The release of NumPy 1.9 a few days ago was a bit of a revelation for me. For the first time in the combined history of NumPy and its predecessor Numeric, a new release broke my own code so severely thatI don’t see any obvious way to fix it, given the limited means I can dedicate to software maintenance. And that makes me wonder for which scientific uses today’s Python ecosystem can still be recommended, since the lack of means for code maintenance is a chronic and endemic problem in science.
I’ll start with a historical review, for which I am particularly well placed as one of the oldtimers in the community: I was a founding member of the Matrix-SIG, a small group of scientists who in 1995 set out to use the still young Python language for computational science, starting with the design and implementation of a module called Numeric. Back then Python was a minority language in a field dominated by Fortran. The number of users started to grow seriously from 2000, to the point of now being a well-recognized and respected community that spans all domains of scientific research and holds several
conferences per year across the globe. The combination of technological change and the needs of new users has caused regular changes in the code base, which has grown as significantly as the user base: the first releases were small packages written and maintained by a single person (Jim Hugunin, who later became famous for Jython and IronPython), whereas today’s NumPy is a complex beast maintained by a team.
My oldest published Python packages, ScientificPython and MMTK, go back to 1997 and are still widely used. They underwent a single major code reorganization, from module collections to packages when Python 1.5 introduced the package system. Other than that, most of the changes to the code base were implementations of new features and the inevitable bug fixes. The two main dependencies of my code, NumPy and Python itself, did sometimes introduce incompatible changes (by design or as consequences of bug fixes) that required changes on my own code base, but they were surprisingly minor and never required more than about a day of work.
However, I now realize that I have simply been lucky. While Python and its standard library have indeed been very stable (not counting the transition to Python 3), NumPy has introduced incompatible changes with almost every new version over the last years. None of them ever touched functionalities that I was using, so I barely noticed them when looking at each new version’s release notes. That changed with release 1.9, which removes the compatbility layer with the old Numeric package, on which all of my code relies because of its early origins.
Backwards-incompatible changes are of course nothing exceptional in the computing world. User needs change, new ideas permit improvements, but existing APIs often prevent a clean or efficient implementation of new features or fundamental code redesigns. This is particularly true for APIs that are not the result of careful design, but of organic growth, which is the case for almost all scientific software. As a result, there is always a tension between improving a piece of software and keeping it compatible with code that depends on it. Several strategies have emerged to deal with, depending on the priorities of each community. The point I want to make in this post is that NumPy has made a bad choice, for several reasons.
The NumPy attitude can be summarized as “introduce incompatible changes slowly but continuously”. Every change goes through several stages. First, the intention of an upcoming changes is announced. Next, deprecation warnings are added in the code, which are printed when code relying on the soon-to-disappear feature is executed. Finally, the change becomes effective. Sometimes changes are made in several steps to ease the transition. A good example from the 1.9 release notes is this:
In NumPy 1.8, the diagonal and diag functions returned readonly copies, in NumPy 1.9 they return readonly views, and in 1.10 they
will return writeable views.
The idea behind this approach to change is that client code that depends on NumPy is expected to be adapted continuously. The early warnings and the slow but regular rythm of change help developers of client code to keep up with NumPy.
The main problem with this attitude is that it works only under the assumption that client code is actively maintained. In scientific computing, that’s not a reasonable assumption to make. Anyone who has followed the discussions about the scientific software crisis and the lack of reproduciblity in computational science should be well aware of this point that is frequently made. Much if not most scientific code is written by individuals or small teams for a specific study and then modified only as much as strictly required. One step up on the maintenance ladder, there is scientific code that is published and maintained by computational scientists as a side activity, without any significant means attributed to software development, usually because the work is not sufficiently valued by funding agencies. This is the category that my own libraries belong to. Of course the most visible software packages are those that are actively maintained by a sufficiently strong community, but I doubt they are representative for computational science as a whole.
A secondary problem with the “slow continuous change” philosophy is that client code becomes hard to read and understand. If you get a Python script, say as a reviewer for a submitted article, and see “import numpy”, you don’t know which version of numpy the authors had in mind. If that script calls array.diag() and modifies the return value, does it expect to modify a copy or a view? The result is very different, but there is no way to tell. It is possible, even quite probable, that the code would execute fine with both NumPy 1.8 and the upcoming NumPy 1.10, but yield different results.
Given the importance of NumPy in the scientific Python ecosystem – the majority of scientific libraries and applications depends on it -, I consider its lack of stability alarming. I would much prefer the NumPy developers to adopt the attitude to change taken by the Python language itself: accumulate ideas for incompatible changes, and apply them in a new version that is clearly labelled and announced as incompatible. Everyone in the Python community knows that there are important differences between Python 2 and Python 3. There’s a good chance that a scientist publishing a Python script will clearly say if it’s for Python 2 or Python 3, but even if not, the answer is often evident from looking at the code, because at least some of the many differences will be visible.
As for my initial question for which scientific uses today’s Python ecosystem can still be recommended, I hesitate to provide an answer. Today’s scientific Python ecosystem is not stable enough for use in small-scale science, in my opinion, although it remains an excellent choice for big communities that can somehow find the resources to maintain their code. What makes me hesitate to recommend not using Python is that there is no better alternative. The only widely used scientific programming language that can be considered stable, but anyone who has used Python is unlikely to be willing to switch to an environment with tedious edit-compile-run cycles.
One possible solution would be a long-time-support version of the core libraries of the Python ecosystem, maintained without any functional change by a separate development team. But that development team has be created and funded. Any volunteers?