Binary operators in Python

A two-hour train journey provided the opportunity to watch the video recording of the Panel with Guido van Rossum at the recent PyData Workshop. The lengthy discussion about PEP 225 (which proposes to add additional operators to Python that would enable to have both elementwise and aggregate operations on the same objects, in particular for providing both matrix and elementwise multiplication on arrays with a nice syntax) motivated me to write up my own thoughts about what’s wrong with operators in Python from my computational scientist’s point of view.

The real problem I see is that operators map to methods. In Python, a*b is just syntactic sugar for a.__mul__(b). This means that it’s the type of a that decides how to do the multiplication. The method implementing this operation can of course check the type of b, and it can even decide to give up and let b handle everything, in which case Python does b.__rmul__(a). But this is just a kludge to work around the real weakness of the operators-map-to-methods approach. Binary operators fundamentally require a dispatch on both types, the type of a and the type of b. What a*b should map to is __builtins__.__mul__(a, b), a global function that would then implement a binary dispatch operation. Implementing that dispatch would in fact be the real problem to solve, as Python currently has no multiple dispatch mechanisms at all.

But would multiple dispatch solve the issue addressed by PEP 225? Not at all, directly. But it would make some of the alternatives mentioned there feasible. A proper multiple dispatch system would allow NumPy (or any other library) to decide what multiplication of its own objects by a number means, no matter if the number is the first or the second factor.

More importantly, multiple dispatch would allow a major cleanup of many scientific packages, including NumPy, and even clean up the basic Python language by getting rid of __rmul__ and friends. NumPy’s current aggressive handling of binary operations is actually more of a problem for me than the lack of a nice syntax for matrix multiplication.

There are many details that would need to be discussed before binary dispatch could be proposed as a PEP. Of course the old method-based approach would need to remain in place as a fallback, to ensure compatibility with existing code. But the real work is defining a good multiple dispatch system that integrates well with Python’s dynamical type system and allows the right kind of extensibility. That same multiple dispatch method could then also be made available for use in plain functions.

Explore posts in the same categories: Programming

2 Comments on “Binary operators in Python”

  1. Dan Says:

    Very good point!
    I think it will become essential for numpy or even programming languages in general to separate the presentation layer/syntax more from what the code actually does.

    We are facing one very serious tradeoff in scientific computation, that is between writing explicit and readable code on the one hand side and code that is close enough to an efficient algorithmic implementation.
    I don’t see people taking this problem seriously enough, because numerics and cs are two very different areas and numerical people don’t see that a numerical library needs a cs approach similar to writing a compiler, and cs people don’t see the problem with forcing an already hopelessly complicated formula into an unreadable form like add(a,mul(b,c))+dot(a,b)

    Most libraries are hacks (in an admiring sense) that use more or less obscure language features of the underlying language to provide the user with a possibility to write expressions more or less the way the user want’s to see them.

    For example, in an object oriented language, to write down a mathematical expression, the library designer would probably use operator overloading.
    This of course is a huge tradeoff, as now you are forever locked into your object model and can’t easily add new operators and generally have to live with the given operator precedence.

    Suddenly it has become impossible for the library to see the entire expression before evaluation, and often unnecessary temporary variables have to be created, slowing down the evaluation.

    Now again, some smart people found a workaround with expression templates for c++, but this is far beyond what a normal person can be expected to understand and maintain and it just creates problems in different places that are even more difficult to address.

    Multiple dispatch, together with the possibility to create custom operators, is a big step in the right direction, as it will make it possible to separate the implementation from the syntactic constraints of the language.

    I like the pypy objspace, which goes in the right direction:

    I would even go one step further and add (optional) multiple dispatch over the entire tree of an expression, so something like A x (B x C) + D^T would be handed as an AST to a function to decide what to do with it.

    a*X + b*Y would be directly handed over to DAXPY and not result in several member function calls and temporaries.

    I can see why some people don’t like it, because it is a kind of macro, but the function to evaluate the AST is a much cleaner and easier solution than to get the same result using operator overloading, where each function only sees a tiny part of the expression.

    This would make it trivial to swap out backends and do the computation on the GPU instead of using BLAS, without any of the syntactic overhead a library like Theano brings.

    • khinsen Says:

      I fully agree with you. It is highly desirable for computational science to have more fine-grained control over the code transformation from math-like equations to efficient machine code than current languages and other tools provide. But, as you say, this requires more collaboration between CS and computational scientists than what is happening at the moment.

      On the Python side, a good entry point could be SymPy, which allows to obtain an equation as an in-memory object not very different from an AST. Code generators can start from there, and that has in fact be done (see for an example). But this is still too complicated for the everyday work of computational scientists.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: