This post originated from an RSS feed registered with Python Buzz
by Dmitry Dvoinikov.
Original Post: Python function guards
Feed Title: Things That Require Further Thinking
Feed URL: http://feeds.feedburner.com/ThingsThatRequireFurtherThinking
Feed Description: Once your species has evolved language, and you have learned language, [...] and you have something to say, [...] it doesn't take much time, energy and effort to say it. The hard part of course is having something interesting to say.
-- Geoffrey Miller
I really love Python, but unfortunately don't have to use it in my current daily job. So now I have to practice it in my spare time, making something generally useful and hopefully suitable for improving my Python application framework.
1. The idea
I already had a method signature checking decorator written years ago, and it turned out enormously useful, so in the same line I started thinking about whether it would be possible to implement declarative function guards that select one version out of many to be executed depending on the actual call arguments. In pseudo-Python, I would like to write something like this:
def foo(a, b) when a > b: ...
def foo(a, b) when a < b: ...
foo(2, 1) # executes the first foo
2. Proof of concept
At the first sight it looks impossible, because the second function kind of shadows the second one:
def foo(a, b): print "first"
def foo(a, b): print "second"
foo(2, 1) # second
but this is not exactly so. Technically, the above piece of code looks something like this:
and so the problem is not the function itself which is overwritten, but its identically named reference entry in current namespace. If you manage to save the reference in between, nothing stops you from calling it:
def foo(a, b): print("first")
old_foo = foo
def foo(a, b): if a > b: old_foo(a, b) elif a < b: print("second")
foo(2, 1) # first foo(1, 2) # second
so there you have it, what's left is to automate the process and it's done.
3. Syntax
There is no question as to how the guard should be attached to the guarded function - it would be done by terms of a decorator:
@guard def foo(): # hey, I'm now being guarded ! ...
@guard def foo(): # and so am I ...
but the question remains where the guarding expression should appear. I see six ways of doing it:
A) as a parameter to the decorator itself:
@guard("a > b") def foo(a, b): ...
B) as a default value for some predefined parameter:
@guard def foo(a, b, _when = "a > b"): ...
C) as an annotation to some predefined parameter:
@guard def foo(a, b, _when: "a > b"): ...
D) as an annotation to return value:
@guard def foo(a, b) -> "a > b": ...
E) as a docstring
@guard def foo(a, b): "a > b" ...
F) as a comment
@guard def foo(a, b): # when a > b ...
Now I will dismiss them one by one until the winner is determined.
Method F (as a comment) is the first to go because implementing it would require serious parsing, access to source code and be semantically misleading as the comments are treated as something insignificant which can be omitted or ignored. The rest of the methods at least depend on the runtime information only and work on compiled modules.
Method A (as a parameter to the decorator) looks attractive, but is dismissed because it moves the decision from the function to the wrapper. So the function alone can't have guard expression and therefore it would not be possible to separate declaration from guarding:
def foo(a, b): # I want to be guarded ... # but it is this guard here that knows how foo = guard("a > b")(foo)
The rest of the methods are more or less equivalent and the choice is to personal taste. Nevertheless, I discard method E (docstring) because there is just one docstring per function and it has other uses. Besides, to me it looks like it describes the insides of the function, not the outsides.
So the final choice is between having the guarding expression as annotation and as default value. The real difference is this: a parameter with a default value can always be put last, but a parameter with annotation alone can not:
def foo(a, b = 0, _when: "a > b") # syntax error ...
This and the fact that aforementioned typecheck decorator already makes use of annotations tips the decision towards default value:
The choice of a name for the parameter containing the guard expression is arbitrary, but it has to be simple, clear and not conflicting at the same time. "_when" looks like a reasonable choice.
4. Semantics
With a few exceptions, the semantics of a guarded function is straightforward:
@guard def foo(a, b, _when = "a > b"): ...
@guard def foo(a, b, _when = "a < b"): ...
foo(2, 1) # executes the first version foo(1, 2) # executes the second version foo(1, 1) # throws
Except when there really is a question which version to invoke:
@guard def foo(a, b, _when = "a > 0"): ...
@guard def foo(a, b, _when = "b > 0"): ...
foo(2, 1) # now what ?
and if there is a default version, which is the one without the guarding expression:
@guard def foo(a, b): # default ...
@guard def foo(a, b, _when = "a > b"): ...
foo(2, 1) # uh ?
and the way it seems logical to me is this: the expressions are evaluated from top to bottom one by one until the match is found, except for the default version, which is always considered last.
foo(1, 1) # a > 0 foo(1, -1) # a > 0 foo(-1, 1) # b > 0 foo(-1, -1) # default
5. Function compatibility
So far we have only seen the case of identical function versions being guarded. But what about functions that have the same name but different signatures ?
@guard def foo(a): ...
@guard def foo(a, b): ...
Should we even consider to have these guarded as versions of one function ? In my opinion - no, because it creates an impression of a different concept - function overloading, which is not supported by Python in the first place. Besides, it would be impossible to map the arguments across the versions.
Another question is the behavior of default arguments:
@guard def foo(a = 1, _when = "a > 0"): ...
@guard def foo(a = -1, _when = "a < 0"): ...
Guarding these as one could work, but would be confusing as to which value the argument has upon which call. So this case I also reject.
What about a simplest case of different names for the same positional arguments ?
@guard def foo(a, b): ...
@guard def foo(b, a): ...
Technically, those have identical signatures, and can be guarded as one, but is likely to be another source of confusion, possibly from a mistake, typo or a bad copy/paste.
Therefore the way I implement it is this: all the guarded functions with the same name need to have identical signatures, down to parameter names, order and default values, except for the _when meta-parameter and annotations. The annotations are excused so that guard decorator could be compatible with typecheck decorator. So the following is about as far as two compatible versions can diverge:
Note how the _when parameter can be positional as well as keyword. This way it can be always put at the end of the parameter list in the declaration.
6. Function naming
Before we used simple functions, presumably declared at module level. But how about this:
@guard def foo(): ...
def bar(): @guard def foo(): ...
class C: @guard def foo(self): ...
those three are obviously not versions of the same function, but they are called foo() so how do we tell them apart ?
In Python 3.2 and later the answer is this: f.__qualname__ contains a fully qualified name of the function, kind of a "a path" to it:
foo bar.<locals>.foo C.foo
respectively. It doesn't matter much what exactly is in the __qualname__, but that they are different, just what we need. Prior to Python 3.3 there is no __qualname__ and we need to fallback to a hacky implementation of qualname.
7. Special cases
Lambdas are unnamed functions. Their __qualname__ has <lambda> in it but no own name. They would be impossible to guard:
foo = lambda: ... foo = guard(foo)
bar = lambda: ... bar = guard(bar)
because from the guard's point of view they are not "foo" and "bar", but the same "<lambda>".
An interesting glitch allows guarding classmethods and staticmethods. See, classmethod/staticmethod are not regular decorator functions but objects and therefore cannot be stacked with guard decorator
class C: @guard # this won't work @classmethod def foo(cls): ...
because classmethod can't be seen through to the original function foo. But it gets interesting when you swap the decorators around:
the way it works now is that guard decorator attaches to the original function foo, before it's wrapped with classmethod. Therefore the guarded chain of versions contains only the original functions, not classmethods. But when it comes to the actual call to it, it goes through a classmethod decorator before it gets to guard, the classmethod does it argument binding magic and whichever foo is matched by guard to be executed, gets its first argument bound to class as expected.
8. The register
Here is one final question: when a guarded function is encountered:
@guard def foo(...): ...
where should the decorator look for previously declared versions of foo() ? There must exist some global state that maps function names to their previous implementations.
The most obvious solution is to attach a state dict to the guard decorator itself. The dict would then map (module_name, function_name) tuples to lists of previous functions versions. This approach certainly works but has a downside, especially considering I'm going to use it with Pythomnic3k framework. The reason is that in Pythomnic3k modules are reloaded automatically whenever source files containing them change. Having a separate global structure holding references to expired modules would be bad, but having a chain of function versions cross different identically named modules from the past would be a disaster.
There is a better solution of making the register slightly less global and attach the state dict to the module in which a function is encountered. This dict would map just function names to the lists of versions. Then all the information about the module's guarded functions disappear with the module with no additional effort.
9. Conclusion
The implementation works.
I'm integrating it with Pythomnic3k framework so that all public method functions are instrumented with it automatically, although it is tricky, because when you have a text of just a