Summary
I used to believe that multimethods were so advanced I would never need them. Well, maybe I still believe that, but here's a quick and dirty implementation of multimethods so you can see for yourself. Some assembly required; advanced functionality left as an exercise for the reader.
Advertisement
So what are multimethods? I'll give you my own definition, as I've come to understand them: a function that has multiple versions, distinguished by the type of the arguments. (Some people go beyond this and also allow versions distinguished by the value of the arguments; I'm not addressing this here.)
As a very simple example, let's suppose we have a function that we want to define for two ints, two floats, or two strings. Of course, we could define it as follows:
def foo(a, b):
if isinstance(a, int) and isinstance(b, int):
...code for two ints...
elif isinstance(a, float) and isinstance(b, float):
...code for two floats...
elif isinstance(a, str) and isinstance(b, str):
...code for two strings...
else:
raise TypeError("unsupported argument types (%s, %s)" % (type(a), type(b)))
But this pattern gets tedious. (It also isn't very OO, but then, neither are multimethods, despite the name, IMO.) So what could this look like using multimethod dispatch? Decorators are a good match:
from mm import multimethod
@multimethod(int, int)
def foo(a, b):
...code for two ints...
@multimethod(float, float):
def foo(a, b):
...code for two floats...
@multimethod(str, str):
def foo(a, b):
...code for two strings...
The rest of this article will show how we can define the multimethod decorator.
It's really pretty simple: there's a global registry indexed by function name ('foo' in this case), pointing to a registry indexed by tuples of type objects corresponding to the arguments passed to the decorator. Like this:
# This is in the 'mm' module
registry = {}
class MultiMethod(object):
def __init__(self, name):
self.name = name
self.typemap = {}
def __call__(self, *args):
types = tuple(arg.__class__ for arg in args) # a generator expression!
function = self.typemap.get(types)
if function is None:
raise TypeError("no match")
return function(*args)
def register(self, types, function):
if types in self.typemap:
raise TypeError("duplicate registration")
self.typemap[types] = function
I hope that wasn't too much code at once; it's really very simple so far (please indulge me in using the words 'class' and 'type' interchangeably here):
The __init__() constructor creates a MultiMethod object with a given name and an empty typemap. (The name argument is only used for diagnostics; I'm leaving these out for brevity.)
The __call__() method makes the MultiMethod object itself callable. It looks up the actual function to call in the typemap based on the types of the arguments passed in; if there's no corresponding entry it raises TypeError. The biggest simplification here is that we don't try to find an inexact match using subclass relationships; a real multimethod implementation should attempt to find the best match such that the actual argument types are subclasses of the types found in the typemap. This is left as an exercise for the reader (or maybe I'll blog about it next week, if there's demand; there are a few subtleties when there's more than one near miss).
The register() method adds a new function to the typemap; it should be called with a tuple of type objects (== classes) and a function.
I hope it's clear from this that the @multimethod decorator should return a MultiMethod object and somehow call its register() method. Let's see how to do that:
def multimethod(*types):
def register(function):
name = function.__name__
mm = registry.get(name)
if mm is None:
mm = registry[name] = MultiMethod(name)
mm.register(types, function)
return mm
return register
That's it! Sparse but it works. Note that only positional parameters are supported; it gets pretty murky if you want to support keyword parameters as well. Default parameter values are somewhat against the nature of multimethods: instead of
@multimethod(int, int)
def foo(a, b=10):
...
you'd have to write
@multimethod(int, int)
def foo(a, b):
...
@multimethod(int)
def foo(a):
return foo(a, 10) # This calls the previous foo()!
I've got one improvement to make: I imagine that somtimes you'd want to write a single implementation that applies to multiple types. It would be convenient if the @multimethod decorators could be stacked, like this:
This can be done by changing the decorator slightly (this is not thread-safe, but I don't think that matters much, since all this is typically happening at import time):
def multimethod(*types):
def register(function):
function = getattr(function, "__lastreg__", function)
name = function.__name__
mm = registry.get(name)
if mm is None:
mm = registry[name] = MultiMethod(name)
mm.register(types, function)
mm.__lastreg__ = function
return mm
return register
Note the three-argument getattr() call, which you may not be familiar with: getattr(x, "y", z) returns x.y if it exists, and z otherwise. So that line is equivalent to
if hasattr(function, "__lastreg__"):
function = function.__lastreg__
You could try to put the assignment to mm.__lastreg__ inside the register() method, but that would just add more distance between the code that sets it and the code that uses it, so I like it better this way. In a more static language, of course, there would have to be a declaration of the __lastreg__ attribute; Python doesn't need this. It's important that this isn't a "normal" attribute name, so that other uses of function attributes aren't preempted. (Hm... There are almost no "normal" uses of function attributes; they are mostly used for various "secret" purposes so name conflicts in the __xxx__ namespace are not inconceivable. Oh well, maybe we should use something really long like multimethod_last_registered or even put the whole thing inside the MultiMethod class so we can use a private variable name like __lastreg.)
As a quick hack I tried to use a descriptor to hold the signatures for the functions. The decorator returns a descriptor, but since __set__ isn't called on the descriptor when foo is redefined (directly alters the class __dict__?) it doesn't work. Or at least it isn't the easy place to accumulate signatures I thought it might be. You would still need a global hiding place to keep track of multiple redefinitions of foo, even if you stored the per-function signatures in a descriptor instead of globally.
Here is a minimal example that doesn't raise an exception where you might think it would.
> <p>As a very simple example, let's suppose we have a > function that we want to define for two ints, two floats, > or two strings. Of course, we could define it as > follows: > > <pre> > def foo(a, b): > if isinstance(a, int) and isinstance(b, int): > ...code for two ints... > elif isinstance(a, float) and isinstance(b, float): > ...code for two floats... > elif isinstance(a, str) and isinstance(b, str): > ...code for two strings... > else: > raise TypeError("unsupported argument types (%s, > pes (%s, %s)" % (type(a), type(b))) > </pre> > > <p>But this pattern gets tedious.
This pattern may be tedious but don't knock it for that. Anyone with even the simplest understanding of Python can read it and immediately understand the coder's intentions.
The alternative that you have presented may be better in many technical ways but it's pretty opaque and really only accessable to an advanced Python programmer.
If that's the intention then fine, but given the choice, I'd go for 'simple' code that anyone can maintain over technical wizardry any day.
> This pattern may be tedious but don't knock it for that. > Anyone with even the simplest understanding of Python can > n read it and immediately understand the coder's > intentions. To me this use of decorators for this looks pretty clear. Clearer then pages of if/elif's on any account.
> The alternative that you have presented may be better in > many technical ways but it's pretty opaque and really only > accessable to an advanced Python programmer. Which is true of most library code
> If that's the intention then fine, but given the choice, > I'd go for 'simple' code that anyone can maintain over > technical wizardry any day. First of all, users don't have to be able to maintain the multimethod module in order to use it. Contrary to the implementation to me the usage looks even simpler then starting to write down a chain of if/elif's. Secondly, let's see how maintainable and easy to understand the code gets when you take 5-10 arguments for a few dozen branches into account. And on a sidenote, I didn't find the code hard to follow at all. Ok, I'm not a beginner, but I'm by no means a python wizard.
"when you take 5-10 arguments for a few dozen branches into account"
If you design API:s where you need pages of logic just to sort out what code a function/method should execute, you should probably hand over the API design to someone else.
> If you design API:s where you need pages of logic just to > sort out what code a function/method should execute, you > should probably hand over the API design to someone else. I rather think that the call for realistic scalability of a pattern is justified in the face of more then 2 parameters and 2 types. I also think that people who need to do this kind of stuff are not doing it for the fun of branching between int/str, but are rather trying to hand a significant part of their domain-model to a significant part of their api.
Personaly I don't use multimethods because I don't like branching by type, but that's just me. I Acknowledge that there are people who think they need just that.
This pattern may be tedious but don't knock it for that. Anyone with even the simplest understanding of Python can read it and immediately understand the coder's intentions.
The alternative that you have presented may be better in many technical ways but it's pretty opaque and really only accessable to an advanced Python programmer.
The very same argument used to be used against object-oriented programming. (Or so they say... I wasn't born yet.) And anyway, I'm not very adept at Python and I think the @multimethod decorator seems fairly self-explanatory, especially if the multiple methods appear next to each other.
I don't have strong feelings for or against multimethods. But I think the first step to appreciating them is understanding that they simply provide polymorphic dispatch over more objects than just "self", which doesn't seem like such an unreasonable thing to ask for in a dynamic language.
For PyProtocols' multimethods (aka generic functions), I used to have decorators like yours, but I found that I could provide a better API by having an explicit definition of the multimethod, distinct from defining individual methods. That way, I can take the argument signature and default values from a prototype, like so:
from dispatch import generic from dispatch.strategy import Signature
@generic() def foo(a, b=10): """Documentation for the multimethod here"""
@foo.when(Signature(a=int,b=int)) def foo(a,b): # code for int,int
Of course, normally one calls the 'when' decorators with a string like "isinstance(a,int) and isinstance(b,int)", but as you can see it's possible to use a simpler API for cases that only use type dispatching.
Anyway, having a prototype allows you to apply default values *before* dispatching, so that you don't have to do special cases for fewer arguments or whatever. And using a decorator method of the multimethod allows you to give the other functions different names, so you can easily see which one is executing, and you can also reuse them or call them explicitly.
Last, but not least, the prototype provides a nice place to put documentation and such, especially if you use an actual function object, as then pydoc and other tools work properly when inspecting it. (PyProtocols takes the original function's argument signature and then generates a new function stub that calls the dispatch machinery, which is part of where the 5 microsecond overhead comes from.)
P.S., no I didn't miss the point that this was supposed to be a 5-minute implementation, just suggesting a couple of 10-minute enhancements. ;) And I completely avoided mentioning how you implemented the easiest part of dispatch, where you're not taking inheritance or method resolution order into account. :)
"""So what are multimethods? I'll give you my own definition, as I've come to understand them: a function that has multiple versions, distinguished by the type of the arguments."""
i.e. All arguments are treated equally. As distinct from single-dispatch languages like Python, where multiple versions of a function are distinguished by the type of the first argument only. Fortunately for single-dispatch languages, while such gross asymmetry is not ideal there seems to be enough asymmetric problems in the world that they don't embarrass themselves too much.:)
Regarding the 'five-minute' implementation, there's more to multimethods than just simple function overloading. As well as supporting 'best match' dispatching on sub-types you also need to provide a mechanism that allows the invoked multimethod to redispatch a call for handling by the next best match, otherwise there's no way to call overridden methods.
I think it's important to point out that all of the complexity in Bob's example code comes from 1) emulating Guido's "implicit" API where the multimethod is automatically created upon first definition, and 2) implementing exact type matching instead of using isinstance(). Without #1, most of the 'register' function wouldn't be needed, and without #2, the Getattr/Pointer stuff would be unnecessary. So, with the dispatch framework it's actually *harder* to create a crippled and implicit version of the API. :)
Part of the reason I point this out is so that people stumbling across Bob's example don't end up thinking that Guido's code is "better" than using the dispatch module, or -- even worse! -- they actually use Bob's code.
If you really want the semantics of Guido's example and you're using PyProtocols, you should use @foo.when("type(a) is int and type(b) is int"), as this works just fine without any need for additional decorators. And if you want multiple signatures, then @foo.when("type(a) is str and type(b) is str or type(a) is unicode and type(b) is unicode") also works as intended. There's then no need for last-method attribute hacks or stacked decorators.
A few extensions of this beautiful Python pattern can be considered that deal with sets of type-signatures. Maybe the most important use-case are those sets that are defined by some interchangeable types. An example are symmetric functions where f(a,b) = f(b,a) is valid.
A symmeric multimethod class and it's decorator should be intoduced:
class SymmetricMM(MultiMethod): def __call__(self, *args): types = tuple(sorted(arg.__class__ for arg in args)) function = self.typemap.get(types) if function is None: raise TypeError("no match") return function(*args)
def symmetric_mm(*types): def register(function): name = function.__name__ mm = registry.get(name) if mm is None: mm = registry[name] = SymmetricMM(name) mm.register(tuple(sorted(types)), function) return mm return register
It may be reasonable that bar(int, int) as well as bar(float, float) return values instead of throwing errors. Handling all permutation of type-tuples (int,int),(int,float),(float,int),(float,float) with one decorator should be possible:
from sets import Set
def perm(*args): s = sorted(args) def perm_n(n,args): if n==1: return [[arg] for arg in s] p = [] for x in s: for y in perm_n(n-1,s): if x<=y[0]: p.append([x]+y) return p return Set([tuple(p) for p in perm_n(len(s),s)])
def perm_mm(*types): def register(function): name = function.__name__ mm = registry.get(name) if mm is None: mm = registry[name] = SymmetricMM(name) for p in perm(*types): mm.register(tuple(p), function) return mm return register
Note that the permutation multimap perm_mm does not intoduce a new MultiMethod class but uses SymmetricMM. The function perm() returns only representatives of tuple permutations ( i.e. a representative of permutation classes where the elements of one perm-class are distinguished by a sequence of transpositions ).
The longer I think about the MultiMethod pattern the less convincing I find some details of it's implementation especially the usage of typemap. The same is valid about the MultiMethod variant of Ian Bicking, cited by Philip Eby.
While the first attempt to branch by type
def foo(a, b): if isinstance(a, int) and isinstance(b, int): ...code for two ints... elif isinstance(a, float) and isinstance(b, float): ...code for two floats...
may look awkward and not OO-ish the latter is just a superficial objection due to the wide-spreaded meme that if..else is somehow not in touch with OO. The criticism is valid if one asks for type equivalence type(a) == int, because it breaks polymorphism: the prediacte is falsified for subclasses. Pythons isinstance() instead is perfectly aware of polymorphism and inheritance, while doing a lookup of a type-tuple in a typemap beaks it again.
Unfortunately I don't see a possibility to change the pattern without blowing up the lookup-time from constant to linear in the number of registered type-tuples.
Regards, Kay
Flat View: This topic has 24 replies
on 2 pages
[
12
|
»
]