Python Buzz Forum - doing the homework

There are a couple of nice posts on doing graphs in Python: One by Guido (the article ends with "This, too, will be the subject of another column." Does anyone know where the column is?), which concentrates on the ease of using Python's built-in data structures to represent graphs. Another one by David Eppstein which continues Guido's theme and adds some performance tricks (David basically makes the point that using a dict of dicts is overall better than a dict of lists). I really enjoyed the articles, but I have a somewhat different take:
These ad-hoc schemes are very cool very Pythonic - the language that makes (even not so-) easy things easy and hard things possible. But if we are to write a graph library in Python (and we are!) I think the interface offered by these native data structures is not rich enough. I know David mentioned that the dict-like interface in Python is kind of rich, due to the ability to override __getitem__ and the dict being an itrerable:
"Of course, G and G[v] need not be Python dict objects; they can be any other object that obeys dict protocol, for instance a wrapper in which vertices are URLs and a call to G[v] loads the web page and finds its links".
Still I do not think it is rich enough. A couple of examples come to mind: what if we want to somehow associate both incoming and outgoing edges with a vertex? Dictionary's __getitem__ only gets us from an object to a single (collection of) objects. We need to get to two sets: the 'downwind' vertecies and the 'upwind' vertecies? Or, what if we want edges instead of vertecies? That's just off the top of my head. Looking at the BGL's graph concepts we see that the graph interface required by different BGL algorithms is way beyond what Python's native dict can offer.
I will also point out that considering operator-based approach (such as the graph[vertex] approach discribed above) is a balancing act between succinctness and readability. Where the operator already has some traction in the domain, such as the + operator in many mathematical domains, it alleviates some of the readability worries. That said, choosing [] to mean get_adjacent_vertecies is pretty arbitrary.
So as far as we are concerned I think named (member) functions to represent graph the interface is the way to go. The benefit of using BGL as a model is that each algorithm "specifies" (well, only via compilation+documentation in C++ BGL) its graph interface. Great effort was applied to BGL to make these interfaces minimal, which means that a great variety of datastructures can be easily adapted to the algorithm's "Concept".
Let me say again that this is not a rant against Guido and David's approach. I am just saying this approach is not good enough for a library. There are plenty of times when a problem does not warrant writing a library, and approaches like these become a very viable option.


	Web Artima.com