Python Buzz Forum - generator-based graph algorithms

So Paul Moore gave me a much needed nudge to look into generator-based versions of graph algorithms. (BTW Paul, I'd like to see your implementation). Something like



for vertex in bfs():
# do stuff with the vertex

Is a very clear, powerful construct and very Pythonic, to boot (wait, that's the same thing! :). Since the main inspiration of this effort is Boost.Graph (BGL), my first thought was whether one of BGL's main constructs (Visitors) was compatible with generators. Basically I was wondering if I would have to keep two versions of the algorithm to have both Visitors and generators.
Another question that was bothering me was the following: this



for vertex in bfs():
# do stuff with the vertex

code is very cute, but what if I am not interested in the vertecies, but I am interested in the edges (for one, edges have enough information to build a graph, vertecies don't). Do I have to have both



for vertex in bfs_vertecies():
# do stuff with the vertex

and



for edge in bfs_edges():
# do stuff with the edge

??
Looks like generators are going to generate some serious PITA, both in implementation and naming (which is very important in libraries, BTW).
Then I rethought the whole Visitor-generator conflict. What do we want to visit? Vertecies, edges, back edges, etc. What do we want to yeild with generators? Pretty much the same things. So I guess they are not so different after all. So what if we put a yield in front of every call to a visitor function? E.g. (somewhere in bfs) we would have

yield visitor.visit_edge(e, graph)

and we'll have this visitor return the edge that's passed to it? This is not bad, as it gives us the opportunity to decide which sequence is going to generated (vertecies or edges) by simply passing the appropriate visitor. Well, there is a minor problem. As all of the visitor functions are going to be called with a 'yield' by convention, and the guts of the algorithm are oblivious to the fact that we only want the edges, we are going to get a bunch of None's (which are returned from any python function even if it does not explicitly return anything) intermixed with the returned edges. We could do



val = visitor.visit_edge(edge, graph)
if val:
    return val

, but it will 2) ruin the readability of the algorithms and b) is going to be a burden to program. Then I remambered that at least in case of bfs there is a function that wraps the entire bfs algorithm. It is mainly needed to call visitor.initialize_vertex() on all the vertecies in the graph. I could filter out all of the None's right there! It ends up looking like something



def breadth_first_search(graph, start, bfs_visitor = default_bfs_visitor()):
 for v in graph.vertices():
 bfs_visitor.initialize_vertex(v, graph)
 try:
 for res in breadth_first_visit(graph, start, bfs_visitor): # call the inner func
     if res:
  yield res
 except traversal_termination_exception, e:
 pass  # this is a valid termination, no rethrow necessary

Now we can pass an edge or a vertex visitor to bfs. As long as their add_edge and add_vertex functions, respectively, return their args they are going to make bfs() into a fine generator func. As a small application I thought of creating a bunch of images from my previous Amazon graph traversal and making an animation out of it. This would work very well with a generator as I could do this:



graph = pygraphlib.UGraph()
for edge in bfs(edges()):
   graph.add_edge(edge)
   dot = pydot.Dot(graph)
   dot.save_image()

going through the edges in a loop, adding them to a graph structure, and taking a 'current snapshot' of the graph inside the loop. As a final result I wanted to use some software to create an animation out of the images. Well, everything worked untill the last step, mainly due to the fact that 'dot' produced images of different size for almost each stem of the graph, which does not look good as a movie. I put up the generator-based code here.
One remaining issue is how to call the new-style algorithms when sequence generation is not desired/is irrelevant. In my unit test I have just been forcing the algorithm to execute with a list comprehension, but it is obviously a hack. I have some idea, but now it's time to sleep on it :). gnite.


	Web Artima.com