Python Buzz Forum - Kata 19: an optimization anecdote

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Buzz Forum
Kata 19: an optimization anecdote

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Ian Bicking

Posts: 900
Nickname: ianb
Registered: Apr, 2003

Ian Bicking is a freelance programmer

Kata 19: an optimization anecdote

Posted: Oct 26, 2003 2:13 PM

This post originated from an RSS feed registered with Python Buzz by Ian Bicking.
Original Post: Kata 19: an optimization anecdote Feed Title: Ian Bicking Feed URL: http://www.ianbicking.org/feeds/atom.xml Feed Description: Thoughts on Python and Programming.	Latest Python Buzz Posts Latest Python Buzz Posts by Ian Bicking Latest Posts From Ian Bicking

I spent some time last night working on Kata 19 (these Katas are little programming exercises, a nice programming break from writing Useful Code). This was definitely the hardest one I've done so far -- you find paths between words, where words are connected if they only differ by one letter. For example: beast to least to leant to leans to loads to loons to loops.

The algorithm I used was pretty quick. Searches take at most 0.1sec, return the shortest path, and do not block on impossible paths. It does a breadth-first search, starting from both the first and terminal words (finding where they meet somewhere in the middle).

The problem was that it was really slow to start up. Reading my dictionary of 96 thousand words took about 53sec (and 166Mb of memory) on my Athlon 650MHz PC. Once it is loaded it is really fast. Perfect for a word-path-finding-server. But I don't think this justifies a persistent server.

Here is this original code. I then set upon optimizing the start time. Here's the things that worked well:

I was creating a dictionary that looked like {'hat': Set(['_at', 'h_t', 'ha_']), ...}. Instead of calculating all these permutations up-front, I calculated them as needed. Down to 31sec (22sec gained) and 124Mb.
Garbage collection was hitting. I could tell because the reading would stall at random times (when the GC was doing work). I didn't need this garbage collection while I was reading data in, because I knew no circular garbage was being created (and not much non-circular garbage was created either). Using the gc module, I turned this off. Down to 17sec (14sec gained), doesn't help memory.
I used sets in several places. One of the central data structures was a dictionary that looked like {'_at': Set(['hat', 'bat', 'sat', ...]), ...}. But I never used set operations on those values, I only iterated over them. I changed it to a list. Down to 9sec (8sec gained), 75Mb memory.

Those were all the effective changes I made. I tried some other things, to little or no effect:

Using .setdefault(word_with_unders, []).append(word) for creating the dictionary. Gained about 0.5sec over using .has_key(word_with_unders). Using exceptions here lost me 7sec -- that should only be done when you are expecting very few misses.
I iterate over range(len(word)) for each word. Since words show up in the dictionary in sorted order, I tried caching the range result (since I'd only need a handful of different ranges for the entire run). No significant effect.
range vs. xrange: no difference.
Avoiding globals, by putting range=range in the method signature. No significant gain.
"%s_%s" % (word[:i], word[i+1:]) vs. word[:i] + word[i+1:]. Actually lost 2sec. Probably using % instead of + for stings will only help when the strings are long (these are very short).
The dictionary has 's at the end of some words. There was no significant diffence between if "'" in word: word = word.split("'")[0] and if word.endswith("'s"): word = word[:-2]. Using line.strip() to get rid of \n wasn't significantly faster than line[:-1] either.

The other big optimization changed the nature of the program slightly. If you only want to test the path between two words (say hat and bee), then you only need to load words of that length (length 3 in this case). You can ignore all the other words. By doing this I could load the necessary parts of the dictionary in about 1.5sec, and use only 36Mb (it depends somewhat on what length words, of course). As a baseline, I can read the entire word list into a Python dictionary in about 0.8sec.

By using Psyco I can cut the 8sec (with all the little optimizations) down to 3.5sec, using psyco.full() It only brings the original down to 41sec (and increases memory to 180Mb), which is still pretty slow. Pysco plus turning GC off gives us 11sec, which is pretty good -- it does better when we tell Psyco psyco.bind(load_words) (i.e., tell Psyco to only try to optimize the one function: 11sec vs 15sec). In contast, psyco.bind doesn't even effect our optimized load_words, only psyco.full helps there.

I'm pretty happy with the finished program. For the most part all the important, time-critical parts are already in C -- they are dictionaries, sets, and lists. These data structures make algorithm implementation in Python really easy and scalable all at once. While you could write this in C for more performance, it would take a lot of work just to get something that worked, and a lot more work to match the speed of Python's built-in data structures. Maybe a language like OCaml or Haskell could match (or beat) Python's agility and performance. I'd be curious.

For more on Python optimization: loop optimization, another anecdote, and an anecdote using images. Some specific tips, and some general tips

Read: Kata 19: an optimization anecdote

Previous Topic

Next Topic


	Web Artima.com