The Artima Developer Community
Sponsored Link

Python Buzz Forum
lazy Hamming

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
maxim khesin

Posts: 251
Nickname: xamdam
Registered: Mar, 2005

Maxim Khesin is developer for Liquidnet. I like C++, python, attend design patterns study group/NYC.
lazy Hamming Posted: Mar 24, 2005 1:17 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by maxim khesin.
Original Post: lazy Hamming
Feed Title: python and the web
Feed URL: http://feeds.feedburner.com/PythonAndTheWeb
Feed Description: blog dedicated to python and the networks we live in
Latest Python Buzz Posts
Latest Python Buzz Posts by maxim khesin
Latest Posts From python and the web

Advertisement
In last month's C++ user's Journal Andrew Koenig & Barbara Moo covered different ways of doing the "Hamming problem". For the uninitiated (like me), the problem is to produce a sequence of all integers, in increasing order, that only have 2, 3 & 5 as their divisors. One of the nicer solutions presented was a piece of Haskell code that uses lazy evaluation. It looked something like this:

hamming = 1 : merge (f 2) (merge (f 3) (f 5))
where
f a = [ n*a | n <- hamming ]
merge (a:x) (b:y) = a : merge x (b:y), if a < b
= b : merge (a:x) y, if a > b
= a : merge x y, otherwise

(I could not get the actual code online, this version is borrowed from
http://moonbase.wwc.edu/~cs_dept/KU/PR/Haskell.html)

It is no secret that Python's design has been influenced by Haskell, and remenbering that we already have support for lazy sequences with generators I attempted translating this code into Python. My first take looked like this:


def scale_by(seq, n):
count = 0
while 1:
yield seq[count]*n
count += 1

def generate_hamming_numbers(max):
seq = [1]
i2, i3, i5 = scale_by(seq, 2), scale_by(seq, 3), scale_by(seq, 5)
v2, v3, v5 = i2.next(), i3.next(), i5.next()
while(1):
smallest = min(v2, v3, v5)
seq.append(smallest)
if v2 == smallest: v2 = i2.next()
if v3 == smallest: v3 = i3.next()
if v5 == smallest: v5 = i5.next()
if smallest > max: break
yield smallest

for i in generate_hamming_numbers(10000):
print i

This version has a bit of a 'literal' flavor in the sense that an actual sequence is kept in a list, but basically does the job. After looking around on da net I found the most serious discussion of Hamming on, surprise surprise, the Python mailing list. It offers some interesting alternatives and shows an 'evolutionary' improvment of the code by a number of people, which is nice to see. The excercise was very worthwhile as it pushed me to catch up on some of the itertools stuff, particularly the tee() function. Here is the solution I like best, offered on the list by Nick Craig-Wood:

from itertools import *

def imerge(*generators):
values = [ g.next() for g in generators ]
while True:
x = min(values)
yield x
for i in range(len(values)):
if values[i] == x:
values[i] = generators[i].next()

def hamming():
def _hamming():
yield 1
for i in imerge(imap(lambda x:2*x, hamming2),
imap(lambda x:3*x, hamming3),
imap(lambda x:5*x, hamming5)):
yield i

hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
return result

for i in hamming():
print i
if i > 10000:break

This is very cool. BTW, this is one of the things that confused me about tee(): in my head I kept wondering whether the copies of the iterator are going to evaluate the generator for the second time (if you aren't bothered by this, it is probably my own brain quirk). Looking at the doc. tee() pseudocode really helped: the answer is that only the 'front' iterator is going to evaluate the generator, the rest are simply going to look up the value in the cache. Of course in some algorithms the "front" iterator can fall behind, but then some other is going to assume the "front" position.
It is interesting that tee apparently does preserve the entire sequence, which is what I did explicitly. I cannot prove it offhand, but it seems very intuitive that only 3 highest values of the sequence must be kept in memory at the same time. I wonder if there is a of expressing this which is as elegant as the tee() solution...

Read: lazy Hamming

Topic: Thank you RichardP and WikiMinion Previous Topic   Next Topic Topic: The Power of Open Source

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use