This post originated from an RSS feed registered with Ruby Buzz
by Christian Neukirchen.
Original Post: Lazy Streams for Ruby
Feed Title: chris blogs: Ruby stuff
Feed URL: http://chneukirchen.org/blog/category/ruby.atom
Feed Description: a weblog by christian neukirchen - Ruby stuff
On #ruby-lang, we talked about
SICP
yesterday, and that reminded me I didn’t look into this wonderful book
for a long time. I had enough time today to open it…
Quickly skipping over the first two chapters that I remembered fairly
well (and that are “easy”), I started reading chapter 3 Modularity,
Objects, and
State.
The section on
Streams
finally made me switch to my always-open Emacs window and port that
beautiful Scheme into even better Ruby. :-)
In short, streams can be explained like this:
From an abstract point of view, a stream is simply a
sequence. However, we will find that the straightforward
implementation of streams as lists doesn’t fully reveal the power of
stream processing. As an alternative, we introduce the technique of
delayed evaluation, which enables us to represent very large (even
infinite) sequences as streams.
Stream processing lets us model systems that have state without ever
using assignment or mutable data.
What a great target for an functional installment in Ruby! Should you
not be any familiar with streams, I recommend you to read above
chapter now, and come back after. Or just read on, if you want to
bite into the code.
To make use of Ruby’s object-orientation, I decided to make each
string-cons instances of class Stream. A Stream contains two
objects, car and cdr, whereas cdr is evaluated lazily. In Ruby, we
do this using a block:
class Stream
def initialize(car, &cdr)
@car = car
@cdr = cdr
end
def car; @car; end
def cdr; @cdr.call; end
end
That way of creating new conses is not very beautiful, but everything
we get in Ruby (no macros…).
We will also need a few helper methods:
class Stream
def self.interval(low, high)
if low > high
nil
else
new(low) { interval low+1, high }
end
end
def filter(&pred)
if pred.call car
Stream.new(car) { cdr.filter &pred }
else
cdr.filter &pred
end
end
def take(n)
stream = self
results = []
n.times {
results << stream.car
stream = stream.cdr
}
results
end
end
Stream.interval(1, 100) will return a new stream that will evaluate
in the numbers of 1 to 100.
Stream#filter returns a new stream that only contains elements that
satisfy a given predicate. Lazily of course.
Stream#take finally will evaluate the string n times and return an
Array of the evaluated values.
Now, let’s do an example from SICP,
calculate the third prime between 10000 and 1000000.
If this was evaluated eagerly, you still
wouldn’t see the result. Instead, we simply run:
def prime?(n)
2.upto(Math.sqrt(n).round) { |i| return false if n % i == 0 }
true
end
Stream.interval(10000, 1000000).filter { |s| prime? s }.take(3).last
# => 10037
The answer comes instantly. Why is this? In the end, the stream
behaves just a loop: We loop over all the integers in the range, and,
basically, trackback if the numbers aren’t primes. Now, we do that
only three times, because that often the stream will be evaluated.
Therefore, we only test 37 numbers for being prime, not 999000.
Another nice example are fibonacci numbers, where each element—I
hope you knew that already—is the sum of its successors. But first,
we need to add two streams:
class Stream
def self.fibonacci
new(0) { new(1) { fibonacci.cdr + fibonacci } }
end
def +(other)
Stream.new(self.car + other.car) { self.cdr + other.cdr }
end
end
Unfortunately, lazyness with lambdas is not exactly a wonder of speed
in Ruby, therefore Streams are, beyond their very interesting
properties, not of much use for the usual Ruby programmer. In purely
functional languages (e.g. Haskell), where lazy evaluation is a common
thing, they are however a very common way of implementing state
without actually changing things.