This post originated from an RSS feed registered with Ruby Buzz
by Jim Weirich.
Original Post: Kata Two Worked -- Binary Chop
Feed Title: { | one, step, back | }
Feed URL: http://onestepback.org/index.cgi/synopsis.rss
Feed Description: Jim Weirich's Blog on Software Development, Ruby, and whatever else sparks his interest.
PragDave has put together some
programming "practice" exericises that he calls Katas. In Kata
number two
he suggests writing a binary search function, not once, but five different
ways. Now I’ve implemented a binary search several times, but writing
it five times in five different ways was intriguing.
Here’s my attempt at Variations on a Binary Search Theme.
Unit Tests
Dave provided a set of unit tests for the binary search function. The tests
below assume that the functions name is chop, and that it takes
the target value and the array to search as arguments. Chop should
return the index position in the array where the target value was found.
We put the tests into a separate module that will be included in the real
test classes below. This is a useful technique when you write tests you
wish to reuse in several test classes.
I first read about using loop invariants on a binary search algorithm, so I
thought it would be a good variation to start with.
If you have not used loop invariants in the past, they are a useful
technique for understanding loops. You specify an condition that is true
for every single iteration of the loop. This condition is called the loop
invariant.
For the binary search, I used the following invariant.
INV :
lo <= Result && Result < hi
Where lo and hi are indices into the array and
result is the location in the array where our target value lies
(i.e. the result of the binary search function). All this invariant states
is that the final result index will be between position lo and
hi. (Note that I’m assuming that the target value is in the
array. We will deal with that assumption later).
The invariant must be true when we start the loop, but that is easy to
handle. Just set lo to zero (the result index must be greater than
or equal to zero), and set hi to one more than the largest index value
(list.size works for this).
The second piece you need is a loop termination condition. Each time
through the loop we will adjust hi and lo so that they
move closer together (without violating the loop invariant). When the
difference between the hi and lo values is 1, we can exit the loop.
Here is the termination condition.
TERM :
(hi-lo) <= 1
When the loop exits, the condition TERM will be true. And the
invariant INV will also be true (because it is true for every
iteration). If TERM and INV are both true, then that implies
that result must be equal to lo! Or in other words, when the loop
exits, our search target value must be in the array indexed by lo
(if it is in the array at all).
Using the above invariant and termination conditions, we can be certain
that our loop will be correct (as long as they are faithfully implemented,
which is not a minor problem).
Code
Here’s the code. Notice that INV will be true the first time we enter
the loop. Inside the loop, we calculate the midpoint of the lo/hi indices,
and compare the value at the midpoint to our target value.
If the value at the midpoint is larger than our target, then the range
(lo..mid) still satisfies the invariant and we can adjust hi down
to mid. Otherwise we can be sure that the range (mid..hi)
satisfies the invariant, and we adjust lo up to the value of
mid.
The "list[lo] == target" test at the end checks our assumption
that the target value was in the array. If we found the value, we return
the lo value. Otherwise we return -1.
def chop_iterate(target, list)
lo = 0
hi = list.size
while (hi-lo) > 1
mid = (hi+lo) / 2
if list[mid] > target
hi = mid
else
lo = mid
end
end
list[lo] == target ? lo : -1
end
class TestChopIterative < Test::Unit::TestCase
alias chop chop_iterate
include ChopTests
end
Problems Encountered
So, did all that work figuring out a loop invariant help us? Perhaps. I did
have one bug where a typo slipped into the calculation of the midpoint. I
typed "(hi-lo)/2" by accident and had to scratch my head
for a few minutes.
After fixing the typo, the rest of the algorithm worked without a hitch.
Where’s the Invariant?
An interesting thing happened to our invariant and termination conditions.
The termination condition was inverted to "(hi-lo) >
1" instead of the "(hi-lo) <= 1" we
originally used. That’s because the while statement expects a
"stay-in-the-loop" test instead of a
"get-out-of-the-loop-test".
But more importantly, our loop invariant has disappeared! It has no
representation in the source code at all, except as an implicit constraint
on the loop values. This is unfortunate because some important information
is lost that should have been communicated to the reader.
The loop construct in the Eiffel language allows you to directly specify
the loop invariants. The same loop in Eiffel would look something like this
…
from
lo := list.lower
hi := list.upper + 1
invariant
lower_limit: -- lo <= Result
upper_limit: -- hi < Result
variant
hi - lo
until
(hi-lo) <= 1
loop
mid := (hi+lo) // 2
if list.at(mid) > target then
hi := mid
else
lo := target
end
end
Although rather verbose, notice how every part of the loop has its proper
place. The termination condition is expressed as an "until"
condition, therefore it isn’t inverted.
The invariants are expressed directly. It turns out that our invariants
cannot be checked at runtime (since we don’t know the actual value of
Result until after the loop terminates), so we expressed them as comments.
However, if they were executable, Eiffel would check them at
runtime for us.
The "variant" section is interesting. The variant is an
non-negative integer expression that must be smaller each time through the
loop. Providing a valid variant proves that your loop eventually
terminates.
Even the initialization of the loop variables (hi and lo)
have their own section.
Next Time …
I was going to write up all 9 variations at once, but this is long enough
now for a posting. I’ll hit the two recursive variations in the next
installment.