Ruby Buzz Forum - Kata Two Worked -- Binary Chop

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Unit Tests

Dave provided a set of unit tests for the binary search function. The tests below assume that the functions name is chop, and that it takes the target value and the array to search as arguments. Chop should return the index position in the array where the target value was found.

We put the tests into a separate module that will be included in the real test classes below. This is a useful technique when you write tests you wish to reuse in several test classes.

  require 'test/unit'

  module ChopTests
    def test_chop
      assert_equal(-1, chop(3, []))
      assert_equal(-1, chop(3, [1]))
      assert_equal(0,  chop(1, [1]))

      assert_equal(0,  chop(1, [1, 3, 5]))
      assert_equal(1,  chop(3, [1, 3, 5]))
      assert_equal(2,  chop(5, [1, 3, 5]))
      assert_equal(-1, chop(0, [1, 3, 5]))
      assert_equal(-1, chop(2, [1, 3, 5]))
      assert_equal(-1, chop(4, [1, 3, 5]))
      assert_equal(-1, chop(6, [1, 3, 5]))
      assert_equal(0,  chop(1, [1, 3, 5, 7]))
      assert_equal(1,  chop(3, [1, 3, 5, 7]))
      assert_equal(2,  chop(5, [1, 3, 5, 7]))
      assert_equal(3,  chop(7, [1, 3, 5, 7]))
      assert_equal(-1, chop(0, [1, 3, 5, 7]))
      assert_equal(-1, chop(2, [1, 3, 5, 7]))
      assert_equal(-1, chop(4, [1, 3, 5, 7]))
      assert_equal(-1, chop(6, [1, 3, 5, 7]))
      assert_equal(-1, chop(8, [1, 3, 5, 7]))
    end
  end

Variation #1: Iterative with Loop Invariants

I first read about using loop invariants on a binary search algorithm, so I thought it would be a good variation to start with.

If you have not used loop invariants in the past, they are a useful technique for understanding loops. You specify an condition that is true for every single iteration of the loop. This condition is called the loop invariant.

For the binary search, I used the following invariant.

INV :

lo <= Result && Result < hi

Where lo and hi are indices into the array and result is the location in the array where our target value lies (i.e. the result of the binary search function). All this invariant states is that the final result index will be between position lo and hi. (Note that I’m assuming that the target value is in the array. We will deal with that assumption later).

The invariant must be true when we start the loop, but that is easy to handle. Just set lo to zero (the result index must be greater than or equal to zero), and set hi to one more than the largest index value (list.size works for this).

The second piece you need is a loop termination condition. Each time through the loop we will adjust hi and lo so that they move closer together (without violating the loop invariant). When the difference between the hi and lo values is 1, we can exit the loop.

Here is the termination condition.

TERM :

(hi-lo) <= 1

When the loop exits, the condition TERM will be true. And the invariant INV will also be true (because it is true for every iteration). If TERM and INV are both true, then that implies that result must be equal to lo! Or in other words, when the loop exits, our search target value must be in the array indexed by lo (if it is in the array at all).

Using the above invariant and termination conditions, we can be certain that our loop will be correct (as long as they are faithfully implemented, which is not a minor problem).

Code

Here’s the code. Notice that INV will be true the first time we enter the loop. Inside the loop, we calculate the midpoint of the lo/hi indices, and compare the value at the midpoint to our target value.

If the value at the midpoint is larger than our target, then the range (lo..mid) still satisfies the invariant and we can adjust hi down to mid. Otherwise we can be sure that the range (mid..hi) satisfies the invariant, and we adjust lo up to the value of mid.

The "list[lo] == target" test at the end checks our assumption that the target value was in the array. If we found the value, we return the lo value. Otherwise we return -1.

  def chop_iterate(target, list)
    lo = 0
    hi = list.size
    while (hi-lo) > 1
      mid = (hi+lo) / 2
      if list[mid] > target
        hi = mid
      else
        lo = mid
      end
    end
    list[lo] == target ? lo : -1
  end

  class TestChopIterative < Test::Unit::TestCase
    alias chop chop_iterate
    include ChopTests
  end

Problems Encountered

So, did all that work figuring out a loop invariant help us? Perhaps. I did have one bug where a typo slipped into the calculation of the midpoint. I typed "(hi-lo)/2" by accident and had to scratch my head for a few minutes.

After fixing the typo, the rest of the algorithm worked without a hitch.

Where’s the Invariant?

An interesting thing happened to our invariant and termination conditions. The termination condition was inverted to "(hi-lo) > 1" instead of the "(hi-lo) <= 1" we originally used. That’s because the while statement expects a "stay-in-the-loop" test instead of a "get-out-of-the-loop-test".

But more importantly, our loop invariant has disappeared! It has no representation in the source code at all, except as an implicit constraint on the loop values. This is unfortunate because some important information is lost that should have been communicated to the reader.

The loop construct in the Eiffel language allows you to directly specify the loop invariants. The same loop in Eiffel would look something like this …

           from
               lo := list.lower
               hi := list.upper + 1
           invariant
               lower_limit: -- lo <= Result
               upper_limit: -- hi <  Result
           variant
               hi - lo
           until
               (hi-lo) <= 1
           loop
               mid := (hi+lo) // 2
               if list.at(mid) > target then
                   hi := mid
               else
                   lo := target
               end
           end

Although rather verbose, notice how every part of the loop has its proper place. The termination condition is expressed as an "until" condition, therefore it isn’t inverted.

The invariants are expressed directly. It turns out that our invariants cannot be checked at runtime (since we don’t know the actual value of Result until after the loop terminates), so we expressed them as comments. However, if they were executable, Eiffel would check them at runtime for us.

The "variant" section is interesting. The variant is an non-negative integer expression that must be smaller each time through the loop. Providing a valid variant proves that your loop eventually terminates.

Even the initialization of the loop variables (hi and lo) have their own section.

Next Time …

I was going to write up all 9 variations at once, but this is long enough now for a posting. I’ll hit the two recursive variations in the next installment.

Read: Kata Two Worked -- Binary Chop

Previous Topic


	Web Artima.com