Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Introduction

I wrote this article on Design by Contract and Unit Testing to go along with the EiffelUnit documentation. I'm no longer maintaining EiffelUnit (I recommend using Gobo Eiffel Test instead), but thought the article was worth publishing independently. Hence this blog entry.
The article was written three years ago, so I added a postscript to the conclusions that reflects my current experience.

Design by Contract and Unit Testing

Got some questions about Design by Contract and unit testing?

I've been using Eiffel for almost three years, and have been exploring some Extreme Programming ideas for the past 6 months. In particular, I've been using XP's "Test-First" desgin process in conjunction with Eiffel. Here are some of my thoughts on how unit testing and Design by Contract fit together. The presentation is in the form of a question/answer session. If you have comments on this topic, I would be glad to here them at jweirich@one.net.

The first part of this page covers Design by Contract for those who are new to the idea. Eiffel programmers may want to skip directly to the section on Unit testing. Those whose time is limited may want to jump directly to the conclusions.

What is Design by Contract? Isn't it just adding assertions to your code?

Although Design by Contract and assertions are very closely related, DbC is more than just slapping a few assertions into your code at strategic locations. It is about identifying the contract under which your code will execute and you expect all clients to adhere to. It is about clearly defining responsibilities between client software and supplier software.

In short, Design by Contract starts by specifying the conditions under which it is legal to call a method. It is the responsibility of the client software to ensure these conditions (called preconditions) are met.

Given that the preconditions are met, the method in the supplier software guarantees that certion other conditions will be true when the method returns. These are called postcondition, and are the responsibility of the supplier code in ensure.

What's the advantages of identifying contracts?

By identifying the contract, we are clearly identifying the responsibilities of each party in a collaboration. Knowing responsibilities is a big help when the time comes to get two (or more) modules to work together. Precondition violations mean that the client is in error. Postcondition violations mean that the supplier is incorrect.

How does this work with inheritance?

Very well.

Ok, seriously. Contracts are inherited by child classes. If a child class overrides a method in a parent class, it needs to make sure the method's new implementation conforms to the contract of the parent class. By conforming, we mean that the child class is not allowed to tighten the preconditions of the parent class, or is it allowed to loosen the postconditions. We remember this by using the slogan...

Require no more, promise no less.

By the way, Eiffel automatically handles preconditions and postconditions in child classes without any need to copy and paste them from the parent.

This sounds complicated. Is it really useful?

Absolutely! By writing child classes so that they conform to the contract of their parent, we ensure that our classes obey the Liskov Substitution Principle.

So where do the assertions come in?

Once we have identified the contract, we can programmatically check that everybody play by the rules by checking for preconditions at the start of a method and checking for postconditions at the exit.

Note that the checking (assertions) are merely a way of verifying that everyone is playing by the rules. We should be able to remove the assertions without effecting program correctness. In fact, Eiffel allows you to specify several different levels of assertion checking.

What is unit testing?

Unit testing (at least from the Extreme Programming standpoint) is a set of executable code that exercises the production code, with the intent of checking it for correctness. Unit tests are automated and run without human interaction.

Eiffel has Design by Contract. Why do we need unit testing?

There are several advantages to running unit tests, even in a Design by Contract environment.

Better code coverage. Unit tests provide more thorough code coverage than what a typical application would do. DbC assertions are useless if the code is never called.
Repeatable. Unit tests are automatic and repeatable. You don't have to rely on a user manually running a program.
Reportable. The results of the unit test are easily summarized for reporting (e.g. 2 out of 200 unit tests failed).

Ok, so if we have unit tests, why do we still need DbC?

Although very similar in many ways, there are still some fundamental differences between DbC assertion checking and unit testing. Unit tests generally focus on a single module (class) and don't exercise modules (classes) in concert together. In particular, unit test are a poor vehicle for checking preconditions.

Wait a minute! I can check preconditions in my unit test!

When someone says they are using unit tests to check preconditions, what they really mean is that the unit tests check to make sure code is well behaved in the presence of bad data. They check to make sure that an exception is thrown or that a error flag is set, or even that the bad data is appropriately ignored.

From a Design by Contract perspective, this is not testing preconditions. Preconditions define when it is legal to invoke a particular method. According to DbC, calling a method when its preconditions are not established results in undefined behavior. How can you test for undefined behavior?

What the unit tester has really done is changed the contract on the tested code. They replaced the existing precondition with True (the universal precondition ... methods with a precondition of True may be called at any time). They also changed the postcondition to a stronger, more complicated one.

Can you give me an example?

Ok, here is a version of the stack operation push with a typical DbC contract.

    push (value: INTEGER)
        require
            not is_full
        ensure
            count = old count + 1
            top = value

When you do robustness checking in a unit test, you generally transform the contract into something slightly different. This is often called "defensive programming".

    push (value: INTEGER)
        -- Note: no precondition
        ensure
            success: (not old is_full) implies
                         ((count = old count + 1) and
                          (top = value))
            failure: (old is_full) implies
                         ((count = old count) and
                          (top = old top))
            error_set: old is_full = error

In this version, we added an error flag. We could have also specified an exception should be thrown.

But isn't defensive programming good thing?

As you can see, the defensive programming version has a much more complicated postcondition. Some might defend this by saying that all we have done is pushed existing complexity into the push method, and that the client code can be simplified.

Unfortunately, that's not necessarily the case.

  -- Design by Contract version
  if not stack.is_full then
      stack.push (value)
  else
      -- handle full stack
  end

  -- Defensive Programming Version
  stack.push (value)
  if stack.error then
      -- handle full stack
  end

Defensive programming generally doesn't simplify the client, it just makes him check for errors after the fact.

Defensive programming can also hide problems in the code. By ignoring bad values, bad data can propagate further in a program.

But what does this have to do with precondition checking?

Unit tests focus on verifying that a single class functions properly. Problems between modules are not generally uncovered by unit tests. Extreme programming uses functional (or acceptance) tests to verify end to end correctness in their programs. But since functional tests only check the final result, they give little feedback about what is going on inside the program.

By enabling contract precondition checking, we can immediately detect interface irregularities whenever they happen, whether in function tests or unit tests.

Ok, checking contract preconditions sounds useful. But what about postconditions?

Like unit tests, DbC postcondition assertions check the results of invoking a method. In other words, they are check the validity of a single class, and not the interface as precondition checking will do.

Although they are checking the same kind of behavior, their focus is different. DbC postconditions are general and answer the question of how a method responds under all possible legal conditions. Unit tests focus on how a method responds under certain, specific situations specified in the test.

Because DbC postconditions are general, they are able to check every invocation of the method for the proper behavior. They can be run during unit testing, but they also are in effect during functional testing (and normal program execution, assuming they aren't disabled), checking each and every invocation of the method. In contrast, unit tests only check a limited number of specific invocations.

Because they are more general, DbC postconditions are more difficult to write. In fact, sometimes the postcondition of a method can be more complicated that the actual code of the body.

Can you give me an example of a complicated postcondition?

Ok. Here is the resize method from the ARRAY class in the ELKS 2000 library proposal. The post condition carefully describes the resulting state of the array after the invocation of the resize method. Note that it uses recursive calls in several places and that it carefully handles all combinations of new/old array bounds.

   resize (min_index, max_index: INTEGER)
      -- Resize to bounds `min_index' and `max_index'.
      -- Do not lose any item whose index is in both
      -- `lower..upper' and `min_index..max_index'.
   require
      valid_bounds: min_index <= max_index + 1
   ensure
      new_lower: lower = min_index
      new_upper: upper = max_index
      default_if_too_small:
         min_index < old lower implies subarray
          (min_index, max_index.min (old lower - 1)).all_default
      default_if_too_large:
         max_index > old upper implies subarray 
          (min_index.max (old upper + 1), max_index).all_default
      stable_in_intersection:
         subarray ((min_index.max (old lower)).min(old upper + 1), 
            (max_index.min (old upper)).max(old lower - 1)).same_items
               (old subarray ((min_index.max (lower)).min(upper + 1), 
               (max_index.min (upper)).max(lower - 1)))

Wow, what would the unit test code for the same thing look like?

Here is a unit test for resize, written in EiffelUnit. Note that assert_array(lo,hi,expected,actual) is a convenience method that asserts that the array `actual' has the given lower and upper bounds and contains the same elements as `expected'.

The number of lines in the test case is actually longer that the postcondition above. But each individual test is simple and direct, and easy to verify that it is correct.

    array: ARRAY[INTEGER]

    test_same_range is
        do
            array := <<1, 2, 3>>;
            array.resize (1,3)
            assert_array (1, 3, <<1, 2, 3>>, array)
        end

    test_included_range is
        do
            array := <<1, 2, 3>>;
            array.resize (1,1)
            assert_array (1, 1, <<1>>, array)
        end

    test_big_range is
        do
            array := <<1, 2, 3>>;
            array.resize (0, 5)
            assert_array (0, 5, <<0, 1, 2, 3, 0, 0>>, array)
        end

    test_high_overlap is
        do
            array := <<1, 2, 3>>;
            array.resize (2, 5)
            assert_array (2, 5, <<2, 3, 0, 0>>, array)
        end

    test_low_overlap is
        do
            array := <<1, 2, 3>>;
            array.resize (-1, 1)
            assert_array (-1, 1, <<0, 0, 1>>, array)
        end

    test_no_overlap is
        do
            array := <<1, 2, 3>>;
            array.resize (5, 7)
            assert_array (5, 7, <<0, 0, 0>>, array)
        end

Note for non-Eiffel programmers: The form <<1, 2>> is a manifest array of two elements.

Any other postcondition examples?

I'm glad you asked. Consider the push and pop methods on a STACK. push is fairly easy to specify using DbC...

    push (value: INTEGER)
        ensure
            count = old count + 1
            top = value

But pop is more difficult.

    pop
        ensure
            count = old count - 1
            -- What is the value of `top'

What is the value of the top of the stack after pop completes? Well, it depends upon the history of pushing and popping leading up to a given call to pop. The logic required to write a postcondition specifying the value of pop

Unit tests don't suffer the same problem. Since we are only testing a specific scenario, we know what the value of the top of the stack should be.

    test_push_and_pop is
        do
            stack.push(5)
            assert_equal ("push5", 5, stack.top)
            stack.push(123)
            assert_equal ("push123", 123, stack.top)
            stack.pop
            assert_equal ("pop5", 5, stack.top)
            stack.pop
            assert ("empty", stack.is_empty)
        end

Conclusion

We see that although unit tests and Design by Contract assertions have a lot of overlap, but each has a different focus and different strong points. DbC preconditions are useful for checking inter-module correctness. Unit testing and DbC postconditions validate individual module behavior, with unit testing focusing on specific situations an DbC postconditions focusing on general behavior.

So, what does this mean in practice? I find that when I am doing Eiffel, I tend to specify preconditions along with a good set of unit tests. If the postconditions are obvious and easy to write, I will include them (mainly because postcondition are included in the short form view of a class). If the postconditions are difficult to specify in general, I might include them in comment form for the short view.

When I'm using something Java, or some other language with weaker support for Design by Contract, I definitely would use unit tests for checking individual modules. Depending on the availability of tools (e.g. some kind of assert mechanism), I would still do some level of precondition specification.

Postscript

At the time of writing the original text (sometime in 2000), I was doing a lot of DbC and Eiffel programming and had just started using XP style unit tests. Today (sometime in 2003) I am no longer writing Eiffel code and doing very little "formal" DbC. I find that I still think of interfaces in terms of contracts with pre/post conditions, but instead using Eiffel-like assertions to define the pre/post-condition assertions, I now use unit tests to drive and define the shape of the software.
Although the unit tests have a different focus than assertions (as noted in the article), I find that they are quite suitable for my needs and rarely find it necessary to supplement the coce with any additional pre/post-condition assertions.

...

Read: Design by Contract and Unit Testing

Previous Topic


	Web Artima.com