This post originated from an RSS feed registered with Ruby Buzz
by Jim Weirich.
Original Post: Design by Contract and Unit Testing
Feed Title: { | one, step, back | }
Feed URL: http://onestepback.org/index.cgi/synopsis.rss
Feed Description: Jim Weirich's Blog on Software Development, Ruby, and whatever else sparks his interest.
I wrote this article on Design by Contract and Unit Testing to go
along with the EiffelUnit documentation. I'm no longer maintaining
EiffelUnit (I recommend using Gobo
Eiffel Test instead), but thought the article was worth publishing
independently. Hence this blog entry.
The article was written three years ago, so I added a postscript to
the conclusions that reflects my current
experience.
Design by Contract and Unit Testing
Got some questions about Design by Contract and unit testing?
I've been using Eiffel for almost three years, and have been
exploring some Extreme Programming
ideas for the past 6 months. In particular, I've been using XP's
"Test-First" desgin process in conjunction with Eiffel. Here are some
of my thoughts on how unit testing and Design by Contract fit
together. The presentation is in the form of a question/answer
session. If you have comments on this topic, I would be glad to here
them at jweirich@one.net.
The first part of this page covers Design by Contract for those who
are new to the idea. Eiffel programmers may want to skip directly to
the section on Unit testing. Those whose
time is limited may want to jump directly to the conclusions.
What is Design by Contract? Isn't it just adding assertions to
your code?
Although Design by Contract and assertions are very closely related,
DbC is more than just slapping a few assertions into your code at
strategic locations. It is about identifying the contract
under which your code will execute and you expect all clients to
adhere to. It is about clearly defining responsibilities between
client software and supplier software.
In short, Design by Contract starts by specifying the conditions
under which it is legal to call a method. It is the responsibility of
the client software to ensure these conditions (called preconditions)
are met.
Given that the preconditions are met, the method in the supplier
software guarantees that certion other conditions will be true when
the method returns. These are called postcondition, and are the
responsibility of the supplier code in ensure.
What's the advantages of identifying contracts?
By identifying the contract, we are clearly identifying the
responsibilities of each party in a collaboration. Knowing
responsibilities is a big help when the time comes to get two (or
more) modules to work together. Precondition violations mean that the
client is in error. Postcondition violations mean that the supplier
is incorrect.
How does this work with inheritance?
Very well.
Ok, seriously. Contracts are inherited by child classes. If a
child class overrides a method in a parent class, it needs to make
sure the method's new implementation conforms to the contract of the
parent class. By conforming, we mean that the child class is not
allowed to tighten the preconditions of the parent class, or is it
allowed to loosen the postconditions. We remember this by using the
slogan...
Require no more, promise no less.
By the way, Eiffel automatically handles preconditions and
postconditions in child classes without any need to copy and paste
them from the parent.
This sounds complicated. Is it really useful?
Absolutely! By writing child classes so that they conform to the
contract of their parent, we ensure that our classes obey the Liskov
Substitution Principle.
So where do the assertions come in?
Once we have identified the contract, we can programmatically check
that everybody play by the rules by checking for preconditions at the
start of a method and checking for postconditions at the exit.
Note that the checking (assertions) are merely a way of verifying
that everyone is playing by the rules. We should be able to remove
the assertions without effecting program correctness. In fact, Eiffel
allows you to specify several different levels of assertion checking.
Unit testing (at least from the Extreme Programming standpoint) is a
set of executable code that exercises the production code, with the
intent of checking it for correctness. Unit tests are automated and run
without human interaction.
Eiffel has Design by Contract. Why do we need unit testing?
There are several advantages to running unit tests, even in a Design
by Contract environment.
Better code coverage. Unit tests provide more thorough
code coverage than what a typical application would do. DbC
assertions are useless if the code is never called.
Repeatable. Unit tests are automatic and repeatable.
You don't have to rely on a user manually running a program.
Reportable. The results of the unit test are easily
summarized for reporting (e.g. 2 out of 200 unit tests failed).
Ok, so if we have unit tests, why do we still need DbC?
Although very similar in many ways, there are still some fundamental
differences between DbC assertion checking and unit testing. Unit
tests generally focus on a single module (class) and don't exercise
modules (classes) in concert together. In particular, unit test are a
poor vehicle for checking preconditions.
Wait a minute! I can check preconditions in my unit test!
When someone says they are using unit tests to check preconditions,
what they really mean is that the unit tests check to make sure code
is well behaved in the presence of bad data. They check to make sure
that an exception is thrown or that a error flag is set, or even that
the bad data is appropriately ignored.
From a Design by Contract perspective, this is not testing
preconditions. Preconditions define when it is legal to invoke a
particular method. According to DbC, calling a method when its
preconditions are not established results in undefined
behavior. How can you test for undefined behavior?
What the unit tester has really done is changed the contract on the
tested code. They replaced the existing precondition with
True (the universal precondition ... methods with a
precondition of True may be called at any time).
They also changed the postcondition to a stronger, more complicated
one.
Can you give me an example?
Ok, here is a version of the stack operation push
with a typical DbC contract.
push (value: INTEGER)
requirenot is_full
ensure
count = old count + 1
top = value
When you do robustness checking in a unit test, you generally
transform the contract into something slightly different. This is
often called "defensive programming".
push (value: INTEGER)
-- Note: no preconditionensure
success: (notold is_full) implies
((count = old count + 1) and
(top = value))
failure: (old is_full) implies
((count = old count) and
(top = old top))
error_set: old is_full = error
In this version, we added an error flag. We could have also specified
an exception should be thrown.
But isn't defensive programming good thing?
As you can see, the defensive programming version has a much more
complicated postcondition. Some might defend this by saying that all
we have done is pushed existing complexity into the push
method, and that the client code can be simplified.
Unfortunately, that's not necessarily the case.
-- Design by Contract versionifnot stack.is_full then
stack.push (value)
else-- handle full stackend
-- Defensive Programming Version
stack.push (value)
if stack.error then-- handle full stackend
Defensive programming generally doesn't simplify the client, it just
makes him check for errors after the fact.
Defensive programming can also hide problems in the code. By
ignoring bad values, bad data can propagate further in a program.
But what does this have to do with precondition checking?
Unit tests focus on verifying that a single class functions properly.
Problems between modules are not generally uncovered by unit tests.
Extreme programming uses functional (or acceptance) tests to verify
end to end correctness in their programs. But since functional tests
only check the final result, they give little feedback about what is
going on inside the program.
By enabling contract precondition checking, we can immediately
detect interface irregularities whenever they happen, whether in
function tests or unit tests.
Ok, checking contract preconditions sounds useful. But what about
postconditions?
Like unit tests, DbC postcondition assertions check the results of
invoking a method. In other words, they are check the validity of a
single class, and not the interface as precondition checking will do.
Although they are checking the same kind of behavior, their focus
is different. DbC postconditions are general and answer the question
of how a method responds under all possible legal conditions. Unit
tests focus on how a method responds under certain, specific
situations specified in the test.
Because DbC postconditions are general, they are able to check
every invocation of the method for the proper behavior. They can be
run during unit testing, but they also are in effect during functional
testing (and normal program execution, assuming they aren't disabled),
checking each and every invocation of the method. In contrast, unit
tests only check a limited number of specific invocations.
Because they are more general, DbC postconditions are more
difficult to write. In fact, sometimes the postcondition of a method
can be more complicated that the actual code of the body.
Can you give me an example of a complicated postcondition?
Ok. Here is the resize method from the ARRAY class in the ELKS
2000 library proposal. The post condition carefully describes the
resulting state of the array after the invocation of the resize
method. Note that it uses recursive calls in several places and
that it carefully handles all combinations of new/old array bounds.
resize (min_index, max_index: INTEGER)
-- Resize to bounds `min_index' and `max_index'.-- Do not lose any item whose index is in both-- `lower..upper' and `min_index..max_index'.require
valid_bounds: min_index <= max_index + 1
ensure
new_lower: lower = min_index
new_upper: upper = max_index
default_if_too_small:
min_index < old lower implies subarray
(min_index, max_index.min (old lower - 1)).all_default
default_if_too_large:
max_index > old upper implies subarray
(min_index.max (old upper + 1), max_index).all_default
stable_in_intersection:
subarray ((min_index.max (old lower)).min(old upper + 1),
(max_index.min (old upper)).max(old lower - 1)).same_items
(old subarray ((min_index.max (lower)).min(upper + 1),
(max_index.min (upper)).max(lower - 1)))
Wow, what would the unit test code for the same thing look
like?
Here is a unit test for resize, written in EiffelUnit. Note
that assert_array(lo,hi,expected,actual) is a convenience
method that asserts that the array `actual' has the given lower and
upper bounds and contains the same elements as `expected'.
The number of lines in the test case is actually longer that the
postcondition above. But each individual test is simple and direct,
and easy to verify that it is correct.
Note for non-Eiffel programmers:The form <<1,
2>> is a manifest array of two elements.
Any other postcondition examples?
I'm glad you asked. Consider the push and pop methods
on a STACK. push is fairly easy to specify using DbC...
push (value: INTEGER)
ensure
count = old count + 1
top = value
But pop is more difficult.
popensure
count = old count - 1
-- What is the value of `top'
What is the value of the top of the stack after pop completes?
Well, it depends upon the history of pushing and popping leading up to
a given call to pop. The logic required to write a
postcondition specifying the value of pop
Unit tests don't suffer the same problem. Since we are only
testing a specific scenario, we know what the value of the
top of the stack should be.
We see that although unit tests and Design by Contract assertions
have a lot of overlap, but each has a different focus and different
strong points. DbC preconditions are useful for checking inter-module
correctness. Unit testing and DbC postconditions validate individual
module behavior, with unit testing focusing on specific situations an
DbC postconditions focusing on general behavior.
So, what does this mean in practice? I find that when I am doing
Eiffel, I tend to specify preconditions along with a good set of unit
tests. If the postconditions are obvious and easy to write, I will
include them (mainly because postcondition are included in the short
form view of a class). If the postconditions are difficult to specify
in general, I might include them in comment form for the short view.
When I'm using something Java, or some other language with weaker
support for Design by Contract, I definitely would use unit tests for
checking individual modules. Depending on the availability of tools
(e.g. some kind of assert mechanism), I would still do some level of
precondition specification.
Postscript
At the time of writing the original text (sometime in 2000), I was
doing a lot of DbC and Eiffel programming and had just started using
XP style unit tests. Today (sometime in 2003) I am no longer writing
Eiffel code and doing very little "formal" DbC. I find that I still
think of interfaces in terms of contracts with pre/post conditions,
but instead using Eiffel-like assertions to define the
pre/post-condition assertions, I now use unit tests to drive and
define the shape of the software.
Although the unit tests have a different focus than assertions (as
noted in the article), I find that they are quite suitable for my
needs and rarely find it necessary to supplement the coce with any
additional pre/post-condition assertions.