I was interested to read Don't Let Yourself Get Unitized, I guess
because he refers to problems with JUnit, and unittest is based on
JUnit, and I don't like unittest that much.
But the article turns into a curmudgeonly rant and his underlying
resentment of unit testing comes out:
Consider the goals and tenents of unit testing:
- Very small "units" are tested
- Testing is almost always done of individual components in
isolation from other components
- Mocking strengthens the isolation aspect
- The code and the tests are almost always written by the same person
Taken together this means that unit tests are testing the lowest
level pieces of your code, each in turn and in isolation from all
other pieces, and the definition of the tests and the code are
done by the same person.
This sort of testing catches what I consider "low hanging
fruit". It catches problems-in-the-small. It'll find individual
methods or classes which don't match what the unit tests say
should happen.
This is a good thing and provides very valuable feedback on the
correctness of your code. But keep in mind it _only_ catches low
hanging fruit. By design, unit testing is supposed to be easy, and
to consider individual small pieces of a system in
isolation. Because of this, by its very nature, unit testing does
not consider the _composition_ of a system, only its individual
parts. Unit tests never check the interconnections of an
application, it never checks how they are wired together.
In my experience, the interconnections and "wiring" of an app is
where most of the complexity of the application lies. The wiring
defines your design, and if considered at a high enough level it
can even be considered to capture your architecture. How
information flows across many software layers and between many
components really define what an application does. And the very
definition of Unit Testing is that it does not test these aspects
of an application. Unit testing ignores information flow across
software layers and components, ignores how classes and objects
are interrelated and are put together into larger designs and
architecture. This means that unit tests can catch simple errors
in individual pieces of code, but says nothing whatsoever about
your system's design or architecture. And what makes or breaks an
application really is the overall design and architecture. The
design and architecture captures your system's performance, it's
memory use, the "end-to-end" correctness from the user's inputs
out to whatever servers you might be using, and the round trip
back again. How all the wiring interconnects shows the true system
behavior, and it is in this area where the toughest bugs and
problems lie, and where people sweat blood to get things
right. Writing individual components in isolation is easy. It's
hooking them together into a cohesive whole that's hard - and unit
tests only pass judgement on the individual parts in isolation,
not the whole.
Getting one component to act "correctly" in a system is almost
always a pretty trivial exercise. Writing one component in
isolation is not the difficult part of computer programming. Any
single small component of a system is generally easy to code. The
hard part of development comes in getting all of the components of
a system to work together - to get the wiring right. Unit tests
can verify that each of your individual components does what you
the developer thinks it should do. But by its very definition,
unit testing cannot check the more complex "wiring" - and the
wiring is where most of our design, development, and debugging
time goes into.
There's also some weird digs at Martin Fowler (and I guess all consultants), which
seems out of the blue and a little mean. The guy has a chip on his
shoulder. But anyway, I'll respond with my thinking on unit tests.
First: Getting one component to act "correctly" in a system is almost
always a pretty trivial exercise. Sure, getting it to act "correctly"
is easy. Getting a component to act correctly without scare
quotes is much harder. If your component acts correctly all that
wiring will work correctly too (by definition). When he's talking
about "correctly" I think he really means "according to some spec that
was delivered to the programmer." This is a symptom of defensive
programming, where the programmer isn't at fault if there's a bug in
the spec, or a bug in the larger design, or what-have-you. A
responsible programmer cares about the success of the larger project,
and judges correctness based on the utility and reliability of their
code in that larger project, so there is no "correct" that does not
include the system.
But another error is in how he views "units". A unit test is small
and isolated, and tests a small amount of code. But a "small amount
of code" is a matter of perspective. I shudder to think about the
amount of code involved in the Python expression a =
some_dict.copy(). There's the Python compiler, and VM, and the
dictionary implementation, and the code behind classes and types, and
underneath that is libc and the kernel and who knows what. And the
combinations! Most of those pieces have multiple implementations;
alternate dictionary-like objects, alternate VMs, many underlying
platforms. We have created ourselves a Tower of Babel, an incredible
monument to abstraction. But God hasn't struck us down... well, we
have been "cursed" with a multitude of languages, but our efforts have
not collapsed.
And yet despite all that underlying code, that one line is too small
to be unit tested. So how did we get there? For the purpose of what
we're testing we ignore the code and trust the abstractions. All
those pieces underneath it are mature and well tested, having
undergone years of development and testing by a large number of
developers.
When I create a system, I have to build those same kinds of
abstractions, reliable pieces that I can trust. When I later create
code that depends on those libraries, I test that code, I do not test
the library.
But unlike foundational code like libc, I can't put years of effort
into building my abstractions. I have deadlines, and anyway I don't
actually want to work on a single project for years at a time, all
the way through to some time-based maturity, suffering through an
unreliable interim and using QA departments and other systems that
require bureaucracy.
Unit testing is in this way circular. Unit testing allows me to bring
something to maturity faster, to practically will it to maturity.
Along the way I have to make compromises -- I have to keep my code
decoupled, and I have to write for testability. But I make those
compromises because they allow me to move beyond my code, to build
up things that don't need to be constantly revisited and retested
according to context.
This is circular, because only with this foundation can I move to
testing higher-level code in "isolation". Because of the Stable
Dependencies Principle, I can't build something reliable and mature
on top of something that is unreliable and immature. I can only make
my code "isolated" if I have real trust that my underlying code is
well defined and functional, that I can ignore it the way I ignore
libc and the virtual machine. And I can only do that with unit tests.