Summary
Refactoring is a wonderful thing. It lets
elegance evolve, instead of making you live
with faulty initial design decisions for the
life of the project. Unit testing makes it
possible to refactor safely, for sure. That's
an important reason for unit tests. But the
need for unit tests goes well beyond that...
Advertisement
Unit tests record the important use cases --
use cases that you frequently discover as
you're going along, and then forget about
some number of weeks later after you've modified
the code to handle them. You're now several months down the road,
and you've got some humungous wad of code
that badly needs refactoring.
I'm looking at one of those at the moment,
in fact. It's a link-managing routine that
takes an input link like http://xyx, ../xyz,
xyz, or /xyz. It has to normalize relative
links, take into account whether there was a
"base" directive on the page it came from, and
map the result to a new location, if a
mapping has been supplied by the user. It also
has to deal with http:/, https:/, ftp:/, and
file:/ protocols, as well as plain directory
paths.
Frankly, the code is a mess. It was patched
in multiple places, each time to solve a problem
introduced by some set of factors that
I hand't orignally taken into account. Looking
at it now, I can see code that can never be
reached--a sure sign that it has grown too
large and too complex for my feeble brain to
manage.
Obviously, refactoring is needed. But what
kind of refactoring? The answer,
naturally enough, depends on what problems
the code is trying to solve.
The question is, what were those problems?? They arrived over
the course of a couple of years. The code shows
the result of the attempted soulutions, but I've long
since forgotten the problems I needed to solve.
One way to get such a list is to examine the
version modification history. With sufficient
scouring, they could be ferreted out. That's
one useful reason for maintaining version-controlled
sources.
A bug tracking system could also be examined.
Out of all the problems that needed to be solved,
it would be possible to extract that ones that involve
this particular part of the code.
But a better answer is unit tests. Every time
the code broke, my first task should have
been to create a unit test that replicated the
break. That speeds up the edit/debug cycle, too.
A small test takes a lot less time to run than the
real-life data that generates the error. That's
another great reason for unit tests.
But more importantly, had those tests been constructed,
they would now provide a complete list of the
problems the code had to solve. And the list would
be organized by section, with all the tests that involve
this method in one place.
In addition, there would be a place to record new issues, as they
arise. (An almost certain occurrence, since the
history of the project has been one of finding out
that the code had to deal with things that I never
knew were possible.)
With that list of cases, I could be sure that when
I refactor to solve the problem in front of me, I
won't create a regression for some issue that I've
completely forgotten about--something that the
current code is handling successfully, no matter
how ugly it is.
Over time, then, the unit tests you collect tells
you what the code has to do. In effect, they give you
a complete, detailed specification--a specification you can use in
an automated way to ensure that your newly refactored
design achieves all of the goals that have been
identified for it, past and present.
So the moral here is that unit tests be very,
very good. They set you up so you can refactor
safely. They record the reasons for the
patches you made over time. And they tell you when
a new refactoring is successful. Taken all together,
that's a heck of a lot of value.
Of course unit tests make refactoring safer and easier, but it's also the case that refactoring can make it easier to write unit tests. We tend to refactor into smaller, more cohesive units (classes and methods). The smaller the scope of a unit, the easier it is to write unit tests for it with good coverage.
Nice article. I've often felt that this was a natural take away from working with unit tests or TDD. I've been amazed though at how eyes glaze over when this is brought up. I hope you fare better with your evangelizing than I have.
I don't want to seem flip, but my only response is "duh!".
I think the synergy between widespread unit-tests and refactoring has been around since at least the advent of XP (1999?). I don't have my copy of "XP: Explained", but I'm pretty sure the diagram of how the 12 practices interact and support each other descibes the postive feedback between testing and refactoring.
FWIW, I think your note is excellent. Anything that helps keep these notions at the forefront of the software-development mindset is the proverbial "good thing".
I liked the points you made here. There is always so much information about how important unittests are in the initial development of a product, but relatively little is said about its longterm benefits. Good to remind us all.
"But more importantly, had those tests been constructed, they would now provide a complete list of the problems the code had to solve. And the list would be organized by section, with all the tests that involve this method in one place."
This is one reason why i love to write unit tests. Very often, I have to come back to my classes and add more business logic. Unit tests make my life so simple!
Thanks, Jeff. I have to agree. My forehead hurt so hard from smacking it that I had to write up the note. I never saw that particular book, but I'm fascinated by systems diagrams that show how things interrelate. I'll have to get it.
> ...it's also the case that refactoring can make it easier > to write unit tests. We tend to refactor into smaller, > more cohesive units (classes and methods). The smaller > the scope of a unit, the easier it is to write unit tests > for it with good coverage. > True. But the revelation for me was the realization that as "stuff happens"--as I find out more about the domain than I originally new--unit tests give me a way to record that knowledge. Because it's only when enough knowledge has accumulated that I can refactor intelligently--but if I'm not creating unit tests, I'm not accumulating that knowledge in the form of use cases.
Partly, the goal of writing the note was to motivate myself. The project I was working on was one in which problems tended to surface in a production setting. More often than not, I couldn't reproduce the problem with short tests, so I would up debugging in semi-production mode. By the time I figured out what was going on, it was easier to jsut fix the code than it would have been to create a test that replicated it.
Refactoring to allow for better testing is another important concept, but the insight that burst upon me was that doing the extra work to set up a test is an important device for knowledge capture--especially since my knowledge is invariably limited in any given domain, and that my "learning opportunities" (aka bugs) can be separated by months, if not years.
>> The smaller the scope of a unit, the easier it is to >> write unit tests for it with good coverage.
> ... unit tests give me a way to record that knowledge. > Because it's only when enough knowledge has accumulated > that I can refactor intelligently...
So true. I've been trying to instill in the folks that I've been mentoring that tests are the perfect place to capture requirements. Functional/Acceptance tests *are* the system requiremens, and unit-tests document lower-level behavior.
IM(H)O, it's all "about the testing", or more conventionally said, "about the design". I think it's crucially important to remember that that "test first" and "test driven" initially affect the system's _design_. The residual unit-tests are very useful, (ex. for subsequent refactoring), but perhaps their greatest benefit is as a description of the design.
> ... it was easier to jsut fix the code than it would have > been to create a test that replicated it. > ... doing the extra work to set up a test is an important > device for knowledge capture ...
That is a very significant observation. I forget where it was first esposed (probably XPE), but one of the best things you can do in the way of process improvement is capture *every* bug as a test or tests (most likely a functional-level test initially, but usually lower level tests as you track down the root cause).
In the spirit of this discssion, creating the test forces you to really understand and _capture_ what the bug is.
Subsequently, the test(s) give you two benefits: 1) you have a focused environment to fix the bug in, and 2) you know when you've fixed the bug (the test(s) pass); they also prevent the bug from ever reappearing (unoticed).
The epiphany comes when you realize that in many (most?) cases, the time/effort saved in fixing the bug can easily repay the effort to create the tests.
Geoff Sobering wrote: > > Tests are the perfect place to capture requirements. > Functional/Acceptance tests *are* the system > requirements, and unit-tests document lower-level > behavior. > That's one to engrave in gold and hang on the wall.
I went to a Ruby conference over the weekend and was exposed to the latest testing concept: Behavior Driven Testing. It has a slightly more clear an O-O testing syntax, where you invoke a method on the object to compare it to an expected result, instead of sending both the result and the expected value to a method in a third object. So you see things like actual.should.equal(expected), instead of assert_equal(actual, expected) -- or is it the other way around?
Syntactically, it's a little nicer. But the deeper motivation was the idea of focusing on desired behaviors rather than objects. Paralleling your line of thinking, it embodies the notion of "test suite as specification", to the point that it's called RSpec, rather than some variant of "Test".
What was particularly fascinating was that Dave Astels originally floated the idea as a thought experiment, but Ruby makes it so easy to extend the behavior of existing classes that Steven Baker implemented the concept in a matter of hours.
Geoff Sobering wrote > > creating the test forces you to really understand and > _capture_ what the bug is. > Couldn't agree more. But to understand what was happening well enough to write a test, I found myself using trace statements and the debugger while running on production data, so I could see just what the heck what was going. By that time I knew enough to write the test, I *also* knew enough to solve the problem.
That situation existed, of course, because this particular program was written before I had bitten by the unit test bug. One of the joys of unit tests is that it keeps all bugs /shallow/. A bug is almost always immediately pinpointed to the method it's in. And with a test harness already set up, you can easily set up experiments to test your assumptions, in the form of more tests. So you're almost always puzzling out one routine, and thinking about what it should do.
That's "fun" because it's like solving crossword puzzles for a living. The problems aren't too hard, and they're not to easy. They're just right. On the other hand, diving through a ton of intertwining functions trying to find the source of the bug... well, that's hard.
So now, I'm at that point where I've been looking at legacy code (which I sadly happen to have perpetrated myself), and I finally see what's going on. But...
* there deadline pressure * the unit test harness doesn't exist * the code needs to be refactored to support testing
So now I'm facing one of those difficult decisions. Do I "fix the bug", or do I set up a test harness, do some refactoring to make testing possible, and create a test, and /then/ fix the bug.
> Subsequently, the test(s) give you two benefits: 1) you > have a focused environment to fix the bug in, and 2) you > know when you've fixed the bug (the test(s) pass); they > also prevent the bug from ever reappearing (unoticed). > Ah, but in this case benefit #1 has *seemingly* disappeared. I no longer need an environment to fix the bug in, because I already know what I need to fix.
Of course, that appearance also disguises several very real risks. For one thing, I may not understand the problem as well as I think I do. For another, I may botch the fix. Even if I get it right, I may break something else--but lacking a test suite, I have no protection against that in any case--at least for the moment. (I do begin to acquire that protection in the future if I start collecting tests now.)
> The epiphany comes when you realize that in many (most?) > cases, the time/effort saved in fixing the bug can easily > repay the effort to create the tests. > Again, you speak the truth. But that traditional argument for unit testing (an argument I subscribe to and apply to all new projects) is one that doesn't apply well to legacy code--or spaghetti, as the case may be.
(Along those lines, the guys at the Ruby conference mentioned a must-read book by ___ called ___ Legacy Code. It's about retrofitting unit testing and refactoring in bits and pieces to help bring such dinosaurs under control.)
Anyway, the bottom line in all this was that "making it easir to fix the bug" wasn't a good argument for setting up a unit test, because once I knew enough to create the test, I already knew how to fix the bug. So it seemed like just so much "extra work".
Meanwhile, the argument of protection against future regressions was too abstract for me. I've just got way too much confidence in myself. Heck, I won't break things in the future. And if I do, I'll fix them. Hey, maybe its not the best of reactions. But it's something.
But the one argument that strikes me as irrefutable is that if I need to make a patch, it means that there is *something I didn't know*. Even if it's someone else' s code, it means there was something the code didn't anticipate.
We can probably afford to rule out simple typos, here, like maybe I forgot a parenthesis or used A instead of B. But when I find myself adding code--especially if I add new logic to handle some previously unexpected condition--I now have to realize that new item is the tip of an iceberg. The top is the new condition that surfaced. Underneath, lies the vast body of ignorance that inhabits the majority of my brain.
So I really *do* need write those unit tests, even if I've already fixed the bug. Thanks to your earlier note and the weekend seminar, I can now state the reason more succinctly:
I need to write a unit test for a bug, even if I already know how to fix it, in order to *capture the specification*.
Seen in that light, writing the unit test is essentially a documenation process--one that will guide later refactoring and provide a means for automated validation of the result.
> (... the guys at the Ruby conference > mentioned a must-read book by ___ called ___ Legacy Code. > It's about retrofitting unit testing and refactoring in > bits and pieces to help bring such dinosaurs under > control.) > The book is Working Effectively with Legacy Code, by Michael Feathers.