Summary
Half book review, half Python-vs-static-languages musings
Advertisement
I wrote this half a year ago, planning to go back to the book for
some fact checking before posting it. Then I accepted a new job,
the book got packed, and I was too busy to blog.
I'm accepting the status quo: I'm still too busy, the book is in
a box that I still haven't unpacked, so I'm just posting this unfinished.
I'm sure commenters will correct my facts if they need correcting. ;-)
I enjoyed reading Bob Martin's "Agile Programming". I haven't
quite finished it, but I'm confident I can start writing this review
now, while I have a few moments to myself. There's no Internet above
the Atlantic Ocean yet to distract me with email, and my family is
sadly remaining at home. And I'm procrastinating writing my keynote
for EuroPython :). (The second pass of this text was done on a plane
from Portland to San Francisco, returning from OSCON 2003 -- see my
previous blog entry.)
I really liked the book, although it was very different from what I
had expected. Well, what had I expected? Maybe more formal
writing; Bob's writing (this may come as no surprise to his fans) is
incredibly informal, despite the many footnotes (half of which contain
jokes) and references (mostly to the "Gang Of Four" patterns book and
Bob's own previous work). Maybe I had expected the code samples to be
more sophisticated. Maybe I had expected fewer typos; I expect Bob's
enough of a control freak to produce camera-ready copy, which must
limit the effectiveness of a publisher's editors and proofreaders
against slippery fingers.
But the typos didn't distract from the book's message. It starts
with a clear manifesto, proclaiming (amongst others) that programmers
aren't to be treated like exchangeable production workers. While
touting various "agile" development methods, Bob is clearly partial to
Extreme Programming, probably the most outgoing of the agile methods.
(I would call it my favorite except I'm pretty ignorant of the others,
so that endorsement wouldn't mean much.)
There is a lot of code in the book. I haven't read all of it, but
I've skimmed it all and read much of it. I'd say two-third is Java
code, the rest is C++. Python gets a mention or two, together with
Smalltalk and Ruby (apparently Bob's current favorite) as examples of
"more dynamic languages" where some things are different or easier.
Seeing all that Java and C++ reminded me of how verbose these
languages often are! After reading Bob's very entertaining chapter of
how he and another Bob wrote a scoring program for bowling, I wrote a
similar program in Python in 75 minutes and 85 lines, plus 150 lines
of unit testing code. [XXX insert comparison in #lines.] My program
used a much simpler data structure (a list of lists, each sublist
representing a frame) and didn't follow the flow of Bob's program; I
simply did what I thought was the best way to solve the problem in
Python. I don't think that reading Bob's program did much to
influence my version, except that I used it to learn the scoring rules
for bowling (not being a native American, I had only the vaguest idea
of the difference between a strike and a spare before reading this
chapter).
But in a sense, the verbosity of Java and C++ help the book; the
method of test-driven design practiced throughout the book doesn't
work for very small programs, and the bowling scoring program in
Python would perhaps have been too small to make a good example
(although the quirkiness of the rules of bowling definitely help).
The large volume of code in the book is important for Bob's story:
there's a lot of it, often shown in various stages of development,
from the first skeleton version, that barely passes the most trivial
unit test, to the final, complete version, with full unit tests. This
is an exellent illustration of test-driven design, although I still
find it hard to believe that it really works like that in real life,
when practiced in the extreme form shown in the book.
I am a firm believer in writing unit tests; in a recent mid-size
programming project, adding a fairly elaborate new feature to Zope 3,
I found time after time that the remaining bugs in the code coincided
with places where I had cut corners in the unit tests. But the unit
tests played only a small role in the design of my code. There
was a solid high-level design on paper, done months before I got the
project, and this design for the most part worked well. It wasn't
grand enough to call it "big design up front" (which has a bad rep in
the XP world); most programming details weren't fixed by the design
and in fact I wrote large parts of the code twice, first to get a feel
for the "lay of the land", and then another time to "get it right".
Only the second version of the code had solid unit tests, which were
essential to get all the code details right -- but they didn't
influence the design much, which was formed by thinking over the
problem after writing the first prototype. To be honest, there was
an even earlier prototype by someone else, which I ended up mostly
throwing away, but which probably informed the design with which I
started.
Besides writing about XP and test-driven design, Bob writes about a
number of programming patterns. Most of these are repeats of
Gang-of-Four patterns, some come from Bob's own patterns book. I find
Bob's approach refreshing: while I own the GoF book, I haven't read it
cover-to-cover, and only occasionally used it as a reference. Bob's
treatment is lighter, but more importantly, he repeatedly warns for
mindless use of patterns. He explains how some patterns can easily
lead to overengineered code, and shows examples of lighter
alternatives (usually different patterns from the same sources).
But more than test-driven design or patterns, for me the
centerpiece of the book are the various principles of agile
development. There is the Liskov Substitution Principle, a reminder
about the intent of subclassing; the Single Responsibility Principle,
which can be used to decide when it's better to split a class into
two; the Open-Closed Principle, stating that a class should be open to
extension but closed to change (since every change to a class requires
relinking all code that depends on it; not quite true in Python, but
still a useful principle).
Finally there's the Dependency Inversion Principle, which reminds
us that rather than having the higher-level classes depend on the
lower-level classes, like we used to do it, we should have the
higher-level classes define interfaces upon which the lower-level
classes depend. This principle, like several of the others, is
designed to decouple different parts of a program, so there's less
need for relinking. Again, this is less important for Python. In C++
or Java this change usually means that you have to create an interface
describing the concrete low-level class, rewrite the higher-level
class to take an instance implementing that interface, and rewrite the
lower-level class to implement the interface.
Once this is done, the low-level class can change without affecting
the higher-level class (which only depends on the interface), as
opposed to before the change, when every change in the lower-level
class required relinking the higher-level class. Why is this bad?
The higher-level class doesn't change that much, but the lower-level
class is where most of the action is, and hence much of the churn
during development. Surely reducing the dependencies on a high-churn
class must be a worthy goal!
But in Python, the implementation of this cunning plan is almost
too simple to deserve a Principle with a capital P. Since in Python,
interfaces (we tend to call them protocols) are usually only in the
programmer's head, all we have to do is to make sure that the
higher-level class doesn't directly instantiate the lower-level class,
but gets an instance (or instances, as the case may be; or perhaps a
factory function) passed in. Then we can change the lower-level class
as long as the signatures of the methods used by the higher-level
class don't change (this is Python's equivalent of keeping the
interface fixed); and we can even substituting an entirely different
class as long as it has the same set of methods.
PS. (back to today, February 2004): does anybody else think that
Ant is a poor excuse for a build tool? Some of my top complaints:
XML wasn't designed to be edited by humans on a regular basis.
Ant is extremele verbose.
The external command "cp A B" becomes a four-line construct.
Ant is not Unix-friendly. The <copy> command doesn't copy
the permission bits of Unix files; the <exec> command doesn't
consider an exit status of "2" to be an error.
Why do I have to say vmlauncher="false" everywhere?
in a recent mid-size programming project, adding a fairly elaborate new feature to Zope 3, I found time after time that the remaining bugs in the code coincided with places where I had cut corners in the unit tests.
I suspect this is correlation, not causation. You didn't write complete unit tests because (a) it was the most challenging and subtle piece of the code, probably because (b) you didn't completely understand what you were trying to achieve, or (c) the external interface can't be easily decoupled from the internal implementation and you hadn't figured out the implementation yet, or (d) it was (perhaps intrinsically) a highly stateful portion of the code, and the correctness of the code was coupled with the correctness of a larger process that could not be easily decomposed into units. In other words, you didn't write tests because it was hard programming, and you had bugs because it was hard programming.
Now, by XP philosophy this means there's something wrong with your design. That's easy to say, but the only real proof is a better design. I don't think it's a given that such a better design always exists, where "better" has to satisy the criteria of "more testable" and every other criteria you already are bringing into your programming (maintainable, extensible, performance requirements, given semantics, usable interface, etc).
You might - or might not - be interested to contrast this review with one written by me about the same book from a C++ programmers viewpoint, on Accu, see:
"Then we can change the lower-level class as long as the signatures of the methods used by the higher-level class don't change"
I'm sorry. I don't usually write troll posts. But I can't take this "xxx is better because it has dynamic typing, which makes writing it faster" crap anymore! Here's my favorite:
"But in Python, the implementation of this cunning plan is almost too simple to deserve a Principle with a capital P."
You then procede to write "all we have to do is...then we can...and we can substitute an entirely different class as long as it has the same set of methods." The adjetives you used suggest that this is somehow easier than the previously mentioned way of doing it in java. But curiously, it sounds <b>exactly</b> the same. Let's just translate the terms.
Java: create an interface describing the concrete low-level class, rewrite the higher-level class to take an instance implementing that interface
Python: all we have to do is to make sure that the higher-level class doesn't directly instantiate the lower-level class, but gets an instance (or instances, as the case may be; or perhaps a factory function) passed in.
Java: and rewrite the lower-level class to implement the interface.
Python: Then we can change the lower-level class as long as the signatures of the methods used by the higher-level class don't change (this is Python's equivalent of keeping the interface fixed); and we can even substituting an entirely different class as long as it has the same set of methods.
I'm not getting it. I know the above wording isn't exactly equivalent, but boy the wording sure makes the java way seem real hard. It makes a good case study of how injecting the appropriate adjectives into a sentence can make one sentence sound hard, and another one easy, while not changing any of the actual content. For example, in several places you mention that in java, you have to rewrite your class to implement an interface. Funny how you don't use the word "rewrite" to describe how you have to change a python class to take an instance of another class rather than creating it directly, huh? Rewrite - that must mean you throw the whole class out and have to write it from scratch, huh? It could be that 4 mouse clicks in an idea and it's done, could it?
And another thing, "we can even substituting an entirely different class as long as it has the same set of methods" - you mean, you create an interface? hmm?
I'm sorry to rant, but I'm sick and tired of hearing how much "easier" things are - when they aren't actually any easier.
On another note, I agree with what you said about ant - it's just that really really horid system of makefiles made ant look wonderful in comparison. :-)
> "Then we can change the lower-level class as long as the > signatures of the methods used by the higher-level class > don't change"
[How are Java and Python *really* different?]
> Java: > create an interface describing the concrete low-level > class, rewrite the higher-level class to take an instance > implementing that interface
The Interface must be written out and completely specified. This will cause at least one round of recompiling, and usually some debugging. Lower level classes have to implement the entire interface, even if you don't need it all for this task. If you solve that by using lots of smaller interfaces, then it becomes a hassle to keep track of exactly which interfaces to use when.
> Python: > all we have to do is to make sure that the higher-level > class doesn't directly instantiate the lower-level class, > but gets an instance (or instances, as the case may be; or > perhaps a factory function) passed in.
It will work exactly as before -- unless you choose to pass in some other factory. It won't require any compilation. You can even do it inside a still-running process; existing instances won't be broken. New objects created by existing code will still get the default Class1 that is known to work.
> Java: > and rewrite the lower-level class to implement the > interface.
> Python: > Then we can change the lower-level class as long as the > signatures of the methods used by the higher-level class > don't change (this is Python's equivalent of keeping the > interface fixed); and we can even substituting an entirely > different class as long as it has the same set of methods.
A python dict has about 40 methods. Using the java way, a replacement mapping would need to fill out the whole interface, even though you-the-programmer may know that you'll never delete a key, or ask whether the dictionary itself is less than some other object.
Using Python, you normally only need to supply one or two methods -- the ones that your code will use. (For a few well-known interfaces like dict, there are also mixins to supply the other methods based on those you do supply, so that you can optimize for the common case, but fail safely.)
It doesn't really matter whether or not the mapping interface allows you to delete a specific key, because that interface is not checked by the compiler. It only matters whether or not *your* caller tries to delete a key.
> And another thing, "we can even substituting an entirely > different class as long as it has the same set of methods" > - you mean, you create an interface? hmm?
Except that with Python you don't have to create the interface, and even if you do, you don't have to implement the whole thing.