Weblogs Forum - The Myth of External Program Documentation

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

The Myth of External Program Documentation

In this blog I'd like to pick up on something I ended with last time. Here's the final note from The Future of UML:

I ran into somebody who had still been associated with the project this summer, some 4+ years after the first new Prescriber had shipped. "Had the documentation been helpful?", I asked. "Are you kidding?!", my friend replied. "It scared the hell out of 'em." I looked aghast. He continued, "they take one look at the size of the notebook you left and they say I'm not touching that! It looks complicated!"

If you read Computer Science text books you might imagine external program documentation such as high-level designs, specifications, and theory of operation descriptions actually exist in most projects. Yet, when I asked many of my friends and colleagues if they had ever had such documentation throughout a project's life cycle, they laughed a bit uncomfortably and said, "well, no, not really." What's going on here?

While we're on it, how do you explain the reaction to the project documentation I'd left for the Prescriber? Is this common? The answer to that appears to be "yes". External program documentation, despite what you may have learned in all those high-minded computer science text books, remains one of the weak links in software development. How did this come to be and what is it costing us? And, why, if the industry standard is to eschew external program documentation, do we continue to preach the importance of such things?

Obsolescence

In his book Extreme Programming Explained: Embrace Change, Kent Beck expands on the original programmer adage "good, fast, cheap: pick two" and claims customers actually choose three of four factors: cost, schedule, features, and quality. Developers choose (or are at least stuck with) the last variable. In either case I think we have, at best, done a poor job of identifying costs or, at worst, have openly lied about costs, thereby skewing what customers might have selected.

Program maintenance constitutes 40-80% of software costs, yet I recall few discussions during program planning or development about such things. Further, maintenance costs fall largely into the category of enhancements. As Robert Glass in his book Facts and Fallacies of Software Engineering, puts it:

The 60/60 rule: 60 percent of software's dollar is spent on maintenance, and 60 percent of that maintenance is enhancement. Enhancing old software is, therefore, a big deal.

Given this large number one might expect that we'd be looking for ways to manage this cost and mitigate any associated risks. Steve Rakitin, in his book Software Verification and Validation for Practitioners and Managers, make the observation (also quoting from DeMarco and Lister) that "Turnover is incredibly expensive." Rakitin's book focuses squarely on software quality and I believe he's right to discuss turnover in his "Balancing People, Process, and Product" chapter. But, why is turnover so expensive?

It is estimated that roughly 30% of the total maintenance time is spent "understanding the existing product". (Again, I turned to Glass's book for this figure since it was handy.) This fact relates directly to the turnover number as illustrated by an Air Force study in 1983 where researchers found that the "biggest problem of software maintenance" was "high [staff] turnover" (at 8.7 on a scale of 1 to 10), "understanding and lack of documentation" (7.5), and "determining the place to make a change" (6.9). I contend they are all related. If you have no usable documentation then all of the information is in people's heads. If the heads walk out the door (turnover) then the information needs to be rediscovered. That is not cheap.

Of course, useless documentation is no help. Since a small percentage of the software life cycle is dedicated to the creation of documentation, the quality of it is immediately suspect. After all, if the documentation is already untrustworthy, why maintain it? Why throw good money after bad? But, is a given document useless 10 minutes after it is written? How about 10 days? How about 10 weeks? The tendency is to dismiss anything outside the code as "out-of-sync with reality" whether, objectively, that is true or not.

Finally, there is a notion that the code is golden (at least documents itself) even if no other external document is present purporting to do so. Tools like JavaDOC, which scrape the Java source code of your project and create a hierarchy of web pages, can create the documentation at the push of a button. The bits will be new and fresh but are they right? That is, how is it more likely that the comments describing the inner workings of a particular subsystem are more accurate, descriptive, and insightful just because they were pulled from the Java code? Put another way, does source code and its associated comments ever get "out-of-sync"? Of course it can. At best the co-location of the documentation and the code can eliminate the need for "finding the right place to update the documentation", but it isn't a panacea. It still takes work, and discipline, to ensure the documentation is correct.

Assume for a moment we're willing to maroon the maintenance programmer with no (useful) documentation. Can documentation early the software life cycle help mitigate risk? It can't be out-of-sync too far while we're still writing it, can it?

Failure to Drive Out Risk

One of the arguments for external program documentation, especially before the coding stage, is that it should help drive out risk. I actually agree with this, at least in principal, but have been forced to also recognize this activity's limits. For example, it isn't unheard of to have a feasibility study to determine if a project is even worth attempting. Certainly this would qualify as a risk reducing activity. Yet, as Glass recounts in his book, he was in the audience at the International Conference on Software Engineering (ICSE) in Tokyo in 1987 when Jerry Weinberg presented his keynote. Weinberg asked the audience who among them had ever participated in a feasibility study where the answer came back "No". Not a single hand of the 1,500 in attendance was raised. Which begs the question "how many of these documents are good science and how many are simply position papers.

Before we have too much fun beating up management on their feasibility studies I think we should take a hard look at our own writings. That is, how many of our (scant few) documents are intended to be good science and how many have simply been constructed to deflect attacks from our critics, get our way on some technical issue, select our favorite vendor, choose our favorite product, or simply embarrass those who dared to disagree with us?

Glass quotes a fellow named Bill Curtis who said "in a room full of top software designers, if any two of them agree, that's a majority." We programmers do like to have our opinions! But, it is sometimes difficult to even have documentation crisp enough to know what it is we are arguing about. I've seen some poor designs win over better alternatives put forth because the advocate for the better alternative simply didn't have the presentation skills to get the facts out there. Winning the argument and getting the right answer are two different things. In the case where the poorer design won, it is difficult to justify the documentation process as risk mitigation.

Literature

In many ways, a good design document is like a novel. It speaks of a world thus far only fictitious, in a tense that makes it sound like it already exists. "The program does this and this and that..." Since most engineers are poor writers (or, at least inexperienced novelists), it is no wonder good design documents are hard to find.

I think it goes beyond that, however. It takes some courage to put things in writing and then live with the consequences. Going on the record may not be in your (personal) best interest -- even if you're right. And, of course, being wrong, in writing, can hang about your neck like the proverbial albatross.

In some ways software design documents differ from novels: usually the novelist knows how the story is going to end. Such is not always the case in software development. Yet, the design decisions made early in a project will affect the software throughout its life cycle. Have you ever worked on something and asked, either under your breath or even out loud, "what were they thinking?!" Documentation of the thinking at the time, even if it drew incorrect conclusions, would be revealing to a maintenance programmer later. Such revelations could save that programmer hours or even days if they had such a document. "Oh, I see where they were going with this; and, I can see why it didn't work out." Still, with all the value that might give, we don't do it.

The final comparison to literature I believe is most apt. Writers make little money. This may also be true in the software world. Our boss or our customer needs the software; the documentation is our problem or just some internal software concern. Customers and management don't put any emphasis on it, so software developers concentrate on what they're being measured upon: delivering code.

Myth vs. Reality

My problem with all this is that I can't reconcile the software engineering practices book world from the real world. The software books blithely continue to tell us about external program documentation and how it is used throughout the software life cycle. For the vast majority of projects I have worked on, there are no such documents. I contend we should fix one or the other. The Extreme Programming crew has made their choice: they don't pretend to create such artifacts. (Castigate them if you must, but at least they are honest about what they do.) For the rest of us, it is a world of denial. I can't help but wonder if we couldn't do better.

References

Facts and Fallacies of Software Engineering by Robert L. Glass.
Find it on amazon.com here.
Extreme Programming Explained: Embrace Change by Kent Beck.
Find it on amazon.com here.
Curtis, B. R., Guindon, H. Krasner, D. Waltz, J. Elam, and N. Iscoe. 1987.
Empirical Studies of the Design Process: Papers for the Second Workshop on Empirical Studies of Programmers. MCC Technical Report Number STP-260-87.
I wasn't able to find this on the web. The reference is from the Glass book (above). It sounds interesting, though. If somebody has a pointer to it, please let me know. I'd like to read the original.
Software Verification and Validation for Practitioners and Managers, Second Edition by Steven R. Rakitin
Find it on amazon.com here. My review of that work appears on that page as well.
Peopleware : Productive Projects and Teams, 2nd Edition by Tom DeMarco & Timothy Lister
Find it on amazon.com here.
The Psychology of Computer Programming Silver Anniversary Edition by Gerald M. Weinberg.
Find it on amazon.com here. My review for the work may also be found on that page. Weinberg explore what does, and does not, motivate software professionals which, of course, is directly related to this.
See all of my reviews on amazon.com here.

James Britt

Posts: 1319
Nickname: jamesbritt
Registered: Apr, 2003

Re: The Myth of External Program Documentation

Posted: Sep 7, 2003 3:50 PM

First, since we're discussing writing, a nitpick: "Begs the question" does not mean "Prompts one to ask."

Second, most projects I've worked on had copius amounts of documentation. Mostly it concerned what a customer was expecting, if and when various features would be included, and requirements for interacting with other, existing systems. But, sadly, the docs for the code itself were often scant and, increasingly, wrong. Tools such as JavaDoc have helped, but rely on developers writing and maintaining comments, a chore treated by many with distain.

There's a cultural barrier that presents writing documentaion as less than noble. *Real* hackers will simply tell you to read the code, but the code can only tell you what the program does, not what it should be doing, or why.

Ian Bicking

Posts: 900
Nickname: ianb
Registered: Apr, 2003

Re: The Myth of External Program Documentation

Posted: Sep 7, 2003 5:57 PM

I think documentation is underrated, not just because of its ability to make future maintenance easier, but because it is an important part of the design process. It can be a time of reflection, when you consider what you've done and describe what that means. Underdocumented interfaces and projects tend, in my experience, to be poorly designed. The interfaces are often too bulky -- both programming and graphical interfaces. They don't predict actual usage, they support theoretical needs that don't actually occur, or they lack symmetry and orthogonality in their design.

But insofar as documentation is useful as a process more than a product, I think XP does attempt to cover this in some of their ideas with use cases -- producing ephemeral documentation. This may or may not be sufficient.

Other important areas of documentation are generally encountered when the underlying technology is flawed in some fashion. Perhaps there's no way to persistently annotate the source, or the structure of the source is forced into a restricted and unexpressive form. Or the source is simply opaque -- usually a persistent structure besides plain text.

Otherwise, good documentation can successfully take the form of careful comments, good naming conventions and good names, and localized documentation like JavaDOC which helps programmers learn where to start reading the code. Of course, all of these are documentation efforts that are also frequently ignored. Poorly factored code is common, and it can be difficult to justify fixing -- unlike external documentation, it is not well understood or measured by non-programmers.

Ian Bicking

Posts: 900
Nickname: ianb
Registered: Apr, 2003

Re: The Myth of External Program Documentation

Posted: Sep 7, 2003 6:01 PM

> the code can only tell you what the program does, not what it should be doing, or why.

Well factored code *can* tell you what it should be doing and why. It can be difficult, and not always possible for all kinds of code and environments, but it is not without hope. Names of variables, objects, and functions all express intention. And like comments, they can also obscure intention and mislead.

Adrian Howard

Posts: 11
Nickname: ajh
Registered: May, 2003

Re: The Myth of External Program Documentation

Posted: Sep 8, 2003 5:44 AM

> Program maintenance constitutes 40-80% of software costs,
> yet I recall
> few discussions during program planning or development
> about such things.

One of the nice things I've found about working with more agile methodologies is that they address this straight on.

For example if you're doing XP then every release after the first iteration is a maintenance release where you're adding new features.

> If you have
> no <b>usable</b>
> documentation then all of the information is in people's
> heads. If the
> heads walk out the door (turnover) then the information
> needs to be
> rediscovered. That is not cheap.

An alternative to more documentation is to get the information into more heads. Practices like common code ownership and pair programming can help there.

> The tendency is to dismiss anything outside the
> code as
> "out-of-sync with reality" whether, objectively, that is
> true or
> not.

True. However the dismissal tendency exists because of the common experience of documentation being radically out of sync with the codebase.

If it has been my experience that, nine times out of ten, the documentation isn't going to help then instant dismissal is my best approach.

If only there was a way of automagically finding out whether documentation was good or bad :-)

> Finally, there is a notion that the code is golden (at
> least
> documents itself) even if no other external document is
> present
> purporting to do so. Tools like JavaDOC, which scrape the
> Java
> source code of your project and create a hierarchy of web
> pages,
> can create the documentation at the push of a button. The
> bits
> will be new and fresh but are they <i>right</i>? That is,
> how
> is it more likely that the comments describing the inner
> workings
> of a particular subsystem are more accurate, descriptive,
> and insightful
> just because they were pulled from the Java code? Put
> another way,
> does source code and its associated comments ever get
> "out-of-sync"?
> Of course it can. At best the co-location of the
> documentation and the
> code can eliminate the need for "finding the right place
> to update the
> documentation", but it isn't a panacea. It still takes
> work, and
> discipline, to ensure the documentation is correct.

Comments can get out of sync. They are just documentation in another place.

However, by definition, the code cannot get out of sync. The code describes accurately what the program currently does.

Now there may be an argument that separate documentation describes what the codebase should do, as opposed to what it actually does, but that's a different kettle of fish.

I prefer to spend time getting the code to match the requirements (using, for example, automated acceptance tests) than spend time documenting design decisions and then having to keep them in sync with the code. As ever YMMV :-)

Although you don't mention them explicitly I hope you're including tests when you talk about the code, since they are often better at documenting questions about why the code does things in a certain way than the code itself

>The Extreme Programming crew has made their choice:
> they don't pretend
> to create such artifacts. (Castigate them if you must, but
> at least they
> are honest about what they do.)

XP myth alert! XP people don't create documentation until it is necessary - very different statement from never creating documentation artefacts at all.

It's just that we find that documentation is necessary in fewer places than many people think.

Adrian Howard

Posts: 11
Nickname: ajh
Registered: May, 2003

Re: The Myth of External Program Documentation

Posted: Sep 8, 2003 6:30 AM

> but the code can only
> tell you what the program does, not what it should be
> doing, or why.

Depends on the code ;-)

It has been my experience that a lot of what the program should be doing and why can be placed in the code if you refactor well and use good naming practices.

You can, of course, argue that bad code doesn't perform this task - but neither does bad documentation.

Good acceptance tests and unit tests act as very good descriptions of what the could should do - and have the advantage of an immediate list of failures if they become out of sync with the code.

In my experience there are very few things that you can't make explicit in the code with some effort. "Global" requirements like "error rate of new users must be less than 60%" or "all transactions must complete in under 15 seconds" are hard - most everything else isn't.

Celia Redmore

Posts: 21
Nickname: redmore
Registered: Jun, 2003

This is what differentiates professionals from the others

Posted: Sep 8, 2003 4:27 PM

I have a hard time believing that any team could create a substantial product with no documentation at all. That would be like building a 50-storey office tower with no blueprints. You might not need sophisticated theory of operations documentation, but there is far more to building a complex product than any humans I have ever met could keep in their heads.

Start off with the detailed analysis of what the customers need all the many customers of a product from CEO to data entry clerk. Go on to the technical implications of resource load, existing infrastructure, competition, existing skills, and expected business direction. Work out all the myriad details of designing a product that is efficient, reliable, robust, maintainable, flexible, and powerful.

When the product is complete, there had better be documentation where its needed which is usually with the source code, not in a separate hard-copy book that can easily be mislaid. That isnt to say that documentation is ever a substitute for a good, clean, usable design. If software cant be used without an accompanying tome, then it probably should scare a sensible user.

Im amused by JavaDoc. The idea that you could abstract out source code comments into a separate file and use them as product documentation was commonplace at one time. It didnt work then, and it doesnt work now, because quality programmers write good documentation and dont need such utilities and hackers (i.e. bad programmers) cant write good comments any more than they can write good documentation.

Jan Ploski

Posts: 8
Nickname: jploski
Registered: Aug, 2003

Re: The Myth of External Program Documentation

Posted: Sep 9, 2003 4:44 PM

Here is my little theory why documentation gets out of sync:
1. Programmer A sits alone and writes a document.
2. Programmer B needs to change the described code.
3. Programmer B reads A's documentation, which at some point turns out to be unclear or just a little bit outdated.

The "right" way of handling this would be to correct the documentation first, looking at the original code, then update both. However, programmer B does not feel an expert like A - who was the original author of the code and docs. Therefore, she has several choices: 1) "fix" the documentation and possibly make some mistakes 2) contact A and ask for help, in hope that she still knows better 3) just let it be ("for now") and change the code. The third alternative requires the least responsibility and is the fastest one to implement. Once it is chosen, the documentation becomes even more outdated, strenghtening the repeated choice of this option on any later occasions.

Unlike buggy code, wrong documentation does not cause immediately apparent failures. I'd also argue that writing excellent documentation is more of an art than writing working code. Excellent documentation can speed up solving real problems. Bad documentation can mislead, raise questions, annoy or bore its reader. No documentation is better than bad documentation. A very easy choice, indeed.

Why is most documentation inadequate? Because we don't test it at all. Documentation, like any writing, is about communication. However, at the time when it is written down there is usually noone to assess its understandability and usefulness, just a lone expert spilling her current thoughts on paper.

Anand B N

Posts: 4
Nickname: anandbn
Registered: Aug, 2002

Re: The Myth of External Program Documentation

Posted: Sep 9, 2003 6:07 PM

Could'nt agree more on the point that developers are not the best of "novelists". However I still feel that documenting what you design is aworthwhile effort.Especially in the world of dynamic changes you may be in a distributed environment where documentation and communication are the best tools you have.
Adding to this UML is an excellent medium to communicate technical mumbo-jumbo in a language independent manner.
With IT companies in perpetual race for "CMM" Level 1,2,3,4 and so on, documentation is only going to be more if not less.

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: The Myth of External Program Documentation

Posted: Sep 10, 2003 1:10 PM

The 60/60 rule: 60 percent of software's dollar is spent on maintenance, and 60 percent of that maintenance is enhancement. Enhancing old software is, therefore, a big deal.

Ada has demonstrated over 20 years that programming language design can halve lifecycle costs. Ada encourages its users to spend more time in writing to describe their code. That extra effort communicates information to the tools and to all future readers of that code. Future readers derive the benefits.

Peter Hickman

Posts: 41
Nickname: peterhi
Registered: Mar, 2003

Re: The Myth of External Program Documentation

Posted: Sep 13, 2003 1:30 PM

> However, by definition, the code cannot get out of sync.

What is true is that code is never out of sync with what it does. But it does get out of sync with what it should be doing. Code that gets upgraded, loses functionality and is extended will do things because that was the voodoo that fixed the bug. The subsequent maintenance programmer will then read the code and become very confused and the pattern named Lava flow comes into being.

Without the information about what the code is supposed to do and why as opposed to what the code actually doing then refactoring becomes a very dangerous exercise.

> The code describes accurately what the program currently does.

This may be of less use than you think.

Joseph Kiniry

Posts: 1
Nickname: kiniry
Registered: Jul, 2003

Re: The Myth of External Program Documentation

Posted: Sep 15, 2003 2:02 AM

If one couples documentation to semantics, then one cannot let the documentation get "out of date" because your system is no longer correct - it cannot be compiled, tested, or verified.

For our Java-based systems, we use the Java Modeling Language (JML), a behavior interface specification language (BISL) for Java.

With JML you write contracts and models for you software using Java plus a set of constructs specific to writing specifications (e.g., pre-/postconditions, invariants, model variables, frame axioms, etc.). This means that the programmer can use a language that has a Java syntax and semantics (no learning some new ambiguous or weak language like UML, OCL, or something arcane like VDM or Z) to write testable, checkable, formal documentation.

The various tools that understand JML (of which there are around 10) automatically transform your JML-annotated Java programs into (a) formal documentation, (b) unit test code, (c) generate invariants for you, (d) generate verification conditions, (e) automatically statically verify some conditions, etc.

These technologies are being used in industry and academic settings with great success. We at KindSoftware are also applying these general principles to other systems (e.g., Eiffel, ML, etc.).

I strongly suggest that Java developers and managers who care about the correctness and clarity of their systems spend a bit of time and look into JML.

http://www.jmlspecs.org/

Chris Hartjes

Posts: 12
Nickname: bloodshot
Registered: May, 2003

Re: The Myth of External Program Documentation

Posted: Sep 16, 2003 8:47 AM

> What is true is that code is never out of sync with what
> it does. But it does get out of sync with what it should
> be doing. Code that gets upgraded, loses functionality and
> is extended will do things because that was the voodoo
> that fixed the bug. The subsequent maintenance programmer
> will then read the code and become very confused and the
> pattern named Lava flow comes into being.

SOOO true. Here, we emphasize writing code that "documents itself" through the use of variable names that make sense and commenting only the stuff that isn't immediately obvious.

Now that we have a project that is reusing code from a previous one, this technique has made integrating old code into the new system a lot easier.

Lava flow is a very good phrase to describe what happens when code isn't documented or even written properly. Some programmers at our work believed that by making poorly documented code they could ensure job security by being the only one who understood how something worked. Those guys were fired and their code rewritten...

Johan Snyman

Posts: 1
Nickname: jsnyman
Registered: Jan, 2007

Re: The Myth of External Program Documentation

Posted: Jan 12, 2007 8:02 AM

The problem, in my experience, with documentation is that is normally only specifies what the code does, while it should specify why the code does what it does.

Any competent developer can see from the code what is done, but it is not possible to derive (at least not without a lot of effort and re-discovering of what was discovered in the original project) why the system was implemented the way it ended up.

Andy Dent

Posts: 165
Nickname: andydent
Registered: Nov, 2005

Re: The Myth of External Program Documentation

Posted: Jan 14, 2007 1:16 AM

> Any competent developer can see from the code what is
> done, but it is not possible to derive (at least not
> without a lot of effort and re-discovering of what was
> discovered in the original project) why the system was
> implemented the way it ended up.

That is exactly why I evolved my diary-driven software process with the minimal requirement being to document decisions - the alternatives considered and why a particular approach was used.
http://www.oofile.com.au/adsother/sse.html

Note that is documentation which as it is historical, can never go out of date - at the time a decision was taken, you can see what was considered and what adopted. If the design has later varied from that, hopefully there's another diary entry. Even if there isn't an explanation as to why it has changed, you at least have a checkpoint on the history of the design.

Flat View: This topic has 14 replies on 1 page

Previous Topic

Next Topic