This post originated from an RSS feed registered with Agile Buzz
by Laurent Bossavit.
Original Post: Quality software measurement
Feed Title: Incipient(thoughts)
Feed URL: http://bossavit.com/thoughts/index.rdf
Feed Description: You're in a maze of twisty little decisions, all alike. You're in a maze of twisty little decisions, all different.
When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science.
-- Lord Kelvin
We commonly honor scientists by giving their names to units of measurement, an honor set aside for a select few. This makes Lord Kelvin, among all scientists, an impressive name to have at your side in a conversation.
The above quote is apparently a common fixture of Usenet discussions, and it popped up recently on an Extreme Programming newsgroup. The person invoking the manes of Kelvin was apparently arguing that, in order to make wise choices about certain software development practices, you should be able to cite measurements on the effectiveness of those practices. "You recommend pair programming", goes that cry, "but where is your scientific data ?"
I submit that the cry for data is a stall, a convenient means to sidetrack discussion of the reasons why something is being recommended.
Data matters, but so do, equally, the models or theories which make sense of the data. A classic episode in the history of science is Lord Kelvin's computation of the age of the Earth at about a hundred million years in his most generous allowance, or twenty million years in his later revisions. Geologists of the time suspected this to be a gross underestimate (which it was, by a factor of 50 even for the first figure), but Kelvin's figures were based on the best available scientific data. Given the measured rate of heat dissipation, the Earth couldn't be that old without being frozen solid. His argument had weight and was for a while an embarrassment to his opponents who argued for a much older Earth, to allow for the doctrine of natural selection. It took an entirely new finding to invalidate Lord Kelvin's argument: the discovery of radiation, which implied heat being created in parallel with heat being dissipated. Kelvin's calculations relied solely on dissipation of an original "capital" of heat.
Kelvin had "scientific" data on the age of the Earth, yet the conclusions he reached were wrong. Nor, it seems, was his interest in that particular problem motivated by purely "scientific" motives. And his conclusions were wrong because there were things he didn't know about the system he was studying, and which made the data irrelevant. Now whose knowledge of the whole affair should we say was of "a meager and unsatisfactory kind" ? It's a toss-up. Our modern-day knowledge is certainly more satisfactory, but that is certainly not due to anything we "express in numbers".
Data is good, but it's only data; the hard work is determining whether the data actually supports your hypotheses. Worse, data is usually theory-laden; research motivated by opposite purposes will often turn up data supporting opposite conclusions.
Worst of all, measuring performance data often has side-effects detrimental to what we are supposed to be measuring.
In discussions of performance, what we usually have in mind is a system consisting of various inputs (raw material, such as ore in steel works or ideas in software development) and outputs (steel, software), and processes turning the inputs into outputs. We may legitimately be interested in various dimensions of the system's performance, such as speed, cost, etc. The ideal of performance measurement consists of varying one parameter, "all other things being equal", and measuring the results on one relevant dimension.
This is possible only in a tiny fraction of all real-world situations.
The rest of the time, we are faced with problems where all other things are never equal, and the results have more than one relevant dimension. Some dimensions will be easy to measure (e.g. "faster", since we need only look at the clock on the wall) and others less easy (e.g. quality, accuracy, customer satisfaction, and so on).
There is a problem commonly faced by businesses and other organizations, called "dysfunction" by Robert Austin. Austin has devoted an entire book to exploring the topic, Measuring and Managing Performance in Organizations. It goes as follows.
Measurements are made of the things which are easy to measure, leading to strong pressure for improved performance along these dimensions. At the same time, no measurements are made of the things harder to measure, and no pressure applied there. However, for any given dimension of performance, "ease of measurement" does not necessarily correlate well with "criticality to business results". In fact, the correlation usually goes the other way.
Measurement efforts therefore tend to have adverse effects, because people respond to the differential pressure by slacking off on the aspects not measured, even if they are critical to business results.
An excellent example is software quality. "Productivity", roughly defined as the number of features (or worse, lines of code) delivered per unit time, is all too easy to measure. "Quality" is much harder to measure, because even at its simplest it consists of several distinct dimensions, such as customer satisfaction with the product delivered, programming defects (i.e. "bugs") detected during the development process or after deployment of the product, and various "ilities" such as maintainability.
Austin's model - a purely qualitative one - does a lot to explain the classic paradox of software development: the more pressure you put on a team to have it deliver faster, the later the project will actually ship, as a result of poor quality.
Conversely, Austin's model also explains the "virtuous circle" of agile software development: the more effort you invest in quality, in all its aspects - even the less easily measured ones - the less your software projects will actually cost and the faster they will go.