Summary
This article gives you a simple trick you can use to make your Java method calls more readable when the list of parameters contains booleans and numeric primitives.
Advertisement
Now that I've gotten myself onto the Groovy mailing list, I'm getting involved
in language discussions again. I love those. The axe I generally grind in those
discussions is readability. To me, that is an overlooked strength of
the Java language that goes a long way towards explaining its popularity as
a vehicle for production applications and for teaching. A recent discussion
reminded me of a trick I use to make method calls more readable in Java--which
can become something of a problem when the list of parameters contains booleans
and numeric primitives. In this article, I share that trick with you.
You're looking through code written by someone else, or that you wrote long
ago (in a galaxy far away), and you find a method call like this:
addValue(13, 47, true);
The question is, what do the parameters mean?
Of course, you can search for the addValue method, navigate to it,
and read it to find out what the values mean, but that means you're no longer
reading the code, so much as translating it. It's as though
you're reading a book written in a foreign language, with the book in one hand
and a translation dictionary in the other. If you only have to do it once in
a while, it's not too bad. But if you're doing it all the time, it becomes a
problem.
In general, code that has to be "translated" is a major headache
for the hapless victim who has been tasked with modifying it--a victim who more
often than not turns out to be myself, 6 months later when I have completely
forgotten all of the things I had in my head when I wrote the original version.
So in an effort to make things slightly simpler for my future self, I began
looking for ways to make such method calls more readable.
A simple solution is to add variables that make the meaning clear:
int value = 13;
int location = 47;
boolean overwrite = true;
addValue(value, location, overwrite);
That code is self-documenting, so it's quite a bit more readable. It does add
some length, though. And the next time you invoke addValue in the same
scope, the code has to change slightly to remove the variable declarations:
It's a small thing, I know, but the code just became a little less consistent.
And it's 3 lines longer every time you make that call.
To shorten the code, you might be tempted to reuse a variable like overwrite,
for example. The code then reduces to
value = 13;
location = 47;
addValue(value, location, overwrite);
That's shorter, but the reader has just gone back into the translation business.
To make sense of the code, they have to answer the following questions:
Is overwrite a boolean or an integer?
(To answer that, they have to find the definition.)
Does it have a default value? Has the value changed anywhere?
(For that, they have to inspect the code for all occurences of assignments
to the variable.)
The simple solution does improve readability to a degree, if you avoid the
trap of reuse. But it turns out that there is an even better way to make your
method calls readable.
The solution that provides maximum readability is to declare variables without
defining them, and do the assignments only when using them in a method call.
The code template looks like this:
int value; int location; boolean overwrite;
...
doIt(value=13, location=47, overwrite=true);
Although I usually don't declare multiple variables on a single line, I do
so here to indicate that they are never intended to take a default value.
Note:
Some sort of naming scheme might also make sense for that purpose. For example,
a Hungarian prefix like "z" could be used to indicate a parameter-variable.
It would sound like French-English for "the", and would produce
variable names like zValue, zLocation, and zOverwrite.
Any reference to such a variable that did not also provide a value would be
considered a stylistic error. (In the ideal word, either the compiler or a
lint-sytle utility would give you a warning.)
With this mechanism, the method call doIt(value=13, location=47, overwrite=true)
now accomplishs three goals, all at the same time, in the least possible space:
It makes it clear that you're passing two integers and a boolean.
It specifies the values you're passing.
It names the values, so you know what purpose they serve.
Finally, note that the ordering of these "named parameters" is always
preserved. The names exist for semantic clarification--they do not permit you
to change the order of the parameters--a facility which tends to produce readability
problems of its own.
I avoid writing methods that accept booleans along with other parameters. I don't know where I got that idea from, I think it was Bertrand Meyer, something about passing control flags. But the reason is pretty much the same as what you gave; the values "true" and "false" in an argument list don't tell you anything, and it is hard to remember what the positive condition is. Moreover, removing the condition usually gives us a more descriptive set of method names:
If we could remove some duplication here, we could make it even more readable. We'd have to find a good name for a type that holds a value and a location. Nothing is occuring to me right now. It's a shame. If we could find something like that, we could end up with two single arg methods named addValue and overwriteValue.
I like your solution: Refactor the boolean into two methods, give the method a significant name and you don't need to declare the arguments.
This will indeed help in many cases.
But if you would create a parameter type to contain value and location you have the drawback of:
- creating a new unnecessary object - decreased readablility as it is easier to grep addValueAtLocation(13,47)instead of addValueAtLocation( myValLoc ) as you have to look for the concrete values on other places now (it's somewhat a reverse of the translation problem).
---- We can stop here or give a totally wider approach and ask, if our problem perhaps is a problem of the design, and why a "location", which contains a value, is not an object.
Then you would perhaps be able to code:
getLocation(47).setValue(13)
It is now implementation detail of the owner of getLocation to perhaps create a new instance if no Location for 47 exists.
(But, well, this only is applicable if you are responsible for the owner's design and not if it is a third party library).
> Although I usually don't declare multiple variables on > a single line, I do so here to indicate that they are never intended to take > a default value.
Unfortunately that intention, whilst it may be clear to you, is not apparent in the code. If I see a line like that when maintaining code it immediately puts me into a 'high alert' state on the grounds that I appear to be dealing with code written by someone who is not familiar with common coding standards or whose codeing style is simply 'thoughtless' - both interpretations which would be wide of the mark in this case.
Now I appreciate that the fashion these days is to eschew all comments from the code but I think this would be a case where a short comment would usefully clarify the intention, e.g.
// Clarify parameter values.
int value; int location; boolean overwrite;
I'm with Noam on this. My unsightly variation is as follows: // value, location, overwrite addValue(13, 47, true);
I think that declaring extra variables would add confusion and potentially slow down execution speed (albeit minimally), assuming a clever compiler didn't strip them out.
Am I the only one who thinks this discussion has gone odd?
If we want more readable method calls, the first thing we should do is look at the method name and figure out how to make it explain what the call means to the user.
Using temps is a neat trick, but it is like saying, well, when 2+2 produces 5, we'll decrement the 5.
> > Eschew boolean parameters... and use better method > > names. > > or declare classes for value and location > > Why add so much overhead just to solve a simple problem? > > > - Noam.
Well, skip the additional classes for a second. Better method names aren't overhead. They're an investment in understanding and they cost nothing except the time it takes to think them up.
On readability, addValueAtLocation(12,47) and overwriteValueAtLocation(12,47) are much better than addValue(12,47,false) and addValue(12,47,true)
Dirk's getLocation protocol is very nice too. I might move to it if the number of location parametered methods increases. Is there a little performance hit in doing that?
Possibly, but performance matters only when it matters, and when it does you find the critical path. Optimizing off the critical path is throwing money away. In this case, keeping the old names with those parameters without any urgent performance need is just hazardous:
addValue(12,47) addValue(47,12)
Eric's suggestion is fine when people use it, but it requires constant diligence. If we're up for constant diligence, why not be diligent about names so that we don't have to be as diligent at call sites? Save some work?
I'm liking the refactored naming approach Michael Feathers is recommending, when renaming is possible. But there seem to be situations where renaming is not feasible (3rd-party libraries, fixed APIs, big gradual safe refactorings, etc), it seems like some of the various ideas discussed in this thread are ways to clarify the intent of the code.
Of those ways, I'm liking the approach of initializing local variables, because it's clear, and allows comments about where the magic number comes from, and maybe its units. example:
int value = 10; // meters, mid-range int location = 5; // feet, lower-left corner doIt( value, location);
or possibly
int valueMidRangeMeters = 10; int locationLowerLeftFeet = 5; doIt( valueMidRangeMeters, locationLowerLeftFeet);
Within the spirit of the thread, some other ideas include:
doIt(10, 5); // doIt(int value, int location)
or shorter versions like:
doIt(10.0, 5.0); // value, location
Preference: Focus on naming: mehtods, then local variables.
This is true if we control the code and can change the method name but the problem still remains if we are using a badly named library method whose signature we cannot change.