This post originated from an RSS feed registered with Agile Buzz
by Keith Ray.
Original Post: Code Smells - Long Method
Feed Title: MemoRanda
Feed URL: http://homepage.mac.com/1/homepage404ErrorPage.html
Feed Description: Keith Ray's notes to be remembered on agile software development, project management, oo programming, and other topics.
Over recent years, in mailing lists and newsgroups, a conversion about refactoring is repeated many times which goes something like this:
A: "Refactoring is [gold-plating | rewriting | unnecessary ]"
B: "No, refactoring is improving the design without changing the functionality."
A: "Ungh. I don't have time for that. The code is good enough as it is. "
and so on... and sometimes...
B: "Of course you'll want to have an automated test-suite to insure your refactorings don't accidentally change the intended behavior.
A: "I certainly don't have time to write tests!
Introduction Two
Textbooks and courses try to teach good design, almost never show examples of bad design. The only examples I know of are the excellent books Refactoring by Martin Fowler, Refactoring to Patterns by Joshua Kerievsky and Working Effectively With Legacy Code by Michael Feathers. Code smells (names for various categories of bad design) has a chapter in "Refactoring" but could just as easily have a whole series of books.
This essay is an attempt to describe one code smell, and how to correct it. The code smell is known as "Long Method" (or Long Function in a non-object oriented language.)
The long method represents a lack of proper encapsulation -- it does too much itself, and doesn't delegate that work to the proper authorities. The irony here is the young coders sometimes think putting everything in one method or in one class is encapsulation. A long method often has varying levels of abstraction within a single block of code.
Besides having "too much information" in one place, long methods prevent appropriate code re-use: they encourage cut-and-paste code duplication instead of re-use by calling methods and instantiating classes. Long methods are hard to understand. Hard to debug. And they are very hard to test.
Here's an example - a 284-line function from an unnamed open-source project:
In this example, the first block of code after the variable declarations starts with a comment "Remove all carriage returns". Any time you see a block of code within a long method that tells what that code is doing, that's a candidate for an Extract Method refactoring. We extract out that code, and use the comment as the name of the extracted method. In this example, 21 lines of code in this method are now replaced with one line of code. And you can bet that somewhere else in our project, we will also want to do something like this - if not removing carriage returns, then maybe removing some other character. This extracted method is also easy to test.
The next comment in the code is "Trim all trailing white space" but unfortunately the code to do that is intermingled with code performing other activities. If the original programmer had been programming by intention, he would have created a method named TrimWhiteSpace, which would have been simple to write and simple to test. In this example, understanding this intermingled code and extracting out the TrimWhiteSpace code is going to be difficult enough that I would postpone that refactoring... getting other cleanup done may allow this refactoring to be easier to do, later.
Even if you can't get the code as clean as you would like in one pass, doing small amounts of refactoring every day will help increase your productivity, code reuse, and testability