This post originated from an RSS feed registered with Agile Buzz
by Keith Ray.
Original Post: Premature Optimization
Feed Title: MemoRanda
Feed URL: http://homepage.mac.com/1/homepage404ErrorPage.html
Feed Description: Keith Ray's notes to be remembered on agile software development, project management, oo programming, and other topics.
So what to optimize is easy: those parts of the program that are consuming the time. But local optimizations that ignore global performance issues are meaningless. And first-order effects (total time spent in the allocator, for example), may not be the dominant effects. Measure, measure, measure.
(He has examples of second-order and third-order effects.)
Why doesn't efficiency matter when you're updating menus or controls? Look at the human factors. A mouse is held approximately 2 feet from the ear. Sound travels at approximately 1100 ft/sec. This means that it takes approximately 2ms for the sound of the mouse click or keystroke to reach the ear. The neural path from the fingertip to the brain of an adult is approximately 3 feet. Propagation of nerve impulses is approximately 300 ft/sec, meaning the sensation of the mouse click or keystroke takes approximately 10ms to reach the brain. Perceptual delay in the brain can add between 50 and 250ms more.
Now, how many Pentium instructions can you execute in 2ms, 10ms, or 100ms? In 2ms, on a 500MHz machine that's 1,000,000 clock cycles, so you can execute a lot of instructions in that time. Even on a now-clunky 120MHz Pentium there is no noticeable delay in handling the controls.
And yet, UI on some Microsoft [and Apple :-( ] products is sometimes slower than my (not terribly fast) typing-speed... but seems to come from poor algorithms and data structures (or n-th order effects like cache-misses), not some skanky c/c++ "function-call efficiency" issues.
[benchmark optimization based on cache archituture]... That's a factor of 20 performance improvement based solely on fourth-order effects, cache line hits. If you're doing image processing, particularly of large images, an awareness of cache issues (even relatively machine-independent) can buy you an order of magnitude performance.
...
Perhaps the best example of pure programmer stupidity in "optimizing" code occurred when I was porting a large library we used for our research project. Think of it as a 16-bit-to-32-bit port [...] after 12 solid hours of debugging [...] Another 5 hours [...] instead of seeing something like something->p1 = NULL; something->p2= NULL;, I found the moral equivalent of (*(DWORD*)&something.p1) = 0! When I confronted the programmer, he justified it by explaining that he was now able to zero out two pointers with only a single doubleword store instruction [...] Of course, when the pointers became 32-bit pointers, this optimization only zeroed one of the two pointers, leaving the other either NULL (most of the time), or, occasionally, pointing into the heap in an eventually destructive manner. I pointed out that this optimization happened once, at object creation time; the average application that used our library created perhaps six of these objects, and that according to the CPU data of the day before I'd spent not only 17 hours of my time but 6 hours of CPU time, and that if we fixed the bug and ran the formerly-failing program continuously, starting it up instantly after it finished, for fourteen years, the time saved by his clever hack would just about break even with the CPU time required to find and fix this piece of gratuitous nonsense. Several years later he was still doing tricks like this; some people never learn.