Summary:
Computers make life easier because they're so fast, right? Well, yes and no. Do you write efficient code? The author reveals some disconcerting inefficiencies lurking in commonly used software and development practices.
The ability to add new comments in this discussion is temporarily disabled.
Most recent reply: December 11, 2008 11:25 AM by
|
In this article, the author reveals some disconcerting inefficiencies lurking in commonly used software and development practices: http://www.artima.com/cppsource/how_to_go_slow.htmlWhat do you think about Colvin's points about seeming inefficiencies?
|
|
|
He mentioned BubbleSort so I immediately stopped reading.
|
|
|
> He mentioned BubbleSort so I immediately stopped reading.
I think you are being too cynical. This is a good article explaining basic sources of inefficiencies for people who don't want a PhD in complexity theory.
|
|
|
And Bubble Sort is very efficient if there are only a few items out of place. If I'm not mistaken, Python's built-in sort stabs at the data with a few different algos until it's fully sorted.
|
|
|
I believe that main cause of slow (and resource-hungry) programs today is the use of prefabricated libraries and components. These components are often too general for the purpose they are used for, and are often built from other components.
Each such layer will also add innecessary conversions of data and data structures.
Slowdown on these layers is mostly invisible during the testing, because today's systems are too powerful and have too much memory. The tests are usually designed to test for "perceived" slowness instead of theoretical expectations how much resource the component needs.
However, once you start stacking these components together, the inefficiencies will become apparent.
Also, I believe that (current) object-oriented programming paradigm is for the most part responsible for the situation (this opinion will be quite hated on Artima, I guess :)). In my opinion, object-oriented programs hide the fundamental data structures of the program in the vast networks of little objects. This prevents understanding and optimization of these data structures, thus preventing usage of optimal algorithms (the whole refactoring thing is just accommodating a different data structures to the program, so that better algorithms could be used).
There are ways out in my opinion. One is to take a look at VHDL, for instance. It is a very modular language (for hardware design), yet the final design is optimized as a whole - there are no redundant data conversions after the synthesis.
Second approach is to decouple the data structures used in the program from the actual program logic. Any program that uses a database essentially does that - the common point is the database schema, and how exactly (with what algorithms) it is implemented in the database is not relevant. The actual database implementation can then be optimized independently of the program, by choice of indexes for example. This should be done on much finer level, directly in memory. But this means to centralize data back from many little objects, which contradicts current OO paradigm.
|
|
|
Jan wrote: "I believe that main cause of slow (and resource-hungry) programs today is the use of prefabricated libraries and components. These components are often too general for the purpose they are used for, and are often built from other components."
I hear what you're saying about prefab libraries and components. I think I've run into a situation in which the "obvious" prefabricated choice doesn't cut it.
Off and on in the past couple weeks I've worked on creating a little app for my beau. It's in JRuby for historical reasons, and it eats memory something fierce.
Well, a hash is the most obvious choice for processing the mound of data in the way it needs processing. The built-in hash structures that comes with Ruby, Python, Java, etc. store not only the "values" but also the "keys" used to access the values. (That's why you can ask these built-in hashes for a list of keys, and they'll cough them right up for you.) But, wouldn't you know, I'm not interested in ever seeing the keys again in my beau's app.
So a couple nights ago I created a fresh data structure, which is sort of a cross between an array and a hash, that doesn't bother to store the "keys" I feed it. Yes, it uses the built-in Array structure, but at least it takes less memory than the Hash. I still need to incorporate my mini-hash with the rest of the app and give it a whirl, though.
|
|
|
> > He mentioned BubbleSort so I immediately stopped > reading. > > I think you are being too cynical. This is a good article
So, I reread it. Here are some pearls of wisdom..
"Looping also amounts to How To Run Hot, where heat means wasted energy and fried equipment, and cooling also wastes a lot of expensive energy."
"You still may never get out of the loop, but at least you won't be hogging a CPU the whole time."
"The easiest way to avoid this is to avoid languages with undefined behavior"
Other than resource locking and some C specific advice of how best to iterate over a 2D array, there's nothing useful there. Do youactually consider "Don't write infinite loops", "Dont use Bubble Sort" and "Don't dereference NULL" to be useful advice.
What if I wrote an article on extreme programming best practices with advice like "don't drop the monitor on your groin", "avoid pair programming with a schizophrenic psychopath" and "don't delete your entire CVS repository by mistake"? That's pretty close to his advice. If it were closer to April I'd think the article is a deliberate joke.
But it's inspired me to write one. Hey Artima, how do I get to write something for you?
|
|
|
When all I care about is the values in awk's associative array I just use the values as the keys.
So instead of array = x and then for(i=1; i<=n; ++i) x = array... I write ++array[x] and then for (x in array) ...
And yes - it can be a good strategy to start with the what your language and library provide and then rewrite as needed.
|
|
|
> What if I wrote an article on extreme programming best > practices with advice like "don't drop the monitor on your > groin", "avoid pair programming with a schizophrenic > psychopath" and "don't delete your entire CVS repository > by mistake"? Obviously you are a guru .. ;-) Peace, Cameron Purdy | Oracle http://www.oracle.com/technology/products/coherence/index.html
|
|
|
Well yes, of course I'm deliberately joking, so perhaps Chuck should have waited until April. But there is a serious point to my jokes, which is to get people to pay attention to performance. Of course no one purposely loops infinite, dereferences a null pointer, chooses a quadratic algorithm over a linear one, or otherwise writes bad code. And indeed you'd think no one needs to be reminded. Nonetheless, programs too often loop infinite, crash, and exhibit non-linear degradation in performance. So I gave a few examples of how that can happen. The upshot of the complexity section was to first check the C++ standard library,which carefully documents the complexity of its algorithms. And if it doesn't have what you need then be sure you know the complexity of the code you use or write instead. Again, this should be obvious. Nonetheless, the programs on my computer spend a lot of time chewing up resources to no apparent purpose. So I gave an example of two similar functions with much different computational complexity. The suggestion to use languages with no undefined behavior was serious. There are a lot available that work well with C++ (e.g. Boost.Python) and you can also implement custom mini-languages in C++ to make particular tasks easier to do safely. You just need to beware of the performance tradeoffs. And I'm rather liking your idea of an Extreme Programming Worst Practices article. If you submit it here we'd be happy to review it: http://www.artima.com/write.html
|
|
|
> Well yes, of course I'm deliberately joking, so perhaps > Chuck should have waited until April. ... > And I'm rather liking your idea of an Extreme Programming > Worst Practices article. If you submit it here we'd be > happy to review it: http://www.artima.com/write.htmlThanks for the link! Hopefully I'll be inspired to write such an "article". :-)
|
|
|
> Well yes, of course I'm deliberately joking, so perhaps > Chuck should have waited until April. But there is a > serious point to my jokes, which is to get people to pay > attention to performance.
I'll try to make the point that I think Morgan is making in a more constructive tone.
This article is a good one for developers don't know their asses from a holes in the ground. In my experience that's probably anywhere from 70 - 85% of developers. However, the portion of those developers that are reading Artima articles is pretty small. I think this article is inappropriate for the audience.
Probably every person reading this already knows to yield in a loop when you are polling. Everybody reading this knows that bubblesort is for boneheads. And finally everyone reading this knows it's impossible to write a Java program that doesn't crash or hang.
|
|
|
> Also, I believe that (current) object-oriented programming > paradigm is for the most part responsible for the > situation (this opinion will be quite hated on Artima, I > guess :)). In my opinion, object-oriented programs hide > the fundamental data structures of the program in the vast > networks of little objects. This prevents understanding > and optimization of these data structures, thus preventing > usage of optimal algorithms (the whole refactoring thing > is just accommodating a different data structures to the > program, so that better algorithms could be used).
I disagree on a number of these points. I think it's a gross misunderstanding of OO. It's a common misconception that OO means every OO must have it's own internal copy of the data-structure it represents. Well designed OO is also much easier to optimize in a lot of cases because the data structures can be changed without affecting things depending on an object's public interface. Lastly I think that's the worst description of refactoring I've ever seen. In fact most refactorings are for modifying classes and object graphs without affecting the underlying structures or behaviors.
|
|
|
> Well, a hash is the most obvious choice for processing the > mound of data in the way it needs processing. The built-in > hash structures that comes with Ruby, Python, Java, etc. > store not only the "values" but also the "keys" used to > access the values. (That's why you can ask these built-in > hashes for a list of keys, and they'll cough them right up > for you.) But, wouldn't you know, I'm not interested in > ever seeing the keys again in my beau's app.
If you can guarantee that your hashes are unique, then you can definitely implement a hash structure that doesn't keep references to the keys.
One thing that I've long considered a flaw in the Java Map class is that it puts far too much burden on the implementor. Often I just want an interface that has a single method: get(key) -> value.
It seems to me that this should be the base of a Map API and then there would be subclasses of this that would provide more functionality. That would make it possible to implement your structure as a standard map.
|
|
|
There are a number of potential alternative designs for the collection classes. I sometimes wish for a separate interface for read only collections instead of just throwing an exception if a mutating method is called. The trouble is that most such alternatives greatly increase the number of interfaces and implementions.
The API we have is a compromise, where the classes/interfaces usually do rather more than we need in any one instance, but we don't have to remember so much. True this does cause more work for implementers, but most people merely use the provided classes, and for those that do implement collections the abstract collections do much of the work.
|
|