This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: The cost of non-context CPS in Smalltalk
Feed Title: Michael Lucas-Smith
Feed URL: http://www.michaellucassmith.com/site.atom
Feed Description: Smalltalk and my misinterpretations of life
What I'm about to cover here doesn't just limit itself to Smalltalk. This is a general issue with the most effecient form of CPS out there - non-context CPS. Non-Context CPS makes a new method for every variant of a method you have. This means every path through your entire system is pre-mapped out at compile time.
The cost upfront is time to compile and memory for all the code, the benefit at runtime is extremely fast code with no stack - that's about as fast as a program can run on most CPU's
So, how can we identify how many method variables we might need? Well, since we're in Smalltalk, we have the reflection abilities of Smalltalk to help us find out. First of all, lets make a nice convenience method:
This handy method lets us get the parse tree for any method in the system. We can get all the methods in a class by doing:
aClass methodDict
And we can get all the classes in the system by doing:
Object allSubclasses
This lets us analyse the entire Smalltalk code base! But! What to analysis? First off, I thought it'd be good to know the last used position in any parse tree of every variable. Given this, I know all the methods before it have to pass that variable as an argument.
To do this, I threw together a subclass of ProgramNodeEnumerator which is used to iterate over a parse tree. On it I record the last node use of each variable.
Next, I then go through and add up the selectors being called against the number of arguments they'll need to pass. Do this over the entire image and we end up with the number of variants for every method.
Aha, .. here's the catch. Smalltalk is Dynamic - so the maximum number of possible arguments being passed (or, the number of argument size variations) times the number of methods in the system is the actual number of variations the system will have.
I didn't analyse my entire image, that would have taken too long. So I took a reasonable snapshot of it. The answer: every argument count variation up to 41 existed, plus a 45. This means 42 variations for every method.
How many methods are there in the image? 80000. So 80000 * 42 is 3,360,000 methods.
This quite clearly demonstrates that a non-context approach for any sizeably complex application is not feasable. This leaves us with the previous option of constructing Continuation/Message objects at every single step in the application. Is that so bad?
Well, lets look at it a different way. Every call you move on to next has one - and you don't need to remember the last one (because we never return!).. so we can reuse it. Now that we know that our application will never pass more than 42 arguments, we also know the biggest size the next Continuation can be. This means, we leave a single spot in memory 42 slots big for the next call. The majority of arguments will be a straight copy or reordering, popping in a return value or removing one out of context.
I may blog about this again - while it looks seemingly expensive from my previous blog item, it turns out to be fairly cheap.