Last night I started thinking about Futures and Erlang and how fantastic they are. I had a simple question - how can you optimise away the multiple processes that are created in a Futures environment when they are unnecessary automatically so that the programmer doesn't have to think about it and you still have a fast running system?
The answer is pretty simple. If the future is to be used straight away (which you can find out with very simple bytecode analysis) then you can run the future in the same thread but in a new stack frame.
Okay, step back - let's talk about Futures and Continuations a little bit again. I've discussed Continuations and Continuation-Passing-Style on this blog before and it's a topic that still interests me greatly. First a refresher. This is what a normal stackbased execution model looks like (please excuse the crude drawings done in The Gimp):
Continuations work such that when you call a method, you pass to that method the next place for that method to run to when it returns. That means the method never 'returns' as such, but does a 'goto' to the next place of execution. This requires passing the 'closure' of the frame you're in that's relevant to the next piece of executable code in your method. No worries, we call this combination of closure + execution point the Continuation (ie: everything you need to know to 'continue' from somewhere).
Now, let's step to the side for a moment and discuss Futures. Futures (also known as Promises) execute every message send in a new thread and return to the calling thread an object called a Future immediately. This means that the sending thread continues to run while the first message send is running. When the sending thread tries to send another message to its result (which is a Future), it blocks and waits for the future to stop. The basic parallel computing model looks like this:
The futures model, on the other hand, promotes asynchronise execution - parallelism in such a way that the developer doesn't have to think about it. Futures execution model looks like this:
Futures have this fantastic feature in that you only ever have one stack frame per process. You end up with lots of processes instead of lots of stack frames. The problem becomes how do you manage all those processes - they are all processes, just most of them are blocked/suspended. Now, continuations have this fantastic feature where you never return, thus you only have the 'required' amount of a stack at any one time.
So, I wondered, what would happen if these two techniques are combined. An execution model that never returns and only ever has one stack frame per process. It'd look something like this:
This looks incredible and very powerful. The big problem of managing lots of processes is gone, because you only have a few running processes at any one time (which can be recycled) as whenever a process blocks, it terminates and sets a continuation in the future. We end up with lots of continuations instead of processes, but continuations that have only one stack frame are really really cheap.
Food for thought, isn't it...