Agile Buzz Forum - Use of variables in methods

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Agile Buzz Forum
Use of variables in methods

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large

Use of variables in methods

Posted: Jan 17, 2004 3:02 PM

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: Use of variables in methods Feed Title: Michael Lucas-Smith Feed URL: http://www.michaellucassmith.com/site.atom Feed Description: Smalltalk and my misinterpretations of life	Latest Agile Buzz Posts Latest Agile Buzz Posts by James Robertson Latest Posts From Michael Lucas-Smith

I had a few minutes last night so I decided to knock together the starts of a simple Smalltalk CPS compiler. It does the following now:

Parses Smalltalk code using the Refactoring Browsers parser (this seems to have trouble with numbers?)
Flattens the parse tree out in to a list of statements. Each statement is a receiver, a selector and arguments. The receiver and arguments are always variables once all statements are flattened.
Optimises the variables. If a variable is no longer used after this statement, then statements below this statement can use that variable as their own slot. Likewise, if my result is only going to be used on the next statement, leave the result in the 'return variable'.
Generate a basic description of the bytecodes to achieve the output of this program.

It has a lot of rough edges already, but that's okay because the exercise was to see how variables reacted. I've learnt a lot of interesting things, such as:

Arguments passed in to a method nearly always survive to the last statement of the method or they are not used at all.
Most variables are used only in the next statement.

These two points are significant. The implications are that:

We should not use the 'return variable' as the variable for storing where we're jumping to, because it is cheaper to have it as one of the arguments to the method we're jumping to.
We should free up as many general-purpose registers as we can. Previous I said we have four free, because we used the following four: continuation address, stack allocation pointer, stack pointer and heap pointer. We can reduce these further and I'll explain how below.

Reducing register usage: The SS is your stack segment and esp is the pointer in to the stack. So we've lost one register already. We always allocate on to the top of the stack so we don't need a stack allocation pointer per-se, but if we re-enter this method via a continuation at some later point, the esp needs to have been adjusted to point to where it truely should be in the stack. When we save the continuation, we need to save the esp as a non-relative address. (It's been a long time since I've done assembler at this level, so I'll have to lookup if this is possible on x86)

Why bother with a heap pointer at all, we have DS, our data segment and at the start of it will be an allocation information structure header record thing. DS is always private to your program under Linux and, I assume, under Windows. (Assumptions will let you down under Windows ;))

This means we only need two special registers: continuation pointer and stack pointer.

During the course of an Object Oriented call, we must lookup the location of where we're jumping too, then jump to it. This costs us: a register to point to the method name, a jmp to look it up, the register used to point to the method name is replaced by a pointer to the method we're jumping too, a register to hold the continuation, then a jmp to the pointer to the method.

This is four operations in bytecode, but greater than 4 in actual assembly. There's polymorphic inline caching which improves the speed of the lookup. But given the nature of a dynamic call, this is generally the best that can be done. But it highlights that we need two registers to make a call.

This means we have a register for the call address, a register for the continuation address and a register for ou stack. Of EAX, EBX, ECX, EDX, EBX, EBP, ESP, ESI, EDI, we have used: ESP, EDI, ESI, leaving five registers free.

Can we do better? Remember one of the registers has to be nominated as the 'return variable'. This is likely to be EAX, as it is the most common. We could use EAX as the call address register, but we learnt from above that it is cheaper to use that as a general purpose register.

Conclusion: We've saved a register and rationalised the real way the system will call to other methods. Are five registers enough? Given the majority of calls use up to three registers and those registers will either be long-lived or very-short-lived (lets say two out of those three registers is long lived, leaving us with only two real registers used), we have three 'temporary' registers to play around with for argument calls.

Are three 'toy' registers enough? Well, so far, of the methods I've compiled, most have used only the return register and one other. Other methods quickly jump up to four registers being used, but often those methods only have one argument. What this means is the X86 is really 'pushing it'. We're constantly on the 'edge' of not being able to do it. Any fewer registers and the technique would probably have to be dumped. Is this just luck? I think it's actually a property of Smalltalk programs (and similar languages) more than anything else.

Once I've gotten far enough through with the compiler, I'll have it run over a large portion of the Smalltalk image to get some real statistics on register spill.

(Just to clarify: There may actually be 20 variables in the method, but because statements go out of scope, there may only be two variable slots required for the whole method to complete)

Read: Use of variables in methods

Previous Topic

Next Topic


	Web Artima.com