Summary
Dynamic language support on the JVM depends to a great extent on the VM's ability to handle dynamic method call and dispatch. Sun's John Rose, spec lead of JSR 292, Supporting Dynamically Typed Languages on the Java Platform, explains how the JVM handles method invocations, and comments on improvements in support of dynamic languages.
Advertisement
Initially designed to execute byte codes compiled from the Java programming language, the JVM has become the preferred execution environment for non-Java code as well in recent years. In contrast to the Java language, some of the languages now targeting the JVM use dynamic typing, and the VM's ability to handle method invocations in such dynamic languages plays a role in how well it can execute dynamic language code.
Sun's John Rose, spec lead of JSR 292, Supporting Dynamically Typed Languages on the Java Platform, penned a brief expose of method invocation mechanics in the JVM, with references to dynamic language support, Anatomy of a Call Site.
One of the topics Rose discusses is how the VM handles method arguments:
There is no essential reason the JVM cannot convert the arguments, as long as the conversions “preserve information”. More specifically, they should not violate intentions shared by the caller and the callee. The JVM cannot know such intentions, but it can provide conventions which align well with the implicit conversions found in most languages.
For example, if the caller passes an int and the callee receives an Integer wrapping the passed value, no type safety is violated, no information is destroyed, and the intentions of caller and callee should continue to match accurately. The Java compiler performs this conversion (called “autoboxing”) as part of method calls. The inverse conversion (“unboxing”) is also reasonable.
Conversion between reference types is also reasonable: If a caller passes a String and the callee expects any Object (which includes strings), there would be no harm if the JVM allowed the different descriptors to match...
Rose also discusses the VM's handling of return values, noting that:
There is no fundamental reason the JVM cannot return several values from a single call. It would be possible to slightly extend the syntax of method descriptors to allow several return values to be specified just as several argument[s] can be. This would be useful for languages that feature tuples; it would allow compilers to avoid boxing a tuple value on return from a method.
For languages which support dynamically selected multiple value returns (e.g., Common Lisp), a varargs return convention would be simple to create, corresponding to the varargs argument passing convention already in the JVM. Conversion between varargs and positional value passing would be intention preserving for return values just as for argument values.
Even more interesting are Rose's comments on reflective and normal method calls, and on the VM's behavior in the case an invoked method is not found:
There is another difference between the JVM and single-language systems ... and that is the JVM’s strong distinction between reflective and normal method invocation. As discussed below, reflective calling sequences are slower and more complex because they perform many steps of boxing, unboxing, dispatching, and access control on every call. With normal JVM calls, these steps, if done at all, are finished in a linkage step before the first call executes.
A “message not understood” hook appropriate to the JVM needs to work this way also: It need to perform its linkage work once before a number (potentially unlimited) of actual method calls. When the call to the hook delivers an actual method to be used, this method should be associated with that call site and reused for similar calls in the future...
A final difference between the JVM and a specific language’s runtime is that the details of the “method not understood” should not be tailored to one language, but should rather be a general and flexible means of satisfying method calls. Single inheritance or even single-argument dispatch are too limited a range of functionality, especially for dynamic languages. (Consider the case of an extensible “add” operation in a symbolic algebra library.) This means that there needs to be a low-level convention for associating the receiver and argument types with an actual method that has previously been associated with those types, and can be reused in the future without further up-calls to the hook.
What do you think of the JVM's support for dynamic languages, and of JSR 292?