Summary
A recent IBM DeveloperWorks article discusses a new feature in IBM's Java Virtual Machine that lets all JVMs running on the same system to share loaded classes. Such class sharing significantly reduces JVM startup time as well as memory footprint.
Advertisement
Application startup time has been significantly improved in recent version of most vendors' Java Virtual Machines. A popular technique to not only reduce VM startup time, but also a JVM's memory footprint, is class sharing: the ability of virtual machines running on a system to share loaded classes.
Class sharing is often facilitated by a shared memory area—such as a persistent disk cache—that stores a memory-mapped representation of classes to be shared by all JVMs on the system.
While such class sharing is frequently used to share the JDK core libraries that provide classes needed by every JVM, IBM's latest JVM takes this concept a step further by allowing all system and application classes to be stored in a persistent dynamic class cache in shared memory, and even supports runtime bytecode modification inside that cache.
Ben Corrie's article, Java technology, IBM style: Class sharing explains how to enable this feature on IBM's JVMs, and how class sharing works in the context of various types of classes and class loaders.
The article also explains what happens when a class definition in the shared cache changes. This is especially relevant with classes that are modified, or annotated, on the fly by bytecode instrumentation tools. Such runtime modifications of classes is a popular technique, for instance, with some object-relational frameworks and code coverage tools:
Runtime bytecode modification is becoming a popular means of instrumenting behaviour into Java classes. It can be performed using the JVM Tools Interface (JVMTI) hooks... alternately, the class bytes can be replaced by the classloader before the class is defined. This presents an extra challenge to class sharing, as one JVM may cache instrumented bytecode that should not be loaded by another JVM sharing the same cache.
However, because of the dynamic nature of the IBM Shared Classes implementation, multiple JVMs using different types of modification can safely share the same cache. Indeed, if the bytecode modification is expensive, caching the modified classes has an even greater benefit, as the transformation only ever needs to be performed once. The only proviso is that the bytecode modifications should be deterministic and predictable. Once a class has been modified and cached, it cannot then be changed further.
Some years ago, slow JVM startup and the Java VM's huge memory footprint were often cited as reasons for Java's perceived sluggishness. But modern JVMs and techniques such as class sharing reduced both JVM startup time and footprint. At the same time, JVM implementations differ greatly from platform to platform, from vendor to vendor, and even between releases of the same vendor.
With so many options to choose from, what JVMs do you prefer for development, and what JVMs do you recommend for deployment on a server?
"It is not possible to cache just-in-time (JIT) compiled code in the class cache"
If I understand this correctly, then it is only the bytecode that is stored in the class cache, leaving me wondering if this is nothing more than a glorified disk cache.
Ideally, the class cache should work the same way that shared objects/dynamic link libraries work: Processes share the same code (but not the same data). What are the reasons for not sharing (by default) the code between JVM processes for different users?
You are correct that only bytecode is cached. However, what you contend it *should* do: "Processes share the same code (but not the same data)" is exactly what it does:
The article describes the ROMClass/RAMClass split of each Java class. To over-simplify it, the RAMClass is the "data" and the ROMClass is the "code". The ROMClass "code" is shared between JVM processes and users and each JVM has its own copy of the "data".