Sponsored Link •
|
From the programmer's perspective, one of the most important aspects of Java's architecture to understand is the linking model. As mentioned in earlier chapters, Java's linking model allows you to design user-defined class loaders that extend your application in custom ways at run-time. Through user-defined class loaders, your application can load and dynamically link to classes and interfaces that were unknown or did not even exist when your application was compiled.
The engine that drives Java's linking model is the process of resolution. The previous chapter described all the various stages in the lifetime of a class, but didn't dive into the details of loading and resolution. This chapter looks at loading and resolution in depth, and shows how the process of resolution fits in with dynamic extension. It gives an overview of the linking model, explains constant pool resolution, describes method tables, shows how to write and use class loaders, and gives several examples.
When you compile a Java program, you get a separate class file for each class or interface in your program. Although the individual class files may appear to be independent, they actually harbor symbolic connections to one another and to the class files of the Java API. When you run your program, the Java virtual machine loads your program's classes and interfaces and hooks them together in a process of dynamic linking. As your program runs, the Java virtual machine builds an internal web of interconnected classes and interfaces.
A class file keeps all its symbolic references in one place, the constant pool. Each class file has a constant pool, and each class or interface loaded by the Java virtual machine has an internal version of its constant pool called the runtime constant pool. The runtime constant pool is an implementation-specific data structure that maps to the constant pool in the class file. Thus, after a type is initially loaded, all the symbolic references from the type reside in the type's runtime constant pool.
At some point during the running of a program, if a particular symbolic reference is to be used, it must be resolved. Resolution is the process of finding the entity identified by the symbolic reference and replacing the symbolic reference with a direct reference. Because all symbolic references reside in the constant pool, this process is often called constant pool resolution.
As described in Chapter 6, "The Java Class File," the constant pool is organized as a sequence of items.
Each item has a unique index, much like an array element. A symbolic reference is one kind of item that may
appear in the constant pool. Java virtual machine instructions that use a symbolic reference specify the index
in the constant pool where the symbolic reference resides. For example, the
getstatic
opcode, which pushes the value of a static field onto the stack, is followed
in the bytecode stream by an index into the constant pool. The constant pool entry at the specified index, a
CONSTANT_Fieldref_info
entry, reveals the fully qualified name of the class in
which the field resides, and the name and type of the field.
Keep in mind that the Java virtual machine contains a separate runtime constant pool for each class and interface it loads. When an instruction refers to the fifth item in the constant pool, it is referring to the fifth item in the constant pool for the current class, the class that defined the method the Java virtual machine is currently executing.
Several instructions, from the same or different methods, may refer to the same constant pool entry, but each constant pool entry is resolved only once. After a symbolic reference has been resolved for one instruction, subsequent attempts to resolve it by other instructions take advantage of the hard work already done, and use the same direct reference resulting from the original resolution.
Linking involves not only the replacement of symbolic references with direct ones, it also involves
checking for correctness and permission. As mentioned in Chapter 7, "The Lifetime of a Class," the checking
of symbolic references for existence and access permission (one aspect of the full verification phase) is
performed during resolution. For example, when a Java virtual machine resolves a
getstatic
instruction to a field of another class, the Java virtual machine checks to
make sure that:
If any of these checks fail, an error is thrown and resolution fails. Otherwise, the symbolic reference is replaced by the direct reference and resolution succeeds.
As described in Chapter 7, "The Lifetime of a Class," different implementations of the Java virtual
machine are permitted to perform resolution at different times during the execution of a program. An
implementation may choose to link everything up front by following all symbolic references from the initial
class, then all symbolic references from subsequent classes, until every symbolic reference has been resolved.
In this case, the application would be completely linked before its main()
method was
ever invoked. This approach is called early resolution. Alternatively, an implementation may
choose to wait until the very last minute to resolve each symbolic reference. In this case, the Java virtual
machine would resolve a symbolic reference only when it is first used by the running program. This approach
is called late resolution. Implementations may also use a resolution strategy in-between these
two extremes.
Although a Java virtual machine implementation has some freedom in choosing when to resolve symbolic references, every Java virtual machine must give the outward impression that it uses late resolution. No matter when a particular Java virtual machine performs its resolution, it will always throw any error that results from attempting to resolve a symbolic reference at the point in the execution of the program where the symbolic reference was actually used for the first time. In this way, it will always appear to the user as if the resolution were late. If a Java virtual machine does early resolution, and during early resolution discovers that a class file is missing, it won't report the class file missing by throwing the appropriate error until later in the program when something in that class file is actually used. If the class is never used by the program, the error will never be thrown.
In addition to simply linking types at run-time, Java applications can decide at run-time which types to
link. Java's architecture allows Java programs to be dynamically extended, the process of
deciding at run-time other types to use, loading them, and using them. You can dynamically extend a Java
application by passing the name of a type to load to either the forName()
method of
class java.lang.Class
or the loadClass()
method of an
instance of a user-defined class loader, which can be created from any subclass of
java.lang.ClassLoader
. Either of these approaches enable your running
application to load types whose names are not mentioned in the source code of your application, but rather,
are determined by your application as it runs. An example of dynamic extension is a Java-capable web
browser, which loads class files for applets from across a network. When the browser starts, it doesn't know
what class files it will be loading across the network. The browser learns the names of the classes and
interfaces required by each applet as it encounters the web pages that contain those applets.
The most straightforward way to dynamically extend a Java application is with the
forName()
method of class java.lang.Class
, which has
two overloaded forms:
// A method declared in class java.lang.Class: public static Class forName(String className) throws ClassNotFoundException; public static Class forName(String className, boolean initialize, ClassLoader loader) throws ClassNotFoundException;The three parameter form of
forName()
, which was added in version 1.2, takes the
fully qualified name of the type to load in the String className
parameter. If the
boolean initialize
parameter is true
, the type will be
linked and initialized as well as loaded before the forName()
method returns.
Otherwise, if the boolean initialize
parameter is false
,
the type will be loaded and possibly linked but not explicitly initialized by the
forName()
method. Nevertheless, if the type had already been initialized prior to the
forName()
invocation, the type returned will have been initialized even though you
pass false
as the second parameter to forName()
. In the third
parameter, ClassLoader loader
, you pass a reference to the user-defined class
loader from which you want forName()
to request the type. You can also indicate that
you want forName()
to request the type from the bootstrap class loader by passing
null
in the ClassLoader loader
parameter. The version of
forName()
that takes one parameter, the fully qualified name of the type to load,
always requests the type from the current class loader (the loader that loaded the class making the
forName()
request) and always initializes the type. Both versions of
forName()
return a reference to the Class
instance that
represents the loaded type, or if the type can't be loaded, throws
ClassNotFoundException
.
The other way to dynamically extend a Java application is to load classes via the
loadClass()
method of a user-defined class loader. To request a type from a user-
defined class loader, you invoke loadClass()
on that class loader. Class
ClassLoader
contains two overloaded methods named
loadClass()
, which look like this:
// A method declared in class java.lang.ClassLoader: protected Class loadClass(String name) throws ClassNotFoundException; protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException;
Both loadClass()
methods accept the fully qualified name to load in their
String name
parameter. The semantics of loadClass()
are
similar to those of forName()
. If the loadClass()
method has
already loaded a type with the fully qualified name passed in the String name
parameter, it should return the Class
instance representing that already loaded type.
Otherwise, it should attempt to load the requested type in some custom way decided upon by the author of
the user-defined class loader. If the class loader is successful loading the type in its custom way,
loadClass()
should return the Class
instance representing the
newly loaded type. Otherwise, it should throw ClassNotFoundException
. The
details on writing your own user-defined class loader are given later in this chapter.
The boolean resolve
parameter of the two-parameter version of
loadClass()
indicates whether or not the type should be linked as well as loaded. As
mentioned in previous chapters, the process of linking involves three steps: verification of the loaded type,
preparation, which involves allocating memory for the type, and optionally, resolution of symbolic references
contained in the type. If resolve
is true
, the
loadClass()
method should ensure that the type has been linked as well as loaded
before it returns the Class
instance for that type. If resolve
is
false
, the loadClass()
method will merely attempt to load the
requested type and not concern itself with whether or not the type is linked. Because the Java virtual
machine specification gives implementations some flexibility in the timing of linking, when you pass
false
in the resolve
parameter, the type you get back from
loadClass()
may or may not have already been linked. The two parameter version
of loadClass()
is a legacy method whose resolve
parameter has,
since Java version 1.1, really served no useful purpose. In general, you should invoke the one-parameter
version of loadClass()
, which is equivalent to invoking the two-parameter version
with resolve
set to false
. When you invoke the one-parameter
version of loadClass()
, it will attempt to load and return the type, but will leave the
timing of linking and initializing the type to the virtual machine.
Whether you should use forName()
or invoke
loadClass()
on a user-defined class loader instance depends on your needs. If you
have no special needs that require a class loader, you should probably use forName()
,
because forName()
is the most straightforward approach to dynamic extension. In
addition, if you need the requested type to be initialized as well as loaded (and linked), you'll have to use
forName()
. When the loadClass()
method returns a type,
that type may or may not be linked. When you invoke the single parameter version of
forName()
, or invoke the three-parameter version and pass true
in the initialize
parameter, the returned type will definitely have been already
linked and initialized.
Initialization is the reason, for example, that JDBC drivers are usually loaded with a call to
forName()
. Because the static initializers of each JDBC driver class registers the
driver with a DriverManager
, thereby making the driver available to the application,
the driver class must be initialized, not just loaded. Were a driver class loaded but not initialized, the static
initializers of the class would not be executed, the driver would not become registered with the
DriverManager
, and the driver would therefore not be available to the application.
Loading a driver with forName()
ensures that the class will be initialized, which
ensures the driver will be available for use by the application after forName()
returns.
Class loaders, on the other hand, can help you meet needs that forName()
can't.
If you have some custom way of loading types, such as by downloading them across a network, retrieving
them from a database, extracting them from encrypted files, or even generating them on the fly, you'll need a
class loader. One of the primary reasons to create a user-defined class loader is to customize the way in
which a fully qualified type name is transformed into an array of bytes in the Java class file format that define
the named type. Other reasons you may want to use a class loader rather than
forName()
involve security. As mentioned in Chapter 3, "Security," the separate
namespaces awarded to each class loader enable you to in effect place a shield between the types loaded into
different namespaces. You can write a Java application such that types cannot see any types that aren't
loaded into the same namespace. Also, as mentioned in Chapter 3, class loaders are responsible for placing
loaded code into protection domains. Thus, if your security needs include a custom way to place loaded
types into protection domains, you'll need to use class loaders rather than forName()
.
Both the general process of dynamic extension and the separate namespaces awarded to individual class
loaders are supported by one aspect of resolution: the way a virtual machine chooses a class loader when it
resolves a symbolic reference to a type. When the resolution of a constant pool entry requires loading a type,
the virtual machine uses the same class loader that loaded the referencing type to load the referenced type.
For example, imagine a Cat
class refers via a symbolic reference in its constant pool to
a type named Mouse
. Assume Cat
was loaded by a user-defined
class loader. When the virtual machine resolves the reference to Mouse
, it checks to see
if Mouse
has been loaded into the namespace to which Cat
belongs. (It checks to see if the class loader that loaded Cat
has previously loaded a
type named Mouse
.) If not, the virtual machine requests Mouse
from the same class loader that loaded Cat
. This is true even if a class named
Mouse
had previously been loaded into a different namespace. When a symbolic
reference from a type loaded by the bootstrap class loader is resolved, the Java virtual machine uses the
bootstrap class loader to load the referenced type. When a symbolic reference from a type loaded by a user-
defined class loader is resolved, the Java virtual machine uses the same user-defined class loader to load the
referenced type.
As mentioned in Chapter 3, "Security," version 1.2 introduced a formal parent-delegation model for
class loaders. Although legacy class loaders written prior to 1.2 that don't take advantage of the parent-
delegation model will still work in 1.2, the recommended way to create class loaders from 1.2 on is to use
the parent-delegation model. Each user-defined class loader created in 1.2 is assigned a "parent" class loader
when it is created. If the parent class loader is not passed explicitly to the constructor of the user-defined
class loader, the system class loader is assigned to be the parent by default. Alternatively, a parent loader can
be explicitly passed to the constructor of a new user-defined class loader. If a reference to an existing user-
defined class loader is passed to the constructor, that user-defined class loader is assigned to be the parent. If
null
is passed to the constructor, the bootstrap class loader is assigned to be the
parent.
To better visualize the parent-delegation model, imagine a Java application creates a user-defined class
loader named "Grandma." Because the application passes null
to Grandma's
constructor, Grandma's parent is set to the bootstrap class loader. Time passes. Sometime later, the
application creates another class loader named "Mom." Because the application passes to Mom's constructor
a reference to Grandma, Mom's parent is set to the user-defined class loader referred to affectionately as
Grandma. More time passes. At some later time, the application creates a class loader named, "Cindy."
Because the application passes to Cindy's constructor a reference to Mom, Cindy's parent is set to the user-
defined class loader referred to as Mom.
Now imagine the application asks Cindy to load a type named
java.io.FileReader
. When a class that follows the parent delegation model loads
a type, it first delegates to its parent -- it asks its parent to try and load the type. Its parent, in turn, asks its
parent, which first asks its parent, and so on. The delegation continues all the way up to the end-point of the
parent-delegation chain, which is usually the bootstrap class loader. Thus, the first thing Cindy does is ask
Mom to load the type. The first thing Mom does is ask Grandma to load the type. And the first thing
Grandma does is ask the bootstrap class loader to load the type. In this case, the bootstrap class loader is
able to load (or already has loaded) the type, and returns the Class
instance
representing java.io.FileReader
to Grandma. Grandma passes this
Class
reference back to Mom, who passes it back to Cindy, who returns it to the
application.
Note that given delegation between class loaders, the class loader that initiates loading is not necessarily
the class loader that actually defines the type. In the previous example, the application initially asked Cindy
to load the type, but ultimately, the bootstrap class loader defined the type. In Java terminology, a class
loader that is asked to load a type, but returns a type loaded by some other class loader, is called an
initiating class loader of that type. The class loader that actually defines the type is called the
defining class loader for the type. In the previous example, therefore, the defining class loader
for java.io.FileReader
is the bootstrap class loader. Class Cindy is an initiating
class loader, but so are Mom, Grandma, and even the bootstrap class loader. Any class loader that is asked
to load a type and is able to return a reference to the Class
instance representing the
type is an initiating loader of that type.
For another example, imagine the application asks Cindy to load a type named
com.artima.knitting.QuiltPattern
. Cindy delegates to Mom, who
delegates to Grandma, who delegates to the bootstrap class loader. In this case, however, the bootstrap class
loader is unable to load the type. So control returns back to Grandma, who attempts to load the type in her
custom way. Because Grandma is responsible for loading standard extensions, and the
com.artima.knitting
package is wisely installed in a JAR file in the standard
extensions directory, Grandma is able to load the type. Grandma defines the type and returns the
Class
instance representing
com.artima.knitting.QuiltPattern
to Mom. Mom passes this
Class
reference back to Cindy, who returns it to the application. In this example,
Grandma is the defining loader of the com.artima.knitting.QuiltPattern
type. Cindy, Mom, and Grandma -- but not the bootstrap class loader -- are initiating class loaders for the
type.
This section describes the details of resolving each type of constant pool entry, including the errors that
may be thrown during resolution. If an error is thrown during resolution, the error is seen as being thrown by
the instruction that refers to the constant pool entry being resolved. Besides the errors described here,
individual instructions that trigger the resolution of a constant pool entry may cause other errors to be
thrown. For example, getstatic
causes a
CONSTANT_Fieldref_info
entry to be resolved. If the entry is resolved
successfully, the virtual machine performs one additional check: it makes sure the field is actually static (a
class variable and not an instance variable). If the field is not static, the virtual machine throws an error. Any
extra errors that may be thrown during resolution besides those described in this section are described for
each individual instruction in Appendix A.
In the following sections, the term current class loader
refers to the defining class
loader, whether it be a user-defined class loader or the bootstrap class loader, for the type whose constant
pool contains the symbolic reference being resolved. The term current namespace
refers
to the namespace of the current class loader, the set of all type names for which the current class loader has
been marked as an initiating loader.
CONSTANT_Class_info
EntriesOf all the types of constant pool entries, the most complicated to resolve is
CONSTANT_Class_info
. This type of entry is used to represent symbolic references
to classes (including array classes) and interfaces. Several instructions, such as new
and
anewarray
, refer directly to CONSTANT_Class_info
entries.
Other instructions, such as putfield
or invokevirtual
, refer
indirectly to CONSTANT_Class_info
entries through other types of entry. For
example, the putfield
instruction refers to a
CONSTANT_Fieldref_info
entry. The class_index
item
of a CONSTANT_Fieldref_info
gives the constant pool index of a
CONSTANT_Class_info
entry.
The details of resolving a CONSTANT_Class_info
entry vary depending on
whether or not the type is an array and whether the referencing type (the one that contains in its constant
pool the CONSTANT_Class_info
entry being resolved) was loaded via the bootstrap
class loader or a user-defined class loader.
A CONSTANT_Class_info
entry refers to an array class if its
name_index
refers to a CONSTANT_Utf8_info
string that
begins with a left bracket, as in "[I
." As described in Chapter 6, "The Java Class File,"
internal array names contain one left bracket for each dimension, followed by a component type. If the
component type begins with an "L
," as in
"Ljava.lang.Integer;
," the array is an array of references. Otherwise, the
component type is a primitive type, such as "I
" for int
or
"D
" for double
, and the array is an array of primitive types.
The end product of the resolution of a symbolic reference to an array class is a
Class
instance that represents the array class. If the current class loader has already
been recorded as an initiating loader for the array class being resolved, that same class is used. Otherwise,
the virtual machine performs the following steps: If the component type of the array is a reference type (the
array is an array of references), the virtual machine resolves the component type using the current class
loader. For example, if resolving an array class with the name
"[[Ljava.lang.Integer;
," the virtual machine would make certain class
java.lang.Integer
is loaded into the namespace of the current class loader. After
resolving the component type if the array is an array of references, or immediately, if the array is an array of
primitive types, the virtual machine creates a new array class of the indicated component type and number of
dimensions and instantiates a Class
instance to represent the type. For an array of
references, the array class is marked as having been defined by the defining class loader of the component
type. For an array of primitive types, the array class is marked as having been defined by the bootstrap class
loader.
A CONSTANT_Class_info
entry whose name_index
refers to a CONSTANT_Utf8_info
string that doesn't begin with a left bracket is a
symbolic reference to a non-array class or an interface. Resolution of this kind of symbolic reference is a
multiple step process.
The Java virtual machine performs the same basic steps, described below as Steps 1a and 1b, to resolve
any symbolic reference (any CONSTANT_Class_info
entry) to a non-array class or
interface. In Step 1a, the type is loaded. In Step 1b, access permission to the type is checked. The precise
way in which the virtual machine performs Step 1a depends on whether the referencing type was loaded via
the bootstrap class loader or a user-defined class loader.
Also described in this section are Steps 2a through 2d, which describe the linking and initialization of the newly resolved type. These steps are not part of the resolution of the symbolic reference to the type that becomes linked and initialized. Resolution of a symbolic reference to a non-array class or interface involves only Steps 1a and 1b, the (potential) loading of the type and the checking of its access permission. However, whenever the resolution process of a symbolic reference to a type is being triggered by the first active use of the type, linking and initialization of the type will immediately follow the resolution of the symbolic reference to that type. Because Java virtual machine implementations are allowed to perform early resolution, however, resolution of references to types may occur much earlier than the linking and initialization of those types. As mentioned in Chapter 7, the "Lifetime of a Type," initialization (here, Step 2d) occurs on the first active use of the type. Before a type can be initialized, it must be linked (Steps 2a through 2c), and before it can be linked, it must be loaded (Step 1a). [D] Step 1a. Load the Type and any Supertypes
The fundamental activity required by the resolution of a non-array class or interface is making sure the
type is loaded into the current namespace. As a first step, the virtual machine must determine whether or not
the referenced type has already been loaded into the current namespace. To make that determination, the
virtual machine must find out whether the current class loader has been marked as an initiating loader for a
type with the desired fully qualified name (the type name given in the symbolic reference being resolved).
For each class loader, the Java virtual machine maintains a list of the names of all the types for which the
class loader has served as an initiating class loader. Each of these lists forms a namespace inside the Java
virtual machine. The virtual machine uses these lists during resolution to determine whether a class has
already been loaded by a particular class loader. If the virtual machine discovers the desired fully qualified
name is already mentioned in the current namespace, it will just use the already-loaded type, which is defined
by a chunk of type data in the method area and represented by an associated Class
instance on the heap. By first checking whether the current namespace already includes the desired fully
qualified name, the virtual machine helps ensure that only one type with a given name is loaded by any single
class loader.
If a type with the desired fully qualified name hasn't yet been loaded into the current namespace, the
virtual machine passes the fully qualified name to the current class loader. The Java virtual machine always
asks the current class loader, the defining loader of the referencing type whose runtime constant pool
contains the CONSTANT_Class_info
entry being resolved, to attempt to load the
referenced type. If the referencing type was defined by the bootstrap class loader, the virtual machine asks
the bootstrap class loader to load the referenced type. Otherwise, the referencing type was defined by a user-
defined class loader, and the virtual machine asks the same user-defined class loader to load the referenced
type.
If the current class loader is the bootstrap class loader, the virtual machine asks it in an implementation
dependent way to load the type. If the current class loader is a user-defined class loader, the Java virtual
machine makes the load request by invoking the user-defined class loader's
loadClass()
method, passing in parameter name
the fully
qualified name of the desired type.
When either the bootstrap class loader or a user-defined class loader is asked to load a type, the class loader has two choices: It can attempt to load the type by itself, or it can delegate the job to some other class loader. A user-defined class loader can ask either another user-defined class loader or the bootstrap class loader to attempt to load the type. The bootstrap class loader can ask a user-defined class loader to attempt to load the type.
To delegate to a user-defined class loader, a class loader (whether bootstrap or user-defined) invokes
loadClass()
on that class loader, passing in the fully qualified name of the desired
type. To delegate to the bootstrap class loader, a user-defined class loader invokes
findSystemClass()
, a static method from
java.lang.ClassLoader
, passing in the fully qualified name of the desired type.
A class loader that has been delegated to can also decide whether or not to attempt to load the type itself, or
to delegate the job to yet another class loader. Eventually, some class loader will decide that the buck stops
with it, and rather than delegate, attempt to actually load the type itself. If this class loader is successful at
loading the type, it will be marked as the defining class loader for the type. All of the class loaders involved
in the process-- the defining class loader and all the class loaders that delegated -- will be marked as
initiating loaders of the type.
Given the existence of the parent-delegation model described earlier in this chapter, if a user-defined class loader delegates, the class loader to which it delegates will often be its parent in the parent-delegation model. The parent will, in-turn, delegate to its parent, which will delegate to its parent, and so on. The delegation process continues all the way up to the end-point of the delegation process, which is the class loader that, rather than delegating, decides to try and load the type itself. Most often, this end-point class loader will be the bootstrap class loader. When a parent class loader attempts to load the type but fails, control returns to the child class loader. In the parent-delegation model, the child class loader, upon learning that its parent (and grandparent, great grandparent, and so on) was unable to load the type, attempts to load the type itself. If a class loader in the middle of the delegation chain is the class loader that first has success loading the type, that class loader will be marked as the defining class loader. The defining class loader and all the class loaders before it in the parent-delegation chain will be marked as initiating class loaders. However, its parent, grandparent, great grandparent, and so on, none of whom were successful in their attempts to load the type, will not be marked as initiating class loaders of the type.
If the loadClass()
method of a user-defined class loader is able to locate or
produce an array of bytes that purportedly defines the type in the Java class file format,
loadClass()
must invoke defineClass()
, passing the fully
qualified name of the desired type and a reference to the byte
array. Invoking
defineClass()
will cause the virtual machine to attempt to parse the binary data
into internal data structures in the method area. At this point the virtual machine will perform pass one of
verification, as described in Chapter 3, "Security," which ensures the passed array of bytes adhere to the
basic structure of the Java class file format. The Java virtual machine uses the passed fully qualified name to
verify that the desired type name is actually declared as the name of the type in the passed array of bytes.
Once the referenced type is loaded in, the virtual machine peers into its binary data. If the type is a class
and not java.lang.Object
, the virtual machine determines from the class's data
the fully qualified name of the class's direct superclass. The virtual machine then checks to see if the
superclass has been loaded into the current namespace. If not, it loads the superclass. Once that class comes
in, the virtual machine can again peer into its binary data to find its superclass. This process repeats all the
way up to Object
.
When the virtual machine loads a superclass, it is really just resolving yet another symbolic reference. To
determine what the fully qualified name of a class's superclass is, the virtual machine looks at the
super_class
field of the class file. This field gives a constant pool index of a
CONSTANT_Class_info
entry that serves as a symbolic reference to the class's
superclass. When the virtual machine load the superclass, it does so as Step 1a of the process of resolving
the symbolic reference to the superclass. Thus, as part of Step 1a of the resolution process for
CONSTANT_Class_info
entries, the virtual machine recursively applies the
resolution process for CONSTANT_Class_info
entries on each superclass all the
way up to Object
.
On the way back down from Object
, the virtual machine will again peer into the
type data for each type it loaded to see if the type directly implements any interfaces. If so, it will make sure
those interfaces are also loaded. For each interface the virtual machine loads, the virtual machine peers into
its type data to see if it directly extends any other interfaces. If so, the virtual machine makes sure those
superinterfaces are loaded.
When the virtual machine loads superinterfaces, it is once again resolving more
CONSTANT_Class_info
entries. The indexes of all the constant pool entries that
serve as symbolic references to the interfaces directly implemented or extended by the type being loaded are
stored in the interfaces
component of the class file. When the virtual machine loads
superinterfaces, it is resolving the CONSTANT_Class_info
entries specified in the
interfaces
component, applying the resolution process for
CONSTANT_Class_info
entries recursively.
When the virtual machine applies the recursive resolution process to superclasses and superinterfaces, it
uses the defining class loader of the referencing subtype. The virtual machine makes its request in the usual
way, by invoking loadClass()
on the referencing subtype's defining class loader,
passing in the fully qualified name of the desired direct superclass or direct superinterface.
Once a type has been loaded into the current namespace, and by recursion, all the type's superclasses
and superinterfaces have also been successfully loaded, the virtual machine instantiates the new
Class
instance to represent the type. If the bytes defining the type were located or
produced by a user-defined class loader and passed to defineClass()
,
defineClass()
will at that point return the new Class
instance. Alternatively, if a user-defined class loader delegated to the bootstrap class loader with a
findSystemClass()
invocation, findSystemClass()
will
at that point return the Class
instance. Upon receiving the Class
instance from either defineClass()
or
findSystemClass()
, the loadClass()
method returns the
Class
instance to its caller. If a user-defined class loader delegates to another user-
defined class loader, therefore, it receives the Class
instance from the delegated-to
user-defined class loader when its loadClass()
method returns. Upon receiving the
Class
instance from the delegated-to class loader, the delegated-from class loader
returns it from its own loadClass()
method.
Through Step 1a, the Java virtual machine makes sure a type is loaded, and if the type is a class, that all its superclasses are loaded, and whether the type is a class or an interface, that all of its superinterfaces are loaded. During this step, these types are not linked and initialized--just loaded.
During Step 1a, the virtual machine may throw the following errors:
findSystemClass()
invocation) and it is unable to locate or produce the binary
data for the requested type, the virtual machine throwsNoClassDefFoundError
.
findSystemClass()
invocation and the bootstrap class loader it is unable to locate
or produce the binary data for the requested type, the findSystemClass()
method
completes abruptly with a ClassNotFoundError
. Similarly, if a user-defined class
loader delegates to another user-defined class loader via a loadClass()
invocation
and the user-defined class loader it is unable to locate or produce the binary data for the requested type, its
loadClass()
method should complete abruptly with a
ClassNotFoundError
.
ClassFormatError
. Likewise, if a user-defined class
loader is able to locate or produce the binary data and invoke the defineClass()
method, but the defineClass()
method discovers the binary data isn't of the proper
structure, defineClass()
will complete abruptly with a
ClassFormatError
.
UnsupportedClassVersionError
.
CuteKitty.class
is discovered to contain class
HungryTiger
instead of CuteKitty
) , the virtual machine
throws NoClassDefFoundError
.
defineClass()
, but contains a class or
interface whose name already appears in the current class loader's namespace, the
defineClass()
method completes abruptly with a
LinkageError
.
Object
itself, the virtual
machine throws a ClassFormatError
. (Note that this check has to be done here,
during the loading step, because that one piece of information--the symbolic reference to the superclass--is
needed by the virtual machine during this step. During Step 1, the virtual machine must load in all the
superclasses recursively.)
ClassCircularityError
.
IncompatibleClassChangeError
.
After loading is complete, the virtual machine checks for access permission. If the referencing type does
not have permission to access the referenced type, the virtual machine throws an
IllegalAccessError
. Step 1b is another activity that is logically part of
verification, but that is performed at some other time than the official verification phase. The check for
access permission will always take place after Step 1a, ensuring a type referenced from a symbolic reference
is loaded into the current namespace, as part of resolving that symbolic reference. Once this check is
complete, Step 1b--and the entire process of resolving the CONSTANT_Class_info
entry--is complete.
If an error occurred in Steps 1a or 1b, the resolution of the symbolic reference to the type fails. But if all went well up until the access permission check of Step 1b, the type is still usable in general, just not usable by the referencing type. If an error occurred before the access permission check, however, the type is unusable and must be marked as such or discarded. [D] Step 2. Link and Initialize the Type and any Superclasses
At this point, the type being referred to by the CONSTANT_Class_info
entry
being resolved has been loaded, but not necessarily linked or initialized. In addition, all the type's
superclasses and superinterfaces have been loaded, but not necessarily linked or initialized. Some of the
supertypes may be initialized at this point, because they may have been initialized during earlier resolutions.
As described in Chapter 7, "The Lifetime of a Class," superclasses must be initialized before subclasses.
If the virtual machine is resolving a reference to a class (not an interface) because of an active use of that
class, it must make sure that the superclasses have been initialized, starting with
Object
and proceeding down the inheritance hierarchy to the referenced class. (Note
that this is the opposite order in which they were loaded in Step 1a.) If a type hasn't yet been linked, it must
be linked before it is initialized. Note that only superclasses must be initialized, not superinterfaces.
Step 2 begins with the official verification phase of linking, described in Chapter 7, "The Lifetime of a Class." As mentioned in Chapter 7, the process of verification may require that the virtual machine load new types to ensure the bytecodes are adhering to the semantics of the Java language. For example, if a reference to an instance of a particular class is assigned to a variable with a declared type of a different class, the virtual machine would have to load both types to make sure one is a subclass of the other. These classes would at this point be loaded and possibly linked, but definitely not initialized.
If during the verification process the Java virtual machine uncovers trouble, it throws
VerifyError
.
After the official verification phase is complete, the type must be prepared. As described in Chapter 7, "The Lifetime of a Class," during preparation the virtual machine allocates memory for class variables and implementation-dependent data structures such as method tables.
At this point, the type has been loaded, verified and prepared. As described in Chapter 7, "The Lifetime
of a Class," a Java virtual machine implementation may optionally resolve the type at this point. Keep in
mind that at this stage in the resolution process, Steps 1a, 2a, and 2b have been performed on a referenced
type to resolve a CONSTANT_Class_info
entry in the constant pool of a referencing
type. Step 2c is the resolution of symbolic references contained in the referenced type, not the referencing
type. (And by the way, Step 2b is not mentioned in the previous discussion because Step 2b has nothing to
do with the referenced type's loading, linking, and initialization process. Step 2b is actually part of pass four
of the verification step of the linking phase of the referencing type, the type that contains the
symbolic reference to the referenced type.)
For example, if the virtual machine is resolving a symbolic reference from class Cat
to class Mouse
, the virtual machine performs Steps 1a, 2a, and 2b on class
Mouse
. At this stage of resolving the symbolic reference to Mouse
contained in the constant pool of Cat
, the virtual machine could optionally (as Step 2c)
resolve all the symbolic references contained in the constant pool for Mouse
. If
Mouse
's constant pool contains a symbolic reference to class
Cheese
, for example, the virtual machine could load and optionally link (but not
initialize) Cheese
at this time. The virtual machine mustn't attempt to initialize
Cheese
here because Cheese
is not being actively used. (Of
course, Cheese
may in fact have already been actively used elsewhere, so it could have
been already be loaded into this namespace, linked, and initialized.)
As mentioned earlier in this chapter, if an implementation does perform Step 2c at this point in the
resolution process (early resolution), it must not report any errors until the symbolic references are actually
used by the running program. For example, if during the resolution of Mouse
's constant
pool, the virtual machine can't find class Cheese
, it won't throw a
NoClassDefFound
error until (and unless) Cheese
is actually
used by the program.
At this point, the type has been loaded, verified, prepared and optionally resolved. At long last, the type is ready for initialization. As defined in Chapter 7, "The Lifetime of a Class," initialization consists of two steps. The initialization of the type's superclasses in top down order, if the type has any superclasses, and the execution of the type's class initialization method, if it has one. Step 2d just consists of executing the class initialization method, if one exists. Because Steps 2d is performed for all the referenced type's superclasses, from the top down, Step 2d will occur for superclasses before it occurs for subclasses.
If the class initialization method completes abruptly by throwing some exception that isn't a subclass of
Error
, the virtual machine throws
ExceptionInInitializerError
with the thrown exception as a parameter to
the constructor. Otherwise, if the thrown exception is already a subclass of Error
, that
error is thrown. If the virtual machine can't create a new
ExceptionInInitializerError
because there isn't enough memory, it throws
an OutOfMemoryError
.
CONSTANT_Fieldref_info
EntriesTo resolve a constant pool entry of type CONSTANT_Fieldref_info
, the
virtual machine must first resolve the CONSTANT_Class_info
entry specified in the
class_index
item. Therefore, any error that can be thrown because of the resolution
of a CONSTANT_Class_info
can be thrown during the resolution of a
CONSTANT_Fieldref_info
. If resolution of the
CONSTANT_Class_info
entry succeeds, the virtual machine searches for the
indicated field in the type and its supertypes. If it finds the indicated field, the virtual machine checks to
make sure the current class has permission to access the field.
If resolution to the CONSTANT_Class_info
completes successfully, the virtual
machine performs the field lookup process using these steps:
If the virtual machine discovers there is no field with the proper name and type in the referenced class or
any of its supertypes (if field lookup failed), the virtual machine throws
NoSuchFieldError
. Otherwise, if the field lookup succeeds, but the current class
doesn't have permission to access the field, the virtual machine throws
IllegalAccessError
.
Otherwise, the virtual machine marks the entry as resolved and places a direct reference to the field in the data for the constant pool entry.
CONSTANT_Methodref_info
EntriesTo resolve a constant pool entry of type CONSTANT_Methodref_info
, the
virtual machine must first resolve the CONSTANT_Class_info
entry specified in the
class_index
item. Therefore, any error that can be thrown because of the resolution
of a CONSTANT_Class_info
can be thrown during the resolution of a
CONSTANT_Methodref_info
. If the resolution of the
CONSTANT_Class_info
entry succeeds, the virtual machine searches for the
indicated method in the type and its supertypes. If it finds the indicated method, the virtual machine checks
to make sure the current class has permission to access the method.
If resolution to the CONSTANT_Class_info
completes successfully, the virtual
machine performs method resolution using these steps:
IncompatibleClassChangeError
.
If the virtual machine discovers there is no method with the proper name, return type, and number and
types of parameters in the referenced class or any of its supertypes (if method lookup fails), the virtual
machine throws NoSuchMethodError
. Otherwise, if the method exists, but the
method is abstract, the virtual machine throws AbstractMethodError
. Otherwise,
if the method exists, but the current class doesn't have permission to access the method, the virtual machine
throws IllegalAccessError
.
Otherwise, the virtual machine marks the entry as resolved and places a direct reference to the method in the data for the constant pool entry.
CONSTANT_InterfaceMethodref_info
EntriesTo resolve a constant pool entry of type
CONSTANT_InterfaceMethodref_info
, the virtual machine must first resolve
the CONSTANT_Class_info
entry specified in the
class_index
item. Therefore, any error that can be thrown because of the resolution
of a CONSTANT_Class_info
can be thrown during the resolution of a
CONSTANT_InterfaceMethodref_info
. If the resolution of the
CONSTANT_Class_info
entry succeeds, the virtual machine searches for the
indicated method in the interface and its supertypes. (The virtual machine need not check to make sure the
current class has permission to access the method, because all methods declared in interfaces are implicitly
public.)
If resolution to the CONSTANT_Class_info
completes successfully, the virtual
machine performs interface method resolution using these steps:
IncompatibleClassChangeError
.
java.lang.Object
for a method of the
specified name and descriptor. If the virtual machine discovers such a method, that method is the result of
the successful interface method lookup.
If the virtual machine discovers there is no method with the proper name, return type, and number and
types of parameters in the referenced interface or any of its supertypes, the virtual machine throws
NoSuchMethodError
.
Otherwise, the virtual machine marks the entry as resolved and places a direct reference to the method in the data for the constant pool entry.
CONSTANT_String_info
EntriesTo resolve an entry of type CONSTANT_String_info
, the virtual machine must
place a reference to an interned String
object in the data for the
constant pool entry being resolved. The String
object (an instance of class
java.lang.String
) must have the character sequence specified by the
CONSTANT_Utf8_info
entry identified by the string_index
item of the CONSTANT_String_info
.
Each Java virtual machine must maintain an internal list of references to String
objects that have been "interned" during the course of running the application. Basically, a
String
object is said to be interned simply if it appears in the virtual machine's internal
list of interned String
objects. The point of maintaining this list is that any particular
sequence of characters is guaranteed to appear in the list no more than once.
To intern a sequence of characters represented by a CONSTANT_String_info
entry, the virtual machine checks to see if the sequence of characters is already in the list of interned strings.
If so, the virtual machine uses the reference to the existing, previously-interned String
object. Otherwise, the virtual machine creates a new String
object with the proper
character sequence and adds a reference to that String
object to the list. To complete
the resolution process for a CONSTANT_String_info
entry, the virtual machine
places the reference to the interned String
object in the data of the constant pool
entry being resolved.
In your Java programs, you can intern a string by invoking the intern()
method
of class String
. All literal strings are interned via the process of resolving
CONSTANT_String_info
entries. If a string with the same sequence of Unicode
characters has been previously interned, the intern()
method returns a reference to
the matching already-interned String
object. If the intern()
method is invoked on a String
object that contains a sequence of characters that has
not yet been interned, that object itself will be interned. The intern()
method will
return a reference to the same String
object upon which it was invoked .
Here's an example:
// On CD-ROM in file linking/ex1/Example1.java class Example1 { // Assume this application is invoked with one command-line // argument, the string "Hi!". public static void main(String[] args) { // argZero, because it is assigned a String from the command // line, does not reference a string literal. This string // is not interned. String argZero = args[0]; // literalString, however, does reference a string literal. // It will be assigned a reference to a String with the value // "Hi!" by an instruction that references a // CONSTANT_String_info entry in the constant pool. The // "Hi!" string will be interned by this process. String literalString = "Hi!"; // At this point, there are two String objects on the heap // that have the value "Hi!". The one from arg[0], which // isn't interned, and the one from the literal, which // is interned. System.out.print("Before interning argZero: "); if (argZero == literalString) { System.out.println("they're the same string object!"); } else { System.out.println("they're different string objects."); } // argZero.intern() returns the reference to the literal // string "Hi!" that is already interned. Now both argZero // and literalString have the same value. The non-interned // version of "Hi!" is now available for garbage collection. argZero = argZero.intern(); System.out.print("After interning argZero: "); if (argZero == literalString) { System.out.println("they're the same string object!"); } else { System.out.println("they're different string objects."); } } }When executed with the string
"Hi!"
as the first command-line argument, the
Example1
application prints the following:
Before interning argZero: they're different string objects. After interning argZero: they're the same string object!
The CONSTANT_Integer_info
,
CONSTANT_Long_info
, CONSTANT_Float_info
,
CONSTANT_Double_info
entries contain the constant values they represent within
the entry itself. These are straightforward to resolve. To resolve this kind of entry, many virtual machine
implementations may not have to do anything but use the value as is. Other implementations, however, may
choose to do some processing on it. For example, a virtual machine on a little-endian machine could choose
to swap the byte order of the value at resolve time.
Entries of type CONSTANT_Utf8_info
and
CONSTANT_NameAndType_info
are never referred to directly by instructions. They
are only referred to via other types of entries, and resolved when those referring entries are resolved.
A Java type can refer symbolically to another type in the constant pool in ways that require special
attention when performing resolution to ensure type safety in the presence of multiple class loaders. When
one type contains a symbolic reference to a field in another type, the symbolic reference includes a descriptor
that specifies the type of the field. When one type contains a symbolic reference to a method in another type,
the symbolic reference includes a descriptor that specifies the types of the return value and parameters, if
any. If the referenced and referencing types do not have the same initiating loader, the virtual machine must
make sure the types mentioned in the field and method descriptors are consistent across the namespaces. For
example, imagine class Cat
contains symbolic references to fields and methods declared
in class Mouse
, and that two different class loaders initiated the loading of
Cat
and Mouse
. To preserve type safety in the presence of multiple
class loaders, it is essential that the fully qualified type names mentioned in field and method descriptors
contained in Cat
refer to the same type data (in the method area) as those same names
in class Mouse
.
To ensure that Java virtual machine implementations enforce this type consistency across namespaces, the second edition of the Java virtual machine specification defined several loading constraints. Each Java virtual machine must maintain an internal list of these constraints, each of which basically states that a name in one namespace must refer to the same type data in the method area as the same name in another namespace. As a Java virtual machine encounters symbolic references to fields and methods of referenced types whose loading wasn't initiated by the same class loader that initiated loading of the referencing type, the virtual machine may add constraints to the list. The virtual machine must check that all current loading constraints are met when it resolves symbolic references.
To describe the loading constraints, the notation
will be
used to represent types. C
denotes the fully qualified name of the type.
Ld
denotes the defining class loader of the type. Li
denotes the
class loader that initiated loading of the type. When the defining class loader is irrelevant, the simplified
notation C
will be used to denote the type and its initiating class loader.
When the initiating loader is irrelevant, the simplified notation
will be used to
denote the type and its defining class loader. An equals sign between two types denotes that both types are
actually the exact same type, represented by the same type data in the method area.
Given this notation, the rules for generating loading constraints are:
to a field of type
T
declared in class
, the virtual machine must generate the
loading constraint:
TL1 = TL2
to a method with
return type T0
and parameter types (T1, ...,
Tn)
declared in class
, the virtual machine must
generate the loading constraint: T0L1 = T0L2
, ...,TnL1 = TnL2
overrides a method with return type
T0
and parameter types (T1, ...,
Tn)
declared in class
, the virtual machine must
generate the loading constraint: T0L1 = T0L2
, ...,TnL1 = TnL2
If the virtual machine's internal list of constraints contains the two constraints
TL1 = TL2
and TL2 =
TL3
, this implies that TL1 = TL3
.
Even if type T
is never loaded by L2
during the execution of the
virtual machine instance, the types named T
loaded by L1
and
L3
must still be the same exact type.
For a less mathematical look at loading constraints, refer to the last example in this chapter. This example, which is presented in the section titled "Example: Type Safety and Loading Constraints," shows how the lack of loading constraints can enable an industrious cracker to thwart the Java virtual machine's guarantee of type safety.
As mentioned in Chapter 7, "The Lifetime of a Class," references to static final variables initialized to a
compile-time constant are resolved at compile-time to a local copy of the constant value. This is true for
constants of all the primitive types and of type java.lang.String
.
This special treatment of constants facilitates two features of the Java language. First, local copies of
constant values enable static final variables to be used as case
expressions in
switch
statements. The two virtual machine instructions that implement
switch
statements in bytecodes, tableswitch
and
lookupswitch
, require the case
values in-line in the bytecode
stream. These instructions do not support run-time resolution of case
values. See
Chapter 16, "Control Flow," for more information about these instructions.
The other motivation behind the special treatment of constants is conditional compilation. Java supports
conditional compilation via if
statements whose expressions resolve to a compile-time
constant. Here's an example:
// On CD-ROM in file linking/ex2/AntHill.java class AntHill { static final boolean debug = true; } // On CD-ROM in file linking/ex2/Example2.java class Example2 { public static void main(String[] args) { if (AntHill.debug) { System.out.println("Debug is true!"); } } }
Because of the special treatment of primitive constants, the Java compiler can decide whether or not to
include the body of the if
statement in Example2.main()
depending upon the value of AntHill.debug
. Because
AntHill.debug
is true
in this case,
javac
generates bytecodes for Example2
's
main()
method that include the body of the if
statement, but not
a check of AntHill.debug
's value. The constant pool of
Example2
has no symbolic reference to class AntHill
. Here are
the bytecodes for main()
:
// Push objref from System.out 0 getstatic #8// Push objref to literal string "Debug is true!" 3 ldc #1 // Pop objref (to a String), pop objref(to // System.out), invoke println() on System.out // passing the string as the only parameter: // System.out.println("Debug is true!"); 5 invokevirtual #9 8 return // return void
If the reference to AntHill.debug
were resolved at run-time, the compiler
would always need to include a check of AntHill.debug
's value and the body of the
if
statement just in case value of AntHill.debug
ever changed.
The value of AntHill.debug
can't change after it is compiled, of course, because it
is declared as final. Still, you could change the source code of AntHill
and recompile
AntHill
, but not recompile Example2
.
Because the reference to AntHill.debug
is resolved at compile-time the
compiler can conditionally compile out the body of the if
statement if
AntHill.debug
is discovered to be false
. Note that this
means you can't change the behavior of the Example2
application just be setting
AntHill
to false
and recompiling only
AntHill
. You have to recompile Example2
as well.
Example3
, shown below, is Example2
with its name
changed to Example3
and compiled with an AntHill
that has
debug
set to false
:
// On CD-ROM in file linking/ex3/AntHill.java class AntHill { static final boolean debug = false; } // On CD-ROM in file linking/ex3/Example3.java class Example3 { public static void main(String[] args) { if (AntHill.debug) { System.out.println("Debug is true!"); } } }
Here are the bytecodes generated by javac
for Example3
's
main()
method:
0 return // return void
As you can see, the Java compiler has brazenly eliminated the entire if
statement
found in Example3.main()
. There is not even a hint of the
println()
invocation in this very short bytecode sequence.
The ultimate goal of constant pool resolution is to replace a symbolic reference with a direct reference. The form of symbolic references is well-defined in Chapter 6, "The Java Class File," but what form do direct references take? As you might expect, the form of direct references is yet another decision of the designers of individual Java virtual machine implementations. Nevertheless, there are some characteristics likely to be common among most implementations.
Direct references to types, class variables, and class methods are likely native pointers into the method area. A direct reference to a type can simply point to the implementation-specific data structure in the method area that holds the type data. A direct reference to a class variable can point to the class variable's value stored in the method area. A direct reference to a class method can point to a data structure in the method area that contains the data needed to invoke the method. For example, the data structure for a class method could include information such as whether or not the method is native. If the method is native, the data structure could include a function pointer to the dynamically linked native method implementation. If the method is not native, the data structure could include the method's bytecodes, max_stack, max_locals, and so on. If there is a just-in-time-compiled version of the method, the data structure could include a pointer to that just-in-time-compiled native code.
Direct references to instance variables and instance methods are offsets. A direct reference to an instance variable is likely the offset from the start of the object's image to the location of the instance variable. A direct reference to an instance method is likely an offset into a method table.
Using offsets to represent direct references to instance variables and instance methods depends on a predictable ordering of the fields in a class's object image and the methods in a class's method table. Although implementation designers may choose any way of placing instance variables into an object image or methods into a method table, they will almost certainly use the same way for all types. Therefore, in any one implementation, the ordering of fields in an object and methods in a method table is defined and predictable.
As an example, consider this hierarchy of three classes and one interface:
// On CD-ROM in file linking/ex4/Friendly.java interface Friendly { void sayHello(); void sayGoodbye(); } // On CD-ROM in file linking/ex4/Dog.java class Dog { // How many times this dog wags its tail when // saying hello. private int wagCount = ((int) (Math.random() * 5.0)) + 1; void sayHello() { System.out.print("Wag"); for (int i = 0; i < wagCount; ++i) { System.out.print(", wag"); } System.out.println("."); } public String toString() { return "Woof!"; } } // On CD-ROM in file linking/ex4/CockerSpaniel.java class CockerSpaniel extends Dog implements Friendly { // How many times this Cocker Spaniel woofs when saying hello. private int woofCount = ((int) (Math.random() * 4.0)) + 1; // How many times this Cocker Spaniel wimpers when saying // goodbye. private int wimperCount = ((int) (Math.random() * 3.0)) + 1; public void sayHello() { // Wag that tail a few times. super.sayHello(); System.out.print("Woof"); for (int i = 0; i < woofCount; ++i) { System.out.print(", woof"); } System.out.println("!"); } public void sayGoodbye() { System.out.print("Wimper"); for (int i = 0; i < wimperCount; ++i) { System.out.print(", wimper"); } System.out.println("."); } } // On CD-ROM in file linking/ex4/Cat.java class Cat implements Friendly { public void eat() { System.out.println("Chomp, chomp, chomp."); } public void sayHello() { System.out.println("Rub, rub, rub."); } public void sayGoodbye() { System.out.println("Scamper."); } protected void finalize() { System.out.println("Meow!"); } }
Assume these types are loaded into a Java virtual machine that organizes objects by placing the instance
variables declared in superclasses into the object image before those declared in subclasses, and by placing
the instance variables for each individual class in their order of appearance in the class file. Assuming there
are no instance variables in class Object
, the object images for
Dog
, CockerSpaniel
, and Cat
would
appear as shown in Figure 8-1.
In this figure, the object image for CockerSpaniel
best illustrates this particular
virtual machine's approach to laying out objects. The instance variable for Dog
, the
superclass, appears before the instance variables for CockerSpaniel
, the subclass.
The instance variables of CockerSpaniel
appear in order of declaration:
woofCount
first, then wimperCount
.
Note that the wagCount
instance variable appears at offset one in both
Dog
and CockerSpaniel
. In this implementation of the Java
virtual machine, a symbolic reference to the wagCount
field of class
Dog
would be resolved to direct reference that is an offset of one. Regardless of
whether the actual object being referred to was a Dog
, a
CockerSpaniel
, or any other subclass of Dog
, the
wagCount
instance variable would always appear at offset one in the object image.
A similar pattern emerges in method tables. A method table entry is associated in some way with data structures in the method area that contain sufficient data to enable the virtual machine to invoke the method. Assume that in the Java virtual machine implementation being described here, method tables are arrays of native pointers into the method area. The data structures that the method table entries point to are similar to the data structures described above for class methods. Assume that the particular Java virtual machine implementation that loads these types organizes its method tables by placing methods for superclasses into the method table before those for subclasses, and by placing pointers for each class in the order the methods appear in the class file. The exception to the ordering is that methods overridden by a subclass appear in the slot where the overridden method first appears in a superclass.
The way this virtual machine would organize the method table for class Dog
is
shown in Figure 8-2. In this figure, the method table entries that point to methods defined in class
Object
are shown in dark gray. Entries that point to methods defined in
Dog
are shown in light gray.
Dog
.
Note that only non-private instance methods appear in this method table. Class methods, which are
invoked via the invokestatic
instruction, need not appear here, because they are
statically bound and don't need the extra indirection of a method table. Private methods and instance
initialization methods need not appear here because they are invoked via the
invokespecial
instruction and are therefore statically bound. Only methods that are
invoked with invokevirtual
or invokeinterface
appear
in this method table. See Chapter 19, "Method Invocation and Return," for a discussion of the different
invocation instructions.
By looking at the source code, you can see that Dog
overrides the
toString()
method defined in class Object
. In
Dog
's method table, the toString()
method appears only once,
in the same slot (offset seven) in which it appears in the method table for Object
. The
pointer residing at offset seven in Dog
's method table points to the data for
Dog
's implementation of toString()
. In this implementation of
the Java virtual machine, the pointer to the method data for toString()
will appear
at offset seven for every method table of every class. (Actually, you could write your own version of
java.lang.Object
and load it in through a user-defined class loader. In this
manner you could create a namespace in which the pointer to toString()
occupies a
method table offset other than seven in the same Java virtual machine implementation.)
Below the methods declared in Object
, which appear first in this method table,
come the methods declared in Dog
that don't override any method in
Object
. There is only one such method, sayHello()
, which has
the method table offset 11. All of Dog
's subclasses will either inherit or override this
implementation of sayHello()
, and some version of
sayHello()
will always appear at offset 11 of any subclass of
Dog
.
Figure 8-3 shows the method table for CockerSpaniel
. Note that because
CockerSpaniel
declares sayHello()
and
sayGoodbye()
, the pointers for those methods point to the data for
CockerSpaniel
's implementation of those methods. Because
CockerSpaniel
inherits Dog
's implementation of
toString()
, the pointer for that method (which is still at offset seven) points the data
for Dog
's implementation of that method. CockerSpaniel
inherits all other methods from Object
, so the pointers for those methods point
directly into Object
's type data. Note also that sayHello()
is
sitting at offset eleven, the same offset it has in Dog
's method table.
CockerSpaniel
.
When the virtual machine resolves a symbolic reference (a
CONSTANT_Methodref_info
entry) to the toString()
method of any class, the direct reference is method table offset seven. When the virtual machine resolves a
symbolic reference to the sayHello()
method of Dog
or any of
its subclasses, the direct reference is method table offset eleven. When the virtual machine resolves a
symbolic reference to the sayGoodbye()
method of
CockerSpaniel
or any of its subclasses, the direct reference is the method table
offset twelve.
Once a symbolic reference to an instance method is resolved to a method table offset, the virtual machine must still actually invoke the method. To invoke an instance method, the virtual machine goes through the object to get at the method table for the object's class. As mentioned in Chapter 5, "The Java Virtual Machine," given a reference to an object, every virtual machine implementation must have some way to get at the type data for that object's class. In addition, given a reference to an object, the method table (a part of the type data for the object's class) is usually very quickly accessible. (One potential scheme is shown in Figure 5-7.) Once the virtual machine has the method table for the object's class, it uses the offset to find the actual method to invoke. Voila!
The virtual machine can always depend on method table offsets when it has a reference of a class type (a
CONSTANT_Methodref_info
entry). If the sayHello()
method appears in offset seven in class Dog
, it will appear in offset seven in any subclass
of Dog
. The same is not true, however, if the reference is of an interface type (a
CONSTANT_InterfaceMethodref_info
entry). With direct references to
instance methods accessed through an interface reference there is no guaranteed method table offset.
Consider the method table for class Cat
, shown in Figure 8-4.
Cat
.
Note that both Cat
and CockerSpaniel
implement the
Friendly
interface. A variable of type Friendly
could hold a
reference to a Cat
object or a CockerSpaniel
object. With that
reference, your program could invoke sayHello()
or
sayGoodbye()
on a Cat
, a
CockerSpaniel
, or any other object whose class implements the
Friendly
interface. The Example4
application demonstrates
this:
// On CD-ROM in file linking/ex4/Example4.java class Example4 { public static void main(String[] args) { Dog dog = new CockerSpaniel(); dog.sayHello(); Friendly fr = (Friendly) dog; // Invoke sayGoodbye() on a CockerSpaniel object through a // reference of type Friendly. fr.sayGoodbye(); fr = new Cat(); // Invoke sayGoodbye() on a Cat object through a reference // of type Friendly. fr.sayGoodbye(); } }
In Example4
, local variable fr
invokes
sayGoodbye()
on both a CockerSpaniel
object and a
Cat
object. The same constant pool entry, a
CONSTANT_InterfaceMethodref_info
entry, is used to invoke this method on
both objects. But when the virtual machine resolves the symbolic reference to
sayHello()
, it can't just save a method table offset and expect that offset to always
work in future uses of the constant pool entry.
The trouble is that classes that implement the Friendly
interface aren't
guaranteed to have a common superclass that also implements Friendly
. As a result,
the methods declared in Friendly
aren't guaranteed to be in the same place in all
method tables. If you compare the method table for CockerSpaniel
against the
method table for Cat
, for example, you'll see that in
CockerSpaniel
, sayHello()
's pointer occupies offset 11.
But in Cat
, sayHello()
occupies offset 12. Likewise,
CockerSpaniel
's sayGoodbye()
method pointer resides in
offset 12, but Cat
's sayGoodbye()
method pointer resides at
offset 13.
Thus, whenever the Java virtual machine invokes a method from an interface reference, it must search the method table of the object's class until it finds the appropriate method. This is why invoking instance methods on interface references can be significantly slower than invoking instance methods on class references. Virtual machine implementations can attempt to be smart, of course, about how they search through a method table. For example, an implementation could save the last index at which they found the method and try there first the next time. Or an implementation could build data structures during preparation that help them search through method tables given an interface reference. Nevertheless, invoking a method given an interface reference will likely be to some extent slower than invoking a method given a class reference.
_quick
InstructionsThe first edition of the Java virtual machine specification described a technique used by one of Sun's
early implementations of the Java virtual machine to speed up the interpretation of bytecodes. In this
scheme, opcodes that refer to constant pool entries are replaced by a "_quick
" opcode
when the constant pool entry is resolved. When the virtual machine encounters a
_quick
instruction, it knows the constant pool entry is already resolved and can
therefore execute the instruction faster.
The core instruction set of the Java virtual machine consists of 200 single-byte opcodes, all of which are
described in Appendix A, "Instruction Set by Opcode Mnemonic." These 200 opcodes are the only opcodes
you will ever see in class files. Virtual machine implementations that use the "_quick
"
technique use another 25 single-byte opcodes internally, the "_quick
" opcodes.
For example, when a virtual machine that uses the _quick
technique resolves a
constant pool entry referred to by an ldc
instruction (opcode value 0x12), it replaces
the ldc
opcode byte in the bytecode stream with an ldc_quick
instruction (opcode value 0xcb). This technique is part of the process of replacing a symbolic reference with
a direct reference in Sun's early virtual machine.
For some instructions, in addition to overwriting the normal opcode with a _quick
opcode, a virtual machine that uses the _quick
technique overwrites the operands of
the instruction with data that represents the direct reference. For example, in addition to replacing an
invokevirtual
opcode with an invokevirtual_quick
,
the virtual machine also puts the method table offset and the number of arguments into the two operand
bytes that follow every invokevirtual
instruction. Placing the method table offset
in the bytecode stream following the invokevirtual_quick
opcode saves the
virtual machine the time it would take to look up the offset in the resolved constant pool entry.
Salutation
ApplicationAs an example of Java's linking model, consider the Salutation
application
shown below:
// On CD-ROM in file linking/ex5/Salutation.java class Salutation { private static final String hello = "Hello, world!"; private static final String greeting = "Greetings, planet!"; private static final String salutation = "Salutations, orb!"; private static int choice = (int) (Math.random() * 2.99); public static void main(String[] args) { String s = hello; if (choice == 1) { s = greeting; } else if (choice == 2) { s = salutation; } System.out.println(s); } }
Assume that you have asked a Java virtual machine to run Salutation
. When the
virtual machine starts, it attempts to invoke the main()
method of
Salutation
. It quickly realizes, however, that it can't invoke
main()
. The invocation of a method declared in a class is an active use of that class,
which is not allowed until the class is initialized. Thus, before the virtual machine can invoke
main()
, it must initialize Salutation
. And before it can
initialize Salutation
, it must load and link Salutation
. So,
the virtual machine hands the fully qualified name of Salutation
to the bootstrap
class loader, which retrieves the binary form of the class, parses the binary data into internal data structures,
and creates an instance of java.lang.Class
. The constant pool for
Salutation
is shown in Table 8-1.
Index | Type | Value |
---|---|---|
1 | CONSTANT_String_info |
30 |
2 | CONSTANT_String_info |
31 |
3 | CONSTANT_String_info |
39 |
4 | CONSTANT_Class_info |
37 |
5 | CONSTANT_Class_info |
44 |
6 | CONSTANT_Class_info |
45 |
7 | CONSTANT_Class_info |
46 |
8 | CONSTANT_Class_info |
47 |
9 | CONSTANT_Methodref_info |
7, 16 |
10 | CONSTANT_Fieldref_info |
4, 17 |
11 | CONSTANT_Fieldref_info |
8, 18 |
12 | CONSTANT_Methodref_info |
5, 19 |
13 | CONSTANT_Methodref_info |
6, 20 |
14 | CONSTANT_Double_info |
2.99 |
16 | CONSTANT_NameAndType_info |
26, 22 |
17 | CONSTANT_NameAndType_info |
41, 32 |
18 | CONSTANT_NameAndType_info |
49, 34 |
19 | CONSTANT_NameAndType_info |
50, 23 |
20 | CONSTANT_NameAndType_info |
51, 21 |
21 | CONSTANT_Utf8_info |
"()D" |
22 | CONSTANT_Utf8_info |
"()V" |
23 | CONSTANT_Utf8_info |
"(Ljava/lang/String;)V" |
24 | CONSTANT_Utf8_info |
"([Ljava/lang/String;)V" |
25 | CONSTANT_Utf8_info |
"<clinit>" |
26 | CONSTANT_Utf8_info |
"<init>" |
27 | CONSTANT_Utf8_info |
"Code" |
28 | CONSTANT_Utf8_info |
"ConstantValue" |
29 | CONSTANT_Utf8_info |
"Exceptions" |
30 | CONSTANT_Utf8_info |
"Greetings, planet!" |
31 | CONSTANT_Utf8_info |
"Hello, world!" |
32 | CONSTANT_Utf8_info |
"I" |
33 | CONSTANT_Utf8_info |
"LineNumberTable" |
34 | CONSTANT_Utf8_info |
"Ljava/io/PrintStream;" |
35 | CONSTANT_Utf8_info |
"Ljava/lang/String;" |
36 | CONSTANT_Utf8_info |
"LocalVariables" |
37 | CONSTANT_Utf8_info |
"Salutation" |
38 | CONSTANT_Utf8_info |
"Salutation.java" |
39 | CONSTANT_Utf8_info |
"Salutations, orb!" |
40 | CONSTANT_Utf8_info |
"SourceFile" |
41 | CONSTANT_Utf8_info |
"choice" |
42 | CONSTANT_Utf8_info |
"greeting" |
43 | CONSTANT_Utf8_info |
"hello" |
44 | CONSTANT_Utf8_info |
"java/io/PrintStream" |
45 | CONSTANT_Utf8_info |
"java/lang/Math" |
46 | CONSTANT_Utf8_info |
"java/lang/Object" |
47 | CONSTANT_Utf8_info |
"java/lang/System" |
48 | CONSTANT_Utf8_info |
"main" |
49 | CONSTANT_Utf8_info |
"out" |
50 | CONSTANT_Utf8_info |
"println" |
51 | CONSTANT_Utf8_info |
"random" |
52 | CONSTANT_Utf8_info |
"salutation" |
Table 8-1. Class Salutation
's constant pool
As part of the loading process for Salutation
, the Java virtual machine must
make sure all of Salutation
's superclasses have been loaded. To start this process,
the virtual machine looks into Salutation
's type data at the
super_class
item, which is a seven. The virtual machine looks up entry seven in the
constant pool, and finds a CONSTANT_Class_info
entry that serves as a symbolic
reference to class java.lang.Object
. See Figure 8-5 for a graphical depiction of
this symbolic reference. The virtual machine resolves this symbolic reference, which causes it to load class
Object
. Because Object
is the top of
Salutation
's inheritance hierarchy, the virtual machine and links and initializes
Object
as well.
Salutation
to Object
.
Now that the Java virtual machine has loaded the Salutation
class and loaded,
linked and initialized all its superclasses, the virtual machine is ready to link
Salutation
. As the first step in the linking process, the virtual machine verifies the
integrity of the binary representation of class Salutation
. Assume this
implementation of the Java virtual machine performs all verification up front, except for the verification of
symbolic references. So by the time this official verification phase of linking is completed, the virtual
machine will have verified:
Salutation
's binary data is structurally correct
Salutation
correctly implements the semantics of the Java language
Salutation
's bytecodes won't crash the virtual machine
After the Java virtual machine has verified Salutation
, it must prepare for
Salutation's
use by allocating any memory needed by the class. At this stage, the
virtual machine allocates memory for Salutation
's class variable,
choice
, and gives it a default initial value. Because the choice
class variable is an int
, it receives the default initial value of zero.
The three literal String
s--hello
,
greeting
, and salutation
--are constants, not class variables.
They do not occupy memory space as class variables in the method area. They don't receive default initial
values. Because they are declared static and final, they appear as
CONSTANT_String_info
entries in Salutation
's constant
pool. The constant pool for Salutation
that was generated by
javac
is shown in Table 8-1. The entries that represent
Salutation
's constant strings are: for greeting
, entry one;
for hello
, entry two; and for salutation
, entry three.
After the processes of verification and preparation have successfully completed, the class is ready for
resolution. As mentioned above, different implementations of the Java virtual machine may perform the
resolution phase of linking at different times. Resolution of Salutation
is optional at
this point in its lifetime. Java virtual machines are not required to perform resolution until each symbolic
reference is actually used by the program. If a symbolic reference is never actually used by a program, the
virtual machine is not required to resolve it.
A Java virtual machine implementation could perform the recursive resolution process, described above
for Salutation
, at this point in the lifetime of a program. If so, the program would
be completely linked before main()
is ever invoked. A different Java virtual machine
implementation could perform none of the resolution process at this point. Instead, it could resolve each
symbolic reference the first time it is actually used by the running program. Other implementations could
choose a resolution strategy between these two extremes. Although different implementations may perform
resolution at different times, all implementations will ensure that a type is loaded, verified, prepared, and
initialized before it is used.
Assume this implementation of the Java virtual machine uses late resolution. As each symbolic reference
is used for the first time by the program, it will be checked for accuracy and converted into a direct
reference. Assume also that this implementation uses the technique of replacing the opcode that refers to the
constant pool with _quick
equivalents.
Once this Java virtual machine implementation has loaded, verified, and prepared
Salutation
, it is ready to initialize it. As mentioned above, the Java virtual machine
must initialize all superclasses of a class before it can initialize the class. In this case, the virtual machine has
already initialized Object
, the superclass of Salutation
.
After the virtual machine has made sure all of Salutation
's superclasses have
been initialized (in this case, just class Object
), it is ready to invoke
Salutation
's <clinit>()
method. Because
Salutation
contains a class variable, width
, that has an
initializer that doesn't resolve at compile-time to a constant, the compiler does place a
<clinit>()
method into Salutation
's class file.
Here's the <clinit>()
method for Salutation
:
// Invoke class method Math.random(), passing no // parameters. Push double result. 0 invokestatic #13 <Method double random()> // Push double constant 2.99 from constant pool. 3 ldc2_w #14 <Double 2.99> 6 dmul // Pop two doubles, multiple, push double result. 7 d2i // Pop double, convert to int, push int result. // Pop int, store int Salutation.choice 8 putstatic #10 <Field int choice> 11 return // Return void from <clinit>()
The Java virtual machine executes Salutation
's
<clinit>()
method to set the choice
field to its proper initial
value. Before executing <clinit>()
, choice
has its default
initial value of zero. After executing <clinit>()
, it has one of three values chosen
pseudo-randomly: zero, one, or two.
The first instruction of the <clinit>()
method, invokestatic
#13
, refers to constant pool entry 13, a CONSTANT_Methodref_info
that represents a symbolic reference to the random()
method of class
java.lang.Math
. You can see a graphical depiction of this symbolic reference in
Figure 8-6. The Java virtual machine resolves this symbolic reference, which causes it to load, link, and
initialize class java.lang.Math
. It places a direct reference to the
random()
method into constant pool entry 13, marks the entry as resolved, and
replaces the invokestatic
opcode with
invokestatic_quick
.
Salutation
to Math.random()
.
Having completed the resolution process for constant pool entry 13, the Java virtual machine is ready to
invoke the method. When the virtual machine actually invokes the random()
method,
it will load, link, and initialize any types referenced symbolically from Math
's constant
pool and random()
's code. When this method returns, the virtual machine will push
the returned double
value onto the main()
method's operand
stack.
To execute the next instruction, ldc2_w #14
, the virtual machine looks into
constant pool entry 14 and finds an unresolved CONSTANT_Double_info
entry. The
virtual machine resolves this entry to the double value 2.99, marks the entry as resolved, and replaces the
ldc2_w
opcode with ldc2_w_quick
. Once the virtual machine
has resolved constant pool entry 14, it pushes the constant double
value, 2.99, onto
the operand stack.
Note that this entry, a CONSTANT_Double_info
, does not refer to any other
constant pool entry or item outside this class. The eight bytes of the double
value 2.99
are specified within the entry itself.
Note also that in this constant pool, there is no entry with an index of 15. As mentioned in Chapter 6,
"The Java Class File," entries of type CONSTANT_Double_info
and
CONSTANT_Long_info
occupy two slots in the constant pool. Thus, the
CONSTANT_Double_info
at index 14 is considered to occupy both indices 14 and
15.
To execute the next instruction, dmul
, the virtual machine pops two
double
s, multiplies them, and pushes the double
result. For the
next instruction, the virtual machine pops the double
, converts it to
int
, and pushes the int
result. Assume that for this particular
execution of Salutation
, the result of this operation is the int
value two.
The next instruction, putstatic #10
, uses another symbolic reference from the
constant pool, this one to the choice
variable of Salutation
itself. This instruction illustrates that a class's bytecodes use symbolic references to refer not only to fields
and methods of other types, but also to its own fields and methods. When the virtual machine executes this
instruction, it looks up constant pool entry 10 and finds an as yet unresolved
CONSTANT_Fieldref_info
item. See Figure 8-7 For a graphical depiction of this
symbolic reference. The virtual machine resolves the reference by locating the choice
class variable in Salutation
's type data in the method area, and placing a pointer to
the actual variable data in constant pool entry 10. It marks the entry as resolved and replaces the
putstatic
opcode with putstatic_quick
.
Salutation
to its own choice
field.
Once it has resolved the CONSTANT_Fieldref_info
entry for
choice
, the virtual machine pops an int
(in this case a two) from
the operand stack and places it into the choice
variable. The execution of the
putstatic
instruction is now complete.
Lastly, the virtual machine executes the return instruction, which signals to the virtual machine that the
<clinit>()
method, and hence the initialization of class
Salutation
, is complete.
Now that class Salutation
has been initialized, it is
finally ready for use. The Java virtual machine invokes main()
, and the program
begins. Here's the bytecode sequence for Salutation
's main()
method:
// Push objref to literal string from constant pool // entry 2 0 ldc #2 <String "Hello, world!"> 2 astore_1 // Pop objref into loc var 1: String s = hello; // Push int from static field Salutation.choice. Note // that by this time, choice has definitely been // given its proper initial value. 3 getstatic #10 <Field int choice> 6 iconst_1 // Push int constant 1 // Pop two ints, compare, if not equal branch to 16: 7 if_icmpne 16 // if (choice == 1) { // Here, choice does equal 1. Push objref to string // literal from constant pool: 10 ldc #1 <String "Greetings, planet!"> 12 astore_1 // Pop objref into loc var 1: s = greeting; 13 goto 26 // Branch unconditionally to offset 26 // Push int from static field Salutation.choice 16 getstatic #10 <Field int choice> 19 iconst_2 // Push int constant 2 // Pop two ints, compare, if not equal branch to 26: 20 if_icmpne 26 // if (choice == 2) { // Here, choice does equal 2. Push objref to string // literal from constant pool: 23 ldc #3 <String "Salutations, orb!"> 25 astore_1 // Pop objref into loc var 1: String s = salutation; // Push objref from System.out 26 getstatic #11 <Field java.io.PrintStream out> 29 aload_1 // Push objref (to a String) from loc var 1 // Pop objref (to a String), pop objref(to // System.out), invoke println() on System.out // passing the string as the only parameter: // System.out.println(s); 30 invokevirtual #12 <Method void println(java.lang.String)> 33 return // Return void from main()
The first instruction in main()
, ldc #2
, uses a symbolic
reference to the string literal "Hello, world!"
. When the virtual machine executes
this instruction, it looks up constant pool entry two and finds a
CONSTANT_String_info
item that hasn't yet been resolved. See Figure 8-8 For a
graphical depiction of the symbolic reference to this string literal.
Salutation
to "Hello,
world!"
As part of executing the ldc
instruction, the virtual machine resolves the constant
pool entry. It creates and interns a new String
object with the value
"Hello, world!"
, places a reference to the string object in the constant pool entry,
marks the entry as resolved, and replaces the ldc
opcode with an
ldc_quick
.
Now that the virtual machine has resolved the "Hello, world!"
string literal, it
pushes the reference to that String
object onto the stack. The next instruction,
astore_1
, pops the reference and stores it into local variable position one, the
s
variable.
To execute the next instruction, getstatic #10
, the virtual machine looks up
constant pool entry 10 and discovers a CONSTANT_Fieldref_info
entry that has
already been resolved. This entry, a symbolic reference to Salutation
's own
choice
field, was resolved by the putstatic #10
instruction
in the <clinit>()
method. The virtual machine simply replaces the
getstatic
opcode with getstatic_quick
, and pushes the
int
value of choice
onto the stack.
To execute main()
's next instruction, iconst_1
, the virtual
machine simply pushes int
one onto the operand stack. For the next instruction,
ificmpne 16
, the virtual machine pops the top two int
s and
subtracts one from the other. In this case, since the value of choice
was set by the
<clinit>()
method to be two, the result of the subtraction is not zero. As a
consequence, the virtual machine takes the branch. It updates the pc register so that the next instruction it
executes is the getstatic
instruction at offset 16.
The getstatic
instruction at offset 16 refers to the same constant pool entry
referred to by the getstatic
instruction at offset three: constant pool entry 10. When
the virtual machine executes the getstatic
at offset 16, it looks up constant pool
entry 10 and finds a CONSTANT_Fieldref_info
entry that is already resolved. It
replaces the getstatic
opcode with getstatic_quick
, and
pushes the int
value of Salutation
's
choice
class variable (a two) onto the operand stack.
To execute the next instruction, iconst_2
, the virtual machine pushes an
int
two onto the stack. For the next instruction, another
ificmpne
26
, the virtual machine again pops two
int
s and subtracts one from the other. This time, however, both
int
s equal two, so the result of the subtraction is zero. As a consequence, the virtual
machine does not take the branch and continues on to execute the next instruction in the bytecode array,
another ldc
.
The ldc
instruction at offset 23 refers to constant pool entry three, a
CONSTANT_String_info
entry that serves as a symbolic reference to the string
literal "Salutations, orb!"
. The virtual machine looks up this entry in the
constant pool and discovers it is as yet unresolved. To resolve the entry, the virtual machine creates and
interns a new String
object with the value "Salutations,
orb!"
, places a reference to the new object in the data for constant pool entry three, and
replaces the ldc
opcode with ldc_quick
. Having resolved the
string literal, the virtual machine pushes the reference to the String
object onto the
stack.
To execute the next instruction, astore_1
, the virtual machine pops the object
reference to the "Salutations, orb!"
string literal off the stack and stores it into
local variable slot one, overwriting the reference to "Hello, world!"
written there
by the astore_1
instruction at offset two.
The next instruction, getstatic #11
, uses a symbolic reference to a public
static class variable of java.lang.System
with the name out
and the type java.io.PrintStream
. This symbolic reference occupies the
CONSTANT_Fieldref_info
entry at index 11 in the constant pool. See Figure 8-9
For a graphical depiction of this symbolic reference.
Salutation
to System.out
.
To resolve the reference to System.out
, the Java virtual machine must load, link,
and initialize java.lang.System
to make sure it has a public static field, named
out
, of type java.io.PrintStream
. Then, the virtual
machine will replace the symbolic reference with a direct reference, such as a native pointer, so that any
future uses of System.out
by Saluation
won't require
resolution and will be faster. Lastly, the virtual machine will replace the getstatic
opcode with getstatic_quick
.
Once the virtual machine has successfully resolved the symbolic reference, it will push the reference to
System.out
onto the stack. To execute the next instruction,
aload_1
, the virtual machine simply pushes onto the stack the object reference from
local variable one, which is the reference to the "Salutations, orb!"
string
literal.
To execute the next instruction, invokevirtual #12
, the Java virtual machine
looks up constant pool entry 12 and finds an unresolved
CONSTANT_Methodref_info
entry, a symbolic reference to the
println()
method of java.io.PrintStream
. See Figure 8-
10 for a graphical depiction of this symbolic reference. The virtual machine loads, links, and initializes
java.io.PrintStream
, and makes sure it has a println()
method that is public
, returns void
, and takes a
String
argument. It marks the entry as resolved and puts a direct reference (an index
into PrintStream
's method table) into the data for the resolved constant pool entry.
Lastly, the virtual machine replaces the invokevirtual
opcode with
invokevirtual_quick
, and places the method table index and the number of
arguments accepted by the method as operands to the invokevirtual_quick
opcode.
Salutation
to PrintStream.println()
.
When the virtual machine actually invokes the println()
method, it will load,
link, and initialize any types referenced symbolically from PrintStream
's constant
pool and println()
's code.
The next instruction is the last instruction the main()
method:
return
. Because main()
was being executed by the only non-
deamon thread running in the Salutation
application, executing the
return
instruction will cause the virtual machine to exit. Note that constant pool entry
one, which contained a symbolic reference to the "Greetings, planet!"
string
literal, was never resolved during this execution of the Salutation
application.
Because choice
happened to be initialized with a value of two, the instruction that
referred to constant pool entry one, the ldc #1
instruction at offset 10, was never
executed. As a result, the virtual machine never created a String
object with the value
"Greetings, planet!"
.
Greet
ApplicationAs an example of an application that performs dynamic extension through user-defined class loaders, consider the following class:
// On CD-ROM in file linking/ex6/Greet.java import com.artima.greeter.*; public class Greet { // Arguments to this application: // args[0] - path name of directory in which class files // for greeters are stored // args[1], args[2], ... - class names of greeters to load // and invoke the greet() method on. // // All greeters must implement the com.artima.greeter.Greeter // interface. // static public void main(String[] args) { if (args.length <= 1) { System.out.println( "Enter base path and greeter class names as args."); return; } GreeterClassLoader gcl = new GreeterClassLoader(args[0]); for (int i = 1; i < args.length; ++i) { try { // Load the greeter specified on the command line Class c = gcl.loadClass(args[i]); // Instantiate it into a greeter object Object o = c.newInstance(); // Cast the Object ref to the Greeter interface type // so greet() can be invoked on it Greeter greeter = (Greeter) o; // Greet the world in this greeter's special way greeter.greet(); } catch (Exception e) { e.printStackTrace(); } } } }
The Greet
application is a fancy incarnation of the typical "Hello, world!"
program. Greet
uses a user-defined class loader to dynamically extend itself with
classes--called "greeters"--that do the actual work of telling the world hello.
A greeter is any class that implements the com.artima.greeter.Greeter
interface:
// On CD-ROM in file linking/ex6/com/artima/greeter/Greeter.java package com.artima.greeter; public interface Greeter { void greet(); }
As you can see from the code above, the Greeter
interface declares only one
method: greet()
. When a greeter object's greet()
method is
invoked, the object should say hello to the world in its own unique way. Here are a few examples of
greeters:
// On CD-ROM in file linking/ex6/greeters/Hello.java import com.artima.greeter.Greeter; public class Hello implements Greeter { public void greet() { System.out.println("Hello, world!"); } } // On CD-ROM in file linking/ex6/greeters/Greetings.java import com.artima.greeter.Greeter; public class Greetings implements Greeter { public void greet() { System.out.println("Greetings, planet!"); } } // On CD-ROM in file linking/ex6/greeters/Salutations.java import com.artima.greeter.Greeter; public class Salutations implements Greeter { public void greet() { System.out.println("Salutations, orb!"); } } // On CD-ROM in file linking/ex6/greeters/HowDoYouDo.java import com.artima.greeter.Greeter; public class HowDoYouDo implements Greeter { public void greet() { System.out.println("How do you do, globe!"); } }Greeters can be more complex than the above four examples. Here's an example of a greeter that chooses a greeting based on the time of day:
// On CD-ROM in file linking/ex6/greeters/HiTime.java import com.artima.greeter.Greeter; import java.util.Date; public class HiTime implements Greeter { public void greet() { // Date's no-arg constructor initializes itself to the // current date and time Date date = new Date(); int hours = date.getHours(); // Some hours: midnight, 0; noon, 12; 11PM, 23; if (hours >= 4 && hours <= 11) { System.out.println("Good morning, world!"); } else if (hours >= 12 && hours <= 16) { System.out.println("Good afternoon, world!"); } else if (hours >= 17 && hours <= 21) { System.out.println("Good evening, world!"); } else { System.out.println("Good night, world!"); } } }
The Greet
application doesn't know at compile-time what greeter classes it will
load and where those classes will be stored. At run-time it takes a directory path as its first command-line
argument and greeter class names as subsequent arguments. It attempts to load the greeters using the path
name as a base directory.
For example, imagine you invoke the Greet
application with the following
command line:
java Greet greeters HelloIn this command line,
java
is the name of the Java virtual machine executable.
Greet
is the class name of the Greet
application.
greeters
is the name of a directory relative to the current directory in which the
Greet
application should look for greeters. Hello
is the name of
the greeter.
When the Greet
application is invoked with the above command line, it attempts
to load greeters/Hello.class
and invoke Hello
's
greet()
method. If the Hello.class
file is indeed sitting in a
directory named greeters
, the application will print:
Hello, world!
The Greet
application can handle more than one greeter. If you invoke it with the
following command line:
java Greet greeters Hello Greetings Salutations HowDoYouDothe
Greet
application will load each of the four greeters listed and invoke their
greet()
methods, yielding the following output:
Hello, world! Greetings, planet! Salutations, orb! How do you do, globe!
The Greet
application works by first checking to make sure there are at least two
command-line arguments: a directory path and at least one greeter class name. It then instantiates a new
GreeterClassLoader
object, which will be responsible for loading the greeters.
(The inner workings of class GreeterClassLoader
, a subclass of
java.lang.ClassLoader
, will be described later in this section.) The constructor
for GreeterClassLoader
accepts a String
that it uses as a
directory path in which to look for greeters.
After it has created the GreeterClassLoader
object, the
Greet
application invokes its loadClass()
method for each
greeter name that appears on the command line. When it invokes loadClass()
, it
passes the greeter class name, args[i]
, as the sole parameter:
// Load the greeter specified on the command line Class c = gcl.loadClass(args[i]);If the
loadClass()
method is unsuccessful, it throws an exception or error. If the
loadClass()
method is successful, it returns the Class
instance
for the newly loaded type.
Note that in addition to being loaded, the type requested of loadClass()
may
possibly be linked and initialized by the time loadClass()
returns. If the type had
been actively used prior to the loadClass()
invocation that requested the type, that
active use would have triggered its loading, linking, and initialization. Regardless, by the time the next
statement, which calls newInstance()
on a Class
reference, is
executed, the type will definitely have been initialized. If the type has not yet been initialized, calling
newInstance()
will trigger the initialization of the type (which must be a class),
because a class must be initialized before an object of that class is instantiated. So if the type hadn't been
initialized prior to the loadClass()
invocation, calling
newInstance()
will trigger the initialization.
Once loadClass()
has returned a Class
instance, the
Greet
application's main()
method instantiates a new instance of
the greeter by calling newInstance()
on the Class
instance:
// Instantiate it into a greeter object Object o = c.newInstance();When the
newInstance()
method is invoked on a Class
object, the virtual machine creates and initializes a new instance of the class represented by the
Class
object. To initialize the new instance, the virtual machine invokes its no-arg
constructor. (Note that for this statement to work without throwing an exception, the newly loaded type
must be a class, not an interface, must be accessible, must not be abstract, and must contain a no-arg
constructor that is accessible.)
The Greet
application then casts the Object
reference that
points to the greeter object to type Greeter
:
// Cast the Object ref to the Greeter interface type // so greet() can be invoked on it Greeter greeter = (Greeter) o;
Finally, armed with a Greeter
reference, the main()
method
invokes the greet()
method on the greeter object:
// Greet the world in this greeter's special way greeter.greet();
The Greet
application demonstrates the flexibility inherent in Java's linking model.
The Greet
application does not know at compile time what greeters it will be loading
and dynamically linking to at run-time. In the examples above, class Greet
invokes the
greet()
method in classes Hello
,
Greetings
, Salutations
, and
HowDoYouDo
. But if you look at Greet
's constant pool, there is
no symbolic reference to any of these classes. There is only a symbolic reference to their shared
superinterface, com.artima.greeter.Greeter
. Greeters themselves, so long as
they implement the com.artima.greeter.Greeter
interface, can be anything
and can be written and compiled anytime, even after the Greet
application itself is
compiled.
Prior to 1.2, the loadClass()
method of
java.lang.ClassLoader
was abstract. To create your own user-defined class
loader, you subclassed ClassLoader
and implemented
loadClass()
. In 1.2, a concrete implementation of
loadClass()
was included in ClassLoader
. This concrete
loadClass()
supports the parent-delegation model introduced in 1.2, and in general
makes it easier and less error prone to create a user-defined class loader. To create a user-defined class
loader in 1.2, you can subclass ClassLoader<
and, rather than override
loadClass()
, you can override findClass()
-- a method
with a much simpler contract than loadClass()
. This approach to creating a user-
defined class loader will be described later in this chapter.
To give you some historical perspective of how class loaders changed between 1.1 and 1.2, consider this
implementation of GreeterClassLoader
, written for 1.1 and included in the first
edition of this book:
// On CD-ROM in file // linking/ex6/COM/artima/greeter/GreeterClassLoader.java package COM.artima.greeter; import java.io.*; import java.util.Hashtable; public class GreeterClassLoader extends ClassLoader { // basePath gives the path to which this class // loader appends "/.class" to get the // full path name of the class file to load private String basePath; public GreeterClassLoader(String basePath) { this.basePath = basePath; } public synchronized Class loadClass(String className, boolean resolveIt) throws ClassNotFoundException { Class result; byte classData[]; // Check the loaded class cache result = findLoadedClass(className); if (result != null) { // Return a cached class return result; } // Check with the primordial class loader try { result = super.findSystemClass(className); // Return a system class return result; } catch (ClassNotFoundException e) { } // Don't attempt to load a system file except through // the primordial class loader if (className.startsWith("java.")) { throw new ClassNotFoundException(); } // Try to load it from the basePath directory. classData = getTypeFromBasePath(className); if (classData == null) { System.out.println("GCL - Can't load class: " + className); throw new ClassNotFoundException(); } // Parse it result = defineClass(className, classData, 0, classData.length); if (result == null) { System.out.println("GCL - Class format error: " + className); throw new ClassFormatError(); } if (resolveIt) { resolveClass(result); } // Return class from basePath directory return result; } private byte[] getTypeFromBasePath(String typeName) { FileInputStream fis; String fileName = basePath + File.separatorChar + typeName.replace('.', File.separatorChar) + ".class"; try { fis = new FileInputStream(fileName); } catch (FileNotFoundException e) { return null; } BufferedInputStream bis = new BufferedInputStream(fis); ByteArrayOutputStream out = new ByteArrayOutputStream(); try { int c = bis.read(); while (c != -1) { out.write(c); c = bis.read(); } } catch (IOException e) { return null; } return out.toByteArray(); } }
The 1.1 GreeterClassLoader
declares one instance variable,
basePath
. This variable, a String
, is used to store the directory
path (passed to GreetingClassLoader
's constructor) in which the
loadClass()
method should look for the class file of the type it has been requested
to load.
The loadClass()
method begins by checking to see if the requested type has
already been loaded by this class loader. It does this by invoking
findLoadedClass()
, an instance method in ClassLoader
,
passing in the fully qualified name of the requested type as a parameter. If this class loader has already been
marked as an initiating class loader of a type with the requested fully qualified name,
findLoadedClass()
will return the Class
instance
representing the type:
// Check the loaded class cache result = findLoadedClass(className); if (result != null) { // Return a cached class return result; }
As mentioned earlier in this chapter, the virtual machine maintains a list of type names that have already
been requested of each class loader. These lists, which include all the types for which each class loader has
been marked as an initiating loader, represent the sets of unique names that currently populate each class
loader's namespace. When loading classes in Step 1a of the process of resolving
CONSTANT_Class_info
entries (described earlier in this chapter), the virtual
machine always checks its internal list before automatically invoking loadClass()
.
As a result, the virtual machine will never automatically invoke loadClass()
on a
user-defined class loader with the name of a type already loaded by that user-defined class loader.
Nevertheless, the GreeterClassLoader
invokes
findLoadedClass()<<
> to check the requested class against the list of the names of the
types it has already loaded. Why? Because even though the virtual machine will never ask a user-defined
class loader to load the same type twice, the application just might.
As an example, imagine the Greet
application were invoked with this command
line:
java Greet greeters Hello Hello Hello Hello HelloGiven this command line, the
Greet
application would invoke
loadClass()
with the name Hello
five times on the same
GreeterClassLoader
object. The first time, the
GreeterClassLoader
would load the class. The next four times, however, the
GreeterClassLoader
would simply get the Class
instance
for Hello
by calling findLoadedClass()
and return that. It
would only load class Hello
once.
If the loadClass()
method determines that the requested type has not been
loaded into its name space, it next passes the name of the requested type to
findSystemClass()
:
// Check with the primordial class loader try { result = super.findSystemClass(className); // Return a system class return result; } catch (ClassNotFoundException e) { }
When the findSystemClass()
method is invoked in a 1.1 virtual machine, the
primordial class loader attempts to load the type. In 1.2, the system class loader attempts to load the type. If
the load is successful, findSystemClass()
returns the Class
instance representing the type, and loadClass()
returns that same
Class
instance.
If the primordial (in 1.1) or system (in 1.2 ) class loader is unable to load the type,
findSystemClass()
throws ClassNotFoundError
. In this
case, the loadClass()
method next checks to make sure the requested class is not
part of the java
package:
// Don't attempt to load a system file except through // the primordial class loader if (className.startsWith("java.")) { throw new ClassNotFoundException(); }
This check prevents members of the standard java
packages
(java.lang
, java.io
, etc.) from being loaded by anything but
the bootstrap class loader. As mentioned in Chapter 3, "Security," two types that declare themselves to be
part of the same named package are only granted access to each other's package-visible members if they
belong to the same runtime package (if they were loaded by the same class loader). But the notion of a
"runtime package" and its affect on accessibility was first introduced in the second edition of the Java virtual
machine specification. Thus, early versions of class loaders had to explicitly prevent user-defined class
loaders from attempting to load types that declare themselves to be part of the Java API (or any other
"restricted" packages) but that couldn't be loaded by the bootstrap class loader.
If the type name doesn't begin with "java."
, the
loadClass()
method next invokes
getTypeFromBasePath()
, which attempts to import the binary data in the user-
defined class loader's custom way:
// Try to load it from the basePath directory. classData = getTypeFromBasePath(className); if (classData == null) { throw new ClassNotFoundException(); }
The getTypeFromBasePath()
method looks for a file with the type name plus
a ".class
" extension in the base directory passed to the
GreeterClassLoader
's constructor. If the
getTypeFromBasePath()
method is unable to find the file, it returns a
null
result and the loadClass()
method throws
ClassNotFoundException
. Otherwise, loadClass()
invokes defineClass()
, passing the byte
array returned by
getTypeFromBasePath()
:
// Parse it result = defineClass(className, classData, 0, classData.length); if (result == null) { System.out.println("GCL - Class format error: " + className); throw new ClassFormatError(); }
The defineClass()
method completes the loading process: it parses the binary
data into internal data structures and creates a Class
instance. The
defineClass()
method does not link and initialize the type. (As mentioned earlier in
this chapter, the defineClass()
method also makes sure all the type's supertypes
are loaded. It does this by invoking loadClass()
on this user-defined class loader for
each direct superclass and superinterface, and recursively applies the resolution process on all supertypes in
the hierarchy.)
If defineClass()
is successful, the loadClass()
method checks to see if resolve
were set to true
. If so, it
invokes resolveClass()
, passing the Class
instance returned
by defineClass()
. The resolveClass()
method links the
class. , it Finally, loadClass()
returns the newly created Class
instance:
if (resolveIt) { resolveClass(result); } // Return class from basePath directory return result;
The class loader described in the previous section, which was originally designed for a 1.1 virtual
machine, will still work in 1.2. Although 1.2 added a concrete default implementation of
loadClass()
to java.lang.ClassLoader
, this concrete
method can still be overridden in subclasses. Because the contract of loadClass()
did not change from 1.1 to 1.2, legacy user-defined class loaders that override
loadClass()
should still work as expected in 1.2.
The basic contract of loadClass()
is this: Given the fully qualified name of the
type to find, the loadClass()
method should in some way attempt to locate or
produce an array of bytes, purportedly in the Java class file format, that define the type. If
loadClass()
is unable to locate or produce the bytes, it should throw
ClassNotFoundException
. Otherwise, loadClass()
should pass the array of bytes to one of the defineClass()
methods declared in
class ClassLoader
. By passing the byte array to
defineClass()
, loadClass()
asks the virtual machine to
import the type represented by the passed byte array into the namespace of this user-defined class loader.
When loadClass()
calls defineClass()
in 1.2, it can also
specify a protection domain with which the type data should be associated. When the
loadClass()
method of a class loader successfully loads a type, it returns a
java.lang.Class
object to represent the newly loaded type.
The concrete implementation of loadClass()
from class
java.lang.ClassLoader
fullfills the loadClass()
method's contract using these four basic steps:
findLoadedClass()
). If so, return the Class
instance for that
already-loaded type.
Class
instance, return that same Class
instance.
findClass()
, which should attempt to locate or produce an
array of bytes, purportedly in the Java class file format, that define the desired type. If successful,
findClass()
should pass those bytes to defineClass()
,
which will attempt to import the type and return a Class
instance. If
findClass()
returns a Class
instance,
loadClass()
returns that same Class
instance.
findClass()
completes abruptly with some exception, and
loadClass()
completes abruptly with the same exception.
Although in 1.2 you can still subclass ClassLoader
and override the
loadClass()
method, the recommended approach to creating your own user-defined
class loader in 1.2 is to subclass ClassLoader
and implement the
findClass()
method. The findClass()
method looks like
this:
// A method declared in class java.lang.ClassLoader: protected Class findClass(String name) throws ClassNotFoundException;
The basic contract of the findClass()
method is this:
findClass()
accepts the fully qualified name of a desired type as its only parameter.
findClass()
first attempts to locate or produce an array of bytes, purportedly in the
Java class file format, that define the type of the requested name. If findClass()
is
unable to locate or produce the array of bytes, it completes abruptly with
ClassNotFoundException
. Otherwise, findClass()
invokes defineClass()
, passing in the requested name, the array of bytes and,
optionally, a ProtectionDomain
object with which the type should be associated. If
defineClass()
returns a Class
instance for the type,
findClass()
simply returns that same Class
instance to its
caller. Otherwise, defineClass()
completes abruptly with some exception, and
findClass()
completes abruptly with the same exception.
Here's a version of GreeterClassLoader
that, rather than overriding
loadClass()
, merely overrides findClass()
:
// On CD-ROM in file // linking/ex7/com/artima/greeter/GreeterClassLoader.java package com.artima.greeter; import java.io.*; public class GreeterClassLoader extends ClassLoader { // basePath gives the path to which this class // loader appends "/.class" to get the // full path name of the class file to load private String basePath; public GreeterClassLoader(String basePath) { this.basePath = basePath; } public GreeterClassLoader(ClassLoader parent, String basePath) { super(parent); this.basePath = basePath; } protected Class findClass(String className) throws ClassNotFoundException { byte classData[]; // Try to load it from the basePath directory. classData = getTypeFromBasePath(className); if (classData == null) { throw new ClassNotFoundException(); } // Parse it return defineClass(className, classData, 0, classData.length); } private byte[] getTypeFromBasePath(String typeName) { FileInputStream fis; String fileName = basePath + File.separatorChar + typeName.replace('.', File.separatorChar) + ".class"; try { fis = new FileInputStream(fileName); } catch (FileNotFoundException e) { return null; } BufferedInputStream bis = new BufferedInputStream(fis); ByteArrayOutputStream out = new ByteArrayOutputStream(); try { int c = bis.read(); while (c != -1) { out.write(c); c = bis.read(); } } catch (IOException e) { return null; } return out.toByteArray(); } }
This version of GreeterClassLoader
appears in the
linking/ex7
directory of the CD-ROM. All of the source files in
linking/ex6
, which were described in detail in previous sections, appear unchanged
in linking/ex7
, except for GreeterClassLoader.java
.
Where the GreeterClassLoader
class, described in the previous section, that
overrides loadClass()
appears in linking/ex6
, the
GreeterClassLoader
, described in this section, that overrides
findClass()
appears in linking/ex7
.
This second version of GreeterClassLoader
declares one instance variable,
basePath
, which is a String
that is used to store the directory
path in which findClass()
should look for the class file of the type it has been
requested to load. The basePath
String
is the only parameter
passed to GreetingClassLoader
's 1-arg constructor. Because the 1-arg
constructor accepts no reference to a caller-specified parent class loader, this class loader can't invoke the
superclass constructor that takes a reference to a user-defined class loader. Thus, it simply invokes the
superclass's no-arg constructor by default, which sets this class loader's parent to be the system class loader.
The other constructor (the 2-arg constructor), however, accepts a reference to a user-defined class loader
instance as well as the basePath
String
. This constructor explicitly
invokes the superclass's 1-arg constructor, passing along the reference. The superclass sets this class loader's
parent to be the passed user-defined class loader instance.
By comparing the implementation of findClass()
in this version of
GreeterClassLoader
with the implementation of
loadClass()
in the previous version of
GreeterClassLoader
, you can easily see how much simpler it is to write
findClass()
than loadClass()
. You have much less to
worry about when you write findClass()
, and fewer opportunities to make
mistakes. findClass()
merely invokes
getTypeFromBasePath()
to attempt in this user-defined class loader's custom way
to load the requested type. If getTypeFromBasePath()
is unable to locate the
requested type in the basePath
directory, it returns null
and
findClass()
throws a ClassNotFoundException
.
Otherwise, getTypeFromBasePath()
returns the array of bytes, which
findClass()
simply passes on to defineClass()
. If
defineClass()
returns a reference to a Class
instance to
represent the successfully loaded type, findClass()
returns that same reference.
Otherwise, defineClass()
completes abruptly with an exception, which causes
findClass()
to complete abruptly with the same exception.
The findClass()
method's contract is a subset of the
loadClass()
method's contract. findClass()
isolates the
only two parts of loadClass()
that should in general be customized by subclasses of
java.lang.ClassLoader
:
When an implementation of findClass()
performs these two tasks, the result is
an array of bytes and a reference to a ProtectionDomain
object.
findClass()
passes both the byte array and the
ProtectionDomain
reference to defineClass()
.
forName()
As an example of a Java application that performs dynamic extension with
forName()
, consider the EasyGreet
class:
// On CD-ROM in file linking/ex7/EasyGreet.java import com.artima.greeter.*; public class EasyGreet { // Arguments to this application: // args[0], args[1], ... - class names of greeters to load // and invoke the greet() method on. // // All greeters must implement the com.artima.greeter.Greeter // interface. // static public void main(String[] args) { if (args.length == 0) { System.out.println( "Enter greeter class names as args."); return; } for (int i = 0; i < args.length; ++i) { try { // Load the greeter specified on the command line Class c = Class.forName(args[i]); // Instantiate it into a greeter object Object o = c.newInstance(); // Cast the Object ref to the Greeter interface type // so greet() can be invoked on it Greeter greeter = (Greeter) o; // Greet the world in this greeter's special way greeter.greet(); } catch (Exception e) { e.printStackTrace(); } } } }
The EasyGreet
application is very similar to the Greet
application from the previous example. Like Greet
, EasyGreet
will attempt to dynamically load and execute greeters mentioned in command line arguments. But unlike
Greet
, EasyGreet
doesn't take as its first command line
argument a path name of the directory in which the class files for the greeters are stored. All of
EasyGreet
's command line arguments are greeter class names. Another difference is
that EasyGreeter
, because it is going to use forName()
to
load greeters dynamically, doesn't instantiate a GreeterClassLoader
. Then, where
Greet
invoked loadClass()
on its
GreeterClassLoader
instance, EasyGreet
invokes
forName()
, a static method of class Class
.
EasyGreet
's forName()
invocation looks very similar to
Greet
's loadClass()
invocation. Like
loadClass()
, forName()
accepts the fully qualified name of
the requested type in a String
parameter. If successful in loading the type (or if the
type had been loaded previously), forName()
, like
loadClass()
, returns the Class
instance that represents the
type. If unsuccessful, forName()
, like loadClass()
, throws
ClassNotFoundException
. The big difference between the two approaches is that
whereas loadClass()
attempts to ensure the requested type is loaded into the user-
defined class loader's namespace, forName()
attempts to ensure the requested type is
loaded into the current namespace -- the namespace of the defining class loader for the type whose method
includes the forName()
invocation.
Because forName()
is invoked from the main()
method of
class EasyGreet
, the class loader that forName()
asks to load
the requested type is EasyGreet
's defining class loader. When run from Sun's Java 2
SDK version 1.2, the class loader that loads EasyGreet
is the system class loader,
which looks for classes on the class path. To use the class path environment variable, you can execute the
EasyGreet
application in the linking/ex7
directory of the
CD-ROM with a command like this:
java EasyGreet Hello
If you don't specify a class path either explicitly on the command line or in an environment variable, the
system loader will look in the current directory for requested types. Because the current directory (the
linking/ex7
directory from the CD-ROM) doesn't contain
Hello.class
, the system class loader is unable to locate
Hello.class
. The forName()
method, and in turn
EasyGreet
's main()
method, completes abruptly with a
ClassNotFoundException
:
java.lang.ClassNotFoundException: Hello at java.net.URLClassLoader$1.run(URLClassLoader.java: 202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java: 191) at java.lang.ClassLoader.loadClass(ClassLoader.java: 290) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java: 275) at java.lang.ClassLoader.loadClass(ClassLoader.java: 247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java: 124) at EasyGreet.main(EasyGreet.java, Compiled Code)
To enable EasyGreet
to find Hello.class
merely
requires that the greeters
directory be included in a class path specified on the
command line with the "-cp
" option, as in:
java -cp .;greeters; EasyGreet Hello
When started with this command, the EasyGreet
program prints:
Hello, world!
Like the Greet
method, EasyGreet
will accept multiple
greeter names on the command line:
java -cp .;greeters; EasyGreet Hello Greetings Salutations HowDoYouDoWhen invoked with this command, the
EasyGreet
application will load each of the
four greeters listed and invoke their greet()
methods, yielding this output:
Hello, world! Greetings, planet! Salutations, orb! How do you do, globe!
The important difference that arises from Greet
's use of
loadClass()
on GreeterClassLoader
and
EasyGreet
's use of forName()
is the namespaces into which
the greeter classes get loaded. In Greet
, the greeter classes get loaded into the
GreeterClassLoader
's namespace. In EasyGreet
, the
greeter classes get loaded into the system class loader's namespace.
As an example of dynamically loaded types becoming unreachable and getting unloaded by the virtual machine, consider the following application:
// On CD-ROM in file linking/ex7/GreetAndForget.java import com.artima.greeter.*; public class GreetAndForget { // Arguments to this application: // args[0] - path name of directory in which class files // for greeters are stored // args[1], args[2], ... - class names of greeters to load // and invoke the greet() method on. // // All greeters must implement the com.artima.greeter.Greeter // interface. // static public void main(String[] args) { if (args.length <= 1) { System.out.println( "Enter base path and greeter class names as args."); return; } for (int i = 1; i < args.length; ++i) { try { GreeterClassLoader gcl = new GreeterClassLoader(args[0]); // Load the greeter specified on the command line Class c = gcl.loadClass(args[i]); // Instantiate it into a greeter object Object o = c.newInstance(); // Cast the Object ref to the Greeter interface type // so greet() can be invoked on it Greeter greeter = (Greeter) o; // Greet the world in this greeter's special way greeter.greet(); // Forget the class loader object, Class // instance, and greeter object gcl = null; c = null; o = null; greeter = null; // At this point, the types loaded through the // GreeterClassLoader object created at the top of // this for loop are unreferenced and can be unloaded // by the virtual machine. } catch (Exception e) { e.printStackTrace(); } } } }
The GreetAndForget
application accepts the same command line arguments as
the Greet
application of the previous example. The first argument is a base directory
path name where the GreetAndForget
application will look for greeters. Subsequent
arguments are greeter names. To understand this example you should be familiar with the
Greet
application presented earlier in this chapter.
Imagine you invoke the GreetAndForget
application with the following
command line:
java GreetAndForget greeters Surprise HiTime SurpriseThe code for the
HiTime
greeter, which selects a different greeting based on the time
of day, is shown above in the previous section of this chapter. The code for the
Surprise
greeter, which pseudo-randomly selects one of four helper greeters--
Hello
, Greetings
, Salutations
, or
HowDoYouDo
--and invokes its greet()
method, is shown here:
// On CD-ROM in file linking/ex7/greeters/Surprise.java import com.artima.greeter.Greeter; public class Surprise implements Greeter { public void greet() { // Choose one of four greeters pseudo-randomly and // invoke its greet() method. int choice = (int) (Math.random() * 3.99); Greeter g; switch(choice) { case 0: g = new Hello(); g.greet(); break; case 1: g = new Greetings(); g.greet(); break; case 2: g = new Salutations(); g.greet(); break; case 3: g = new HowDoYouDo(); g.greet(); break; } } }
Given the command line shown above, the GreetAndForget
application invokes
the greet()
method of the Surprise
greeter first, then the
HiTime
greeter, then the Surprise
greeter again.
GreetAndForget
's actual output would vary depending on the time of day and
Surprise
's pseudo-random mood. For the purposes of this example, assume that you
typed in the above command, hit return, and got the following output:
How do you do, globe! Good afternoon, world! Greetings, planet!This output indicates
Surprise
chose to execute HowDoYouDo
's
greet()
method the first time around and Greetings
's
greet()
method the second time around.
The first pass through GreetAndForget
's for loop, the virtual machine loads the
Surprise
class and invokes its greet()
method. The constant
pool for Surprise
includes a symbolic reference to each of the four helper greeters
that it may choose: Hello
, Greetings
,
Salutations
, and HowDoYouDo
. Assuming the Java virtual
machine that you used to run the GreetAndForget
application uses late resolution,
only one of these four symbolic references will be resolved during the first pass of
GreetAndForget
's for loop: the symbolic reference to
HowDoYouDo
. The virtual machine resolves this symbolic reference when it executes
the bytecodes that correspond to the following statement in Surprise
's
greet()
method:
g = new HowDoYouDo();
To resolve the symbolic reference from Surprise
's constant pool to
HowDoYouDo
, the virtual machine invokes the
GreeterClassLoader
object's loadClass()
method,
passing the string "HowDoYouDo
" in the name
parameter. The
virtual machine uses the GreeterClassLoader
object to load
HowDoYouDo
because Surprise
was loaded through the
GreeterClassLoader
object. As mentioned earlier in this chapter, when the Java
virtual machine resolves a symbolic reference, it uses the same class loader that defined the referencing type
(in this case, Surprise
) to initiate loading the referenced type (in this case,
HowDoYouDo
).
Once Surprise
's greet()
method has created a new
HowDoYouDo
instance, it invokes its greet()
method:
g.greet();
As the virtual machine executes HowDoYouDo
's greet()
method, it must resolve two symbolic references from HowDoYouDo
's constant pool--
one to class java.lang.System
and another to class
java.io.PrintStream
. To resolve these symbolic references, the virtual machine
invokes the GreeterClassLoader
object's loadClass()
method, once with the name java.lang.System
and once with the name
java.io.PrintStream
. As before, the virtual machine uses the
GreeterClassLoader
object to load these classes because the referencing class--in
this case, HowDoYouDo
--was loaded through the
GreeterClassLoader
object. But these two classes, both members of the Java
API, will end up being loaded by the bootstrap class loader anyway, because
loadClass()
will first delegate to its parent.
Remember that before the loadClass()
method of
GreeterClassLoader
attempts to look for a requested type in the base directory
(in this case, directory greeters
), it invokes its parent, the system class loader. The
system class loader will first delegate to its parent, which will first delegate to its parent, and so on.
Eventually findSystemClass()
will be invoked to delegate to the bootstrap class
loader, the end-point of the parent-delegation chain. Because the bootstrap class loader (via
findSystemClass()
) is able to load both
java.lang.System
and java.io.PrintStream
, the
loadClass()
method will simply return the Class
instance
returned by findSystemClass()
. These classes will be marked not as having been
loaded by the GreeterClassLoader
object, but as having been loaded by the
bootstrap class loader. To resolve any references from java.lang.System
or
java.io.PrintStream
, the virtual machine will not invoke the
loadClass()
method of the GreeterClassLoader
object,
or even the system class loader. It will just use the bootstrap class loader directly.
As a result, after Surprise
's greet()
method has returned,
there will be two types marked as having been loaded by the GreeterClassLoader
object: class Surprise
and class HowDoYouDo
. These two types
will be in the virtual machine's internal list of the types loaded by the
GreeterClassLoader
object.
Just after Surprise
's greet()
method returns, the
Class
instances for Surprise
and
HowDoYouDo
are reachable by the application. The garbage collector will not reclaim
the space occupied by these Class
instances, because there are ways for the
application's code to access and use them. See Figure 8-11 for a graphical depiction of the reachability of
these two Class
instances.
Class
instances for Surprise
and HowDoYouDo
.
The Class
instance for Surprise
can be reached in two
ways. First, it can be reached directly from local variable c
of
GreetAndForget
's main()
method. Second, it can be reached
from local variables o
and greeter
, which both point to the same
Surprise
object. From the Surprise
object, the virtual
machine can get at Surprise
's type data, which includes a reference to
Surprise
's Class
object. The third way the
Class
instance for Surprise
can be reached is through the
gcl
local variable of GreetAndForget
's
main()
method. This local variable points to the
GreeterClassLoader
object, which includes a reference to a
HashTable
object in which a reference to Surprise
's
Class
instance is stored.
The Class
instance for HowDoYouDo
can be reached in two
ways. One way is identical to the one of the paths to the Class
instance for
Surprise
: the gcl
local variable of
GreetAndForget
's main()
method points to the
GreeterClassLoader
object, which includes a reference to a
HashTable
object. The Hashtable
contains a reference to
HowDoYouDo
's Class
instance. The other way to reach
HowDoYouDo
's class instance is through Surprise
's constant
pool.
When the virtual machine resolved the symbolic reference from Surprise
's
constant pool to HowDoYouDo
, it replaced the symbolic reference with a direct
reference. The direct reference points to HowDoYouDo
's type data, which includes a
reference to HowDoYouDo
's Class
instance.
Thus, starting from Surprise
's constant pool, the Class
instance to HowDoYouDo
is reachable. But why would the garbage collector look at
direct references emanating from Surprise
's constant pool in the first place? Because
Surprise
's Class
instance is reachable. When the garbage
collector finds that it can reach Surprise
's Class
instance, it
makes sure it marks the Class
instances for any types that are directly referenced from
Surprise
's constant pool as reachable. If Surprise
is still live,
the virtual machine can't unload any types Surprise
may need to use.
Note that of the three ways, described above, that Surprise
's
Class
instance can be reached, none of them involve a constant pool of another type.
Surprise
does not appear as a symbolic reference in the constant pool for
GreetAndForget
. Class GreetAndForget
did not know
about Surprise
at compile-time. Instead, the
GreetAndForget
application decided at run-time to load and link to class
Surprise
. Thus, the Class
instance for class
Surprise
is only reachable by starting from the local variables of
GreetAndForget
's main()
method. Unfortunately for
Surprise
(and ultimately for HowDoYouDo)
, this does not
constitute a very firm grasp on life.
The next four statements in GreetAndForget
's main()
method, will change the reachability situation completely:
// Forget the user-defined class loader, Class // instance, and greeter object gcl = null; c = null; o = null; greeter = null;These statements null out all four starting places from which
Surprise
's
Class
instance is reachable. As a result, after these statements have been executed, the
Class
instance for Surprise
is no longer reachable. These
statements also render unreachable the Class
instance for
HowDoYouDo
, the Surprise
instance that was formerly pointed
to by the o
and greeter
variables, the
GreeterClassLoader
instance that was formerly pointed to by the
gcl
variable, and the Hashtable
instance that was pointed to by
the classes
variable of the GreeterClassLoader
object. All
five of these objects are now available for garbage collection.
When (and if) the garbage collector gets around to freeing the unreferenced Class
instances for Surprise
and HowDoYouDo
, it can also free up all
the associated type data in the method area for Surprise
and
HowDoYouDo
. Because these class's Class
instances are
unreachable, the types themselves are unreachable and can be unloaded by the virtual machine.
Note that two iterations of the for loop later (given the command line shown above), the
GreetAndForget
application will again load class Surprise
.
Keep in mind that the virtual machine will not reuse the type data for Surprise
that
was loaded during the first pass of the for loop. Granted, that type data became available for unloading at
the end of the first pass. But even if the Class
instance for
Surprise
hadn't become unreferenced at the end of the first pass, the type data from
the first pass wouldn't be reused during the third pass.
With each pass of the for loop, the main()
method of
GreetAndForget
creates a new GreeterClassLoader
object. Thus, every greeter that GreetAndForget
loads is loaded through a different
user-defined class loader. For example, if you invoke the GreetAndForget
application with the Hello
greeter listed five times on the command line, the
application will create five instances of class GreeterClassLoader
. The
Hello
greeter will be loaded five times by five different user-defined class loaders. The
method area will contain five different copies of the type data for Hello
. The heap will
contain five Class
instances that represent the Hello
class--one
for each namespace into which Hello
is loaded. When one of the
Class
instances for Hello
becomes unreferenced, only the
Hello
type data associated with that particular Class
instance
would be available for unloading.
In early implementations of the Java virtual machine, it was possible to confuse Java's type system. A
Java application could trick the Java virtual machine into using an object of one type as if it were an object
of a different type. This capability makes cracker's happy, because they can potentially spoof trusted classes
to gain access to non-public data or change the behavior of methods by replacing them with new versions.
For example, if a cracker could write a class and successfully fool the Java virtual machine into thinking it
was class SecurityManager
, that cracker could potentially break out of the
sandbox. The example presented in this section is designed to help you understand the type safety problems
that can arise with delegating class loaders, and the loading constraints that appeared in the second edition of
the Java virtual machine specification to address the problem.
The type safety problem arises because the multiple namespaces inside a Java virtual machine can share types. If one class loader delegates to another class loader, and the delegated-to class loader defines the type, both class loaders are marked as initiating loaders for that type. The type defined by the delegated-to class loader is shared among all the namespaces of the initiating loaders of the type.
At compile time, a type is uniquely identifiable by its fully qualified name. For example, only one class
named Spoofed
can exist at compile time. At runtime, however, a fully qualified name
is not enough to uniquely identify a type that has been loaded into a Java virtual machine. Because a Java
application can have multiple class loaders, and each class loader maintains its own namespace, multiple
types with the same fully qualified name can be loaded into the same Java virtual machine. Thus, to uniquely
identify a type loaded into a Java virtual machine requires the fully qualified name and the
defining class loader.
The type safety problems made possible by this class loader architecture arose from the Java virtual
machine's initial reliance on the compile time notion of a type being uniquely identifiable by only its fully
qualified name. You can always load two types both named Spoofed
into the same
Java virtual machine. Each Spoofed
class would be defined by different class loader.
But with a little finesse, you could fool an early implementation of the Java virtual machine into treating an
instance of one Spoofed
as if it were an instance of the other
Spoofed.
To address this problem, the second edition of the Java virtual machine specification introduced the notion of loading constraints. Loading constraints basically enable the Java virtual machine to enforce type safety based not just on fully qualified name, but also on the defining class loader, without forcing eager class loading. When the virtual machine detects a potential for type confusion during constant pool resolution, it adds a constraint to an internal list of constraints. All future resolutions must satisfy this new constraint, as well as all other constraints in the list.
For an example of the type confusion problem and its loading constraints solution, consider this implementation of a greeter, written by a devious cracker:
// On CD-ROM in file linking/ex8/greeters/Cracker.java import com.artima.greeter.Greeter; public class Cracker implements Greeter { public void greet() { Spoofed spoofed = new Spoofed(); System.out.println("secret val = " + spoofed.giveMeFive()); spoofed = Delegated.getSpoofed(); System.out.println("secret val = " + spoofed.giveMeFive()); } }
Class Cracker
is a greeter, like Hello
or
Salutations
of the previous examples, because it implements the
com.artima.greeter.Greeter
interface. Class Cracker
is
sitting in the linking/ex8
directory of the CD-ROM, along with other, more well-
meaning, greeters.
All the classes from the linking/ex7
directory appear unchanged in
linking/ex8
, except for GreeterClassLoader
, which has
been slightly modified. (More on this modification later.) You can invoke Cracker
with the Greet
method just like any other greeter. From the
linking/ex8
directory, you can simply type:
java Greet greeters Cracker
The main()
method of Greet
will, as it did in the previous
examples, create a GreeterClassLoader
and invoke its
loadClass()
method, passing in the name Cracker
.
GreeterClassLoader
's loadClass()
method will look in
the greeters
directory, load Cracker.class
, instantiate a
new Cracker
object, and invoke greet()
on it.
Cracker
's greet()
method starts by instantiating a new
Spoofed
. This is where the plot thickens.
It turns out that there are two implementations of a class named Spoofed
. The
class file for the "trusted" implementation is sitting in the linking/ex8
directory,
where it will be discovered by the system class loader:
// On CD-ROM in file linking/ex8/Spoofed.java // Trusted version - when asked to give five, gives 5 public class Spoofed { private int secretValue = 42; public int giveMeFive() { return 5; } static { System.out.println( "linking/ex8/Spoofed initialized."); } }
The trusted Spoofed
declares a private variable, named
secretValue
, that is initialized to 42. This private variable represents anything that
needs to be kept secret: a credit card number, a private key, an amount of e-cash, a reference to the current
Policy
object, and so on. Because the designers of this class didn't want the rest of the
world to have access to the secret value, they made the secretValue
variable
private. Only the methods of class Spoofed
can access
secretValue
. If you inspect the code to the trusted Spoofed
class, you'll see that the designers of Spoofed didn't provide any method that reveals
information about secretValue
. The only method in Spoofed
,
giveMeFive()
, returns the value 5.
But what if a maladjusted cracker was able to trick the virtual machine that an instance of the trusted
Spoofed
was really an instance of this class, also named
Spoofed
, which was written by the cracker:
// On CD-ROM in file linking/ex8/greeters/Spoofed.java // Malicious version - when asked to give five, this // version of Spoofed reveals secret_value public class Spoofed { private int secretValue = 100; public int giveMeFive() { return secretValue; } static { System.out.println( "linking/ex8/greeters/Spoofed initialized."); } }
When this Spoofed
class's giveMeFive()
method is
invoked, it returns secretValue
, effectively rendering the value of the private
variable public knowledge.
So which version of Spoofed
gets used by the Cracker
greeter? Cracker
deviously attempts to use both. First,
Cracker
's greet()
method loads the malicious
Spoofed
and executes its greet()
method, just to get the feel of
it:
Spoofed spoofed = new Spoofed(); System.out.println("secret val = " + spoofed.giveMeFive());
The Java compiler translates the new Spoofed()
expression into a
new
bytecode instruction that gives the index of a
CONSTANT_Class_info
constant pool entry, which represents a symbolic reference
to Spoofed
. When the virtual machine resolves this reference, it will ask the defining
loader of Cracker
to load spoofed. The defining loader of
Cracker
is this version of GreeterClassLoader
, which the
cracker has had the opportunity to modify:
// On CD-ROM in file // linking/ex8/COM/artima/greeter/GreeterClassLoader.java package com.artima.greeter; import java.io.*; import java.util.Hashtable; public class GreeterClassLoader extends ClassLoader { // basePath gives the path to which this class // loader appends "/.class" to get the // full path name of the class file to load private String basePath; public GreeterClassLoader(String basePath) { this.basePath = basePath; } public synchronized Class loadClass(String className, boolean resolveIt) throws ClassNotFoundException { Class result; byte classData[]; // Check the loaded class cache result = findLoadedClass(className); if (result != null) { // Return a cached class return result; } // If Spoofed, don't delegate if (className.compareTo("Spoofed") != 0) { // Check with the system class loader try { result = super.findSystemClass(className); // Return a system class return result; } catch (ClassNotFoundException e) { } } // Don't attempt to load a system file except through // the primordial class loader if (className.startsWith("java.")) { throw new ClassNotFoundException(); } // Try to load it from the basePath directory. classData = getTypeFromBasePath(className); if (classData == null) { System.out.println("GCL - Can't load class: " + className); throw new ClassNotFoundException(); } // Parse it result = defineClass(className, classData, 0, classData.length); if (result == null) { System.out.println("GCL - Class format error: " + className); throw new ClassFormatError(); } if (resolveIt) { resolveClass(result); } // Return class from basePath directory return result; } private byte[] getTypeFromBasePath(String typeName) { FileInputStream fis; String fileName = basePath + File.separatorChar + typeName.replace('.', File.separatorChar) + ".class"; try { fis = new FileInputStream(fileName); } catch (FileNotFoundException e) { return null; } BufferedInputStream bis = new BufferedInputStream(fis); ByteArrayOutputStream out = new ByteArrayOutputStream(); try { int c = bis.read(); while (c != -1) { out.write(c); c = bis.read(); } } catch (IOException e) { return null; } return out.toByteArray(); } }
To create this user-defined class loader, the cracker took the
GreeterClassLoader
from the linking/ex6
directory of
the CD-ROM (the one that overrides loadClass()
), and added one if statement:
// If Spoofed, don't delegate if (className.compareTo("Spoofed") != 0) { // Check with the system class loader try { result = super.findSystemClass(className); // Return a system class return result; } catch (ClassNotFoundException e) { } }
If the type name passed to loadClass()
is "Spoofed"
, the
loadClass()
method doesn't first delegate to the system class loader before
attempting to load the class in its custom way, by looking in the basePath
directory.
As a result, when the virtual machine asks this class loader (Cracker
's defining class
loader) to load Spoofed
, its loadClass()
doesn't delegate. It
just looks in the basePath
directory for Spoofed.class
,
where it finds and loads the definition of the malicious Spoofed
. The application
prints:
linking/ex8/greeters/Spoofed initialized.
The next statement in Cracker
's greet()
method invokes
giveMeFive()
on the new Spoofed
instance and prints its
return value:
secret val = 100
Having exercised the giveMeFive()
method and feeling smug,
Cracker
's greet()
method invokes a static method in a class
named Delegated
, which returns a reference of type Spoofed
:
spoofed = Delegated.getSpoofed();
The Java compiler transforms the Delegated.getSpoofed()
expression in
the source code to an invokestatic
bytecode instruction that gives the index of a
CONSTANT_Methodref_info
entry in the constant pool. To execute this
instruction, the virtual machine must resolve the constant pool entry. As the first step in resolving this
symbolic reference to getSpoofed()
, the virtual machine resolves the
CONSTANT_Class_info
reference whose index is given in the
class_index
of the CONSTANT_Methodref_info
entry.
The CONSTANT_Class_info
entry is a symbolic reference to class
Delegated
.
To resolve Once In Java source code, this looks quite innocuous. The When the Java compiler encounters the Although this is the same process that the virtual machine used to resolve
Because the trusted version of Assume for a moment that the application is running in an early Java virtual machine implementation
that doesn't apply the loading constraints. In that case, When Although this kind of type confusion attack was possible in many implementations of the Java virtual
machine prior to version 1.2, it usually couldn't be exploited in practice, because it requires the assistance of
the class loader. In this example, the cracker added an if statement to
In Java virtual machine implementations that check the loading constraints that are now part of the Java
virtual machine specification, the type confusion is not possible at all. All virtual machines must now keep an
internal list of loading constraints that must be met as types are loaded. For example, when such a virtual
machine resolves the This constraint is checked later, when the virtual machine attempts to resolve the
Java's guarantee of type safety is a cornerstone of its security model. Type safety means that programs
are allowed to manipulate the memory occupied by an object's instance variables on the heap only in ways
that are defined by that object's class. Likewise, type safety means that programs are allowed to manipulate
the memory occupied by a class's static variables in the method area only in ways that are defined by that
class. If the virtual machine can become confused about types, as demonstrated in this example, malicious
code can potentially look at or change non-public variables. In addition, if malicious code could use a
method defined in one version of a type to set an The CD-ROM contains the source code examples from this chapter in the
For more information about the material presented in this chapter, visit the resources page:
Cracker
's symbolic reference to Delegated
, the
virtual machine asks the defining class loader of Delegated
. Once again the virtual machine invokes
GreeterClassLoader
's loadClass()
method, this time
passing in the name Delegated
. However, because the requested name isn't
"Spoofed"
, the loadClass()
method goes ahead and
delegates the load request to the system class loader. Because Delegated.class
is
sitting in the linking/ex8
directory, the system class loader is able to load the class.
The system class loader is marked as the defining class loader for Delegated
, and
both the system class loader and the GreeterClassLoader
are marked as initiating
class loaders.
Delegated
has been loaded, the virtual machine completes the resolution of
the CONSTANT_Methodref_info
and invokes the
getSpoofed()
method. Here's what Delegated
's
getSpoofed()
method looks like:
// On CD-ROM in file linking/ex8/Delegated.java
public class Delegated {
public static Spoofed getSpoofed() {
return new Spoofed();
}
}
getSpoofed()
method
merely instantiates yet another Spoofed
object and returns a reference to it. Inside the
Java virtual machine, however, a serious challenge to Java's guarantee of type safety is looming.
new Spoofed()
expression in class
Delegated
, it generates a new
bytecode that gives the index of a
CONSTANT_Class_info
that forms a symbolic reference to
Spoofed
. This is exactly what happened when the Java compiler encountered the
new Spoofed()
expression in class Cracker
. When the Java
virtual machine executes this new
instruction, just as when it executed the
new
instruction in Cracker
's greet()
method, it starts by resolving the symbolic reference to Spoofed
. The virtual machine
asks Delegated
's defining loader, which is the system class loader, to load
Spoofed
.
Cracker
's symbolic reference to Spoofed
, the class loader to
which the virtual machine makes its load request is different. Because Cracker
's
defining loader was GreeterClassLoader
, the virtual machine asked
GreeterClassLoader
to load Spoofed
. But because
Delegated
's defining loader was the system class loader, the virtual machine now asks
the system class loader to load Spoofed
.
Spoofed
is sitting in the
linking/ex8
directory of the CD-ROM, the system class loader is able to read in the
bytes of the Spoofed.class
and pass them to
defineClass()
. What happens next depends on whether or not the application is
running in a Java virtual machine that adheres to the loading constraints specified in the second edition of the
Java virtual machine specification.
defineClass()
is able to
define the type from the bytes read in from linking/ex2/Spoofed.class
. The
virtual machine creates a new instance of this trusted Spoofed
type. Shortly thereafter,
Delegated
's getSpoofed()
method returns a reference to the
trusted Spoofed
object to its caller, Cracker
's
greet()
method. Cracker
stores this reference in local variable
spoofed
, and proceeds to print out the value returned by invoking
giveMeFive()
on spoofed
.
Cracker.java
was compiled, the Java compiler transformed this second
giveMeFive()
invocation into yet another invokevirtual
instruction that references a CONSTANT_Methodref_info
entry in the constant
pool, the symbolic reference to giveMeFive()
in Spoofed
.
When the virtual machine goes to resolve this symbolic reference, however, it discovers it has already been
resolved. The CONSTANT_Methodref_info
entry specified by the second
giveMeFive()
invocation is the same as that specified by the first one, which was
resolved to the malicious Spoofed
's implementation of
giveMeFive()
. The virtual machine invokes the malicious
Spoofed
method on the trusted Spoofed
object, and the
application prints:
secret val = 42
GreeterClassLoader
's loadClass()
method that causes it
to treat Spoofed
specially. Were the cracker to attempt to instigate this kind of type
confusion attack via an untrusted applet, he or she would run into trouble. Untrusted applets are not allowed
to create class loaders. Thus, providing the designers of the class loaders in the application that loads applets
into browsers did their jobs correctly, the cracker would have no way to exploit this (former) weakness in
Java's type safety guarantee.
CONSTANT_Methodref_info
entry in
Cracker
' s constant pool that forms a symbolic reference to the
getSpoofed()
method of class Delegated
, the virtual
machine records a loading constraint. Because Delegated
was defined by a different
class loader than Cracker
, and Delegated
's
getSpoofed()
method returns a reference to a Spoofed
, the
virtual machine records the following constraint:
Spoofed
for which the system class loader
(Delegated
defining class loader) is marked as an initiating loader must be the same
type named Spoofed
for which GreeterClassLoader
(Cracker
's defining class loader) is marked as an initiating class loader.
CONSTANT_Class_info
entry in Delegated
's constant pool
that forms a symbolic reference to class Spoofed
. At that time, the virtual machine
discovers that the constraint is violated. The type named Spoofed
that is being loaded
by the system class loader is not the same type named Spoofed
that was loaded by
GreeterClassLoader
. As a result, the Java virtual machine throws a
LinkageError
:
Exception in thread "main" java.lang.LinkageError: Class Spoofed
violates loader constraints
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:422)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:10)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:248)
at java.net.URLClassLoader.access$1(URLClassLoader.java:216)
at java.net.URLClassLoader$1.run (URLClassLoader.java:197)
at java.security.AccessController.doPrivileged (Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:191)
at java.lang.ClassLoader.loadClass(ClassLoader.java:290)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:275)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at Delegated.getSpoofed(Delegated.java, Compiled Code)
at Cracker.greet(Cracker.java:13)
at Greet.main(Greet.java, Compiled Code)
int
instance variable, then use a
method in another version of that type to interpret and return the value of the int
as an
array, the malicious code would in effect transform an int
to an array reference. With
this forged pointer, the malicious code could wreak all kinds of havoc. Thus, it is important that Java's type
safety guarantee be iron-clad. The loading constraints ensure that, even in the presence of multiple
namespaces, Java's type safety will be enforced at runtime.
On the CD-ROM
linking
directory.
The Resources Page
http://www.artima.com/insidejvm/resources/
Sponsored Links
|