Yup, still hype (Legacy Design Forum)

Yup, still hype

This page contains an archived post to the Design Forum (formerly called the Flexible Java Forum) made prior to February 25, 2002. If you wish to participate in discussions, please visit the new Artima Forums.

Message:

Yup, still hype

Posted by Bill Venners on 01 Jun 1998, 12:02 PM

> "The Hotspot Java virtual machine promises to bring Java program performance on par with that of natively compiled C++. In fact ... could eventually push Java performance past that of statically compiled C++."
> I take it this refers to some crippled model of C++ where pointer arithmetic is not permitted, objects are assumed to be dynamically allocated even where the programmer can easily avoid it, array checking is forced etc.
> Java syntax simply does not have the capability to instruct the processor as precisely as C++.
> Also, I see nothing about improving the incredibly slow startup times of Java programs. Even quite small programs take several seconds to start from a hard disk. It is true I have only dealt with applets - maybe applications are faster?

I do agree with you that Hotspot is being hyped at the moment,
because we only know what Sun is telling us about it. We haven't
actually seen it. But the techniques described do
sound promising.

In fact, when I look at the techniques they describe, it makes
sense to me that Java programs could run "as fast as C++," and
may even be able to run "faster than C++," as Sun is claiming.

I think that superstitions about performance abound, and where
actual performance bottlenecks lie is often very non-intuitive.
My article explains adaptive optimization, the "big idea" of
Hotspot, which shows how the performance critical sections of
a Java program can be transformed into native code as heavily
optimized as statically compiled C++. That is the main thing
that tells me that Java could indeed match C++ performance.

But, you say, what about all the safey-checks built into Java
that don't exist in C++? Once again, the performance impact of
all these Java features isn't necessarily as costly as they
may seem on the surface. To take two examples from your comment,
array bounds checking needn't always be performed in a Java
program, because in some cases the optimizer can determine that
the bounds will never be exceeded, and remove the checks. In
addition, only those array-bounds checks that are in the
time-critical portion of the code matter to performance.

Another biggy is that Java makes all objects sit on the heap,
where C++ gives the programmer a choice between heap and stack.
The C++ lore is that making an object a local variable sitting
on the stack is much faster than allocating it on the heap, thus,
wouldn't that make Java programs by definition slower than C++
programs? Well, it isn't so straightfoward a comparison, because
it all depends on memory allocation and garbage collection
algorithms and their associated costs.

For example, when a C++ function is called, the stack pointer
can simply be incremented past all required local variable data,
including objects. So with one add operation, you've allocated
memory for the objects. When the function exits, the destructor
must be called on any objects sitting there, and then the stack
pointer can be decremented back to its original value, thereby
"freeing" the memory occupied by those local objects.

When a C++ object is allocated on the heap, a fancier (and more
time consuming) operation must take place. A basic malloc() (or
new) could simply search through a linked list of
open slots of memory until it finds one big enough, mark that
slot as taken, update the linked list, and return the pointer to the allocated
memory. This algorithm would yield a fragemented heap over time
and would yield expensive dynamic allocation.

There are, however, other approaches malloc() could take. It
could slice up the heap into bins, each of which contains
some number of like-sized chunks of memory. So there might
be a bin with 1000 8-byte chunks, a bin with 1000 16 byte chunks,
and so on. This malloc() would just return a pointer to one of
the smallest chunks in which the allocated object would fit. So
if your program allocated an 11-byte object, malloc() would
return a pointer to a 16-byte chunk and 5 of the bytes would be
unused. This techniques trades off some memory for speedier
allocation time.

So the actual difference in performance cost between dynamic
allocation and stack allocation in C++ depends on what malloc()
or new is actually doing. Likewise, the cost of
object allocation in
Java depends on what is going on when you allocate a new object.

One heap model, which I describe in my book and which is used
by Microsoft's VM, is called stop and copy. In this model, the
heap is divided into two halves. To allocate an object, the VM
merely increments a pointer up the heap to make enough space
for the object. This is basically a similar process to making
space for an object on the stack in C++, though incrementing
a stack pointer may be faster than incrementing a heap pointer.
Regardless, it is likely faster than allocating an object on
the heap in C++.

When a half-heap fills up with objects, execution is stopped,
and the garbage collector identifies and copies all live objects
to the other half-heap. In this way, the heap is never
fragmented.

So in this picture, allocating an object on a Java heap looks
not much more expensive than allocating an object on a C++
stack. But one must also ask the question, is calling a
C++ destructor less expensive than the garbage collection
process that a Java program must use. And the answer to that
one is a resounding yes.

Garbage collection will almost certainly more expensive than
plain old destruction, because it has to do so much more
work to figure out what to free. But once again, that doesn't
necessarily mean that all Java programs will run slower than
C++ programs. Many programs have times when they aren't doing
anything, for example they may be waiting on something to
come across the network or for the user to click a button. In
such slots GC could be run without negatively impacting the
"performance" of the program. In addition, many Java programs
may execute completely without every needing a garbage collection.
In other words, they may run and never run out of memory.

So that's a long way of saying that it is hard to predict
performance by guessing at the impact of language features
or VM features. The best way to do it is by measuring actual
Java programs running on actual JVMs. So once we get our hands
on Hotspot and can try it out, then we'll see the extent to
which Sun delivers on its claims of bringing Java performance
on par with C++.

Replies:


	Web Artima.com