Measuring and analyzing the Memory consumption in Java and other OOP languages is not an easy task, because all the objects form a big highly connected graph. Only
counting the flat, also called "shallow" size, of the objects, is not always helpful:
This figure shows a typical memory histogram. In general this table does not help very much to figure out which objects really cause the high memory usage. Typically String and char[] will show up in the list of the classes with the highest shallow size, just because they are frequently used.
There is probably a correlation between the high number of char[] instances and the high number of String instances(indicated by the yellow arrows), but we cannot really know from the information in the table. We also don't know whether the Order object could potentially be the root cause for all those Objects there.
So how can we find out, who is really responsible for the memory usage?
Let's take a step back and ask ourselves how Garbage Collection works in a language like Java (and also in other "modern" languages ;)).
Garbage CollectionThe general idea is that the Garbage Collector will reclaim all objects that cannot be reached from a so called "GC root object".
GC root objects are
- Local variables on the stack
- Parameters of functions on the stack
- JNI native references
- All classes loaded by the bootstrap Loader
Now, let's come back to our original problem, we can ask us the interesting question:
"How much memory would be freed, if we would remove all those "Order" objects and afterwards a full GC would run?"This would give us a measure for how much memory would be freed if those objects would not be there.
We call this measure the
"retained size". The set of objects that would be freed is called the
"retained set".
With the "retained set" we can figure out how many Strings would be freed if those "Order" objects would not be there. We can therefore solve the problem to find the objects "responsible" for the memory usage.
Definition "Retained size/set"Retained set of X is the set of objects which would be collected by the GC
Retained size of X is the sum of shallow sizes of all objects in the retained set of X, i.e. memory kept alive by X.
Generally speaking, shallow size of an object is its "flat" size in the heap whereas the retained size of the same object is the amount of heap memory that will be freed when the object is garbage collected.
The retained set for a leading set of objects, such as all objects of a particular class or all objects of all classes loaded by a particular class loader or simply a bunch of arbitrary objects, is the set of objects that is released if all objects of that leading set become inaccessible. The retained set includes these objects as well as all other objects only accessible through these objects. The retained size is the total heap size of all objects contained in the retained set.
How to find the object with the biggest retained size?
The
Eclipse Memory Analyzer was built from the beginning with support for computing the retained set/size of objects.
The retained size was pioneered, as far as I know, by the commercial
Yourkit profiler(great product!). When we started to develop the Eclipse Memory Analyzer (SAP Memory Analyzer at that point in time), Yourkit already had this very nice feature to find the single object with the biggest retained size. We still don't know how Yourkit does it, but for sure at that point in time they used a different approach than we do now. The reason is that it took Yourkit several minutes to get the first 10 (I think) biggest (in terms of retained size) objects in the heap. We knew that this was already still much faster than the naive method,
that would have to simulate a Garbage Collector run each of the object in the heap. With millions of objects on the heap this would have taken years!
So I went on to research, what kind of Graph algorithms could help us to speed things up. After a lot of Google searched I finally found the
dominator tree.
The dominator tree
The
dominator tree is a datastructure that allows you to answer the question about the biggest objects in almost no time. After further research I found that it could be build in (almost) O(n) time, by a complicated algorithm invented by
Lengauer and Tarjan. Since the plan by the architect of the project, was always to store precomputed information on disk, we could compute the dominator tree once and then store it on disk. Therefore opening a heap dump for the first time is a little bit slow (but still often faster than with any other tool), but opening it again is blazing fast (just a few seconds).
Still implementing the algorithm was a high risk, because at that point in time relatively new paper
"Finding Dominators in Practice" was talking about graphs with at most a few thousand nodes, whereas we wanted to run it on a few millions of nodes on a 32bit machine.
Still the team would get it done (I was always only in "consultant" role), and it turned out that the dominator tree would not only allow us to speed up the process of finding the biggest objects, but would also allow us to really significantly simplify the process of finding the root causes for high memory usage.
More on this in the next post.
Stay tuned!