At first look this seems like a clear definition and a relatively boring topic to read about. So why did I decide to write about it? Because despite of the understandable definition, it is not always straightforward (for the tool developers) to calculate the shallow size, or (for a user) to understand how the size was calculated. The reasons? – different JVM vendors, different pointer sizes (32 / 64 bit), different dump formats, insufficient data in some heap dumps, etc … These factors could lead to small differences in the shallow sizes for objects of the same type shallow sizes being displayed for objects of the same type, and thus to questions.
Is it really important to know the precise size? Not necessarily. If you got a heap dump from an OutOfMemoryError in your production system, and MAT helps you to easily find the leak suspect there – let’s say it is some 500Mb big object - then the shallow size of every individual object accumulated in the suspect's size doesn’t really matter. The suspect is clear and you can go on and try to fix the problem.
On the other hand, if you are trying to understand the impact of adding some fields to your “base” classes, then the size of the individual instance can be of interest.
In the rest of the post I would have a look at the information available (or missing) in the different snapshot formats, explain what MAT displays as shallow size in the different cases, and try to answer some of the questions related to the shallow size which we usually get. If you are interested, read further.
As I mentioned already, the various snapshot formats contain different pieces of information about the objects. I will look at each of them separately, and additionally differentiate between object instances and classes. For more information on the different heap dump formats see part one of the blog series.
Instance Size in HPROF Heap Dumps
The heap dumps in HPROF binary format do not provide the correct size of each instance. What they provide is the number of bytes used to store the necessary data in the heap dump, but not the number of bytes the VM really needs to store the instance in the heap. Therefore, in MAT we have to (and attempt to) model how the VM would store the instance and how much memory it would need.
The approach we have to calculate the sizes is the following:
Instance Shallow Size = [object header] + space for fields of super class N + [some bytes because of alignment] + … + space for own fields + [some bytes because of alignment]
The sizes originally provided in the hprof file do not contain the object header and the additional space the VM uses to have the object addresses aligned in a certain way. These are namely the parameters we guess on our own. Does this always work? No. Unfortunately not. In the Bugzilla entry 231296 you can find some discussions on the topic, and also what the current state is. Here is just a short summary:
Instance Size in IBM System Dumps (read with DTFJ)
The DTFJ provides already the correct instance size for the objects, and in MAT we don’t have to do any guessing – the instance sizes are correct. What needs to mentioned is that it may happen that two instances of the same class and in the same heap dump have different shallow sizes. The Memory Analyzer was not prepared until recently to handle such a case. More information on when exactly such a difference could appear and some discussions on the necessary changes can be found in Bugzilla entry 301228.
Instance Size in PHD Dumps
The sizes provided in the PHD (Portable Heap Dumps from IBM JVMs) dumps are also correct and MAT just displays them without any further computations.
Class Size in HPROF
The HPROF format does not provide information about the memory needed for a class - for bytecode, for jitted code, etc... For every class the Memory Analyzer will show as shallow size the sum of the shallow sizes of all static fields of the class.
Class Size in IBM System Dumps
DTFJ provides more information about the classes sizes. The shallow size for classes reported by MAT includes the size of all methods (bytecode and jitted code sections) and also the on heap size of the java.lang.Class object.
Class Size in PHD Dumps
The PHD dumps do not contain information about the method sizes. The shallow size for classes in PHD dumps is just the size of the java.lang.Class object.
Shallow Size of a Set of Objects
The "Shallow Size" column appears in many views where objects have been aggregated in groups based on different criteria. The shallow size of a set of objects is just the sum of the shallow sizes of the individual objects in the set. There are two things to mention here, which have raised questions in the past:
My personal view is that if one is using the Memory Analyzer to find the root cause of an OutOfMemoryError, then the shallow sizes of the individual objects are not that important.
If for a given purpose one needs to understand in detail the sizes of objects, then it is important to remember that they depend on the concrete JVM and heap dump type, and that in some cases the displayed sizes are not given by the VM but are calculated in MAT. I hope that the short overview given in this blog could be helpful for better understanding these details.
What Comes Next?
In the next post I plan to write again about size, but a different one - the retained (or keep alive) size of objects and object sets.