So, can we use Java without creating garbage? Is it possible, for example, in natural Java, to solve this problem:
Create 10m financial instrument objects, store them in a map, retrieve them and perform a calculation using each object without creating any garbage at all.
It is if you use Chronicle! Chronicle provides libraries so that you can easily use off heap storage in the form off memory mapped files for your objects. (For full source code for this article see here.)
Let's look implementing a solution for the above problem.
First let's have a look at how you might do this in normal Java so that we make sure we understand the problem and what happens if we use the standard Java libraries for our implementation.
This is the out put from the program:
*** Entering critical section ***
[GC (Allocation Failure) 98816K->92120K(125952K), 0.0317021 secs]
[Full GC (Ergonomics) 92120K->91917K(216576K), 0.2510530 secs]
[GC (Allocation Failure) 125197K->125430K(224256K), 0.0449051 secs]
[GC (Allocation Failure) 166390K->166686K(244224K), 0.0504341 secs]
[Full GC (Ergonomics) 166686K->165777K(387072K), 0.6243385 secs]
[GC (Allocation Failure) 226705K->226513K(388096K), 0.0785121 secs]
[GC (Allocation Failure) 293073K->293497K(392704K), 0.0825828 secs]
[Full GC (Ergonomics) 293497K->292649K(591872K), 1.2479519 secs]
[GC (Allocation Failure) 359209K->359433K(689664K), 0.0666344 secs]
[GC (Allocation Failure) 449033K->449417K(695296K), 0.1759746 secs]
[GC (Allocation Failure) 539017K->539385K(747008K), 0.1907760 secs]
[GC (Allocation Failure) 632569K->633009K(786944K), 0.2293778 secs]
[Full GC (Ergonomics) 633009K->631584K(1085952K), 2.1328028 secs]
[GC (Allocation Failure) 724768K->723368K(1146368K), 0.3092297 secs]
[GC (Allocation Failure) 827816K->825088K(1174016K), 0.3156138 secs]
[GC (Allocation Failure) 929536K->929952K(1207296K), 0.3891754 secs]
[GC (Allocation Failure) 1008800K->1009560K(1273856K), 0.4149915 secs]
[Full GC (Ergonomics) 1009560K->1007636K(1650688K), 3.4521240 secs]
[GC (Allocation Failure) 1086484K->1087425K(1671680K), 0.3884906 secs]
[GC (Allocation Failure) 1195969K->1196129K(1694208K), 0.2905121 secs]
[GC (Allocation Failure) 1304673K->1305257K(1739776K), 0.4291658 secs]
[GC (Allocation Failure) 1432745K->1433137K(1766912K), 0.4470582 secs]
[GC (Allocation Failure) 1560625K->1561697K(1832960K), 0.6003558 secs]
[Full GC (Ergonomics) 1561697K->1558537K(2343936K), 4.9359721 secs]
[GC (Allocation Failure) 1728009K->1730019K(2343936K), 0.7616385 secs]
[GC (Allocation Failure) 1899491K->1901139K(2413056K), 0.5187234 secs]
[Full GC (Ergonomics) 1901139K->1897477K(3119616K), 5.7177263 secs]
[GC (Allocation Failure) 2113029K->2114505K(3119616K), 0.6768888 secs]
[GC (Allocation Failure) 2330057K->2331441K(3171840K), 0.4812436 secs]
[Full GC (Ergonomics) 2331441K->2328578K(3530240K), 6.3054896 secs]
[GC (Allocation Failure) 2600962K->2488834K(3528704K), 0.1580837 secs]
*** Exiting critical section ***
Time for putting 32088
Time for getting 454
[GC (System.gc()) 2537859K->2488834K(3547136K), 0.1599314 secs]
[Full GC (System.gc()) 2488834K->2488485K(3547136K), 6.2759293 secs]
[GC (System.gc()) 2488485K->2488485K(3559936K), 0.0060901 secs]
[Full GC (System.gc()) 2488485K->2488485K(3559936K), 6.0975322 secs]
Memory(heap) used 2.6 GB
The two main points that jump out of this issue are, one, the number and expense of the garbage collections (clearly this could be tuned) and two the amount of heap used 2.6GB. In short, there's no getting away from it, this program produces masses of garbage.
Let's try exactly the same thing, this time using ChronicleMap.
This is the code to solve the problem:
This is the output from the program:
[GC (Allocation Failure) 33280K->6595K(125952K), 0.0072065 secs]
[GC (Allocation Failure) 39875K->12177K(125952K), 0.0106678 secs]
[GC (Allocation Failure) 45457K->15289K(125952K), 0.0068434 secs]
[GC (Allocation Failure) 48569K->18357K(159232K), 0.0098287 secs]
[GC (Allocation Failure) 84917K->21008K(159232K), 0.0156393 secs]
*** Entering critical section ***
*** Exiting critical section ***
Time for putting 8554
Time for getting 4351
[GC (System.gc()) 36921K->21516K(230400K), 0.0331916 secs]
[Full GC (System.gc()) 21516K->15209K(230400K), 0.0630483 secs]
[GC (System.gc()) 15209K->15209K(230912K), 0.0006491 secs]
[Full GC (System.gc()) 15209K->15209K(230912K), 0.0234045 secs]
Memory(heap) used 18.2 MB
The main point here is obviously that there were no GCs in the critical section and that the whole program only used 18MB of heap. We have managed to create a program that ordinarily would have produced gigabytes of garbage without producing any garbage at all.
A note on timings
ChronicleMap is clearly not a drop in replacement for ConcurrentHashMap, they have very different uses and it is beyond the scope of this post to go too much further into that line of discussion. But the main differences in functionality are that ChronicleMap is persisted and can be shared amongst many JVMs. (ChronicleMap also has the ability to be tcp replicated.) Nevertheless it is interesting to quickly compare timings if nothing else than to make sure we are in the same ball park. ChronicleMap was faster for putting, 8.5s compared to 32s. But most of the time in ConcurrentHashMap was spent in GC and that might be tuned away to some extent. ConcurrentHashMap was faster for getting, 0.5s compared to 4.3s. Nevertheless on other runs I've seen ConcurrentHashMap taking over 7s because of a GC that occurred in that section. Even though ChronicleMap is doing significantly more work, the lack of garbage produced actually makes the timings comparable with ConcurrentHashMap.
Restarting the program
Where ChronicleMap really comes into its own is on a restart. Let's say your program goes down and you need to recalculate the same computation we did earlier. In the case of ConcurrentHashMap we would have to repopulate the map in exactly the same we did earlier. With ChronicleMap, since the map is persistent it is just a matter of pointing the map at the existing file and rerunning the calculation to produce the totalQuantity.
Summary
ConcurrentHashMap | ChronicleMap | |
---|---|---|
gc pauses | Many | None |
update time | 32s | 8s |
reads allowing gc | 7s | 4s |
reads no gc | 0.5s | 4s |
heap size | 2.6GB | 18MB |
persistence | No | Yes |
fast restart | No | Yes |