Java: ChronicleMap, Part 2: Super RAM Maps
The standard Java Maps, such as the ubiquitous HashMap
, are ultimately limited by the available RAM. Read this article and learn how you can create Java Maps with virtually unlimited sizes even exceeding the target machine's RAM size.
The built-in Map implementations, such as HashMap
and ConcurrentHashMap
, work fine as long as they are relatively small. In all cases, they are limited by the available heap and, therefore, eventually the available RAM size. ChronicleMap
can store its contents in files, thereby circumventing this limitation, opening up for terabyte-sized mappings as shown in this second article in an article series about ChronicleMap
.
Read more about the fundamentals of ChronicleMap
in my first article.
File Mapping
Mapping of a file is made by invoking the createPersistedTo()
method on a ChronicleMap
builder as shown in the method below:
private static Map<Long, Point> createFileMapped() {
try {
return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(10_000_000)
.createPersistedTo(new File("my-map"));
} catch (IOException ioe) {
throw new RuntimeException(ioe);
}
}
This will create a Map that will lay out its content in a memory-mapped file named "my-map" rather than in direct memory. The following example shows how we can create 10 million Point
objects and store them all in a file mapped map:
final Map<Long, Point> m3 = LongStream.range(0, 10_000_000)
.boxed()
.collect(
toMap(
Function.identity(),
FillMaps::pointFrom,
(u, v) -> {
throw new IllegalStateException();
},
FillMaps::createFileMapped
)
);
The following command shows the newly created file:
Pers-MacBook-Pro:target pemi$ ls -lart my-map
-rw-r--r-- 1 pemi staff 330305536 Jul 10 16:56 my-map
As can be seen, the file is about 33 MB, and thus, each entry occupies 33 bytes on average.
Persistence
When the JVM terminates, the mapped file is still there, making it easy to pick up a previously created map including its content. This works much like a rudimentary superfast database. Here is how we can start off from an existing file:
return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(10_000_000)
.createOrRecoverPersistedTo(new File("my-map"));
The Map
will be available directly, including its previous content.
Java Map Exceeding RAM Limit
One interesting aspect of memory-mapped files is that they can exceed both the heap and RAM limits. The file mapping logic will make sure that the parts being currently used are loaded into RAM on demand. The mapping logic will also retain recent portions of accessed mapped memory in physical memory to improve performance. This occurs behind-the-scenes and need not be managed by the application itself.
My desktop computer is an older MacBook Pro with only 16GB of memory (Yes, I know that sucks). Nevertheless, I can allocate a Map
with 1 billion entries potentially occupying 33 * 1,000,000,000 = 33 GB memory (We remember from above that each entry occupied 33 bytes on average). The code looks like this:
return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(1_000_000_000)
.createPersistedTo(new File("huge-map"));
Even though I try to create a Java Map with 2x my RAM size, the code runs flawlessly and I get this file:
Pers-MacBook-Pro:target pemi$ ls -lart | grep huge-map
-rw-r--r-- 1 pemi staff 34573651968 Jul 10 18:52 huge-map
Needless to say, you should make sure that the file you are mapping to is located on a file system with high random access performance. For example, a filesystem located on a local SSD.
Summary
- ChronicleMap can be mapped to an external file
- The mapped file is retained when the JVM exits
- New applications can pick up an existing mapped file
- ChronicleMap can hold more data than there is RAM
- Mapped files are best placed on file systems with high random access performance