Addressing Fragmentation in ZGC through Custom Allocators
This is a repost of my blog entry about my master thesis at Oracle. The original was published on inside.java.
My name is Joel and I am currently finishing my 5-year degree in Computer and Information Engineering at Uppsala University, with a focus on Software Engineering. This blog post details my master thesis work as part of the GC team at Oracle’s Stockholm office in the spring of 2024. I’ve worked closely with Casper Norrbin and Niclas Gärds, who have also conducted their masters theses as Oracle this spring in the same area as I have. Problem Statement
ZGC and other garbage collectors typically use bump-pointer allocation, which is efficient for sequential allocations but leads to fragmentation over time. Fragmentation occurs when memory gaps are created that cannot be easily reused, which is solved by costly relocation of live objects. The goal of this research is to reduce the need for relocation in ZGC by using a free-list-based allocator alongside a bump-pointer allocator, which can track and utilize fragmented memory more effectively in certain scenarios.
Methodology
My research focuses on adapting an allocator to make it better suited for use in ZGC, based on the Two-Level Segregated Fit (TLSF) allocator by Masmano et al. The main adaptations I contribute with are:
-
0-byte Header: By utilizing information within ZGC, the allocator introduces a 0-byte header, which significantly reduces internal fragmentation. The image below shows 1) the reference design, 2) the general design, and 3) the optimized 0-byte header.
-
ZGC Small Pages: Limiting the allocator to be used inside the limited size (2MB) and allocation size range ([16 B, 256 KB]) of ZGC, internal representations can be stored and used more efficiently. The image below shows how the large number of first- and second-levels are flattened into a 64-bit word.
-
Concurrency: Concurrent operations on the allocators are supported using a lock-free mechanism, which considers many different problems and use-cases. The 0-byte header is especially noteworthy as it is made possible by a series of smaller adaptations to the allocator. Adaptations such as deferring coalescing, reducing the supported heap size to that of ZGC’s small pages, and leveraging information that is already part of the Java Object header, make the 0-byte header possible. Additionally, concurrency can be solved in many ways, but the lock-free solution that is part of my research is made significantly easier to implement by the already mentioned adaptations. Without these adaptations, implementing a lock-free solution would be much more complex.
Results
The adapted allocator shows promising potential to be used in ZGC, with an emphasis on allocating memory.
-
Performance: For single allocations, the new allocator performed on par with the reference implementation. However, it was slightly slower for single deallocations and real-world allocation patterns. This trade-off is considered acceptable given the significant reduction in fragmentation.
-
Memory Efficiency: The introduction of the 0-byte header and other optimizations led to a notable decrease in internal fragmentation. This improvement in memory efficiency suggests that the new allocator is effective in managing fragmented memory.
Conclusion
My work demonstrates that customizing allocators for use within garbage collectors like ZGC seems like a viable approach to addressing memory fragmentation. The adapted allocator not only reduces the need for costly relocations but also enhances overall memory efficiency. I’ve shown that there is a lot of potential for adaptations to be made to TLSF for use in ZGC, which might also apply to other allocators as well. The most apparent next step of my work is to integrate the allocator into ZGC (which Niclas Gärds, who conducted his thesis at Oracle in parallel to mine, has done). Other areas to look at are to consider the new minimum allocation size in Java from the Lilliput project and to address starvation in the adapted allocator’s concurrency implementation.
I would like to end by giving a huge thank you to everyone at the Oracle office in Stockholm for sharing their knowledge, making me feel part of the team and fostering an environment which inspires learning. Thank you to Erik Österlund and Tobias Wrigstad for your steady support, knowledge and guidance throughout the project. And finally, thank you to Casper and Niclas for making this spring special and exciting! The full report of my work can be found here, which provides more detail and depth to the concepts explained in this post but also some other areas which are not included here.