Why is the first C++ (m)allocation always 72 KB?
TLDR; The C++ standard library sets up exception handling infrastructure early on, allocating memory for an “emergency pool” to be able to allocate memory for exceptions in case malloc ever runs out of memory.
Introduction
I like to spend (some of) my time hacking and experimenting on custom memory allocators with my own malloc implementation(s). While unit tests are useful for correctness, the ultimate test is seeing how the allocator behaves in real-world programs. On Linux, overriding the default malloc is surprisingly simple: wrap the standard allocation functions (e.g., malloc, calloc, realloc, free, and utilities like malloc_usable_size), compile your implementation into a shared library, and use LD_PRELOAD to force programs to load it first. For example, you can test your allocator with a simple command like this:
LD_PRELOAD=/home/joel/mymalloc/libmymalloc.so ls
To better understand how programs allocate memory, I built a debug tool that logs the size of every allocation request to a file. You have to be careful when creating debug tools like this when implementing malloc to not internally use malloc to log output. Otherwise, you risk an infinite loop and a crash. To solve this I’m using a stack-allocated buffer together with low-level functions like creat, write and snprintf to safely capture the data.
$ LOG_ALLOC=log.txt LD_PRELOAD=/home/joel/mymalloc/libmymalloc.so ls
The 72 KB Mystery
While analyzing allocation patterns across different programs, I noticed something unusual: the very first allocation is always 73728 bytes (72 KB). Every program I tested exhibited this behavior, as confirmed by my debug logs:
$ head -n 1 log.txt
73728
To track down the first call to malloc, I use gdb to set a breakpoint into my own malloc function to inspect the backtrace.
A quick side note: Setting a breakpoint on the “malloc” symbol will not only trigger for our own malloc, but also the dynamic linker’s (RTLD) internal malloc, so we have to be more specific. RTLD uses its own minimal malloc implementation for early memory allocation, before libc (or our own malloc) is loaded. I encourage you to take a look at glibc’s elf/dl-minimal-malloc.c, it is remarkably approachable.
$ gdb --args ls
...
(gdb) set environment LD_PRELOAD=/home/joel/mymalloc/libmymalloc.so
(gdb) b MallocWrapper.cpp:malloc
...
(gdb) r
Starting program: /usr/bin/ls
Breakpoint 1, malloc (size=73728) at src/MallocWrapper.cpp:44
...
(gdb) bt
#0 malloc (size=73728) at src/MallocWrapper.cpp:44
#1 0x00007ffff78bd17f in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff7fca71f in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdce8, env=env@entry=0x7fffffffdcf8) at ./elf/dl-init.c:74
#3 0x00007ffff7fca824 in call_init (env=<optimized out>, argv=<optimized out>, argc=<optimized out>, l=<optimized out>) at ./elf/dl-init.c:120
#4 _dl_init (main_map=0x7ffff7ffe2e0, argc=1, argv=0x7fffffffdce8, env=0x7fffffffdcf8) at ./elf/dl-init.c:121
#5 0x00007ffff7fe45a0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
# ...
Diving Into libstdc++
The backtrace revealed that the first 72 KB allocation originated from libstdc++. While adding debug symbols helps narrow it down a bit, it’s hard to pinpoint the exact function responsible for the malloc call due to inlining. All we know is that the malloc call comes from something down the line from __pool_alloc_base::_M_allocate_chunk.
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
...
0x00007ffff78aa600 0x00007ffff79eef42 Yes (*) /lib/x86_64-linux-gnu/libstdc++.so.6
...
(*): Shared library is missing debugging information.
(gdb) add-symbol-file /lib/x86_64-linux-gnu/debug/libstdc++.so.6 0x00007ffff78aa600
...
Reading symbols from /lib/x86_64-linux-gnu/debug/libstdc++.so.6...
(gdb) bt
#0 malloc (size=73728) at src/MallocWrapper.cpp:44
#1 0x00007ffff78bd17f in __gnu_cxx::__pool_alloc_base::_M_allocate_chunk (this=0x7ffff79ef08d <std::filesystem::filesystem_error::_Impl::~_Impl()+3>, __n=8, __nobjs=<error reading variable: Cannot access memory at address 0x0>)
# at ../../../../../../src/libstdc++-v3/src/c++98/pool_allocator.cc:114
# ...
Identifying the exact caller took some time, but I narrowed it down by cross-referencing known functions in the assembly code with the libstdc++ source code. The investigation led me to libstdc++-v3/libsupc++/eh_alloc.cc, where “eh” stands for “exception handling”. This made sense because _M_allocate_chunk is likely the first point where an exception could be thrown, so the exception-handling infrastructure must be initialized, which is presumably done lazily.
libstdc++-v3/src/c++98/pool_allocator.cc (simplified for readability):
char* __pool_alloc_base::_M_allocate_chunk(size_t __n, int& __nobjs) {
// ...
__try {
_S_start_free = static_cast<char*>(::operator new(__bytes_to_get));
} __catch(const std::bad_alloc&) { /* ... */ }
// ...
}
Exception Handling Infrastructure (Emergency Pool)
The 72 KB call to malloc we’re seeing is memory for the so called “emergency pool”, which is allocated in the constructor of the pool:
libstdc++-v3/libsupc++/eh_alloc.cc (simplified for readability):
pool::pool() noexcept {
// ...
arena_size = buffer_size_in_bytes(obj_count, obj_size);
if (arena_size == 0)
return;
arena = (char *)malloc (arena_size);
if (!arena) {
// If the allocation failed go without an emergency pool.
arena_size = 0;
return;
}
// Populate the free-list with a single entry covering the whole arena
first_free_entry = reinterpret_cast <free_entry *> (arena);
new (first_free_entry) free_entry;
first_free_entry->size = arena_size;
first_free_entry->next = NULL;
}
Normally, exceptions are allocated directly via malloc, but if the malloc call fails, the exception is allocated from the emergency pool instead. This ensures that exceptions can still be thrown (to the extent of the size of the emergency pool) even when malloc fails, providing a last line of defense for error handling. The emergency pool is allocated lazily at program startup, since memory is more likely to be available then, which explains why we see this allocation so consistently.
libstdc++-v3/libsupc++/eh_alloc.cc (simplified for readability):
extern "C" void *
__cxxabiv1::__cxa_allocate_exception(std::size_t thrown_size) noexcept {
// ..
void *ret = malloc (thrown_size);
#if USE_POOL
if (!ret)
ret = emergency_pool.allocate (thrown_size);
#endif
// ...
}
Emergency Pool Sizing. Why 72 KB?
Looking in the source file there is a brief explanation of how the size of the emergency pool is calculated. Both the object size and the number of objects are based on the wordsize, so 8 bytes on a 64-bit system.
libstdc++-v3/libsupc++/eh_alloc.cc:
// The size of the buffer is N * (S * P + R + D), where:
// N == The number of objects to reserve space for.
// Defaults to EMERGENCY_OBJ_COUNT, defined below.
// S == Estimated size of exception objects to account for.
// This size is in units of sizeof(void*) not bytes.
// Defaults to EMERGENCY_OBJ_SIZE, defined below.
// P == sizeof(void*).
// R == sizeof(__cxa_refcounted_exception).
// D == sizeof(__cxa_dependent_exception).
// ...
#define EMERGENCY_OBJ_SIZE 6
#define EMERGENCY_OBJ_COUNT (4 * __SIZEOF_POINTER__ * __SIZEOF_POINTER__)
The object size (obj_size) and number of objects (obj_count) can be tuned manually via the GLIBCXX_TUNABLES environment variable. We can verify empirically that the initial allocation is actually for the emergency pool by changing the number of objects in the pool. As expected, we see the initial allocation size go down when we change the number of objects:
$ GLIBCXX_TUNABLES=glibcxx.eh_pool.obj_count=10 \
LOG_ALLOC=log.txt \
LD_PRELOAD=/home/joel/mymalloc/libmymalloc.so \
ls
$ head -n 1 log.txt
2880
As a side note, the emergency pool can also be disabled (i.e., not allocated), by setting the number of objects to 0. Alternatively, you can opt-in to use a fixed-size static buffer for the emergency pool by configuring --enable-libstdcxx-static-eh-pool when building libstdc++.
Valgrind and Memory Leak Confusion
Tying into our findings is the behavior observed in Valgrind, a popular tool that can, among other things, detect memory management bugs. Reddit user ismbks posts on r/cpp_questions “Why does my program allocate ~73kB of memory even tho it doesn’t do anything?”, which is the same number of bytes we’re seeing being allocated unconditionally for the emergency pool.
==1174489== HEAP SUMMARY:
==1174489== in use at exit: 0 bytes in 0 blocks
==1174489== total heap usage: 1 allocs, 1 frees, 73,728 bytes allocated
However, in older Valgrind versions, this memory appeared as “still reachable” rather than properly freed. While “still reachable” memory isn’t technically a leak (the program still has references to it), it can be misleading. See post on Stack Overflow detailing this behavior. Interestingly, this person sees a 71 KB allocation instead of 72 KB.
==8511== HEAP SUMMARY:
==8511== in use at exit: 72,704 bytes in 1 blocks
==8511== total heap usage: 1 allocs, 0 frees, 72,704 bytes allocated
==8511==
==8511== LEAK SUMMARY:
==8511== definitely lost: 0 bytes in 0 blocks
==8511== indirectly lost: 0 bytes in 0 blocks
==8511== possibly lost: 0 bytes in 0 blocks
==8511== still reachable: 72,704 bytes in 1 blocks
==8511== suppressed: 0 bytes in 0 blocks
Many developers mistakenly interpret this behavior as a memory leak, leading to unnecessary confusion. To address this, newer Valgrind versions now explicitly free the emergency pool during cleanup, providing clearer reports. This is implemented through the mechanisms shown below, which were added specifically for tools like Valgrind:
/* g++ mangled __gnu_cxx::__freeres yields -> _ZN9__gnu_cxx9__freeresEv */
extern void _ZN9__gnu_cxx9__freeresEv(void) __attribute__((weak));
if (((to_run & VG_RUN__GNU_CXX__FREERES) != 0) &&
(_ZN9__gnu_cxx9__freeresEv != NULL)) {
_ZN9__gnu_cxx9__freeresEv();
}
libstdc++-v3/libsupc++/eh_alloc.cc (simplified for readability):
namespace __gnu_cxx {
__attribute__((cold)) void __freeres() noexcept {
#ifndef _GLIBCXX_EH_POOL_STATIC
if (emergency_pool.arena) {
::free(emergency_pool.arena);
emergency_pool.arena = 0;
}
#endif
}
}
Takeaways
The memory allocated for the emergency pool explains why I’ve been able to consistently observe a 72 KB allocation when testing my custom allocator. Since I’ve implemented my custom allocator in C++, it inherently depends on libstdc++, which initializes the emergency pool on every program invocation. Interestingly, if I had written my allocator in C instead, which several popular malloc implementations are implemented in (mimalloc, jemalloc), I would only see this initial allocation when testing C++ binaries, which explicitly link against libstdc++.
As you quickly find out when working with memory allocation is that almost everything needs to allocate memory. From time immemorial with RTLD needing its own malloc since it hasn’t loaded libc yet, or for the emergency pool, which only uses malloc to allocate memory for its own pool allocator!
Digging through the code and piecing this together was rewarding and fun. I hope you enjoyed the journey as much as I did!