NSTL Allocators

Go Back to the NSTL Home Page.

NSTL offers different variety of allocators for efficient and effective memory management. These allocators are standard compliant allocators, meaning that they have an interface that agrees with the allocator interface as described by the C++ standard. These allocators can be used with the standard containers. Their description and intended method of use is discussed below:

1. nstd::node_allocator

This is the default node allocator that should be used for node based containers such as std::list, std::map, and std::multimap, etc... It is optimized for single object allocations. The overhead per each allocated block is zero bytes as long as the size of the type around which the allocator is parameterized is greater than or equal to the size of a pointer type in C++. This allocator uses a free list for giving out memory, and each allocation and deallocation is guaranteed to be amortized constant time. The only defect this allocator suffers from is that suppose the memory is given back to the allocator in an order different from that in which it was acquired from the allocator, then the nodes will get skewed about in memory, and will cause lots of cache misses for any container that subsequently acquires nodes from this allocator. This is used by including the header node_alloc.hpp. Thus, the performance of the container will seem to get slower the next time around.

2. nstd::vma_allocator

VMA stands for Variable size Memory Allocator. This is the default large object allocator. It is used to acquire memory for multiple objects. Typical use would include std::vector, etc... This allocator has an overhead of 20 bytes for a free block, and overhead of 12 bytes per allocated object. (for a typical 32-bit architecture, in which size of an int and size of a pointer are both 4 bytes. For calculating sizes of this over head for other architectures, refer nstl_install_dir/nstl/detail/nstl_vma_list.hpp, and see the definition of the struct chunk_header). This differers from the default node_allocator in that:

It should be used for multiple object allocations.

The overhead per allocated block is more compared to node_allocator.

This allocator implements merging of freed blocks.

The last point needs some detailed description. It means that when a block of memory is deallocated, it is merged with a block around it, if it is free, so that internal fragmentation can be avoided at this stage. this would obviously mean that suppose you allocate objects successively for the same container, and which are intended to be used together, then this allocator will try to make sure that the blocks of memory are as close to each other as possible. The node_allocator is unaware of the position of a node in memory, and makes no effort to maintain relative order of nodes in memory, whereas this particular vma_allocator does implement merging of adjacent blocks.

3. nstd::bitmap_allocator

This is again a single object allocator intended to be used with node based containers. The only differences between bitmap_allocator and node_allocator are given below:

The overhead per allocated/free block is 1 bit. This derives from the fact that it is a bitmapped allocator.

The relative order of nodes in memory will be preserved, so that the cache misses for subsequent allocation and also use of the memory associated with those nodes is minimized, and almost eliminated as far as possible.

This is possible because the allocator internally maintains a free list of all the areas or regions of memory used as bitmaps.

The allocation are guaranteed to have a complexity of O(log(N)) where N stands for the number of nodes held by the allocator. In general, the speed of the allocator should be quite good for decent sized containers too.

You should try to use this allocator as far as possible for all node based containers, because it is pretty quick in allocating memory, and also preserves the cache locality. This improves the cache performance of the program in how it interacts with the CPU L1/L2 cache sub-system.

This particular allocator can be made thread-safe by defining the MACRO NSTL_BITMAP_ALLOCATOR_THREAD_SUPPORT in the file nstl/details/nstl_defines.hpp.

This allocator is used by including the header bit_alloc.hpp.

Refer to the file balloc_test.cpp in the testsuite for comparing the performance of this allocator with the others in various situations.

4. nstd::allocator

This is the default allocator that should be used for all general cases. It accepts 2 template parameters.

The Type for which you want to instantiate the allocator.
A Small object allocator.
A Large object allocator.

The Small object allocator is used for getting memory lesser than or equal to the size of 5 ints on that particular system. The way this is done is a bit wierd. nstd::allocator creates 5 instances of Small object allocator each one for 1 int, 2 ints, 3 ints, 4 ints , and for 5 ints. Then, each allocator is used as a single object allocator.
The Large object allocator is used to satisfy all other size memory requests.
By default, the small object allocator is nstd::node_allocator and the large object allocator is nstd::vma_allocator. This can be used by including the header allocator.hpp.

5. nstd::cheap_allcator

Does nothing but call _aux_getmem(), and _aux_putmem() from the header nstl_aux_mem.hpp to get and free memory respectively. That's why it is called cheap_allocator :-) Can be used by including the header cheap_alloc.hpp.

6. nstd::mt_allocator

This allocator is meant to be used for containers in a multi-threaded environment where many threads are trying to create their own container such as vector/deque/list/map/multimap/set, etc..., and are inserting/deleting values from it. What happens when you use cheap_allocator for multi-threaded applications is that every time there is a request for memory, or some thread wants to release memory, then some sort of lock is placed and turned on, now, unless and until this particular operation finishes, the lock can not be Released. Hence, this hampers performance drastically if for a Single CPU system a context switch occurs between the allocation function being called and the allocation function returning, and for Multiple CPU systems, if 2 or more CPUs call the allocate function simultaneously or with a very small time gap. Thus, to alleviate the problem of this excessive locking, mt_allocator reduces this locking to only when the Global locking is absolutely necessary to obtain more memory from the OS by calling the (hopefully!) thread-safe operator new(). The test-case that corresponds to this allocator is thread_test.cpp.

Comparison of the Allocators' various parameters.

Node based containers such as list, map, multimap, setc, etc...

Name of Allocator	Pool Type Allocator?	Thread Safe?	Best used for:	Defined in file:
nstd::bitmap_allocator	Yes, but releases memory when the pool exceeds a certain limit. And does have a clear function by which the internal pool can be cleared.	Yes by default, but can be made thread-unsafe by undefining a macro in the file nstl/details/nstl_defines.hpp.	Node based containers such as list, map, multimap, setc, etc...	bit_alloc.hpp
nstd::node_allocator	Yes.	No, but will be made confifurable like nstd::bitmap_allocator in the future.	Node based containers such as list, map, multimap, setc, etc...	node_alloc.hpp
nstd::vma_allocator	Yes.	No, but will be made confifurable like nstd::bitmap_allocator in the future.	All containers except node based containers because the overhead per allocated block is 12-bytes for an x86-system.	vma_alloc.hpp
nstd::mt_allocator	Yes, but releases memory when the pool exceeds a certain limit. And does have a clear function by which the internal pool can be cleared.	Yes. That's what it's made for!	All containers used in a multi-threaded environment.	mt_alloc.hpp
nstd::allocator	Depends on the nature of the underlying allocators(s).	Depends on the nature of the underlying allocators(s).	General allocation.	allocator.hpp
nstd::cheap_allocator	Depends on the nature of the underlying operator new. Mostly, the answer is No.	Depends upon the nature of the underlying operator new. Mostly the answer is Yes.	General allocation using operator new and without any further fancy features.	cheap_alloc.hpp

Dhruv Matani.
dhruv_nospam_bird at yahoo dot com

Remove _nospam_ to send me an email.