NSTL Allocators
Go Back to the NSTL Home Page.
NSTL offers different variety of
allocators for efficient and effective memory management. These
allocators are standard compliant allocators, meaning that they have an
interface that agrees with the allocator interface as described by the
C++ standard. These allocators can be used with the standard containers.
Their description and intended method of use is discussed below:
1. nstd::node_allocator
This is the default node allocator that
should be used for node based containers such as std::list, std::map,
and std::multimap, etc... It is optimized for single object allocations.
The overhead per each allocated block is zero bytes as long as the size
of the type around which the allocator is parameterized is greater than
or equal to the size of a pointer type in C++. This allocator uses a
free list for giving out memory, and each allocation and deallocation is
guaranteed to be amortized constant time. The only defect this allocator
suffers from is that suppose the memory is given back to the allocator
in an order different from that in which it was acquired from the
allocator, then the nodes will get skewed about in memory, and will
cause lots of cache misses for any container that subsequently acquires
nodes from this allocator. This is used by including the header
node_alloc.hpp. Thus, the performance of the container will seem to get
slower the next time around.
2. nstd::vma_allocator
VMA stands for Variable size Memory
Allocator. This is the default large object allocator. It is used to
acquire memory for multiple objects. Typical use would include
std::vector, etc... This allocator has an overhead of 20 bytes for a
free block, and overhead of 12 bytes per allocated object. (for a
typical 32-bit architecture, in which size of an int and size of a
pointer are both 4 bytes. For calculating sizes of this over head for
other architectures, refer
nstl_install_dir/nstl/detail/nstl_vma_list.hpp, and see the definition
of the struct chunk_header). This differers from the default
node_allocator in that:
- It should be used for multiple object allocations.
- The overhead per allocated block is more compared to
node_allocator.
- This allocator implements merging of freed blocks.
The last point needs some detailed description. It means that when a
block of memory is deallocated, it is merged with a block around it, if
it is free, so that internal fragmentation can be avoided at this stage.
this would obviously mean that suppose you allocate objects successively
for the same container, and which are intended to be used together, then
this allocator will try to make sure that the blocks of memory are as
close to each other as possible. The node_allocator is unaware of the
position of a node in memory, and makes no effort to maintain relative
order of nodes in memory, whereas this particular vma_allocator does
implement merging of adjacent blocks.
3. nstd::bitmap_allocator
This is again a single object allocator
intended to be used with node based containers. The only differences
between bitmap_allocator and node_allocator are given below:
- The overhead per allocated/free block is 1 bit. This derives from
the fact that it is a bitmapped allocator.
- The relative order of nodes in memory will be preserved, so that
the cache misses for subsequent allocation and also use of the memory
associated with those nodes is minimized, and almost eliminated as far
as possible.
This is possible because the allocator internally maintains a free
list of all the areas or regions of memory used as bitmaps.
The allocation are guaranteed to have a complexity of O(log(N))
where N stands for the number of nodes held by the allocator. In
general, the speed of the allocator should be quite good for decent
sized containers too.
You should try to use this allocator as far as possible for all node
based containers, because it is pretty quick in allocating memory, and
also preserves the cache locality. This improves the cache performance
of the program in how it interacts with the CPU L1/L2 cache sub-system.
This particular allocator can be made thread-safe by defining the
MACRO NSTL_BITMAP_ALLOCATOR_THREAD_SUPPORT in the file
nstl/details/nstl_defines.hpp.
This allocator is used by including the header bit_alloc.hpp.
Refer to the file balloc_test.cpp in the testsuite for comparing the
performance of this allocator with the others in various situations.
4. nstd::allocator
This is the default allocator that
should be used for all general cases. It accepts 2 template parameters.
- The Type for which you want to instantiate the allocator.
- A Small object allocator.
- A Large object allocator.
The Small object allocator is used for getting memory lesser than or
equal to the size of 5 ints on that particular system. The way this is
done is a bit wierd. nstd::allocator creates 5 instances of Small object
allocator each one for 1 int, 2 ints, 3 ints, 4 ints , and for 5 ints.
Then, each allocator is used as a single object allocator.
The Large object allocator is used to satisfy all other size memory
requests.
By default, the small object allocator is nstd::node_allocator and the
large object allocator is nstd::vma_allocator. This can be used by
including the header allocator.hpp.
5. nstd::cheap_allcator
Does nothing but call _aux_getmem(),
and _aux_putmem() from the header nstl_aux_mem.hpp to get and free
memory respectively. That's why it is called cheap_allocator :-) Can be
used by including the header cheap_alloc.hpp.
6. nstd::mt_allocator
This allocator is meant to be used for
containers in a multi-threaded environment where many threads are trying
to create their own container such as
vector/deque/list/map/multimap/set, etc..., and are inserting/deleting
values from it. What happens when you use cheap_allocator for
multi-threaded applications is that every time there is a request for
memory, or some thread wants to release memory, then some sort of lock
is placed and turned on, now, unless and until this particular operation
finishes, the lock can not be Released. Hence, this hampers performance
drastically if for a Single CPU system a context switch occurs between
the allocation function being called and the allocation function
returning, and for Multiple CPU systems, if 2 or more CPUs call the
allocate function simultaneously or with a very small time gap. Thus, to
alleviate the problem of this excessive locking, mt_allocator reduces
this locking to only when the Global locking is absolutely necessary to
obtain more memory from the OS by calling the (hopefully!) thread-safe
operator new(). The test-case that corresponds to this allocator is
thread_test.cpp.
Comparison of the Allocators' various parameters.
Node based containers such as list, map, multimap, setc, etc...
Name of Allocator
|
Pool Type Allocator?
|
Thread Safe?
|
Best used for:
|
Defined in file:
|
nstd::bitmap_allocator
|
Yes, but releases memory when the
pool exceeds a certain limit. And does have a clear function by which
the internal pool can be cleared. |
Yes
by default, but can be made thread-unsafe by undefining a macro in the
file nstl/details/nstl_defines.hpp.
|
Node based containers such as
list, map, multimap, setc, etc... |
bit_alloc.hpp
|
nstd::node_allocator
|
Yes.
|
No,
but will be made confifurable like nstd::bitmap_allocator in the future.
|
Node based containers such as
list, map, multimap, setc, etc...
|
node_alloc.hpp
|
nstd::vma_allocator |
Yes.
|
No,
but will be made confifurable like nstd::bitmap_allocator in the future. |
All containers except node based
containers because the overhead per allocated block is 12-bytes for an
x86-system.
|
vma_alloc.hpp
|
nstd::mt_allocator
|
Yes,
but releases memory when the pool exceeds a certain limit. And does
have a clear function by which the internal pool can be cleared.
|
Yes.
That's what it's made for!
|
All containers used in a
multi-threaded environment.
|
mt_alloc.hpp
|
nstd::allocator |
Depends on the nature of the
underlying allocators(s).
|
Depends on the nature of the
underlying allocators(s). |
General allocation.
|
allocator.hpp
|
nstd::cheap_allocator
|
Depends
on the nature of the underlying operator new. Mostly, the answer is No.
|
Depends upon the nature of the
underlying operator new. Mostly the answer is Yes.
|
General allocation using
operator new and without any further fancy features.
|
cheap_alloc.hpp
|
Dhruv Matani.
dhruv_nospam_bird at yahoo dot com
Remove _nospam_ to send me an email.