Memory management

Memory management is especially important for immutable data structures. This is mainly due to:

  1. In order to preserve the old value, new memory has to be allocated to store the new data whenever a container is manipulated. Thus, more allocations are performed when changing than with traditional mutable data structures.
  2. In order to support structural sharing transparently, some kind of garbage collection mechanism is required. Passing immutable data structures around is, internally, just passing references, thus the system needs to figure out somehow when old values are not referenced anymore and should be deallocated.

Thus, most containers in this library can be customized via policies in order to use different allocation and garbage collection strategies.

using default_memory_policy = memory_policy<default_heap_policy, default_refcount_policy, default_lock_policy>

The default memory policy.

using default_heap_policy = free_list_heap_policy<cpp_heap>

The default heap policy just uses the standard heap with a free_list_heap_policy. If IMMER_NO_FREE_LIST is defined to 1 then it just uses the standard heap.

using default_refcount_policy = refcount_policy

By default we use thread safe reference counting.

Memory policy

template <typename HeapPolicy, typename RefcountPolicy, typename LockPolicy, typename TransiencePolicy = get_transience_policy_t<RefcountPolicy>, bool PreferFewerBiggerObjects = get_prefer_fewer_bigger_objects_v<HeapPolicy>, bool UseTransientRValues = get_use_transient_rvalues_v<RefcountPolicy>>
struct immer::memory_policy

This is a default implementation of a memory policy. A memory policy is just a bag of other policies plus some flags with hints to the user about the best way to use these strategies.

Template Parameters
  • HeapPolicy: A heap policy, for example, heap_policy.
  • RefcountPolicy: A reference counting policy, for example, refcount_policy.
  • TransiencePolicy: A transience policy, for example, no_transience_policy.
  • PreferFewerBiggerObjects: Boolean flag indicating whether the user should prefer to allocate memory in bigger chunks e.g. by putting various objects in the same memory region or not.
  • UseTransientRValues: Boolean flag indicating whether immutable containers should try to modify contents in-place when manipulating an r-value reference.

Public Types

using heap = HeapPolicy
using refcount = RefcountPolicy
using transience = TransiencePolicy
using lock = LockPolicy
using transience_t = typename transience::template apply<heap>::type

Public Static Attributes

constexpr bool prefer_fewer_bigger_objects = PreferFewerBiggerObjects
constexpr bool use_transient_rvalues = UseTransientRValues

Example: tracing garbage collection

It is noteworthy that all aspects of a immer::memory_policy are not completely orthogonal.

Let’s say you want to use a tracing garbage collector. Actually, we already provide a heap that interfaces with the Boehm’s conservative garbage collector. Chunks of memory allocated with this heap do not need to be deallocated, instead, after a certain number of allocations, the heap itself will scan the stack and all allocated memory to find references to other blocks of memory. The memory that is not referenced anymore is automatically freed. Thus, no reference counting mechanism is needed, and it makes no sense to use this heap with anything else than the immer::no_refcount_policy. Also, a big object can be separated in parts that contain pointers and parts that do not, which make scanning references faster. So it makes most sense to use prefer_fewer_bigger_objects = false.

Note

There are few considerations to note when using gc_heap with containers. Please make sure to read its documentation section before using it.

#include <immer/heap/gc_heap.hpp>
#include <immer/heap/heap_policy.hpp>
#include <immer/memory_policy.hpp>
#include <immer/refcount/no_refcount_policy.hpp>
#include <immer/vector.hpp>

#include <iostream>

// declare a memory policy for using a tracing garbage collector
using gc_policy = immer::memory_policy<immer::heap_policy<immer::gc_heap>,
                                       immer::no_refcount_policy,
                                       immer::default_lock_policy,
                                       immer::gc_transience_policy,
                                       false>;

// alias the vector type so we are not concerned about memory policies
// in the places where we actually use it
template <typename T>
using my_vector = immer::vector<T, gc_policy>;

int main()
{
    auto v =
        my_vector<const char*>().push_back("hello, ").push_back("world!\n");

    for (auto s : v)
        std::cout << s;
}

Heaps

A heap policy is a metafunction class that given the sizes of the objects that we want to allocate, it returns a heap that can allocate objects of those sizes.

A heap is a type with methods void* allocate(std::size_t) and void deallocate(void*) that return and release raw memory. For a canonical model of this concept check the immer::cpp_heap.

Note

Currently, heaps can only have global state. Having internal state poses conceptual problems for immutable data structures: should a const method of a container modify its internal allocator state? Should every immutable container object have its own internal state, or new objects made from another one just keep a reference to the allocator of the parent?

On the other hand, having some scoped state does make sense for some use-cases of immutable data structures. For example, we might want to support variations of region based allocation. This interface might evolve to support some kind of non-global state to accommodate these use cases.

Why not use the standard allocator interface?

The standard allocator API was not designed to support different allocation strategies, but to abstract over the underlying memory model instead. In C++11 the situation improves, but the new API is stateful, posing various challenges as described in the previous note. So far it was easier to provide our own allocation interface. In the future, we will provide adaptors so standard-compatible allocators can also be used with immer.

Heap policies

template <typename Heap>
struct immer::heap_policy

Heap policy that unconditionally uses its Heap argument.

Public Types

using type = Heap
template <std::size_t>
struct optimized

Public Types

template<>
using type = Heap
template <typename Heap, std::size_t Limit = default_free_list_size>
struct immer::free_list_heap_policy

Heap policy that returns a heap with a free list of objects of max_size = max(Sizes...) on top an underlying Heap. Note these two properties of the resulting heap:

  • Allocating an object that is bigger than max_size may trigger undefined behavior.
  • Allocating an object of size less than max_size still returns an object of max_size.

Basically, this heap will always return objects of max_size. When an object is freed, it does not directly invoke std::free, but it keeps the object in a global linked list instead. When a new object is requested, it does not need to call std::malloc but it can directly pop and return the other object from the global list, a much faster operation.

This actually creates a hierarchy with two free lists:

  • A thread_local free list is used first. It does not need any kind of synchronization and is very fast. When the thread finishes, its contents are returned to the next free list.
  • A global free list using lock-free access via atomics.

Tip

For many applications that use immutable data structures significantly, this is actually the best heap policy.

Note that most our data structures internally use trees with the same big branching factors. This means that all vectors, maps, etc. can just allocate elements from the same free-list optimized heap. Not only does this lowers the allocation time, but also makes up for more efficient cache utilization. When a new node is needed, there are high chances the allocator will return a node that was just accessed. When batches of immutable updates are made, this can make a significant difference.

Template Parameters
  • Heap: Heap to be used when the free list is empty.

Standard heap

struct immer::cpp_heap

A heap that uses operator new and operator delete.

Public Static Functions

template <typename… Tags>
static void *allocate(std::size_t size, Tags...)

Returns a pointer to a memory region of size size, if the allocation was successful, and throws otherwise.

static void deallocate(std::size_t, void *data)

Releases a memory region data that was previously returned by allocate. One must not use nor deallocate again a memory region that once it has been deallocated.

Malloc heap

struct immer::malloc_heap

A heap that uses std::malloc and std::free to manage memory.

Public Static Functions

template <typename… Tags>
static void *allocate(std::size_t size, Tags...)

Returns a pointer to a memory region of size size, if the allocation was successful and throws std::bad_alloc otherwise.

static void deallocate(std::size_t, void *data)

Releases a memory region data that was previously returned by allocate. One must not use nor deallocate again a memory region that once it has been deallocated.

Garbage collected heap

class immer::gc_heap

Heap that uses a tracing garbage collector.

This heap uses the Boehm’s conservative garbage collector under the hood. This is a tracing garbage collector that automatically reclaims unused memory. Thus, it is not needed to call deallocate() in order to release memory.

Dependencies

In order to use this header file, you need to make sure that Boehm’s libgc is your include path and link to its binary library.

Caution

Memory that is allocated with the standard malloc and free is not visible to libgc when it is looking for references. This means that if, let’s say, you store a immer::vector using a gc_heap inside a std::vector that uses a standard allocator, the memory of the former might be released automatically at unexpected times causing crashes.

Caution

When using a gc_heap in combination with immutable containers, the destructors of the contained objects will never be called. It is ok to store containers inside containers as long as all of them use a gc_heap consistently, but storing other kinds of objects with relevant destructors (e.g. containers with reference counting or other kinds of resource handles) might cause memory leaks and other problems.

Heap adaptors

Inspired by Andrei Alexandrescu’s talk on allocators and Emery Berger’s heap layers we provide allocator adaptors that can be combined using C++ mixins. These enable building more complex allocators out of simpler strategies, or provide application specific optimizations on top of general allocators.

template <typename T, typename Base>
struct immer::with_data

Appends a default constructed extra object of type T at the before the requested region.

Template Parameters
  • T: Type of the appended data.
  • Base: Type of the parent heap.

template <std::size_t Size, std::size_t Limit, typename Base>
struct immer::free_list_heap

Adaptor that does not release the memory to the parent heap but instead it keeps the memory in a thread-safe global free list. Must be preceded by a with_data<free_list_node, ...> heap adaptor.

Template Parameters
  • Size: Maximum size of the objects to be allocated.
  • Base: Type of the parent heap.

template <std::size_t Size, std::size_t Limit, typename Base>
struct immer::thread_local_free_list_heap

Adaptor that does not release the memory to the parent heap but instead it keeps the memory in a thread_local global free list. Must be preceded by a with_data<free_list_node, ...> heap adaptor. When the current thread finishes, the memory is returned to the parent heap.

Template Parameters
  • Size: Maximum size of the objects to be allocated.
  • Limit: Maximum number of elements to keep in the free list.
  • Base: Type of the parent heap.

template <std::size_t Size, std::size_t Limit, typename Base>
struct immer::unsafe_free_list_heap

Adaptor that does not release the memory to the parent heap but instead it keeps the memory in a global free list that is not thread-safe. Must be preceded by a with_data<free_list_node, ...> heap adaptor.

Template Parameters
  • Size: Maximum size of the objects to be allocated.
  • Limit: Maximum number of elements to keep in the free list.
  • Base: Type of the parent heap.

template <typename Base>
struct immer::identity_heap

A heap that simply passes on to the parent heap.

template <typename Base>
struct immer::debug_size_heap

A heap that in debug mode ensures that the sizes for allocation and deallocation do match.

template <std::size_t Size, typename SmallHeap, typename BigHeap>
struct immer::split_heap

Adaptor that uses SmallHeap for allocations that are smaller or equal to Size and BigHeap otherwise.

Reference counting

Reference counting is the most commonly used garbage collection strategy for C++. It can be implemented non-intrusively, in a way orthogonal to the allocation strategy. It is deterministic, playing well with RAII.

A memory policy can provide a reference counting strategy that containers can use to track their contents.

struct immer::refcount_policy

A reference counting policy implemented using an atomic int count. It is thread-safe.

struct immer::unsafe_refcount_policy

A reference counting policy implemented using a raw int count. It is not thread-safe.

struct immer::no_refcount_policy

Disables reference counting, to be used with an alternative garbage collection strategy like a gc_heap.

Transience

In order to support transients, it is needed to provide a mechanism to track the ownership of the data allocated inside the container. This concept is encapsulated in transience policies.

Note that when reference counting is available, no such mechanism is needed. However, when tracing garbage collection is used instead, a special policy has to be provided. Otherwise, the transient API is still available, but it will perform poorly, since it won’t be able to mutate any data in place.

struct immer::no_transience_policy

Disables any special transience tracking. To be used when reference counting is available instead.

struct immer::gc_transience_policy

Provides transience ownership tracking when a tracing garbage collector is used instead of reference counting.

Warning

Using this policy without an allocation scheme that includes automatic tracing garbage collection may cause memory leaks.