The immer::persist
library persists persistent data structures,
allowing the preservation of structural sharing of immer
containers
when serializing, deserializing or transforming the data.
Warning
This library is still experimental and its API may
change in the future. The headers can be found in
immer/extra/persist/...
and the extra
subpath will be
removed once its interface stabilizes.
In addition to the dependencies of
immer
, this library makes use of C++17, Boost.Hana, fmt and
cereal.
Structural sharing allows immer
containers to be efficient. At
runtime, two distinct containers can be operated on independently but
internally they share nodes and use memory efficiently in that
way.
However when such containers are serialized in a trivial form, for example, as JSON lists, this sharing is lost: they become truly independent—the same data is stored multiple times on disk, and later, when read back into memory, the program has lost the structural sharing.
This library operates on the internal structure of immer
containers: allowing it to be serialized, deserialized and
transformed. This enables more efficient storage, particularly when
many nodes are reused, and, even more importantly, preserving
structural sharing after deserializing the containers.
Consider this scenario where you have multiple
immer::vector<std::string>
, where the various instances are
derived from one another. Some of these vectors would be completely
identical, while others would have just a few elements different. This
scenario is not uncommon, for example, when implementing the undo
history of an application by preserving the previous
states.
The goal is to apply a transformation function to these vectors with
something like std::transform
.
A direct approach would be to take each vector and create a new vector by applying the transformation function for each element. However, after this process, all the structural sharing of the original containers would be lost—the result would be multiple independent vectors without any structural sharing, and the transformation may have been applied unnecessarily multiple times to identical elements that were previously shared.
This library enables the application of the transformation function directly on the nodes, preserving structural sharing. Additionally, regardless of how many times a node is reused, the transformation needs to be performed only once.
To solve this problem, this library introduces the notion of a pool.
A pool represents a set of immer
containers of a specific
type. For example, we may have a pool that contains all
immer::vector<int>
of our document. You can think of it as a small
database of immer
containers. When serializing the pool, the
internal structure of all those immer
containers is written as
a whole, preserving the structural sharing between those containers.
Note that for the most part, the user of the library is not concerned with pools, as they are generated automatically from your data structures. However, you may become aware of them in the JSON output or when transforming recursive data structures.