Implementing an heterogeneous dynamic array in C++
Introduction
heterogeneous
: consisting of dissimilar or diverse constituents
dynamic array
: a dynamic array is a random access, variable-size list data structure that allows elements to be added or removed. […] A simple dynamic array can be constructed by allocating an array of fixed-size, typically larger than the number of elements immediately required. The elements of the dynamic array are stored contiguously at the start of the underlying array, […]
Heterogeneous containers are data structures wherein elements can be of different types. struct
and std::tuple
are example of heterogeneous containers in C++. In contrast, arrays provided by the language or the standard library are homogeneous containers, their elements can only be of the same type. std::array
and std::vector
are examples of static array and dynamic array implementations, respectively.
In this post, I will outline the implementation of an heterogeneous dynamic array in C++, which satisfies the API below:
|
|
and resulting in the following memory layout:
Motivation
There are regularly questions or posts popping-up about the existence of an heterogenous container in C++.
- A true heterogeneous container in C++
- c++ heterogeneous container, get entry as type
- Who else would like to see multi-type lists?
- Proof of concept runtime heterogeneous containers
- Multi type vectors
- Is heterogeneous array possible ?
- C++ Heterogeneous list
- How do I create heterogeneous containers in c++? I want to have different types of elements in the container.
- Etc
Answers are often centered around the use of vector<any>
or vector<variant>
. However, they each have shortcomings:
- With
vector<any>
solution, any introduces an indirection for each inserted object (vector→any→type) and inserted objects are not stored contiguously, barring special cases (custom allocator). - With
vector<variant>
solution, variant requires the user to provide a list of types known at compile-time. Moreover, memory is inherently wasted because the size of a variant is the size of the largest possible type it can hold, plus some extra for the type identifier.
In “A true heterogeneous container in C++”, the container relies on a static member variable to store/retrieve the inserted/requested objects, using the memory address of the container itself as a key in a map:
|
|
The API is pleasant, but only instances of a same type are stored contiguously, which favors fast per-type linear iteration. As noted by its author, it has some drawbacks making it ill-suited beyond an experimental usage.
The layout provided by these solutions is
I was left bugged by these solutions, so I decided to explore the feasibility of having a class that has the semantics of an heterogeneous vector/a structure populated at runtime. Basically, a contiguous piece of memory (one indirection) where any object can be inserted one after another. I used this exercise as a good opportunity to experiment with C++17 meta-programming features and low-level memory manipulation. Beware, knowledge of variadic templates and fold-expressions is highly recommended in the following. Let’s now see that together !
Principle
For a bit, there were only bytes
We will have to manipulate and manage directly the memory in order to perform our goal. Since we want the least amount of indirection while remaining extensible, the best is to store everything inside an array of bytes on the heap. This member will hold the actual instances of our objects.
|
|
Know your style
With this single member, only the caller who inserted the object would know what is where. Type information is lost and our heterogenous array is far from being self-contained. Indeed, there would be no way to access its content solely with a reference to the container. Thus, we add a member that keeps track of what is where.
|
|
Values represented by typeid_t
uniquely identify a type at runtime and those represented by offset_t
give the memory offset where the instance of a type is located in data
.
Since C++14, it is possible to have a unique value assigned to a type easily thanks to a variable template. The implementation holds in a few lines of code:
|
|
In order to get, you must give
We now have the minimum necessary members to implement insertion into the container. It consists of allocating needed memory first, then constructing the object within that space and finally doing some bookkeeping (record what was stored where).
|
|
Allocation is the most involved task. It must compute the new size needed to hold the desired content, resize accordingly, while computing the position where each object should be constructed. The easy but wrong solution would be to resize data
solely with the additional storage needed by each inserted type:
|
|
Repeat after me: padding is a thing, padding is a thing, padding is a thing. It also works with alignment my torment 🎵 depending on your history with the topic. In brief, objects cannot reside wherever they want in memory, and their memory addresses are restricted to be dividable by some values (a power of 2). For example, std::uint16_t
has a size and alignment of 2 bytes, then the least significant bit in its memory address will always be 0 (i.e. memory address is an even number).
In general, the alignment and padding within user-defined structures are automatically set by the compiler. It ensures that each structure members will be properly aligned in regular scenarios. However, we cannot rely here on the compiler to do it because objects are constructed in a piece of memory only appearing as an array of bytes. Hence, we have to compute the padding for each object ourselves and take them into account when resizing the container.
|
|
You may raise a serious warning here: “ok, newly inserted objects are properly padded with respect to current end address (ptr_end
), but resizing the vector may provoke a reallocation, thus destroying the memory alignment since we have no guarantee that the vector would put everything at a same memory ending!” *catch up breath* Fine observation, you are right !
Obtaining properly aligned memory manually could be achieved, accepting a waste of alignment value bytes. New facilities were added in C++17 to deal with alignment in dynamic memory allocation. For example, the alignment value can be explicitly provided during dynamic memory allocation via additional overload to new
. Alas, this is not yet (ever?) portable due to how each OS deals with aligned memory allocation. To keep it simple, we make use of the Boost.Align library, which provides an allocator that can force our array of bytes to always be aligned on a specific value.
|
|
Consequently, every type whose alignment is lower or equal than max_alignment
is guaranteed to be properly aligned within the buffer, even after memory relocation. For security, we could add the following check to guarantee no type inserted in the container has a higher alignment than the arbitrary chosen one.
|
|
Now that allocation is finally correctly done, we can construct our instance within the resized data
. For this purpose, in-place constructor suffices. Phew !
|
|
You shall have the aptitude to get
The procedure to retrieve a reference from the container is conceptually straightforward. We lookup where the type is stored and cast the bytes starting from that address as a reference to the desired type, which we know is there. This implies the use of reinterpret_cast
. Unfortunately, C++ is quite unforgiving when one attempts to mess the type system. If the reference is a structure/class which has some members, any end-user that attempts to access a member would trigger Undefined Behavior, which conveniently rhymes with Unexpected Bug.
From
reinterpret_cast
Performing a class member access that designates a non-static data member or a non-static member function on a glvalue that does not actually designate an object of the appropriate type - such as one obtained through a reinterpret_cast - results in undefined behavior:
A way to get around this safely was introduced in C++17 with std::launder
. One of the example given on cppreference fits exactly our situation:
|
|
Finally, the method for accessing the content of the container is:
|
|
Neither auto&
or auto&&
would be warning/error-free because the method either returns a reference (T&
) or a value (std::tuple<T&...>
). As a result, the more obscure decltype(auto)
is necessary, which infers the proper return type.
No leeks in my cup of tea
For a basic implementation to be minimally correct, we must properly resolve the destruction of the container. If it were left with a default destructor, any object in the container that allocated on the heap would leak its memory. Once again, this is the result of our manual management of the memory. Since the compiler only sees an array of bytes and not the underlying types, it will not call the destructors of inserted objects. As a result, we must deal with them explicitly:
|
|
EDIT: UB strikes back
Performing a simple resize of data
during allocation, which could trigger allocation of a new buffer, is actually Undefined Behavior because no object is actually “exists” in the new memory location. On Windows, it works without flinching for any types and will act as if the object was in that new memory location, but not on Linux, where storing for example std::string
can send the program to oblivion. One possibility chosen here is to restrict the usage to types that are not breaking, e.g. std::vector
of any type is ok. Otherwise, memory allocation must be modified so that move constructor is called for each already present type, which means it must also be stored somewhere:
|
|
Conclusion
A minimal working implementation of an heterogeneous dynamic array has been described and it holds in less than a 100 lines of code ! 😮 See for yourself on Compiler Explorer or at the end of this page. The main restriction of the current implementation is that it only accepts a single instance of each type. Furthermore, several features could be added on top of it to make it more efficient and user-friendly:
- Copy/move construction/assignment of the container
- Insertion that does not require the type to be move/copy constructible, à la emplace
- Reduce memory waste when inserting several types in a single call
- Avoid storing empty types
- Avoid storing trivial destructors
- Provide additional methods for retrieving offsets type identifier, or checking presence
In a follow-up post, the performance of the implementation will be compared to some alternatives.
Acknowledgment
Thanks Rasti for the diligent review !
Code
|
|