This is really clever but better to call this a list rather than an array; functions which expect array semantics will simply not work, and there's no way to transparently pass slices of this data structure around.
In the past I've abused virtual memory systems to block off a bunch of pages after my array. This lets you use an array data structure, have guard pages to prevent out of bounds access, and to have stable pointers in the data structure.
zokier · 2h ago
> Today’s computers use only 48 bits of the 64 bits in a pointer
I don't know the precise details of how deques are implemented in C++, but given the most popular stack overflow explanation of them, some immidiate pitfalls are that the T* map itself sounds unbounded and if each chunk allocates only a fixed constant size it's probably horrible for fragmentation or overallocation. The indexing also seems dependent on division.
With this power of twos approach you can't really truly delete from the front of the array but the amount of pointers you store is constant and the memory fragmentation is better. (Though OP never claimed to want to support deque behavior, it shouldn't be that hard to modify, though indexing seems like it has to go thru more arithmetic again)
I haven't used OP's array, but I have been bit plenty of times with std::deque's memory allocation patterns and had to rewrite with raw arrays and pointer tracking.
cornstalks · 1h ago
What kind of setups use over 256 TiB of RAM?
TapamN · 1h ago
It not necessarily physical RAM. If you memmap large files, like maybe a large file from RAID or network share, you could still need that much virtual address space.
bonzini · 1h ago
In practice it's over 64 TiB because kernels often use a quarter of the available addressing space (half of the kernel addressing space) to map the physical addresses (e.g. FFFFC000_12345678 maps physical address 0x12345678). So 48 virtual address bits can be used with up to 2^46 bytes of RAM.
reorder9695 · 1h ago
"640k ought to be enough for anybody"
mwkaufma · 1h ago
In principle it's not that different that deque, though:
(1) deque uses fixed-sized blocks, not increasing-size blocks.
(2) dequeue supports prepending, which adds another level of indirection internally.
sigbottle · 1h ago
You can support prepending by mirroring the allocations, probably? eg for the "negative index" case do an exponential thing in the other direction.
Your indexing has some legitimate math to be done now which can be annoying (efficiency perspective) I think you can still get o(1) with careful allocation of powers of 2.
o11c · 1h ago
That's fine if you only ever add, but is likely to fail if you pop FIFO-style. This too is ultimately fixable but means we can no longer assume "every bucket size doubles".
That said, IMO "stable pointers" is overrated; "minimize copying" is all that's useful.
sestep · 1h ago
Does std::deque support random access?
mwkaufma · 1h ago
Yes, you can see operator[] in the linked reference docs.
unwind · 1h ago
Very nice, although I think the level of "trickery" with the macros becomes a bit much. I do understand that is The Way in C (I've written C for 30 years), it's just not something I'd do very often.
Also, from a strictly prose point of view, isn't it strange that the `clz` instruction doesn't actually appear in the 10-instruction disassembly of the indexing function? It feels like it was optimized out by the compiler perhaps due to the index being compile-time known or something, but after the setup and explanation that was a bit jarring to me.
mananaysiempre · 59m ago
The POSIX name for the function is clz() [the C23 name is stdc_leading_zeros(), because that's how the committee names things now, while the GCC intrinsic is __builtin_clz()]. The name of the x86 instruction, on the other hand, is BSR (80386+) or LZCNT (Nehalem+, K10+) depending on what semantics you want on zero input (take care that early implementations of BSF/BSR are very slow and take time proportional to the output value). The compiled code uses BSR. (All of these are specified slightly differently, take care if you plan to actually use them.)
variadix · 1h ago
You can also use virtual memory for a stable resizable vector implementation, up to some max length based on how much you virtual memory you reserve initially, then commit as required to grow the physical capacity.
loeg · 1h ago
Yeah, with less runtime overhead, so long as you're ok with the ~4kB minimum allocation size.
fyrn_ · 37m ago
This mention this alturnative in the article, and also point out how it does not work in embeded contexts or with WASM
pfg_ · 42m ago
Zig has this as std.SegmentedList, but it can resize the segment array dynamically
o11c · 1h ago
Can we really call it an array if it's not contiguous (or at least strided)? Only a small fraction of APIs take an `iovec, iovcnt`-equivalent ...
jandrese · 1h ago
Yeah, the limitation that it can't be just dumped into anything that expects a C array is a large one. You need to structure your code around the access primitives this project implements.
dhooper · 1h ago
feel free to call it a "levelwise-allocated pile"
tovej · 1h ago
Very nice! I do wonder if it would be useful to be able to skip even more smaller segments, maybe a ctor argument for the minimum segment size. Or maybe some housekeeping functions to collapse the smallest segments into one.
Mostly the thing that feels strange is when using say, n > 10 segments, then the smallest segment will be less than a thousandth of the largest, and iterating over the first half will access n-1 or n-2 segments, worse cache behaviour, while iterating over the second half will access 1 or two segments.
Seems like, in most cases, you would want to be able to collapse those earlier segments together.
In the past I've abused virtual memory systems to block off a bunch of pages after my array. This lets you use an array data structure, have guard pages to prevent out of bounds access, and to have stable pointers in the data structure.
https://en.wikipedia.org/wiki/Intel_5-level_paging introduced in Ice Lake 6 years ago.
But anyways, isn't this just variant of std::deque? https://en.cppreference.com/w/cpp/container/deque.html
With this power of twos approach you can't really truly delete from the front of the array but the amount of pointers you store is constant and the memory fragmentation is better. (Though OP never claimed to want to support deque behavior, it shouldn't be that hard to modify, though indexing seems like it has to go thru more arithmetic again)
I haven't used OP's array, but I have been bit plenty of times with std::deque's memory allocation patterns and had to rewrite with raw arrays and pointer tracking.
(1) deque uses fixed-sized blocks, not increasing-size blocks. (2) dequeue supports prepending, which adds another level of indirection internally.
Your indexing has some legitimate math to be done now which can be annoying (efficiency perspective) I think you can still get o(1) with careful allocation of powers of 2.
That said, IMO "stable pointers" is overrated; "minimize copying" is all that's useful.
Also, from a strictly prose point of view, isn't it strange that the `clz` instruction doesn't actually appear in the 10-instruction disassembly of the indexing function? It feels like it was optimized out by the compiler perhaps due to the index being compile-time known or something, but after the setup and explanation that was a bit jarring to me.
Mostly the thing that feels strange is when using say, n > 10 segments, then the smallest segment will be less than a thousandth of the largest, and iterating over the first half will access n-1 or n-2 segments, worse cache behaviour, while iterating over the second half will access 1 or two segments.
Seems like, in most cases, you would want to be able to collapse those earlier segments together.