Inside Boost.Container: comparing different deque implementations

Many C++ developers have used std::deque, and most assume all implementations are roughly equivalent. In practice, they differ quite a bit. The three major standard library implementations — libc++ (Clang/LLVM), libstdc++ (GCC), and Microsoft STL (Visual Studio) — each make different design choices about block sizes, iterator layout, map management, and memory reclamation. Boost.Container 1.90 adds a fourth option with a recently rewritten implementation that improves performance and reduces significantly the size of an empty deque.

The common architecture

All four implementations share the same high-level idea: a map (an array of pointers) pointing to fixed-size blocks, each holding a contiguous chunk of elements. This two-level indirection is what gives deque its characteristic properties — O(1) push at both ends without invalidating references. The differences are in how each library manages those blocks.

Internal data structure diagrams

The key differences lie in how each implementation stores its metadata: what the map looks like, how iterators track their position, and what block size strategy they use. The diagrams below show the internal layout of each implementation for 64-bit systems.

libstdc++ (GCC)

libstdc++ stores the valid range as a pair of fat iterators. Blocks are allocated centered in the map; when one side fills up, in-use blocks are slid to re-center.

libc++ (Clang/LLVM)

libc++ uses a __split_buffer for the map itself — effectively a deque of block pointers. Iterators are lean (2 pointers). The valid range is tracked by start offset + size.

MSVC STL (Microsoft)

MSVC treats the map as a circular buffer — no sliding needed. The 16-byte block size is quite small compared to other implementations, and is a legacy decision the team has said they plan to revisit at the next ABI break. Iterators are proxies back to the parent deque.

Boost.Container 1.90 deque (rewritten)

Boost was rewritten for version 1.90. The previous SGI-derived layout (10 words, fat iterators) was replaced with a 4-word structure using double-ended offsets (m_start_off / m_finish_off) into the block map, and 2-pointer iterators. The default block size was also increased to 1024 bytes on 64-bit.

`sizeof(deque<int>)` and dynamic allocation comparison

The in-memory footprint of an empty deque object varies quite a bit across implementations. On a 64-bit platform:

Implementation	sizeof (bytes)	Breakdown
Boost	32	m_map (8) + m_map_size (8) + m_start_off (8) + m_finish_off (8) = 32
MSVC STL	40	map ptr (8) + map_size (8) + offset (8) + size (8) + proxy (8) = 40
libc++	48	__split_buffer (4 ptrs = 32) + start offset (8) + size (8) = 48
libstdc++	80	map ptr (8) + map_size (8) + 2 iterators × 4 ptrs × 8 = 64 → total 80

Boost’s reimplemented deque comes in at 32 bytes — the smallest among the four implementations tested. The previous Boost layout was 80 bytes, the same as libstdc++. The reduction comes from replacing the two stored fat iterators with a pair of offsets (m_start_off / m_finish_off) that index into the block map.

Apart from this static size, implementations differ whether dynamic allocation is required when default constructing or move constructing a deque. It’s important to note that operations that allocate memory can’t be noexcept.

Implementation	Dynamic memory allocation and noexcept-ness on default or move construction
Boost	No allocation. noexcept constructors.
MSVC STL	16 byte allocation if _ITERATOR_DEBUG_LEVEL ==2 (typically in Debug). Constructors are not marked as noexcept.
libc++	No allocation. noexcept constructors.
libstdc++	One block (512 bytes) + minimal map (64 bytes). Constructors are not marked as noexcept.

Having noexcept default and move constructors can be important for those users wanting to replace vector members with deques. Implementations having potentially throwing constructors could require code changes (default and move constructors for user-defined types using deques could not longer be noexcept).

Key design trade-offs

Property	libstdc++	libc++	MSVC	Boost
Block bytes	512	4096	16	1024 (64-bit) 512 (32-bit)*
Initial map size	8	2	8	8
Map strategy	Center + slide	split_buffer	Circular buffer	Center + slide
Spare block policy	Free immediately	Keep 1 per end	Keep all	Free immediately
noexcept default/ move constructor	No	Yes	No	Yes
Dynamic allocation on default/move constructor	One block (512B) + minimal map (64B)	No	When debug iterators are enabled (e.g. Debug builds)	No
Iterator size	32 bytes	16 bytes	16 bytes	16 bytes
sizeof(deque)	80	48	40	32
Configurable	No	No	No	Yes

* Boost defaults; configurable via block_size<N> and block_bytes<N> deque options.

Benchmarks

The following deque implementations were measured:

libstdc++ 14.2 (GCC 14)
libc++ 18.1 (Clang 18)
Microsoft STL shipped with MSVC 19.40 (VS 2022 17.10)
Boost was compiled with GCC 14 in three block-size configurations (512B, 1K, 4K).

Tested were executed on x86-64 systems with deque<int>at N = 100,000 elements:

The range-insert tests use d.insert(pos, src.begin(), src.end()) with a pre-populated 128 element size source vector into a pre-populated 100K-element deque;
The “at back” variant inserts at d.end()
The “at position ¾” variant inserts at d.begin() + d.size()*3/4 128 elements, requiring the implementation to shift about 25K elements.

Notes:

Times are in nanoseconds per operation (or per inserted element where noted), lower is better.
Std implementation bars are ordered by block size –> smallest(MSVC, 16Byte) to biggest(4KByte, libc++).
Boost deque is tested with 3 block sizes (512B, 1KB (default), 4KB)

Analysis

Block size appears to be the most significant factor in these results. The three Boost configurations help illustrate this — as block size increases from 512B to 1K to 4K, performance improves consistently across all benchmarks. Note that a big block size increases the memory consumption of a deque with very few elements, big block sizes have also downsides. Generally speaking, deque is not a good container when the most common case is having a few elements in the container.

Comparing libstdc++ and Boost (512B) is useful because both use the same 512-byte blocks (128 ints each), which isolates the effect of the different internal algorithms. Boost shows a slight advantage here and the difference likely comes from the newly reimplemented segmented algorithms and (maybe) the smaller 32-byte deque object (vs libstdc++’s 80 bytes), which could mean fewer cache lines are touched when the deque metadata is accessed.

Range insertion at the back is where block size differences show up most clearly. MSVC’s per-element cost reflects the overhead of crossing 32 block boundaries (128 ints ÷ 4 ints per block) for every insert call. Boost (4K) and libc++, with their large blocks, can often absorb 128 ints (512 bytes) into a single contiguous region, bringing the cost down.

The near-back insertion benchmark (position ¾) adds a new dimension: the cost of shifting existing elements. When inserting at position ¾ of a 100K-element deque, implementations need to move roughly 25,000 elements (the shorter back quarter) to make room. This shift dominates the cost and is also sensitive to block size — larger blocks allow the shift to proceed as contiguous memmove operations within blocks rather than element-by-element across block boundaries.

It’s worth noting that MSVC’s circular map design has the advantage of avoiding block sliding, but the small block size leads to frequent cache misses and allocation overhead in bulk operations. The MSVC team has noted this is something they plan to address at the next ABI break.

Takeaways

Among the implementations tested, this shows that the object size reduction achieved by Boost has been compatible with an improved perfomance. When configured with 512-byte blocks to match libstdc++’s geometry, it shows pretty good performance whereas with the 4KB block option ( deque_options<block_bytes<4096>>), its throughput is comparable to libc++’s.

The main takeaway is that std::deque is not a single data structure — these four implementations make substantially different trade-offs, and the choice of block size alone can account for large performance differences in iteration and bulk insertion. It’s worth understanding which implementation you’re using and what its defaults are, especially if deque performance is relevant to your workload.

If you need a portable high-performance deque implementation that offers noexcept guarantees for essential constructors or you want to take advantage of a configurable block size, take a look at Boost.Container’s new deque implementation.

Responses

Howard Hinnant

April 7, 2026

Nice write up! Love the comparisons.

I’d love to see a “queue benchmark” comparison. That is, how does std::queue (uses deque under the covers) perform under a series of pushes and pops that keeps the average size() about the same over a time?

This might be done by alternating push and pop. Or it might be done by randomly calling push and pop with equal probability.

This benchmark is designed to highlight a potential strong point of deque. deque should be able to pull off this use pattern better than other std::containers.

Loading…

Ion Gaztañaga

April 7, 2026

Thanks Howard!

Certainly std::queue use cases would be interesting to benchmark. I expect the libc++ implementation will shine on some of them, as it keeps a spare block on each end, avoiding allocator calls in some cases.

I also noticed that I should add one additional data to the comparison table regarding the noexceptness of the default constructor, as it can make a difference if users want to achieve noexcept default constructors on types holding deques.

Loading…

1. Howard Hinnant
  
  April 8, 2026
  
  Here is a comparison I made about a decade ago on the noexcept issue. I don’t know if it is still accurate today or not.
  
  https://howardhinnant.github.io/container_summary.html
  
  Loading…
  
Ion Gaztañaga

April 8, 2026

Thanks Howard, I’ll update the article adding the noexcept info, it’s certainly relevant.

Loading…

Ion Gaztañaga

April 8, 2026

Hi, I reviewed a bit the default and move constructor guarantees for existing implementations (let’s hope there are no mistakes) and updated the article accordingly. Thanks again Howard!

Loading…