Solver Benchmark Results

Benchmarks for the IterativeDeepeningSolver comparing different StateTraversal parent chain strategies.

All benchmarks run with Criterion and allocation tracking via stats_alloc. The stats_alloc global allocator is active during benchmarks, so timings include its overhead — relative comparisons are valid but absolute numbers are slightly inflated vs production.

Test Cases

Case	Board	Target	Depth
1-step	`[1, 2, 3, 4, 5, 6]`	12	1
2-step	`[1, 4, 4, 5, 6, 50]`	350	2
3-step	`[1, 3, 3, 8, 9, 50]`	410	3
4-step	`[2, 3, 3, 5, 6, 75]`	277	4
5-step	`[1, 10, 25, 50, 75, 100]`	831	5
impossible	`[3, 7, 6, 2, 1, 7]`	824	all 6

Approach 1: Baseline (Box + deep clone)

Each child StateTraversal owns its parent chain via Box<StateTraversal<S>>. When generating children from a candidate, candidate.clone() deep-copies the entire parent chain for every child.

Timing

Case	Time
1-step	~39 us
2-step	~506 us
3-step	~1.1 ms
4-step	~5.7 ms
5-step	~24.3 ms
impossible	~8.3 ms

Allocations (per solve)

Case	Allocs	Bytes allocated	Bytes freed
1-step	862	72,228	72,000
2-step	11,608	830,932	830,648
3-step	26,102	1,704,096	1,703,760
4-step	135,791	7,622,888	7,622,504
5-step	613,880	30,021,272	30,020,844
impossible	185,776	10,179,728	10,179,728

Approach 2: Rc (shared ownership)

Replace Box<StateTraversal<S>> with Rc<StateTraversal<S>>. The popped candidate is wrapped in Rc::new() before child generation; each child receives Rc::clone() (cheap reference count increment) instead of a deep clone.

Timing

Case	Time	vs Baseline
1-step	~38 us	-3%
2-step	~490 us	-3%
3-step	~1.03 ms	-7%
4-step	~4.88 ms	-14%
5-step	~18.9 ms	-22%
impossible	~7.0 ms	-15%

Allocations (per solve)

Case	Allocs	Bytes allocated	Bytes freed	Alloc change vs Baseline
1-step	795	70,348	70,120	-8%
2-step	10,494	803,592	803,308	-3%
3-step	21,516	1,579,720	1,579,384	-7%
4-step	95,395	6,491,748	6,491,364	-15%
5-step	353,637	22,649,324	22,648,896	-25%
impossible	122,589	8,403,592	8,403,592	-17%

Approach 3: Arena (typed-arena) -- Winner

Replace individual heap allocations with a per-depth-iteration typed_arena::Arena. All StateTraversal nodes are bump-allocated into the arena; children reference parents via &'arena StateTraversal borrows. The entire arena is freed in one shot when the depth iteration ends.

Timing

Case	Time	vs Baseline	vs Rc
1-step	~38 us	-3%	~0%
2-step	~482 us	-5%	-2%
3-step	~1.02 ms	-7%	-1%
4-step	~4.79 ms	-16%	-2%
5-step	~18.5 ms	-24%	-2%
impossible	~6.9 ms	-17%	-2%

Allocations (per solve)

Case	Allocs	Bytes allocated	Bytes freed	Alloc change vs Baseline
1-step	780	70,540	70,312	-10%
2-step	10,098	809,080	808,796	-13%
3-step	20,407	1,600,264	1,599,928	-22%
4-step	87,871	6,542,836	6,542,452	-35%
5-step	313,503	22,701,036	22,700,608	-49%
impossible	111,762	8,367,144	8,367,144	-40%

Analysis

The arena approach wins on both timing and allocations. Key observations:

Allocation count reduction scales dramatically with depth — at 5-step depth, arena uses 49% fewer allocations than baseline. This is because the arena batches many small allocations into large chunks, and individual StateTraversal nodes no longer need their own heap allocation for the parent pointer.
Timing improvement over Rc is modest (~2%) — the main win (eliminating deep cloning) was already captured by Rc. The arena's additional advantage comes from cheaper allocation (bump pointer vs malloc) and cheaper deallocation (bulk free vs individual drop).
Bytes allocated is slightly higher than Rc for some cases — the arena pre-allocates chunks, so there is some unused capacity within each chunk. This is a space-time trade-off: slightly more memory reserved, but faster allocation and deallocation.
Code complexity is comparable to Rc — the arena version uses &'arena StateTraversal borrows instead of Rc<StateTraversal>. The arena is created per depth iteration and naturally scopes the lifetime of all nodes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solver Benchmark Results

Test Cases

Approach 1: Baseline (Box + deep clone)

Timing

Allocations (per solve)

Approach 2: Rc (shared ownership)

Timing

Allocations (per solve)

Approach 3: Arena (typed-arena) -- Winner

Timing

Allocations (per solve)

Analysis

FilesExpand file tree

BENCHMARK_RESULTS.md

Latest commit

History

BENCHMARK_RESULTS.md

File metadata and controls

Solver Benchmark Results

Test Cases

Approach 1: Baseline (Box + deep clone)

Timing

Allocations (per solve)

Approach 2: Rc (shared ownership)

Timing

Allocations (per solve)

Approach 3: Arena (typed-arena) -- Winner

Timing

Allocations (per solve)

Analysis