Skip to content

packedsets: simplify implementation using Table#1687

Draft
alaviss wants to merge 4 commits intonim-works:develfrom
alaviss:push-ktrysyywokxm
Draft

packedsets: simplify implementation using Table#1687
alaviss wants to merge 4 commits intonim-works:develfrom
alaviss:push-ktrysyywokxm

Conversation

@alaviss
Copy link
Copy Markdown
Contributor

@alaviss alaviss commented Feb 22, 2026

Summary

Instead of using a custom ordered table implementation, convert packedsets to take advantage of Table and built-in set[T]. This cut down on the amount of code in the stdlib without compromising performance.

Details

This came out of the prior work on Roaring Bitmap, where I needed a few sample implementations to measure performance with. I found the original PackedSet resembles a simple ordered table with clustered lower bits, and experimented with building a simplified version based on existing stdlib components.

In this new structure, keys are grouped by their top N - 8 bits and indexed via a Table, with the bottom 8 bits packed into a set[uint8]. Due to the change in internal structure, items() no longer yield keys in insert order.

Performance metrics for nim c --hints:off --warnings:off -fc compiler/nim.nim (on the same compiler checkout):

Implementation Compiler runtime Ratio Improvement
Original 10.469s ± 0.075s 1.0 ± 0.007 0 ± 0.007
Simplified 10.114s ± 0.107s 0.97 ± 0.011 0.03 ± 0.011
Implementation Compiler peak memory Ratio Improvement
Original 1.137GiB 1.0 0
Simplified 1.123GiB 0.988 0.012

For the compiler, the new implementation performance is very similar to the original.


Notes for Reviewers

  • While tests passes, the original suite isn't very comprehensive. Other compiler failures showed that the tests are not good enough. Draft until tests and microbenchs are written.
  • I have not yet measured the performance of individual set operations in detail. The compiler mostly uses incl and contains while not employing many mathematical operations.

@alaviss alaviss added refactor Implementation refactor stdlib Standard library simplification Removal of the old, unused, unnecessary or un/under-specified language features. labels Feb 22, 2026
The new implementation pack ordinals by clustering their
least-significant byte in a bitset. This optimizes the set for the
common case of checking in sequential keys.
The default equivalence check does not come through sometimes, so
implement our own.

Also fix wrong set comparison operator used for comparing words.
I forgot to preserve the elements in the LHS...
They already doesn't, but add some try-except so the compiler doesn't
complain.
@alaviss alaviss marked this pull request as draft February 22, 2026 02:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Implementation refactor simplification Removal of the old, unused, unnecessary or un/under-specified language features. stdlib Standard library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant