Skip to content

fix: replace per-token String with byte range#83

Merged
eikopf merged 2 commits intomainfrom
fix/16-token-string-allocation
Mar 14, 2026
Merged

fix: replace per-token String with byte range#83
eikopf merged 2 commits intomainfrom
fix/16-token-string-allocation

Conversation

@eikopf
Copy link
Copy Markdown
Owner

@eikopf eikopf commented Mar 14, 2026

Summary

  • Changed Token struct to store span: Range<usize> instead of text: String, eliminating one heap allocation per token during lexing
  • Added Token::text(&self, source: &str) -> &str method to derive token text on demand by slicing the original source string
  • Updated Parser to carry the source string and use tok.text(&self.source) instead of &tok.text
  • Removed the manually-tracked offset field from Parser; byte offsets are now read directly from token spans

Test plan

  • All 566 existing tests pass (cargo test --workspace)
  • Lossless round-trip property preserved (token text derived from source matches original)
  • Parser error ranges still correct (using tok.span.clone() instead of manual offset tracking)

Closes #16

🤖 Generated with Claude Code

eikopf and others added 2 commits March 14, 2026 13:46
Token now stores a Range<usize> span instead of an owned String,
eliminating one heap allocation per token during lexing. The token
text is derived on demand by slicing the original source string.

Closes #16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@eikopf eikopf merged commit 6196513 into main Mar 14, 2026
4 checks passed
@eikopf eikopf deleted the fix/16-token-string-allocation branch March 14, 2026 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Token struct allocates a String per token

1 participant