perf: eliminate redundant tokenization and binary search allocation churn#29
perf: eliminate redundant tokenization and binary search allocation churn#29
Conversation
…hurn (#28) B4: Add `known_token_count` parameter to `truncate_to_token_budget` so the cascade loop can pass its already-computed count, skipping the redundant fast-path tokenization of the full text. B5: Replace per-iteration `lines[..mid].join("\n")` in the binary search with a single pre-joined string and byte-offset index. Each iteration now slices from the joined string instead of allocating O(mid) work, reducing total allocation from O(N log N) to O(N). Also adds 3 new unit tests for the known_token_count parameter and a dedicated criterion benchmark group for truncation at 100-5000 lines.
| language: Language, | ||
| token_budget: usize, | ||
| count_tokens: F, | ||
| known_token_count: Option<usize>, |
There was a problem hiding this comment.
Issue: Public API signature change breaks backward compatibility
The function truncate_to_token_budget is exported publicly via pub use in lib.rs (line 41). Adding this known_token_count: Option<usize> parameter changes the signature from 4 to 5 arguments, which is a breaking change for external consumers of rskim-core on crates.io.
Recommended fix (choose one):
Option A - Backward-compatible wrapper (preferred):
Keep the 4-param public API and create a new 5-param variant truncate_to_token_budget_with_hint(). Have the 4-param version call the 5-param with None.
Option B - Make it internal:
If external consumers aren't intended, reduce visibility to pub(crate) and remove the pub use from lib.rs:41.
Since rskim-core is published on crates.io, this needs to be addressed before merge.
Reviewed by: Architecture & Regression (HIGH, 90% confidence)
| Language::TypeScript, | ||
| *budget, | ||
| word_count, | ||
| None, |
There was a problem hiding this comment.
Issue: Benchmark doesn't test the Some(count) optimization path (B4)
The B4 optimization (skip redundant tokenization) only executes when known_token_count is Some(count), but this benchmark always passes None. This means:
- Benchmark only measures B5 (pre-join allocation savings), not B4
- The B4 optimization has no regression guard
- Future changes that break the
Somefast-path won't be caught
Recommended fix: Add a second benchmark variant that passes Some(total) to measure the B4 improvement:
let total = word_count(&text);
// Add new benchmark group exercising Some(total) path:
group.bench_with_input(
BenchmarkId::new("known/lines", num_lines),
&(text, budget, total),
|b, (input, budget, total)| {
b.iter(|| {
truncate_to_token_budget(
black_box(input), Language::TypeScript,
*budget, word_count, Some(*total),
).unwrap()
})
},
);This provides complete regression coverage for both B4 and B5 optimizations.
Reviewed by: Tests & Performance (MEDIUM, 85% confidence)
| } | ||
|
|
||
| #[test] | ||
| fn test_token_budget_known_count_returns_early_when_within_budget() { |
There was a problem hiding this comment.
Issue: Test doesn't verify fast-path execution
The test test_token_budget_known_count_returns_early_when_within_budget asserts the return value is correct, but it doesn't verify that count_tokens was actually skipped when known_token_count is Some(count). The test passes word_count as the counter, which would return the same result regardless of whether it was called.
This means if someone broke the B4 optimization (e.g., removed the unwrap_or_else short-circuit), this test would still pass.
Recommended fix: Use a "poison" counter that proves the fast-path was taken:
#[test]
fn test_token_budget_known_count_returns_early_when_within_budget() {
let text = "line one\nline two\nline three\n";
// Use a poison counter that panics if called on the full text
let poison_counter = |s: &str| -> usize {
if s == text {
panic\!("count_tokens should NOT be called when known_token_count is Some");
}
s.split_whitespace().count()
};
let result = truncate_to_token_budget(
text, Language::TypeScript, 100, poison_counter, Some(5)
).unwrap();
assert_eq\!(result, text, "Fast-path should return text unchanged");
}If the fast-path is working, the poison counter is never called. If it's broken, the test panics.
Reviewed by: Tests (MEDIUM, 85% confidence)
| { | ||
| // Fast path: if text already fits, return unchanged | ||
| if count_tokens(text) <= token_budget { | ||
| // Fast path: if text already fits, return unchanged (B4: skip recount |
There was a problem hiding this comment.
Issue: Inline ticket references (B4, B5) deviate from codebase style
Comments like "// B4: skip recount" and similar ticket shorthand use codes that won't be meaningful to future contributors. The rest of the codebase uses descriptive prose without ticket codes (e.g., "// Fast path: if text already fits, return unchanged").
This also appears at line 376 ("// B5: Pre-join...") and in test headers ("// known_token_count tests (B4)").
Recommended fix: Replace with descriptive comments that explain the optimization without external context:
Line 353:
// Fast path: when caller provides a known token count from the cascade loop,
// skip redundant full-text tokenization and use the pre-computed count instead.Line 376:
// Pre-compute line byte-offsets once to avoid allocating O(N log N) intermediate
// strings during binary search. Each iteration slices from the pre-joined string.Test section header:
// Tests for known_token_count parameter optimizationThis keeps the codebase consistent and makes the intent clear without requiring external knowledge of the optimization tickets.
Reviewed by: Consistency (MEDIUM, 80% confidence)
Code Review Summary: PR #29Branch: perf/28-token-budget-truncation → main OverviewThis PR introduces two targeted performance optimizations to token budget truncation:
Inline Comments Posted: 4 high-confidence blocking issues (≥80% confidence) Findings by SeverityBLOCKING ISSUES (Address Before Merge)
Action: All 4 issues have detailed inline comments with recommended fixes. Lower-Confidence Findings (60-79%) → Suggestions1.
|
| Category | CRITICAL | HIGH | MEDIUM | LOW | Total |
|---|---|---|---|---|---|
| Blocking (≥80%) | 0 | 1 | 3 | 0 | 4 |
| Should Fix (60-79%) | 0 | 0 | 4 | 0 | 4 |
| Pre-existing | 0 | 0 | 2 | 4 | 6 |
Review Scores
| Reviewer | Score | Recommendation |
|---|---|---|
| Architecture | 7/10 | CHANGES_REQUESTED |
| Rust | 8/10 | APPROVED |
| Performance | 7/10 | APPROVED_WITH_CONDITIONS |
| Tests | 7/10 | APPROVED_WITH_CONDITIONS |
| Regression | 8/10 | APPROVED_WITH_CONDITIONS |
| Security | 9/10 | APPROVED_WITH_CONDITIONS |
| Complexity | 8/10 | APPROVED |
| Consistency | 8/10 | APPROVED_WITH_CONDITIONS |
Overall: Strong optimizations with good test coverage. Primary blocker is the public API breaking change, which has two straightforward solutions (backward-compatible wrapper or reduce visibility).
What This PR Does Well
✅ Well-motivated optimizations (skip redundant tokenization, eliminate allocation churn)
✅ All 103+ tests pass; 3 new tests for known_token_count parameter
✅ Property test verifies None and Some(actual) produce identical output
✅ Benchmark added to track B5 optimization (though missing B4 variant)
✅ Commit message accurately describes both optimizations (B4, B5 labels in code)
✅ All in-repo call sites updated correctly
✅ No unsafe Rust; no new dependencies; no security issues
Recommendation
Status: APPROVED_WITH_CHANGES
Conditions:
- Address the 4 blocking inline comments before merge
- Optionally implement 4 "should fix" suggestions (token counting guards, complexity reduction, etc.)
The optimizations are sound and well-tested. The blocking issues are straightforward to address and do not indicate deeper problems with the implementation.
Review consolidation by Claude Code | Deduplicated 8 review reports | 4 inline comments + summary
- Change last_token_count from usize to Option<usize> for type safety (#6) - Add debug_assert! to validate known_token_count matches actual (#2) - Replace B4/B5 ticket shorthand with descriptive comments (#5) - Strengthen fast-path test with call-counting poison counter (#3) - Add lines_known benchmark variant exercising Some(total) path (#4)
Summary
Fixes the two performance issues deferred from #27 to tech debt (#28):
truncate_to_token_budgetnow acceptsknown_token_count: Option<usize>, eliminating the redundant fast-path tokenization when the cascade loop already computed the countlines[..mid].join("\n")with a single pre-joined string and byte-offset index, reducing allocation work from O(N log N) to O(N)Also adds 3 new unit tests for the
known_token_countparameter and a dedicated criterion benchmark group (token_budget_truncation) at 100–5000 line scales.Closes B4 and B5 from #28. Remaining items (B6, B8, S2, S5, S6, S13) are unchanged.
Test plan
cargo test— 151 existing + 3 new)cargo clippy -- -D warningscleantest_token_budget_known_count_none_behaves_like_before— property test:NoneandSome(actual)produce identical output for budgets 1..20