add token_counting_strategy override for provider-aware token counting by adilhafeez · Pull Request #843 · katanemo/plano

adilhafeez · 2026-03-23T04:46:59Z

add token_counting_strategy override (estimate|auto) for provider-aware token counting

use len/4 token estimate by default, make tiktoken opt-in via enable_token_counting. This takes out around ~80ms time from request,

.404    Brightstaff accepts connection, reuses idle conn to localhost:12001
.405    Envoy WASM filter receives request, resolves model, starts tokenization
.484    Tokenization complete (79ms to count 225 tokens — result unused, ratelimit skipped)
.484    Upstream transform — request sent to archfc.katanemo.dev:443
.557    Upstream response received (73ms round-trip, TLS session reused)
.560    Response processed, route=coding → openai/gpt-4o
.563    Response returned to client

By default, use cheap len/4 estimate for input token counting (metrics and ratelimit). When enable_token_counting is set to true in overrides, use tiktoken BPE for exact counts. This eliminates ~80ms of per-request latency from tiktoken in the WASM filter while keeping metrics and ratelimit functional. Made-with: Cursor

…(estimate|auto)

adilhafeez marked this pull request as ready for review March 23, 2026 04:48

adilhafeez force-pushed the adil/optional-token-counting branch from a295a1b to e5f3039 Compare March 23, 2026 04:53

replace enable_token_counting bool with token_counting_strategy enum …

20e8e0c

…(estimate|auto)

adilhafeez changed the title ~~make tiktoken token counting optional via enable_token_counting override~~ add token_counting_strategy override (estimate|auto) for provider-aware token counting Mar 25, 2026

adilhafeez changed the title ~~add token_counting_strategy override (estimate|auto) for provider-aware token counting~~ add token_counting_strategy override for provider-aware token counting Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add token_counting_strategy override for provider-aware token counting#843

add token_counting_strategy override for provider-aware token counting#843
adilhafeez wants to merge 2 commits intomainfrom
adil/optional-token-counting

adilhafeez commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adilhafeez commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adilhafeez commented Mar 23, 2026 •

edited

Loading