[token] Reduce DB reads on tracked validations#268
Merged
Conversation
- Populate token cache after ValidateTokenWithTracking - For cached limited tokens: increment usage without extra DB read - Tighten DB IncrementTokenUsage gating (inactive/expired/quota) and return semantic errors Testing: make test, make lint
Contributor
There was a problem hiding this comment.
Pull request overview
This PR optimizes token validation performance by extending cache usage to tracked validations (previously only used for non-tracking validation). The key insight is that for limited tokens, the cached metadata (active/expiry/project) can be reused while still enforcing max_requests through a synchronous DB write that avoids an extra DB read.
- Caches tokens after successful tracked validation to avoid redundant DB reads on subsequent requests
- For cached limited tokens, uses cached metadata but still calls
IncrementTokenUsagefor quota enforcement - Tightens DB-level enforcement by adding
is_activeandexpires_atchecks to the UPDATE's WHERE clause, returning semantic errors when these constraints aren't met
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
internal/token/cache.go |
Implements limited-token caching by calling IncrementTokenUsage directly on cache hits; replaces IsValid() with new isCacheableTokenValid() helper that excludes quota checks; populates cache after successful tracking |
internal/database/token.go |
Adds is_active and expires_at guards to the UPDATE WHERE clause in IncrementTokenUsage; expands error-path SELECT to return ErrTokenInactive or ErrTokenExpired when appropriate |
internal/token/cache_test.go |
Adds countingStore mock and test verifying that limited tokens hit the cache on second call with only one increment and no extra DB read |
internal/token/token_integration_test.go |
Updates MockStore.IncrementTokenUsage to match new DB behavior (check active/expired/quota before incrementing) |
internal/database/token_test.go |
Adds test cases verifying IncrementTokenUsage returns ErrTokenInactive and ErrTokenExpired when appropriate |
- Use SQL boolean literal TRUE for is_active check (works across sqlite/mysql/postgres) - Tighten cache test assertion per review
- Invalidate cached limited-token entries when IncrementTokenUsage returns rate-limit/inactive/expired - Treat any non-NULL expires_at as authoritative (remove IsZero guard) Testing: go test ./internal/token ./internal/database
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduce unnecessary DB reads on the uncached request path by ensuring token caching applies to tracked validations, while keeping quota enforcement correct.
Changes
ValidateTokenWithTracking()to avoid repeated token DB lookups.IncrementTokenUsageDB update gating foris_activeandexpires_at, and return semantic errors.Testing
make testmake lintNotes
This targets proxy-side latency under concurrency for uncached calls; limited tokens still require a per-request DB write to enforce
max_requests.