fix(models): update Gemini context and output token limits #251

xlyk · 2026-02-03T05:47:21Z

Update Gemini model capabilities with correct token limits from official Google Cloud documentation:

gemini_defaults(): context_window: 1,048,576 (1M), max_output_tokens: 65,536
gemini_2_0_flash_lite_defaults(): context_window: 1,048,576 (1M), max_output_tokens: 8,192
Updated gemini-2.0-flash-lite model to use gemini_2_0_flash_lite_defaults()

Correct GPT-5 context window from 1M to 400k tokens. Add support for Codex models with 192k context. Separate GPT-5 and o-series capabilities with distinct context windows (400k vs 200k). Update Gemini max output tokens from 8k to 65k, with special handling for Flash-Lite models.

xlyk merged commit 0f7b716 into main Feb 3, 2026
3 checks passed

github-actions bot mentioned this pull request Feb 3, 2026

chore(main): release 0.2.21 #250

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(models): update Gemini context and output token limits #251

fix(models): update Gemini context and output token limits #251

xlyk commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(models): update Gemini context and output token limits #251

fix(models): update Gemini context and output token limits #251

Conversation

xlyk commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant