Refactor to replace litellm with provider catalog and SDK adapters by Starlitnightly · Pull Request #61 · aristoteleo/PantheonOS

Starlitnightly · 2026-04-03T22:47:49Z

No description provided.

…SDK adapters Remove litellm dependency entirely and replace with a catalog-driven provider abstraction layer using native SDKs (openai, anthropic, google-genai). New architecture: - llm_catalog.json: single source of truth for 12 providers, 80+ models (OpenAI, Anthropic, Gemini, DeepSeek, Zhipu, MiniMax, Moonshot, Qwen, Groq, Mistral, Together AI, OpenRouter) - provider_registry.py: catalog loader + get_model_info(), completion_cost(), token_counter(), models_by_provider() - adapters/: per-SDK adapters (openai, anthropic, gemini) with unified interface - OpenAI adapter handles all OpenAI-compatible providers - Anthropic adapter converts message format + normalizes streaming events - Gemini adapter wraps google-genai SDK - stream_chunk_builder(): local replacement for litellm.stream_chunk_builder() with reasoning_content support Key changes: - All litellm imports removed from codebase - pyproject.toml: litellm → anthropic, google-genai, tiktoken - Proxy mode: backward-compat LITELLM_PROXY_* env vars + new LLM_PROXY_* - remove_metadata(): whitelist-based field cleanup (strict providers like Groq reject any non-standard fields) - Null field cleanup: tool_calls=null → field removed - Tool call error recovery: stream interruptions from server-side validation (e.g. Groq hallucinated tool names) return partial text instead of crashing - stream_chunk_builder: handles usage=null from partial/interrupted streams - Responses API support via OpenAI adapter for gpt-5.x-pro and codex models Tested with real API calls across all providers (52/52 tests passing). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Clean up all remaining litellm references in variable names, function names, enum values, parameters, comments, and documentation: - ProviderType.LITELLM → ProviderType.NATIVE - force_litellm parameter → relaxed_schema (Agent, detect_provider) - acompletion_litellm() → acompletion() - litellm_mode parameter → removed (only relaxed_schema remains) - _convert_functions(litellm_mode=) → _convert_functions(relaxed_schema=) - get_litellm_proxy_kwargs() backward-compat alias deleted - litellm_model variable → resolved_model - All comments and docstrings updated - Documentation updated (agent.rst, utils.rst, models.rst, etc.) - Test names updated (test_agent_force_litellm → test_agent_relaxed_schema) Only remaining "LITELLM" references are env var names in get_proxy_kwargs() for backward compatibility (LITELLM_PROXY_ENABLED/URL/KEY). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ters Anthropic: thinking_delta events now written into collected_chunks (previously only sent via process_chunk callback, lost in stream_chunk_builder) Gemini: add include_thoughts=True to ThinkingConfig, capture thought=True parts as reasoning_content chunks (previously thinking parts were ignored) Both adapters now emit reasoning_content in the standard delta format, compatible with stream_chunk_builder's reasoning_content accumulation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Groq gpt-oss models use 'reasoning' (not 'reasoning_content') for thinking output. stream_chunk_builder now accumulates both field names. OpenAI gpt-5 does not expose reasoning content at all (by design). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New OAuth infrastructure for browser-based authentication: - pantheon/utils/oauth/codex.py: CodexOAuthManager with login(), refresh(), import_from_codex_cli(), and persistent token storage (~/.pantheon/oauth/) - OAuth 2.0 Authorization Code + PKCE flow, local callback server - Auto-refresh expired tokens, import from Codex CLI (~/.codex/auth.json) New Codex adapter: - pantheon/utils/adapters/codex_adapter.py: calls chatgpt.com/backend-api using Responses API format with OAuth bearer tokens - Handles SSE streaming, tool calls, usage extraction Integration: - llm_catalog.json: new "codex" provider with sdk="codex", auth_mode="oauth" - acompletion(): detects codex provider, auto-fetches OAuth token - call_llm_provider(): routes codex/ models to dedicated adapter - Models: gpt-5.4, gpt-5.4-mini, gpt-5.2-codex, gpt-5, o4-mini (free via OAuth) Usage: # Import from Codex CLI (if installed) from pantheon.utils.oauth import CodexOAuthManager CodexOAuthManager().import_from_codex_cli() # Or browser login CodexOAuthManager().login() # Then use codex/ prefix await acompletion(model="codex/gpt-5.4-mini", messages=[...]) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…gration CLI commands (pantheon-chatroom oauth): - oauth status: check auth status - oauth login: browser-based OAuth login - oauth import: import from Codex CLI (~/.codex/auth.json) - oauth logout: remove stored tokens NATS RPC tools for frontend: - oauth_status(): returns all OAuth provider statuses - oauth_login(provider): start browser-based login - oauth_import(provider): import from native CLI Model selector: - Detects codex as available provider when OAuth tokens exist - Added codex to DEFAULT_PROVIDER_MODELS and PROVIDER_API_KEYS - codex/ models appear in list_available_models() when authenticated acompletion(): - Routes codex/ models through OAuth token + CodexAdapter - Passes account_id for chatgpt-account-id header - Returns message dict directly (no stream_chunk_builder) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

OpenAI refresh_tokens are single-use. If Codex CLI already used the refresh_token, our refresh attempt fails with "refresh_token_reused". Now import_from_codex_cli() copies tokens as-is without refreshing. get_access_token() handles lazy refresh when actually needed. Only attempt refresh if there's no access_token at all. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

oauth_status() now returns supports_import=true only when Codex CLI auth file is detected. Frontend hides the import button otherwise. Also renamed button to "Import from Codex CLI" for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide structured project documentation for AI assistants (Claude Code, Cursor, Copilot, etc.) covering architecture, conventions, module reference, team templates, and task toolset mechanism. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat: add .agents/ directory for AI coding tool context

Ollama is detected automatically when running at localhost:11434. No API key or manual configuration needed. - llm_catalog.json: new "ollama" provider with local=true, sdk=openai - model_selector.py: _detect_ollama() pings /api/tags to check availability, _list_ollama_models() fetches model names (cached 30s), _get_provider_models() returns dynamic ollama model list - llm.py: auto-fills dummy api_key="ollama" for local providers Models appear in the UI model selector as ollama/model-name. Usage: just run `ollama serve` and models show up automatically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When a chat fails (e.g. OAuth token expired, model error), the error was silently swallowed — frontend just saw the model stop responding. Now chat_finished event includes status="error" and metadata.message when thread.response indicates failure. Frontend ChatManager shows the error as an assistant message in the chat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Previously is_authenticated() returned true if refresh_token existed in the file, even if both access_token and refresh_token were expired/reused. Now oauth_status() calls get_access_token(auto_refresh=True) to actually verify the token works before reporting "Connected". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Improved error messages for Codex OAuth failures to be user-friendly and include [OAUTH_REQUIRED] tag for frontend to detect and show actionable UI (settings button). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Resolved conflict in pantheon/repl/__main__.py: - main added _update_litellm_cost_map() wrapper - our branch removed all litellm code - kept our version (no litellm) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The test workflow referenced --extra slack but pyproject.toml has no slack optional-dependency group (slack-sdk/slack-bolt are in main deps). This caused all CI jobs to fail with "Extra slack is not defined". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

refactor: replace litellm with lightweight provider catalog + native SDK adapters

feat: CC-aligned token optimization pipeline (5-stage)

Nanguage and others added 18 commits March 30, 2026 23:53

Merge pull request #56 from aristoteleo/feat/agents-directory

3c22ee8

feat: add .agents/ directory for AI coding tool context

Merge branch 'main' into feature/replace-litellm

efe5b9f

Resolved conflict in pantheon/repl/__main__.py: - main added _update_litellm_cost_map() wrapper - our branch removed all litellm code - kept our version (no litellm) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge pull request #60 from aristoteleo/feature/replace-litellm

eaa4053

refactor: replace litellm with lightweight provider catalog + native SDK adapters

Merge pull request #54 from aristoteleo/claw

26dd9c8

feat: CC-aligned token optimization pipeline (5-stage)

Starlitnightly merged commit b624140 into dev Apr 3, 2026
0 of 5 checks passed

Starlitnightly mentioned this pull request Apr 4, 2026

fix(file_manager): output-token truncation guards + append_file tool (clean cherry-pick from #55) #64

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor to replace litellm with provider catalog and SDK adapters#61

Refactor to replace litellm with provider catalog and SDK adapters#61
Starlitnightly merged 18 commits intodevfrom
main

Starlitnightly commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Starlitnightly commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants