Skip to content

Support multiple LLM providers (Claude, Gemini, local models) #1

@Aditya8840

Description

@Aditya8840

Problem

The agent loop in agent.py is tightly coupled to OpenAI's Python SDK. Every LLM interaction — client init, chat completion, tool-call parsing, message serialization — uses OpenAI-specific APIs directly. This makes it impossible to use Claude, Gemini, Ollama, or any other provider without rewriting the core loop.

Hardcoded OpenAI touchpoints:

Location What
agent.py:4 from openai import OpenAI
agent.py:112 client = OpenAI() — relies on OPENAI_API_KEY env var
agent.py:138-143 client.chat.completions.create(model=..., tools=..., tool_choice="required")
agent.py:145-146 response.choices[0].message / .model_dump(exclude_none=True)
agent.py:152-154 tool_call.function.name / tool_call.function.arguments — OpenAI tool_calls structure
main.py:20-22 --model defaults to gpt-4o via DROIDPILOT_MODEL env var

Expected behavior

Users should be able to choose their LLM provider via CLI flag or env var:

droidpilot "Open Settings" --provider anthropic --model claude-sonnet-4-20250514
droidpilot "Open Settings" --provider ollama --model llama3

Suggested approach

Introduce a provider abstraction that normalizes:

  • Client initialization (API key handling per provider)
  • Chat completion with tool use
  • Tool-call response parsing (each provider returns tool calls differently)
  • Message history serialization

The tool definitions in actions.py already use a provider-agnostic JSON schema format, so those can be reused across providers with minimal adaptation.

Scope

  • agent.py — extract LLM interaction into a provider interface
  • main.py — add --provider CLI arg
  • New: provider implementations (openai, anthropic, ollama at minimum)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions