English | 中文
LLMIO is a Go-based LLM load‑balancing gateway that provides a unified REST API, weighted scheduling, logging, and a modern admin UI for LLM clients (openclaw / claude code / codex / gemini cli / cherry studio / open webui). It helps you integrate OpenAI, Anthropic, Gemini, and other model capabilities in a single service.
QQ group: 1083599685
- Unified API: Compatible with OpenAI Chat Completions, OpenAI Responses, Gemini Native, and Anthropic Messages. Supports both streaming and non‑streaming passthrough.
- Weighted scheduling:
balancers/provides two strategies (random by weight / priority by weight). You can route based on tool calling, structured output, and multimodal capability. - Admin Web UI: React + TypeScript + Tailwind + Vite console for providers, models, associations, logs, and metrics.
- Rate limiting & failure handling: Built‑in rate‑limit fallback and provider connectivity checks for fault isolation.
- Local persistence: Pure Go SQLite (
db/llmio.db) for config and request logs, ready to use out of the box.
services:
llmio:
image: atopos31/llmio:latest
ports:
- 7070:7070
volumes:
- ./db:/app/db
environment:
- GIN_MODE=release
- TOKEN=<YOUR_TOKEN>
- TZ=Asia/Shanghaidocker compose up -ddocker run -d \
--name llmio \
-p 7070:7070 \
-v $(pwd)/db:/app/db \
-e GIN_MODE=release \
-e TOKEN=<YOUR_TOKEN> \
-e TZ=Asia/Shanghai \
atopos31/llmio:latestDownload the release package for your OS/arch from releases (version > 0.5.13). Example for linux amd64:
wget https://github.com/atopos31/llmio/releases/download/v0.5.13/llmio_0.5.13_linux_amd64.tar.gzExtract:
tar -xzf ./llmio_0.5.13_linux_amd64.tar.gzStart:
GIN_MODE=release TOKEN=<YOUR_TOKEN> ./llmioThe service will create ./db/llmio.db in the current directory as the SQLite persistence file.
| Variable | Description | Default | Notes |
|---|---|---|---|
TOKEN |
Console login and API auth for /openai /anthropic /gemini /v1 |
None | Required for public access |
GIN_MODE |
Gin runtime mode | debug |
Use release in production |
LLMIO_SERVER_PORT |
Server listen port | 7070 |
Service listen port |
TZ |
Timezone for logs and scheduling | Host default | Recommend explicit setting in containers (e.g. Asia/Shanghai) |
DB_VACUUM |
Run SQLite VACUUM on startup | Disabled | Set to true to reclaim space |
Clone:
git clone https://github.com/atopos31/llmio.git
cd llmioBuild frontend (pnpm required):
make webuiRun backend (Go >= 1.26.1):
TOKEN=<YOUR_TOKEN> make runWeb UI: http://localhost:7070/
LLMIO provides a multi‑provider REST API with the following endpoints:
| Provider | Path | Method | Description | Auth |
|---|---|---|---|---|
| OpenAI | /openai/v1/models |
GET | List available models | Bearer Token |
| OpenAI | /openai/v1/chat/completions |
POST | Create chat completion | Bearer Token |
| OpenAI | /openai/v1/responses |
POST | Create response | Bearer Token |
| Anthropic | /anthropic/v1/models |
GET | List available models | x-api-key |
| Anthropic | /anthropic/v1/messages |
POST | Create message | x-api-key |
| Anthropic | /anthropic/v1/messages/count_tokens |
POST | Count tokens | x-api-key |
| Gemini | /gemini/v1beta/models |
GET | List available models | x-goog-api-key |
| Gemini | /gemini/v1beta/models/{model}:generateContent |
POST | Generate content | x-goog-api-key |
| Gemini | /gemini/v1beta/models/{model}:streamGenerateContent |
POST | Stream content | x-goog-api-key |
| Generic | /v1/models |
GET | List models (compat) | Bearer Token |
| Generic | /v1/chat/completions |
POST | Create chat completion (compat) | Bearer Token |
| Generic | /v1/responses |
POST | Create response (compat) | Bearer Token |
| Generic | /v1/messages |
POST | Create message (compat) | x-api-key |
| Generic | /v1/messages/count_tokens |
POST | Count tokens (compat) | x-api-key |
LLMIO uses different auth headers depending on the endpoint:
Applies to /openai/v1/* and OpenAI‑compatible endpoints under /v1/*.
curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:7070/openai/v1/modelsApplies to /anthropic/v1/* and Anthropic‑compatible endpoints under /v1/*.
curl -H "x-api-key: YOUR_TOKEN" http://localhost:7070/anthropic/v1/messagesApplies to /gemini/v1beta/* endpoints.
curl -H "x-goog-api-key: YOUR_TOKEN" http://localhost:7070/gemini/v1beta/modelsFor claude code or codex, use these environment variables:
export OPENAI_API_KEY=<YOUR_TOKEN>
export ANTHROPIC_API_KEY=<YOUR_TOKEN>
export GEMINI_API_KEY=<YOUR_TOKEN>Note:
/v1/*paths are kept for compatibility. Prefer the provider‑specific routes.
.
├─ main.go # HTTP server entry and routes
├─ handler/ # REST handlers
├─ service/ # Business logic and load‑balancing
├─ middleware/ # Auth, rate limit, streaming middleware
├─ providers/ # Provider adapters
├─ balancers/ # Weight and scheduling strategies
├─ models/ # GORM models and DB init
├─ common/ # Shared helpers
├─ webui/ # React + TypeScript admin UI
└─ docs/ # Ops & usage docs
This project is released under the MIT License.


