-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
arch: integrationExternal service integrationExternal service integrationarch: ports-adaptersTouches adapter/port boundariesTouches adapter/port boundariescomponent: proxyOpenAI-compatible proxyOpenAI-compatible proxyenhancementNew feature or requestNew feature or requestpriority: mediumShould be done soonShould be done soonsize: l1-3 days1-3 daystype: featureNew functionality or enhancementNew functionality or enhancement
Description
Description
Add Ollama-native API support to the gglib proxy, making it a drop-in replacement for Ollama on port 11434.
Problem
Apps expecting an Ollama endpoint reject the gglib proxy even when configured on port 11434 because:
- Proxy only served OpenAI
/v1/*endpoints - Ollama clients hit
/api/*endpoints (version, tags, chat, generate, embed, etc.) — all returned 404 - Response formats were incompatible:
- Ollama: NDJSON streaming + flat JSON responses
- OpenAI: SSE streaming + nested choices array
- Model names were strict (no
:latestsuffix handling)
Solution
Implement an Adapter pattern with simultaneous dual API surfaces:
/v1/*— OpenAI-compatible (existing, unchanged)/api/*— Ollama-native (new)
Core Features
✅ Ollama Endpoints Implemented:
GET /— Root probe (returns "Ollama is running")GET /api/version— Version infoGET /api/tags— List modelsPOST /api/show— Model metadataGET /api/ps— Running modelsPOST /api/chat— Chat completions (streaming + non-streaming)POST /api/generate— Text generation (streaming + non-streaming)POST /api/embed— EmbeddingsPOST /api/embeddings— Legacy single-embedding endpoint- Stubs for
/api/pull,/api/delete,/api/copy,/api/create(redirect to CLI)
✅ Format Translation:
- Client request → OpenAI format → llama-server → OpenAI response → Ollama format
- SSE ↔ NDJSON streaming adapter using
futures_util::stream::unfold - Proper timestamp handling via
chronocrate
✅ Model Name Normalization:
- Strips
:latestsuffix automatically (e.g.,phi3:latest→phi3) - Preserves other tags and variants
✅ Code Quality:
- Single unified
ProxyStatestruct shared by both API surfaces - Extracted helper functions (
apply_openai_options,parse_upstream_completion) - Comprehensive unit tests (10 new tests in
ollama_models.rs) - All 164 existing tests still pass
Files Changed
- New:
crates/gglib-proxy/src/ollama_models.rs— Ollama data types + normalization - New:
crates/gglib-proxy/src/ollama_handlers.rs— Route handlers + translation logic - New:
crates/gglib-proxy/src/ollama_stream.rs— SSE→NDJSON streaming adapter - Modified:
crates/gglib-proxy/src/server.rs— Route registration, state unification - Modified:
crates/gglib-proxy/src/lib.rs— Module exports - Modified:
crates/gglib-proxy/Cargo.toml— Addedchronodependency
Testing
- ✅ All 12 proxy unit tests pass
- ✅ All 2 proxy doc-tests pass
- ✅ All 67 runtime tests pass
- ✅ All 34 CLI tests pass
- ✅ All 49 axum tests pass
- Total: 164 tests passing, 0 failures
Usage
# Start proxy on port 11434 (Ollama default)
gglib proxy --port 11434
# Now compatible with any Ollama-expecting client:
curl http://localhost:11434/api/tags
curl -X POST http://localhost:11434/api/chat -d '{"model":"phi3","messages":[...]}'Breaking Changes
None. Both API surfaces coexist without configuration or toggling.
Closes
TBD (link to related issues if applicable)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
arch: integrationExternal service integrationExternal service integrationarch: ports-adaptersTouches adapter/port boundariesTouches adapter/port boundariescomponent: proxyOpenAI-compatible proxyOpenAI-compatible proxyenhancementNew feature or requestNew feature or requestpriority: mediumShould be done soonShould be done soonsize: l1-3 days1-3 daystype: featureNew functionality or enhancementNew functionality or enhancement