Python implementation of the original modelrelay by Elliptic Marketing.
Intelligent LLM Model Routing - Python Edition
Routes requests to the best available model based on quality scores and provider availability.
- Smart Routing: Automatically selects the best available model based on quality scores
- Multi-Provider Support: NVIDIA NIM, Groq, Cerebras, OpenRouter, Codestral, Scaleway, Google AI
- Health Checking: Monitors provider availability and latency
- OpenAI-Compatible API: Drop-in replacement for OpenAI API clients
- CLI Tools: Manage models, providers, and check availability
pip install modelrelayFor the HTTP server:
pip install modelrelay[server]List available models:
modelrelay modelsShow best available model:
modelrelay bestCheck provider status:
modelrelay checkimport asyncio
from modelrelay import ModelRouter
async def main():
router = ModelRouter()
# Get best available model
model = await router.get_best_available_model()
print(f"Best model: {model.model_id} (score: {model.score})")
# Route a chat completion request
response = await router.route_request(
messages=[
{"role": "user", "content": "Hello, world!"}
]
)
print(response["choices"][0]["message"]["content"])
await router.close()
asyncio.run(main())Start the OpenAI-compatible server:
modelrelay serve --port 8000Then use with any OpenAI client:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="not-needed"
)
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello!"}]
)| Provider | Models | Specialization |
|---|---|---|
| NVIDIA NIM | 30+ | Best variety, high quality |
| Groq | 7 | Ultra-fast inference |
| Cerebras | 5 | Fast inference |
| OpenRouter | 6+ | Free tier available |
| Codestral | 1 | Code generation |
| Scaleway | 3 | European hosting |
| Google AI | 3 | Gemma models |
Models are scored from 0.0 to 1.0 based on quality benchmarks. Higher scores indicate better performance.
View all scores:
modelrelay scoresMIT
This project is a Python port of the original modelrelay Node.js project by Elliptic Marketing. All core design concepts—including the model scoring system, provider abstraction, auto-routing with fallback chains, and health-checking—are theirs. I've rewritten it in Python with a different architecture (FastAPI instead of Express) while preserving the spirit and functionality of the original.
Thank you to the original authors for building such a useful tool.