ModelRelay (Python Port)

Python implementation of the original modelrelay by Elliptic Marketing.

Intelligent LLM Model Routing - Python Edition

Routes requests to the best available model based on quality scores and provider availability.

Features

Smart Routing: Automatically selects the best available model based on quality scores
Multi-Provider Support: NVIDIA NIM, Groq, Cerebras, OpenRouter, Codestral, Scaleway, Google AI
Health Checking: Monitors provider availability and latency
OpenAI-Compatible API: Drop-in replacement for OpenAI API clients
CLI Tools: Manage models, providers, and check availability

Installation

pip install modelrelay

For the HTTP server:

pip install modelrelay[server]

Quick Start

CLI Usage

List available models:

modelrelay models

Show best available model:

modelrelay best

Check provider status:

modelrelay check

Python API

import asyncio
from modelrelay import ModelRouter

async def main():
    router = ModelRouter()

    # Get best available model
    model = await router.get_best_available_model()
    print(f"Best model: {model.model_id} (score: {model.score})")

    # Route a chat completion request
    response = await router.route_request(
        messages=[
            {"role": "user", "content": "Hello, world!"}
        ]
    )
    print(response["choices"][0]["message"]["content"])

    await router.close()

asyncio.run(main())

HTTP Server

Start the OpenAI-compatible server:

modelrelay serve --port 8000

Then use with any OpenAI client:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}]
)

Supported Providers

Provider	Models	Specialization
NVIDIA NIM	30+	Best variety, high quality
Groq	7	Ultra-fast inference
Cerebras	5	Fast inference
OpenRouter	6+	Free tier available
Codestral	1	Code generation
Scaleway	3	European hosting
Google AI	3	Gemma models

Model Scoring

Models are scored from 0.0 to 1.0 based on quality benchmarks. Higher scores indicate better performance.

View all scores:

modelrelay scores

License

MIT

Acknowledgments

This project is a Python port of the original modelrelay Node.js project by Elliptic Marketing. All core design concepts—including the model scoring system, provider abstraction, auto-routing with fallback chains, and health-checking—are theirs. I've rewritten it in Python with a different architecture (FastAPI instead of Express) while preserving the spirit and functionality of the original.

Thank you to the original authors for building such a useful tool.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src/modelrelay		src/modelrelay
tests		tests
PROGRESS.md		PROGRESS.md
README.md		README.md
THIRD-PARTY-NOTICES.md		THIRD-PARTY-NOTICES.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ModelRelay (Python Port)

Features

Installation

Quick Start

CLI Usage

Python API

HTTP Server

Supported Providers

Model Scoring

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ModelRelay (Python Port)

Features

Installation

Quick Start

CLI Usage

Python API

HTTP Server

Supported Providers

Model Scoring

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages