A Flask web application that acts as an Ollama-compatible server and forwards requests to OpenRouter. This allows you to use Ollama clients with OpenRouter's LLM models.
- Compatible with Ollama API clients
- Forwards requests to OpenRouter
- Supports both single prompt generation and chat conversations
- Maps Ollama model names to OpenRouter model names
- Configurable via environment variables
- Python 3.10+
- OpenRouter API key (get one at https://openrouter.ai/keys)
- Optionally, uv for modern installation
This is the recommended, streamlined way to run the app without manually managing virtual environments or pip:
-
Clone this repository:
git clone https://github.com/yourusername/ForwardLLM.git cd ForwardLLM -
Create a
.envfile with your OpenRouter API key:cp .env.example .env
Then edit
.envand replaceyour_openrouter_api_key_herewith your OpenRouter API key. -
Run the app directly with
uv:uv run app.py
uv will automatically:
- Create a temporary isolated environment
- Install dependencies (Flask, python-dotenv, OpenAI, etc.)
- Run the proxy server
If you prefer the classic way:
-
Clone this repository:
git clone https://github.com/yourusername/ForwardLLM.git cd ForwardLLM -
Create a virtual environment and activate it:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate -
Install the required packages:
pip install -r requirements.txt -
Create a
.envfile with your OpenRouter API key:cp .env.example .envThen edit the
.envfile and replaceyour_openrouter_api_key_herewith your actual OpenRouter API key.
-
Start the server:
python app.pyBy default, the server runs on port 11434 (the same as Ollama).
-
Run the test script to verify everything is working:
python test_api.pyThis will test all the API endpoints and show the results.
-
Use any Ollama client to connect to the server. For example, using curl:
Generate a response to a single prompt:
curl -X POST http://localhost:11434/api/generate -d '{ "model": "gpt-3.5-turbo", "prompt": "Tell me a joke about programming", "stream": false }'
Chat conversation:
curl -X POST http://localhost:11434/api/chat -d '{ "model": "gpt-3.5-turbo", "messages": [ {"role": "user", "content": "Hello, how are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}, {"role": "user", "content": "Tell me a joke about programming."} ] }'
List available models:
curl http://localhost:11434/api/models
The application maps common Ollama model names to their OpenRouter equivalents:
llama2→meta-llama/llama-2-13b-chatmistral→mistralai/mistral-7b-instruct
You can also directly use OpenRouter model names like openai/gpt-4 or anthropic/claude-2.
Set the following environment variables in your .env file:
OPENROUTER_API_KEY(required): Your OpenRouter API keyPORT(optional): The port to run the server on (default: 11434)
MIT