Cascadeflow OpenAI Proxy

This project provides a FastAPI-based proxy server that exposes an OpenAI-compatible API (/v1/chat/completions) for the cascadeflow library. This allows you to use cascadeflow with tools and extensions that support the OpenAI API format, such as the Continue VSCode extension.

Features

OpenAI Compatibility: Implements the /v1/chat/completions endpoint, accepting standard OpenAI chat completion requests.
Streaming Support: Fully supports streaming responses via Server-Sent Events (SSE).
Configurable Models: Define your models, providers, and costs in a simple YAML configuration file.
Cascadeflow Integration: Leverages the CascadeAgent to orchestrate model interactions.

Prerequisites

Python 3.8+
cascadeflow library installed (included in requirements)

Installation

Clone the repository:

git clone <repository-url>
cd cascadeflow-openai-proxy

Install dependencies: It is recommended to use a virtual environment.

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Configuration

The server will start on http://0.0.0.0:8000.

Using the Startup Script

For convenience, a script is provided to start multiple Ollama instances and the proxy together. This is useful if you want to serve different models on different ports.

./start_services.sh

This script will:

Start an Ollama instance on port 11434 (default).
Start a second Ollama instance on port 11435.
Start the cascadeflow-openai-proxy.

Configuration

1. Model Configuration (`config.yaml`)

The config.yaml file defines the models available to the proxy. You can configure multiple models with different providers and specific URLs.

Example config.yaml:

models:
  - name: qwen3:1.7b
    provider: ollama
    url: http://localhost:11434
    cost: 0.0
  - name: ministral-3:8b
    provider: ollama
    url: http://localhost:11435
    cost: 0.0
  - name: claude-3-5-sonnet-20241022
    provider: anthropic
    cost: 0.003

name: The model name to be used in API requests.
provider: The provider name (e.g., openai, anthropic, ollama).
url: (Optional) The base URL for the provider (useful for local models like Ollama).
cost: (Optional) Cost per token or request.

Connecting with Clients

You can now point any OpenAI-compatible client to your proxy.

Base URL: http://localhost:8000/v1

Example: cURL

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Example: Continue (VSCode Extension)

To use this with Continue, add the following to your config.json:

{
  "models": [
    {
      "title": "Cascadeflow Proxy",
      "provider": "openai",
      "model": "gpt-4o-mini",
      "apiBase": "http://localhost:8000/v1",
      "apiKey": "EMPTY" 
    }
  ]
}

Note: The apiKey field is required by some clients but ignored by the proxy if not needed, as the proxy uses the server-side environment variables.

API Endpoints

POST /v1/chat/completions: Handles chat completion requests. Supports both streaming (stream=True) and non-streaming modes.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
config.yaml		config.yaml
proxy.py		proxy.py
requirements.txt		requirements.txt
start_services.sh		start_services.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cascadeflow OpenAI Proxy

Features

Prerequisites

Installation

Configuration

Using the Startup Script

Configuration

1. Model Configuration (`config.yaml`)

Connecting with Clients

Example: cURL

Example: Continue (VSCode Extension)

API Endpoints

License

About

Uh oh!

Releases

Packages

Languages

nuvolos-cloud/cascadeflow-openai-proxy

Folders and files

Latest commit

History

Repository files navigation

Cascadeflow OpenAI Proxy

Features

Prerequisites

Installation

Configuration

Using the Startup Script

Configuration

1. Model Configuration (config.yaml)

Connecting with Clients

Example: cURL

Example: Continue (VSCode Extension)

API Endpoints

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Model Configuration (`config.yaml`)

Packages