GitHub - vitalii-dynamiq/llm-gateway

The arc connecting you to every LLM

Zero dependencies. Maximum performance. One unified API.

Installation • Quick Start • Providers • Features • Docs

Why ArcLLM?

Feature	ArcLLM	Others
Dependencies	0 (stdlib only)	10-50+ packages
Install size	~100KB	50-200MB
Cold start	~10ms	500ms-2s
API	OpenAI-compatible	Varies

ArcLLM is built for developers who want speed, simplicity, and reliability when working with LLMs.

Installation

pip install arcllm

That's it. No dependency hell. No version conflicts. Just works.

Quick Start

import arcllm

# Simple completion
response = arcllm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Streaming

stream = arcllm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a haiku about coding"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Async

response = await arcllm.acompletion(
    model="anthropic/claude-3-5-sonnet-latest",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

Different Providers

# OpenAI
response = arcllm.completion(model="gpt-4o", messages=messages)

# Anthropic
response = arcllm.completion(model="anthropic/claude-3-5-sonnet-latest", messages=messages)

# Google Gemini
response = arcllm.completion(model="gemini/gemini-1.5-pro", messages=messages)

# Groq (ultra-fast inference)
response = arcllm.completion(model="groq/llama-3.3-70b-versatile", messages=messages)

# Local with Ollama
response = arcllm.completion(model="ollama/llama3.2", messages=messages)

Supported Providers

Provider	Prefix	Models	Environment Variable
OpenAI	`openai/`	GPT-4o, GPT-4, o1, o3	`OPENAI_API_KEY`
Anthropic	`anthropic/`	Claude 3.5, Claude 3	`ANTHROPIC_API_KEY`
Google Gemini	`gemini/`	Gemini 1.5, Gemini 2.0	`GEMINI_API_KEY`
Azure OpenAI	`azure/`	GPT-4o, GPT-4	`AZURE_OPENAI_API_KEY`
AWS Bedrock	`bedrock/`	Claude, Llama, Titan	AWS credentials
Google Vertex	`vertex_ai/`	Gemini, PaLM	`GOOGLE_ACCESS_TOKEN`
Mistral	`mistral/`	Mistral Large, Codestral	`MISTRAL_API_KEY`
Groq	`groq/`	Llama 3.3, Mixtral	`GROQ_API_KEY`
Together AI	`together_ai/`	Llama, Mixtral, Qwen	`TOGETHER_API_KEY`
Fireworks	`fireworks_ai/`	Llama, Mixtral	`FIREWORKS_API_KEY`
DeepSeek	`deepseek/`	DeepSeek V3, Coder	`DEEPSEEK_API_KEY`
Perplexity	`perplexity/`	Sonar, Online	`PERPLEXITY_API_KEY`
Cohere	`cohere/`	Command R+	`COHERE_API_KEY`
Databricks	`databricks/`	DBRX, Llama	`DATABRICKS_TOKEN`
Ollama	`ollama/`	Any local model	(local)

Features

🛠️ Tool Calling

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
}]

response = arcllm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools
)

if response.choices[0].message.tool_calls:
    for tool_call in response.choices[0].message.tool_calls:
        print(f"Call: {tool_call.function.name}({tool_call.function.arguments})")

📋 Structured Output

response = arcllm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Generate a user profile"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "user_profile",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "interests": {"type": "array", "items": {"type": "string"}}
                },
                "required": ["name", "age"]
            }
        }
    }
)

🖼️ Vision

response = arcllm.completion(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
    }]
)

📊 Embeddings

response = arcllm.embedding(
    model="text-embedding-3-small",
    input=["Hello world", "Goodbye world"]
)
print(f"Dimensions: {len(response.data[0].embedding)}")

💰 Cost Tracking

response = arcllm.completion(model="gpt-4o", messages=messages)

# Calculate cost
cost = arcllm.completion_cost(response)
print(f"Cost: ${cost:.6f}")

# Or get per-token pricing
input_cost, output_cost = arcllm.cost_per_token(
    model="gpt-4o",
    prompt_tokens=1000,
    completion_tokens=500
)

🔍 Model Capabilities

# Check what models can do
arcllm.supports_vision("gpt-4o")           # True
arcllm.supports_vision("gpt-3.5-turbo")    # False

arcllm.supports_tools("claude-3-5-sonnet-latest")  # True
arcllm.supports_pdf_input("gemini-1.5-pro")        # True

arcllm.get_max_tokens("gpt-4o")  # 16384

Error Handling

from arcllm import (
    ArcLLMError,
    AuthenticationError,
    RateLimitError,
    TimeoutError,
)

try:
    response = arcllm.completion(model="gpt-4o", messages=messages)
except AuthenticationError:
    print("Check your API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except TimeoutError:
    print("Request timed out")
except ArcLLMError as e:
    print(f"Error: {e.message}")

Configuration

# Per-request configuration
response = arcllm.completion(
    model="gpt-4o",
    messages=messages,
    api_key="sk-...",           # Override API key
    api_base="https://...",     # Custom endpoint
    timeout=120.0,              # Request timeout
    max_retries=5,              # Retry count
)

# Azure OpenAI
response = arcllm.completion(
    model="azure/my-deployment",
    messages=messages,
    api_base="https://myresource.openai.azure.com",
    api_version="2024-10-21",
)

Migration from LiteLLM

ArcLLM is designed as a drop-in replacement:

# Before
import litellm
response = litellm.completion(model="gpt-4o", messages=messages)

# After
import arcllm
response = arcllm.completion(model="gpt-4o", messages=messages)

# Or alias it
import arcllm as litellm
response = litellm.completion(model="gpt-4o", messages=messages)

Documentation

Why "Arc"?

An arc is the shortest path between two points. ArcLLM is the shortest path between your code and any LLM provider—minimal, direct, efficient.

License

Apache 2.0 - see LICENSE

_{Built with ❤️ for developers who value simplicity}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
arcllm		arcllm
benchmarks		benchmarks
docs		docs
examples		examples
notes		notes
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The arc connecting you to every LLM

Why ArcLLM?

Installation

Quick Start

Streaming

Async

Different Providers

Supported Providers

Features

🛠️ Tool Calling

📋 Structured Output

🖼️ Vision

📊 Embeddings

💰 Cost Tracking

🔍 Model Capabilities

Error Handling

Configuration

Migration from LiteLLM

Documentation

Why "Arc"?

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The arc connecting you to every LLM

Why ArcLLM?

Installation

Quick Start

Streaming

Async

Different Providers

Supported Providers

Features

🛠️ Tool Calling

📋 Structured Output

🖼️ Vision

📊 Embeddings

💰 Cost Tracking

🔍 Model Capabilities

Error Handling

Configuration

Migration from LiteLLM

Documentation

Why "Arc"?

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages