Python SDK for Inception Labs Mercury diffusion-LLM API.
- 🚀 Full API Coverage: Chat completions with streaming support
- 🔄 Async/Sync Support: Both synchronous and asynchronous clients
- 🛡️ Type Safety: Full type hints with Pydantic models
- 🔁 Retry Logic: Built-in exponential backoff for reliability
- 📝 Comprehensive Logging: Detailed logging for debugging
- ⚡ Streaming: Real-time response streaming for chat completions
pip install mercury-api-clientOr install from source:
git clone https://github.com/hamzaamjad/mercury-client.git
cd mercury-client
pip install -e .First, obtain your API key from Inception Labs and set it as an environment variable:
# Using export (Linux/macOS)
export MERCURY_API_KEY="sk_your_api_key_here"
# Using set (Windows)
set MERCURY_API_KEY=sk_your_api_key_here
# Or add to your shell profile (.bashrc, .zshrc, etc.)
echo 'export MERCURY_API_KEY="sk_your_api_key_here"' >> ~/.bashrcAlternatively, you can pass the API key directly when initializing the client:
from mercury_client import MercuryClient
client = MercuryClient(api_key="sk_your_api_key_here")from mercury_client import MercuryClient
# Initialize the client (uses MERCURY_API_KEY env var by default)
client = MercuryClient()
# Create a chat completion
response = client.chat_completion(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is a diffusion model?"}
],
model="mercury-coder-small"
)
print(response.choices[0].message.content)import asyncio
from mercury_client import AsyncMercuryClient
async def main():
async with AsyncMercuryClient() as client:
response = await client.chat_completion(
messages=[
{"role": "user", "content": "Explain quantum computing"}
]
)
print(response.choices[0].message.content)
asyncio.run(main())# Synchronous streaming
for chunk in client.chat_completion_stream(
messages=[{"role": "user", "content": "Write a story about AI"}],
max_tokens=1000
):
if chunk.choices[0].delta and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
# Async streaming
async for chunk in client.chat_completion_stream(
messages=[{"role": "user", "content": "Write a poem"}]
):
if chunk.choices[0].delta and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")response = client.fim_completion(
prompt="def fibonacci(",
suffix=" return a + b",
max_tokens=100
)
print(response.choices[0].text)from mercury_client import MercuryClient, RetryConfig
retry_config = RetryConfig(
max_retries=5,
initial_delay=2.0,
max_delay=30.0,
exponential_base=2.0,
jitter=True
)
client = MercuryClient(
api_key="your-api-key",
retry_config=retry_config,
timeout=60.0 # Request timeout in seconds
)from mercury_client import MercuryClient
from mercury_client.exceptions import (
AuthenticationError,
RateLimitError,
ServerError,
EngineOverloadedError
)
try:
response = client.chat_completion(
messages=[{"role": "user", "content": "Hello"}]
)
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limit exceeded. Retry after {e.retry_after} seconds")
except EngineOverloadedError:
print("Service is temporarily overloaded")
except ServerError:
print("Server error occurred")from mercury_client.models import Tool, FunctionDefinition
tools = [
Tool(
type="function",
function=FunctionDefinition(
name="get_weather",
description="Get current weather for a location",
parameters={
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
)
)
]
response = client.chat_completion(
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=tools,
tool_choice="auto"
)MERCURY_API_KEY- Your Mercury API key (primary)INCEPTION_API_KEY- Alternative environment variable (for backward compatibility)
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
None |
API key for authentication |
base_url |
str |
https://api.inceptionlabs.ai/v1 |
Base URL for the API |
timeout |
float |
30.0 |
Request timeout in seconds |
retry_config |
RetryConfig |
Default config | Retry behavior configuration |
chat_completion()- Create a chat completionchat_completion_stream()- Create a streaming chat completionfim_completion()- Create a fill-in-the-middle completion (coming soon)close()- Close the HTTP client (also supports context manager)
All models are fully typed with Pydantic:
ChatCompletionRequest/ChatCompletionResponseFIMCompletionRequest/FIMCompletionResponseMessage,Tool,ToolCall,Usage, etc.
MercuryAPIError- Base exception for all API errorsAuthenticationError- Invalid or missing API key (401)RateLimitError- Rate limit exceeded (429)ServerError- Server error (500)EngineOverloadedError- Service overloaded (503)
# Clone the repository
git clone https://github.com/hamzaamjad/mercury-client.git
cd mercury-client
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install# Run all tests
pytest
# Run with coverage
pytest --cov=mercury_client --cov-report=html
# Run specific test file
pytest tests/test_client.py
# Run integration tests (requires MERCURY_API_KEY)
pytest tests/test_integration.py -v -m integration# Format code
black mercury_client tests
# Sort imports
isort mercury_client tests
# Type checking
mypy mercury_client
# Linting
ruff mercury_client- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure:
- All tests pass
- Code is formatted with Black
- Type hints are added for new code
- Documentation is updated
This project is licensed under the MIT License - see the LICENSE file for details.
- 📧 Email: hamza@example.com
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions