The Gateway API provides miner agents with access to external services during sandbox execution. Agents run in isolated Docker containers without internet access, and the gateway acts as a controlled proxy to external APIs. Validators handle authentication, while miners can link their API accounts to cover costs and access higher budgets (see miner-setup.md).
Available Services:
- Chutes AI: LLM inference with multiple open-source models
- Desearch AI: Web search, social media search, and content crawling
- OpenAI: GPT-5 series models with built-in web search
- Perplexity: Reasoning LLMs with built-in web search
- Vericore: Statement verification with evidence-based metrics
- OpenRouter: Model router with access to hundreds of LLM models (Claude, Gemini, Llama, etc.)
- Numinous Indicia: Geopolitical and OSINT signals intelligence (X/Twitter, LiveUAMap)
All requests are cached to optimize performance and reduce costs.
Cost Limits: $0.01 (default) or $0.10 (linked account) per sandbox run for Chutes and Desearch. OpenAI: $1.00 per run (requires linked account, no free tier). Perplexity: $0.10 per run (requires linked account, no free tier). Vericore: $0.10 per run (requires linked account, no free tier). OpenRouter: $0.10 per run (requires linked account, no free tier). Numinous Indicia: free (no linking required).
Security: API keys are securely stored using external secret management and never exposed to validators.
Your agent receives these environment variables in the sandbox:
| Variable | Description | Example |
|---|---|---|
SANDBOX_PROXY_URL |
Gateway proxy URL | http://sandbox_proxy |
RUN_ID |
Unique execution identifier (UUID) | 550e8400-e29b-41d4-a716-446655440000 |
All gateway requests must:
- Use
SANDBOX_PROXY_URLas the base URL - Include
run_idin the request body (for POST) or headers (for GET) - Not include any API keys (validator handles authentication)
Example:
import os
PROXY_URL = os.getenv("SANDBOX_PROXY_URL", "http://sandbox_proxy")
RUN_ID = os.getenv("RUN_ID")
if not RUN_ID:
raise ValueError("RUN_ID environment variable is required")Chutes AI provides access to open-source LLM models for inference.
OpenAI-compatible chat completion endpoint.
URL: {SANDBOX_PROXY_URL}/api/gateway/chutes/chat/completions
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"model": "deepseek-ai/DeepSeek-V3-0324",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 1000,
"tools": null,
"tool_choice": null
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
model |
string | Yes | - | Model identifier (see Available Models below) |
messages |
array | Yes | - | List of message objects with role and content |
temperature |
float | No | 0.7 | Sampling temperature (0.0-2.0) |
max_tokens |
integer | No | null | Maximum tokens to generate |
tools |
array | No | null | Tool definitions for function calling |
tool_choice |
string/object | No | null | Tool selection strategy (auto, required, or specific tool) |
Response:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "deepseek-ai/DeepSeek-V3-0324",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 8,
"total_tokens": 36
}
}Example (using LangChain):
import os
from langchain_openai import ChatOpenAI
PROXY_URL = os.getenv("SANDBOX_PROXY_URL", "http://sandbox_proxy")
RUN_ID = os.getenv("RUN_ID")
llm = ChatOpenAI(
model="deepseek-ai/DeepSeek-V3-0324",
base_url=f"{PROXY_URL}/api/gateway/chutes",
api_key="not-needed", # Gateway handles authentication
extra_body={"run_id": RUN_ID},
)
response = llm.invoke("What is 2+2?")
print(response.content)Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/chutes/chat/completions",
json={
"run_id": RUN_ID,
"model": "deepseek-ai/DeepSeek-V3-0324",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7,
},
timeout=60.0,
)
result = response.json()
content = result["choices"][0]["message"]["content"]Available Models:
| Model | Identifier | Notes |
|---|---|---|
| DeepSeek R1 | deepseek-ai/DeepSeek-R1 |
Latest reasoning model |
| DeepSeek R1 0528 | deepseek-ai/DeepSeek-R1-0528 |
Version-specific |
| DeepSeek V3 0324 | deepseek-ai/DeepSeek-V3-0324 |
Fast and efficient |
| DeepSeek V3.1 | deepseek-ai/DeepSeek-V3.1 |
Improved version |
| DeepSeek V3.2 Exp | deepseek-ai/DeepSeek-V3.2-Exp |
Experimental |
| Gemma 3 4B | unsloth/gemma-3-4b-it |
Lightweight model |
| Gemma 3 12B | unsloth/gemma-3-12b-it |
Mid-size model |
| Gemma 3 27B | unsloth/gemma-3-27b-it |
Larger model |
| GLM 4.5 | zai-org/GLM-4.5 |
Multilingual model |
| GLM 4.6 | zai-org/GLM-4.6 |
Latest GLM version |
| Qwen3 32B | Qwen/Qwen3-32B |
High-performance model |
| Qwen3 235B | Qwen/Qwen3-235B-A22B |
Large-scale model |
| Mistral Small 24B | unsloth/Mistral-Small-24B-Instruct-2501 |
Efficient instruction model |
| GPT OSS 20B | openai/gpt-oss-20b |
Open-source GPT variant |
| GPT OSS 120B | openai/gpt-oss-120b |
Large open-source GPT |
Note: Model availability can change. Check https://chutes.ai/app for the latest list of active models.
Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable (cold model) | Implement exponential backoff, retry after 2-8s |
| 404 | Model not found | Verify model name at https://chutes.ai/app |
| 429 | Rate limit exceeded | Implement exponential backoff |
| 401 | Authentication failed | Contact validator (gateway misconfigured) |
| 500 | Internal server error | Retry with fallback to baseline prediction |
Get real-time status and utilization metrics for all Chutes models.
URL: {SANDBOX_PROXY_URL}/api/gateway/chutes/status
Request:
import httpx
response = httpx.get(
f"{PROXY_URL}/api/gateway/chutes/status",
timeout=10.0,
)
status_list = response.json()Response:
[
{
"chute_id": "chute-123",
"name": "deepseek-ai/DeepSeek-R1",
"timestamp": "2025-11-13T12:00:00Z",
"utilization_current": 0.85,
"utilization_5m": 0.75,
"utilization_15m": 0.70,
"utilization_1h": 0.65,
"rate_limit_ratio_5m": 0.1,
"rate_limit_ratio_15m": 0.08,
"rate_limit_ratio_1h": 0.05,
"total_requests_5m": 100.0,
"completed_requests_5m": 90.0,
"rate_limited_requests_5m": 10.0,
"instance_count": 5,
"action_taken": "scale_up",
"scalable": true
}
]Response Fields:
| Field | Type | Description |
|---|---|---|
chute_id |
string | Unique chute identifier |
name |
string | Model name |
utilization_current |
float | Current utilization (0.0-1.0) |
utilization_5m |
float | 5-minute average utilization |
utilization_15m |
float | 15-minute average utilization |
utilization_1h |
float | 1-hour average utilization |
rate_limit_ratio_5m |
float | Ratio of rate-limited requests (5min) |
instance_count |
integer | Active instances |
action_taken |
string | Latest scaling action (scale_up, scale_down, none) |
scalable |
boolean | Whether model can scale |
Use Case:
Use this endpoint to select the most available model before making inference requests:
import httpx
def select_best_model():
response = httpx.get(f"{PROXY_URL}/api/gateway/chutes/status", timeout=10.0)
status_list = response.json()
# Filter for low utilization and low rate limiting
available_models = [
s for s in status_list
if s["utilization_current"] < 0.5 and s["rate_limit_ratio_5m"] < 0.1
]
if available_models:
# Pick the least utilized model
best = min(available_models, key=lambda x: x["utilization_current"])
return best["name"]
# Fallback to default
return "deepseek-ai/DeepSeek-V3-0324"Desearch AI provides web search, social media search, and content crawling capabilities.
AI-powered search with automatic summarization and multiple tool support.
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/ai/search
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"prompt": "Latest developments in quantum computing",
"model": "NOVA",
"tools": ["web", "arxiv"],
"date_filter": "PAST_WEEK",
"result_type": "LINKS_WITH_FINAL_SUMMARY",
"system_message": null,
"count": 10
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID |
prompt |
string | Yes | - | Search query or question |
model |
string | No | NOVA |
AI model (NOVA, ORBIT, HORIZON) |
tools |
array[string] | No | ["web"] |
Search tools to use (see Available Tools) |
date_filter |
string | No | null | Time range filter (see Date Filters) |
result_type |
string | No | null | Output format (see Result Types) |
system_message |
string | No | null | Custom system prompt for AI |
count |
integer | No | 10 | Number of results (1-100) |
Available Tools:
| Tool | Description |
|---|---|
web |
General web search |
twitter |
Twitter/X search |
reddit |
Reddit search |
hackernews |
Hacker News search |
wikipedia |
Wikipedia search |
youtube |
YouTube search |
arxiv |
Academic papers (arXiv) |
Date Filters:
| Value | Description |
|---|---|
PAST_24_HOURS |
Last 24 hours |
PAST_2_DAYS |
Last 2 days |
PAST_WEEK |
Last 7 days |
PAST_2_WEEKS |
Last 14 days |
PAST_MONTH |
Last 30 days |
PAST_2_MONTHS |
Last 60 days |
PAST_YEAR |
Last 365 days |
PAST_2_YEARS |
Last 2 years |
Result Types:
| Value | Description |
|---|---|
ONLY_LINKS |
Return only search result links |
LINKS_WITH_SUMMARIES |
Return links with individual summaries |
LINKS_WITH_FINAL_SUMMARY |
Return links with one aggregated summary |
Response:
{
"text": "Search results text...",
"completion": "AI-generated summary based on search results...",
"wikipedia_search": [],
"youtube_search": [],
"arxiv_search": [
{
"title": "Paper title",
"url": "https://arxiv.org/abs/...",
"summary": "Paper abstract..."
}
],
"reddit_search": [],
"hacker_news_search": [],
"tweets": [],
"miner_link_scores": {}
}Example:
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/ai/search",
json={
"run_id": RUN_ID,
"prompt": "What are experts saying about AI safety?",
"model": "NOVA",
"tools": ["web", "twitter", "reddit"],
"date_filter": "PAST_WEEK",
"count": 15,
},
timeout=60.0,
)
result = response.json()
summary = result.get("completion", "")
tweets = result.get("tweets", [])Get search result links without summaries (faster than AI search).
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/ai/links
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"prompt": "Climate change policy updates",
"model": "NOVA",
"tools": ["web", "wikipedia"],
"count": 20
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID |
prompt |
string | Yes | - | Search query |
model |
string | No | NOVA |
AI model |
tools |
array[string] | No | ["web"] |
Search tools (web, wikipedia, reddit, etc.) |
count |
integer | No | 10 | Number of links (1-100) |
Response:
{
"search_results": [
{
"title": "Result title",
"url": "https://example.com",
"snippet": "Preview text..."
}
],
"wikipedia_search_results": [],
"youtube_search_results": [],
"arxiv_search_results": [],
"reddit_search_results": [],
"hacker_news_search_results": []
}Example:
import httpx
response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/ai/links",
json={
"run_id": RUN_ID,
"prompt": "US inflation data 2025",
"tools": ["web"],
"count": 10,
},
timeout=30.0,
)
links = response.json().get("search_results", [])
for link in links[:5]:
print(f"{link['title']}: {link['url']}")Raw web search without AI processing (fastest option).
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/web/search
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"query": "bitcoin price prediction",
"num": 10,
"start": 0
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID |
query |
string | Yes | - | Search query string |
num |
integer | No | 10 | Number of results (1-100) |
start |
integer | No | 0 | Pagination offset |
Response:
{
"data": [
{
"title": "Page title",
"link": "https://example.com/page",
"snippet": "Page description or excerpt...",
"date": "2025-11-10"
}
]
}Example:
import httpx
response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/web/search",
json={
"run_id": RUN_ID,
"query": "federal reserve interest rate decision",
"num": 20,
"start": 0,
},
timeout=30.0,
)
results = response.json()["data"]
for result in results:
print(f"{result['title']}: {result['link']}")Fetch and extract content from a specific URL.
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/web/crawl
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"url": "https://example.com/article"
}Parameters:
| Field | Type | Required | Description |
|---|---|---|---|
run_id |
string (UUID) | Yes | Execution tracking ID |
url |
string | Yes | Full URL to crawl |
Response:
{
"url": "https://example.com/article",
"content": "Extracted text content from the page..."
}Example:
import httpx
# First, search for relevant URLs
search_response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/web/search",
json={"run_id": RUN_ID, "query": "climate summit outcomes", "num": 5},
timeout=30.0,
)
urls = [r["link"] for r in search_response.json()["data"]]
# Then, crawl each URL for full content
for url in urls[:3]:
crawl_response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/web/crawl",
json={"run_id": RUN_ID, "url": url},
timeout=30.0,
)
content = crawl_response.json()["content"]
# Analyze content...Search for posts on X (Twitter) with advanced filtering options.
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/x/search
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"query": "AI safety",
"sort": "Top",
"count": 20,
"min_likes": 100,
"verified": true
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID |
query |
string | Yes | - | Search query for X posts |
sort |
string | No | Top |
Sort order (Top or Latest) |
user |
string | No | null | Filter by username |
start_date |
string (ISO 8601) | No | null | Filter posts after this date |
end_date |
string (ISO 8601) | No | null | Filter posts before this date |
lang |
string | No | null | Filter by language code (e.g., en) |
verified |
boolean | No | null | Filter by verified status |
blue_verified |
boolean | No | null | Filter by blue verified status |
is_quote |
boolean | No | null | Filter for quote tweets |
is_video |
boolean | No | null | Filter for posts with video |
is_image |
boolean | No | null | Filter for posts with images |
min_retweets |
integer | No | null | Minimum retweet count |
min_replies |
integer | No | null | Minimum reply count |
min_likes |
integer | No | null | Minimum like count |
count |
integer | No | 20 | Number of posts to return |
Response:
{
"posts": [
{
"id": "1234567890",
"text": "Post content here...",
"url": "https://x.com/user/status/1234567890",
"created_at": "2025-01-06T12:00:00Z",
"reply_count": 10,
"retweet_count": 50,
"like_count": 200,
"view_count": 5000,
"quote_count": 5,
"bookmark_count": 15,
"is_quote_tweet": false,
"is_retweet": false,
"lang": "en",
"conversation_id": "1234567890",
"media": []
}
],
"cost": 0.003
}Note: Each post may include additional optional fields such as in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id, quoted_status_id, replies, and display_text_range.
Example:
import httpx
response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/x/search",
json={
"run_id": RUN_ID,
"query": "quantum computing breakthrough",
"sort": "Latest",
"min_likes": 50,
"count": 10,
},
timeout=30.0,
)
posts = response.json()["posts"]
for post in posts:
print(f"{post['text'][:100]}... - {post['like_count']} likes")Fetch detailed information about a specific X (Twitter) post by ID.
URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/x/post
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"post_id": "1234567890"
}Parameters:
| Field | Type | Required | Description |
|---|---|---|---|
run_id |
string (UUID) | Yes | Execution tracking ID |
post_id |
string | Yes | The X post ID to fetch |
Response:
{
"user": {
"id": "123456",
"username": "exampleuser",
"name": "Example User",
"url": "https://x.com/exampleuser",
"created_at": "2020-01-01T00:00:00Z",
"description": "User bio...",
"followers_count": 10000,
"favourites_count": 5000,
"listed_count": 100,
"media_count": 200,
"statuses_count": 5000,
"verified": true,
"is_blue_verified": false,
"profile_image_url": "https://...",
"profile_banner_url": "https://...",
"location": "San Francisco, CA",
"can_dm": true,
"can_media_tag": true
},
"id": "1234567890",
"text": "Full post content here...",
"url": "https://x.com/exampleuser/status/1234567890",
"created_at": "2025-01-06T12:00:00Z",
"reply_count": 10,
"retweet_count": 50,
"like_count": 200,
"view_count": 5000,
"quote_count": 5,
"bookmark_count": 15,
"is_quote_tweet": false,
"is_retweet": false,
"lang": "en",
"conversation_id": "1234567890",
"media": [],
"cost": 0.0003
}Note: The response may include additional optional fields such as quote (for quote tweets), retweet (for retweets), replies (list of reply posts), entities, extended_entities, in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id, quoted_status_id, and display_text_range.
Example:
import httpx
# Fetch a specific post
response = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/x/post",
json={
"run_id": RUN_ID,
"post_id": "1234567890",
},
timeout=30.0,
)
post = response.json()
print(f"Author: {post['user']['username']}")
print(f"Text: {post['text']}")
print(f"Engagement: {post['like_count']} likes, {post['retweet_count']} retweets")OpenAI provides access to GPT-5 series models with built-in web search capability.
Create a response using OpenAI's GPT-5 models with optional web search.
URL: {SANDBOX_PROXY_URL}/api/gateway/openai/responses
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"model": "gpt-5-mini",
"input": [
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_output_tokens": 1000,
"tools": [{"type": "web_search"}],
"tool_choice": null,
"instructions": null
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
model |
string | Yes | - | Model identifier (see Available Models below) |
input |
array | Yes | - | List of message objects with role and content |
temperature |
float | No | 0.7 | Sampling temperature (0.0-2.0) |
max_output_tokens |
integer | No | null | Maximum tokens to generate |
tools |
array | No | null | Tool definitions (e.g., [{"type": "web_search"}]) |
tool_choice |
string/object | No | null | Tool selection strategy |
instructions |
string | No | null | System-level instructions |
Available Models:
| Model | Identifier | Notes |
|---|---|---|
| GPT-5 Mini | gpt-5-mini |
Cost-effective, fast |
| GPT-5 | gpt-5 |
Balanced performance |
| GPT-5.2 | gpt-5.2 |
Enhanced reasoning |
| GPT-5.2 Pro | gpt-5.2-pro |
Most capable |
| GPT-5 Nano | gpt-5-nano |
Lightweight |
Web Search Tool:
Enable web search by including tools:
"tools": [{"type": "web_search"}]The model will autonomously decide when to search based on the prompt. Each search costs $0.01.
Response:
{
"id": "resp_123",
"object": "response",
"created_at": 1768496869,
"model": "gpt-5-mini-2025-08-07",
"output": [
{
"id": "msg_123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "The capital of France is Paris.",
"logprobs": [],
"annotations": []
}
],
"status": "completed"
}
],
"usage": {
"input_tokens": 22,
"output_tokens": 207,
"total_tokens": 229
},
"status": "completed",
"cost": 0.001953
}Response Fields:
| Field | Type | Description |
|---|---|---|
id |
string | Response identifier |
model |
string | Model used for generation |
output |
array | List of output items (messages, reasoning steps) |
output[].type |
string | Item type (message, reasoning) |
output[].content[].text |
string | Generated text content |
usage |
object | Token usage statistics |
usage.input_tokens |
integer | Input tokens consumed |
usage.output_tokens |
integer | Output tokens generated |
cost |
float | Total cost in USD (includes token cost + web search cost) |
Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/openai/responses",
json={
"run_id": RUN_ID,
"model": "gpt-5-mini",
"input": [
{"role": "developer", "content": "You are an expert forecaster."},
{"role": "user", "content": "What is the probability of rain tomorrow?"}
],
"tools": [{"type": "web_search"}],
"temperature": 0.7,
},
timeout=120.0,
)
result = response.json()
# Extract text from output
for item in result["output"]:
if item["type"] == "message":
for content in item["content"]:
if content.get("text"):
print(content["text"])Cost Calculation:
Total cost = Token cost + Web search cost
- Token cost: Based on input/output tokens and model pricing
- Web search cost: $0.01 per search executed
The cost field in the response includes both components.
Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable | Retry with exponential backoff |
| 404 | Model not found | Verify model identifier |
| 429 | Rate limit exceeded | Retry with exponential backoff |
| 401 | Authentication failed | Contact validator |
| 500 | Internal server error | Retry with fallback |
Best Practices:
- Use web_search selectively: Only enable when research is needed
- Clear prompts: Explicitly ask the model to search before forecasting
- Model selection: Use
gpt-5-minifor cost-efficiency,gpt-5.2for complex reasoning - Error handling: Always implement retry logic with fallback predictions
Perplexity provides reasoning LLMs with built-in web search capability.
Create a response using Perplexity's reasoning models with automatic web search.
URL: {SANDBOX_PROXY_URL}/api/gateway/perplexity/chat/completions
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"model": "sonar-reasoning-pro",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 1000,
"search_recency_filter": "month"
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
model |
string | Yes | - | Model identifier (see Available Models below) |
messages |
array | Yes | - | List of message objects with role and content |
temperature |
float | No | 0.7 | Sampling temperature (0.0-2.0) |
max_tokens |
integer | No | null | Maximum tokens to generate |
search_recency_filter |
string | No | null | Time range for search results (day, week, month, year) |
Available Models:
| Model | Identifier | Notes |
|---|---|---|
| Sonar Reasoning Pro | sonar-reasoning-pro |
Most capable reasoning model |
| Sonar Pro | sonar-pro |
Balanced performance |
| Sonar | sonar |
Fast and efficient |
Response:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "sonar-reasoning-pro",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 8,
"total_tokens": 36
},
"citations": [
"https://example.com/source1",
"https://example.com/source2"
],
"search_results": [
{
"title": "Source title",
"url": "https://example.com/source1",
"snippet": "Relevant text..."
}
],
"cost": 0.002145
}Response Fields:
| Field | Type | Description |
|---|---|---|
id |
string | Response identifier |
model |
string | Model used for generation |
choices |
array | List of completion choices |
choices[].message.content |
string | Generated text content |
usage |
object | Token usage statistics |
citations |
array | List of source URLs used |
search_results |
array | Detailed search result objects |
cost |
float | Total cost in USD |
Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/perplexity/chat/completions",
json={
"run_id": RUN_ID,
"model": "sonar-reasoning-pro",
"messages": [
{"role": "system", "content": "You are an expert forecaster."},
{"role": "user", "content": "What is the probability of rain tomorrow?"}
],
"temperature": 0.2,
"search_recency_filter": "day",
},
timeout=120.0,
)
result = response.json()
content = result["choices"][0]["message"]["content"]
citations = result.get("citations", [])
print(f"Response: {content}")
print(f"Sources: {citations}")Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable | Retry with exponential backoff |
| 404 | Model not found | Verify model identifier |
| 429 | Rate limit exceeded | Retry with exponential backoff |
| 401 | Authentication failed | Contact validator |
| 500 | Internal server error | Retry with fallback |
Best Practices:
- Use search_recency_filter: Set to
dayorweekfor time-sensitive events - Extract citations: Use the
citationsarray to verify information sources - Model selection: Use
sonar-reasoning-profor complex reasoning tasks - Error handling: Always implement retry logic with fallback predictions
Note: Perplexity has no free tier. You must link your API key to use Perplexity models.
Vericore provides statement verification with evidence-based metrics including sentiment, conviction, source credibility, and more.
Verify a statement against web evidence and get detailed metrics.
URL: {SANDBOX_PROXY_URL}/api/gateway/vericore/calculate-rating
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"statement": "Bitcoin will reach $100k by end of 2026",
"generate_preview": false
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
statement |
string | Yes | - | Statement to verify against web evidence |
generate_preview |
boolean | No | false | Generate a preview URL for the results |
Response:
{
"batch_id": "mlzjxglo15m23k",
"request_id": "req-mlzjxgmc4amr6",
"preview_url": "",
"evidence_summary": {
"total_count": 12,
"neutral": 37.5,
"entailment": 1.03,
"contradiction": 61.46,
"sentiment": -0.07,
"conviction": 0.82,
"source_credibility": 0.93,
"narrative_momentum": 0.48,
"risk_reward_sentiment": -0.15,
"political_leaning": 0.0,
"catalyst_detection": 0.12,
"statements": [
{
"statement": "Evidence text from source...",
"url": "https://example.com/article",
"contradiction": 0.87,
"neutral": 0.12,
"entailment": 0.01,
"sentiment": -0.5,
"conviction": 0.75,
"source_credibility": 0.85,
"narrative_momentum": 0.5,
"risk_reward_sentiment": -0.5,
"political_leaning": 0.0,
"catalyst_detection": 0.3
}
]
},
"cost": 0.05
}Response Fields:
| Field | Type | Description |
|---|---|---|
batch_id |
string | Batch identifier |
request_id |
string | Request identifier |
preview_url |
string | Preview URL (empty if generate_preview is false) |
evidence_summary.total_count |
integer | Number of evidence sources found |
evidence_summary.entailment |
float | Aggregated entailment score |
evidence_summary.contradiction |
float | Aggregated contradiction score |
evidence_summary.sentiment |
float | Aggregated sentiment (-1.0 to 1.0) |
evidence_summary.conviction |
float | Aggregated conviction level |
evidence_summary.source_credibility |
float | Average source credibility |
evidence_summary.statements |
array | Individual evidence sources with per-source metrics |
Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/vericore/calculate-rating",
json={
"run_id": RUN_ID,
"statement": "Bitcoin will reach $100k by end of 2026",
},
timeout=120.0,
)
result = response.json()
summary = result["evidence_summary"]
total = summary["total_count"]
contradiction = summary["contradiction"]
sentiment = summary["sentiment"]
conviction = summary["conviction"]
credibility = summary["source_credibility"]Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable | Retry with exponential backoff |
| 429 | Rate limit exceeded | Retry with exponential backoff |
| 401 | Authentication failed | Contact validator |
| 500 | Internal server error | Retry with fallback |
Note: Vericore has no free tier. You must link your API key to use Vericore. Each call costs $0.05.
OpenRouter is a model router that provides access to hundreds of LLM models through a unified API. You can use models from Anthropic, Google, Meta, and many other providers.
Generate chat completions using any OpenRouter-supported model.
URL: {SANDBOX_PROXY_URL}/api/gateway/openrouter/chat/completions
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"model": "anthropic/claude-sonnet-4-6",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 1024
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
model |
string | Yes | - | OpenRouter model ID (e.g., anthropic/claude-sonnet-4-6) |
messages |
array | Yes | - | Chat messages array with role and content |
temperature |
float | No | 0.7 | Sampling temperature (0.0-2.0) |
max_tokens |
integer | No | - | Maximum tokens to generate |
tools |
array | No | - | Tool/function definitions for function calling |
tool_choice |
string/object | No | - | Tool selection mode |
Response:
{
"id": "gen-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "anthropic/claude-sonnet-4-6",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 10,
"total_tokens": 35,
"cost": 0.000135
},
"cost": 0.000135
}Response Fields:
| Field | Type | Description |
|---|---|---|
id |
string | Unique completion identifier |
model |
string | Model used for the completion |
choices |
array | Array of completion choices |
choices[].message.content |
string | Generated text response |
choices[].finish_reason |
string | Why generation stopped (stop, length, tool_calls) |
usage.prompt_tokens |
integer | Input tokens used |
usage.completion_tokens |
integer | Output tokens generated |
usage.cost |
decimal | Actual cost reported by OpenRouter |
cost |
decimal | Total cost for this request |
Popular Models:
| Model ID | Description |
|---|---|
anthropic/claude-sonnet-4-6 |
Claude Sonnet 4.6 - balanced performance |
anthropic/claude-haiku-4-5 |
Claude Haiku 4.5 - fast and cost-effective |
google/gemini-2.5-flash |
Gemini 2.5 Flash - fast and affordable |
google/gemini-2.5-pro |
Gemini 2.5 Pro - high capability |
See the full model list at https://openrouter.ai/models
Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
response = httpx.post(
f"{PROXY_URL}/api/gateway/openrouter/chat/completions",
json={
"run_id": RUN_ID,
"model": "anthropic/claude-sonnet-4-6",
"messages": [
{"role": "user", "content": "Analyze the likelihood of this event..."}
],
"temperature": 0.2,
"max_tokens": 1024,
},
timeout=120.0,
)
result = response.json()
content = result["choices"][0]["message"]["content"]
cost = result.get("cost", 0.0)Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable | Retry with exponential backoff |
| 429 | Rate limit exceeded | Retry with exponential backoff |
| 401 | Authentication failed | Contact validator |
| 500 | Internal server error | Retry with fallback model |
Note: OpenRouter has no free tier. You must link your API key to use OpenRouter models.
Numinous Indicia provides geopolitical and OSINT signals intelligence from X/Twitter and LiveUAMap. Useful as additional context for geopolitical forecasting when combined with an LLM.
Fetch geopolitical signals derived from X/Twitter OSINT sources.
URL: {SANDBOX_PROXY_URL}/api/gateway/numinous-indicia/x-osint
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"account": null,
"limit": 20
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
account |
string | No | null | Filter by specific X account |
limit |
integer | No | 20 | Number of signals to return (1-50) |
Fetch geopolitical signals from LiveUAMap (military/conflict data).
URL: {SANDBOX_PROXY_URL}/api/gateway/numinous-indicia/liveuamap
Request Body:
{
"run_id": "550e8400-e29b-41d4-a716-446655440000",
"region": null,
"limit": 50
}Parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
run_id |
string (UUID) | Yes | - | Execution tracking ID from environment |
region |
string | No | null | Filter by geographic region |
limit |
integer | No | 50 | Number of signals to return (1-200) |
{
"signals": [
{
"topic": "Ukraine conflict",
"category": "military",
"signal": "Russian forces advance near Pokrovsk...",
"confidence": "high",
"fact_status": "confirmed",
"timestamp": "2026-03-08T14:30:00Z",
"source_url": "https://example.com/source",
"evidence_refs": ["https://example.com/ref1"]
}
],
"cost": 0.0
}Signal Fields:
| Field | Type | Description |
|---|---|---|
signals |
array | List of signal objects |
signals[].topic |
string | Signal topic |
signals[].category |
string | Signal category (e.g., military, political) |
signals[].signal |
string | Signal description text |
signals[].confidence |
string | Confidence level |
signals[].fact_status |
string | Verification status |
signals[].timestamp |
string (ISO 8601) | When the signal was captured |
signals[].source_url |
string | Original source URL (may be null) |
signals[].evidence_refs |
array | Supporting evidence URLs |
cost |
decimal | Cost for this request (currently $0) |
Example (using httpx):
import os
import httpx
PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")
INDICIA_URL = f"{PROXY_URL}/api/gateway/numinous-indicia"
# Fetch X/Twitter OSINT signals
response = httpx.post(
f"{INDICIA_URL}/x-osint",
json={"run_id": RUN_ID, "limit": 20},
timeout=30.0,
)
data = response.json()
signals = data["signals"]
for s in signals:
print(f"[{s['category']}] {s['signal']} (confidence={s['confidence']})")Error Handling:
| Status Code | Description | Recommended Action |
|---|---|---|
| 503 | Service Unavailable | Retry with exponential backoff |
| 429 | Rate limit exceeded | Retry with exponential backoff |
| 500 | Internal server error | Retry with fallback |
Note: Numinous Indicia is free to use. No API key linking required.
See neurons/miner/agents/indicia_openai_example.py for a complete agent that combines Indicia signals with OpenAI web search for geopolitical forecasting.
The gateway implements request-level caching to increase consensus stabilit among validators, optimize performance, reduce API costs.
Cache Behavior:
- Requests with identical parameters return cached responses instantly
- Cache is keyed by endpoint name and request parameters (excluding
run_id) - Cache persists for the lifetime of the gateway process
- Cache is shared across all agent executions on the same validator
Cache Key Generation:
- The
run_idfield is excluded from cache key calculation - This means identical requests from different executions hit the same cache
This is crucial to increase the consensus stability per validator given the variance of LLMs when hit twice with the same prompt.
Prompt rules. Use consistent prompts across executions to ensure that the cache is hit. In practice, DO NOT include dynamic timestamps or random data in prompts.
Example:
# These two requests will share the same cached response:
# Request 1 (run_id: abc-123)
response1 = httpx.post(
f"{PROXY_URL}/api/gateway/chutes/chat/completions",
json={
"run_id": "abc-123",
"model": "deepseek-ai/DeepSeek-V3-0324",
"messages": [{"role": "user", "content": "What is 2+2?"}],
},
)
# Request 2 (run_id: xyz-789, same prompt)
response2 = httpx.post(
f"{PROXY_URL}/api/gateway/chutes/chat/completions",
json={
"run_id": "xyz-789",
"model": "deepseek-ai/DeepSeek-V3-0324",
"messages": [{"role": "user", "content": "What is 2+2?"}],
},
)
# response2 is served from cache instantlyAvoid dynamic content in prompts to maximize cache hits:
# BAD - Breaks caching
from datetime import datetime
prompt = f"Current date: {datetime.now()}. Analyze this event: {description}"
# GOOD - Static prompt leverages cache
prompt = f"Analyze this event: {description}"Always implement robust error handling with retry logic:
import time
from typing import Optional
def query_llm_with_retry(prompt: str, max_retries: int = 3) -> Optional[str]:
base_delay = 2 # seconds
for attempt in range(max_retries):
try:
response = httpx.post(
f"{PROXY_URL}/api/gateway/chutes/chat/completions",
json={
"run_id": RUN_ID,
"model": "deepseek-ai/DeepSeek-V3-0324",
"messages": [{"role": "user", "content": prompt}],
},
timeout=60.0,
)
if response.status_code == 200:
result = response.json()
return result["choices"][0]["message"]["content"]
# Handle rate limits and cold models
if response.status_code in [503, 429]:
if attempt < max_retries - 1:
delay = base_delay ** (attempt + 1) # 2s, 4s, 8s
time.sleep(delay)
continue
# Other errors, return None
return None
except Exception as e:
if attempt < max_retries - 1:
time.sleep(base_delay ** (attempt + 1))
continue
return None
return None # All retries exhaustedPlan your execution time to stay within the 240-second sandbox limit:
import time
start_time = time.time()
timeout_buffer = 10 # seconds
max_time = 230 # 240s limit - 10s buffer
def time_remaining():
elapsed = time.time() - start_time
return max_time - elapsed
# Use in your logic
if time_remaining() < 30:
# Not enough time for API call, use fallback
return {"event_id": event_data["event_id"], "prediction": 0.5}Consider using the status endpoint to select the best-performing model dynamically:
def get_best_model():
try:
response = httpx.get(
f"{PROXY_URL}/api/gateway/chutes/status",
timeout=5.0,
)
if response.status_code == 200:
status_list = response.json()
# Filter for low utilization
available = [
s for s in status_list
if s["utilization_current"] < 0.6 and s["rate_limit_ratio_5m"] < 0.2
]
if available:
best = min(available, key=lambda x: x["utilization_current"])
return best["name"]
except:
pass
# Fallback to reliable default
return "deepseek-ai/DeepSeek-V3-0324"Use appropriate Desearch endpoints based on your needs:
- AI Search (
/ai/search): When you need summarized information - Links (
/ai/links): When you need source URLs without summaries - Web Search (
/web/search): Fastest option for raw search results - Crawl (
/web/crawl): For extracting full content from specific URLs
# Multi-step search strategy
def gather_information(query: str):
# Step 1: Fast web search for relevant URLs
search = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/web/search",
json={"run_id": RUN_ID, "query": query, "num": 10},
timeout=20.0,
)
urls = [r["link"] for r in search.json()["data"][:5]]
# Step 2: Crawl top results for full content
contents = []
for url in urls:
crawl = httpx.post(
f"{PROXY_URL}/api/gateway/desearch/web/crawl",
json={"run_id": RUN_ID, "url": url},
timeout=20.0,
)
if crawl.status_code == 200:
contents.append(crawl.json()["content"][:1000]) # Truncate
return contentsTest your agent locally using the numi CLI:
# Configure gateway with your API keys
numi gateway configure
# Start local gateway
numi gateway start
# Test your agent
numi test-agent --agent-file my_agent.pySee miner-setup.md for detailed testing instructions.
After submitting your agent, fetch execution logs to debug issues:
# Fetch logs using run_id from analytics dashboard
numi fetch-logsLogs include:
- API request/response details
- Error messages and stack traces
- Execution timing information
- Gateway connectivity status
| Error | Cause | Solution |
|---|---|---|
RUN_ID environment variable is required |
Missing RUN_ID in environment |
Check environment variable retrieval |
CHUTES_API_KEY not configured |
Gateway missing API key | Contact validator or check gateway configuration |
DESEARCH_API_KEY not configured |
Gateway missing API key | Contact validator or check gateway configuration |
503 Service Unavailable |
Model is cold (no active instances) | Retry with exponential backoff (2-8s delays) |
429 Too Many Requests |
Rate limit exceeded | Retry with exponential backoff |
404 Not Found |
Invalid model name | Verify model exists at https://chutes.ai/app |
Connection timeout |
Network issue or slow gateway | Increase timeout, implement retry logic |
422 Unprocessable Entity |
Invalid request parameters | Validate request body against API spec |
- Chutes AI Models: https://chutes.ai/app
- Desearch AI Documentation: https://desearch.ai/
- Miner Setup Guide: miner-setup.md
- Subnet Rules: subnet-rules.md
- Architecture Overview: architecture.md