Gateway API Reference

Overview

The Gateway API provides miner agents with access to external services during sandbox execution. Agents run in isolated Docker containers without internet access, and the gateway acts as a controlled proxy to external APIs. Validators handle authentication, while miners can link their API accounts to cover costs and access higher budgets (see miner-setup.md).

Available Services:

Chutes AI: LLM inference with multiple open-source models
Desearch AI: Web search, social media search, and content crawling
OpenAI: GPT-5 series models with built-in web search
Perplexity: Reasoning LLMs with built-in web search
Vericore: Statement verification with evidence-based metrics
OpenRouter: Model router with access to hundreds of LLM models (Claude, Gemini, Llama, etc.)
Numinous Indicia: Geopolitical and OSINT signals intelligence (X/Twitter, LiveUAMap)

All requests are cached to optimize performance and reduce costs.

Cost Limits: $0.01 (default) or $0.10 (linked account) per sandbox run for Chutes and Desearch. OpenAI: $1.00 per run (requires linked account, no free tier). Perplexity: $0.10 per run (requires linked account, no free tier). Vericore: $0.10 per run (requires linked account, no free tier). OpenRouter: $0.10 per run (requires linked account, no free tier). Numinous Indicia: free (no linking required).

Security: API keys are securely stored using external secret management and never exposed to validators.

Authentication

Environment Variables

Your agent receives these environment variables in the sandbox:

Variable	Description	Example
`SANDBOX_PROXY_URL`	Gateway proxy URL	`http://sandbox_proxy`
`RUN_ID`	Unique execution identifier (UUID)	`550e8400-e29b-41d4-a716-446655440000`

Request Requirements

All gateway requests must:

Use SANDBOX_PROXY_URL as the base URL
Include run_id in the request body (for POST) or headers (for GET)
Not include any API keys (validator handles authentication)

Example:

import os

PROXY_URL = os.getenv("SANDBOX_PROXY_URL", "http://sandbox_proxy")
RUN_ID = os.getenv("RUN_ID")

if not RUN_ID:
    raise ValueError("RUN_ID environment variable is required")

Chutes AI Endpoints

Chutes AI provides access to open-source LLM models for inference.

POST /api/gateway/chutes/chat/completions

OpenAI-compatible chat completion endpoint.

URL: {SANDBOX_PROXY_URL}/api/gateway/chutes/chat/completions

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "deepseek-ai/DeepSeek-V3-0324",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "tools": null,
  "tool_choice": null
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`model`	string	Yes	-	Model identifier (see Available Models below)
`messages`	array	Yes	-	List of message objects with `role` and `content`
`temperature`	float	No	0.7	Sampling temperature (0.0-2.0)
`max_tokens`	integer	No	null	Maximum tokens to generate
`tools`	array	No	null	Tool definitions for function calling
`tool_choice`	string/object	No	null	Tool selection strategy (`auto`, `required`, or specific tool)

Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "deepseek-ai/DeepSeek-V3-0324",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 8,
    "total_tokens": 36
  }
}

Example (using LangChain):

import os
from langchain_openai import ChatOpenAI

PROXY_URL = os.getenv("SANDBOX_PROXY_URL", "http://sandbox_proxy")
RUN_ID = os.getenv("RUN_ID")

llm = ChatOpenAI(
    model="deepseek-ai/DeepSeek-V3-0324",
    base_url=f"{PROXY_URL}/api/gateway/chutes",
    api_key="not-needed",  # Gateway handles authentication
    extra_body={"run_id": RUN_ID},
)

response = llm.invoke("What is 2+2?")
print(response.content)

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/chutes/chat/completions",
    json={
        "run_id": RUN_ID,
        "model": "deepseek-ai/DeepSeek-V3-0324",
        "messages": [{"role": "user", "content": "Hello!"}],
        "temperature": 0.7,
    },
    timeout=60.0,
)

result = response.json()
content = result["choices"][0]["message"]["content"]

Available Models:

Model	Identifier	Notes
DeepSeek R1	`deepseek-ai/DeepSeek-R1`	Latest reasoning model
DeepSeek R1 0528	`deepseek-ai/DeepSeek-R1-0528`	Version-specific
DeepSeek V3 0324	`deepseek-ai/DeepSeek-V3-0324`	Fast and efficient
DeepSeek V3.1	`deepseek-ai/DeepSeek-V3.1`	Improved version
DeepSeek V3.2 Exp	`deepseek-ai/DeepSeek-V3.2-Exp`	Experimental
Gemma 3 4B	`unsloth/gemma-3-4b-it`	Lightweight model
Gemma 3 12B	`unsloth/gemma-3-12b-it`	Mid-size model
Gemma 3 27B	`unsloth/gemma-3-27b-it`	Larger model
GLM 4.5	`zai-org/GLM-4.5`	Multilingual model
GLM 4.6	`zai-org/GLM-4.6`	Latest GLM version
Qwen3 32B	`Qwen/Qwen3-32B`	High-performance model
Qwen3 235B	`Qwen/Qwen3-235B-A22B`	Large-scale model
Mistral Small 24B	`unsloth/Mistral-Small-24B-Instruct-2501`	Efficient instruction model
GPT OSS 20B	`openai/gpt-oss-20b`	Open-source GPT variant
GPT OSS 120B	`openai/gpt-oss-120b`	Large open-source GPT

Note: Model availability can change. Check https://chutes.ai/app for the latest list of active models.

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable (cold model)	Implement exponential backoff, retry after 2-8s
404	Model not found	Verify model name at https://chutes.ai/app
429	Rate limit exceeded	Implement exponential backoff
401	Authentication failed	Contact validator (gateway misconfigured)
500	Internal server error	Retry with fallback to baseline prediction

GET /api/gateway/chutes/status

Get real-time status and utilization metrics for all Chutes models.

URL: {SANDBOX_PROXY_URL}/api/gateway/chutes/status

Request:

import httpx

response = httpx.get(
    f"{PROXY_URL}/api/gateway/chutes/status",
    timeout=10.0,
)
status_list = response.json()

Response:

[
  {
    "chute_id": "chute-123",
    "name": "deepseek-ai/DeepSeek-R1",
    "timestamp": "2025-11-13T12:00:00Z",
    "utilization_current": 0.85,
    "utilization_5m": 0.75,
    "utilization_15m": 0.70,
    "utilization_1h": 0.65,
    "rate_limit_ratio_5m": 0.1,
    "rate_limit_ratio_15m": 0.08,
    "rate_limit_ratio_1h": 0.05,
    "total_requests_5m": 100.0,
    "completed_requests_5m": 90.0,
    "rate_limited_requests_5m": 10.0,
    "instance_count": 5,
    "action_taken": "scale_up",
    "scalable": true
  }
]

Response Fields:

Field	Type	Description
`chute_id`	string	Unique chute identifier
`name`	string	Model name
`utilization_current`	float	Current utilization (0.0-1.0)
`utilization_5m`	float	5-minute average utilization
`utilization_15m`	float	15-minute average utilization
`utilization_1h`	float	1-hour average utilization
`rate_limit_ratio_5m`	float	Ratio of rate-limited requests (5min)
`instance_count`	integer	Active instances
`action_taken`	string	Latest scaling action (`scale_up`, `scale_down`, `none`)
`scalable`	boolean	Whether model can scale

Use Case:

Use this endpoint to select the most available model before making inference requests:

import httpx

def select_best_model():
    response = httpx.get(f"{PROXY_URL}/api/gateway/chutes/status", timeout=10.0)
    status_list = response.json()

    # Filter for low utilization and low rate limiting
    available_models = [
        s for s in status_list
        if s["utilization_current"] < 0.5 and s["rate_limit_ratio_5m"] < 0.1
    ]

    if available_models:
        # Pick the least utilized model
        best = min(available_models, key=lambda x: x["utilization_current"])
        return best["name"]

    # Fallback to default
    return "deepseek-ai/DeepSeek-V3-0324"

Desearch AI Endpoints

Desearch AI provides web search, social media search, and content crawling capabilities.

POST /api/gateway/desearch/ai/search

AI-powered search with automatic summarization and multiple tool support.

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/ai/search

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "prompt": "Latest developments in quantum computing",
  "model": "NOVA",
  "tools": ["web", "arxiv"],
  "date_filter": "PAST_WEEK",
  "result_type": "LINKS_WITH_FINAL_SUMMARY",
  "system_message": null,
  "count": 10
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID
`prompt`	string	Yes	-	Search query or question
`model`	string	No	`NOVA`	AI model (`NOVA`, `ORBIT`, `HORIZON`)
`tools`	array[string]	No	`["web"]`	Search tools to use (see Available Tools)
`date_filter`	string	No	null	Time range filter (see Date Filters)
`result_type`	string	No	null	Output format (see Result Types)
`system_message`	string	No	null	Custom system prompt for AI
`count`	integer	No	10	Number of results (1-100)

Available Tools:

Tool	Description
`web`	General web search
`twitter`	Twitter/X search
`reddit`	Reddit search
`hackernews`	Hacker News search
`wikipedia`	Wikipedia search
`youtube`	YouTube search
`arxiv`	Academic papers (arXiv)

Date Filters:

Value	Description
`PAST_24_HOURS`	Last 24 hours
`PAST_2_DAYS`	Last 2 days
`PAST_WEEK`	Last 7 days
`PAST_2_WEEKS`	Last 14 days
`PAST_MONTH`	Last 30 days
`PAST_2_MONTHS`	Last 60 days
`PAST_YEAR`	Last 365 days
`PAST_2_YEARS`	Last 2 years

Result Types:

Value	Description
`ONLY_LINKS`	Return only search result links
`LINKS_WITH_SUMMARIES`	Return links with individual summaries
`LINKS_WITH_FINAL_SUMMARY`	Return links with one aggregated summary

Response:

{
  "text": "Search results text...",
  "completion": "AI-generated summary based on search results...",
  "wikipedia_search": [],
  "youtube_search": [],
  "arxiv_search": [
    {
      "title": "Paper title",
      "url": "https://arxiv.org/abs/...",
      "summary": "Paper abstract..."
    }
  ],
  "reddit_search": [],
  "hacker_news_search": [],
  "tweets": [],
  "miner_link_scores": {}
}

Example:

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/ai/search",
    json={
        "run_id": RUN_ID,
        "prompt": "What are experts saying about AI safety?",
        "model": "NOVA",
        "tools": ["web", "twitter", "reddit"],
        "date_filter": "PAST_WEEK",
        "count": 15,
    },
    timeout=60.0,
)

result = response.json()
summary = result.get("completion", "")
tweets = result.get("tweets", [])

POST /api/gateway/desearch/ai/links

Get search result links without summaries (faster than AI search).

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/ai/links

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "prompt": "Climate change policy updates",
  "model": "NOVA",
  "tools": ["web", "wikipedia"],
  "count": 20
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID
`prompt`	string	Yes	-	Search query
`model`	string	No	`NOVA`	AI model
`tools`	array[string]	No	`["web"]`	Search tools (web, wikipedia, reddit, etc.)
`count`	integer	No	10	Number of links (1-100)

Response:

{
  "search_results": [
    {
      "title": "Result title",
      "url": "https://example.com",
      "snippet": "Preview text..."
    }
  ],
  "wikipedia_search_results": [],
  "youtube_search_results": [],
  "arxiv_search_results": [],
  "reddit_search_results": [],
  "hacker_news_search_results": []
}

Example:

import httpx

response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/ai/links",
    json={
        "run_id": RUN_ID,
        "prompt": "US inflation data 2025",
        "tools": ["web"],
        "count": 10,
    },
    timeout=30.0,
)

links = response.json().get("search_results", [])
for link in links[:5]:
    print(f"{link['title']}: {link['url']}")

POST /api/gateway/desearch/web/search

Raw web search without AI processing (fastest option).

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/web/search

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "query": "bitcoin price prediction",
  "num": 10,
  "start": 0
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID
`query`	string	Yes	-	Search query string
`num`	integer	No	10	Number of results (1-100)
`start`	integer	No	0	Pagination offset

Response:

{
  "data": [
    {
      "title": "Page title",
      "link": "https://example.com/page",
      "snippet": "Page description or excerpt...",
      "date": "2025-11-10"
    }
  ]
}

Example:

import httpx

response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/web/search",
    json={
        "run_id": RUN_ID,
        "query": "federal reserve interest rate decision",
        "num": 20,
        "start": 0,
    },
    timeout=30.0,
)

results = response.json()["data"]
for result in results:
    print(f"{result['title']}: {result['link']}")

POST /api/gateway/desearch/web/crawl

Fetch and extract content from a specific URL.

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/web/crawl

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com/article"
}

Parameters:

Field	Type	Required	Description
`run_id`	string (UUID)	Yes	Execution tracking ID
`url`	string	Yes	Full URL to crawl

Response:

{
  "url": "https://example.com/article",
  "content": "Extracted text content from the page..."
}

Example:

import httpx

# First, search for relevant URLs
search_response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/web/search",
    json={"run_id": RUN_ID, "query": "climate summit outcomes", "num": 5},
    timeout=30.0,
)
urls = [r["link"] for r in search_response.json()["data"]]

# Then, crawl each URL for full content
for url in urls[:3]:
    crawl_response = httpx.post(
        f"{PROXY_URL}/api/gateway/desearch/web/crawl",
        json={"run_id": RUN_ID, "url": url},
        timeout=30.0,
    )
    content = crawl_response.json()["content"]
    # Analyze content...

POST /api/gateway/desearch/x/search

Search for posts on X (Twitter) with advanced filtering options.

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/x/search

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "query": "AI safety",
  "sort": "Top",
  "count": 20,
  "min_likes": 100,
  "verified": true
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID
`query`	string	Yes	-	Search query for X posts
`sort`	string	No	`Top`	Sort order (`Top` or `Latest`)
`user`	string	No	null	Filter by username
`start_date`	string (ISO 8601)	No	null	Filter posts after this date
`end_date`	string (ISO 8601)	No	null	Filter posts before this date
`lang`	string	No	null	Filter by language code (e.g., `en`)
`verified`	boolean	No	null	Filter by verified status
`blue_verified`	boolean	No	null	Filter by blue verified status
`is_quote`	boolean	No	null	Filter for quote tweets
`is_video`	boolean	No	null	Filter for posts with video
`is_image`	boolean	No	null	Filter for posts with images
`min_retweets`	integer	No	null	Minimum retweet count
`min_replies`	integer	No	null	Minimum reply count
`min_likes`	integer	No	null	Minimum like count
`count`	integer	No	20	Number of posts to return

Response:

{
  "posts": [
    {
      "id": "1234567890",
      "text": "Post content here...",
      "url": "https://x.com/user/status/1234567890",
      "created_at": "2025-01-06T12:00:00Z",
      "reply_count": 10,
      "retweet_count": 50,
      "like_count": 200,
      "view_count": 5000,
      "quote_count": 5,
      "bookmark_count": 15,
      "is_quote_tweet": false,
      "is_retweet": false,
      "lang": "en",
      "conversation_id": "1234567890",
      "media": []
    }
  ],
  "cost": 0.003
}

Note: Each post may include additional optional fields such as in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id, quoted_status_id, replies, and display_text_range.

Example:

import httpx

response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/x/search",
    json={
        "run_id": RUN_ID,
        "query": "quantum computing breakthrough",
        "sort": "Latest",
        "min_likes": 50,
        "count": 10,
    },
    timeout=30.0,
)

posts = response.json()["posts"]
for post in posts:
    print(f"{post['text'][:100]}... - {post['like_count']} likes")

POST /api/gateway/desearch/x/post

Fetch detailed information about a specific X (Twitter) post by ID.

URL: {SANDBOX_PROXY_URL}/api/gateway/desearch/x/post

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "post_id": "1234567890"
}

Parameters:

Field	Type	Required	Description
`run_id`	string (UUID)	Yes	Execution tracking ID
`post_id`	string	Yes	The X post ID to fetch

Response:

{
  "user": {
    "id": "123456",
    "username": "exampleuser",
    "name": "Example User",
    "url": "https://x.com/exampleuser",
    "created_at": "2020-01-01T00:00:00Z",
    "description": "User bio...",
    "followers_count": 10000,
    "favourites_count": 5000,
    "listed_count": 100,
    "media_count": 200,
    "statuses_count": 5000,
    "verified": true,
    "is_blue_verified": false,
    "profile_image_url": "https://...",
    "profile_banner_url": "https://...",
    "location": "San Francisco, CA",
    "can_dm": true,
    "can_media_tag": true
  },
  "id": "1234567890",
  "text": "Full post content here...",
  "url": "https://x.com/exampleuser/status/1234567890",
  "created_at": "2025-01-06T12:00:00Z",
  "reply_count": 10,
  "retweet_count": 50,
  "like_count": 200,
  "view_count": 5000,
  "quote_count": 5,
  "bookmark_count": 15,
  "is_quote_tweet": false,
  "is_retweet": false,
  "lang": "en",
  "conversation_id": "1234567890",
  "media": [],
  "cost": 0.0003
}

Note: The response may include additional optional fields such as quote (for quote tweets), retweet (for retweets), replies (list of reply posts), entities, extended_entities, in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id, quoted_status_id, and display_text_range.

Example:

import httpx

# Fetch a specific post
response = httpx.post(
    f"{PROXY_URL}/api/gateway/desearch/x/post",
    json={
        "run_id": RUN_ID,
        "post_id": "1234567890",
    },
    timeout=30.0,
)

post = response.json()
print(f"Author: {post['user']['username']}")
print(f"Text: {post['text']}")
print(f"Engagement: {post['like_count']} likes, {post['retweet_count']} retweets")

OpenAI Endpoints

OpenAI provides access to GPT-5 series models with built-in web search capability.

POST /api/gateway/openai/responses

Create a response using OpenAI's GPT-5 models with optional web search.

URL: {SANDBOX_PROXY_URL}/api/gateway/openai/responses

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "gpt-5-mini",
  "input": [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_output_tokens": 1000,
  "tools": [{"type": "web_search"}],
  "tool_choice": null,
  "instructions": null
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`model`	string	Yes	-	Model identifier (see Available Models below)
`input`	array	Yes	-	List of message objects with `role` and `content`
`temperature`	float	No	0.7	Sampling temperature (0.0-2.0)
`max_output_tokens`	integer	No	null	Maximum tokens to generate
`tools`	array	No	null	Tool definitions (e.g., `[{"type": "web_search"}]`)
`tool_choice`	string/object	No	null	Tool selection strategy
`instructions`	string	No	null	System-level instructions

Available Models:

Model	Identifier	Notes
GPT-5 Mini	`gpt-5-mini`	Cost-effective, fast
GPT-5	`gpt-5`	Balanced performance
GPT-5.2	`gpt-5.2`	Enhanced reasoning
GPT-5.2 Pro	`gpt-5.2-pro`	Most capable
GPT-5 Nano	`gpt-5-nano`	Lightweight

Web Search Tool:

Enable web search by including tools:

"tools": [{"type": "web_search"}]

The model will autonomously decide when to search based on the prompt. Each search costs $0.01.

Response:

{
  "id": "resp_123",
  "object": "response",
  "created_at": 1768496869,
  "model": "gpt-5-mini-2025-08-07",
  "output": [
    {
      "id": "msg_123",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris.",
          "logprobs": [],
          "annotations": []
        }
      ],
      "status": "completed"
    }
  ],
  "usage": {
    "input_tokens": 22,
    "output_tokens": 207,
    "total_tokens": 229
  },
  "status": "completed",
  "cost": 0.001953
}

Response Fields:

Field	Type	Description
`id`	string	Response identifier
`model`	string	Model used for generation
`output`	array	List of output items (messages, reasoning steps)
`output[].type`	string	Item type (`message`, `reasoning`)
`output[].content[].text`	string	Generated text content
`usage`	object	Token usage statistics
`usage.input_tokens`	integer	Input tokens consumed
`usage.output_tokens`	integer	Output tokens generated
`cost`	float	Total cost in USD (includes token cost + web search cost)

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/openai/responses",
    json={
        "run_id": RUN_ID,
        "model": "gpt-5-mini",
        "input": [
            {"role": "developer", "content": "You are an expert forecaster."},
            {"role": "user", "content": "What is the probability of rain tomorrow?"}
        ],
        "tools": [{"type": "web_search"}],
        "temperature": 0.7,
    },
    timeout=120.0,
)

result = response.json()

# Extract text from output
for item in result["output"]:
    if item["type"] == "message":
        for content in item["content"]:
            if content.get("text"):
                print(content["text"])

Cost Calculation:

Total cost = Token cost + Web search cost

Token cost: Based on input/output tokens and model pricing
Web search cost: $0.01 per search executed

The cost field in the response includes both components.

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable	Retry with exponential backoff
404	Model not found	Verify model identifier
429	Rate limit exceeded	Retry with exponential backoff
401	Authentication failed	Contact validator
500	Internal server error	Retry with fallback

Best Practices:

Use web_search selectively: Only enable when research is needed
Clear prompts: Explicitly ask the model to search before forecasting
Model selection: Use gpt-5-mini for cost-efficiency, gpt-5.2 for complex reasoning
Error handling: Always implement retry logic with fallback predictions

Perplexity Endpoints

Perplexity provides reasoning LLMs with built-in web search capability.

POST /api/gateway/perplexity/chat/completions

Create a response using Perplexity's reasoning models with automatic web search.

URL: {SANDBOX_PROXY_URL}/api/gateway/perplexity/chat/completions

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "sonar-reasoning-pro",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "search_recency_filter": "month"
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`model`	string	Yes	-	Model identifier (see Available Models below)
`messages`	array	Yes	-	List of message objects with `role` and `content`
`temperature`	float	No	0.7	Sampling temperature (0.0-2.0)
`max_tokens`	integer	No	null	Maximum tokens to generate
`search_recency_filter`	string	No	null	Time range for search results (`day`, `week`, `month`, `year`)

Available Models:

Model	Identifier	Notes
Sonar Reasoning Pro	`sonar-reasoning-pro`	Most capable reasoning model
Sonar Pro	`sonar-pro`	Balanced performance
Sonar	`sonar`	Fast and efficient

Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "sonar-reasoning-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 8,
    "total_tokens": 36
  },
  "citations": [
    "https://example.com/source1",
    "https://example.com/source2"
  ],
  "search_results": [
    {
      "title": "Source title",
      "url": "https://example.com/source1",
      "snippet": "Relevant text..."
    }
  ],
  "cost": 0.002145
}

Response Fields:

Field	Type	Description
`id`	string	Response identifier
`model`	string	Model used for generation
`choices`	array	List of completion choices
`choices[].message.content`	string	Generated text content
`usage`	object	Token usage statistics
`citations`	array	List of source URLs used
`search_results`	array	Detailed search result objects
`cost`	float	Total cost in USD

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/perplexity/chat/completions",
    json={
        "run_id": RUN_ID,
        "model": "sonar-reasoning-pro",
        "messages": [
            {"role": "system", "content": "You are an expert forecaster."},
            {"role": "user", "content": "What is the probability of rain tomorrow?"}
        ],
        "temperature": 0.2,
        "search_recency_filter": "day",
    },
    timeout=120.0,
)

result = response.json()

content = result["choices"][0]["message"]["content"]
citations = result.get("citations", [])

print(f"Response: {content}")
print(f"Sources: {citations}")

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable	Retry with exponential backoff
404	Model not found	Verify model identifier
429	Rate limit exceeded	Retry with exponential backoff
401	Authentication failed	Contact validator
500	Internal server error	Retry with fallback

Best Practices:

Use search_recency_filter: Set to day or week for time-sensitive events
Extract citations: Use the citations array to verify information sources
Model selection: Use sonar-reasoning-pro for complex reasoning tasks
Error handling: Always implement retry logic with fallback predictions

Note: Perplexity has no free tier. You must link your API key to use Perplexity models.

Vericore Endpoints

Vericore provides statement verification with evidence-based metrics including sentiment, conviction, source credibility, and more.

POST /api/gateway/vericore/calculate-rating

Verify a statement against web evidence and get detailed metrics.

URL: {SANDBOX_PROXY_URL}/api/gateway/vericore/calculate-rating

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "statement": "Bitcoin will reach $100k by end of 2026",
  "generate_preview": false
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`statement`	string	Yes	-	Statement to verify against web evidence
`generate_preview`	boolean	No	false	Generate a preview URL for the results

Response:

{
  "batch_id": "mlzjxglo15m23k",
  "request_id": "req-mlzjxgmc4amr6",
  "preview_url": "",
  "evidence_summary": {
    "total_count": 12,
    "neutral": 37.5,
    "entailment": 1.03,
    "contradiction": 61.46,
    "sentiment": -0.07,
    "conviction": 0.82,
    "source_credibility": 0.93,
    "narrative_momentum": 0.48,
    "risk_reward_sentiment": -0.15,
    "political_leaning": 0.0,
    "catalyst_detection": 0.12,
    "statements": [
      {
        "statement": "Evidence text from source...",
        "url": "https://example.com/article",
        "contradiction": 0.87,
        "neutral": 0.12,
        "entailment": 0.01,
        "sentiment": -0.5,
        "conviction": 0.75,
        "source_credibility": 0.85,
        "narrative_momentum": 0.5,
        "risk_reward_sentiment": -0.5,
        "political_leaning": 0.0,
        "catalyst_detection": 0.3
      }
    ]
  },
  "cost": 0.05
}

Response Fields:

Field	Type	Description
`batch_id`	string	Batch identifier
`request_id`	string	Request identifier
`preview_url`	string	Preview URL (empty if `generate_preview` is false)
`evidence_summary.total_count`	integer	Number of evidence sources found
`evidence_summary.entailment`	float	Aggregated entailment score
`evidence_summary.contradiction`	float	Aggregated contradiction score
`evidence_summary.sentiment`	float	Aggregated sentiment (-1.0 to 1.0)
`evidence_summary.conviction`	float	Aggregated conviction level
`evidence_summary.source_credibility`	float	Average source credibility
`evidence_summary.statements`	array	Individual evidence sources with per-source metrics

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/vericore/calculate-rating",
    json={
        "run_id": RUN_ID,
        "statement": "Bitcoin will reach $100k by end of 2026",
    },
    timeout=120.0,
)

result = response.json()

summary = result["evidence_summary"]
total = summary["total_count"]
contradiction = summary["contradiction"]
sentiment = summary["sentiment"]
conviction = summary["conviction"]
credibility = summary["source_credibility"]

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable	Retry with exponential backoff
429	Rate limit exceeded	Retry with exponential backoff
401	Authentication failed	Contact validator
500	Internal server error	Retry with fallback

Note: Vericore has no free tier. You must link your API key to use Vericore. Each call costs $0.05.

OpenRouter Endpoints

OpenRouter is a model router that provides access to hundreds of LLM models through a unified API. You can use models from Anthropic, Google, Meta, and many other providers.

POST /api/gateway/openrouter/chat/completions

Generate chat completions using any OpenRouter-supported model.

URL: {SANDBOX_PROXY_URL}/api/gateway/openrouter/chat/completions

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`model`	string	Yes	-	OpenRouter model ID (e.g., `anthropic/claude-sonnet-4-6`)
`messages`	array	Yes	-	Chat messages array with `role` and `content`
`temperature`	float	No	0.7	Sampling temperature (0.0-2.0)
`max_tokens`	integer	No	-	Maximum tokens to generate
`tools`	array	No	-	Tool/function definitions for function calling
`tool_choice`	string/object	No	-	Tool selection mode

Response:

{
  "id": "gen-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "anthropic/claude-sonnet-4-6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 10,
    "total_tokens": 35,
    "cost": 0.000135
  },
  "cost": 0.000135
}

Response Fields:

Field	Type	Description
`id`	string	Unique completion identifier
`model`	string	Model used for the completion
`choices`	array	Array of completion choices
`choices[].message.content`	string	Generated text response
`choices[].finish_reason`	string	Why generation stopped (`stop`, `length`, `tool_calls`)
`usage.prompt_tokens`	integer	Input tokens used
`usage.completion_tokens`	integer	Output tokens generated
`usage.cost`	decimal	Actual cost reported by OpenRouter
`cost`	decimal	Total cost for this request

Popular Models:

Model ID	Description
`anthropic/claude-sonnet-4-6`	Claude Sonnet 4.6 - balanced performance
`anthropic/claude-haiku-4-5`	Claude Haiku 4.5 - fast and cost-effective
`google/gemini-2.5-flash`	Gemini 2.5 Flash - fast and affordable
`google/gemini-2.5-pro`	Gemini 2.5 Pro - high capability

See the full model list at https://openrouter.ai/models

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

response = httpx.post(
    f"{PROXY_URL}/api/gateway/openrouter/chat/completions",
    json={
        "run_id": RUN_ID,
        "model": "anthropic/claude-sonnet-4-6",
        "messages": [
            {"role": "user", "content": "Analyze the likelihood of this event..."}
        ],
        "temperature": 0.2,
        "max_tokens": 1024,
    },
    timeout=120.0,
)

result = response.json()
content = result["choices"][0]["message"]["content"]
cost = result.get("cost", 0.0)

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable	Retry with exponential backoff
429	Rate limit exceeded	Retry with exponential backoff
401	Authentication failed	Contact validator
500	Internal server error	Retry with fallback model

Note: OpenRouter has no free tier. You must link your API key to use OpenRouter models.

Numinous Indicia Endpoints

Numinous Indicia provides geopolitical and OSINT signals intelligence from X/Twitter and LiveUAMap. Useful as additional context for geopolitical forecasting when combined with an LLM.

POST /api/gateway/numinous-indicia/x-osint

Fetch geopolitical signals derived from X/Twitter OSINT sources.

URL: {SANDBOX_PROXY_URL}/api/gateway/numinous-indicia/x-osint

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "account": null,
  "limit": 20
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`account`	string	No	null	Filter by specific X account
`limit`	integer	No	20	Number of signals to return (1-50)

POST /api/gateway/numinous-indicia/liveuamap

Fetch geopolitical signals from LiveUAMap (military/conflict data).

URL: {SANDBOX_PROXY_URL}/api/gateway/numinous-indicia/liveuamap

Request Body:

{
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "region": null,
  "limit": 50
}

Parameters:

Field	Type	Required	Default	Description
`run_id`	string (UUID)	Yes	-	Execution tracking ID from environment
`region`	string	No	null	Filter by geographic region
`limit`	integer	No	50	Number of signals to return (1-200)

Response (both endpoints)

{
  "signals": [
    {
      "topic": "Ukraine conflict",
      "category": "military",
      "signal": "Russian forces advance near Pokrovsk...",
      "confidence": "high",
      "fact_status": "confirmed",
      "timestamp": "2026-03-08T14:30:00Z",
      "source_url": "https://example.com/source",
      "evidence_refs": ["https://example.com/ref1"]
    }
  ],
  "cost": 0.0
}

Signal Fields:

Field	Type	Description
`signals`	array	List of signal objects
`signals[].topic`	string	Signal topic
`signals[].category`	string	Signal category (e.g., military, political)
`signals[].signal`	string	Signal description text
`signals[].confidence`	string	Confidence level
`signals[].fact_status`	string	Verification status
`signals[].timestamp`	string (ISO 8601)	When the signal was captured
`signals[].source_url`	string	Original source URL (may be null)
`signals[].evidence_refs`	array	Supporting evidence URLs
`cost`	decimal	Cost for this request (currently $0)

Example (using httpx):

import os
import httpx

PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
RUN_ID = os.getenv("RUN_ID")

INDICIA_URL = f"{PROXY_URL}/api/gateway/numinous-indicia"

# Fetch X/Twitter OSINT signals
response = httpx.post(
    f"{INDICIA_URL}/x-osint",
    json={"run_id": RUN_ID, "limit": 20},
    timeout=30.0,
)

data = response.json()
signals = data["signals"]

for s in signals:
    print(f"[{s['category']}] {s['signal']} (confidence={s['confidence']})")

Error Handling:

Status Code	Description	Recommended Action
503	Service Unavailable	Retry with exponential backoff
429	Rate limit exceeded	Retry with exponential backoff
500	Internal server error	Retry with fallback

Note: Numinous Indicia is free to use. No API key linking required.

See neurons/miner/agents/indicia_openai_example.py for a complete agent that combines Indicia signals with OpenAI web search for geopolitical forecasting.

Caching

The gateway implements request-level caching to increase consensus stabilit among validators, optimize performance, reduce API costs.

Cache Behavior:

Requests with identical parameters return cached responses instantly
Cache is keyed by endpoint name and request parameters (excluding run_id)
Cache persists for the lifetime of the gateway process
Cache is shared across all agent executions on the same validator

Cache Key Generation:

The run_id field is excluded from cache key calculation
This means identical requests from different executions hit the same cache

This is crucial to increase the consensus stability per validator given the variance of LLMs when hit twice with the same prompt.

Prompt rules. Use consistent prompts across executions to ensure that the cache is hit. In practice, DO NOT include dynamic timestamps or random data in prompts.

Example:

# These two requests will share the same cached response:

# Request 1 (run_id: abc-123)
response1 = httpx.post(
    f"{PROXY_URL}/api/gateway/chutes/chat/completions",
    json={
        "run_id": "abc-123",
        "model": "deepseek-ai/DeepSeek-V3-0324",
        "messages": [{"role": "user", "content": "What is 2+2?"}],
    },
)

# Request 2 (run_id: xyz-789, same prompt)
response2 = httpx.post(
    f"{PROXY_URL}/api/gateway/chutes/chat/completions",
    json={
        "run_id": "xyz-789",
        "model": "deepseek-ai/DeepSeek-V3-0324",
        "messages": [{"role": "user", "content": "What is 2+2?"}],
    },
)
# response2 is served from cache instantly

Best Practices

Prompt Rules

Avoid dynamic content in prompts to maximize cache hits:

# BAD - Breaks caching
from datetime import datetime
prompt = f"Current date: {datetime.now()}. Analyze this event: {description}"

# GOOD - Static prompt leverages cache
prompt = f"Analyze this event: {description}"

Error Handling

Always implement robust error handling with retry logic:

import time
from typing import Optional

def query_llm_with_retry(prompt: str, max_retries: int = 3) -> Optional[str]:
    base_delay = 2  # seconds

    for attempt in range(max_retries):
        try:
            response = httpx.post(
                f"{PROXY_URL}/api/gateway/chutes/chat/completions",
                json={
                    "run_id": RUN_ID,
                    "model": "deepseek-ai/DeepSeek-V3-0324",
                    "messages": [{"role": "user", "content": prompt}],
                },
                timeout=60.0,
            )

            if response.status_code == 200:
                result = response.json()
                return result["choices"][0]["message"]["content"]

            # Handle rate limits and cold models
            if response.status_code in [503, 429]:
                if attempt < max_retries - 1:
                    delay = base_delay ** (attempt + 1)  # 2s, 4s, 8s
                    time.sleep(delay)
                    continue

            # Other errors, return None
            return None

        except Exception as e:
            if attempt < max_retries - 1:
                time.sleep(base_delay ** (attempt + 1))
                continue
            return None

    return None  # All retries exhausted

Timeout Management

Plan your execution time to stay within the 240-second sandbox limit:

import time

start_time = time.time()
timeout_buffer = 10  # seconds
max_time = 230  # 240s limit - 10s buffer

def time_remaining():
    elapsed = time.time() - start_time
    return max_time - elapsed

# Use in your logic
if time_remaining() < 30:
    # Not enough time for API call, use fallback
    return {"event_id": event_data["event_id"], "prediction": 0.5}

Model Selection

Consider using the status endpoint to select the best-performing model dynamically:

def get_best_model():
    try:
        response = httpx.get(
            f"{PROXY_URL}/api/gateway/chutes/status",
            timeout=5.0,
        )

        if response.status_code == 200:
            status_list = response.json()

            # Filter for low utilization
            available = [
                s for s in status_list
                if s["utilization_current"] < 0.6 and s["rate_limit_ratio_5m"] < 0.2
            ]

            if available:
                best = min(available, key=lambda x: x["utilization_current"])
                return best["name"]
    except:
        pass

    # Fallback to reliable default
    return "deepseek-ai/DeepSeek-V3-0324"

Search Strategy

Use appropriate Desearch endpoints based on your needs:

AI Search (/ai/search): When you need summarized information
Links (/ai/links): When you need source URLs without summaries
Web Search (/web/search): Fastest option for raw search results
Crawl (/web/crawl): For extracting full content from specific URLs

# Multi-step search strategy
def gather_information(query: str):
    # Step 1: Fast web search for relevant URLs
    search = httpx.post(
        f"{PROXY_URL}/api/gateway/desearch/web/search",
        json={"run_id": RUN_ID, "query": query, "num": 10},
        timeout=20.0,
    )
    urls = [r["link"] for r in search.json()["data"][:5]]

    # Step 2: Crawl top results for full content
    contents = []
    for url in urls:
        crawl = httpx.post(
            f"{PROXY_URL}/api/gateway/desearch/web/crawl",
            json={"run_id": RUN_ID, "url": url},
            timeout=20.0,
        )
        if crawl.status_code == 200:
            contents.append(crawl.json()["content"][:1000])  # Truncate

    return contents

Testing

Local Testing

Test your agent locally using the numi CLI:

# Configure gateway with your API keys
numi gateway configure

# Start local gateway
numi gateway start

# Test your agent
numi test-agent --agent-file my_agent.py

See miner-setup.md for detailed testing instructions.

Production Testing

After submitting your agent, fetch execution logs to debug issues:

# Fetch logs using run_id from analytics dashboard
numi fetch-logs

Logs include:

API request/response details
Error messages and stack traces
Execution timing information
Gateway connectivity status

Common Errors

Error	Cause	Solution
`RUN_ID environment variable is required`	Missing `RUN_ID` in environment	Check environment variable retrieval
`CHUTES_API_KEY not configured`	Gateway missing API key	Contact validator or check gateway configuration
`DESEARCH_API_KEY not configured`	Gateway missing API key	Contact validator or check gateway configuration
`503 Service Unavailable`	Model is cold (no active instances)	Retry with exponential backoff (2-8s delays)
`429 Too Many Requests`	Rate limit exceeded	Retry with exponential backoff
`404 Not Found`	Invalid model name	Verify model exists at https://chutes.ai/app
`Connection timeout`	Network issue or slow gateway	Increase timeout, implement retry logic
`422 Unprocessable Entity`	Invalid request parameters	Validate request body against API spec

Additional Resources

Chutes AI Models: https://chutes.ai/app
Desearch AI Documentation: https://desearch.ai/
Miner Setup Guide: miner-setup.md
Subnet Rules: subnet-rules.md
Architecture Overview: architecture.md

FilesExpand file tree

gateway-guide.md

Latest commit

History

gateway-guide.md

File metadata and controls

Gateway API Reference

Overview

Authentication

Environment Variables

Request Requirements

Chutes AI Endpoints

POST /api/gateway/chutes/chat/completions

GET /api/gateway/chutes/status

Desearch AI Endpoints

POST /api/gateway/desearch/ai/search

POST /api/gateway/desearch/ai/links

POST /api/gateway/desearch/web/search

POST /api/gateway/desearch/web/crawl

POST /api/gateway/desearch/x/search

POST /api/gateway/desearch/x/post

OpenAI Endpoints

POST /api/gateway/openai/responses

Perplexity Endpoints

POST /api/gateway/perplexity/chat/completions

Vericore Endpoints

POST /api/gateway/vericore/calculate-rating

OpenRouter Endpoints

POST /api/gateway/openrouter/chat/completions

Numinous Indicia Endpoints

POST /api/gateway/numinous-indicia/x-osint

POST /api/gateway/numinous-indicia/liveuamap

Response (both endpoints)

Caching

Best Practices

Prompt Rules

Error Handling

Timeout Management

Model Selection

Search Strategy

Testing

Local Testing

Production Testing

Common Errors

Additional Resources