This document outlines all supported reasoning models across providers and their specific API requirements.
| Provider | Models | Auto-Detection | Special Parameters | Thinking Tokens |
|---|---|---|---|---|
| OpenAI | o1-preview, o1-mini, o1-2024-12-17, o1, o3-mini | ✅ Yes | reasoning_effort, max_completion_tokens | ✅ Yes |
| Anthropic | claude-3-7-sonnet | ✅ Yes | thinking (budget_tokens) | ✅ Yes |
| Ollama | deepseek-r1 (all sizes) | ✅ Yes | None | |
| vLLM | deepseek-r1 (all sizes) | ✅ Yes | None | |
| LiteLLM | All above via routing | ✅ Yes | Provider-specific | ✅ Yes |
o1-preview- Original reasoning modelo1-mini- Smaller, faster reasoning modelo1-2024-12-17- Latest o1 with reasoning_efforto1- Base o1 modelo3-mini- Next-gen reasoning model
{
o1-preview: {
supportsStreaming: true,
supportsTools: false,
supportsSystemMessages: false,
supportsReasoningEffort: false,
requiresMaxCompletionTokens: false
},
o1-2024-12-17: {
supportsStreaming: false,
supportsTools: false,
supportsSystemMessages: false,
supportsReasoningEffort: true,
requiresMaxCompletionTokens: true
},
o3-mini: {
supportsStreaming: true,
supportsTools: true,
supportsSystemMessages: false,
supportsReasoningEffort: true,
requiresMaxCompletionTokens: true
}
}max_completion_tokens(instead of max_tokens for some models)reasoning_effort: "low" | "medium" | "high" (for o1-2024-12-17, o1, o3-mini)
{
"usage": {
"prompt_tokens": 100,
"completion_tokens": 200,
"total_tokens": 300,
"completion_tokens_details": {
"reasoning_tokens": 50
}
}
}- o1-preview: $15 input, $60 output
- o1-mini: $3 input, $12 output
- o1-2024-12-17: $15 input, $60 output
- o1: $15 input, $60 output
- o3-mini: $1 input, $5 output
claude-3-7-sonnet- Hybrid reasoning model
{
'claude-3-7-sonnet': {
supportsStreaming: true,
supportsTools: true,
supportsSystemMessages: true,
supportsExtendedThinking: true,
requiresThinkingBudget: true
}
}{
"thinking": {
"type": "enabled",
"budget_tokens": 2000 // Minimum 1024
}
}{
"content": [
{
"type": "thinking",
"thinking": "Step-by-step reasoning process...",
"signature": "optional_signature"
},
{
"type": "text",
"text": "Final answer..."
}
]
}- claude-3-7-sonnet: $3 input, $15 output (thinking tokens included in output)
deepseek-r1:1.5b- Ultra-small reasoningdeepseek-r1:8b- Balanced reasoningdeepseek-r1:70b- Large reasoningdeepseek-r1:671b- Full-scale reasoningdeepseek-r1-0528:*- Latest version
{
'deepseek-r1': {
supportsStreaming: true,
supportsTools: true,
supportsSystemMessages: true,
requiresSpecialHandling: false // Standard model interface
}
}- Uses standard Ollama/vLLM parameters
- No special reasoning parameters needed
- Model handles reasoning internally
function isDeepSeekR1(modelName: string): boolean {
const name = modelName.toLowerCase();
return name.includes('deepseek-r1') || name.includes('deepseek-r1-0528');
}- Self-hosted, no API costs
- Hardware costs only
{
"model_list": [
{
"model_name": "reasoning-model",
"litellm_params": {
"model": "o1-mini",
"reasoning_effort": "high"
}
},
{
"model_name": "reasoning-model",
"litellm_params": {
"model": "claude-3-7-sonnet",
"thinking": {"type": "enabled", "budget_tokens": 2000}
}
}
]
}- Check if LiteLLM is available
- If available, use LiteLLM's routing
- If not available, use built-in provider detection
- Fallback to standard model handling
// Extend existing types
interface ReasoningModelInfo {
isReasoning: boolean;
provider: 'openai' | 'anthropic' | 'ollama' | 'vllm';
supportsStreaming: boolean;
supportsTools: boolean;
supportsSystemMessages: boolean;
supportsReasoningEffort?: boolean;
supportsExtendedThinking?: boolean;
requiresMaxCompletionTokens?: boolean;
requiresThinkingBudget?: boolean;
}Each provider implements:
function getReasoningModelInfo(modelName: string): ReasoningModelInfo// Auto-configure based on model info
if (modelInfo.supportsReasoningEffort && request.extra?.reasoning_effort) {
payload.reasoning_effort = request.extra.reasoning_effort;
}
if (modelInfo.supportsExtendedThinking && request.extra?.thinking) {
payload.thinking = request.extra.thinking;
}// Extract thinking from response
if (modelInfo.provider === 'anthropic') {
const thinkingBlock = findThinkingBlock(response.content);
metadata.thinking = thinkingBlock?.thinking;
}
if (modelInfo.provider === 'openai') {
metadata.reasoning_tokens = response.usage?.completion_tokens_details?.reasoning_tokens;
}- Model detection accuracy
- Parameter configuration
- Cost calculation
- Response parsing
- Backward compatibility
- Cross-provider reasoning cascades
- LiteLLM fallback behavior
- n8n node functionality
- Error handling
// OpenAI
test('o1-mini auto-configures reasoning_effort', ...)
test('o3-mini supports tools', ...)
// Anthropic
test('claude-3-7-sonnet extracts thinking blocks', ...)
test('thinking budget validation', ...)
// DeepSeek-R1
test('deepseek-r1:8b detected for ollama', ...)
test('deepseek-r1-0528 detected for vllm', ...)- ✅ OpenAI types and detection
- ✅ Basic Anthropic thinking types
- 🔄 DeepSeek detection types
- ✅ OpenAI provider (complete)
- 🔄 Anthropic provider (extend)
- 🔄 Ollama provider (add DeepSeek)
- 🔄 vLLM provider (add DeepSeek)
- 🔄 LiteLLM wrapper
- 🔄 n8n nodes
- 🔄 Examples
- 🔄 Documentation
✅ Zero-configuration for all reasoning models ✅ Automatic parameter handling ✅ Accurate cost tracking ✅ Comprehensive test coverage (>90%) ✅ Backward compatible ✅ Cross-provider cascade support ✅ Documentation complete
- All reasoning models should work without special configuration
- Developers just specify model name, everything else is automatic
- Cost tracking must include reasoning/thinking tokens
- Fallback to standard behavior if detection fails
- LiteLLM integration optional but leveraged when available