Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Module 04 · Enterprise Patterns

Production hardening patterns for AI systems at scale.

Files

File Description
multi_model_router.py Route to optimal model by cost/latency/capability with circuit breaking
cost_optimizer.py Per-tenant budget enforcement, auto model downgrade

Supported Providers (all via env vars)

Provider Model Examples Env Var
xAI Grok grok-2-latest, grok-2-mini XAI_API_KEY
OpenRouter 100+ models OPENROUTER_API_KEY
HuggingFace Mistral, Llama, etc. HUGGINGFACE_API_KEY
Ollama (local) llama3.2, phi3 OLLAMA_BASE_URL

Quick Start

# Test the router
python 04-enterprise-patterns/multi_model_router.py

# Test the cost optimizer
python 04-enterprise-patterns/cost_optimizer.py

Key Patterns

  • Circuit Breaker: Automatically skip providers that are failing
  • Cost-aware Routing: Stay within budget by preferring cheaper models
  • Fallback Chains: Never fail completely — degrade gracefully to cheaper/local models
  • Tenant Budget Enforcement: Each customer has isolated token and cost limits