Skip to content

Latest commit

Β 

History

History
Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

README.md

cascadeflow TypeScript Examples

Complete collection of examples demonstrating cascadeflow from basics to production deployment.


πŸš€ Quick Start (5 Minutes)

# 1. Install cascadeflow
npm install @cascadeflow/core

# 2. Set your API key
export OPENAI_API_KEY="sk-..."

# 3. Run your first example
cd packages/core/examples/nodejs
npx tsx basic-usage.ts

That's it! You'll see cascading in action with cost savings.


🎯 Quick Reference - Find What You Need

Example What It Does Complexity Time Best For
basic-usage.ts Learn cascading basics ⭐ Easy 5 min First-time users
tool-calling.ts Function calling ⭐⭐ Medium 15 min Agent builders
cost-tracking.ts Budget management ⭐⭐ Medium 15 min Cost optimization
multi-provider.ts Mix AI providers ⭐⭐ Medium 10 min Multi-cloud
reasoning-models.ts o1, o3, Claude 3.7, DeepSeek-R1 ⭐⭐ Medium 10 min Complex reasoning
semantic-quality.ts ML-based quality validation ⭐⭐⭐ Advanced 15 min Quality assurance
production-patterns.ts Enterprise patterns ⭐⭐⭐ Advanced 30 min Production deployment

πŸ’‘ Tip: Start with basic-usage.ts, then explore based on your use case!


πŸ” Find by Feature

I want to...

  • Use tools/functions? β†’ tool-calling.ts
  • Track costs? β†’ cost-tracking.ts
  • Enforce budgets? β†’ cost-tracking.ts, production-patterns.ts
  • Use multiple providers? β†’ multi-provider.ts
  • Deploy to production? β†’ production-patterns.ts
  • Use reasoning models? β†’ reasoning-models.ts
  • Validate quality with ML? β†’ semantic-quality.ts
  • Access DeepSeek/Gemini/Azure? β†’ Python examples (LiteLLM integration)

πŸ“‹ Table of Contents


πŸ“š Examples by Category

🌟 Core Examples (4 examples) - Start Here

Perfect for learning cascadeflow basics. Start with these!

1. Basic Usage ⭐ START HERE

File: nodejs/basic-usage.ts Time: 5 minutes What you'll learn:

  • How cascading works (cheap model β†’ expensive model)
  • Automatic quality-based routing
  • Cost tracking and savings
  • When drafts are accepted vs rejected

Run it:

export OPENAI_API_KEY="sk-..."
cd packages/core/examples/nodejs
npx tsx basic-usage.ts

Expected output:

Query 1/8: What color is the sky?
   πŸ’š Model: gpt-4o-mini only
   πŸ’° Cost: $0.000081
   βœ… Draft Accepted

Query 5/8: Write a function to reverse a string...
   πŸ’› Model: gpt-4o
   πŸ’° Cost: $0.001320
   🎯 Direct Route (hard complexity)

πŸ’° TOTAL SAVINGS: 48.2% reduction

Key concepts:

  • Token-based pricing (not flat rates)
  • PreRouter detects complexity and routes accordingly
  • Draft accepted = verifier skipped (saves money!)
  • Semantic quality checking with embeddings

2. Tool Calling 🎯

File: nodejs/tool-calling.ts Time: 15 minutes What you'll learn:

  • Define tools with TypeScript types
  • Type-safe tool definitions
  • Tool execution across cascade tiers
  • Universal tool format

Key features:

  • Full TypeScript type safety
  • Weather and calculator tools
  • Automatic tool format conversion
  • Cross-provider compatibility

Run it:

export OPENAI_API_KEY="sk-..."
npx tsx tool-calling.ts

3. Multi-Provider Cascade 🌐

File: nodejs/multi-provider.ts Time: 10 minutes What you'll learn:

  • Mix models from different providers
  • OpenAI + Anthropic + Groq
  • Provider-specific configurations
  • Cross-provider cost comparison

Example setup:

const agent = new CascadeAgent({
  models: [
    { name: 'llama-3.1-8b-instant', provider: 'groq', cost: 0.00005 },  // Fast & cheap
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625 },              // Quality
    { name: 'claude-3-5-sonnet', provider: 'anthropic', cost: 0.003 },  // Reasoning
  ],
});

Requirements:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."

4. Reasoning Models 🧠

File: nodejs/reasoning-models.ts Time: 10 minutes What you'll learn:

  • Use o1, o3-mini, Claude 3.7, DeepSeek-R1
  • Extended thinking mode
  • Chain-of-thought reasoning
  • Zero configuration (auto-detects reasoning capabilities)

Supported models:

  • OpenAI: o1, o1-mini, o3-mini
  • Anthropic: claude-3-7-sonnet-20250219
  • Ollama: deepseek-r1, deepseek-r1-distill (free local)
  • vLLM: deepseek-r1 (self-hosted)

Example:

const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
    { name: 'o1', provider: 'openai', cost: 0.015 },  // Auto-detected
  ],
});

// Reasoning tokens automatically tracked
const result = await agent.run('Solve the Traveling Salesman Problem');

πŸ’° Cost Management (1 example)

Track costs and manage budgets in production.

Cost Tracking

File: nodejs/cost-tracking.ts Time: 15 minutes What you'll learn:

  • Real-time cost tracking across queries
  • Per-model and per-provider cost analysis
  • Budget limits and alerts
  • Cost history and trends

Features:

  • Tracks costs per query, model, and provider
  • Manual tracking implementation (TypeScript doesn't have telemetry module yet)
  • Budget warnings at configurable thresholds
  • Cost breakdown by complexity

Run it:

export OPENAI_API_KEY="sk-..."
npx tsx cost-tracking.ts

Output example:

πŸ“Š Cost Breakdown:
   GPT-4o-mini: $0.000420 (5 queries)
   GPT-4o:      $0.004650 (3 queries)
   Total:       $0.005070

πŸ’° Budget Status:
   Used:   $0.005070 / $10.00 (0.05%)
   Remaining: $9.995

πŸ€– Quality & Validation (1 example)

ML-based semantic validation using embeddings.

Semantic Quality Validation

File: nodejs/semantic-quality.ts Time: 15 minutes What you'll learn:

  • Semantic similarity scoring with BGE-small-en-v1.5
  • Off-topic response detection
  • Integration with cascade quality validation
  • Request-scoped caching for performance

Features:

  • Model: BGE-small-en-v1.5 (~40MB, auto-downloads)
  • Runtime: CPU-based, fully local inference
  • Latency: ~50-100ms per check (with caching)
  • Caching: 50% latency reduction on cache hits

Installation:

npm install @cascadeflow/ml @xenova/transformers

Example:

import { CascadeAgent } from '@cascadeflow/core';
import { UnifiedEmbeddingService } from '@cascadeflow/ml';

const embeddingService = await UnifiedEmbeddingService.getInstance();

const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625 },
  ],
  quality: {
    semanticThreshold: 0.7,  // Reject if similarity < 70%
    embeddingService,
  },
});

Run it:

export OPENAI_API_KEY="sk-..."
npx tsx semantic-quality.ts

🏭 Production (1 example)

Deploy cascadeflow to production with enterprise patterns.

Production Patterns ⭐

File: nodejs/production-patterns.ts Time: 30 minutes What you'll learn:

  • Error handling and automatic retries
  • Response caching for performance
  • Rate limiting and throttling
  • Monitoring and logging
  • Cost tracking and budgets
  • Failover strategies

Patterns covered:

  • Exponential backoff retries
  • In-memory and Redis caching
  • Token bucket rate limiting
  • Structured logging
  • Budget enforcement
  • Multi-provider fallback

Features:

// Error handling with retries
async function withRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await sleep(Math.pow(2, i) * 1000);  // Exponential backoff
    }
  }
}

// Response caching
class ResponseCache {
  cache = new Map();
  ttl = 3600;  // 1 hour

  get(key: string): string | null {
    const entry = this.cache.get(key);
    if (!entry || Date.now() > entry.expiry) return null;
    return entry.value;
  }
}

// Rate limiting
class RateLimiter {
  tokens: number;
  maxTokens: number;
  refillRate: number;  // tokens per second
}

Run it:

export OPENAI_API_KEY="sk-..."
npx tsx production-patterns.ts

πŸŽ“ Learning Path

Step 1: Basics (30 minutes)

  1. βœ… Run basic-usage.ts - Understand core concepts
  2. βœ… Read the code comments - Learn patterns
  3. βœ… Try different queries - See routing decisions

Key concepts:

  • Cascading = cheap model first, escalate if needed
  • Draft accepted = money saved βœ…
  • Draft rejected = quality ensured βœ…
  • PreRouter detects complexity before calling models

Step 2: Tools & Multi-Provider (30 minutes)

  1. βœ… Run tool-calling.ts - Learn tool usage
  2. βœ… Run multi-provider.ts - Mix providers
  3. βœ… Run reasoning-models.ts - Try o1/o3

Key concepts:

  • Type-safe tool definitions
  • Universal tool format
  • Cross-provider compatibility
  • Reasoning token tracking

Step 3: Cost Management (30 minutes)

  1. βœ… Run cost-tracking.ts - Learn budget tracking
  2. βœ… Implement custom budget logic
  3. βœ… Read Cost Tracking Guide

Key concepts:

  • Token-based pricing
  • Per-model breakdown
  • Budget alerts
  • Cost optimization

Step 4: Production (1 hour)

  1. βœ… Run production-patterns.ts - Enterprise patterns
  2. βœ… Run semantic-quality.ts - ML validation
  3. βœ… Read Production Guide

Key concepts:

  • Error handling
  • Rate limiting
  • Caching
  • Monitoring

πŸ› οΈ Running Examples

Prerequisites

# Install cascadeflow
npm install @cascadeflow/core

# For semantic quality example
npm install @cascadeflow/ml @xenova/transformers

# Install peer dependencies for providers you'll use
npm install openai                    # OpenAI
npm install @anthropic-ai/sdk         # Anthropic
npm install groq-sdk                  # Groq

Set API Keys

# OpenAI (most examples)
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Groq (free, fast)
export GROQ_API_KEY="gsk_..."

# Together AI
export TOGETHER_API_KEY="..."

# HuggingFace
export HF_TOKEN="hf_..."

Run Examples

# Navigate to examples directory
cd packages/core/examples/nodejs

# Run with tsx (recommended)
npx tsx basic-usage.ts
npx tsx tool-calling.ts
npx tsx cost-tracking.ts

# Or install tsx globally
npm install -g tsx
tsx basic-usage.ts

πŸ’‘ Example Code

Minimal Example (Recommended Setup)

import { CascadeAgent } from '@cascadeflow/core';

// Recommended: Claude Haiku + GPT-5
const agent = new CascadeAgent({
  models: [
    { name: 'claude-3-5-haiku-20241022', provider: 'anthropic', cost: 0.0008 },
    { name: 'gpt-5', provider: 'openai', cost: 0.00125 },
  ],
});

const result = await agent.run('What is TypeScript?');
console.log(`Cost: $${result.totalCost}, Savings: ${result.savingsPercentage}%`);

Note: GPT-5 requires organization verification. The cascade works immediately - Claude Haiku handles 75% of queries!

OpenAI Only

const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625 },
  ],
});

With Full Configuration

import { CascadeAgent, ModelConfig } from '@cascadeflow/core';

const models: ModelConfig[] = [
  {
    name: 'claude-3-5-haiku-20241022',
    provider: 'anthropic',
    cost: 0.0008,
    apiKey: process.env.ANTHROPIC_API_KEY,
  },
  {
    name: 'gpt-4o',
    provider: 'openai',
    cost: 0.00625,
    apiKey: process.env.OPENAI_API_KEY,
  },
];

const agent = new CascadeAgent({
  models,
  quality: {
    threshold: 0.7,  // Quality configured at agent level
    requireMinimumTokens: 10,
  },
});

const result = await agent.run('Explain quantum computing');

console.log(result.content);
console.log(`Cost: $${result.totalCost}, Saved: ${result.savingsPercentage}%`);

🌐 Universal Browser Support

All 7 providers work in both Node.js and browser:

  • βœ… OpenAI
  • βœ… Anthropic
  • βœ… Groq
  • βœ… Together AI
  • βœ… Ollama (local)
  • βœ… HuggingFace
  • βœ… vLLM (local)

cascadeflow automatically detects your runtime environment!

Browser Example: See browser/vercel-edge/ for edge function deployment.


πŸ”§ Troubleshooting

API key errors
# Check if set
echo $OPENAI_API_KEY

# Set it
export OPENAI_API_KEY="sk-..."

# Windows
set OPENAI_API_KEY=sk-...

# Or use .env file
echo "OPENAI_API_KEY=sk-..." > .env
Import errors
# Install core package
npm install @cascadeflow/core

# Install peer dependencies
npm install openai @anthropic-ai/sdk groq-sdk

# For semantic quality
npm install @cascadeflow/ml @xenova/transformers
Examples run but show errors
# Check Node.js version (18+ required)
node --version

# Reinstall
rm -rf node_modules package-lock.json
npm install
tsx command not found
# Install tsx
npm install -g tsx

# Or use npx
npx tsx basic-usage.ts
Semantic quality model download fails
# Model downloads automatically on first run
# If it fails, check internet connection and try again

# Manual cache clear
rm -rf ~/.cache/huggingface

πŸ’‘ Pro Tips

1. Start Simple

Begin with basic-usage.ts before advanced examples.

2. Read the Code

All examples are heavily commented. Read through to understand patterns.

3. Key Concepts

Token-Based Pricing:

  • Input and output tokens priced differently
  • gpt-4o: $0.0025 input, $0.010 output per 1K tokens
  • Actual costs depend on query/response length

Cost Savings:

  • Draft accepted = only cheap model used (big savings!)
  • Draft rejected = both models used (quality ensured)
  • Direct routing = only expensive model used (no savings)

Quality Validation:

  • Logprobs-based (default)
  • Semantic similarity with embeddings (optional)
  • Custom validators supported

4. Watch Statistics

const result = await agent.run(query);

// Access result properties
console.log(`Model: ${result.modelUsed}`);
console.log(`Cost: $${result.totalCost}`);
console.log(`Savings: ${result.savingsPercentage}%`);
console.log(`Latency: ${result.latencyMs}ms`);
console.log(`Cascaded: ${result.cascaded}`);
console.log(`Draft Accepted: ${result.draftAccepted}`);

5. Cost Tracking Pattern

// Track costs manually
const costs = {
  total: 0,
  byModel: {} as Record<string, number>,
  queries: 0,
};

for (const query of queries) {
  const result = await agent.run(query);

  costs.total += result.totalCost;
  costs.byModel[result.modelUsed] =
    (costs.byModel[result.modelUsed] || 0) + result.totalCost;
  costs.queries++;
}

console.log(`Average cost per query: $${(costs.total / costs.queries).toFixed(6)}`);

πŸ“– Complete Documentation

Getting Started Guides

Advanced Guides

πŸ“š View All Documentation β†’


🀝 Contributing Examples

Have a great use case? Contribute an example!

Template

/**
 * Your Example - Brief Description
 *
 * What it demonstrates:
 * - Feature 1
 * - Feature 2
 *
 * Requirements:
 * - npm install @cascadeflow/core
 * - export OPENAI_API_KEY="..."
 *
 * Setup:
 *     npm install @cascadeflow/core
 *     export OPENAI_API_KEY="..."
 *     npx tsx your-example.ts
 *
 * Expected Results:
 *     Description of output
 */

import { CascadeAgent } from '@cascadeflow/core';

async function main() {
  console.log('='.repeat(80));
  console.log('YOUR EXAMPLE TITLE');
  console.log('='.repeat(80));

  // Your code here

  console.log('\nKEY TAKEAWAYS:');
  console.log('- Takeaway 1');
  console.log('- Takeaway 2');
}

main().catch(console.error);

See CONTRIBUTING.md for guidelines.


πŸ“ž Need Help?

Documentation

πŸ“– Complete Guides πŸ› οΈ Tools Guide πŸ’° Cost Tracking Guide 🏭 Production Guide πŸ“˜ TypeScript Guide

Community

πŸ’¬ GitHub Discussions - Ask questions πŸ› GitHub Issues - Report bugs πŸ’‘ Use "question" label for general questions


πŸ“Š Summary

βœ… Available Examples (7 total)

Core (4): Basic usage, tool calling, multi-provider, reasoning models

Cost Management (1): Cost tracking

Quality & Validation (1): Semantic quality with ML

Production (1): Production patterns

πŸ“š Documentation Coverage

  • βœ… 7 TypeScript examples (~2,500+ lines of code)
  • βœ… Comprehensive README (this file)
  • βœ… Individual example READMEs (nodejs/README.md, browser/README.md)
  • βœ… Full TypeScript definitions with IDE autocomplete
  • βœ… 10+ comprehensive guides in main docs

πŸ”‘ Key Learnings

Essential Concepts:

  • βœ… Draft accepted = money saved
  • βœ… Draft rejected = quality ensured
  • βœ… PreRouter detects complexity before cascade
  • βœ… Token-based pricing (input/output split)
  • βœ… Semantic quality with embeddings
  • βœ… Universal tool format

Production Ready:

  • βœ… Error handling
  • βœ… Rate limiting
  • βœ… Caching
  • βœ… Budget management
  • βœ… ML-based validation

πŸ’° Save 40-85% on AI costs with intelligent cascading! πŸš€

View All Documentation β€’ Python Examples β€’ TypeScript Examples β€’ GitHub Discussions