A model-agnostic implementation of Anthropic's Programmatic Tool Calling pattern. Instead of sequential tool calls consuming tokens, any LLM writes Rhai scripts that orchestrate multiple tools efficiently.
Traditional AI tool calling follows a request-response pattern:
LLM: "Call get_expenses(employee_id=1)"
→ Returns 100 expense items to context
LLM: "Call get_expenses(employee_id=2)"
→ Returns 100 more items to context
... (20 employees later)
→ 2,000+ line items polluting the context window
→ 110,000+ tokens just to produce a summary
Each intermediate result floods the model's context window, wasting tokens and degrading performance.
In November 2024, Anthropic introduced Programmatic Tool Calling (PTC) as part of their advanced tool use features. The key insight:
LLMs excel at writing code. Instead of reasoning through one tool call at a time, let them write code that orchestrates entire workflows.
Their approach:
- Claude writes Python code that calls multiple tools
- Code executes in Anthropic's managed sandbox
- Only the final result returns to the context window
Results: 37-98% token reduction, lower latency, more reliable control flow.
- Introducing advanced tool use on the Claude Developer Platform - Anthropic Engineering Blog
- CodeAct: Executable Code Actions Elicit Better LLM Agents - Academic research on code-based tool orchestration
Anthropic's implementation has constraints:
- Claude-only: Requires Claude 4.5 with the
advanced-tool-use-2025-11-20beta header - Python-only: Scripts must be Python
- Anthropic-hosted: Execution happens in their managed sandbox
- API-dependent: Requires their code execution tool to be enabled
Tool Orchestrator provides the same benefits for any LLM provider:
| Constraint | Anthropic's PTC | Tool Orchestrator |
|---|---|---|
| Model | Claude 4.5 only | Any LLM that can write code |
| Language | Python | Rhai (Rust-like, easy for LLMs) |
| Execution | Anthropic's sandbox | Your local process |
| Runtime | Server-side (their servers) | Client-side (your control) |
| Dependencies | API call + beta header | Pure Rust, zero runtime deps |
| Targets | Python environments | Native Rust + WASM (browser/Node.js) |
- Claude (all versions, not just 4.5)
- OpenAI (GPT-4, GPT-4o, o1, etc.)
- Google (Gemini Pro, etc.)
- Anthropic competitors (Mistral, Cohere, etc.)
- Local models (Ollama, llama.cpp, vLLM)
- Any future provider
┌─────────────────────────────────────────────────────────────────┐
│ TRADITIONAL APPROACH │
│ │
│ LLM ─→ Tool Call ─→ Full Result to Context ─→ LLM reasons │
│ LLM ─→ Tool Call ─→ Full Result to Context ─→ LLM reasons │
│ LLM ─→ Tool Call ─→ Full Result to Context ─→ LLM reasons │
│ (tokens multiply rapidly) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ PROGRAMMATIC TOOL CALLING │
│ │
│ LLM writes script: │
│ ┌──────────────────────────────────────┐ │
│ │ let results = []; │ │
│ │ for id in employee_ids { │ Executes locally │
│ │ let expenses = get_expenses(id); │ ─────────────────→ │
│ │ let flagged = expenses.filter(...); │ Tools called │
│ │ results.push(flagged); │ in sandbox │
│ │ } │ │
│ │ summarize(results) // Only this │ ←───────────────── │
│ └──────────────────────────────────────┘ returns to LLM │
└─────────────────────────────────────────────────────────────────┘
- Register tools - Your actual tool implementations (file I/O, APIs, etc.)
- LLM writes script - Any LLM generates a Rhai script orchestrating those tools
- Sandboxed execution - Script runs locally with configurable safety limits
- Minimal context - Only the final result enters the conversation
This crate produces two outputs from a single codebase:
| Target | Description | Use Case |
|---|---|---|
| Rust Library | Native Rust crate with Arc<Mutex> thread safety |
CLI tools, server-side apps, native integrations |
| WASM Package | Browser/Node.js module with Rc<RefCell> |
Web apps, npm packages, browser-based AI |
- 37-98% token reduction - Intermediate results stay in sandbox, only final output returns
- Batch operations - Process thousands of items in loops without context pollution
- Conditional logic - if/else based on tool results, handled in code not LLM reasoning
- Data transformation - Filter, aggregate, transform between tool calls
- Explicit control flow - Loops, error handling, retries are code, not implicit reasoning
- Model agnostic - Works with any LLM that can write Rhai/Rust-like code
- Audit trail - Every tool call is recorded with timing and results
# Add to Cargo.toml
cargo add tool-orchestrator
# Or build from source
cargo build# Build for web (browser)
wasm-pack build --target web --features wasm --no-default-features
# Build for Node.js
wasm-pack build --target nodejs --features wasm --no-default-features
# The package is generated in ./pkg/use tool_orchestrator::{ToolOrchestrator, ExecutionLimits};
// Create orchestrator
let mut orchestrator = ToolOrchestrator::new();
// Register tools as executor functions
orchestrator.register_executor("read_file", |input| {
let path = input.get("path").and_then(|v| v.as_str()).unwrap_or("");
std::fs::read_to_string(path).map_err(|e| e.to_string())
});
orchestrator.register_executor("list_directory", |input| {
let path = input.get("path").and_then(|v| v.as_str()).unwrap_or(".");
let entries: Vec<String> = std::fs::read_dir(path)
.map_err(|e| e.to_string())?
.filter_map(|e| e.ok().map(|e| e.path().display().to_string()))
.collect();
Ok(entries.join("\n"))
});
// Execute a Rhai script (written by any LLM)
let script = r#"
let files = list_directory("src");
let rust_files = [];
for file in files.split("\n") {
if file.ends_with(".rs") {
rust_files.push(file);
}
}
`Found ${rust_files.len()} Rust files: ${rust_files}`
"#;
let result = orchestrator.execute(script, ExecutionLimits::default())?;
println!("Output: {}", result.output); // Final result only
println!("Tool calls: {:?}", result.tool_calls); // Audit trailimport init, { WasmOrchestrator, ExecutionLimits } from 'tool-orchestrator';
await init();
const orchestrator = new WasmOrchestrator();
// Register a JavaScript function as a tool
orchestrator.register_tool('get_weather', (inputJson: string) => {
const input = JSON.parse(inputJson);
// Your implementation here
return JSON.stringify({ temp: 72, condition: 'sunny' });
});
// Execute a Rhai script
const limits = new ExecutionLimits();
const result = orchestrator.execute(`
let weather = get_weather("San Francisco");
\`Current weather: \${weather}\`
`, limits);
console.log(result);
// { success: true, output: "Current weather: ...", tool_calls: [...] }The orchestrator includes built-in limits to prevent runaway scripts:
| Limit | Default | Description |
|---|---|---|
max_operations |
100,000 | Prevents infinite loops |
max_tool_calls |
50 | Limits tool invocations |
timeout_ms |
30,000 | Execution timeout |
max_string_size |
10MB | Maximum string length |
max_array_size |
10,000 | Maximum array elements |
// Preset profiles
let quick = ExecutionLimits::quick(); // 10k ops, 10 calls, 5s
let extended = ExecutionLimits::extended(); // 500k ops, 100 calls, 2m
// Custom limits
let limits = ExecutionLimits::default()
.with_max_operations(50_000)
.with_max_tool_calls(25)
.with_timeout_ms(10_000);Rhai scripts executed by this crate cannot:
- Access the filesystem (no
std::fs, no file I/O) - Make network requests (no sockets, no HTTP)
- Execute shell commands (no
std::process) - Access environment variables
- Spawn threads or processes
- Access raw memory or use unsafe code
The only way scripts interact with the outside world is through explicitly registered tools.
You are responsible for the security of tools you register. If you register a tool that:
- Executes shell commands → scripts can run arbitrary commands
- Reads/writes files → scripts can access your filesystem
- Makes HTTP requests → scripts can exfiltrate data
Design your tools with the principle of least privilege.
The timeout_ms limit uses Rhai's on_progress callback for real-time enforcement:
- Timeout is checked after every Rhai operation (not just at the end)
- CPU-intensive loops will be terminated mid-execution when timeout is exceeded
- Note: Timeout checks don't occur during a tool call - if a registered tool blocks for 10 seconds, that time isn't interruptible
- For tools that may block, implement your own timeouts within the tool executor
let limits = ExecutionLimits::default()
.with_max_operations(10_000) // Tight loop protection
.with_max_tool_calls(5) // Limit external interactions
.with_timeout_ms(5_000) // 5 second max
.with_max_string_size(100_000) // 100KB strings
.with_max_array_size(1_000); // 1K elementsAnthropic uses Python because Claude is trained extensively on it. We chose Rhai for different reasons:
| Factor | Python | Rhai |
|---|---|---|
| Safety | Requires heavy sandboxing | Sandboxed by design, no filesystem/network access |
| Embedding | CPython runtime (large) | Pure Rust, compiles into your binary |
| WASM | Complex (Pyodide, etc.) | Native WASM support |
| Syntax | Python-specific | Rust-like (familiar to many LLMs) |
| Performance | Interpreter overhead | Optimized for embedding |
| Dependencies | Python ecosystem | Zero runtime dependencies |
LLMs have no trouble generating Rhai - it's syntactically similar to Rust/JavaScript:
// Variables
let x = 42;
let name = "Claude";
// String interpolation (backticks)
let greeting = `Hello, ${name}!`;
// Arrays and loops
let items = [1, 2, 3, 4, 5];
let sum = 0;
for item in items {
sum += item;
}
// Conditionals
if sum > 10 {
"Large sum"
} else {
"Small sum"
}
// Maps (objects)
let config = #{
debug: true,
limit: 100
};
// Tool calls (registered functions)
let content = read_file("README.md");
let files = list_directory("src");
| Feature | Default | Description |
|---|---|---|
native |
Yes | Thread-safe with Arc<Mutex> (for native Rust) |
wasm |
No | Single-threaded with Rc<RefCell> (for browser/Node.js) |
# Run all native tests
cargo test
# Run with verbose output
cargo test -- --nocaptureWASM tests require wasm-pack. Install it with:
cargo install wasm-packRun WASM tests:
# Test with Node.js (fastest)
wasm-pack test --node --features wasm --no-default-features
# Test with headless Chrome
wasm-pack test --headless --chrome --features wasm --no-default-features
# Test with headless Firefox
wasm-pack test --headless --firefox --features wasm --no-default-featuresThe test suite includes:
Native tests (39 tests)
- Orchestrator creation and configuration
- Tool registration and execution
- Script compilation and execution
- Error handling (compilation errors, tool errors, runtime errors)
- Execution limits (max operations, max tool calls, timeout)
- JSON type conversion
- Loop and conditional execution
- Timing and metrics recording
WASM tests (25 tests)
- ExecutionLimits constructors and setters
- WasmOrchestrator creation
- Script execution (simple, loops, conditionals, functions)
- Tool registration and execution
- JavaScript callback integration
- Error handling (compilation, runtime, tool errors)
- Max operations and tool call limits
- Complex data structures (arrays, maps, nested)
The orchestrator integrates with AI agents via a tool definition:
// Register as "execute_script" tool for the LLM
Tool {
name: "execute_script",
description: "Execute a Rhai script for programmatic tool orchestration.
Write code that calls registered tools, processes results,
and returns only the final output. Use loops for batch
operations, conditionals for branching logic.",
input_schema: /* script parameter */,
requires_approval: false, // Scripts are sandboxed
}When the LLM needs to perform multi-step operations, it writes a Rhai script instead of making sequential individual tool calls. The script executes locally, and only the final result enters the context window.
To use programmatic tool calling, your LLM needs to know how to write Rhai scripts. Include something like this in your system prompt:
You have access to a script execution tool that runs Rhai code. When you need to:
- Call multiple tools in sequence
- Process data from tool results
- Loop over items or aggregate results
- Apply conditional logic based on tool outputs
Write a Rhai script instead of making individual tool calls.
## Rhai Syntax Quick Reference
Variables and types:
let x = 42; // integer
let name = "hello"; // string
let items = [1, 2, 3]; // array
let config = #{ key: "value" }; // map (object)
String interpolation (use backticks):
let msg = `Hello, ${name}!`;
let result = `Found ${items.len()} items`;
Loops:
for item in items { /* body */ }
for i in 0..10 { /* 0 to 9 */ }
Conditionals:
if x > 5 { "big" } else { "small" }
String methods:
s.len(), s.contains("x"), s.starts_with("x"), s.ends_with("x")
s.split(","), s.trim(), s.to_upper(), s.to_lower()
s.sub_string(start, len), s.index_of("x")
Array methods:
arr.push(item), arr.len(), arr.pop()
arr.filter(|x| x > 5), arr.map(|x| x * 2)
Parsing:
"42".parse_int(), "3.14".parse_float()
Available tools (call as functions):
{TOOL_LIST}
## Important Rules
1. The LAST expression in your script is the return value
2. Use string interpolation with backticks for output: `Result: ${value}`
3. Process data locally - don't return intermediate results
4. Only return the final summary/answer
## Example
Task: Get total expenses for employees 1-3
Script:
let total = 0;
for id in [1, 2, 3] {
let expenses = get_expenses(id); // Returns JSON array
// Parse and sum (simplified)
total += expenses.len() * 100; // Estimate
}
`Total across 3 employees: $${total}`
| Concept | Rhai Syntax | Notes |
|---|---|---|
| Variables | let x = 5; |
Immutable by default |
| Mutable | let x = 5; x = 10; |
Can reassign |
| Strings | "hello" or `hello` |
Backticks allow interpolation |
| Interpolation | `Value: ${x}` |
Only in backtick strings |
| Arrays | [1, 2, 3] |
Dynamic, mixed types OK |
| Maps | #{ a: 1, b: 2 } |
Like JSON objects |
| For loops | for x in arr { } |
Iterates over arrays |
| Ranges | for i in 0..5 { } |
0, 1, 2, 3, 4 |
| If/else | if x > 5 { a } else { b } |
Expression-based |
| Functions | fn add(a, b) { a + b } |
Last expr is return |
| Tool calls | tool_name(arg) |
Registered tools are functions |
| Comments | // comment |
Single line |
| Unit (null) | () |
Like None/null |
- open-ptc-agent - Python implementation using Daytona sandbox
- LangChain DeepAgents - LangChain's agent framework with code execution
This project implements patterns from:
- Anthropic's Advanced Tool Use - The original Programmatic Tool Calling concept
- Rhai - The embedded scripting engine that makes this possible
MIT