diff --git a/skills/AGENTS.md b/skills/AGENTS.md new file mode 100644 index 000000000..61fd72289 --- /dev/null +++ b/skills/AGENTS.md @@ -0,0 +1,2109 @@ +# Plano Agent Skills + +> Best practices for building agents and agentic applications with Plano — the AI-native proxy and dataplane. Covers configuration, routing, agent orchestration, filter chains, observability, CLI operations, and deployment patterns. + +**Version:** 1.0.0 | **Organization:** Plano + +--- + +## Table of Contents + +- [Section 1: Configuration Fundamentals](#section-1) + - [1.1 Always Specify a Supported Config Version](#always-specify-a-supported-config-version) + - [1.2 Choose the Right Listener Type for Your Use Case](#choose-the-right-listener-type-for-your-use-case) + - [1.3 Register Model Providers with Correct Format Identifiers](#register-model-providers-with-correct-format-identifiers) + - [1.4 Use Environment Variable Substitution for All Secrets](#use-environment-variable-substitution-for-all-secrets) +- [Section 2: Routing & Model Selection](#section-2) + - [2.1 Always Set Exactly One Default Model Provider](#always-set-exactly-one-default-model-provider) + - [2.2 Use Model Aliases for Semantic, Stable Model References](#use-model-aliases-for-semantic-stable-model-references) + - [2.3 Use Passthrough Auth for Proxy and Multi-Tenant Setups](#use-passthrough-auth-for-proxy-and-multi-tenant-setups) + - [2.4 Write Task-Specific Routing Preference Descriptions](#write-task-specific-routing-preference-descriptions) +- [Section 3: Agent Orchestration](#section-3) + - [3.1 Register All Sub-Agents in Both `agents` and `listeners.agents`](#register-all-sub-agents-in-both-agents-and-listenersagents) + - [3.2 Write Capability-Focused Agent Descriptions for Accurate Routing](#write-capability-focused-agent-descriptions-for-accurate-routing) +- [Section 4: Filter Chains & Guardrails](#section-4) + - [4.1 Configure MCP Filters with Explicit Type and Transport](#configure-mcp-filters-with-explicit-type-and-transport) + - [4.2 Configure Prompt Guards with Actionable Rejection Messages](#configure-prompt-guards-with-actionable-rejection-messages) + - [4.3 Order Filter Chains with Guards First, Enrichment Last](#order-filter-chains-with-guards-first-enrichment-last) +- [Section 5: Observability & Debugging](#section-5) + - [5.1 Add Custom Span Attributes for Correlation and Filtering](#add-custom-span-attributes-for-correlation-and-filtering) + - [5.2 Enable Tracing with Appropriate Sampling for Your Environment](#enable-tracing-with-appropriate-sampling-for-your-environment) + - [5.3 Use `planoai trace` to Inspect Routing Decisions](#use-planoai-trace-to-inspect-routing-decisions) +- [Section 6: CLI Operations](#section-6) + - [6.1 Follow the `planoai up` Validation Workflow Before Debugging Runtime Issues](#follow-the-planoai-up-validation-workflow-before-debugging-runtime-issues) + - [6.2 Generate Prompt Targets from Python Functions with `planoai generate_prompt_targets`](#generate-prompt-targets-from-python-functions-with-planoai-generateprompttargets) + - [6.3 Use `planoai cli_agent` to Connect Claude Code Through Plano](#use-planoai-cliagent-to-connect-claude-code-through-plano) + - [6.4 Use `planoai init` Templates to Bootstrap New Projects Correctly](#use-planoai-init-templates-to-bootstrap-new-projects-correctly) +- [Section 7: Deployment & Security](#section-7) + - [7.1 Understand Plano's Docker Network Topology for Agent URL Configuration](#understand-planos-docker-network-topology-for-agent-url-configuration) + - [7.2 Use PostgreSQL State Storage for Multi-Turn Conversations in Production](#use-postgresql-state-storage-for-multi-turn-conversations-in-production) + - [7.3 Verify Listener Health Before Sending Requests](#verify-listener-health-before-sending-requests) +- [Section 8: Advanced Patterns](#section-8) + - [8.1 Combine Multiple Listener Types for Layered Agent Architectures](#combine-multiple-listener-types-for-layered-agent-architectures) + - [8.2 Design Prompt Targets with Precise Parameter Schemas](#design-prompt-targets-with-precise-parameter-schemas) + +--- + +## Section 1: Configuration Fundamentals + +*Core config.yaml structure, versioning, listener types, and provider setup — the entry point for every Plano deployment.* + +### 1.1 Always Specify a Supported Config Version + +**Impact:** `CRITICAL` — Plano rejects configs with missing or unsupported version fields — the version field gates all other validation +**Tags:** `config`, `versioning`, `validation` + +## Always Specify a Supported Config Version + +Every Plano `config.yaml` must include a `version` field at the top level. Plano validates configs against a versioned JSON schema — an unrecognized or missing version will cause `planoai up` to fail immediately with a schema validation error before the container starts. + +**Incorrect (missing or invalid version):** + +```yaml +# No version field — fails schema validation +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY +``` + +**Correct (explicit supported version):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +Use the latest supported version unless you are targeting a specific deployed Plano image. Current supported versions: `v0.1`, `v0.1.0`, `0.1-beta`, `v0.2.0`, `v0.3.0`. Prefer `v0.3.0` for all new projects. + +Reference: https://github.com/katanemo/archgw/blob/main/config/plano_config_schema.yaml + +--- + +### 1.2 Choose the Right Listener Type for Your Use Case + +**Impact:** `CRITICAL` — The listener type determines the entire request processing pipeline — choosing the wrong type means features like prompt functions or agent routing are unavailable +**Tags:** `config`, `listeners`, `architecture`, `routing` + +## Choose the Right Listener Type for Your Use Case + +Plano supports three listener types, each serving a distinct purpose. `listeners` is the only required top-level array in a Plano config. Every listener needs at minimum a `type`, `name`, and `port`. + +| Type | Use When | Key Feature | +|------|----------|-------------| +| `model` | You want an OpenAI-compatible LLM gateway | Routes to multiple LLM providers, supports model aliases and routing preferences | +| `prompt` | You want LLM-callable custom functions | Define `prompt_targets` that the LLM dispatches as function calls | +| `agent` | You want multi-agent orchestration | Routes user requests to specialized sub-agents by matching agent descriptions | + +**Incorrect (using `model` when agents need orchestration):** + +```yaml +version: v0.3.0 + +# Wrong: a model listener cannot route to backend agent services +listeners: + - type: model + name: main + port: 12000 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 +``` + +**Correct (use `agent` listener for multi-agent systems):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: travel_agent + url: http://host.docker.internal:8002 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides real-time weather, forecasts, and conditions for any city. + - id: travel_agent + description: Books flights, hotels, and travel itineraries. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +A single Plano instance can expose multiple listeners on different ports, each with a different type, to serve different clients simultaneously. + +Reference: https://github.com/katanemo/archgw + +--- + +### 1.3 Register Model Providers with Correct Format Identifiers + +**Impact:** `CRITICAL` — Incorrect provider format causes request translation failures — Plano must know the wire format each provider expects +**Tags:** `config`, `model-providers`, `llm`, `api-format` + +## Register Model Providers with Correct Format Identifiers + +Plano translates requests between its internal format and each provider's API. The `model` field uses `provider/model-name` syntax which determines both the upstream endpoint and the request/response translation layer. Some providers require an explicit `provider_interface` override. + +**Provider format reference:** + +| Model prefix | Wire format | Example | +|---|---|---| +| `openai/*` | OpenAI | `openai/gpt-4o` | +| `anthropic/*` | Anthropic | `anthropic/claude-sonnet-4-20250514` | +| `gemini/*` | Google Gemini | `gemini/gemini-2.0-flash` | +| `mistral/*` | Mistral | `mistral/mistral-large-latest` | +| `groq/*` | Groq | `groq/llama-3.3-70b-versatile` | +| `deepseek/*` | DeepSeek | `deepseek/deepseek-chat` | +| `xai/*` | Grok (OpenAI-compat) | `xai/grok-2` | +| `together_ai/*` | Together.ai | `together_ai/meta-llama/Llama-3` | +| `custom/*` | Requires `provider_interface` | `custom/my-local-model` | + +**Incorrect (missing provider prefix, ambiguous format):** + +```yaml +model_providers: + - model: gpt-4o # Missing openai/ prefix — Plano cannot route this + access_key: $OPENAI_API_KEY + + - model: claude-3-5-sonnet # Missing anthropic/ prefix + access_key: $ANTHROPIC_API_KEY +``` + +**Correct (explicit provider prefixes):** + +```yaml +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + + - model: gemini/gemini-2.0-flash + access_key: $GOOGLE_API_KEY +``` + +**For local or self-hosted models (Ollama, LiteLLM, vLLM):** + +```yaml +model_providers: + - model: custom/llama3 + base_url: http://host.docker.internal:11434/v1 # Ollama endpoint + provider_interface: openai # Ollama speaks OpenAI format + default: true +``` + +Always set `default: true` on exactly one provider per listener so Plano has a fallback when routing preferences do not match. + +Reference: https://github.com/katanemo/archgw + +--- + +### 1.4 Use Environment Variable Substitution for All Secrets + +**Impact:** `CRITICAL` — Hardcoded API keys in config.yaml will be committed to version control and exposed in Docker container inspect output +**Tags:** `config`, `security`, `secrets`, `api-keys`, `environment-variables` + +## Use Environment Variable Substitution for All Secrets + +Plano supports `$VAR_NAME` substitution in config values. This applies to `access_key` fields, `connection_string` for state storage, and `http_headers` in prompt targets and endpoints. Never hardcode credentials — Plano reads them from environment variables or a `.env` file at startup via `planoai up`. + +**Incorrect (hardcoded secrets):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o + access_key: abcdefghijklmnopqrstuvwxyz... # Hardcoded — never do this + +state_storage: + type: postgres + connection_string: "postgresql://admin:mysecretpassword@prod-db:5432/plano" + +prompt_targets: + - name: get_data + endpoint: + name: my_api + http_headers: + Authorization: "Bearer abcdefghijklmnopqrstuvwxyz" # Hardcoded token +``` + +**Correct (environment variable substitution):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + +state_storage: + type: postgres + connection_string: "postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}" + +prompt_targets: + - name: get_data + endpoint: + name: my_api + http_headers: + Authorization: "Bearer $MY_API_TOKEN" +``` + +**`.env` file pattern (loaded automatically by `planoai up`):** + +```bash +# .env — add to .gitignore +OPENAI_API_KEY=abcdefghijklmnopqrstuvwxyz... +ANTHROPIC_API_KEY=abcdefghijklmnopqrstuvwxyz... +DB_USER=plano +DB_PASS=secure-password +DB_HOST=localhost +MY_API_TOKEN=abcdefghijklmnopqrstuvwxyz... +``` + +Plano also accepts keys set directly in the shell environment. Variables referenced in config but not found at startup cause `planoai up` to fail with a clear error listing the missing keys. + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 2: Routing & Model Selection + +*Intelligent LLM routing using preferences, aliases, and defaults to match tasks to the best model.* + +### 2.1 Always Set Exactly One Default Model Provider + +**Impact:** `HIGH` — Without a default provider, Plano has no fallback when routing preferences do not match — requests with unclassified intent will fail +**Tags:** `routing`, `defaults`, `model-providers`, `reliability` + +## Always Set Exactly One Default Model Provider + +When a request does not match any routing preference, Plano forwards it to the `default: true` provider. Without a default, unmatched requests fail. If multiple providers are marked `default: true`, Plano uses the first one — which can produce unexpected behavior. + +**Incorrect (no default provider set):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o-mini # No default: true anywhere + access_key: $OPENAI_API_KEY + routing_preferences: + - name: summarization + description: Summarizing documents and extracting key points + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: Writing new functions and implementing algorithms +``` + +**Incorrect (multiple defaults — ambiguous):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + default: true # First default + access_key: $OPENAI_API_KEY + + - model: openai/gpt-4o + default: true # Second default — confusing + access_key: $OPENAI_API_KEY +``` + +**Correct (exactly one default, covering unmatched requests):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true # Handles general/unclassified requests + routing_preferences: + - name: summarization + description: Summarizing documents, articles, and meeting notes + - name: classification + description: Categorizing inputs, labeling, and intent detection + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: Writing, debugging, and reviewing code + - name: complex_reasoning + description: Multi-step math, logical analysis, research synthesis +``` + +Choose your most cost-effective capable model as the default — it handles all traffic that doesn't match specialized preferences. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +### 2.2 Use Model Aliases for Semantic, Stable Model References + +**Impact:** `MEDIUM` — Hardcoded model names in client code require code changes when you swap providers; aliases let you update routing in config.yaml alone +**Tags:** `routing`, `model-aliases`, `maintainability`, `client-integration` + +## Use Model Aliases for Semantic, Stable Model References + +`model_aliases` map human-readable names to specific model identifiers. Client applications reference the alias, not the underlying model. When you want to upgrade from `gpt-4o` to a new model, you change one line in `config.yaml` — not every client calling the API. + +**Incorrect (clients hardcode specific model names):** + +```yaml +# config.yaml — no aliases defined +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +```python +# Client code — brittle, must be updated when model changes +client.chat.completions.create(model="gpt-4o", ...) +``` + +**Correct (semantic aliases, stable client contracts):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + +model_aliases: + plano.fast.v1: + target: gpt-4o-mini # Cheap, fast — for high-volume tasks + + plano.smart.v1: + target: gpt-4o # High capability — for complex reasoning + + plano.creative.v1: + target: claude-sonnet-4-20250514 # Strong creative writing and analysis + + plano.v1: + target: gpt-4o # Default production alias +``` + +```python +# Client code — stable, alias is the contract +client.chat.completions.create(model="plano.smart.v1", ...) +``` + +**Alias naming conventions:** +- `..` — e.g., `plano.fast.v1`, `acme.code.v2` +- Bumping `.v2` → `.v3` lets you run old and new aliases simultaneously during rollouts +- `plano.v1` as a canonical default gives clients a single stable entry point + +Reference: https://github.com/katanemo/archgw + +--- + +### 2.3 Use Passthrough Auth for Proxy and Multi-Tenant Setups + +**Impact:** `MEDIUM` — Without passthrough auth, self-hosted proxy services (LiteLLM, vLLM, etc.) reject Plano's requests because the wrong Authorization header is sent +**Tags:** `routing`, `authentication`, `proxy`, `litellm`, `multi-tenant` + +## Use Passthrough Auth for Proxy and Multi-Tenant Setups + +When routing to a self-hosted LLM proxy (LiteLLM, vLLM, OpenRouter, Azure APIM) or in multi-tenant setups where clients supply their own keys, set `passthrough_auth: true`. This forwards the client's `Authorization` header rather than Plano's configured `access_key`. Combine with a `base_url` pointing to the proxy. + +**Incorrect (Plano sends its own key to a proxy that expects the client's key):** + +```yaml +model_providers: + - model: custom/proxy + base_url: http://host.docker.internal:8000 + access_key: $SOME_KEY # Plano overwrites the client's auth — proxy rejects it +``` + +**Correct (forward client Authorization header to the proxy):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: custom/litellm-proxy + base_url: http://host.docker.internal:4000 # LiteLLM server + provider_interface: openai # LiteLLM uses OpenAI format + passthrough_auth: true # Forward client's Bearer token + default: true +``` + +**Multi-tenant pattern (client supplies their own API key):** + +```yaml +model_providers: + # Plano acts as a passthrough gateway; each client has their own OpenAI key + - model: openai/gpt-4o + passthrough_auth: true # No access_key here — client's key is forwarded + default: true +``` + +**Combined: proxy for some models, Plano-managed for others:** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY # Plano manages this key + default: true + routing_preferences: + - name: quick tasks + description: Short answers, simple lookups, fast completions + + - model: custom/vllm-llama + base_url: http://gpu-server:8000 + provider_interface: openai + passthrough_auth: true # vLLM cluster handles its own auth + routing_preferences: + - name: long context + description: Processing very long documents, multi-document analysis +``` + +Reference: https://github.com/katanemo/archgw + +--- + +### 2.4 Write Task-Specific Routing Preference Descriptions + +**Impact:** `HIGH` — Vague preference descriptions cause Plano's internal router LLM to misclassify requests, routing expensive tasks to cheap models and vice versa +**Tags:** `routing`, `model-selection`, `preferences`, `llm-routing` + +## Write Task-Specific Routing Preference Descriptions + +Plano's `plano_orchestrator_v1` router uses a 1.5B preference-aligned LLM to classify incoming requests against your `routing_preferences` descriptions. It routes the request to the first provider whose preferences match. Description quality directly determines routing accuracy. + +**Incorrect (vague, overlapping descriptions):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: simple + description: easy tasks # Too vague — what is "easy"? + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: hard + description: hard tasks # Too vague — overlaps with "easy" +``` + +**Correct (specific, distinct task descriptions):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: summarization + description: > + Summarizing documents, articles, emails, or meeting transcripts. + Extracting key points, generating TL;DR sections, condensing long text. + - name: classification + description: > + Categorizing inputs, sentiment analysis, spam detection, + intent classification, labeling structured data fields. + - name: translation + description: > + Translating text between languages, localization tasks. + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: > + Writing new functions, classes, or modules from scratch. + Implementing algorithms, boilerplate generation, API integrations. + - name: code_review + description: > + Reviewing code for bugs, security vulnerabilities, performance issues. + Suggesting refactors, explaining complex code, debugging errors. + - name: complex_reasoning + description: > + Multi-step math problems, logical deduction, strategic planning, + research synthesis requiring chain-of-thought reasoning. +``` + +**Key principles for good preference descriptions:** +- Use concrete action verbs: "writing", "reviewing", "translating", "summarizing" +- List 3–5 specific sub-tasks or synonyms for each preference +- Ensure preferences across providers are mutually exclusive in scope +- Test with representative queries using `planoai trace` and `--where` filters to verify routing decisions + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 3: Agent Orchestration + +*Multi-agent patterns, agent descriptions, and orchestration strategies for building agentic applications.* + +### 3.1 Register All Sub-Agents in Both `agents` and `listeners.agents` + +**Impact:** `CRITICAL` — An agent registered only in `agents` but not referenced in a listener's agent list is unreachable; an agent listed in a listener but missing from `agents` causes a startup error +**Tags:** `agent`, `orchestration`, `config`, `multi-agent` + +## Register All Sub-Agents in Both `agents` and `listeners.agents` + +Plano's agent system has two separate concepts: the global `agents` array (defines the agent's ID and backend URL) and the `listeners[].agents` array (controls which agents are available to an orchestrator and provides their routing descriptions). Both must reference the same agent ID. + +**Incorrect (agent defined globally but not referenced in listener):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: news_agent # Defined but never referenced in any listener + url: http://host.docker.internal:8002 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides weather forecasts and current conditions. + # news_agent is missing here — the orchestrator cannot route to it +``` + +**Incorrect (listener references an agent ID not in the global agents list):** + +```yaml +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides weather forecasts. + - id: flights_agent # ID not in global agents[] — startup error + description: Provides flight status information. +``` + +**Correct (every agent ID appears in both places):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: flights_agent + url: http://host.docker.internal:8002 + - id: hotels_agent + url: http://host.docker.internal:8003 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +listeners: + - type: agent + name: travel_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Real-time weather, forecasts, and climate data for any city. + - id: flights_agent + description: Live flight status, schedules, gates, and delays. + - id: hotels_agent + description: Hotel search, availability, pricing, and booking. + default: true # Fallback if no other agent matches +``` + +Set `default: true` on one agent in each listener's agents list to handle unmatched requests. The agent's URL in the global `agents` array is the HTTP endpoint Plano forwards matching requests to — it must be reachable from within the Docker container (use `host.docker.internal` for services on the host). + +Reference: https://github.com/katanemo/archgw + +--- + +### 3.2 Write Capability-Focused Agent Descriptions for Accurate Routing + +**Impact:** `HIGH` — The orchestrator LLM routes requests purely by reading agent descriptions — poor descriptions cause misroutes to the wrong specialized agent +**Tags:** `agent`, `orchestration`, `descriptions`, `routing`, `multi-agent` + +## Write Capability-Focused Agent Descriptions for Accurate Routing + +In an `agent` listener, Plano's orchestrator reads each agent's `description` and routes user requests to the best-matching agent. This is LLM-based intent matching — the description is the entire specification the router sees. Write it as a capability manifest: what can this agent do, what data does it have access to, and what types of requests should it handle? + +**Incorrect (generic, overlapping descriptions):** + +```yaml +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: agent_1 + description: Helps users with information # Too generic — matches everything + + - id: agent_2 + description: Also helps users # Indistinguishable from agent_1 +``` + +**Correct (specific capabilities, distinct domains, concrete examples):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: flight_agent + url: http://host.docker.internal:8002 + - id: hotel_agent + url: http://host.docker.internal:8003 + +listeners: + - type: agent + name: travel_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: > + Provides real-time weather conditions and multi-day forecasts for any city + worldwide. Handles questions about temperature, precipitation, wind, humidity, + sunrise/sunset times, and severe weather alerts. Examples: "What's the weather + in Tokyo?", "Will it rain in London this weekend?", "Sunrise time in New York." + + - id: flight_agent + description: > + Provides live flight status, schedules, gate information, delays, and + aircraft details for any flight number or route between airports. + Handles questions about departures, arrivals, and airline information. + Examples: "Is AA123 on time?", "Flights from JFK to LAX tomorrow." + + - id: hotel_agent + description: > + Searches and books hotel accommodations, compares room types, pricing, + and availability. Handles check-in/check-out dates, amenities, and + cancellation policies. Examples: "Hotels near Times Square for next Friday." +``` + +**Description writing checklist:** +- State the primary domain in the first sentence +- List 3–5 specific data types or question categories this agent handles +- Include 2–3 concrete example user queries in quotes +- Avoid capability overlap between agents — if they overlap, the router will split traffic unpredictably +- Keep descriptions under 150 words — the orchestrator reads all descriptions per request + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 4: Filter Chains & Guardrails + +*Request/response processing pipelines — ordering, MCP integration, and safety guardrails.* + +### 4.1 Configure MCP Filters with Explicit Type and Transport + +**Impact:** `MEDIUM` — Omitting type and transport fields relies on defaults that may not match your MCP server's protocol implementation +**Tags:** `filter`, `mcp`, `integration`, `configuration` + +## Configure MCP Filters with Explicit Type and Transport + +Plano filters integrate with external services via MCP (Model Context Protocol) or plain HTTP. MCP filters call a specific tool on a remote MCP server. Always specify `type`, `transport`, and optionally `tool` (defaults to the filter `id`) to ensure Plano connects correctly to your filter implementation. + +**Incorrect (minimal filter definition relying on all defaults):** + +```yaml +filters: + - id: my_guard # Plano infers type=mcp, transport=streamable-http, tool=my_guard + url: http://localhost:10500 + # If your MCP server uses a different tool name or transport, this silently misroutes +``` + +**Correct (explicit configuration for each filter):** + +```yaml +version: v0.3.0 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp # Explicitly MCP protocol + transport: streamable-http # Streamable HTTP transport + tool: input_guards # MCP tool name (matches MCP server registration) + + - id: query_rewriter + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + tool: rewrite_query # Tool name differs from filter ID — explicit is safer + + - id: custom_validator + url: http://host.docker.internal:10503 + type: http # Plain HTTP filter (not MCP) + # No tool field for HTTP filters +``` + +**MCP filter implementation contract:** +Your MCP server must expose a tool matching the `tool` name. The tool receives the request payload and must return either: +- A modified request (to pass through with changes) +- A rejection response (to short-circuit the pipeline) + +**HTTP filter alternative** — use `type: http` for simpler request/response interceptors that don't need the MCP protocol: + +```yaml +filters: + - id: auth_validator + url: http://host.docker.internal:9000/validate + type: http # Plano POSTs the request, expects the modified request back +``` + +Reference: https://github.com/katanemo/archgw + +--- + +### 4.2 Configure Prompt Guards with Actionable Rejection Messages + +**Impact:** `MEDIUM` — A generic or empty rejection message leaves users confused about why their request was blocked and unable to rephrase appropriately +**Tags:** `filter`, `guardrails`, `jailbreak`, `security`, `ux` + +## Configure Prompt Guards with Actionable Rejection Messages + +Plano has built-in `prompt_guards` for detecting jailbreak attempts. When triggered, Plano returns the `on_exception.message` instead of forwarding the request. Write messages that explain the restriction and suggest what the user can do instead — both for user experience and to reduce support burden. + +**Incorrect (no message configured — returns a generic error):** + +```yaml +version: v0.3.0 + +prompt_guards: + input_guards: + jailbreak: + on_exception: {} # Empty — returns unhelpful generic error +``` + +**Incorrect (cryptic technical message):** + +```yaml +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: "Error code 403: guard triggered" # Unhelpful to the user +``` + +**Correct (clear, actionable, brand-appropriate message):** + +```yaml +version: v0.3.0 + +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: > + I'm not able to help with that request. This assistant is designed + to help with [your use case, e.g., customer support, coding questions]. + Please rephrase your question or contact support@yourdomain.com + if you believe this is an error. +``` + +**Combining prompt_guards with MCP filter guardrails:** + +```yaml +# Built-in jailbreak detection (fast, no external service needed) +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: "This request cannot be processed. Please ask about our products and services." + +# MCP-based custom guards for additional policy enforcement +filters: + - id: topic_restriction + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + tool: topic_restriction # Custom filter for domain-specific restrictions + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant for product questions and order issues. + filter_chain: + - topic_restriction # Additional custom topic filtering +``` + +`prompt_guards` applies globally to all listeners. Use `filter_chain` on individual agents for per-agent policies. + +Reference: https://github.com/katanemo/archgw + +--- + +### 4.3 Order Filter Chains with Guards First, Enrichment Last + +**Impact:** `HIGH` — Running context builders before input guards means jailbreak attempts get RAG-enriched context before being blocked — wasting compute and risking data exposure +**Tags:** `filter`, `guardrails`, `security`, `pipeline`, `ordering` + +## Order Filter Chains with Guards First, Enrichment Last + +A `filter_chain` is an ordered list of filter IDs applied sequentially to each request. The order is semantically meaningful: each filter receives the output of the previous one. Safety and validation filters must run first to short-circuit bad requests before expensive enrichment filters process them. + +**Recommended filter chain order:** + +1. **Input guards** — jailbreak detection, PII detection, topic restrictions (reject early) +2. **Query rewriting** — normalize or enhance the user query +3. **Context building** — RAG retrieval, tool lookup, knowledge injection (expensive) +4. **Output guards** — validate or sanitize LLM response before returning + +**Incorrect (context built before guards — wasteful and potentially unsafe):** + +```yaml +filters: + - id: context_builder + url: http://host.docker.internal:10502 # Runs expensive RAG retrieval first + - id: query_rewriter + url: http://host.docker.internal:10501 + - id: input_guards + url: http://host.docker.internal:10500 # Guards run last — jailbreak gets context + +listeners: + - type: agent + name: rag_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: rag_agent + filter_chain: + - context_builder # Wrong: expensive enrichment before safety check + - query_rewriter + - input_guards +``` + +**Correct (guards block bad requests before any enrichment):** + +```yaml +version: v0.3.0 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + - id: query_rewriter + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + - id: context_builder + url: http://host.docker.internal:10502 + type: mcp + transport: streamable-http + +listeners: + - type: agent + name: rag_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: rag_agent + description: Answers questions using internal knowledge base documents. + filter_chain: + - input_guards # 1. Block jailbreaks and policy violations + - query_rewriter # 2. Normalize the safe query + - context_builder # 3. Retrieve relevant context for the clean query +``` + +Different agents within the same listener can have different filter chains — a public-facing agent may need all guards while an internal admin agent may skip them. + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 5: Observability & Debugging + +*OpenTelemetry tracing, log levels, span attributes, and sampling for production visibility.* + +### 5.1 Add Custom Span Attributes for Correlation and Filtering + +**Impact:** `MEDIUM` — Without custom span attributes, traces cannot be filtered by user, session, or environment — making production debugging significantly harder +**Tags:** `observability`, `tracing`, `span-attributes`, `correlation` + +## Add Custom Span Attributes for Correlation and Filtering + +Plano can automatically extract HTTP request headers and attach them as span attributes, plus attach static key-value pairs to every span. This enables filtering traces by user, session, tenant, environment, or any other dimension that matters to your application. + +**Incorrect (no span attributes — traces are unfiltered blobs):** + +```yaml +tracing: + random_sampling: 20 + # No span_attributes — cannot filter by user, session, or environment +``` + +**Correct (rich span attributes for production correlation):** + +```yaml +version: v0.3.0 + +tracing: + random_sampling: 20 + trace_arch_internal: true + + span_attributes: + # Match all headers with this prefix, then map to span attributes by: + # 1) stripping the prefix and 2) converting hyphens to dots + header_prefixes: + - x-katanemo- + + # Static attributes added to every span from this Plano instance + static: + environment: production + service.name: plano-gateway + deployment.region: us-east-1 + service.version: "2.1.0" + team: platform-engineering +``` + +**Sending correlation headers from client code:** + +```python +import httpx + +response = httpx.post( + "http://localhost:12000/v1/chat/completions", + headers={ + "x-katanemo-request-id": "req_abc123", + "x-katanemo-user-id": "usr_12", + "x-katanemo-session-id": "sess_xyz456", + "x-katanemo-tenant-id": "acme-corp", + }, + json={"model": "plano.v1", "messages": [...]} +) +``` + +**Querying by custom attribute:** + +```bash +# Find all requests from a specific user +planoai trace --where user.id=usr_12 + +# Find all traces from production environment +planoai trace --where environment=production + +# Find traces from a specific tenant +planoai trace --where tenant.id=acme-corp +``` + +Header prefix matching is a prefix match. With `x-katanemo-`, these mappings apply: + +- `x-katanemo-user-id` -> `user.id` +- `x-katanemo-tenant-id` -> `tenant.id` +- `x-katanemo-request-id` -> `request.id` + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +### 5.2 Enable Tracing with Appropriate Sampling for Your Environment + +**Impact:** `HIGH` — Without tracing enabled, debugging routing decisions, latency issues, and model selection is guesswork — traces are the primary observability primitive in Plano +**Tags:** `observability`, `tracing`, `opentelemetry`, `otel`, `debugging` + +## Enable Tracing with Appropriate Sampling for Your Environment + +Plano emits OpenTelemetry (OTEL) traces for every request, capturing routing decisions, LLM provider selection, filter chain execution, and response latency. Traces are the best tool for understanding why a request was routed to a particular model and debugging unexpected behavior. + +**Incorrect (no tracing configured — flying blind in production):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +# No tracing block — no visibility into routing, latency, or errors +``` + +**Correct (tracing enabled with environment-appropriate sampling):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +tracing: + random_sampling: 100 # 100% for development/debugging + trace_arch_internal: true # Include Plano's internal routing spans +``` + +**Production configuration (sampled to control volume):** + +```yaml +tracing: + random_sampling: 10 # Sample 10% of requests in production + trace_arch_internal: false # Skip internal spans to reduce noise + span_attributes: + header_prefixes: + - x-katanemo- # Match all x-katanemo-* headers + static: + environment: production + service.name: my-plano-service + version: "1.0.0" +``` + +With `x-katanemo-` configured, Plano maps headers to attributes by stripping the prefix and converting hyphens to dots: + +- `x-katanemo-user-id` -> `user.id` +- `x-katanemo-session-id` -> `session.id` +- `x-katanemo-request-id` -> `request.id` + +**Starting the trace collector:** + +```bash +# Start Plano with built-in OTEL collector +planoai up config.yaml --with-tracing +``` + +Sampling rates: 100% for dev/staging, 5–20% for high-traffic production, 100% for low-traffic production. `trace_arch_internal: true` adds spans showing which routing preference matched — essential for debugging preference configuration. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +### 5.3 Use `planoai trace` to Inspect Routing Decisions + +**Impact:** `MEDIUM-HIGH` — The trace CLI lets you verify which model was selected, why, and how long each step took — without setting up a full OTEL backend +**Tags:** `observability`, `tracing`, `cli`, `debugging`, `routing` + +## Use `planoai trace` to Inspect Routing Decisions + +`planoai trace` provides a built-in trace viewer backed by an in-memory OTEL collector. Use it to inspect routing decisions, verify preference matching, measure filter latency, and debug failed requests — all from the CLI without configuring Jaeger, Zipkin, or another backend. + +**Workflow: start collector, run requests, then inspect traces:** + +```bash +# 1. Start Plano with the built-in trace collector (recommended) +planoai up config.yaml --with-tracing + +# 2. Send test requests through Plano +curl http://localhost:12000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "plano.v1", "messages": [{"role": "user", "content": "Write a Python function to sort a list"}]}' + +# 3. Show the latest trace +planoai trace +``` + +You can also run the trace listener directly: + +```bash +planoai trace listen # available on a process ID running OTEL collector +``` + +Stop the background trace listener: + +```bash +planoai trace down +``` + +**Useful trace viewer patterns:** + +```bash +# Show latest trace (default target is "last") +planoai trace + +# List available trace IDs +planoai trace --list + +# Show all traces +planoai trace any + +# Show a specific trace (short 8-char or full 32-char ID) +planoai trace 7f4e9a1c +planoai trace 7f4e9a1c0d9d4a0bb9bf5a8a7d13f62a + +# Filter by specific span attributes (AND semantics for repeated --where) +planoai trace any --where llm.model=gpt-4o-mini + +# Filter by user ID (if header prefix is x-katanemo-, x-katanemo-user-id maps to user.id) +planoai trace any --where user.id=user_123 + +# Limit results for a quick sanity check +planoai trace any --limit 5 + +# Time window filter +planoai trace any --since 30m + +# Filter displayed attributes by key pattern +planoai trace any --filter "http.*" + +# Output machine-readable JSON +planoai trace any --json +``` + +**What to look for in traces:** + + +| Span name | What it tells you | +| ------------------- | ------------------------------------------------------------- | +| `plano.routing` | Which routing preference matched and which model was selected | +| `plano.filter.` | How long each filter in the chain took | +| `plano.llm.request` | Time to first token and full response time | +| `plano.agent.route` | Which agent description matched for agent listeners | + + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +## Section 6: CLI Operations + +*Using the planoai CLI for startup, tracing, CLI agents, project init, and code generation.* + +### 6.1 Follow the `planoai up` Validation Workflow Before Debugging Runtime Issues + +**Impact:** `HIGH` — `planoai up` validates config, checks API keys, and health-checks all listeners — skipping this diagnostic information leads to unnecessary debugging of container or network issues +**Tags:** `cli`, `startup`, `validation`, `debugging`, `workflow` + +## Follow the `planoai up` Validation Workflow Before Debugging Runtime Issues + +`planoai up` is the entry point for running Plano. It performs sequential checks before the container starts: schema validation, API key presence check, container startup, and health checks on all configured listener ports. Understanding what each failure stage means prevents chasing the wrong root cause. + +**Validation stages and failure signals:** + +``` +Stage 1: Schema validation → "config.yaml: invalid against schema" +Stage 2: API key check → "Missing required environment variables: OPENAI_API_KEY" +Stage 3: Container start → "Docker daemon not running" or image pull errors +Stage 4: Health check (/healthz) → "Listener not healthy after 120s" (timeout) +``` + +**Development startup workflow:** + +```bash +# Standard startup — config.yaml in current directory +planoai up + +# Explicit config file path +planoai up my-config.yaml + +# Start in foreground to see all logs immediately (great for debugging) +planoai up config.yaml --foreground + +# Start with built-in OTEL trace collector +planoai up config.yaml --with-tracing + +# Enable verbose logging for debugging routing decisions +LOG_LEVEL=debug planoai up config.yaml --foreground +``` + +**Checking what's running:** + +```bash +# Stream recent logs (last N lines, then exit) +planoai logs + +# Follow logs in real-time +planoai logs --follow + +# Include Envoy/gateway debug messages +planoai logs --debug --follow +``` + +**Stopping and restarting after config changes:** + +```bash +# Stop the current container +planoai down + +# Restart with updated config +planoai up config.yaml +``` + +**Common failure patterns:** + +```bash +# API key missing — check your .env file or shell environment +export OPENAI_API_KEY=sk-proj-... +planoai up config.yaml + +# Health check timeout — listener port may conflict +# Check if another process uses port 12000 +lsof -i :12000 + +# Container fails to start — verify Docker daemon is running +docker ps +``` + +`planoai down` fully stops and removes the Plano container. Always run `planoai down` before `planoai up` when changing config to avoid stale container state. + +Reference: https://github.com/katanemo/archgw + +--- + +### 6.2 Generate Prompt Targets from Python Functions with `planoai generate_prompt_targets` + +**Impact:** `MEDIUM` — Manually writing prompt_targets YAML for existing Python APIs is error-prone — the generator introspects function signatures and produces correct YAML automatically +**Tags:** `cli`, `generate`, `prompt-targets`, `python`, `code-generation` + +## Generate Prompt Targets from Python Functions with `planoai generate_prompt_targets` + +`planoai generate_prompt_targets` introspects Python function signatures and docstrings to generate `prompt_targets` YAML for your Plano config. This is the fastest way to expose existing Python APIs as LLM-callable functions without manually writing the YAML schema. + +**Python function requirements for generation:** +- Use simple type annotations: `int`, `float`, `bool`, `str`, `list`, `tuple`, `set`, `dict` +- Include a docstring describing what the function does (becomes the `description`) +- Complex Pydantic models must be flattened into primitive typed parameters first + +**Example Python file:** + +```python +# api.py + +def get_stock_quote(symbol: str, exchange: str = "NYSE") -> dict: + """Get the current stock price and trading data for a given stock symbol. + + Returns price, volume, market cap, and 24h change percentage. + """ + # Implementation calls stock API + pass + +def get_weather_forecast(city: str, days: int = 3, units: str = "celsius") -> dict: + """Get the weather forecast for a city. + + Returns temperature, precipitation, and conditions for the specified number of days. + """ + pass + +def search_flights(origin: str, destination: str, date: str, passengers: int = 1) -> list: + """Search for available flights between two airports on a given date. + + Date format: YYYY-MM-DD. Returns list of flight options with prices. + """ + pass +``` + +**Running the generator:** + +```bash +planoai generate_prompt_targets --file api.py +``` + +**Generated output (add to your config.yaml):** + +```yaml +prompt_targets: + - name: get_stock_quote + description: Get the current stock price and trading data for a given stock symbol. + parameters: + - name: symbol + type: str + required: true + - name: exchange + type: str + required: false + default: NYSE + # Add endpoint manually: + endpoint: + name: stock_api + path: /quote?symbol={symbol}&exchange={exchange} + + - name: get_weather_forecast + description: Get the weather forecast for a city. + parameters: + - name: city + type: str + required: true + - name: days + type: int + required: false + default: 3 + - name: units + type: str + required: false + default: celsius + endpoint: + name: weather_api + path: /forecast?city={city}&days={days}&units={units} +``` + +After generation, manually add the `endpoint` blocks pointing to your actual API. The generator produces the schema; you wire in the connectivity. + +Reference: https://github.com/katanemo/archgw + +--- + +### 6.3 Use `planoai cli_agent` to Connect Claude Code Through Plano + +**Impact:** `MEDIUM-HIGH` — Running Claude Code directly against provider APIs bypasses Plano's routing, observability, and guardrails — cli_agent routes all Claude Code traffic through your configured Plano instance +**Tags:** `cli`, `cli-agent`, `claude`, `coding-agent`, `integration` + +## Use `planoai cli_agent` to Connect Claude Code Through Plano + +`planoai cli_agent` starts a Claude Code session that routes all LLM traffic through your running Plano instance instead of directly to Anthropic. This gives you routing preferences, model aliases, tracing, and guardrails for your coding agent workflows — making Claude Code a first-class citizen of your Plano configuration. + +**Prerequisites:** + +```bash +# 1. Plano must be running with a model listener +planoai up config.yaml + +# 2. ANTHROPIC_API_KEY must be set (Claude Code uses it for auth) +export ANTHROPIC_API_KEY=sk-ant-... +``` + +**Starting the CLI agent:** + +```bash +# Start CLI agent using config.yaml in current directory +planoai cli_agent claude + +# Use a specific config file +planoai cli_agent claude config.yaml + +# Use a config in a different directory +planoai cli_agent claude --path /path/to/project +``` + +**Recommended config for Claude Code routing:** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: claude_code_router + port: 12000 + +model_providers: + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + default: true + routing_preferences: + - name: general coding + description: > + Writing code, debugging, code review, explaining concepts, + answering programming questions, general development tasks. + + - model: anthropic/claude-opus-4-6 + access_key: $ANTHROPIC_API_KEY + routing_preferences: + - name: complex architecture + description: > + System design, complex refactoring across many files, + architectural decisions, performance optimization, security audits. + +model_aliases: + claude.fast.v1: + target: claude-sonnet-4-20250514 + claude.smart.v1: + target: claude-opus-4-6 + +tracing: + random_sampling: 100 + trace_arch_internal: true + +overrides: + upstream_connect_timeout: "10s" +``` + +**What happens when cli_agent runs:** + +1. Reads your config.yaml to find the model listener port +2. Configures Claude Code to use `http://localhost:` as its API endpoint +3. Starts a Claude Code session in your terminal +4. All Claude Code LLM calls flow through Plano — routing, tracing, and guardrails apply + +After your session, use `planoai trace` to inspect every LLM call Claude Code made, which model was selected, and why. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +### 6.4 Use `planoai init` Templates to Bootstrap New Projects Correctly + +**Impact:** `MEDIUM` — Starting from a blank config.yaml leads to missing required fields and common structural mistakes — templates provide validated, idiomatic starting points +**Tags:** `cli`, `init`, `templates`, `getting-started`, `project-setup` + +## Use `planoai init` Templates to Bootstrap New Projects Correctly + +`planoai init` generates a valid `config.yaml` from built-in templates. Each template demonstrates a specific Plano capability with correct structure, realistic examples, and comments. Use this instead of writing config from scratch — it ensures you start with a valid, working configuration. + +**Available templates:** + +| Template ID | What It Demonstrates | Best For | +|---|---|---| +| `sub_agent_orchestration` | Multi-agent routing with specialized sub-agents | Building agentic applications | +| `coding_agent_routing` | Routing preferences + model aliases for coding workflows | Claude Code and coding assistants | +| `preference_aware_routing` | Automatic LLM routing based on task type | Multi-model cost optimization | +| `filter_chain_guardrails` | Input guards, query rewrite, context builder | RAG + safety pipelines | +| `conversational_state_v1_responses` | Stateful conversations with memory | Chatbots, multi-turn assistants | + +**Usage:** + +```bash +# Initialize with a template +planoai init --template sub_agent_orchestration + +# Initialize coding agent routing setup +planoai init --template coding_agent_routing + +# Initialize a RAG with guardrails project +planoai init --template filter_chain_guardrails +``` + +**Typical project setup workflow:** + +```bash +# 1. Create project directory +mkdir my-plano-agent && cd my-plano-agent + +# 2. Bootstrap with the closest matching template +planoai init --template preference_aware_routing + +# 3. Edit config.yaml to add your specific models, agents, and API keys +# (keys are already using $VAR substitution — just set your env vars) + +# 4. Create .env file for local development +cat > .env << EOF +OPENAI_API_KEY=sk-proj-... +ANTHROPIC_API_KEY=sk-ant-... +EOF + +echo ".env" >> .gitignore + +# 5. Start Plano +planoai up + +# 6. Test your configuration +curl http://localhost:12000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}' +``` + +Start with `preference_aware_routing` for most LLM gateway use cases and `sub_agent_orchestration` for multi-agent applications. Both can be combined after you understand each independently. + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 7: Deployment & Security + +*Docker deployment, environment variable management, health checks, and state storage for production.* + +### 7.1 Understand Plano's Docker Network Topology for Agent URL Configuration + +**Impact:** `HIGH` — Using `localhost` for agent URLs inside Docker always fails — Plano runs in a container and cannot reach host services via localhost +**Tags:** `deployment`, `docker`, `networking`, `agents`, `urls` + +## Understand Plano's Docker Network Topology for Agent URL Configuration + +Plano runs inside a Docker container managed by `planoai up`. Services running on your host machine (agent servers, filter servers, databases) are not accessible as `localhost` from inside the container. Use Docker's special hostname `host.docker.internal` to reach host services. + +**Docker network rules:** +- `localhost` / `127.0.0.1` inside the container → Plano's own container (not your host) +- `host.docker.internal` → Your host machine's loopback interface +- Container name or `docker network` hostname → Other Docker containers +- External domain / IP → Reachable if Docker has network access + +**Incorrect (using localhost — agent unreachable from inside container):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://localhost:8001 # Wrong: this is Plano's own container + + - id: flight_agent + url: http://127.0.0.1:8002 # Wrong: same issue + +filters: + - id: input_guards + url: http://localhost:10500 # Wrong: filter server unreachable +``` + +**Correct (using host.docker.internal for host-side services):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 # Correct: reaches host port 8001 + + - id: flight_agent + url: http://host.docker.internal:8002 # Correct: reaches host port 8002 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 # Correct: reaches filter server on host + +endpoints: + internal_api: + endpoint: host.docker.internal # Correct for internal API on host + protocol: http +``` + +**Production deployment patterns:** + +```yaml +# Kubernetes / Docker Compose — use service names +agents: + - id: weather_agent + url: http://weather-service:8001 # Kubernetes service DNS + +# External cloud services — use full domain +agents: + - id: cloud_agent + url: https://my-agent.us-east-1.amazonaws.com/v1 + +# Custom TLS (self-signed or internal CA) +overrides: + upstream_tls_ca_path: /etc/ssl/certs/internal-ca.pem +``` + +**Ports exposed by Plano's container:** +- All `port` values from your `listeners` blocks are automatically mapped +- `9901` — Envoy admin interface (for advanced debugging) +- `12001` — Plano internal management API + +Reference: https://github.com/katanemo/archgw + +--- + +### 7.2 Use PostgreSQL State Storage for Multi-Turn Conversations in Production + +**Impact:** `HIGH` — The default in-memory state storage loses all conversation history when the container restarts — production multi-turn agents require persistent PostgreSQL storage +**Tags:** `deployment`, `state`, `postgres`, `memory`, `multi-turn`, `production` + +## Use PostgreSQL State Storage for Multi-Turn Conversations in Production + +`state_storage` enables Plano to maintain conversation context across requests. Without it, each request is stateless. The `memory` type works for development and testing — all state is lost on container restart. Use `postgres` for any production deployment where conversation continuity matters. + +**Incorrect (memory storage in production):** + +```yaml +version: v0.3.0 + +# Memory storage — all conversations lost on planoai down / container restart +state_storage: + type: memory + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant with conversation history. +``` + +**Correct (PostgreSQL for production persistence):** + +```yaml +version: v0.3.0 + +state_storage: + type: postgres + connection_string: "postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}" + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant with access to full conversation history. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +**Setting up PostgreSQL for local development:** + +```bash +# Start PostgreSQL with Docker +docker run -d \ + --name plano-postgres \ + -e POSTGRES_USER=plano \ + -e POSTGRES_PASSWORD=devpassword \ + -e POSTGRES_DB=plano \ + -p 5432:5432 \ + postgres:16 + +# Set environment variables +export DB_USER=plano +export DB_PASS=devpassword +export DB_HOST=host.docker.internal # Use host.docker.internal from inside Plano container +export DB_NAME=plano +``` + +**Production `.env` pattern:** + +```bash +DB_USER=plano_prod +DB_PASS= +DB_HOST=your-rds-endpoint.amazonaws.com +DB_NAME=plano +``` + +Plano automatically creates its state tables on first startup. The `connection_string` supports all standard PostgreSQL connection parameters including SSL: `postgresql://user:pass@host:5432/db?sslmode=require`. + +Reference: https://github.com/katanemo/archgw + +--- + +### 7.3 Verify Listener Health Before Sending Requests + +**Impact:** `MEDIUM` — Sending requests to Plano before listeners are healthy results in connection refused errors that look like application bugs — always confirm health before testing +**Tags:** `deployment`, `health-checks`, `readiness`, `debugging` + +## Verify Listener Health Before Sending Requests + +Each Plano listener exposes a `/healthz` HTTP endpoint. `planoai up` automatically health-checks all listeners during startup (120s timeout), but in CI/CD pipelines, custom scripts, or when troubleshooting, you may need to check health manually. + +**Health check endpoints:** + +```bash +# Check model listener health (port from your config) +curl -f http://localhost:12000/healthz +# Returns 200 OK when healthy + +# Check prompt listener +curl -f http://localhost:10000/healthz + +# Check agent listener +curl -f http://localhost:8000/healthz +``` + +**Polling health in scripts (CI/CD pattern):** + +```bash +#!/bin/bash +# wait-for-plano.sh + +LISTENER_PORT=${1:-12000} +MAX_WAIT=120 +INTERVAL=2 +elapsed=0 + +echo "Waiting for Plano listener on port $LISTENER_PORT..." + +until curl -sf "http://localhost:$LISTENER_PORT/healthz" > /dev/null; do + if [ $elapsed -ge $MAX_WAIT ]; then + echo "ERROR: Plano listener not healthy after ${MAX_WAIT}s" + planoai logs --debug + exit 1 + fi + sleep $INTERVAL + elapsed=$((elapsed + INTERVAL)) +done + +echo "Plano listener healthy after ${elapsed}s" +``` + +**Docker Compose health check:** + +```yaml +# docker-compose.yml for services that depend on Plano +services: + plano: + image: katanemo/plano:latest + # Plano is managed by planoai, not directly via compose in most setups + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:12000/healthz"] + interval: 5s + timeout: 3s + retries: 24 + start_period: 10s + + my-agent: + image: my-agent:latest + depends_on: + plano: + condition: service_healthy +``` + +**Debug unhealthy listeners:** + +```bash +# See startup logs +planoai logs --debug + +# Check if port is already in use +lsof -i :12000 + +# Check container status +docker ps -a --filter name=plano + +# Restart from scratch +planoai down && planoai up config.yaml --foreground +``` + +Reference: https://github.com/katanemo/archgw + +--- + +## Section 8: Advanced Patterns + +*Prompt targets, external API integration, rate limiting, and multi-listener architectures.* + +### 8.1 Combine Multiple Listener Types for Layered Agent Architectures + +**Impact:** `MEDIUM` — Using a single listener type forces all traffic through one gateway pattern — combining types lets you serve different clients with the right interface without running multiple Plano instances +**Tags:** `advanced`, `multi-listener`, `architecture`, `agent`, `model`, `prompt` + +## Combine Multiple Listener Types for Layered Agent Architectures + +A single Plano `config.yaml` can define multiple listeners of different types, each on a separate port. This lets you serve different client types simultaneously: an OpenAI-compatible model gateway for direct API clients, a prompt gateway for LLM-callable function applications, and an agent orchestrator for multi-agent workflows — all from one Plano instance sharing the same model providers. + +**Single listener (limited — forces all clients through one interface):** + +```yaml +version: v0.3.0 + +listeners: + - type: model # Only model clients can use this + name: model_gateway + port: 12000 + +# Prompt target clients and agent clients cannot connect +``` + +**Multi-listener architecture (serves all client types):** + +```yaml +version: v0.3.0 + +# --- Shared model providers --- +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: quick tasks + description: Short answers, formatting, classification, simple generation + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: complex reasoning + description: Multi-step analysis, code generation, research synthesis + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + routing_preferences: + - name: long documents + description: Summarizing or analyzing very long documents, PDFs, transcripts + +# --- Listener 1: OpenAI-compatible API gateway --- +# For: SDK clients, Claude Code, LangChain, etc. +listeners: + - type: model + name: model_gateway + port: 12000 + timeout: "120s" + +# --- Listener 2: Prompt function gateway --- +# For: Applications that expose LLM-callable APIs + - type: prompt + name: function_gateway + port: 10000 + timeout: "60s" + +# --- Listener 3: Agent orchestration gateway --- +# For: Multi-agent application clients + - type: agent + name: agent_orchestrator + port: 8000 + timeout: "90s" + router: plano_orchestrator_v1 + agents: + - id: research_agent + description: Searches, synthesizes, and summarizes information from multiple sources. + filter_chain: + - input_guards + - context_builder + - id: code_agent + description: Writes, reviews, debugs, and explains code across all languages. + default: true + +# --- Agents --- +agents: + - id: research_agent + url: http://host.docker.internal:8001 + - id: code_agent + url: http://host.docker.internal:8002 + +# --- Filters --- +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + - id: context_builder + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + +# --- Prompt targets (for function gateway) --- +endpoints: + internal_api: + endpoint: host.docker.internal + protocol: http + +prompt_targets: + - name: search_knowledge_base + description: Search the internal knowledge base for relevant documents and facts. + parameters: + - name: query + type: str + required: true + description: Search query to find relevant information + endpoint: + name: internal_api + path: /kb/search?q={query} + http_method: GET + +# --- Observability --- +model_aliases: + plano.fast.v1: + target: gpt-4o-mini + plano.smart.v1: + target: gpt-4o + +tracing: + random_sampling: 50 + trace_arch_internal: true + span_attributes: + static: + environment: production + header_prefixes: + - x-katanemo- +``` + +This architecture serves: SDK clients on `:12000`, function-calling apps on `:10000`, and multi-agent orchestration on `:8000` — with shared cost-optimized routing across all three. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) + +--- + +### 8.2 Design Prompt Targets with Precise Parameter Schemas + +**Impact:** `HIGH` — Imprecise parameter definitions cause the LLM to hallucinate values, skip required fields, or produce malformed API calls — the schema is the contract between the LLM and your API +**Tags:** `advanced`, `prompt-targets`, `functions`, `llm`, `api-integration` + +## Design Prompt Targets with Precise Parameter Schemas + +`prompt_targets` define functions that Plano's LLM can call autonomously when it determines a user request matches the function's description. The parameter schema tells the LLM exactly what values to extract from user input — vague schemas lead to hallucinated parameters and failed API calls. + +**Incorrect (too few constraints — LLM must guess):** + +```yaml +prompt_targets: + - name: get_flight_info + description: Get flight information + parameters: + - name: flight # What format? "AA123"? "AA 123"? "American 123"? + type: str + required: true + endpoint: + name: flights_api + path: /flight?id={flight} +``` + +**Correct (fully specified schema with descriptions, formats, and enums):** + +```yaml +version: v0.3.0 + +endpoints: + flights_api: + endpoint: api.flightaware.com + protocol: https + connect_timeout: "5s" + +prompt_targets: + - name: get_flight_status + description: > + Get real-time status, gate information, and delays for a specific flight number. + Use when the user asks about a flight's current status, arrival time, or gate. + parameters: + - name: flight_number + description: > + IATA airline code followed by flight number, e.g., "AA123", "UA456", "DL789". + Extract from user message — do not include spaces. + type: str + required: true + format: "^[A-Z]{2}[0-9]{1,4}$" # Regex hint for validation + + - name: date + description: > + Flight date in YYYY-MM-DD format. Use today's date if not specified. + type: str + required: false + format: date + + endpoint: + name: flights_api + path: /flights/{flight_number}?date={date} + http_method: GET + http_headers: + Authorization: "Bearer $FLIGHTAWARE_API_KEY" + + - name: search_flights + description: > + Search for available flights between two cities or airports. + Use when the user wants to find flights, compare options, or book travel. + parameters: + - name: origin + description: Departure airport IATA code (e.g., "JFK", "LAX", "ORD") + type: str + required: true + - name: destination + description: Arrival airport IATA code (e.g., "LHR", "CDG", "NRT") + type: str + required: true + - name: departure_date + description: Departure date in YYYY-MM-DD format + type: str + required: true + format: date + - name: cabin_class + description: Preferred cabin class + type: str + required: false + default: economy + enum: [economy, premium_economy, business, first] + - name: passengers + description: Number of adult passengers (1-9) + type: int + required: false + default: 1 + + endpoint: + name: flights_api + path: /search?from={origin}&to={destination}&date={departure_date}&class={cabin_class}&pax={passengers} + http_method: GET + http_headers: + Authorization: "Bearer $FLIGHTAWARE_API_KEY" + + system_prompt: | + You are a travel assistant. Present flight search results clearly, + highlighting the best value options. Include price, duration, and + number of stops for each option. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +listeners: + - type: prompt + name: travel_functions + port: 10000 + timeout: "30s" +``` + +**Key principles:** +- `description` on the target tells the LLM when to call it — be specific about trigger conditions +- `description` on each parameter tells the LLM what value to extract — include format examples +- Use `enum` to constrain categorical values — prevents the LLM from inventing categories +- Use `format: date` or regex patterns to hint at expected format +- Use `default` for optional parameters so the API never receives null values +- `system_prompt` on the target customizes how the LLM formats the API response to the user + +Reference: https://github.com/katanemo/archgw + +--- + +*Generated from individual rule files in `rules/`.* +*To contribute, see [CONTRIBUTING](https://github.com/katanemo/archgw/blob/main/CONTRIBUTING.md).* diff --git a/skills/README.md b/skills/README.md new file mode 100644 index 000000000..d941fb931 --- /dev/null +++ b/skills/README.md @@ -0,0 +1,243 @@ +# Plano Agent Skills + +A structured repository of best practices for building agents and agentic applications with [Plano](https://github.com/katanemo/archgw) — the AI-native proxy and dataplane. Optimized for coding agents and LLMs. + +## What Are Skills? + +Skills are principle-based guides that help coding agents (Claude Code, Cursor, Copilot, etc.) make better decisions when working with Plano. They cover configuration patterns, routing strategies, agent orchestration, observability, and CLI workflows — acting as operating principles, not documentation replacements. + +## Installing + +```bash +# Install via npx skills +npx skills add katanemo/plano +``` + +This skills collection is published from the `skills/` directory in the `katanemo/plano` monorepo. + +Install a specific skill: + +```bash +npx skills add katanemo/plano --skill plano-routing-model-selection +``` + +List available skills before install: + +```bash +npx skills add katanemo/plano --list +``` + +## Using Skills in Agents + +After installation, these skills are available to your coding agent and can be invoked with normal language. You do not need special syntax unless your tooling requires it. + +### Natural Language Invocation Examples + +- "Use the Plano skills to validate this `config.yaml` and fix issues." +- "Apply Plano routing best practices to improve model/provider selection." +- "Review this agent listener config with the orchestration rules." +- "Refactor this filter chain to follow guardrail ordering best practices." +- "Audit this setup against Plano deployment and security recommendations." + +### Prompting Tips for Better Results + +- Name your goal and file: "Harden `config.yaml` for production." +- Ask for an action: "Generate a patch," "fix directly," or "explain the changes." +- Include runtime context when relevant: trace output, logs, listener errors. +- Ask for verification: "Run a final validation check after edits." + +### Invoke by Skill Area (Optional) + +- **Configuration:** "Use Plano configuration fundamentals on this config." +- **Routing:** "Use routing/model-selection skills to tune defaults and aliases." +- **Agent orchestration:** "Use agent orchestration skills to improve routing accuracy." +- **Filters/guardrails:** "Use filter-chain skills to harden input/output safety." +- **Observability:** "Use observability skills to add traceability and debug routing." +- **CLI/deployment:** "Use CLI and deployment skills to produce a startup checklist." + +## Available Skills + +- `plano-agent-skills` - Umbrella skill covering all Plano areas +- `plano-config-fundamentals` - Config versioning, listeners, providers, secrets +- `plano-routing-model-selection` - Defaults, aliases, passthrough auth, preferences +- `plano-agent-orchestration` - Agent registration and routing descriptions +- `plano-filter-guardrails` - MCP filters, guardrail messaging, filter ordering +- `plano-observability-debugging` - Tracing setup, span attributes, trace analysis +- `plano-cli-operations` - `planoai up`, `cli_agent`, init, prompt target generation +- `plano-deployment-security` - Docker networking, health checks, state storage +- `plano-advanced-patterns` - Multi-listener architecture and prompt target schema design + +## Local Testing + +```bash +# From repo root +npx skills add ./skills --list +npx skills add ./skills --skill plano-agent-skills -y +npx skills list +``` + +## Structure + +``` +skills/ +├── rules/ # Individual rule files (one per rule) +│ ├── _sections.md # Section metadata and prefix definitions +│ ├── _template.md # Template for creating new rules +│ ├── config-*.md # Section 1: Configuration Fundamentals +│ ├── routing-*.md # Section 2: Routing & Model Selection +│ ├── agent-*.md # Section 3: Agent Orchestration +│ ├── filter-*.md # Section 4: Filter Chains & Guardrails +│ ├── observe-*.md # Section 5: Observability & Debugging +│ ├── cli-*.md # Section 6: CLI Operations +│ ├── deploy-*.md # Section 7: Deployment & Security +│ └── advanced-*.md # Section 8: Advanced Patterns +├── src/ +│ ├── build.ts # Compiles rules/ into AGENTS.md +│ ├── validate.ts # Validates rule files +│ └── extract-tests.ts # Extracts test cases for LLM evaluation +├── metadata.json # Document metadata +├── AGENTS.md # Compiled output (generated — do not edit directly) +├── test-cases.json # Test cases for LLM evaluation (generated) +└── package.json +``` + +## Sections + +| # | Prefix | Section | Rules | +|---|--------|---------|-------| +| 1 | `config-` | Configuration Fundamentals | Version, listeners, providers, secrets, timeouts | +| 2 | `routing-` | Routing & Model Selection | Preferences, aliases, defaults, passthrough | +| 3 | `agent-` | Agent Orchestration | Descriptions, agent registration | +| 4 | `filter-` | Filter Chains & Guardrails | Ordering, MCP integration, guardrails | +| 5 | `observe-` | Observability & Debugging | Tracing, trace inspection, span attributes | +| 6 | `cli-` | CLI Operations | Startup, CLI agent, init, code generation | +| 7 | `deploy-` | Deployment & Security | Docker networking, state storage, health checks | +| 8 | `advanced-` | Advanced Patterns | Prompt targets, rate limits, multi-listener | + +## Getting Started + +```bash +# Install dependencies +npm install + +# Validate all rule files +npm run validate + +# Build AGENTS.md from rules +npm run build + +# Extract test cases for LLM evaluation +npm run extract-tests + +# Run all of the above +npm run dev +``` + +## Creating a New Rule + +1. Copy `rules/_template.md` to `rules/-.md` + +2. Choose the correct prefix for your section: + - `config-` — Configuration Fundamentals + - `routing-` — Routing & Model Selection + - `agent-` — Agent Orchestration + - `filter-` — Filter Chains & Guardrails + - `observe-` — Observability & Debugging + - `cli-` — CLI Operations + - `deploy-` — Deployment & Security + - `advanced-` — Advanced Patterns + +3. Fill in the frontmatter: + ```yaml + --- + title: Clear, Actionable Rule Title + impact: HIGH + impactDescription: One-line description of why this matters + tags: config, routing, relevant-tags + --- + ``` + +4. Write the rule body with: + - Brief explanation of the principle and why it matters + - **Incorrect** example (YAML config or CLI command showing the wrong pattern) + - **Correct** example (the right pattern with comments) + - Optional explanatory notes + +5. Run `npm run dev` to validate and regenerate + +## Rule File Structure + +```markdown +--- +title: Rule Title Here +impact: CRITICAL +impactDescription: One sentence on the impact +tags: tag1, tag2, tag3 +--- + +## Rule Title Here + +Brief explanation of the rule and why it matters for Plano developers. + +**Incorrect (describe what's wrong):** + +```yaml +# Bad example +``` + +**Correct (describe what's right):** + +```yaml +# Good example with comments explaining the decisions +``` + +Optional explanatory text, lists, or tables. + +Reference: https://github.com/katanemo/archgw + + + +## Impact Levels + +| Level | Description | +|-------|-------------| +| `CRITICAL` | Causes startup failures or silent misbehavior — always fix | +| `HIGH` | Significantly degrades routing accuracy, security, or reliability | +| `MEDIUM-HIGH` | Important for production deployments | +| `MEDIUM` | Best practice for maintainability and developer experience | +| `LOW-MEDIUM` | Incremental improvements | +| `LOW` | Nice to have | + +## Key Rules at a Glance + +- **Always set `version: v0.3.0`** — config is rejected without it +- **Use `host.docker.internal`** for agent/filter URLs — `localhost` doesn't work inside Docker +- **Set exactly one `default: true` provider** — unmatched requests need a fallback +- **Write specific routing preference descriptions** — vague descriptions cause misroutes +- **Order filter chains: guards → rewriters → context builders** — never build context before blocking bad input +- **Use `$VAR_NAME` for all secrets** — never hardcode API keys in config.yaml +- **Enable tracing with `--with-tracing`** — traces are the primary debugging tool + +## Scripts + +| Command | Description | +|---------|-------------| +| `npm run build` | Compile `rules/` into `AGENTS.md` | +| `npm run validate` | Validate all rule files for required fields and structure | +| `npm run extract-tests` | Generate `test-cases.json` for LLM evaluation | +| `npm run dev` | Validate + build + extract tests | + +## Contributing + +Rules are automatically sorted alphabetically by title within each section — no need to manage numbers. IDs (`1.1`, `1.2`, etc.) are assigned during build. + +When adding rules: +1. Use the correct filename prefix for your section +2. Follow `_template.md` structure +3. Include clear bad/good YAML or CLI examples +4. Add relevant tags +5. Run `npm run dev` to validate and regenerate + +## License + +Apache-2.0 — see [LICENSE](../LICENSE) diff --git a/skills/metadata.json b/skills/metadata.json new file mode 100644 index 000000000..f1f754abc --- /dev/null +++ b/skills/metadata.json @@ -0,0 +1,8 @@ +{ + "version": "1.0.0", + "organization": "Plano", + "name": "plano-agent-skills", + "abstract": "Best practices for building agents and agentic applications with Plano — the AI-native proxy and dataplane. Covers configuration, routing, agent orchestration, filter chains, observability, CLI operations, and deployment patterns.", + "homepage": "https://github.com/katanemo/archgw", + "license": "Apache-2.0" +} diff --git a/skills/package-lock.json b/skills/package-lock.json new file mode 100644 index 000000000..080a8c7ff --- /dev/null +++ b/skills/package-lock.json @@ -0,0 +1,594 @@ +{ + "name": "plano-agent-skills", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "plano-agent-skills", + "version": "1.0.0", + "license": "Apache-2.0", + "devDependencies": { + "@types/node": "^24.3.0", + "tsx": "^4.20.5", + "typescript": "^5.9.2" + }, + "engines": { + "node": ">=18.0.0" + } + }, + "node_modules/@esbuild/aix-ppc64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.3.tgz", + "integrity": "sha512-9fJMTNFTWZMh5qwrBItuziu834eOCUcEqymSH7pY+zoMVEZg3gcPuBNxH1EvfVYe9h0x/Ptw8KBzv7qxb7l8dg==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "aix" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.3.tgz", + "integrity": "sha512-i5D1hPY7GIQmXlXhs2w8AWHhenb00+GxjxRncS2ZM7YNVGNfaMxgzSGuO8o8SJzRc/oZwU2bcScvVERk03QhzA==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.3.tgz", + "integrity": "sha512-YdghPYUmj/FX2SYKJ0OZxf+iaKgMsKHVPF1MAq/P8WirnSpCStzKJFjOjzsW0QQ7oIAiccHdcqjbHmJxRb/dmg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.3.tgz", + "integrity": "sha512-IN/0BNTkHtk8lkOM8JWAYFg4ORxBkZQf9zXiEOfERX/CzxW3Vg1ewAhU7QSWQpVIzTW+b8Xy+lGzdYXV6UZObQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.3.tgz", + "integrity": "sha512-Re491k7ByTVRy0t3EKWajdLIr0gz2kKKfzafkth4Q8A5n1xTHrkqZgLLjFEHVD+AXdUGgQMq+Godfq45mGpCKg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.3.tgz", + "integrity": "sha512-vHk/hA7/1AckjGzRqi6wbo+jaShzRowYip6rt6q7VYEDX4LEy1pZfDpdxCBnGtl+A5zq8iXDcyuxwtv3hNtHFg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.3.tgz", + "integrity": "sha512-ipTYM2fjt3kQAYOvo6vcxJx3nBYAzPjgTCk7QEgZG8AUO3ydUhvelmhrbOheMnGOlaSFUoHXB6un+A7q4ygY9w==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.3.tgz", + "integrity": "sha512-dDk0X87T7mI6U3K9VjWtHOXqwAMJBNN2r7bejDsc+j03SEjtD9HrOl8gVFByeM0aJksoUuUVU9TBaZa2rgj0oA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.3.tgz", + "integrity": "sha512-s6nPv2QkSupJwLYyfS+gwdirm0ukyTFNl3KTgZEAiJDd+iHZcbTPPcWCcRYH+WlNbwChgH2QkE9NSlNrMT8Gfw==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.3.tgz", + "integrity": "sha512-sZOuFz/xWnZ4KH3YfFrKCf1WyPZHakVzTiqji3WDc0BCl2kBwiJLCXpzLzUBLgmp4veFZdvN5ChW4Eq/8Fc2Fg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ia32": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.3.tgz", + "integrity": "sha512-yGlQYjdxtLdh0a3jHjuwOrxQjOZYD/C9PfdbgJJF3TIZWnm/tMd/RcNiLngiu4iwcBAOezdnSLAwQDPqTmtTYg==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-loong64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.3.tgz", + "integrity": "sha512-WO60Sn8ly3gtzhyjATDgieJNet/KqsDlX5nRC5Y3oTFcS1l0KWba+SEa9Ja1GfDqSF1z6hif/SkpQJbL63cgOA==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-mips64el": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.3.tgz", + "integrity": "sha512-APsymYA6sGcZ4pD6k+UxbDjOFSvPWyZhjaiPyl/f79xKxwTnrn5QUnXR5prvetuaSMsb4jgeHewIDCIWljrSxw==", + "cpu": [ + "mips64el" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ppc64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.3.tgz", + "integrity": "sha512-eizBnTeBefojtDb9nSh4vvVQ3V9Qf9Df01PfawPcRzJH4gFSgrObw+LveUyDoKU3kxi5+9RJTCWlj4FjYXVPEA==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-riscv64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.3.tgz", + "integrity": "sha512-3Emwh0r5wmfm3ssTWRQSyVhbOHvqegUDRd0WhmXKX2mkHJe1SFCMJhagUleMq+Uci34wLSipf8Lagt4LlpRFWQ==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-s390x": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.3.tgz", + "integrity": "sha512-pBHUx9LzXWBc7MFIEEL0yD/ZVtNgLytvx60gES28GcWMqil8ElCYR4kvbV2BDqsHOvVDRrOxGySBM9Fcv744hw==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.3.tgz", + "integrity": "sha512-Czi8yzXUWIQYAtL/2y6vogER8pvcsOsk5cpwL4Gk5nJqH5UZiVByIY8Eorm5R13gq+DQKYg0+JyQoytLQas4dA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.3.tgz", + "integrity": "sha512-sDpk0RgmTCR/5HguIZa9n9u+HVKf40fbEUt+iTzSnCaGvY9kFP0YKBWZtJaraonFnqef5SlJ8/TiPAxzyS+UoA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.3.tgz", + "integrity": "sha512-P14lFKJl/DdaE00LItAukUdZO5iqNH7+PjoBm+fLQjtxfcfFE20Xf5CrLsmZdq5LFFZzb5JMZ9grUwvtVYzjiA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.3.tgz", + "integrity": "sha512-AIcMP77AvirGbRl/UZFTq5hjXK+2wC7qFRGoHSDrZ5v5b8DK/GYpXW3CPRL53NkvDqb9D+alBiC/dV0Fb7eJcw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.3.tgz", + "integrity": "sha512-DnW2sRrBzA+YnE70LKqnM3P+z8vehfJWHXECbwBmH/CU51z6FiqTQTHFenPlHmo3a8UgpLyH3PT+87OViOh1AQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openharmony-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.3.tgz", + "integrity": "sha512-NinAEgr/etERPTsZJ7aEZQvvg/A6IsZG/LgZy+81wON2huV7SrK3e63dU0XhyZP4RKGyTm7aOgmQk0bGp0fy2g==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/sunos-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.3.tgz", + "integrity": "sha512-PanZ+nEz+eWoBJ8/f8HKxTTD172SKwdXebZ0ndd953gt1HRBbhMsaNqjTyYLGLPdoWHy4zLU7bDVJztF5f3BHA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "sunos" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-arm64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.3.tgz", + "integrity": "sha512-B2t59lWWYrbRDw/tjiWOuzSsFh1Y/E95ofKz7rIVYSQkUYBjfSgf6oeYPNWHToFRr2zx52JKApIcAS/D5TUBnA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-ia32": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.3.tgz", + "integrity": "sha512-QLKSFeXNS8+tHW7tZpMtjlNb7HKau0QDpwm49u0vUp9y1WOF+PEzkU84y9GqYaAVW8aH8f3GcBck26jh54cX4Q==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-x64": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.3.tgz", + "integrity": "sha512-4uJGhsxuptu3OcpVAzli+/gWusVGwZZHTlS63hh++ehExkVT8SgiEf7/uC/PclrPPkLhZqGgCTjd0VWLo6xMqA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@types/node": { + "version": "24.11.0", + "resolved": "https://registry.npmjs.org/@types/node/-/node-24.11.0.tgz", + "integrity": "sha512-fPxQqz4VTgPI/IQ+lj9r0h+fDR66bzoeMGHp8ASee+32OSGIkeASsoZuJixsQoVef1QJbeubcPBxKk22QVoWdw==", + "dev": true, + "license": "MIT", + "dependencies": { + "undici-types": "~7.16.0" + } + }, + "node_modules/esbuild": { + "version": "0.27.3", + "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.3.tgz", + "integrity": "sha512-8VwMnyGCONIs6cWue2IdpHxHnAjzxnw2Zr7MkVxB2vjmQ2ivqGFb4LEG3SMnv0Gb2F/G/2yA8zUaiL1gywDCCg==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "esbuild": "bin/esbuild" + }, + "engines": { + "node": ">=18" + }, + "optionalDependencies": { + "@esbuild/aix-ppc64": "0.27.3", + "@esbuild/android-arm": "0.27.3", + "@esbuild/android-arm64": "0.27.3", + "@esbuild/android-x64": "0.27.3", + "@esbuild/darwin-arm64": "0.27.3", + "@esbuild/darwin-x64": "0.27.3", + "@esbuild/freebsd-arm64": "0.27.3", + "@esbuild/freebsd-x64": "0.27.3", + "@esbuild/linux-arm": "0.27.3", + "@esbuild/linux-arm64": "0.27.3", + "@esbuild/linux-ia32": "0.27.3", + "@esbuild/linux-loong64": "0.27.3", + "@esbuild/linux-mips64el": "0.27.3", + "@esbuild/linux-ppc64": "0.27.3", + "@esbuild/linux-riscv64": "0.27.3", + "@esbuild/linux-s390x": "0.27.3", + "@esbuild/linux-x64": "0.27.3", + "@esbuild/netbsd-arm64": "0.27.3", + "@esbuild/netbsd-x64": "0.27.3", + "@esbuild/openbsd-arm64": "0.27.3", + "@esbuild/openbsd-x64": "0.27.3", + "@esbuild/openharmony-arm64": "0.27.3", + "@esbuild/sunos-x64": "0.27.3", + "@esbuild/win32-arm64": "0.27.3", + "@esbuild/win32-ia32": "0.27.3", + "@esbuild/win32-x64": "0.27.3" + } + }, + "node_modules/fsevents": { + "version": "2.3.3", + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz", + "integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" + } + }, + "node_modules/get-tsconfig": { + "version": "4.13.6", + "resolved": "https://registry.npmjs.org/get-tsconfig/-/get-tsconfig-4.13.6.tgz", + "integrity": "sha512-shZT/QMiSHc/YBLxxOkMtgSid5HFoauqCE3/exfsEcwg1WkeqjG+V40yBbBrsD+jW2HDXcs28xOfcbm2jI8Ddw==", + "dev": true, + "license": "MIT", + "dependencies": { + "resolve-pkg-maps": "^1.0.0" + }, + "funding": { + "url": "https://github.com/privatenumber/get-tsconfig?sponsor=1" + } + }, + "node_modules/resolve-pkg-maps": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/resolve-pkg-maps/-/resolve-pkg-maps-1.0.0.tgz", + "integrity": "sha512-seS2Tj26TBVOC2NIc2rOe2y2ZO7efxITtLZcGSOnHHNOQ7CkiUBfw0Iw2ck6xkIhPwLhKNLS8BO+hEpngQlqzw==", + "dev": true, + "license": "MIT", + "funding": { + "url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1" + } + }, + "node_modules/tsx": { + "version": "4.21.0", + "resolved": "https://registry.npmjs.org/tsx/-/tsx-4.21.0.tgz", + "integrity": "sha512-5C1sg4USs1lfG0GFb2RLXsdpXqBSEhAaA/0kPL01wxzpMqLILNxIxIOKiILz+cdg/pLnOUxFYOR5yhHU666wbw==", + "dev": true, + "license": "MIT", + "dependencies": { + "esbuild": "~0.27.0", + "get-tsconfig": "^4.7.5" + }, + "bin": { + "tsx": "dist/cli.mjs" + }, + "engines": { + "node": ">=18.0.0" + }, + "optionalDependencies": { + "fsevents": "~2.3.3" + } + }, + "node_modules/typescript": { + "version": "5.9.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz", + "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", + "dev": true, + "license": "Apache-2.0", + "bin": { + "tsc": "bin/tsc", + "tsserver": "bin/tsserver" + }, + "engines": { + "node": ">=14.17" + } + }, + "node_modules/undici-types": { + "version": "7.16.0", + "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.16.0.tgz", + "integrity": "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw==", + "dev": true, + "license": "MIT" + } + } +} diff --git a/skills/package.json b/skills/package.json new file mode 100644 index 000000000..eb33002fd --- /dev/null +++ b/skills/package.json @@ -0,0 +1,31 @@ +{ + "name": "plano-agent-skills", + "version": "1.0.0", + "description": "Best practices for building agents and agentic applications with Plano — installable via npx skills add", + "type": "module", + "scripts": { + "typecheck": "tsc --noEmit", + "build": "tsx src/build.ts", + "validate": "tsx src/validate.ts", + "extract-tests": "tsx src/extract-tests.ts", + "dev": "npm run typecheck && npm run validate && npm run build && npm run extract-tests" + }, + "keywords": [ + "plano", + "archgw", + "ai-gateway", + "agent", + "llm", + "skills", + "best-practices" + ], + "license": "Apache-2.0", + "engines": { + "node": ">=18.0.0" + }, + "devDependencies": { + "@types/node": "^24.3.0", + "tsx": "^4.20.5", + "typescript": "^5.9.2" + } +} diff --git a/skills/plano-advanced-patterns/SKILL.md b/skills/plano-advanced-patterns/SKILL.md new file mode 100644 index 000000000..7e2f1b007 --- /dev/null +++ b/skills/plano-advanced-patterns/SKILL.md @@ -0,0 +1,32 @@ +--- +name: plano-advanced-patterns +description: Design advanced Plano architectures. Use for multi-listener systems, prompt target schema quality, and layered orchestration patterns. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Advanced Patterns + +Use this skill for higher-order architecture decisions once fundamentals are stable. + +## When To Use + +- "Design a multi-listener Plano architecture" +- "Improve prompt target schema precision" +- "Combine model, prompt, and agent listeners" +- "Refine advanced routing/function-calling behavior" + +## Apply These Rules + +- `advanced-multi-listener` +- `advanced-prompt-targets` + +## Execution Checklist + +1. Use multiple listeners only when interfaces are truly distinct. +2. Keep provider/routing definitions shared and consistent. +3. Define prompt target parameters with strict, explicit schemas. +4. Minimize ambiguity that causes malformed tool calls. +5. Provide migration-safe recommendations and test scenarios. diff --git a/skills/plano-agent-orchestration/SKILL.md b/skills/plano-agent-orchestration/SKILL.md new file mode 100644 index 000000000..90f25bebe --- /dev/null +++ b/skills/plano-agent-orchestration/SKILL.md @@ -0,0 +1,32 @@ +--- +name: plano-agent-orchestration +description: Improve multi-agent orchestration in Plano. Use for agent registration, agent listener wiring, and capability-focused agent descriptions for accurate routing. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Agent Orchestration + +Use this skill for agent listener quality, sub-agent registration, and route accuracy. + +## When To Use + +- "Fix multi-agent routing" +- "Validate agents vs listeners.agents config" +- "Improve agent descriptions" +- "Set up a reliable orchestrator" + +## Apply These Rules + +- `agent-orchestration` +- `agent-descriptions` + +## Execution Checklist + +1. Verify each agent exists in both `agents` and `listeners[].agents`. +2. Ensure one fallback/default agent where appropriate. +3. Rewrite descriptions to be capability-focused and non-overlapping. +4. Keep descriptions specific, concise, and example-driven. +5. Provide test prompts to validate routing outcomes. diff --git a/skills/plano-agent-skills/SKILL.md b/skills/plano-agent-skills/SKILL.md new file mode 100644 index 000000000..e6ecbb203 --- /dev/null +++ b/skills/plano-agent-skills/SKILL.md @@ -0,0 +1,53 @@ +--- +name: plano-agent-skills +description: Best practices for building agents and agentic applications with Plano, including configuration, routing, orchestration, guardrails, observability, and deployment. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Agent Skills + +Comprehensive Plano guidance for coding agents. Use this umbrella skill when a task spans multiple areas (config, routing, orchestration, filters, observability, CLI, deployment). + +## When To Use + +- Validating or fixing Plano `config.yaml` +- Designing listener architecture (`model`, `prompt`, `agent`) +- Improving model/provider routing quality and fallback behavior +- Hardening filter chains and prompt guardrails +- Debugging routing with traces and CLI workflows +- Preparing deployment and production readiness checks + +## How To Use + +1. Classify the request by scope (single section vs. cross-cutting). +2. For focused work, prefer a section-specific skill (for example `plano-routing-model-selection`). +3. For broad work, apply this umbrella skill and reference section rules from `skills/AGENTS.md`. +4. Produce concrete edits first, then concise reasoning and validation steps. + +## Operating Workflow + +1. Identify the task area first: config, routing, orchestration, filters, observability, CLI, or deployment. +2. Apply the smallest correct change that satisfies the requested behavior. +3. Preserve security and reliability defaults: + - `version: v0.3.0` + - exactly one `default: true` model provider + - secrets via `$ENV_VAR` substitution only + - `host.docker.internal` for host services from inside Docker + - guardrails before enrichment in filter chains +4. For debugging, prioritize traces over guesswork (`planoai up --with-tracing`, `planoai trace`). +5. Return concrete diffs and a short validation checklist. + +## Response Style + +- Prefer actionable edits over generic advice. +- Be explicit about why a config choice is correct. +- Call out risky patterns (hardcoded secrets, missing default provider, bad filter ordering). +- Keep examples minimal and production-viable. + +## References + +- Repo: https://github.com/katanemo/plano +- Full rulebook: `skills/AGENTS.md` diff --git a/skills/plano-cli-operations/SKILL.md b/skills/plano-cli-operations/SKILL.md new file mode 100644 index 000000000..da25db580 --- /dev/null +++ b/skills/plano-cli-operations/SKILL.md @@ -0,0 +1,34 @@ +--- +name: plano-cli-operations +description: Apply Plano CLI best practices. Use for startup troubleshooting, cli_agent workflows, prompt target generation, and template-based project bootstrapping. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano CLI Operations + +Use this skill when the task is primarily operational and CLI-driven. + +## When To Use + +- "Fix `planoai up` failures" +- "Use `planoai cli_agent` with coding agents" +- "Generate prompt targets from Python functions" +- "Bootstrap a project with `planoai init` templates" + +## Apply These Rules + +- `cli-startup` +- `cli-agent` +- `cli-generate` +- `cli-init` + +## Execution Checklist + +1. Follow startup validation order before deep debugging. +2. Use `cli_agent` to route coding-agent traffic through Plano. +3. Generate prompt target schema, then wire endpoint details explicitly. +4. Start from templates for reliable first-time setup. +5. Provide a compact runbook with exact CLI commands. diff --git a/skills/plano-config-fundamentals/SKILL.md b/skills/plano-config-fundamentals/SKILL.md new file mode 100644 index 000000000..87b7fbdd9 --- /dev/null +++ b/skills/plano-config-fundamentals/SKILL.md @@ -0,0 +1,34 @@ +--- +name: plano-config-fundamentals +description: Validate and fix Plano config fundamentals. Use for config versioning, listener types, provider registration, secrets handling, and startup validation failures. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Configuration Fundamentals + +Use this skill for foundational `config.yaml` correctness. + +## When To Use + +- "Validate this Plano config" +- "Fix startup config errors" +- "Check listeners/providers/secrets" +- "Why does `planoai up` fail schema validation?" + +## Apply These Rules + +- `config-version` +- `config-listeners` +- `config-providers` +- `config-secrets` + +## Execution Checklist + +1. Ensure `version: v0.3.0` is present. +2. Confirm listener type matches intended architecture. +3. Verify provider names/interfaces and exactly one default provider. +4. Replace hardcoded secrets with `$ENV_VAR` substitution. +5. Return minimal patch and a `planoai up` verification plan. diff --git a/skills/plano-deployment-security/SKILL.md b/skills/plano-deployment-security/SKILL.md new file mode 100644 index 000000000..48256777d --- /dev/null +++ b/skills/plano-deployment-security/SKILL.md @@ -0,0 +1,33 @@ +--- +name: plano-deployment-security +description: Apply Plano deployment and production security practices. Use for Docker networking, state storage choices, readiness checks, and environment-based secret handling. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Deployment and Security + +Use this skill to harden production deployments and reduce runtime surprises. + +## When To Use + +- "Fix unreachable agents in Docker" +- "Configure persistent conversation state" +- "Add readiness and health checks" +- "Prepare production deployment checklist" + +## Apply These Rules + +- `deploy-docker` +- `deploy-state` +- `deploy-health` + +## Execution Checklist + +1. Use `host.docker.internal` for host-side services from inside Plano container. +2. Prefer PostgreSQL state storage for production multi-turn workloads. +3. Verify `/healthz` before traffic or CI assertions. +4. Ensure secrets remain environment-based, never hardcoded. +5. Return deployment checks with failure-mode diagnostics. diff --git a/skills/plano-filter-guardrails/SKILL.md b/skills/plano-filter-guardrails/SKILL.md new file mode 100644 index 000000000..2f19e67b6 --- /dev/null +++ b/skills/plano-filter-guardrails/SKILL.md @@ -0,0 +1,33 @@ +--- +name: plano-filter-guardrails +description: Harden Plano filter chains and guardrails. Use for MCP filter setup, prompt guard responses, and safe filter ordering. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Filter Chains and Guardrails + +Use this skill when safety controls or filter pipelines need correction. + +## When To Use + +- "Fix filter chain ordering" +- "Set up MCP filters correctly" +- "Improve guardrail rejection behavior" +- "Harden request processing for safety" + +## Apply These Rules + +- `filter-mcp` +- `filter-guardrails` +- `filter-ordering` + +## Execution Checklist + +1. Configure filter `type`, `transport`, and `tool` explicitly for MCP. +2. Ensure rejection messages are clear and actionable. +3. Order chain as guards -> rewriters -> enrichment -> output checks. +4. Prevent expensive enrichment on unsafe requests. +5. Verify with representative blocked and allowed test prompts. diff --git a/skills/plano-observability-debugging/SKILL.md b/skills/plano-observability-debugging/SKILL.md new file mode 100644 index 000000000..c4039a7f5 --- /dev/null +++ b/skills/plano-observability-debugging/SKILL.md @@ -0,0 +1,33 @@ +--- +name: plano-observability-debugging +description: Improve Plano tracing and debugging workflows. Use for sampling strategy, span attributes, and trace query-based root-cause analysis. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Observability and Debugging + +Use this skill to make routing and latency behavior inspectable and debuggable. + +## When To Use + +- "Enable tracing correctly" +- "Add useful span attributes" +- "Debug why a request routed incorrectly" +- "Inspect filter/model latency from traces" + +## Apply These Rules + +- `observe-tracing` +- `observe-span-attributes` +- `observe-trace-query` + +## Execution Checklist + +1. Enable tracing with environment-appropriate sampling. +2. Add useful static and header-derived span attributes. +3. Use `planoai trace` filters to isolate route and latency issues. +4. Prefer trace evidence over assumptions in recommendations. +5. Return exact commands to reproduce and validate findings. diff --git a/skills/plano-routing-model-selection/SKILL.md b/skills/plano-routing-model-selection/SKILL.md new file mode 100644 index 000000000..083f21c8f --- /dev/null +++ b/skills/plano-routing-model-selection/SKILL.md @@ -0,0 +1,34 @@ +--- +name: plano-routing-model-selection +description: Optimize Plano model routing and selection. Use for provider defaults, model aliases, passthrough auth, and routing preference quality. +license: Apache-2.0 +metadata: + author: katanemo + version: "1.0.0" +--- + +# Plano Routing and Model Selection + +Use this skill when requests are routed to the wrong model, costs are high, or fallback behavior is unclear. + +## When To Use + +- "Improve model routing" +- "Add aliases and defaults" +- "Fix passthrough auth with proxy providers" +- "Tune routing preferences for better classification" + +## Apply These Rules + +- `routing-default` +- `routing-aliases` +- `routing-passthrough` +- `routing-preferences` + +## Execution Checklist + +1. Ensure exactly one `default: true` provider. +2. Add semantic aliases for stable client contracts. +3. Configure passthrough auth only where required. +4. Rewrite vague preference descriptions with concrete task scopes. +5. Validate routing behavior using trace-based checks. diff --git a/skills/rules/_sections.md b/skills/rules/_sections.md new file mode 100644 index 000000000..a74c77f82 --- /dev/null +++ b/skills/rules/_sections.md @@ -0,0 +1,16 @@ +# Section Definitions + +This file defines the sections used to organize Plano agent skills rules. +Files are assigned to sections based on their filename prefix. + + +| Prefix | Section # | Title | Impact | Description | +| ----------- | --------- | -------------------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------- | +| `config-` | 1 | Configuration Fundamentals | CRITICAL | Core config.yaml structure, versioning, listener types, and provider setup — the entry point for every Plano deployment | +| `routing-` | 2 | Routing & Model Selection | HIGH | Intelligent LLM routing using preferences, aliases, and defaults to match tasks to the best model | +| `agent-` | 3 | Agent Orchestration | HIGH | Multi-agent patterns, agent descriptions, and orchestration strategies for building agentic applications | +| `filter-` | 4 | Filter Chains & Guardrails | HIGH | Request/response processing pipelines — ordering, MCP integration, and safety guardrails | +| `observe-` | 5 | Observability & Debugging | MEDIUM-HIGH | OpenTelemetry tracing, log levels, span attributes, and sampling for production visibility | +| `cli-` | 6 | CLI Operations | MEDIUM | Using the planoai CLI for startup, tracing, CLI agents, project init, and code generation | +| `deploy-` | 7 | Deployment & Security | HIGH | Docker deployment, environment variable management, health checks, and state storage for production | +| `advanced-` | 8 | Advanced Patterns | MEDIUM | Prompt targets, external API integration, and multi-listener architectures | diff --git a/skills/rules/_template.md b/skills/rules/_template.md new file mode 100644 index 000000000..9566063e0 --- /dev/null +++ b/skills/rules/_template.md @@ -0,0 +1,26 @@ +--- +title: Rule Title Here +impact: MEDIUM +impactDescription: Optional one-line description of the impact +tags: tag1, tag2, tag3 +--- + +## Rule Title Here + +Brief explanation of what this rule is and why it matters for Plano developers and agents. + +**Incorrect (explain what's wrong):** + +```yaml +# Bad config or CLI example +``` + +**Correct (explain what's right):** + +```yaml +# Good config or CLI example +``` + +Optional explanatory text elaborating on the principle or listing key points. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/advanced-multi-listener.md b/skills/rules/advanced-multi-listener.md new file mode 100644 index 000000000..81c8d4d9a --- /dev/null +++ b/skills/rules/advanced-multi-listener.md @@ -0,0 +1,139 @@ +--- +title: Combine Multiple Listener Types for Layered Agent Architectures +impact: MEDIUM +impactDescription: Using a single listener type forces all traffic through one gateway pattern — combining types lets you serve different clients with the right interface without running multiple Plano instances +tags: advanced, multi-listener, architecture, agent, model, prompt +--- + +## Combine Multiple Listener Types for Layered Agent Architectures + +A single Plano `config.yaml` can define multiple listeners of different types, each on a separate port. This lets you serve different client types simultaneously: an OpenAI-compatible model gateway for direct API clients, a prompt gateway for LLM-callable function applications, and an agent orchestrator for multi-agent workflows — all from one Plano instance sharing the same model providers. + +**Single listener (limited — forces all clients through one interface):** + +```yaml +version: v0.3.0 + +listeners: + - type: model # Only model clients can use this + name: model_gateway + port: 12000 + +# Prompt target clients and agent clients cannot connect +``` + +**Multi-listener architecture (serves all client types):** + +```yaml +version: v0.3.0 + +# --- Shared model providers --- +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: quick tasks + description: Short answers, formatting, classification, simple generation + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: complex reasoning + description: Multi-step analysis, code generation, research synthesis + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + routing_preferences: + - name: long documents + description: Summarizing or analyzing very long documents, PDFs, transcripts + +# --- Listener 1: OpenAI-compatible API gateway --- +# For: SDK clients, Claude Code, LangChain, etc. +listeners: + - type: model + name: model_gateway + port: 12000 + timeout: "120s" + +# --- Listener 2: Prompt function gateway --- +# For: Applications that expose LLM-callable APIs + - type: prompt + name: function_gateway + port: 10000 + timeout: "60s" + +# --- Listener 3: Agent orchestration gateway --- +# For: Multi-agent application clients + - type: agent + name: agent_orchestrator + port: 8000 + timeout: "90s" + router: plano_orchestrator_v1 + agents: + - id: research_agent + description: Searches, synthesizes, and summarizes information from multiple sources. + filter_chain: + - input_guards + - context_builder + - id: code_agent + description: Writes, reviews, debugs, and explains code across all languages. + default: true + +# --- Agents --- +agents: + - id: research_agent + url: http://host.docker.internal:8001 + - id: code_agent + url: http://host.docker.internal:8002 + +# --- Filters --- +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + - id: context_builder + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + +# --- Prompt targets (for function gateway) --- +endpoints: + internal_api: + endpoint: host.docker.internal + protocol: http + +prompt_targets: + - name: search_knowledge_base + description: Search the internal knowledge base for relevant documents and facts. + parameters: + - name: query + type: str + required: true + description: Search query to find relevant information + endpoint: + name: internal_api + path: /kb/search?q={query} + http_method: GET + +# --- Observability --- +model_aliases: + plano.fast.v1: + target: gpt-4o-mini + plano.smart.v1: + target: gpt-4o + +tracing: + random_sampling: 50 + trace_arch_internal: true + span_attributes: + static: + environment: production + header_prefixes: + - x-katanemo- +``` + +This architecture serves: SDK clients on `:12000`, function-calling apps on `:10000`, and multi-agent orchestration on `:8000` — with shared cost-optimized routing across all three. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/advanced-prompt-targets.md b/skills/rules/advanced-prompt-targets.md new file mode 100644 index 000000000..88f376fd3 --- /dev/null +++ b/skills/rules/advanced-prompt-targets.md @@ -0,0 +1,128 @@ +--- +title: Design Prompt Targets with Precise Parameter Schemas +impact: HIGH +impactDescription: Imprecise parameter definitions cause the LLM to hallucinate values, skip required fields, or produce malformed API calls — the schema is the contract between the LLM and your API +tags: advanced, prompt-targets, functions, llm, api-integration +--- + +## Design Prompt Targets with Precise Parameter Schemas + +`prompt_targets` define functions that Plano's LLM can call autonomously when it determines a user request matches the function's description. The parameter schema tells the LLM exactly what values to extract from user input — vague schemas lead to hallucinated parameters and failed API calls. + +**Incorrect (too few constraints — LLM must guess):** + +```yaml +prompt_targets: + - name: get_flight_info + description: Get flight information + parameters: + - name: flight # What format? "AA123"? "AA 123"? "American 123"? + type: str + required: true + endpoint: + name: flights_api + path: /flight?id={flight} +``` + +**Correct (fully specified schema with descriptions, formats, and enums):** + +```yaml +version: v0.3.0 + +endpoints: + flights_api: + endpoint: api.flightaware.com + protocol: https + connect_timeout: "5s" + +prompt_targets: + - name: get_flight_status + description: > + Get real-time status, gate information, and delays for a specific flight number. + Use when the user asks about a flight's current status, arrival time, or gate. + parameters: + - name: flight_number + description: > + IATA airline code followed by flight number, e.g., "AA123", "UA456", "DL789". + Extract from user message — do not include spaces. + type: str + required: true + format: "^[A-Z]{2}[0-9]{1,4}$" # Regex hint for validation + + - name: date + description: > + Flight date in YYYY-MM-DD format. Use today's date if not specified. + type: str + required: false + format: date + + endpoint: + name: flights_api + path: /flights/{flight_number}?date={date} + http_method: GET + http_headers: + Authorization: "Bearer $FLIGHTAWARE_API_KEY" + + - name: search_flights + description: > + Search for available flights between two cities or airports. + Use when the user wants to find flights, compare options, or book travel. + parameters: + - name: origin + description: Departure airport IATA code (e.g., "JFK", "LAX", "ORD") + type: str + required: true + - name: destination + description: Arrival airport IATA code (e.g., "LHR", "CDG", "NRT") + type: str + required: true + - name: departure_date + description: Departure date in YYYY-MM-DD format + type: str + required: true + format: date + - name: cabin_class + description: Preferred cabin class + type: str + required: false + default: economy + enum: [economy, premium_economy, business, first] + - name: passengers + description: Number of adult passengers (1-9) + type: int + required: false + default: 1 + + endpoint: + name: flights_api + path: /search?from={origin}&to={destination}&date={departure_date}&class={cabin_class}&pax={passengers} + http_method: GET + http_headers: + Authorization: "Bearer $FLIGHTAWARE_API_KEY" + + system_prompt: | + You are a travel assistant. Present flight search results clearly, + highlighting the best value options. Include price, duration, and + number of stops for each option. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +listeners: + - type: prompt + name: travel_functions + port: 10000 + timeout: "30s" +``` + +**Key principles:** +- `description` on the target tells the LLM when to call it — be specific about trigger conditions +- `description` on each parameter tells the LLM what value to extract — include format examples +- Use `enum` to constrain categorical values — prevents the LLM from inventing categories +- Use `format: date` or regex patterns to hint at expected format +- Use `default` for optional parameters so the API never receives null values +- `system_prompt` on the target customizes how the LLM formats the API response to the user + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/agent-descriptions.md b/skills/rules/agent-descriptions.md new file mode 100644 index 000000000..86728bde8 --- /dev/null +++ b/skills/rules/agent-descriptions.md @@ -0,0 +1,75 @@ +--- +title: Write Capability-Focused Agent Descriptions for Accurate Routing +impact: HIGH +impactDescription: The orchestrator LLM routes requests purely by reading agent descriptions — poor descriptions cause misroutes to the wrong specialized agent +tags: agent, orchestration, descriptions, routing, multi-agent +--- + +## Write Capability-Focused Agent Descriptions for Accurate Routing + +In an `agent` listener, Plano's orchestrator reads each agent's `description` and routes user requests to the best-matching agent. This is LLM-based intent matching — the description is the entire specification the router sees. Write it as a capability manifest: what can this agent do, what data does it have access to, and what types of requests should it handle? + +**Incorrect (generic, overlapping descriptions):** + +```yaml +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: agent_1 + description: Helps users with information # Too generic — matches everything + + - id: agent_2 + description: Also helps users # Indistinguishable from agent_1 +``` + +**Correct (specific capabilities, distinct domains, concrete examples):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: flight_agent + url: http://host.docker.internal:8002 + - id: hotel_agent + url: http://host.docker.internal:8003 + +listeners: + - type: agent + name: travel_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: > + Provides real-time weather conditions and multi-day forecasts for any city + worldwide. Handles questions about temperature, precipitation, wind, humidity, + sunrise/sunset times, and severe weather alerts. Examples: "What's the weather + in Tokyo?", "Will it rain in London this weekend?", "Sunrise time in New York." + + - id: flight_agent + description: > + Provides live flight status, schedules, gate information, delays, and + aircraft details for any flight number or route between airports. + Handles questions about departures, arrivals, and airline information. + Examples: "Is AA123 on time?", "Flights from JFK to LAX tomorrow." + + - id: hotel_agent + description: > + Searches and books hotel accommodations, compares room types, pricing, + and availability. Handles check-in/check-out dates, amenities, and + cancellation policies. Examples: "Hotels near Times Square for next Friday." +``` + +**Description writing checklist:** +- State the primary domain in the first sentence +- List 3–5 specific data types or question categories this agent handles +- Include 2–3 concrete example user queries in quotes +- Avoid capability overlap between agents — if they overlap, the router will split traffic unpredictably +- Keep descriptions under 150 words — the orchestrator reads all descriptions per request + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/agent-orchestration.md b/skills/rules/agent-orchestration.md new file mode 100644 index 000000000..0e6d7bb3b --- /dev/null +++ b/skills/rules/agent-orchestration.md @@ -0,0 +1,88 @@ +--- +title: Register All Sub-Agents in Both `agents` and `listeners.agents` +impact: CRITICAL +impactDescription: An agent registered only in `agents` but not referenced in a listener's agent list is unreachable; an agent listed in a listener but missing from `agents` causes a startup error +tags: agent, orchestration, config, multi-agent +--- + +## Register All Sub-Agents in Both `agents` and `listeners.agents` + +Plano's agent system has two separate concepts: the global `agents` array (defines the agent's ID and backend URL) and the `listeners[].agents` array (controls which agents are available to an orchestrator and provides their routing descriptions). Both must reference the same agent ID. + +**Incorrect (agent defined globally but not referenced in listener):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: news_agent # Defined but never referenced in any listener + url: http://host.docker.internal:8002 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides weather forecasts and current conditions. + # news_agent is missing here — the orchestrator cannot route to it +``` + +**Incorrect (listener references an agent ID not in the global agents list):** + +```yaml +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides weather forecasts. + - id: flights_agent # ID not in global agents[] — startup error + description: Provides flight status information. +``` + +**Correct (every agent ID appears in both places):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: flights_agent + url: http://host.docker.internal:8002 + - id: hotels_agent + url: http://host.docker.internal:8003 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +listeners: + - type: agent + name: travel_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Real-time weather, forecasts, and climate data for any city. + - id: flights_agent + description: Live flight status, schedules, gates, and delays. + - id: hotels_agent + description: Hotel search, availability, pricing, and booking. + default: true # Fallback if no other agent matches +``` + +Set `default: true` on one agent in each listener's agents list to handle unmatched requests. The agent's URL in the global `agents` array is the HTTP endpoint Plano forwards matching requests to — it must be reachable from within the Docker container (use `host.docker.internal` for services on the host). + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/cli-agent.md b/skills/rules/cli-agent.md new file mode 100644 index 000000000..e311e99ee --- /dev/null +++ b/skills/rules/cli-agent.md @@ -0,0 +1,86 @@ +--- +title: Use `planoai cli_agent` to Connect Claude Code Through Plano +impact: MEDIUM-HIGH +impactDescription: Running Claude Code directly against provider APIs bypasses Plano's routing, observability, and guardrails — cli_agent routes all Claude Code traffic through your configured Plano instance +tags: cli, cli-agent, claude, coding-agent, integration +--- + +## Use `planoai cli_agent` to Connect Claude Code Through Plano + +`planoai cli_agent` starts a Claude Code session that routes all LLM traffic through your running Plano instance instead of directly to Anthropic. This gives you routing preferences, model aliases, tracing, and guardrails for your coding agent workflows — making Claude Code a first-class citizen of your Plano configuration. + +**Prerequisites:** + +```bash +# 1. Plano must be running with a model listener +planoai up config.yaml + +# 2. ANTHROPIC_API_KEY must be set (Claude Code uses it for auth) +export ANTHROPIC_API_KEY=sk-ant-... +``` + +**Starting the CLI agent:** + +```bash +# Start CLI agent using config.yaml in current directory +planoai cli_agent claude + +# Use a specific config file +planoai cli_agent claude config.yaml + +# Use a config in a different directory +planoai cli_agent claude --path /path/to/project +``` + +**Recommended config for Claude Code routing:** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: claude_code_router + port: 12000 + +model_providers: + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + default: true + routing_preferences: + - name: general coding + description: > + Writing code, debugging, code review, explaining concepts, + answering programming questions, general development tasks. + + - model: anthropic/claude-opus-4-6 + access_key: $ANTHROPIC_API_KEY + routing_preferences: + - name: complex architecture + description: > + System design, complex refactoring across many files, + architectural decisions, performance optimization, security audits. + +model_aliases: + claude.fast.v1: + target: claude-sonnet-4-20250514 + claude.smart.v1: + target: claude-opus-4-6 + +tracing: + random_sampling: 100 + trace_arch_internal: true + +overrides: + upstream_connect_timeout: "10s" +``` + +**What happens when cli_agent runs:** + +1. Reads your config.yaml to find the model listener port +2. Configures Claude Code to use `http://localhost:` as its API endpoint +3. Starts a Claude Code session in your terminal +4. All Claude Code LLM calls flow through Plano — routing, tracing, and guardrails apply + +After your session, use `planoai trace` to inspect every LLM call Claude Code made, which model was selected, and why. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/cli-generate.md b/skills/rules/cli-generate.md new file mode 100644 index 000000000..75ae8e4fd --- /dev/null +++ b/skills/rules/cli-generate.md @@ -0,0 +1,91 @@ +--- +title: Generate Prompt Targets from Python Functions with `planoai generate_prompt_targets` +impact: MEDIUM +impactDescription: Manually writing prompt_targets YAML for existing Python APIs is error-prone — the generator introspects function signatures and produces correct YAML automatically +tags: cli, generate, prompt-targets, python, code-generation +--- + +## Generate Prompt Targets from Python Functions with `planoai generate_prompt_targets` + +`planoai generate_prompt_targets` introspects Python function signatures and docstrings to generate `prompt_targets` YAML for your Plano config. This is the fastest way to expose existing Python APIs as LLM-callable functions without manually writing the YAML schema. + +**Python function requirements for generation:** +- Use simple type annotations: `int`, `float`, `bool`, `str`, `list`, `tuple`, `set`, `dict` +- Include a docstring describing what the function does (becomes the `description`) +- Complex Pydantic models must be flattened into primitive typed parameters first + +**Example Python file:** + +```python +# api.py + +def get_stock_quote(symbol: str, exchange: str = "NYSE") -> dict: + """Get the current stock price and trading data for a given stock symbol. + + Returns price, volume, market cap, and 24h change percentage. + """ + # Implementation calls stock API + pass + +def get_weather_forecast(city: str, days: int = 3, units: str = "celsius") -> dict: + """Get the weather forecast for a city. + + Returns temperature, precipitation, and conditions for the specified number of days. + """ + pass + +def search_flights(origin: str, destination: str, date: str, passengers: int = 1) -> list: + """Search for available flights between two airports on a given date. + + Date format: YYYY-MM-DD. Returns list of flight options with prices. + """ + pass +``` + +**Running the generator:** + +```bash +planoai generate_prompt_targets --file api.py +``` + +**Generated output (add to your config.yaml):** + +```yaml +prompt_targets: + - name: get_stock_quote + description: Get the current stock price and trading data for a given stock symbol. + parameters: + - name: symbol + type: str + required: true + - name: exchange + type: str + required: false + default: NYSE + # Add endpoint manually: + endpoint: + name: stock_api + path: /quote?symbol={symbol}&exchange={exchange} + + - name: get_weather_forecast + description: Get the weather forecast for a city. + parameters: + - name: city + type: str + required: true + - name: days + type: int + required: false + default: 3 + - name: units + type: str + required: false + default: celsius + endpoint: + name: weather_api + path: /forecast?city={city}&days={days}&units={units} +``` + +After generation, manually add the `endpoint` blocks pointing to your actual API. The generator produces the schema; you wire in the connectivity. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/cli-init.md b/skills/rules/cli-init.md new file mode 100644 index 000000000..740396aef --- /dev/null +++ b/skills/rules/cli-init.md @@ -0,0 +1,66 @@ +--- +title: Use `planoai init` Templates to Bootstrap New Projects Correctly +impact: MEDIUM +impactDescription: Starting from a blank config.yaml leads to missing required fields and common structural mistakes — templates provide validated, idiomatic starting points +tags: cli, init, templates, getting-started, project-setup +--- + +## Use `planoai init` Templates to Bootstrap New Projects Correctly + +`planoai init` generates a valid `config.yaml` from built-in templates. Each template demonstrates a specific Plano capability with correct structure, realistic examples, and comments. Use this instead of writing config from scratch — it ensures you start with a valid, working configuration. + +**Available templates:** + +| Template ID | What It Demonstrates | Best For | +|---|---|---| +| `sub_agent_orchestration` | Multi-agent routing with specialized sub-agents | Building agentic applications | +| `coding_agent_routing` | Routing preferences + model aliases for coding workflows | Claude Code and coding assistants | +| `preference_aware_routing` | Automatic LLM routing based on task type | Multi-model cost optimization | +| `filter_chain_guardrails` | Input guards, query rewrite, context builder | RAG + safety pipelines | +| `conversational_state_v1_responses` | Stateful conversations with memory | Chatbots, multi-turn assistants | + +**Usage:** + +```bash +# Initialize with a template +planoai init --template sub_agent_orchestration + +# Initialize coding agent routing setup +planoai init --template coding_agent_routing + +# Initialize a RAG with guardrails project +planoai init --template filter_chain_guardrails +``` + +**Typical project setup workflow:** + +```bash +# 1. Create project directory +mkdir my-plano-agent && cd my-plano-agent + +# 2. Bootstrap with the closest matching template +planoai init --template preference_aware_routing + +# 3. Edit config.yaml to add your specific models, agents, and API keys +# (keys are already using $VAR substitution — just set your env vars) + +# 4. Create .env file for local development +cat > .env << EOF +OPENAI_API_KEY=sk-proj-... +ANTHROPIC_API_KEY=sk-ant-... +EOF + +echo ".env" >> .gitignore + +# 5. Start Plano +planoai up + +# 6. Test your configuration +curl http://localhost:12000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}' +``` + +Start with `preference_aware_routing` for most LLM gateway use cases and `sub_agent_orchestration` for multi-agent applications. Both can be combined after you understand each independently. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/cli-startup.md b/skills/rules/cli-startup.md new file mode 100644 index 000000000..2d51927cd --- /dev/null +++ b/skills/rules/cli-startup.md @@ -0,0 +1,80 @@ +--- +title: Follow the `planoai up` Validation Workflow Before Debugging Runtime Issues +impact: HIGH +impactDescription: `planoai up` validates config, checks API keys, and health-checks all listeners — skipping this diagnostic information leads to unnecessary debugging of container or network issues +tags: cli, startup, validation, debugging, workflow +--- + +## Follow the `planoai up` Validation Workflow Before Debugging Runtime Issues + +`planoai up` is the entry point for running Plano. It performs sequential checks before the container starts: schema validation, API key presence check, container startup, and health checks on all configured listener ports. Understanding what each failure stage means prevents chasing the wrong root cause. + +**Validation stages and failure signals:** + +``` +Stage 1: Schema validation → "config.yaml: invalid against schema" +Stage 2: API key check → "Missing required environment variables: OPENAI_API_KEY" +Stage 3: Container start → "Docker daemon not running" or image pull errors +Stage 4: Health check (/healthz) → "Listener not healthy after 120s" (timeout) +``` + +**Development startup workflow:** + +```bash +# Standard startup — config.yaml in current directory +planoai up + +# Explicit config file path +planoai up my-config.yaml + +# Start in foreground to see all logs immediately (great for debugging) +planoai up config.yaml --foreground + +# Start with built-in OTEL trace collector +planoai up config.yaml --with-tracing + +# Enable verbose logging for debugging routing decisions +LOG_LEVEL=debug planoai up config.yaml --foreground +``` + +**Checking what's running:** + +```bash +# Stream recent logs (last N lines, then exit) +planoai logs + +# Follow logs in real-time +planoai logs --follow + +# Include Envoy/gateway debug messages +planoai logs --debug --follow +``` + +**Stopping and restarting after config changes:** + +```bash +# Stop the current container +planoai down + +# Restart with updated config +planoai up config.yaml +``` + +**Common failure patterns:** + +```bash +# API key missing — check your .env file or shell environment +export OPENAI_API_KEY=sk-proj-... +planoai up config.yaml + +# Health check timeout — listener port may conflict +# Check if another process uses port 12000 +lsof -i :12000 + +# Container fails to start — verify Docker daemon is running +docker ps +``` + +`planoai down` fully stops and removes the Plano container. Always run `planoai down` before `planoai up` when changing config to avoid stale container state. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/config-listeners.md b/skills/rules/config-listeners.md new file mode 100644 index 000000000..d40a3e30f --- /dev/null +++ b/skills/rules/config-listeners.md @@ -0,0 +1,64 @@ +--- +title: Choose the Right Listener Type for Your Use Case +impact: CRITICAL +impactDescription: The listener type determines the entire request processing pipeline — choosing the wrong type means features like prompt functions or agent routing are unavailable +tags: config, listeners, architecture, routing +--- + +## Choose the Right Listener Type for Your Use Case + +Plano supports three listener types, each serving a distinct purpose. `listeners` is the only required top-level array in a Plano config. Every listener needs at minimum a `type`, `name`, and `port`. + +| Type | Use When | Key Feature | +|------|----------|-------------| +| `model` | You want an OpenAI-compatible LLM gateway | Routes to multiple LLM providers, supports model aliases and routing preferences | +| `prompt` | You want LLM-callable custom functions | Define `prompt_targets` that the LLM dispatches as function calls | +| `agent` | You want multi-agent orchestration | Routes user requests to specialized sub-agents by matching agent descriptions | + +**Incorrect (using `model` when agents need orchestration):** + +```yaml +version: v0.3.0 + +# Wrong: a model listener cannot route to backend agent services +listeners: + - type: model + name: main + port: 12000 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 +``` + +**Correct (use `agent` listener for multi-agent systems):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 + - id: travel_agent + url: http://host.docker.internal:8002 + +listeners: + - type: agent + name: orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: weather_agent + description: Provides real-time weather, forecasts, and conditions for any city. + - id: travel_agent + description: Books flights, hotels, and travel itineraries. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +A single Plano instance can expose multiple listeners on different ports, each with a different type, to serve different clients simultaneously. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/config-providers.md b/skills/rules/config-providers.md new file mode 100644 index 000000000..30476cd5a --- /dev/null +++ b/skills/rules/config-providers.md @@ -0,0 +1,64 @@ +--- +title: Register Model Providers with Correct Format Identifiers +impact: CRITICAL +impactDescription: Incorrect provider format causes request translation failures — Plano must know the wire format each provider expects +tags: config, model-providers, llm, api-format +--- + +## Register Model Providers with Correct Format Identifiers + +Plano translates requests between its internal format and each provider's API. The `model` field uses `provider/model-name` syntax which determines both the upstream endpoint and the request/response translation layer. Some providers require an explicit `provider_interface` override. + +**Provider format reference:** + +| Model prefix | Wire format | Example | +|---|---|---| +| `openai/*` | OpenAI | `openai/gpt-4o` | +| `anthropic/*` | Anthropic | `anthropic/claude-sonnet-4-20250514` | +| `gemini/*` | Google Gemini | `gemini/gemini-2.0-flash` | +| `mistral/*` | Mistral | `mistral/mistral-large-latest` | +| `groq/*` | Groq | `groq/llama-3.3-70b-versatile` | +| `deepseek/*` | DeepSeek | `deepseek/deepseek-chat` | +| `xai/*` | Grok (OpenAI-compat) | `xai/grok-2` | +| `together_ai/*` | Together.ai | `together_ai/meta-llama/Llama-3` | +| `custom/*` | Requires `provider_interface` | `custom/my-local-model` | + +**Incorrect (missing provider prefix, ambiguous format):** + +```yaml +model_providers: + - model: gpt-4o # Missing openai/ prefix — Plano cannot route this + access_key: $OPENAI_API_KEY + + - model: claude-3-5-sonnet # Missing anthropic/ prefix + access_key: $ANTHROPIC_API_KEY +``` + +**Correct (explicit provider prefixes):** + +```yaml +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + + - model: gemini/gemini-2.0-flash + access_key: $GOOGLE_API_KEY +``` + +**For local or self-hosted models (Ollama, LiteLLM, vLLM):** + +```yaml +model_providers: + - model: custom/llama3 + base_url: http://host.docker.internal:11434/v1 # Ollama endpoint + provider_interface: openai # Ollama speaks OpenAI format + default: true +``` + +Always set `default: true` on exactly one provider per listener so Plano has a fallback when routing preferences do not match. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/config-secrets.md b/skills/rules/config-secrets.md new file mode 100644 index 000000000..5f585c879 --- /dev/null +++ b/skills/rules/config-secrets.md @@ -0,0 +1,72 @@ +--- +title: Use Environment Variable Substitution for All Secrets +impact: CRITICAL +impactDescription: Hardcoded API keys in config.yaml will be committed to version control and exposed in Docker container inspect output +tags: config, security, secrets, api-keys, environment-variables +--- + +## Use Environment Variable Substitution for All Secrets + +Plano supports `$VAR_NAME` substitution in config values. This applies to `access_key` fields, `connection_string` for state storage, and `http_headers` in prompt targets and endpoints. Never hardcode credentials — Plano reads them from environment variables or a `.env` file at startup via `planoai up`. + +**Incorrect (hardcoded secrets):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o + access_key: abcdefghijklmnopqrstuvwxyz... # Hardcoded — never do this + +state_storage: + type: postgres + connection_string: "postgresql://admin:mysecretpassword@prod-db:5432/plano" + +prompt_targets: + - name: get_data + endpoint: + name: my_api + http_headers: + Authorization: "Bearer abcdefghijklmnopqrstuvwxyz" # Hardcoded token +``` + +**Correct (environment variable substitution):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + +state_storage: + type: postgres + connection_string: "postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}" + +prompt_targets: + - name: get_data + endpoint: + name: my_api + http_headers: + Authorization: "Bearer $MY_API_TOKEN" +``` + +**`.env` file pattern (loaded automatically by `planoai up`):** + +```bash +# .env — add to .gitignore +OPENAI_API_KEY=sk-proj-... +ANTHROPIC_API_KEY=sk-ant-... +DB_USER=plano +DB_PASS=secure-password +DB_HOST=localhost +MY_API_TOKEN=tok_live_... +``` + +Plano also accepts keys set directly in the shell environment. Variables referenced in config but not found at startup cause `planoai up` to fail with a clear error listing the missing keys. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/config-version.md b/skills/rules/config-version.md new file mode 100644 index 000000000..768d7b043 --- /dev/null +++ b/skills/rules/config-version.md @@ -0,0 +1,44 @@ +--- +title: Always Specify a Supported Config Version +impact: CRITICAL +impactDescription: Plano rejects configs with missing or unsupported version fields — the version field gates all other validation +tags: config, versioning, validation +--- + +## Always Specify a Supported Config Version + +Every Plano `config.yaml` must include a `version` field at the top level. Plano validates configs against a versioned JSON schema — an unrecognized or missing version will cause `planoai up` to fail immediately with a schema validation error before the container starts. + +**Incorrect (missing or invalid version):** + +```yaml +# No version field — fails schema validation +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY +``` + +**Correct (explicit supported version):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +Use the latest supported version unless you are targeting a specific deployed Plano image. Current supported versions: `v0.1`, `v0.1.0`, `0.1-beta`, `v0.2.0`, `v0.3.0`. Prefer `v0.3.0` for all new projects. + +Reference: https://github.com/katanemo/archgw/blob/main/config/plano_config_schema.yaml diff --git a/skills/rules/deploy-docker.md b/skills/rules/deploy-docker.md new file mode 100644 index 000000000..ecc23586f --- /dev/null +++ b/skills/rules/deploy-docker.md @@ -0,0 +1,80 @@ +--- +title: Understand Plano's Docker Network Topology for Agent URL Configuration +impact: HIGH +impactDescription: Using `localhost` for agent URLs inside Docker always fails — Plano runs in a container and cannot reach host services via localhost +tags: deployment, docker, networking, agents, urls +--- + +## Understand Plano's Docker Network Topology for Agent URL Configuration + +Plano runs inside a Docker container managed by `planoai up`. Services running on your host machine (agent servers, filter servers, databases) are not accessible as `localhost` from inside the container. Use Docker's special hostname `host.docker.internal` to reach host services. + +**Docker network rules:** +- `localhost` / `127.0.0.1` inside the container → Plano's own container (not your host) +- `host.docker.internal` → Your host machine's loopback interface +- Container name or `docker network` hostname → Other Docker containers +- External domain / IP → Reachable if Docker has network access + +**Incorrect (using localhost — agent unreachable from inside container):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://localhost:8001 # Wrong: this is Plano's own container + + - id: flight_agent + url: http://127.0.0.1:8002 # Wrong: same issue + +filters: + - id: input_guards + url: http://localhost:10500 # Wrong: filter server unreachable +``` + +**Correct (using host.docker.internal for host-side services):** + +```yaml +version: v0.3.0 + +agents: + - id: weather_agent + url: http://host.docker.internal:8001 # Correct: reaches host port 8001 + + - id: flight_agent + url: http://host.docker.internal:8002 # Correct: reaches host port 8002 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 # Correct: reaches filter server on host + +endpoints: + internal_api: + endpoint: host.docker.internal # Correct for internal API on host + protocol: http +``` + +**Production deployment patterns:** + +```yaml +# Kubernetes / Docker Compose — use service names +agents: + - id: weather_agent + url: http://weather-service:8001 # Kubernetes service DNS + +# External cloud services — use full domain +agents: + - id: cloud_agent + url: https://my-agent.us-east-1.amazonaws.com/v1 + +# Custom TLS (self-signed or internal CA) +overrides: + upstream_tls_ca_path: /etc/ssl/certs/internal-ca.pem +``` + +**Ports exposed by Plano's container:** +- All `port` values from your `listeners` blocks are automatically mapped +- `9901` — Envoy admin interface (for advanced debugging) +- `12001` — Plano internal management API + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/deploy-health.md b/skills/rules/deploy-health.md new file mode 100644 index 000000000..8e948ee43 --- /dev/null +++ b/skills/rules/deploy-health.md @@ -0,0 +1,90 @@ +--- +title: Verify Listener Health Before Sending Requests +impact: MEDIUM +impactDescription: Sending requests to Plano before listeners are healthy results in connection refused errors that look like application bugs — always confirm health before testing +tags: deployment, health-checks, readiness, debugging +--- + +## Verify Listener Health Before Sending Requests + +Each Plano listener exposes a `/healthz` HTTP endpoint. `planoai up` automatically health-checks all listeners during startup (120s timeout), but in CI/CD pipelines, custom scripts, or when troubleshooting, you may need to check health manually. + +**Health check endpoints:** + +```bash +# Check model listener health (port from your config) +curl -f http://localhost:12000/healthz +# Returns 200 OK when healthy + +# Check prompt listener +curl -f http://localhost:10000/healthz + +# Check agent listener +curl -f http://localhost:8000/healthz +``` + +**Polling health in scripts (CI/CD pattern):** + +```bash +#!/bin/bash +# wait-for-plano.sh + +LISTENER_PORT=${1:-12000} +MAX_WAIT=120 +INTERVAL=2 +elapsed=0 + +echo "Waiting for Plano listener on port $LISTENER_PORT..." + +until curl -sf "http://localhost:$LISTENER_PORT/healthz" > /dev/null; do + if [ $elapsed -ge $MAX_WAIT ]; then + echo "ERROR: Plano listener not healthy after ${MAX_WAIT}s" + planoai logs --debug + exit 1 + fi + sleep $INTERVAL + elapsed=$((elapsed + INTERVAL)) +done + +echo "Plano listener healthy after ${elapsed}s" +``` + +**Docker Compose health check:** + +```yaml +# docker-compose.yml for services that depend on Plano +services: + plano: + image: katanemo/plano:latest + # Plano is managed by planoai, not directly via compose in most setups + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:12000/healthz"] + interval: 5s + timeout: 3s + retries: 24 + start_period: 10s + + my-agent: + image: my-agent:latest + depends_on: + plano: + condition: service_healthy +``` + +**Debug unhealthy listeners:** + +```bash +# See startup logs +planoai logs --debug + +# Check if port is already in use +lsof -i :12000 + +# Check container status +docker ps -a --filter name=plano + +# Restart from scratch +planoai down && planoai up config.yaml --foreground +``` + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/deploy-state.md b/skills/rules/deploy-state.md new file mode 100644 index 000000000..03ce1f3d2 --- /dev/null +++ b/skills/rules/deploy-state.md @@ -0,0 +1,85 @@ +--- +title: Use PostgreSQL State Storage for Multi-Turn Conversations in Production +impact: HIGH +impactDescription: The default in-memory state storage loses all conversation history when the container restarts — production multi-turn agents require persistent PostgreSQL storage +tags: deployment, state, postgres, memory, multi-turn, production +--- + +## Use PostgreSQL State Storage for Multi-Turn Conversations in Production + +`state_storage` enables Plano to maintain conversation context across requests. Without it, each request is stateless. The `memory` type works for development and testing — all state is lost on container restart. Use `postgres` for any production deployment where conversation continuity matters. + +**Incorrect (memory storage in production):** + +```yaml +version: v0.3.0 + +# Memory storage — all conversations lost on planoai down / container restart +state_storage: + type: memory + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant with conversation history. +``` + +**Correct (PostgreSQL for production persistence):** + +```yaml +version: v0.3.0 + +state_storage: + type: postgres + connection_string: "postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}" + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant with access to full conversation history. + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +**Setting up PostgreSQL for local development:** + +```bash +# Start PostgreSQL with Docker +docker run -d \ + --name plano-postgres \ + -e POSTGRES_USER=plano \ + -e POSTGRES_PASSWORD=devpassword \ + -e POSTGRES_DB=plano \ + -p 5432:5432 \ + postgres:16 + +# Set environment variables +export DB_USER=plano +export DB_PASS=devpassword +export DB_HOST=host.docker.internal # Use host.docker.internal from inside Plano container +export DB_NAME=plano +``` + +**Production `.env` pattern:** + +```bash +DB_USER=plano_prod +DB_PASS= +DB_HOST=your-rds-endpoint.amazonaws.com +DB_NAME=plano +``` + +Plano automatically creates its state tables on first startup. The `connection_string` supports all standard PostgreSQL connection parameters including SSL: `postgresql://user:pass@host:5432/db?sslmode=require`. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/filter-guardrails.md b/skills/rules/filter-guardrails.md new file mode 100644 index 000000000..d60bea658 --- /dev/null +++ b/skills/rules/filter-guardrails.md @@ -0,0 +1,81 @@ +--- +title: Configure Prompt Guards with Actionable Rejection Messages +impact: MEDIUM +impactDescription: A generic or empty rejection message leaves users confused about why their request was blocked and unable to rephrase appropriately +tags: filter, guardrails, jailbreak, security, ux +--- + +## Configure Prompt Guards with Actionable Rejection Messages + +Plano has built-in `prompt_guards` for detecting jailbreak attempts. When triggered, Plano returns the `on_exception.message` instead of forwarding the request. Write messages that explain the restriction and suggest what the user can do instead — both for user experience and to reduce support burden. + +**Incorrect (no message configured — returns a generic error):** + +```yaml +version: v0.3.0 + +prompt_guards: + input_guards: + jailbreak: + on_exception: {} # Empty — returns unhelpful generic error +``` + +**Incorrect (cryptic technical message):** + +```yaml +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: "Error code 403: guard triggered" # Unhelpful to the user +``` + +**Correct (clear, actionable, brand-appropriate message):** + +```yaml +version: v0.3.0 + +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: > + I'm not able to help with that request. This assistant is designed + to help with [your use case, e.g., customer support, coding questions]. + Please rephrase your question or contact support@yourdomain.com + if you believe this is an error. +``` + +**Combining prompt_guards with MCP filter guardrails:** + +```yaml +# Built-in jailbreak detection (fast, no external service needed) +prompt_guards: + input_guards: + jailbreak: + on_exception: + message: "This request cannot be processed. Please ask about our products and services." + +# MCP-based custom guards for additional policy enforcement +filters: + - id: topic_restriction + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + tool: topic_restriction # Custom filter for domain-specific restrictions + +listeners: + - type: agent + name: customer_support + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: support_agent + description: Customer support assistant for product questions and order issues. + filter_chain: + - topic_restriction # Additional custom topic filtering +``` + +`prompt_guards` applies globally to all listeners. Use `filter_chain` on individual agents for per-agent policies. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/filter-mcp.md b/skills/rules/filter-mcp.md new file mode 100644 index 000000000..c2d02efdb --- /dev/null +++ b/skills/rules/filter-mcp.md @@ -0,0 +1,59 @@ +--- +title: Configure MCP Filters with Explicit Type and Transport +impact: MEDIUM +impactDescription: Omitting type and transport fields relies on defaults that may not match your MCP server's protocol implementation +tags: filter, mcp, integration, configuration +--- + +## Configure MCP Filters with Explicit Type and Transport + +Plano filters integrate with external services via MCP (Model Context Protocol) or plain HTTP. MCP filters call a specific tool on a remote MCP server. Always specify `type`, `transport`, and optionally `tool` (defaults to the filter `id`) to ensure Plano connects correctly to your filter implementation. + +**Incorrect (minimal filter definition relying on all defaults):** + +```yaml +filters: + - id: my_guard # Plano infers type=mcp, transport=streamable-http, tool=my_guard + url: http://localhost:10500 + # If your MCP server uses a different tool name or transport, this silently misroutes +``` + +**Correct (explicit configuration for each filter):** + +```yaml +version: v0.3.0 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp # Explicitly MCP protocol + transport: streamable-http # Streamable HTTP transport + tool: input_guards # MCP tool name (matches MCP server registration) + + - id: query_rewriter + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + tool: rewrite_query # Tool name differs from filter ID — explicit is safer + + - id: custom_validator + url: http://host.docker.internal:10503 + type: http # Plain HTTP filter (not MCP) + # No tool field for HTTP filters +``` + +**MCP filter implementation contract:** +Your MCP server must expose a tool matching the `tool` name. The tool receives the request payload and must return either: +- A modified request (to pass through with changes) +- A rejection response (to short-circuit the pipeline) + +**HTTP filter alternative** — use `type: http` for simpler request/response interceptors that don't need the MCP protocol: + +```yaml +filters: + - id: auth_validator + url: http://host.docker.internal:9000/validate + type: http # Plano POSTs the request, expects the modified request back +``` + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/filter-ordering.md b/skills/rules/filter-ordering.md new file mode 100644 index 000000000..ad2d0d7b2 --- /dev/null +++ b/skills/rules/filter-ordering.md @@ -0,0 +1,78 @@ +--- +title: Order Filter Chains with Guards First, Enrichment Last +impact: HIGH +impactDescription: Running context builders before input guards means jailbreak attempts get RAG-enriched context before being blocked — wasting compute and risking data exposure +tags: filter, guardrails, security, pipeline, ordering +--- + +## Order Filter Chains with Guards First, Enrichment Last + +A `filter_chain` is an ordered list of filter IDs applied sequentially to each request. The order is semantically meaningful: each filter receives the output of the previous one. Safety and validation filters must run first to short-circuit bad requests before expensive enrichment filters process them. + +**Recommended filter chain order:** + +1. **Input guards** — jailbreak detection, PII detection, topic restrictions (reject early) +2. **Query rewriting** — normalize or enhance the user query +3. **Context building** — RAG retrieval, tool lookup, knowledge injection (expensive) +4. **Output guards** — validate or sanitize LLM response before returning + +**Incorrect (context built before guards — wasteful and potentially unsafe):** + +```yaml +filters: + - id: context_builder + url: http://host.docker.internal:10502 # Runs expensive RAG retrieval first + - id: query_rewriter + url: http://host.docker.internal:10501 + - id: input_guards + url: http://host.docker.internal:10500 # Guards run last — jailbreak gets context + +listeners: + - type: agent + name: rag_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: rag_agent + filter_chain: + - context_builder # Wrong: expensive enrichment before safety check + - query_rewriter + - input_guards +``` + +**Correct (guards block bad requests before any enrichment):** + +```yaml +version: v0.3.0 + +filters: + - id: input_guards + url: http://host.docker.internal:10500 + type: mcp + transport: streamable-http + - id: query_rewriter + url: http://host.docker.internal:10501 + type: mcp + transport: streamable-http + - id: context_builder + url: http://host.docker.internal:10502 + type: mcp + transport: streamable-http + +listeners: + - type: agent + name: rag_orchestrator + port: 8000 + router: plano_orchestrator_v1 + agents: + - id: rag_agent + description: Answers questions using internal knowledge base documents. + filter_chain: + - input_guards # 1. Block jailbreaks and policy violations + - query_rewriter # 2. Normalize the safe query + - context_builder # 3. Retrieve relevant context for the clean query +``` + +Different agents within the same listener can have different filter chains — a public-facing agent may need all guards while an internal admin agent may skip them. + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/observe-span-attributes.md b/skills/rules/observe-span-attributes.md new file mode 100644 index 000000000..a90b3006f --- /dev/null +++ b/skills/rules/observe-span-attributes.md @@ -0,0 +1,80 @@ +--- +title: Add Custom Span Attributes for Correlation and Filtering +impact: MEDIUM +impactDescription: Without custom span attributes, traces cannot be filtered by user, session, or environment — making production debugging significantly harder +tags: observability, tracing, span-attributes, correlation +--- + +## Add Custom Span Attributes for Correlation and Filtering + +Plano can automatically extract HTTP request headers and attach them as span attributes, plus attach static key-value pairs to every span. This enables filtering traces by user, session, tenant, environment, or any other dimension that matters to your application. + +**Incorrect (no span attributes — traces are unfiltered blobs):** + +```yaml +tracing: + random_sampling: 20 + # No span_attributes — cannot filter by user, session, or environment +``` + +**Correct (rich span attributes for production correlation):** + +```yaml +version: v0.3.0 + +tracing: + random_sampling: 20 + trace_arch_internal: true + + span_attributes: + # Match all headers with this prefix, then map to span attributes by: + # 1) stripping the prefix and 2) converting hyphens to dots + header_prefixes: + - x-katanemo- + + # Static attributes added to every span from this Plano instance + static: + environment: production + service.name: plano-gateway + deployment.region: us-east-1 + service.version: "2.1.0" + team: platform-engineering +``` + +**Sending correlation headers from client code:** + +```python +import httpx + +response = httpx.post( + "http://localhost:12000/v1/chat/completions", + headers={ + "x-katanemo-request-id": "req_abc123", + "x-katanemo-user-id": "usr_12", + "x-katanemo-session-id": "sess_xyz456", + "x-katanemo-tenant-id": "acme-corp", + }, + json={"model": "plano.v1", "messages": [...]} +) +``` + +**Querying by custom attribute:** + +```bash +# Find all requests from a specific user +planoai trace --where user.id=usr_12 + +# Find all traces from production environment +planoai trace --where environment=production + +# Find traces from a specific tenant +planoai trace --where tenant.id=acme-corp +``` + +Header prefix matching is a prefix match. With `x-katanemo-`, these mappings apply: + +- `x-katanemo-user-id` -> `user.id` +- `x-katanemo-tenant-id` -> `tenant.id` +- `x-katanemo-request-id` -> `request.id` + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/observe-trace-query.md b/skills/rules/observe-trace-query.md new file mode 100644 index 000000000..a7ef7db7a --- /dev/null +++ b/skills/rules/observe-trace-query.md @@ -0,0 +1,85 @@ +--- +title: Use `planoai trace` to Inspect Routing Decisions +impact: MEDIUM-HIGH +impactDescription: The trace CLI lets you verify which model was selected, why, and how long each step took — without setting up a full OTEL backend +tags: observability, tracing, cli, debugging, routing +--- + +## Use `planoai trace` to Inspect Routing Decisions + +`planoai trace` provides a built-in trace viewer backed by an in-memory OTEL collector. Use it to inspect routing decisions, verify preference matching, measure filter latency, and debug failed requests — all from the CLI without configuring Jaeger, Zipkin, or another backend. + +**Workflow: start collector, run requests, then inspect traces:** + +```bash +# 1. Start Plano with the built-in trace collector (recommended) +planoai up config.yaml --with-tracing + +# 2. Send test requests through Plano +curl http://localhost:12000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "plano.v1", "messages": [{"role": "user", "content": "Write a Python function to sort a list"}]}' + +# 3. Show the latest trace +planoai trace +``` + +You can also run the trace listener directly: + +```bash +planoai trace listen # available on a process ID running OTEL collector +``` + +Stop the background trace listener: + +```bash +planoai trace down +``` + +**Useful trace viewer patterns:** + +```bash +# Show latest trace (default target is "last") +planoai trace + +# List available trace IDs +planoai trace --list + +# Show all traces +planoai trace any + +# Show a specific trace (short 8-char or full 32-char ID) +planoai trace 7f4e9a1c +planoai trace 7f4e9a1c0d9d4a0bb9bf5a8a7d13f62a + +# Filter by specific span attributes (AND semantics for repeated --where) +planoai trace any --where llm.model=gpt-4o-mini + +# Filter by user ID (if header prefix is x-katanemo-, x-katanemo-user-id maps to user.id) +planoai trace any --where user.id=user_123 + +# Limit results for a quick sanity check +planoai trace any --limit 5 + +# Time window filter +planoai trace any --since 30m + +# Filter displayed attributes by key pattern +planoai trace any --filter "http.*" + +# Output machine-readable JSON +planoai trace any --json +``` + +**What to look for in traces:** + + +| Span name | What it tells you | +| ------------------- | ------------------------------------------------------------- | +| `plano.routing` | Which routing preference matched and which model was selected | +| `plano.filter.` | How long each filter in the chain took | +| `plano.llm.request` | Time to first token and full response time | +| `plano.agent.route` | Which agent description matched for agent listeners | + + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/observe-tracing.md b/skills/rules/observe-tracing.md new file mode 100644 index 000000000..93b9c0038 --- /dev/null +++ b/skills/rules/observe-tracing.md @@ -0,0 +1,80 @@ +--- +title: Enable Tracing with Appropriate Sampling for Your Environment +impact: HIGH +impactDescription: Without tracing enabled, debugging routing decisions, latency issues, and model selection is guesswork — traces are the primary observability primitive in Plano +tags: observability, tracing, opentelemetry, otel, debugging +--- + +## Enable Tracing with Appropriate Sampling for Your Environment + +Plano emits OpenTelemetry (OTEL) traces for every request, capturing routing decisions, LLM provider selection, filter chain execution, and response latency. Traces are the best tool for understanding why a request was routed to a particular model and debugging unexpected behavior. + +**Incorrect (no tracing configured — flying blind in production):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +# No tracing block — no visibility into routing, latency, or errors +``` + +**Correct (tracing enabled with environment-appropriate sampling):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true + +tracing: + random_sampling: 100 # 100% for development/debugging + trace_arch_internal: true # Include Plano's internal routing spans +``` + +**Production configuration (sampled to control volume):** + +```yaml +tracing: + random_sampling: 10 # Sample 10% of requests in production + trace_arch_internal: false # Skip internal spans to reduce noise + span_attributes: + header_prefixes: + - x-katanemo- # Match all x-katanemo-* headers + static: + environment: production + service.name: my-plano-service + version: "1.0.0" +``` + +With `x-katanemo-` configured, Plano maps headers to attributes by stripping the prefix and converting hyphens to dots: + +- `x-katanemo-user-id` -> `user.id` +- `x-katanemo-session-id` -> `session.id` +- `x-katanemo-request-id` -> `request.id` + +**Starting the trace collector:** + +```bash +# Start Plano with built-in OTEL collector +planoai up config.yaml --with-tracing +``` + +Sampling rates: 100% for dev/staging, 5–20% for high-traffic production, 100% for low-traffic production. `trace_arch_internal: true` adds spans showing which routing preference matched — essential for debugging preference configuration. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/routing-aliases.md b/skills/rules/routing-aliases.md new file mode 100644 index 000000000..91f0b31a9 --- /dev/null +++ b/skills/rules/routing-aliases.md @@ -0,0 +1,77 @@ +--- +title: Use Model Aliases for Semantic, Stable Model References +impact: MEDIUM +impactDescription: Hardcoded model names in client code require code changes when you swap providers; aliases let you update routing in config.yaml alone +tags: routing, model-aliases, maintainability, client-integration +--- + +## Use Model Aliases for Semantic, Stable Model References + +`model_aliases` map human-readable names to specific model identifiers. Client applications reference the alias, not the underlying model. When you want to upgrade from `gpt-4o` to a new model, you change one line in `config.yaml` — not every client calling the API. + +**Incorrect (clients hardcode specific model names):** + +```yaml +# config.yaml — no aliases defined +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + default: true +``` + +```python +# Client code — brittle, must be updated when model changes +client.chat.completions.create(model="gpt-4o", ...) +``` + +**Correct (semantic aliases, stable client contracts):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + - model: anthropic/claude-sonnet-4-20250514 + access_key: $ANTHROPIC_API_KEY + +model_aliases: + plano.fast.v1: + target: gpt-4o-mini # Cheap, fast — for high-volume tasks + + plano.smart.v1: + target: gpt-4o # High capability — for complex reasoning + + plano.creative.v1: + target: claude-sonnet-4-20250514 # Strong creative writing and analysis + + plano.v1: + target: gpt-4o # Default production alias +``` + +```python +# Client code — stable, alias is the contract +client.chat.completions.create(model="plano.smart.v1", ...) +``` + +**Alias naming conventions:** +- `..` — e.g., `plano.fast.v1`, `acme.code.v2` +- Bumping `.v2` → `.v3` lets you run old and new aliases simultaneously during rollouts +- `plano.v1` as a canonical default gives clients a single stable entry point + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/routing-default.md b/skills/rules/routing-default.md new file mode 100644 index 000000000..f23e73570 --- /dev/null +++ b/skills/rules/routing-default.md @@ -0,0 +1,70 @@ +--- +title: Always Set Exactly One Default Model Provider +impact: HIGH +impactDescription: Without a default provider, Plano has no fallback when routing preferences do not match — requests with unclassified intent will fail +tags: routing, defaults, model-providers, reliability +--- + +## Always Set Exactly One Default Model Provider + +When a request does not match any routing preference, Plano forwards it to the `default: true` provider. Without a default, unmatched requests fail. If multiple providers are marked `default: true`, Plano uses the first one — which can produce unexpected behavior. + +**Incorrect (no default provider set):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o-mini # No default: true anywhere + access_key: $OPENAI_API_KEY + routing_preferences: + - name: summarization + description: Summarizing documents and extracting key points + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: Writing new functions and implementing algorithms +``` + +**Incorrect (multiple defaults — ambiguous):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + default: true # First default + access_key: $OPENAI_API_KEY + + - model: openai/gpt-4o + default: true # Second default — confusing + access_key: $OPENAI_API_KEY +``` + +**Correct (exactly one default, covering unmatched requests):** + +```yaml +version: v0.3.0 + +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true # Handles general/unclassified requests + routing_preferences: + - name: summarization + description: Summarizing documents, articles, and meeting notes + - name: classification + description: Categorizing inputs, labeling, and intent detection + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: Writing, debugging, and reviewing code + - name: complex_reasoning + description: Multi-step math, logical analysis, research synthesis +``` + +Choose your most cost-effective capable model as the default — it handles all traffic that doesn't match specialized preferences. + +Reference: [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw) diff --git a/skills/rules/routing-passthrough.md b/skills/rules/routing-passthrough.md new file mode 100644 index 000000000..ff9fbaf90 --- /dev/null +++ b/skills/rules/routing-passthrough.md @@ -0,0 +1,69 @@ +--- +title: Use Passthrough Auth for Proxy and Multi-Tenant Setups +impact: MEDIUM +impactDescription: Without passthrough auth, self-hosted proxy services (LiteLLM, vLLM, etc.) reject Plano's requests because the wrong Authorization header is sent +tags: routing, authentication, proxy, litellm, multi-tenant +--- + +## Use Passthrough Auth for Proxy and Multi-Tenant Setups + +When routing to a self-hosted LLM proxy (LiteLLM, vLLM, OpenRouter, Azure APIM) or in multi-tenant setups where clients supply their own keys, set `passthrough_auth: true`. This forwards the client's `Authorization` header rather than Plano's configured `access_key`. Combine with a `base_url` pointing to the proxy. + +**Incorrect (Plano sends its own key to a proxy that expects the client's key):** + +```yaml +model_providers: + - model: custom/proxy + base_url: http://host.docker.internal:8000 + access_key: $SOME_KEY # Plano overwrites the client's auth — proxy rejects it +``` + +**Correct (forward client Authorization header to the proxy):** + +```yaml +version: v0.3.0 + +listeners: + - type: model + name: model_listener + port: 12000 + +model_providers: + - model: custom/litellm-proxy + base_url: http://host.docker.internal:4000 # LiteLLM server + provider_interface: openai # LiteLLM uses OpenAI format + passthrough_auth: true # Forward client's Bearer token + default: true +``` + +**Multi-tenant pattern (client supplies their own API key):** + +```yaml +model_providers: + # Plano acts as a passthrough gateway; each client has their own OpenAI key + - model: openai/gpt-4o + passthrough_auth: true # No access_key here — client's key is forwarded + default: true +``` + +**Combined: proxy for some models, Plano-managed for others:** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY # Plano manages this key + default: true + routing_preferences: + - name: quick tasks + description: Short answers, simple lookups, fast completions + + - model: custom/vllm-llama + base_url: http://gpu-server:8000 + provider_interface: openai + passthrough_auth: true # vLLM cluster handles its own auth + routing_preferences: + - name: long context + description: Processing very long documents, multi-document analysis +``` + +Reference: https://github.com/katanemo/archgw diff --git a/skills/rules/routing-preferences.md b/skills/rules/routing-preferences.md new file mode 100644 index 000000000..571a3acd7 --- /dev/null +++ b/skills/rules/routing-preferences.md @@ -0,0 +1,73 @@ +--- +title: Write Task-Specific Routing Preference Descriptions +impact: HIGH +impactDescription: Vague preference descriptions cause Plano's internal router LLM to misclassify requests, routing expensive tasks to cheap models and vice versa +tags: routing, model-selection, preferences, llm-routing +--- + +## Write Task-Specific Routing Preference Descriptions + +Plano's `plano_orchestrator_v1` router uses a 1.5B preference-aligned LLM to classify incoming requests against your `routing_preferences` descriptions. It routes the request to the first provider whose preferences match. Description quality directly determines routing accuracy. + +**Incorrect (vague, overlapping descriptions):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: simple + description: easy tasks # Too vague — what is "easy"? + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: hard + description: hard tasks # Too vague — overlaps with "easy" +``` + +**Correct (specific, distinct task descriptions):** + +```yaml +model_providers: + - model: openai/gpt-4o-mini + access_key: $OPENAI_API_KEY + default: true + routing_preferences: + - name: summarization + description: > + Summarizing documents, articles, emails, or meeting transcripts. + Extracting key points, generating TL;DR sections, condensing long text. + - name: classification + description: > + Categorizing inputs, sentiment analysis, spam detection, + intent classification, labeling structured data fields. + - name: translation + description: > + Translating text between languages, localization tasks. + + - model: openai/gpt-4o + access_key: $OPENAI_API_KEY + routing_preferences: + - name: code_generation + description: > + Writing new functions, classes, or modules from scratch. + Implementing algorithms, boilerplate generation, API integrations. + - name: code_review + description: > + Reviewing code for bugs, security vulnerabilities, performance issues. + Suggesting refactors, explaining complex code, debugging errors. + - name: complex_reasoning + description: > + Multi-step math problems, logical deduction, strategic planning, + research synthesis requiring chain-of-thought reasoning. +``` + +**Key principles for good preference descriptions:** +- Use concrete action verbs: "writing", "reviewing", "translating", "summarizing" +- List 3–5 specific sub-tasks or synonyms for each preference +- Ensure preferences across providers are mutually exclusive in scope +- Test with representative queries using `planoai trace` and `--where` filters to verify routing decisions + +Reference: https://github.com/katanemo/archgw diff --git a/skills/src/build.ts b/skills/src/build.ts new file mode 100644 index 000000000..5d4640f1f --- /dev/null +++ b/skills/src/build.ts @@ -0,0 +1,262 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, readdirSync } from "node:fs"; +import { join, dirname } from "node:path"; +import { fileURLToPath } from "node:url"; + +type Section = { + prefix: string; + number: number; + title: string; + description: string; +}; + +type Rule = { + file: string; + title: string; + impact: string; + impactDescription: string; + tags: string[]; + body: string; + section: Section; +}; + +type ParsedFrontmatter = { + frontmatter: Record; + body: string; +}; + +type Metadata = { + abstract: string; + version: string; + organization: string; +}; + +const __dirname = dirname(fileURLToPath(import.meta.url)); +const RULES_DIR = join(__dirname, "..", "rules"); +const OUTPUT_FILE = join(__dirname, "..", "AGENTS.md"); +const METADATA_FILE = join(__dirname, "..", "metadata.json"); + +const SECTIONS: Section[] = [ + { + prefix: "config-", + number: 1, + title: "Configuration Fundamentals", + description: + "Core config.yaml structure, versioning, listener types, and provider setup — the entry point for every Plano deployment.", + }, + { + prefix: "routing-", + number: 2, + title: "Routing & Model Selection", + description: + "Intelligent LLM routing using preferences, aliases, and defaults to match tasks to the best model.", + }, + { + prefix: "agent-", + number: 3, + title: "Agent Orchestration", + description: + "Multi-agent patterns, agent descriptions, and orchestration strategies for building agentic applications.", + }, + { + prefix: "filter-", + number: 4, + title: "Filter Chains & Guardrails", + description: + "Request/response processing pipelines — ordering, MCP integration, and safety guardrails.", + }, + { + prefix: "observe-", + number: 5, + title: "Observability & Debugging", + description: + "OpenTelemetry tracing, log levels, span attributes, and sampling for production visibility.", + }, + { + prefix: "cli-", + number: 6, + title: "CLI Operations", + description: + "Using the planoai CLI for startup, tracing, CLI agents, project init, and code generation.", + }, + { + prefix: "deploy-", + number: 7, + title: "Deployment & Security", + description: + "Docker deployment, environment variable management, health checks, and state storage for production.", + }, + { + prefix: "advanced-", + number: 8, + title: "Advanced Patterns", + description: + "Prompt targets, external API integration, rate limiting, and multi-listener architectures.", + }, +]; + +function parseFrontmatter(content: string): ParsedFrontmatter | null { + const match = content.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/); + if (!match) return null; + + const frontmatter: Record = {}; + const lines = match[1].split("\n"); + for (const line of lines) { + const colonIdx = line.indexOf(":"); + if (colonIdx === -1) continue; + const key = line.slice(0, colonIdx).trim(); + const value = line.slice(colonIdx + 1).trim(); + frontmatter[key] = value; + } + + return { + frontmatter, + body: match[2].trim(), + }; +} + +function inferSection(filename: string): Section | null { + for (const section of SECTIONS) { + if (filename.startsWith(section.prefix)) { + return section; + } + } + return null; +} + +function main(): void { + const metadata = JSON.parse(readFileSync(METADATA_FILE, "utf-8")) as Metadata; + + const files = readdirSync(RULES_DIR) + .filter((f) => f.endsWith(".md") && !f.startsWith("_")) + .sort(); + + const sectionRules = new Map(); + for (const section of SECTIONS) { + sectionRules.set(section.number, []); + } + + let parseErrors = 0; + + for (const file of files) { + const content = readFileSync(join(RULES_DIR, file), "utf-8"); + const parsed = parseFrontmatter(content); + + if (!parsed) { + console.error(`ERROR: Could not parse frontmatter in ${file}`); + parseErrors++; + continue; + } + + const section = inferSection(file); + if (!section) { + console.warn(`WARN: No section found for ${file} — skipping`); + continue; + } + + const rule: Rule = { + file, + title: parsed.frontmatter.title ?? file, + impact: parsed.frontmatter.impact ?? "MEDIUM", + impactDescription: parsed.frontmatter.impactDescription ?? "", + tags: parsed.frontmatter.tags + ? parsed.frontmatter.tags.split(",").map((t) => t.trim()) + : [], + body: parsed.body, + section, + }; + sectionRules.get(section.number)?.push(rule); + } + + if (parseErrors > 0) { + console.error(`\nBuild failed: ${parseErrors} file(s) had parse errors.`); + process.exit(1); + } + + for (const [, rules] of sectionRules) { + rules.sort((a, b) => a.title.localeCompare(b.title)); + } + + const lines: string[] = []; + lines.push(`# Plano Agent Skills`); + lines.push(``); + lines.push(`> ${metadata.abstract}`); + lines.push(``); + lines.push( + `**Version:** ${metadata.version} | **Organization:** ${metadata.organization}` + ); + lines.push(``); + lines.push(`---`); + lines.push(``); + + lines.push(`## Table of Contents`); + lines.push(``); + for (const section of SECTIONS) { + const rules = sectionRules.get(section.number) ?? []; + if (rules.length === 0) continue; + lines.push( + `- [Section ${section.number}: ${section.title}](#section-${section.number})` + ); + for (let i = 0; i < rules.length; i++) { + const rule = rules[i]; + const id = `${section.number}.${i + 1}`; + const anchor = rule.title + .toLowerCase() + .replace(/[^a-z0-9\s-]/g, "") + .replace(/\s+/g, "-"); + lines.push(` - [${id} ${rule.title}](#${anchor})`); + } + } + lines.push(``); + lines.push(`---`); + lines.push(``); + + for (const section of SECTIONS) { + const rules = sectionRules.get(section.number) ?? []; + if (rules.length === 0) continue; + + lines.push(`## Section ${section.number}: ${section.title}`); + lines.push(``); + lines.push(`*${section.description}*`); + lines.push(``); + + for (let i = 0; i < rules.length; i++) { + const rule = rules[i]; + const id = `${section.number}.${i + 1}`; + + lines.push(`### ${id} ${rule.title}`); + lines.push(``); + lines.push( + `**Impact:** \`${rule.impact}\`${rule.impactDescription ? ` — ${rule.impactDescription}` : ""}` + ); + if (rule.tags.length > 0) { + lines.push(`**Tags:** ${rule.tags.map((t) => `\`${t}\``).join(", ")}`); + } + lines.push(``); + lines.push(rule.body); + lines.push(``); + lines.push(`---`); + lines.push(``); + } + } + + lines.push(`*Generated from individual rule files in \`rules/\`.*`); + lines.push( + `*To contribute, see [CONTRIBUTING](https://github.com/katanemo/archgw/blob/main/CONTRIBUTING.md).*` + ); + + writeFileSync(OUTPUT_FILE, lines.join("\n"), "utf-8"); + + let totalRules = 0; + for (const section of SECTIONS) { + const rules = sectionRules.get(section.number) ?? []; + if (rules.length > 0) { + console.log(` Section ${section.number}: ${rules.length} rules`); + totalRules += rules.length; + } + } + console.log(`\nBuilt AGENTS.md with ${totalRules} rules.`); +} + +main(); diff --git a/skills/src/extract-tests.ts b/skills/src/extract-tests.ts new file mode 100644 index 000000000..b7d03b615 --- /dev/null +++ b/skills/src/extract-tests.ts @@ -0,0 +1,147 @@ +#!/usr/bin/env node + +import { readFileSync, writeFileSync, readdirSync } from "node:fs"; +import { join, dirname } from "node:path"; +import { fileURLToPath } from "node:url"; + +type ParsedFrontmatter = { + frontmatter: Record; + body: string; +}; + +type SectionPrefix = { + prefix: string; + number: number; + title: string; +}; + +type ExampleExtraction = { + incorrect: string | null; + correct: string | null; +}; + +type TestCaseEntry = { + id: string; + section: number; + sectionTitle: string; + title: string; + impact: string; + tags: string[]; + testCase: { + description: string; + input: string | null; + expected: string | null; + evaluationPrompt: string; + }; +}; + +const __dirname = dirname(fileURLToPath(import.meta.url)); +const RULES_DIR = join(__dirname, "..", "rules"); +const OUTPUT_FILE = join(__dirname, "..", "test-cases.json"); + +const SECTION_PREFIXES: SectionPrefix[] = [ + { prefix: "config-", number: 1, title: "Configuration Fundamentals" }, + { prefix: "routing-", number: 2, title: "Routing & Model Selection" }, + { prefix: "agent-", number: 3, title: "Agent Orchestration" }, + { prefix: "filter-", number: 4, title: "Filter Chains & Guardrails" }, + { prefix: "observe-", number: 5, title: "Observability & Debugging" }, + { prefix: "cli-", number: 6, title: "CLI Operations" }, + { prefix: "deploy-", number: 7, title: "Deployment & Security" }, + { prefix: "advanced-", number: 8, title: "Advanced Patterns" }, +]; + +function parseFrontmatter(content: string): ParsedFrontmatter | null { + const match = content.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/); + if (!match) return null; + + const frontmatter: Record = {}; + const lines = match[1].split("\n"); + for (const line of lines) { + const colonIdx = line.indexOf(":"); + if (colonIdx === -1) continue; + const key = line.slice(0, colonIdx).trim(); + const value = line.slice(colonIdx + 1).trim(); + frontmatter[key] = value; + } + + return { frontmatter, body: match[2].trim() }; +} + +function extractCodeBlocks(text: string): string[] { + const blocks: string[] = []; + const regex = /```(?:yaml|bash|python|typescript|json|sh)?\n([\s\S]*?)```/g; + let match: RegExpExecArray | null; + do { + match = regex.exec(text); + if (match) { + blocks.push(match[1].trim()); + } + } while (match !== null); + return blocks; +} + +function extractExamples(body: string): ExampleExtraction { + const incorrectMatch = body.match( + /\*\*Incorrect[^*]*\*\*[:\s]*([\s\S]*?)(?=\*\*Correct|\*\*Key|$)/ + ); + const correctMatch = body.match( + /\*\*Correct[^*]*\*\*[:\s]*([\s\S]*?)(?=\*\*Incorrect|\*\*Key|\*\*Note|Reference:|$)/ + ); + + return { + incorrect: incorrectMatch + ? extractCodeBlocks(incorrectMatch[1]).join("\n\n") + : null, + correct: correctMatch ? extractCodeBlocks(correctMatch[1]).join("\n\n") : null, + }; +} + +function inferSection(filename: string): SectionPrefix | null { + for (const s of SECTION_PREFIXES) { + if (filename.startsWith(s.prefix)) return s; + } + return null; +} + +function main(): void { + const files = readdirSync(RULES_DIR) + .filter((f) => f.endsWith(".md") && !f.startsWith("_")) + .sort(); + + const testCases: TestCaseEntry[] = []; + + for (const file of files) { + const content = readFileSync(join(RULES_DIR, file), "utf-8"); + const parsed = parseFrontmatter(content); + if (!parsed) continue; + + const { frontmatter, body } = parsed; + const section = inferSection(file); + if (!section) continue; + + const { incorrect, correct } = extractExamples(body); + if (!incorrect && !correct) continue; + + testCases.push({ + id: file.replace(".md", ""), + section: section.number, + sectionTitle: section.title, + title: frontmatter.title ?? file, + impact: frontmatter.impact ?? "MEDIUM", + tags: frontmatter.tags + ? frontmatter.tags.split(",").map((t) => t.trim()) + : [], + testCase: { + description: `Detect and fix: "${frontmatter.title}"`, + input: incorrect, + expected: correct, + evaluationPrompt: `Given the following Plano config or CLI usage, identify if it violates the rule "${frontmatter.title}" and explain how to fix it.`, + }, + }); + } + + writeFileSync(OUTPUT_FILE, JSON.stringify(testCases, null, 2), "utf-8"); + console.log(`Extracted ${testCases.length} test cases to test-cases.json`); +} + +main(); diff --git a/skills/src/validate.ts b/skills/src/validate.ts new file mode 100644 index 000000000..4fdf46ea7 --- /dev/null +++ b/skills/src/validate.ts @@ -0,0 +1,156 @@ +#!/usr/bin/env node + +import { readFileSync, readdirSync } from "node:fs"; +import { join, dirname } from "node:path"; +import { fileURLToPath } from "node:url"; + +type ParsedFrontmatter = { + frontmatter: Record; + body: string; +}; + +type ValidationResult = { + errors: string[]; + warnings: string[]; +}; + +const __dirname = dirname(fileURLToPath(import.meta.url)); +const RULES_DIR = join(__dirname, "..", "rules"); + +const VALID_IMPACTS = [ + "CRITICAL", + "HIGH", + "MEDIUM-HIGH", + "MEDIUM", + "LOW-MEDIUM", + "LOW", +] as const; + +const SECTION_PREFIXES = [ + "config-", + "routing-", + "agent-", + "filter-", + "observe-", + "cli-", + "deploy-", + "advanced-", +]; + +function parseFrontmatter(content: string): ParsedFrontmatter | null { + const match = content.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/); + if (!match) return null; + + const frontmatter: Record = {}; + const lines = match[1].split("\n"); + for (const line of lines) { + const colonIdx = line.indexOf(":"); + if (colonIdx === -1) continue; + const key = line.slice(0, colonIdx).trim(); + const value = line.slice(colonIdx + 1).trim(); + frontmatter[key] = value; + } + + return { frontmatter, body: match[2].trim() }; +} + +function validateFile(file: string, content: string): ValidationResult { + const errors: string[] = []; + const warnings: string[] = []; + + const parsed = parseFrontmatter(content); + if (!parsed) { + errors.push("Missing or malformed frontmatter (expected --- ... ---)"); + return { errors, warnings }; + } + + const { frontmatter, body } = parsed; + + if (!frontmatter.title) { + errors.push("Missing required frontmatter field: title"); + } + if (!frontmatter.impact) { + errors.push("Missing required frontmatter field: impact"); + } else if (!VALID_IMPACTS.includes(frontmatter.impact as (typeof VALID_IMPACTS)[number])) { + errors.push( + `Invalid impact value: "${frontmatter.impact}". Valid values: ${VALID_IMPACTS.join(", ")}` + ); + } + if (!frontmatter.tags) { + warnings.push("No tags defined — consider adding relevant tags"); + } + + const hasValidPrefix = SECTION_PREFIXES.some((p) => file.startsWith(p)); + if (!hasValidPrefix) { + errors.push( + `Filename must start with a valid prefix: ${SECTION_PREFIXES.join(", ")}` + ); + } + + if (body.length < 100) { + warnings.push("Rule body seems very short — consider adding more detail"); + } + + if (!body.includes("```")) { + warnings.push( + "No code examples found — rules should include YAML or CLI examples" + ); + } + + if (!body.includes("Incorrect") || !body.includes("Correct")) { + warnings.push( + "Consider adding both Incorrect and Correct examples for clarity" + ); + } + + return { errors, warnings }; +} + +function main(): void { + const files = readdirSync(RULES_DIR) + .filter((f) => f.endsWith(".md") && !f.startsWith("_")) + .sort(); + + let totalErrors = 0; + let totalWarnings = 0; + let filesWithIssues = 0; + + console.log(`Validating ${files.length} rule files...\n`); + + for (const file of files) { + const content = readFileSync(join(RULES_DIR, file), "utf-8"); + const { errors, warnings } = validateFile(file, content); + + if (errors.length > 0 || warnings.length > 0) { + filesWithIssues++; + console.log(`📄 ${file}`); + + for (const error of errors) { + console.log(` ❌ ERROR: ${error}`); + totalErrors++; + } + for (const warning of warnings) { + console.log(` ⚠️ WARN: ${warning}`); + totalWarnings++; + } + console.log(); + } else { + console.log(`✅ ${file}`); + } + } + + console.log(`\n--- Validation Summary ---`); + console.log(`Files checked: ${files.length}`); + console.log(`Files with issues: ${filesWithIssues}`); + console.log(`Errors: ${totalErrors}`); + console.log(`Warnings: ${totalWarnings}`); + + if (totalErrors > 0) { + console.log(`\nValidation FAILED with ${totalErrors} error(s).`); + process.exit(1); + } else { + console.log(`\nValidation passed.`); + } +} + +main(); diff --git a/skills/test-cases.json b/skills/test-cases.json new file mode 100644 index 000000000..c8bcfe338 --- /dev/null +++ b/skills/test-cases.json @@ -0,0 +1,353 @@ +[ + { + "id": "advanced-prompt-targets", + "section": 8, + "sectionTitle": "Advanced Patterns", + "title": "Design Prompt Targets with Precise Parameter Schemas", + "impact": "HIGH", + "tags": [ + "advanced", + "prompt-targets", + "functions", + "llm", + "api-integration" + ], + "testCase": { + "description": "Detect and fix: \"Design Prompt Targets with Precise Parameter Schemas\"", + "input": "prompt_targets:\n - name: get_flight_info\n description: Get flight information\n parameters:\n - name: flight # What format? \"AA123\"? \"AA 123\"? \"American 123\"?\n type: str\n required: true\n endpoint:\n name: flights_api\n path: /flight?id={flight}", + "expected": "version: v0.3.0\n\nendpoints:\n flights_api:\n endpoint: api.flightaware.com\n protocol: https\n connect_timeout: \"5s\"\n\nprompt_targets:\n - name: get_flight_status\n description: >\n Get real-time status, gate information, and delays for a specific flight number.\n Use when the user asks about a flight's current status, arrival time, or gate.\n parameters:\n - name: flight_number\n description: >\n IATA airline code followed by flight number, e.g., \"AA123\", \"UA456\", \"DL789\".\n Extract from user message — do not include spaces.\n type: str\n required: true\n format: \"^[A-Z]{2}[0-9]{1,4}$\" # Regex hint for validation\n\n - name: date\n description: >\n Flight date in YYYY-MM-DD format. Use today's date if not specified.\n type: str\n required: false\n format: date\n\n endpoint:\n name: flights_api\n path: /flights/{flight_number}?date={date}\n http_method: GET\n http_headers:\n Authorization: \"Bearer $FLIGHTAWARE_API_KEY\"\n\n - name: search_flights\n description: >\n Search for available flights between two cities or airports.\n Use when the user wants to find flights, compare options, or book travel.\n parameters:\n - name: origin\n description: Departure airport IATA code (e.g., \"JFK\", \"LAX\", \"ORD\")\n type: str\n required: true\n - name: destination\n description: Arrival airport IATA code (e.g., \"LHR\", \"CDG\", \"NRT\")\n type: str\n required: true\n - name: departure_date\n description: Departure date in YYYY-MM-DD format\n type: str\n required: true\n format: date\n - name: cabin_class\n description: Preferred cabin class\n type: str\n required: false\n default: economy\n enum: [economy, premium_economy, business, first]\n - name: passengers\n description: Number of adult passengers (1-9)\n type: int\n required: false\n default: 1\n\n endpoint:\n name: flights_api\n path: /search?from={origin}&to={destination}&date={departure_date}&class={cabin_class}&pax={passengers}\n http_method: GET\n http_headers:\n Authorization: \"Bearer $FLIGHTAWARE_API_KEY\"\n\n system_prompt: |\n You are a travel assistant. Present flight search results clearly,\n highlighting the best value options. Include price, duration, and\n number of stops for each option.\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\nlisteners:\n - type: prompt\n name: travel_functions\n port: 10000\n timeout: \"30s\"", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Design Prompt Targets with Precise Parameter Schemas\" and explain how to fix it." + } + }, + { + "id": "agent-descriptions", + "section": 3, + "sectionTitle": "Agent Orchestration", + "title": "Write Capability-Focused Agent Descriptions for Accurate Routing", + "impact": "HIGH", + "tags": [ + "agent", + "orchestration", + "descriptions", + "routing", + "multi-agent" + ], + "testCase": { + "description": "Detect and fix: \"Write Capability-Focused Agent Descriptions for Accurate Routing\"", + "input": "listeners:\n - type: agent\n name: orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: agent_1\n description: Helps users with information # Too generic — matches everything\n\n - id: agent_2\n description: Also helps users # Indistinguishable from agent_1", + "expected": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001\n - id: flight_agent\n url: http://host.docker.internal:8002\n - id: hotel_agent\n url: http://host.docker.internal:8003\n\nlisteners:\n - type: agent\n name: travel_orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: weather_agent\n description: >\n Provides real-time weather conditions and multi-day forecasts for any city\n worldwide. Handles questions about temperature, precipitation, wind, humidity,\n sunrise/sunset times, and severe weather alerts. Examples: \"What's the weather\n in Tokyo?\", \"Will it rain in London this weekend?\", \"Sunrise time in New York.\"\n\n - id: flight_agent\n description: >\n Provides live flight status, schedules, gate information, delays, and\n aircraft details for any flight number or route between airports.\n Handles questions about departures, arrivals, and airline information.\n Examples: \"Is AA123 on time?\", \"Flights from JFK to LAX tomorrow.\"\n\n - id: hotel_agent\n description: >\n Searches and books hotel accommodations, compares room types, pricing,\n and availability. Handles check-in/check-out dates, amenities, and\n cancellation policies. Examples: \"Hotels near Times Square for next Friday.\"", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Write Capability-Focused Agent Descriptions for Accurate Routing\" and explain how to fix it." + } + }, + { + "id": "agent-orchestration", + "section": 3, + "sectionTitle": "Agent Orchestration", + "title": "Register All Sub-Agents in Both `agents` and `listeners.agents`", + "impact": "CRITICAL", + "tags": [ + "agent", + "orchestration", + "config", + "multi-agent" + ], + "testCase": { + "description": "Detect and fix: \"Register All Sub-Agents in Both `agents` and `listeners.agents`\"", + "input": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001\n - id: news_agent # Defined but never referenced in any listener\n url: http://host.docker.internal:8002\n\nlisteners:\n - type: agent\n name: orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: weather_agent\n description: Provides weather forecasts and current conditions.\n # news_agent is missing here — the orchestrator cannot route to it\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001\n\nlisteners:\n - type: agent\n name: orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: weather_agent\n description: Provides weather forecasts.\n - id: flights_agent # ID not in global agents[] — startup error\n description: Provides flight status information.", + "expected": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001\n - id: flights_agent\n url: http://host.docker.internal:8002\n - id: hotels_agent\n url: http://host.docker.internal:8003\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\nlisteners:\n - type: agent\n name: travel_orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: weather_agent\n description: Real-time weather, forecasts, and climate data for any city.\n - id: flights_agent\n description: Live flight status, schedules, gates, and delays.\n - id: hotels_agent\n description: Hotel search, availability, pricing, and booking.\n default: true # Fallback if no other agent matches", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Register All Sub-Agents in Both `agents` and `listeners.agents`\" and explain how to fix it." + } + }, + { + "id": "config-listeners", + "section": 1, + "sectionTitle": "Configuration Fundamentals", + "title": "Choose the Right Listener Type for Your Use Case", + "impact": "CRITICAL", + "tags": [ + "config", + "listeners", + "architecture", + "routing" + ], + "testCase": { + "description": "Detect and fix: \"Choose the Right Listener Type for Your Use Case\"", + "input": "version: v0.3.0\n\n# Wrong: a model listener cannot route to backend agent services\nlisteners:\n - type: model\n name: main\n port: 12000\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001", + "expected": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001\n - id: travel_agent\n url: http://host.docker.internal:8002\n\nlisteners:\n - type: agent\n name: orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: weather_agent\n description: Provides real-time weather, forecasts, and conditions for any city.\n - id: travel_agent\n description: Books flights, hotels, and travel itineraries.\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Choose the Right Listener Type for Your Use Case\" and explain how to fix it." + } + }, + { + "id": "config-providers", + "section": 1, + "sectionTitle": "Configuration Fundamentals", + "title": "Register Model Providers with Correct Format Identifiers", + "impact": "CRITICAL", + "tags": [ + "config", + "model-providers", + "llm", + "api-format" + ], + "testCase": { + "description": "Detect and fix: \"Register Model Providers with Correct Format Identifiers\"", + "input": "model_providers:\n - model: gpt-4o # Missing openai/ prefix — Plano cannot route this\n access_key: $OPENAI_API_KEY\n\n - model: claude-3-5-sonnet # Missing anthropic/ prefix\n access_key: $ANTHROPIC_API_KEY", + "expected": "model_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\n - model: anthropic/claude-sonnet-4-20250514\n access_key: $ANTHROPIC_API_KEY\n\n - model: gemini/gemini-2.0-flash\n access_key: $GOOGLE_API_KEY\n\nmodel_providers:\n - model: custom/llama3\n base_url: http://host.docker.internal:11434/v1 # Ollama endpoint\n provider_interface: openai # Ollama speaks OpenAI format\n default: true", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Register Model Providers with Correct Format Identifiers\" and explain how to fix it." + } + }, + { + "id": "config-secrets", + "section": 1, + "sectionTitle": "Configuration Fundamentals", + "title": "Use Environment Variable Substitution for All Secrets", + "impact": "CRITICAL", + "tags": [ + "config", + "security", + "secrets", + "api-keys", + "environment-variables" + ], + "testCase": { + "description": "Detect and fix: \"Use Environment Variable Substitution for All Secrets\"", + "input": "version: v0.3.0\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: abcdefghijklmnopqrstuvwxyz... # Hardcoded — never do this\n\nstate_storage:\n type: postgres\n connection_string: \"postgresql://admin:mysecretpassword@prod-db:5432/plano\"\n\nprompt_targets:\n - name: get_data\n endpoint:\n name: my_api\n http_headers:\n Authorization: \"Bearer abcdefghijklmnopqrstuvwxyz\" # Hardcoded token", + "expected": "version: v0.3.0\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\n - model: anthropic/claude-sonnet-4-20250514\n access_key: $ANTHROPIC_API_KEY\n\nstate_storage:\n type: postgres\n connection_string: \"postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}\"\n\nprompt_targets:\n - name: get_data\n endpoint:\n name: my_api\n http_headers:\n Authorization: \"Bearer $MY_API_TOKEN\"\n\n# .env — add to .gitignore\nOPENAI_API_KEY=abcdefghijklmnopqrstuvwxyz...\nANTHROPIC_API_KEY=abcdefghijklmnopqrstuvwxyz...\nDB_USER=plano\nDB_PASS=secure-password\nDB_HOST=localhost\nMY_API_TOKEN=abcdefghijklmnopqrstuvwxyz...", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Use Environment Variable Substitution for All Secrets\" and explain how to fix it." + } + }, + { + "id": "config-version", + "section": 1, + "sectionTitle": "Configuration Fundamentals", + "title": "Always Specify a Supported Config Version", + "impact": "CRITICAL", + "tags": [ + "config", + "versioning", + "validation" + ], + "testCase": { + "description": "Detect and fix: \"Always Specify a Supported Config Version\"", + "input": "# No version field — fails schema validation\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY", + "expected": "version: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Always Specify a Supported Config Version\" and explain how to fix it." + } + }, + { + "id": "deploy-docker", + "section": 7, + "sectionTitle": "Deployment & Security", + "title": "Understand Plano's Docker Network Topology for Agent URL Configuration", + "impact": "HIGH", + "tags": [ + "deployment", + "docker", + "networking", + "agents", + "urls" + ], + "testCase": { + "description": "Detect and fix: \"Understand Plano's Docker Network Topology for Agent URL Configuration\"", + "input": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://localhost:8001 # Wrong: this is Plano's own container\n\n - id: flight_agent\n url: http://127.0.0.1:8002 # Wrong: same issue\n\nfilters:\n - id: input_guards\n url: http://localhost:10500 # Wrong: filter server unreachable", + "expected": "version: v0.3.0\n\nagents:\n - id: weather_agent\n url: http://host.docker.internal:8001 # Correct: reaches host port 8001\n\n - id: flight_agent\n url: http://host.docker.internal:8002 # Correct: reaches host port 8002\n\nfilters:\n - id: input_guards\n url: http://host.docker.internal:10500 # Correct: reaches filter server on host\n\nendpoints:\n internal_api:\n endpoint: host.docker.internal # Correct for internal API on host\n protocol: http\n\n# Kubernetes / Docker Compose — use service names\nagents:\n - id: weather_agent\n url: http://weather-service:8001 # Kubernetes service DNS\n\n# External cloud services — use full domain\nagents:\n - id: cloud_agent\n url: https://my-agent.us-east-1.amazonaws.com/v1\n\n# Custom TLS (self-signed or internal CA)\noverrides:\n upstream_tls_ca_path: /etc/ssl/certs/internal-ca.pem", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Understand Plano's Docker Network Topology for Agent URL Configuration\" and explain how to fix it." + } + }, + { + "id": "deploy-state", + "section": 7, + "sectionTitle": "Deployment & Security", + "title": "Use PostgreSQL State Storage for Multi-Turn Conversations in Production", + "impact": "HIGH", + "tags": [ + "deployment", + "state", + "postgres", + "memory", + "multi-turn", + "production" + ], + "testCase": { + "description": "Detect and fix: \"Use PostgreSQL State Storage for Multi-Turn Conversations in Production\"", + "input": "version: v0.3.0\n\n# Memory storage — all conversations lost on planoai down / container restart\nstate_storage:\n type: memory\n\nlisteners:\n - type: agent\n name: customer_support\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: support_agent\n description: Customer support assistant with conversation history.", + "expected": "version: v0.3.0\n\nstate_storage:\n type: postgres\n connection_string: \"postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:5432/${DB_NAME}\"\n\nlisteners:\n - type: agent\n name: customer_support\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: support_agent\n description: Customer support assistant with access to full conversation history.\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\n# Start PostgreSQL with Docker\ndocker run -d \\\n --name plano-postgres \\\n -e POSTGRES_USER=plano \\\n -e POSTGRES_PASSWORD=devpassword \\\n -e POSTGRES_DB=plano \\\n -p 5432:5432 \\\n postgres:16\n\n# Set environment variables\nexport DB_USER=plano\nexport DB_PASS=devpassword\nexport DB_HOST=host.docker.internal # Use host.docker.internal from inside Plano container\nexport DB_NAME=plano\n\nDB_USER=plano_prod\nDB_PASS=\nDB_HOST=your-rds-endpoint.amazonaws.com\nDB_NAME=plano", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Use PostgreSQL State Storage for Multi-Turn Conversations in Production\" and explain how to fix it." + } + }, + { + "id": "filter-guardrails", + "section": 4, + "sectionTitle": "Filter Chains & Guardrails", + "title": "Configure Prompt Guards with Actionable Rejection Messages", + "impact": "MEDIUM", + "tags": [ + "filter", + "guardrails", + "jailbreak", + "security", + "ux" + ], + "testCase": { + "description": "Detect and fix: \"Configure Prompt Guards with Actionable Rejection Messages\"", + "input": "version: v0.3.0\n\nprompt_guards:\n input_guards:\n jailbreak:\n on_exception: {} # Empty — returns unhelpful generic error\n\nprompt_guards:\n input_guards:\n jailbreak:\n on_exception:\n message: \"Error code 403: guard triggered\" # Unhelpful to the user", + "expected": "version: v0.3.0\n\nprompt_guards:\n input_guards:\n jailbreak:\n on_exception:\n message: >\n I'm not able to help with that request. This assistant is designed\n to help with [your use case, e.g., customer support, coding questions].\n Please rephrase your question or contact support@yourdomain.com\n if you believe this is an error.\n\n# Built-in jailbreak detection (fast, no external service needed)\nprompt_guards:\n input_guards:\n jailbreak:\n on_exception:\n message: \"This request cannot be processed. Please ask about our products and services.\"\n\n# MCP-based custom guards for additional policy enforcement\nfilters:\n - id: topic_restriction\n url: http://host.docker.internal:10500\n type: mcp\n transport: streamable-http\n tool: topic_restriction # Custom filter for domain-specific restrictions\n\nlisteners:\n - type: agent\n name: customer_support\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: support_agent\n description: Customer support assistant for product questions and order issues.\n filter_chain:\n - topic_restriction # Additional custom topic filtering", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Configure Prompt Guards with Actionable Rejection Messages\" and explain how to fix it." + } + }, + { + "id": "filter-mcp", + "section": 4, + "sectionTitle": "Filter Chains & Guardrails", + "title": "Configure MCP Filters with Explicit Type and Transport", + "impact": "MEDIUM", + "tags": [ + "filter", + "mcp", + "integration", + "configuration" + ], + "testCase": { + "description": "Detect and fix: \"Configure MCP Filters with Explicit Type and Transport\"", + "input": "filters:\n - id: my_guard # Plano infers type=mcp, transport=streamable-http, tool=my_guard\n url: http://localhost:10500\n # If your MCP server uses a different tool name or transport, this silently misroutes", + "expected": "version: v0.3.0\n\nfilters:\n - id: input_guards\n url: http://host.docker.internal:10500\n type: mcp # Explicitly MCP protocol\n transport: streamable-http # Streamable HTTP transport\n tool: input_guards # MCP tool name (matches MCP server registration)\n\n - id: query_rewriter\n url: http://host.docker.internal:10501\n type: mcp\n transport: streamable-http\n tool: rewrite_query # Tool name differs from filter ID — explicit is safer\n\n - id: custom_validator\n url: http://host.docker.internal:10503\n type: http # Plain HTTP filter (not MCP)\n # No tool field for HTTP filters\n\nfilters:\n - id: auth_validator\n url: http://host.docker.internal:9000/validate\n type: http # Plano POSTs the request, expects the modified request back", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Configure MCP Filters with Explicit Type and Transport\" and explain how to fix it." + } + }, + { + "id": "filter-ordering", + "section": 4, + "sectionTitle": "Filter Chains & Guardrails", + "title": "Order Filter Chains with Guards First, Enrichment Last", + "impact": "HIGH", + "tags": [ + "filter", + "guardrails", + "security", + "pipeline", + "ordering" + ], + "testCase": { + "description": "Detect and fix: \"Order Filter Chains with Guards First, Enrichment Last\"", + "input": "filters:\n - id: context_builder\n url: http://host.docker.internal:10502 # Runs expensive RAG retrieval first\n - id: query_rewriter\n url: http://host.docker.internal:10501\n - id: input_guards\n url: http://host.docker.internal:10500 # Guards run last — jailbreak gets context\n\nlisteners:\n - type: agent\n name: rag_orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: rag_agent\n filter_chain:\n - context_builder # Wrong: expensive enrichment before safety check\n - query_rewriter\n - input_guards", + "expected": "version: v0.3.0\n\nfilters:\n - id: input_guards\n url: http://host.docker.internal:10500\n type: mcp\n transport: streamable-http\n - id: query_rewriter\n url: http://host.docker.internal:10501\n type: mcp\n transport: streamable-http\n - id: context_builder\n url: http://host.docker.internal:10502\n type: mcp\n transport: streamable-http\n\nlisteners:\n - type: agent\n name: rag_orchestrator\n port: 8000\n router: plano_orchestrator_v1\n agents:\n - id: rag_agent\n description: Answers questions using internal knowledge base documents.\n filter_chain:\n - input_guards # 1. Block jailbreaks and policy violations\n - query_rewriter # 2. Normalize the safe query\n - context_builder # 3. Retrieve relevant context for the clean query", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Order Filter Chains with Guards First, Enrichment Last\" and explain how to fix it." + } + }, + { + "id": "observe-span-attributes", + "section": 5, + "sectionTitle": "Observability & Debugging", + "title": "Add Custom Span Attributes for Correlation and Filtering", + "impact": "MEDIUM", + "tags": [ + "observability", + "tracing", + "span-attributes", + "correlation" + ], + "testCase": { + "description": "Detect and fix: \"Add Custom Span Attributes for Correlation and Filtering\"", + "input": "tracing:\n random_sampling: 20\n # No span_attributes — cannot filter by user, session, or environment", + "expected": "version: v0.3.0\n\ntracing:\n random_sampling: 20\n trace_arch_internal: true\n\n span_attributes:\n # Match all headers with this prefix, then map to span attributes by:\n # 1) stripping the prefix and 2) converting hyphens to dots\n header_prefixes:\n - x-katanemo-\n\n # Static attributes added to every span from this Plano instance\n static:\n environment: production\n service.name: plano-gateway\n deployment.region: us-east-1\n service.version: \"2.1.0\"\n team: platform-engineering\n\nimport httpx\n\nresponse = httpx.post(\n \"http://localhost:12000/v1/chat/completions\",\n headers={\n \"x-katanemo-request-id\": \"req_abc123\",\n \"x-katanemo-user-id\": \"usr_12\",\n \"x-katanemo-session-id\": \"sess_xyz456\",\n \"x-katanemo-tenant-id\": \"acme-corp\",\n },\n json={\"model\": \"plano.v1\", \"messages\": [...]}\n)\n\n# Find all requests from a specific user\nplanoai trace --where user.id=usr_12\n\n# Find all traces from production environment\nplanoai trace --where environment=production\n\n# Find traces from a specific tenant\nplanoai trace --where tenant.id=acme-corp", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Add Custom Span Attributes for Correlation and Filtering\" and explain how to fix it." + } + }, + { + "id": "observe-tracing", + "section": 5, + "sectionTitle": "Observability & Debugging", + "title": "Enable Tracing with Appropriate Sampling for Your Environment", + "impact": "HIGH", + "tags": [ + "observability", + "tracing", + "opentelemetry", + "otel", + "debugging" + ], + "testCase": { + "description": "Detect and fix: \"Enable Tracing with Appropriate Sampling for Your Environment\"", + "input": "version: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\n# No tracing block — no visibility into routing, latency, or errors", + "expected": "version: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\ntracing:\n random_sampling: 100 # 100% for development/debugging\n trace_arch_internal: true # Include Plano's internal routing spans\n\ntracing:\n random_sampling: 10 # Sample 10% of requests in production\n trace_arch_internal: false # Skip internal spans to reduce noise\n span_attributes:\n header_prefixes:\n - x-katanemo- # Match all x-katanemo-* headers\n static:\n environment: production\n service.name: my-plano-service\n version: \"1.0.0\"\n\n# Start Plano with built-in OTEL collector\nplanoai up config.yaml --with-tracing", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Enable Tracing with Appropriate Sampling for Your Environment\" and explain how to fix it." + } + }, + { + "id": "routing-aliases", + "section": 2, + "sectionTitle": "Routing & Model Selection", + "title": "Use Model Aliases for Semantic, Stable Model References", + "impact": "MEDIUM", + "tags": [ + "routing", + "model-aliases", + "maintainability", + "client-integration" + ], + "testCase": { + "description": "Detect and fix: \"Use Model Aliases for Semantic, Stable Model References\"", + "input": "# config.yaml — no aliases defined\nversion: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n default: true\n\n# Client code — brittle, must be updated when model changes\nclient.chat.completions.create(model=\"gpt-4o\", ...)", + "expected": "version: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: openai/gpt-4o-mini\n access_key: $OPENAI_API_KEY\n default: true\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n - model: anthropic/claude-sonnet-4-20250514\n access_key: $ANTHROPIC_API_KEY\n\nmodel_aliases:\n plano.fast.v1:\n target: gpt-4o-mini # Cheap, fast — for high-volume tasks\n\n plano.smart.v1:\n target: gpt-4o # High capability — for complex reasoning\n\n plano.creative.v1:\n target: claude-sonnet-4-20250514 # Strong creative writing and analysis\n\n plano.v1:\n target: gpt-4o # Default production alias\n\n# Client code — stable, alias is the contract\nclient.chat.completions.create(model=\"plano.smart.v1\", ...)", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Use Model Aliases for Semantic, Stable Model References\" and explain how to fix it." + } + }, + { + "id": "routing-default", + "section": 2, + "sectionTitle": "Routing & Model Selection", + "title": "Always Set Exactly One Default Model Provider", + "impact": "HIGH", + "tags": [ + "routing", + "defaults", + "model-providers", + "reliability" + ], + "testCase": { + "description": "Detect and fix: \"Always Set Exactly One Default Model Provider\"", + "input": "version: v0.3.0\n\nmodel_providers:\n - model: openai/gpt-4o-mini # No default: true anywhere\n access_key: $OPENAI_API_KEY\n routing_preferences:\n - name: summarization\n description: Summarizing documents and extracting key points\n\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n routing_preferences:\n - name: code_generation\n description: Writing new functions and implementing algorithms\n\nmodel_providers:\n - model: openai/gpt-4o-mini\n default: true # First default\n access_key: $OPENAI_API_KEY\n\n - model: openai/gpt-4o\n default: true # Second default — confusing\n access_key: $OPENAI_API_KEY", + "expected": "version: v0.3.0\n\nmodel_providers:\n - model: openai/gpt-4o-mini\n access_key: $OPENAI_API_KEY\n default: true # Handles general/unclassified requests\n routing_preferences:\n - name: summarization\n description: Summarizing documents, articles, and meeting notes\n - name: classification\n description: Categorizing inputs, labeling, and intent detection\n\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n routing_preferences:\n - name: code_generation\n description: Writing, debugging, and reviewing code\n - name: complex_reasoning\n description: Multi-step math, logical analysis, research synthesis", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Always Set Exactly One Default Model Provider\" and explain how to fix it." + } + }, + { + "id": "routing-passthrough", + "section": 2, + "sectionTitle": "Routing & Model Selection", + "title": "Use Passthrough Auth for Proxy and Multi-Tenant Setups", + "impact": "MEDIUM", + "tags": [ + "routing", + "authentication", + "proxy", + "litellm", + "multi-tenant" + ], + "testCase": { + "description": "Detect and fix: \"Use Passthrough Auth for Proxy and Multi-Tenant Setups\"", + "input": "model_providers:\n - model: custom/proxy\n base_url: http://host.docker.internal:8000\n access_key: $SOME_KEY # Plano overwrites the client's auth — proxy rejects it", + "expected": "version: v0.3.0\n\nlisteners:\n - type: model\n name: model_listener\n port: 12000\n\nmodel_providers:\n - model: custom/litellm-proxy\n base_url: http://host.docker.internal:4000 # LiteLLM server\n provider_interface: openai # LiteLLM uses OpenAI format\n passthrough_auth: true # Forward client's Bearer token\n default: true\n\nmodel_providers:\n # Plano acts as a passthrough gateway; each client has their own OpenAI key\n - model: openai/gpt-4o\n passthrough_auth: true # No access_key here — client's key is forwarded\n default: true\n\nmodel_providers:\n - model: openai/gpt-4o-mini\n access_key: $OPENAI_API_KEY # Plano manages this key\n default: true\n routing_preferences:\n - name: quick tasks\n description: Short answers, simple lookups, fast completions\n\n - model: custom/vllm-llama\n base_url: http://gpu-server:8000\n provider_interface: openai\n passthrough_auth: true # vLLM cluster handles its own auth\n routing_preferences:\n - name: long context\n description: Processing very long documents, multi-document analysis", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Use Passthrough Auth for Proxy and Multi-Tenant Setups\" and explain how to fix it." + } + }, + { + "id": "routing-preferences", + "section": 2, + "sectionTitle": "Routing & Model Selection", + "title": "Write Task-Specific Routing Preference Descriptions", + "impact": "HIGH", + "tags": [ + "routing", + "model-selection", + "preferences", + "llm-routing" + ], + "testCase": { + "description": "Detect and fix: \"Write Task-Specific Routing Preference Descriptions\"", + "input": "model_providers:\n - model: openai/gpt-4o-mini\n access_key: $OPENAI_API_KEY\n default: true\n routing_preferences:\n - name: simple\n description: easy tasks # Too vague — what is \"easy\"?\n\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n routing_preferences:\n - name: hard\n description: hard tasks # Too vague — overlaps with \"easy\"", + "expected": "model_providers:\n - model: openai/gpt-4o-mini\n access_key: $OPENAI_API_KEY\n default: true\n routing_preferences:\n - name: summarization\n description: >\n Summarizing documents, articles, emails, or meeting transcripts.\n Extracting key points, generating TL;DR sections, condensing long text.\n - name: classification\n description: >\n Categorizing inputs, sentiment analysis, spam detection,\n intent classification, labeling structured data fields.\n - name: translation\n description: >\n Translating text between languages, localization tasks.\n\n - model: openai/gpt-4o\n access_key: $OPENAI_API_KEY\n routing_preferences:\n - name: code_generation\n description: >\n Writing new functions, classes, or modules from scratch.\n Implementing algorithms, boilerplate generation, API integrations.\n - name: code_review\n description: >\n Reviewing code for bugs, security vulnerabilities, performance issues.\n Suggesting refactors, explaining complex code, debugging errors.\n - name: complex_reasoning\n description: >\n Multi-step math problems, logical deduction, strategic planning,\n research synthesis requiring chain-of-thought reasoning.", + "evaluationPrompt": "Given the following Plano config or CLI usage, identify if it violates the rule \"Write Task-Specific Routing Preference Descriptions\" and explain how to fix it." + } + } +] diff --git a/skills/tsconfig.json b/skills/tsconfig.json new file mode 100644 index 000000000..83552abbc --- /dev/null +++ b/skills/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "NodeNext", + "moduleResolution": "NodeNext", + "lib": ["ES2022"], + "strict": true, + "noEmit": true, + "types": ["node"], + "skipLibCheck": true, + "resolveJsonModule": true, + "forceConsistentCasingInFileNames": true + }, + "include": ["src/**/*.ts"] +}