Skip to content

feat: smart routing v3, provider-locked selection, and NanoGPT endpoint policy#6

Open
Daltonganger wants to merge 39 commits intofeat/omos-pr2-manual-backendfrom
feat/omos-pr3-command-ux-rollout
Open

feat: smart routing v3, provider-locked selection, and NanoGPT endpoint policy#6
Daltonganger wants to merge 39 commits intofeat/omos-pr2-manual-backendfrom
feat/omos-pr3-command-ux-rollout

Conversation

@Daltonganger
Copy link
Copy Markdown
Owner

@Daltonganger Daltonganger commented Feb 7, 2026

Summary

  • add Smart Routing v3 runtime/scoring with per-agent assignments and fallback chains
  • enforce strict provider-bound model selection (never picks models outside selected providers)
  • implement endpoint-driven NanoGPT access policy using endpoints /api/v1/models, /api/subscription/v1/models, and /api/paid/v1/models
  • rewrite NanoGPT usage parsing to support weekly input tokens as primary source (weeklyInputTokens) with compatibility mapping for existing budget consumers
  • add startup auto-refresh hook (model_refresh) with periodic checks (default every 24 hours)
  • add per-agent model preferences and planner integration
  • include merge fixes from upstream and review follow-ups (including refresh checker warning)

Why

This aligns model selection behavior with explicit provider and billing constraints while keeping routing current as catalogs and usage change.

Validation

  • bun run check:ci
  • bun run typecheck
  • bun test
  • bun run build:runtime
  • bun run build:types

Notes

  • Local untracked data artifacts are not part of this PR.
  • Stacked branch context remains: base is feat/omos-pr2-manual-backend.

kassieclaire and others added 2 commits February 7, 2026 19:59
* feat: restrict subagent delegation based on agent type (alvinunreal#116)

Add SUBAGENT_DELEGATION_RULES to control which agents can spawn subagents:
- orchestrator: can spawn all subagents (full delegation)
- fixer/designer: can spawn explorer only (for research)
- explorer/librarian/oracle: cannot spawn any subagents (leaf nodes)

Update BackgroundTaskManager to:
- Track agent type per session via agentBySessionId map
- Calculate tool permissions based on parent agent's delegation rules
- Apply appropriate background_task/task tool permissions when spawning

Add comprehensive tests for all delegation restrictions.

Fixes alvinunreal#116

* fix: enforce agent-type validation in background_task tool delegation

The background_task tool accepted any agent string without checking
SUBAGENT_DELEGATION_RULES, allowing agents like fixer to delegate to
any agent despite being restricted to explorer only. Add isAgentAllowed()
and getAllowedSubagents() methods to BackgroundTaskManager and validate
the requested agent in the tool before launching.

* fix: treat untracked sessions as root orchestrator in delegation checks

The root orchestrator session is created by OpenCode, not by
BackgroundTaskManager, so it is never registered in agentBySessionId.
This caused isAgentAllowed(), getAllowedSubagents(), and
calculateToolPermissions() to reject all delegation from the root
session — completely blocking the primary agent from launching tasks.

Default untracked sessions to 'orchestrator' instead of rejecting them.

* feat: default unknown agent types to explorer-only delegation

New background agent types no longer need explicit entries in
SUBAGENT_DELEGATION_RULES. Agents not listed default to ['explorer']
access via a centralized getSubagentRules() helper.

* fix: base tool permissions on spawned agent's own delegation rules, not parent's

* test: add multi-layered delegation chain tests for orchestrator→fixer→explorer paths

---------

Co-authored-by: vllm-user <vllm-user@example.com>
Install the /omos command entry, add diff-first confirm flow for omos_preferences, and handle global/project target precedence warnings. Include tests for command install, diff behavior, and precedence plus rollout gate documentation updates.
@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented Feb 7, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Fix these issues in Kilo Cloud

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
src/hooks/model-refresh-checker/index.ts 264 hasOpencodeZen: true hardcoded in refreshConfig — ignores detected.hasOpencodeZen, causing model catalog filtering and plan building to always assume OpenCode Zen is available
Other Observations (not in diff)

Issues found in unchanged code or carried over from prior review:

File Line Issue
src/cli/commands.ts N/A WARNING: Unused imports in stub file
src/cli/system.ts N/A WARNING: Unused variable proc
src/hooks/model-refresh-checker/index.ts N/A WARNING: Dead-code logic — inferOpenCodeFreeEnabled guards on hasOpencodeZen but refreshConfig always passes true (see inline comment on line 264)
Files Reviewed (12 files)
  • .gitignore — no issues
  • AGENTS.md — no issues
  • README.md — no issues
  • biome.json — no issues
  • src/config/constants.ts — no issues
  • src/config/loader.ts — no issues
  • src/config/schema.ts — no issues
  • src/hooks/index.ts — no issues
  • src/hooks/model-refresh-checker/index.ts — 1 issue
  • src/index.ts — no issues
  • src/tools/background.ts — no issues
  • src/tools/grep/tools.ts — no issues
  • src/utils/tmux.ts — no issues

Ruben Beuker added 19 commits February 7, 2026 21:09
Add a deterministic provider-combination matrix test that compares v1, v2-shadow, and v2 outputs across curated mixes plus three random mixes. Refresh the provider matrix doc with the new scenario results and scoring-mode behavior.
Add score-plan output for manual model plans, including per-agent ranked scores for v1/v2/shadow comparisons. Also align plan/apply diff hashes to target config, ensure target directories exist before writes, and update provider-coverage provenance after swap logic.
Update README and rollout docs with an English how-to for viewing model scores during manual planning, and refresh the installed /omos command template to include score-plan and diffHash-safe apply flow.
Include an English command sequence for show -> score-plan -> plan -> apply using operation-based omos_preferences calls, with target selection and diffHash-safe apply guidance.
Remove CHUTES_API_KEY env dependency from installer/config wiring and align Chutes with OpenCode auth provider flow. Switch Chutes discovery to all provider models from opencode refresh output while keeping OpenCode free-model filtering unchanged.
Accept provider/model headers with additional path segments so chutes catalog entries like chutes/vendor/model are discovered during opencode verbose parsing.
Allow manual plan model ids like provider/vendor/model and keep them in score-plan ranking. Add regressions for loader validation and /omos scoring with multi-segment Chutes model identifiers.
Use model ids after the first slash for external signal lookup so chutes/vendor/model names map correctly in v1 and v2 scoring. Add regressions for multi-segment chutes ids in dynamic and scoring engine tests.
Apply canonical model alias normalization across external signal ingestion and v1/v2 lookups (strip provider prefix, remove TEE/FP* tokens, lowercase, and normalize slash/space/hyphen variants). Also switch V2 capability scoring to one-sided bonuses, add K2.5 version preference, and update regression matrices/tests for the new ranking behavior.
Add an installer question for subscription/pay-per-API usage and persist balanceProviderUsage in config. When enabled, dynamic planning rebalances provider assignments toward even distribution using a max score-loss tolerance. Also expose the setting in omos_preferences show/plan/apply flows and update docs/tests.
Add targeted Chutes role priors that down-rank Qwen3 and prefer Kimi K2.5/Minimax M2.1 by role in both v1 and v2 scoring. Update matrix and regression tests to reflect the calibrated ranking behavior.
- Remove omos_preferences tool and /omos command flow
- Add --dry-run flag for testing install without OpenCode
- Add manual setup mode to choose models per agent
- Exclude already-selected models from fallback choices
- Limit model list display to 5 items with option to type any model ID
- Support primary + 3 fallbacks configuration per agent
- Add ManualAgentConfig type for storing manual selections
- Add OPENCODE_PATHS with common installation locations
- Add resolveOpenCodePath() helper with caching
- Update all opencode commands to use resolved path
- Show found path in success message
- Add helpful instructions when opencode not found in PATH
Add comprehensive path detection including:
- macOS: Applications, Homebrew, Library paths
- Linux: /usr/bin, snap, flatpak, nix paths
- Package managers: Homebrew, Cargo, npm, yarn, pnpm
- More system-wide and user-local locations
Respect user-selected primary/secondary models during dynamic planning, lock forced assignments from rebalance overrides, and update tests to match manual-plan precedence behavior. Also include robust OpenCode path resolution updates for cross-platform installs.
Comment thread src/cli/commands.ts Outdated
@@ -0,0 +1,2 @@
import { getConfigDir } from './paths';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Unused imports in stub file

This file imports getConfigDir and ConfigMergeResult but never uses them. The file has no exports, making the export * from './commands' in config-manager.ts a no-op. This appears to be an incomplete stub that should either be completed or removed.

Either export functions that use these imports, or remove the file and the re-export from config-manager.ts.

Comment thread src/cli/system.ts Outdated
for (const opencodePath of paths) {
try {
// Check if we can execute it
const proc = Bun.spawn([opencodePath, '--version'], {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Unused variable proc

The proc variable from Bun.spawn() is never used. While the intent is to check if spawn succeeds, the variable should either be used (e.g., to check the result) or removed with a comment explaining why we don't await the process.

Suggested change
const proc = Bun.spawn([opencodePath, '--version'], {
// Check if spawn succeeds - don't await, just verify no throw
Bun.spawn([opencodePath, '--version'], {
stdout: 'pipe',
stderr: 'pipe',
});

Note: This approach may cache an invalid path since Bun.spawn() might not throw synchronously for non-existent executables on all platforms.

Ruben Beuker and others added 7 commits February 9, 2026 11:44
- add  command and  mode for updating assignments without reinstalling plugins/skills
- keep full provider model lists available for primary/support choices
- improve large-list selection with  expansion option
- refine dynamic planner provider candidate narrowing and pinned model handling
- update matrix/planner tests and remove residual internal placeholder wording
Clarify quick/manual setup descriptions, standardize provider prompts, and improve models-only messaging.
Avoid spawning processes in resolveOpenCodePath by selecting only existing filesystem paths and simplify slash normalization regex for clarity.
…nreal#122)

- ast-grep: Log background init errors instead of silently swallowing them
- grep: Add caseSensitive, wholeWord, fixedStrings options to tool definition

These fixes improve debuggability and expose more functionality to users.
…nunreal#123)

* feat: add reusable OpenCode free-model selection flow

* docs: keep antigravity guide focused on antigravity setup

* feat: add Chutes provider with adaptive model selection

* fix: remove request-tier bias from Chutes model scoring

* feat: add provider-aware 6-agent fallback chains with 15s failover

* feat: add subscription prompts and provider-aware chains for anthropic/copilot/zai

* refactor: rebalance mixed provider defaults toward Kimi and GPT-5.3

* feat: add dynamic provider-aware model planning from live catalog

* docs: add 5 provider-combination config scenarios

* feat: blend Artificial Analysis and OpenRouter signals into dynamic model planning

* feat: prompt API keys during install for external ranking signals

* refactor: rebalance dynamic planner for provider diversity and non-flash depth

* refactor: boost explorer speed signals and map chutes aliases

* refactor: prefer Gemini 3 Pro over 2.5 in dynamic ranking

* feat: add version-aware recency scoring across model families

* refactor: enforce provider-balanced primaries and richer fallback bundles

* Address code review comments

* Fix formatting issues

* refactor: remove model-name and provider bonus heuristics

* fix: check if chutes alias exists before adding

* revert: restore heuristic bonuses in dynamic model planner

* Merge pull request 3: Add external ranking signals from Artificial Analysis and OpenRouter

* fix: harden fallback chains and external signal handling

Prevent key leakage in installer prompts, isolate external ranking aliasing, and make free-model tails deterministic before Big Pickle fallback. Also relax fallback chain schema to preserve custom agent keys and align tests.

* feat: add scoring v2 foundation with precedence resolver

Introduce modular V2 scoring components, shadow-mode plumbing, and deterministic precedence resolution with provenance while keeping V1 as the applied default. Adds schema flag support and regression tests for determinism and fallback compilation.

* feat: add OMOS manual preference backend operations

Introduce a validated manual plan schema and new omos_preferences tool with show/plan/apply/reset-agent operations, including atomic writes with backups. Wire the tool into plugin registration and add tests for config loading and manual plan compilation.

* feat: ship /omos command UX and rollout hardening

Install the /omos command entry, add diff-first confirm flow for omos_preferences, and handle global/project target precedence warnings. Include tests for command install, diff behavior, and precedence plus rollout gate documentation updates.

* test: add dynamic scoring matrix scenarios

Add a deterministic provider-combination matrix test that compares v1, v2-shadow, and v2 outputs across curated mixes plus three random mixes. Refresh the provider matrix doc with the new scenario results and scoring-mode behavior.

* feat: expose manual scoring preview and harden apply flow

Add score-plan output for manual model plans, including per-agent ranked scores for v1/v2/shadow comparisons. Also align plan/apply diff hashes to target config, ensure target directories exist before writes, and update provider-coverage provenance after swap logic.

* docs: add /omos score-plan guidance in English

Update README and rollout docs with an English how-to for viewing model scores during manual planning, and refresh the installed /omos command template to include score-plan and diffHash-safe apply flow.

* docs: add copy-paste /omos scoring workflow example

Include an English command sequence for show -> score-plan -> plan -> apply using operation-based omos_preferences calls, with target selection and diffHash-safe apply guidance.

* feat: move chutes to auth flow and full catalog refresh

Remove CHUTES_API_KEY env dependency from installer/config wiring and align Chutes with OpenCode auth provider flow. Switch Chutes discovery to all provider models from opencode refresh output while keeping OpenCode free-model filtering unchanged.

* fix: parse multi-segment provider model headers

Accept provider/model headers with additional path segments so chutes catalog entries like chutes/vendor/model are discovered during opencode verbose parsing.

* fix: support multi-segment model ids in /omos scoring

Allow manual plan model ids like provider/vendor/model and keep them in score-plan ranking. Add regressions for loader validation and /omos scoring with multi-segment Chutes model identifiers.

* fix: strip provider prefix for nested model signal matching

Use model ids after the first slash for external signal lookup so chutes/vendor/model names map correctly in v1 and v2 scoring. Add regressions for multi-segment chutes ids in dynamic and scoring engine tests.

* feat: normalize model keys and rebalance v2 capability scoring

Apply canonical model alias normalization across external signal ingestion and v1/v2 lookups (strip provider prefix, remove TEE/FP* tokens, lowercase, and normalize slash/space/hyphen variants). Also switch V2 capability scoring to one-sided bonuses, add K2.5 version preference, and update regression matrices/tests for the new ranking behavior.

* feat: add balanced subscription mode for install and /omos

Add an installer question for subscription/pay-per-API usage and persist balanceProviderUsage in config. When enabled, dynamic planning rebalances provider assignments toward even distribution using a max score-loss tolerance. Also expose the setting in omos_preferences show/plan/apply flows and update docs/tests.

* tune: reduce chutes qwen3 over-ranking

Add targeted Chutes role priors that down-rank Qwen3 and prefer Kimi K2.5/Minimax M2.1 by role in both v1 and v2 scoring. Update matrix and regression tests to reflect the calibrated ranking behavior.

* tune: remove Gemini bonus from v1 scoring

* feat: add guided click-through /omos command flow

* feat: remove /omos command, add manual model selection to CLI install

- Remove omos_preferences tool and /omos command flow
- Add --dry-run flag for testing install without OpenCode
- Add manual setup mode to choose models per agent
- Exclude already-selected models from fallback choices
- Limit model list display to 5 items with option to type any model ID
- Support primary + 3 fallbacks configuration per agent
- Add ManualAgentConfig type for storing manual selections

* fix: auto-detect opencode installation in common paths

- Add OPENCODE_PATHS with common installation locations
- Add resolveOpenCodePath() helper with caching
- Update all opencode commands to use resolved path
- Show found path in success message
- Add helpful instructions when opencode not found in PATH

* chore: remove accidentally added submodule

* fix: expand opencode path detection for macOS and Linux

Add comprehensive path detection including:
- macOS: Applications, Homebrew, Library paths
- Linux: /usr/bin, snap, flatpak, nix paths
- Package managers: Homebrew, Cargo, npm, yarn, pnpm
- More system-wide and user-local locations

* chore: remove accidentally added submodule

* fix: enforce selected provider models in dynamic assignments

Respect user-selected primary/secondary models during dynamic planning, lock forced assignments from rebalance overrides, and update tests to match manual-plan precedence behavior. Also include robust OpenCode path resolution updates for cross-platform installs.

* feat: add models-only update flow and refine model selection UX

- add  command and  mode for updating assignments without reinstalling plugins/skills
- keep full provider model lists available for primary/support choices
- improve large-list selection with  expansion option
- refine dynamic planner provider candidate narrowing and pinned model handling
- update matrix/planner tests and remove residual internal placeholder wording

* chore: polish installer menu copy and provider prompt consistency

Clarify quick/manual setup descriptions, standardize provider prompts, and improve models-only messaging.

* fix: address PR review for OpenCode path resolution and alias regex

Avoid spawning processes in resolveOpenCodePath by selecting only existing filesystem paths and simplify slash normalization regex for clarity.

* fix: remove stale commands exports and placeholder test

---------

Co-authored-by: Ruben Beuker <rubenbeuker@MacBook-Pro-van-Ruben.local>
Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
Co-authored-by: Your Name <you@example.com>
Updated the project description to reflect the open status of the Multi Agent Suite.
mfold111 and others added 9 commits February 14, 2026 01:09
Co-authored-by: Claw <claw@Claws-Virtual-Machine.local>
…unreal#127)

* Fix: Tmux Session Leak - Complete Session Lifecycle Management

## Problem
When background tasks complete, tmux panes remained open and opencode attach
processes became orphaned, accumulating over time.

## Root Cause
Missing session lifecycle management:
1. No session.abort() on task completion
2. No session.abort() on cancellation
3. No session.deleted event handler
4. No graceful shutdown (Ctrl+C before kill)

## Solution
- Add session.abort() in completeTask() (single point of responsibility)
- Add handleSessionDeleted() for cleanup on session deletion
- Add onSessionDeleted() in TmuxSessionManager for pane cleanup
- Implement graceful shutdown: Ctrl+C before kill-pane
- Wire up event handlers in main dispatcher

## Code Quality
- Removed duplicate code in completeTask()
- Eliminated redundant session.abort() calls
- All 43 tests pass

## Documentation
- Updated AGENTS.md with session lifecycle section
- Updated docs/tmux-integration.md with troubleshooting

## Testing
- @explorer and @librarian tasks complete and close panes automatically
- Zero orphaned processes after task completion

Inspired by oh-my-opencode session management implementation.

* fix: Address Greptile review feedback

- Remove redundant resolver deletion in handleSessionDeleted()
- Add session tracking cleanup in completeTask() as fallback
- Prevents memory leak if session.deleted event doesn't fire

Fixes issues identified in latest Greptile review.
* Fix: Tmux Session Leak - Complete Session Lifecycle Management

## Problem
When background tasks complete, tmux panes remained open and opencode attach
processes became orphaned, accumulating over time.

## Root Cause
Missing session lifecycle management:
1. No session.abort() on task completion
2. No session.abort() on cancellation
3. No session.deleted event handler
4. No graceful shutdown (Ctrl+C before kill)

## Solution
- Add session.abort() in completeTask() (single point of responsibility)
- Add handleSessionDeleted() for cleanup on session deletion
- Add onSessionDeleted() in TmuxSessionManager for pane cleanup
- Implement graceful shutdown: Ctrl+C before kill-pane
- Wire up event handlers in main dispatcher

## Code Quality
- Removed duplicate code in completeTask()
- Eliminated redundant session.abort() calls
- All 43 tests pass

## Documentation
- Updated AGENTS.md with session lifecycle section
- Updated docs/tmux-integration.md with troubleshooting

## Testing
- @explorer and @librarian tasks complete and close panes automatically
- Zero orphaned processes after task completion

Inspired by oh-my-opencode session management implementation.

* fix: Address Greptile review feedback

- Remove redundant resolver deletion in handleSessionDeleted()
- Add session tracking cleanup in completeTask() as fallback
- Prevents memory leak if session.deleted event doesn't fire

Fixes issues identified in latest Greptile review.

* chore: Add .aim/ to .gitignore for AI memory directory

* fix: disable timeout when fallback is disabled

- Remove max(120000) constraint on timeoutMs
- Allow timeoutMs: 0 to disable timeout entirely
- Auto-disable timeout when fallback.enabled = false
- Default remains 15000ms (unchanged)
},
);

return hasExplicitOpenCodeAgent || input.hasOpencodeZen;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Dead-code logic — || input.hasOpencodeZen makes hasExplicitOpenCodeAgent irrelevant

Because the function already returns false early on line 97 when !input.hasOpencodeZen, execution only reaches line 110 when input.hasOpencodeZen is true. The || input.hasOpencodeZen term is therefore always true, meaning the function always returns true whenever hasOpencodeZen is set — regardless of whether any agent actually uses an OpenCode model.

If the intent is to return true only when there is an explicit OpenCode agent assignment, the condition should be:

Suggested change
return hasExplicitOpenCodeAgent || input.hasOpencodeZen;
return hasExplicitOpenCodeAgent;

If the intent is to always enable OpenCode free models when hasOpencodeZen is true, the hasExplicitOpenCodeAgent check and the entire some() loop above are dead code and should be removed.

hasAntigravity: detected.hasAntigravity,
hasChutes: detected.hasChutes ?? false,
hasNanoGpt: hasNanoGptForRun,
hasOpencodeZen: true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: hasOpencodeZen is hardcoded to true in refreshConfig, ignoring the actual detected value.

Line 209 correctly passes detected.hasOpencodeZen to inferOpenCodeFreeEnabled, but then refreshConfig unconditionally sets hasOpencodeZen: true. This means filterCatalogToEnabledProviders and buildDynamicModelPlan will always treat the user as having OpenCode Zen, potentially selecting models the user cannot access.

Suggested change
hasOpencodeZen: true,
hasOpencodeZen: detected.hasOpencodeZen,

@Daltonganger Daltonganger changed the title feat: add /omos command UX, conflict handling, and rollout gates feat: smart routing v3, provider-locked selection, and NanoGPT endpoint policy Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants