Conversation
NL tool discovery via local qwen model. Agent asks a question in English, local 7B model routes to the right spai command. ~1200 prompt tokens for the full 35-tool catalog, sub-second after prompt cache warmup. - plugins/spai-search: reads spai help, builds compact prompt, queries ollama, returns EDN recommendations - setup.clj: ollama + qwen2.5-coder:7b check (optional dep, same pattern as ripgrep) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
spai searchplugin: natural language → spai command recommendation via local Ollama (qwen2.5-coder:7b)setup.clj: ollama + model check added as optional dependency (same pattern as ripgrep)Agent asks "find class predicates" → local 7B routes to
spai refs. ~1200 prompt tokens for the full catalog, sub-second after prompt cache. Zero API cost for tool routing.Inspired by the MCP token cost analysis — this is the CLI lazy-loading pattern taken to its conclusion: don't load schemas at all, let a local model be the index.
Test plan
spai search "who depends on this file"→ should recommendspai whospai search "what changed recently"→ should recommendspai changesspai search "biggest files"→ should recommendspai hotspotsspai setupon machine without ollama → shows optional warning, doesn't blockspai setupwith ollama but no qwen → offers to pull model🤖 Generated with Claude Code