feat(init,rgai): auto-install grepai via init, delegate rtk rgai to grepai with silent fallback by heAdz0r · Pull Request #136 · rtk-ai/rtk

heAdz0r · 2026-02-15T20:58:06Z

Summary

Adds optional grepai installation to rtk init --global and integrates rtk rgai delegation to external grepai with silent fallback to built-in keyword search.

User flow after this PR:

$ rtk init --global
RTK hook installed (global).
  ...
Install grepai for semantic code search? [Y/n] y
  Installing grepai...
  grepai installed: ~/.local/bin/grepai
  Run `grepai init` in any project, then `grepai watch --background`.

Then rtk rgai "query" automatically delegates to grepai when available, or falls back to built-in search silently.

Presets & Defaults

Setting	Default	Where
Embedding provider	`ollama`	`grepai init --provider ollama`
Index backend	`gob` (file-based)	`grepai init --backend gob`
`grepai.enabled`	`true`	`~/.config/rtk/config.toml`
`grepai.auto_init`	`true`	Auto-init project on first `rtk rgai`
`grepai.binary_path`	`None` (auto-detect)	Override binary location
Install prompt default	Yes (`[Y/n]`)	Consent during `rtk init`
`--builtin` flag	`false`	Force built-in, skip grepai

Why these defaults: Ollama runs locally (no API keys needed), gob is zero-config (no external DB), auto-init reduces friction. The [Y/n] default (Yes) is intentionally different from the settings.json [y/N] (No) — grepai is low-risk and high-value.

Prerequisites for grepai

rtk rgai works without grepai (built-in keyword engine). For full embedding-based semantic search:

Ollama running locally (ollama serve)
grepai installed (this PR automates it via rtk init)
Project indexed: grepai init && grepai watch --background

If any prerequisite is missing, rtk rgai falls back to built-in search silently (no warnings, no nagging).

Security Compliance (SECURITY.md Workflow)

Layer 1: Automated security-check.yml

Critical-file match: src/init.rs — triggers enhanced review
Dangerous-pattern scan: no eval, no exec, no unsanitized interpolation
No Cargo.toml or CI workflow modifications

Layer 2: Installer security

The install step (grepai::install_grepai) uses safe stdin piping instead of curl | sh:

// Step 1: Download script to memory
let script = Command::new("curl").args(["-fsSL", URL]).output()?;
// Step 2: Pipe to sh via stdin (no shell interpolation)
let mut installer = Command::new("sh")
    .env("INSTALL_DIR", &install_dir)
    .stdin(Stdio::piped()).spawn()?;
installer.stdin.write_all(&script.stdout)?;

-f flag: fail on HTTP errors (no HTML error pages executed)
No sh -c "curl ... | sh" — avoids shell interpolation of URL
INSTALL_DIR passed as env var, not interpolated into command string
All Command::new() calls use absolute binary paths to avoid hook circular rewriting

Layer 3: Manual review

PR touches init.rs (critical file) — requires maintainer sign-off per SECURITY.md
All subprocess invocations are auditable (grep for Command::new in src/grepai.rs)

Implementation Details

New file: `src/grepai.rs`

GrepaiState enum: Ready / NotInitialized / NotInstalled
Binary discovery: PATH → ~/.local/bin → /usr/local/bin (testable via DI)
Install, init, search — all use explicit binary paths
7 unit tests (state detection, fallback chain, priority)

Modified: `src/config.rs`

GrepaiConfig { enabled, auto_init, binary_path } with serde defaults
1 unit test

Modified: `src/init.rs`

Step 6 in run_default_mode(): setup_grepai(patch_mode, verbose)
prompt_grepai_consent(): [Y/n] default-Yes (non-interactive defaults to Yes)
Follows existing PatchMode semantics: Auto installs silently, Skip prints manual URL, Ask prompts

Modified: `src/rgai_cmd.rs`

try_grepai_delegation() before built-in engine
Respects --path by running grepai in target project directory
Auto-init when grepai installed but project not initialized
Silent fallback on any error (verbose mode shows diagnostics)
2 unit tests for resolve_grepai_project_dir

Modified: `src/main.rs`

mod grepai; declaration
--builtin flag on Commands::Rgai

Fixed: ARCHITECTURE.md

Module map updated: 30 → 49 modules (adds all missing modules from Python, Go, analytics, search categories)
Fixes validate CI check (module count mismatch)

Fixed: `.gitignore`

Added .grepai/ and .claude/settings.local.json
Removed accidentally committed local artifacts from tracking

Delegation Flow

rtk rgai "query"
  │
  ├─ --builtin? ──→ built-in keyword search
  │
  ├─ config.grepai.enabled == false? ──→ built-in
  │
  ├─ detect_grepai(project_path)
  │   ├─ NotInstalled ──→ built-in (silent)
  │   ├─ NotInitialized + auto_init ──→ grepai init → search
  │   └─ Ready ──→ grepai search
  │
  └─ grepai error? ──→ built-in (silent fallback)

Verification

# 1. All tests pass
cargo fmt --all --check && cargo clippy --all-targets && cargo test
# 369 passed

# 2. rtk init offers grepai install
cargo run -- init --global

# 3. Built-in fallback works
cargo run -- rgai --builtin "token tracking"

# 4. Delegation works (if grepai+ollama available)
cargo run -- rgai "token tracking"

# 5. Graceful fallback when grepai unavailable
# → Falls back to built-in without error

Chain / Dependency Note

Part of the rgai adoption chain. Depends on / aligned with: #124, #125, #127.

Record project_path (cwd) in tracking database and add filtered query methods. `rtk gain -p` shows savings scoped to the current project directory instead of global aggregates. - tracking.rs: Add project_path column with auto-migration, index, and filtered variants for all query methods (summary, daily, weekly, monthly, recent) - gain.rs: Add resolve_project_scope(), shorten_path(), scope-aware header, pass project filter to all queries and exports - main.rs: Add --project/-p flag to Gain command Backward-compatible: existing rows get empty project_path, unfiltered queries delegate to filtered(None) which returns all data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address reviewer feedback on PR rtk-ai#128: 1. Replace SQL LIKE with GLOB in all project-scoped queries to prevent `_` and `%` characters in path names from being interpreted as wildcards (e.g., `my_project` matching `myXproject`). GLOB uses `*` for wildcard matching which is safer for file system paths. 2. Guard the startup `UPDATE commands SET project_path = ''` migration with an `EXISTS` check so it only runs when NULL rows actually exist, avoiding a no-op UPDATE on every startup after the first migration. 3. Add `DEFAULT ''` to the ALTER TABLE migration so new installs never create NULL project_path values. 4. Add 3 new unit tests for project_filter_params GLOB behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rust-native semantic search that scores files and lines by term relevance, symbol definitions, and path matching. No external dependencies (no grepai/embeddings required). Features: - Natural-language multi-word queries: rtk rgai "auth token refresh" - File scoring with symbol definition boost (+2.5) and comment penalty - Stop word removal + basic stemming for better recall - Compact and JSON output modes - File type filtering (--file-type ts/py/rust/etc.) - gitignore-aware traversal via `ignore` crate - Binary and large file skipping - Backward-compat: trailing path token auto-detection Includes 8 unit tests (5 in rgai_cmd, 3 for arg normalization).

…ment scoring - stem_token: remove "es" suffix to fix broken stems for -ce/-ge/-ve words (caches→cache, services→service, changes→change instead of cach/servic/chang) - looks_like_path_token: remove bare contains('/') check that treated "client/server" as a path; now requires actual path prefixes (./ ../ / ~/) - is_comment_line: make '#' detection extension-aware to avoid penalizing Markdown headers and YAML in non-script files; only applies to py/sh/rb/etc. - Add tests for all three fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

# Conflicts: # src/gain.rs

# Conflicts: # src/init.rs

Search priority (mandatory): rgai > rg > grep. Hook changes: - Add rewrite rules: grepai/rgai search -> rtk rgai (Tier 1) - Split rg and grep into separate rules (Tier 2/3) - Source-of-truth comment for hook sync - Test infrastructure: HOOK env override, script-relative path Doc updates (README, INSTALL, TROUBLESHOOTING, awareness template): - Add search priority section - Update command tables with rtk rgai examples - Add search ladder (rgai -> grep -> proxy) - Remove unverifiable benchmark table Template updates (init.rs): - RTK_INSTRUCTIONS: add rtk rgai to Files & Search section - show_config: display search priority hint - Tests: assert rtk rgai in top-level commands list Test fixes: - Fix pre-existing find/tree/wget test expectations (hook already rewrites them on master, tests incorrectly expected no rewrite) - Add 7 new hook tests for rgai/grepai rewrite rules

Add comprehensive benchmark suite comparing grep, rtk grep, rtk rgai, and head_n (negative control) for code search tasks. Key methodology improvements: - Pinned commit verification (exit 2 if HEAD != gold_standards.json commit) - Dirty tree detection (exit 3 if uncommitted changes in src/) - Token-based TE using tiktoken (cl100k_base) instead of byte approximation - No output truncation (full quality samples preserved) - head_n negative control baseline for comparison - Auto-generated gold_auto.json from grep output for objective verification Benchmark categories: - A: Exact Identifier (6 queries) - rtk_grep recommended - B: Regex Pattern (6 queries) - grep/rtk_grep recommended - C: Semantic Intent (10 queries) - rtk_rgai recommended (100% vs 0% grep) - D: Cross-File Pattern (5 queries) - rtk_grep recommended - E: Edge Cases (3 queries) Key findings: - rtk rgai excels at semantic/intent queries (cosine similarity) - rtk grep provides best exact-match with token savings (~30%) - Recommended: rgai for discovery → grep fallback for precision

`rtk grep -n "pattern" path` failed with "unexpected argument '-n'" because clap didn't recognize -n before the positional arguments. Users naturally place -n before the pattern (muscle memory from grep/rg). The flag is a no-op since grep_cmd::run() already passes -n to ripgrep unconditionally, but clap must accept it. Adds -n/--line-numbers as an explicit bool field to the Grep command and ignores it in the match arm. Test added in grep_cmd::tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove .grepai/* and .claude/settings.local.json from tracking - Add .grepai/ and .claude/settings.local.json to .gitignore - Update ARCHITECTURE.md module map: 30 → 49 modules (adds cargo, curl, go, python, rgai, grepai, analytics modules) - Fix cargo fmt on gain.rs (long line split) Fixes validate CI check (module count mismatch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Keep our module map (49 modules with curl_cmd, grepai, rgai_cmd, analytics) and discard duplicate PYTHON/GO sections from upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

FlorianBruniaux · 2026-02-16T08:39:34Z

Thanks for the contribution. The auto-install grepai + silent fallback concept is interesting.

However, 145 changed files and 12K+ additions is not reviewable. This needs to be significantly reduced in scope.

A few concerns:

Scope: The PR description mentions init changes + rgai delegation, but 145 files suggests much more is bundled in
Dependencies on other PRs: This seems to depend on feat: add rtk rgai command for semantic code search #124 (rtk rgai command) — is that correct? If so, feat: add rtk rgai command for semantic code search #124 should be merged first
Size: Please rebase on latest master, remove any unrelated changes, and keep this focused strictly on the init grepai install + delegation logic

Could you trim this down to just the core feature (init changes + delegation fallback)?

heAdz0r · 2026-02-16T08:50:20Z

Thanks for the review — you're right, the diff is way too large.

Root cause

The branch accumulated commits from 5 other open PRs (#124, #125, #127, #128, #135) that were stacked on top of each other during development. The actual init + rgai delegation changes are only 2 commits / ~8 files / ~680 additions.

Dependency

Yes, this PR depends on #124 (rtk rgai command). The rgai_cmd.rs module and grepai.rs delegation logic require the base rgai command to be present.

Action plan

Please review and merge feat: add rtk rgai command for semantic code search #124 first — it's the foundation (3 files, ~1K additions, CI passing)
Also mergeable independently: fix(grep): accept -n flag for grep/rg compatibility #135 (grep -n fix, 2 files, 22 adds) and feat(gain): add per-project token savings with -p flag #128 (gain per-project, 3 files, 275 adds)
After feat: add rtk rgai command for semantic code search #124 merges, I'll rebase this PR on fresh master with only the init+delegation commits → bringing it down to ~8 files / ~680 additions
feat(docs,hooks): enforce rgai-first search policy #125 and feat(benchmark): reproducible code-search methodology with rgai/grep strategy #127 will also be rebased independently after their dependencies are resolved

What this PR actually adds (after cleanup)

File	Change
`src/init.rs`	grepai auto-install step in `rtk init --global`
`src/grepai.rs`	grepai binary detection + delegation logic
`src/rgai_cmd.rs`	rgai → grepai safe fallback
`src/config.rs`	grepai config support
`src/main.rs`	command registration
`.gitignore`	exclude `.grepai/` artifacts
`ARCHITECTURE.md`	updated module map

I'll force-push the clean rebase as soon as #124 is merged.

heAdz0r · 2026-02-17T07:43:40Z

Closing — agreed with maintainers to keep grepai/rgai activity in my fork (heAdz0r/rtk) and not mix it into upstream for now.

heAdz0r and others added 13 commits February 14, 2026 13:52

feat(init,docs,hooks): enforce rgai-first search policy

b19c7db

docs(readme): remove private benchmark source references

4b0a413

Merge remote-tracking branch 'origin/feat/gain-project-scope'

8dc10c6

# Conflicts: # src/gain.rs

Merge remote-tracking branch 'origin/codex/rgai-priority-init-docs'

77ae9be

# Conflicts: # src/init.rs

feat(grepai): integrate init flow and safe rgai delegation

1cf0a28

heAdz0r changed the title ~~feat(init,rgai): optional grepai install and delegated semantic search fallback~~ feat(init,rgai): auto-install grepai via init, delegate rtk rgai to grepai with silent fallback Feb 15, 2026

Merge upstream/master, resolve ARCHITECTURE.md conflict

b680aac

Keep our module map (49 modules with curl_cmd, grepai, rgai_cmd, analytics) and discard duplicate PYTHON/GO sections from upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This was referenced Feb 16, 2026

feat: add rtk rgai command for semantic code search #124

Closed

feat(docs,hooks): enforce rgai-first search policy #125

Closed

pszymkowiak added the pending-problem label Feb 17, 2026

heAdz0r closed this Feb 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(init,rgai): auto-install grepai via init, delegate rtk rgai to grepai with silent fallback#136

feat(init,rgai): auto-install grepai via init, delegate rtk rgai to grepai with silent fallback#136
heAdz0r wants to merge 14 commits intortk-ai:masterfrom
heAdz0r:codex/grepai-init-review-fixes

heAdz0r commented Feb 15, 2026 •

edited

Loading

Uh oh!

FlorianBruniaux commented Feb 16, 2026

Uh oh!

heAdz0r commented Feb 16, 2026

Uh oh!

heAdz0r commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

heAdz0r commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Presets & Defaults

Prerequisites for grepai

Security Compliance (SECURITY.md Workflow)

Layer 1: Automated security-check.yml

Layer 2: Installer security

Layer 3: Manual review

Implementation Details

New file: src/grepai.rs

Modified: src/config.rs

Modified: src/init.rs

Modified: src/rgai_cmd.rs

Modified: src/main.rs

Fixed: ARCHITECTURE.md

Fixed: .gitignore

Delegation Flow

Verification

Chain / Dependency Note

Uh oh!

FlorianBruniaux commented Feb 16, 2026

Uh oh!

heAdz0r commented Feb 16, 2026

Root cause

Dependency

Action plan

What this PR actually adds (after cleanup)

Uh oh!

heAdz0r commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

heAdz0r commented Feb 15, 2026 •

edited

Loading

New file: `src/grepai.rs`

Modified: `src/config.rs`

Modified: `src/init.rs`

Modified: `src/rgai_cmd.rs`

Modified: `src/main.rs`

Fixed: `.gitignore`