feat: Add Knowledge Sync workflow for RAG-ready knowledge index#1778
feat: Add Knowledge Sync workflow for RAG-ready knowledge index#1778mbergo wants to merge 1 commit intobmad-code-org:mainfrom
Conversation
📝 WalkthroughWalkthroughIntroduces a new GenAI Knowledge Sync workflow with three sequential steps (discover, index, optimize) to build RAG-ready knowledge indexes from project artifacts. Updates analyst agent configuration with a KS trigger pointing to the workflow. Adds CLI verbose mode and YAML config caching to the tools layer. Changes
Sequence DiagramsequenceDiagram
actor User
participant Workflow as Workflow<br/>(Main Orchestrator)
participant Step1 as Step 01<br/>(Discover)
participant Step2 as Step 02<br/>(Index)
participant Step3 as Step 03<br/>(Optimize)
User->>Workflow: Trigger KS (Knowledge Sync)
Workflow->>Step1: Load step-01-discover.md
Step1->>Step1: Scan artifacts & categorize
Step1->>User: Present discovery report
User->>Step1: Confirm & approve
Step1->>Workflow: Signal completion
Workflow->>Step2: Load step-02-index.md
Step2->>Step2: Process artifacts into chunks
Step2->>Step2: Generate knowledge index
Step2->>User: Present indexed knowledge
User->>Step2: Confirm & approve
Step2->>Workflow: Signal completion
Workflow->>Step3: Load step-03-optimize.md
Step3->>Step3: Test retrieval scenarios
Step3->>Step3: Optimize tags & config
Step3->>User: Present finalized index
User->>Step3: Confirm completion
Step3->>Workflow: Workflow complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (27)
.github/instructions/*.instructions.md-3-3 (1)
3-3:⚠️ Potential issue | 🟠 MajorSeverity labels are underspecified and non-actionable.
Line 3 defines
NORMAL | IMPROVEMENT | FIX | CRITICALbut gives no criteria or required response per level, so different reviewers will classify the same issue inconsistently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/instructions/*.instructions.md at line 3, The severity labels string "NORMAL | IMPROVEMENT | FIX | CRITICAL" is underspecified; update that entry so each label has a one-line definition and the expected reviewer action/response (e.g., triage timeframe, required reviewer role, must-block vs advisory, or required follow-up). Specifically expand "NORMAL", "IMPROVEMENT", "FIX", and "CRITICAL" with acceptance criteria and the concrete action required from the reviewer/assignee (examples and escalation steps optional), and include one or two short examples per level to guide consistent use.src/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md-30-35 (1)
30-35:⚠️ Potential issue | 🟠 MajorInitialization lacks required-field validation before workflow execution.
The workflow assumes
project_name,output_folder,project_knowledge, etc. exist, but there’s no fail-fast rule for missing/empty values. That’s a brittle start state.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md` around lines 30 - 35, Add a fail-fast validation step immediately after loading config from _bmad/bmm/config.yaml to ensure required fields (project_name, output_folder, project_knowledge, user_name, communication_language, document_output_language, user_skill_level, planning_artifacts, implementation_artifacts) are present and non-empty; if any are missing or empty, raise/return a clear error and stop workflow execution instead of proceeding, and only generate the system date value (date) after successful validation. Ensure the validation logic lives alongside the config-loading routine (the code that resolves project_name/output_folder/project_knowledge/date) and includes descriptive error messages identifying which key is invalid so callers can fail fast and fix the config.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md-85-88 (1)
85-88:⚠️ Potential issue | 🟠 Major“Read the complete source file” for every artifact is unbounded and non-scalable.
Large repositories will hit context/performance limits. Add bounded extraction rules (size caps, prioritized sections, sampling, or staged reads).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` around lines 85 - 88, The step "Read the complete source file" is unbounded; update the text in step-02-index.md to replace that requirement with bounded extraction rules: enforce a per-file size cap (e.g., max bytes/lines), prioritize sections (headers, function definitions, docs) for full extraction, apply sampling for very large files, and support staged reads (full read only for prioritized/high-value files). Mention concrete policies (size limits, priority order, sampling rate) and reference the step content bullets ("Read the complete source file", "Identify distinct knowledge units", "Create one chunk per knowledge unit") so the extraction flow uses these bounds when creating chunks and tagging for retrieval.tools/cli/lib/config.js-31-33 (1)
31-33:⚠️ Potential issue | 🟠 MajorCached object is returned by reference, enabling accidental cache mutation.
Callers can mutate
cached.data, and every future caller gets polluted state from Line 32.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/cli/lib/config.js` around lines 31 - 33, The cached object is returned by reference (cached.data), allowing callers to mutate the cached state; update the cache hit path in the module (the check using cached and cached.mtime === mtime) to return a safe clone of cached.data instead of the original reference (use structuredClone if available with a JSON.parse(JSON.stringify(...)) fallback) so callers receive an immutable copy and the internal cache cannot be polluted.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md-15-15 (1)
15-15:⚠️ Potential issue | 🟠 Major“Show your analysis before taking any action” is unsafe as an execution rule.
This instruction pushes internal reasoning disclosure instead of controlled outputs and can lead to inconsistent behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` at line 15, The line "🎯 Show your analysis before taking any action" in step-02-index.md is unsafe; remove or replace it with a safe execution rule that requests only a brief, actionable summary or plan (e.g., "Provide a concise plan or next steps without revealing internal chain-of-thought") so the document no longer instructs disclosure of internal reasoning; update the step content accordingly to reference the revised wording.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md-5-5 (1)
5-5:⚠️ Potential issue | 🟠 MajorThe step has a hard instruction conflict.
Line 5 says “NEVER generate content without user input,” while Line 24 says “This step will generate content.” One of these needs to be scoped or rewritten, or the agent gets contradictory directives.
Also applies to: 24-24
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` at line 5, Resolve the contradictory hard instructions by reconciling the two phrases: change the absolute "NEVER generate content without user input" to a scoped rule (e.g., "Do not generate content unless explicit user input or consent is present") and/or rewrite "This step will generate content" to "This step may generate content only after explicit user input/consent"; update the occurrences of the exact strings "NEVER generate content without user input" and "This step will generate content" in step-02-index.md so the step clearly requires explicit user input or consent before any generation occurs.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md-15-15 (1)
15-15:⚠️ Potential issue | 🟠 MajorThis step also hardcodes analysis disclosure as a rule.
Line 15 should not require exposing internal analysis before actions; constrain outputs to user-facing summaries/findings instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md` at line 15, Replace the hardcoded directive "🎯 Show your analysis before taking any action" in the step-01-discover.md content with a requirement that outputs be limited to user-facing summaries/findings and explicit prohibition of exposing internal chain-of-thought; update the step's wording to something like "Provide a concise, user-facing summary of findings and recommended next steps (do not include internal analysis or chain-of-thought)" so the step enforces withholding internal analysis while still requiring clear actionable summaries.src/bmm/workflows/4-implementation/genai-knowledge-sync/knowledge-index-template.md-1-10 (1)
1-10:⚠️ Potential issue | 🟠 MajorTemplate placeholders do not align with declared frontmatter keys.
Frontmatter defines
total_chunks/sources_indexed, but summary uses{{total_count}}/{{source_count}}(and{{ref_count}}). This will produce unresolved tokens unless another layer remaps keys.Also applies to: 20-23
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/knowledge-index-template.md` around lines 1 - 10, The template's Handlebars-like placeholders ({{total_count}}, {{source_count}}, {{ref_count}}) do not match the frontmatter keys (total_chunks, sources_indexed, tag_vocabulary_size), causing unresolved tokens; fix by making the keys and placeholders consistent — either rename the frontmatter keys to total_count, source_count, ref_count or update all placeholders to use {{total_chunks}}, {{sources_indexed}}, {{tag_vocabulary_size}} (and update the summary occurrences referenced around lines 20–23) so the template and frontmatter use the exact same identifiers (search for the placeholders and the frontmatter block to ensure all instances are updated).src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md-37-37 (1)
37-37:⚠️ Potential issue | 🟠 MajorThe recursive
{project-root}/**/knowledge-index.mdscan is too broad and costly.Without exclusions, this can traverse large irrelevant trees and degrade performance. Scope the search roots and exclude generated/vendor folders.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md` at line 37, The doc currently instructs scanning `{project-root}/**/knowledge-index.md`, which is too broad; update the guidance so the search is limited to specific roots (e.g., known source directories) and add exclusions for generated/vendor folders (node_modules, vendor, dist, build, .git, .venv, etc.). Replace the broad pattern with scoped patterns and/or recommend using glob exclusions or a configurable exclude list and a max-depth limit when searching for `{project_knowledge}/knowledge-index.md` and `{project-root}/**/knowledge-index.md` to avoid traversing large irrelevant trees.src/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md-33-37 (1)
33-37:⚠️ Potential issue | 🟠 MajorLanguage directives are conflicting and can produce wrong output language.
You load both
communication_languageanddocument_output_language, but Line 36 mandates output using onlycommunication_language. Clarify precedence for user chat vs generated document content.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md` around lines 33 - 37, The workflow currently loads both communication_language and document_output_language but contains a conflicting directive forcing all output to communication_language; update the text to clarify precedence: state that communication_language is used for Agent/chat communication (messages, prompts, inline commentary) while document_output_language is used for generated artifact content (reports, docs in planning_artifacts/implementation_artifacts/project_knowledge); also add a clear fallback rule (use document_output_language when present, otherwise fall back to communication_language) and mention user_skill_level remains for tone/complexity, so references to communication_language in the "YOU MUST ALWAYS SPEAK OUTPUT" line should be revised to specify "Agent communications use {communication_language}; generated documents use {document_output_language} (fallback to {communication_language})."src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md-48-58 (1)
48-58:⚠️ Potential issue | 🟠 MajorChunking rules omit explicit size/token limits, so retrieval quality will vary wildly.
You define principles, but no hard max size (tokens/chars) and no split policy when limits are exceeded.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` around lines 48 - 58, The CHUNKING PRINCIPLES section (Chunk Design Rules) lacks concrete size and split policies—add explicit limits and a deterministic split strategy: specify a max token count (e.g., max_tokens_per_chunk = 500–1000), a character fallback (e.g., 3–5k chars), and an overlap window (e.g., 50–100 tokens) to preserve context; define a split policy for overflow (prefer sentence/paragraph boundaries, then semantic delimiter heuristics, then hard token cut) and tie this into the metadata fields (category, priority, source path, semantic tags) so each chunk includes its byte/token range and source section reference for traceability; update the CHUNKING PRINCIPLES/Chunk Design Rules text to state these concrete values and the order of operations for splitting and deduplication.tools/cli/lib/config.js-27-33 (1)
27-33:⚠️ Potential issue | 🟠 Majormtime-only cache validation is fragile for rapid consecutive writes.
Using strict
mtimeMsequality can serve stale data when filesystem timestamp granularity is coarse or writes happen within the same tick.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/cli/lib/config.js` around lines 27 - 33, The cache invalidation currently relies on strict equality of stat.mtimeMs (variables: resolved, mtime, stat.mtimeMs, this.#cache, cached.mtime), which is fragile for rapid writes; update the logic to store and compare at least two stable attributes (e.g., mtimeMs and stat.size) in the cache entry and treat the cache as valid only if both match, or if stronger guarantees are needed compute a quick content hash when populating the cache and compare that on lookup; adjust where you set and read this.#cache (ensure you save cached.size or cached.hash alongside cached.mtime) and use the combined comparison instead of strict mtime equality.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md-95-109 (1)
95-109:⚠️ Potential issue | 🟠 MajorSource scanning lacks exclusion/privacy boundaries.
This step asks to mine source patterns but doesn’t exclude directories like vendored/generated/cache artifacts or secret-bearing files. Add explicit denylist/exclusion rules before indexing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md` around lines 95 - 109, Update the "### 5. Scan Source Code for Patterns" step to add an explicit denylist/exclusion policy that runs before any indexing of "Configuration Files" or "Key Source Files"; list common exclusions (e.g., node_modules, .git, vendor/, build/, dist/, generated/, cache/, tmp/, lockfiles, and secret-bearing files like .env, *.key, *.pem) and describe enforcing it in the scanner logic so those paths are skipped, plus include a short note on how to add project-specific exclusions and a privacy boundary check that aborts indexing if secrets are detected..github/instructions/*.instructions.md-2-2 (1)
2-2:⚠️ Potential issue | 🟠 MajorThis rule can bypass review/audit controls for post-review changes.
Line 2 currently mandates “never create PRs” and pushes direct fix+commit flow. That can skip required approvals for non-trivial or risky post-review edits.
Suggested wording to keep governance safe
-* Never creates PRs for altering code after review. Always offer a fix and the option to commit. +* For post-review changes, propose a concrete fix first. If the change is non-trivial, risky, or policy-sensitive, require a PR; otherwise offer the option to commit directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/instructions/*.instructions.md at line 2, The current rule text "Never creates PRs for altering code after review. Always offer a fix and the option to commit." can bypass approvals; update the sentence to allow creating a PR for post-review changes that are non-trivial or require governance, e.g. replace it with guidance that prefers offering a direct fix and optional commit for trivial/low-risk changes but mandates creating a PR and following normal review/approval workflows for anything non-trivial, security-sensitive, or policy-impacting so reviewers retain audit controls.tools/cli/commands/status.js-56-70 (1)
56-70:⚠️ Potential issue | 🟠 MajorPer-module workflow reporting conflicts with BMAD artifact scope.
This code implies workflows are module-scoped, but BMAD’s artifact model treats workflows/tasks as global cross-cutting assets. The output can mislead users about missing workflows.
Based on learnings: “workflows and tasks are cross-cutting methodology artifacts that should always be installed globally... Only agents are module-scoped.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/cli/commands/status.js` around lines 56 - 70, The per-module verbose loop incorrectly treats workflows as module-scoped; remove the per-module workflow glob and reporting from the loop (the block using moduleName/modDir and the workflows variable) so only agents are counted per module (keep agents = await glob(...) and the prompts.log.info message updated to report only agents for moduleName); instead ensure global workflow/task reporting is done elsewhere (outside this loop) by using a single global glob for workflows/tasks when generating overall status. Reference symbols: options.verbose, modules loop, moduleName, modDir, agents, workflows, prompts.log.info.tools/cli/lib/config.js-37-38 (1)
37-38:⚠️ Potential issue | 🟠 MajorCache coherence is incomplete because writes don’t evict cache entries explicitly.
saveYamlandprocessConfigcan update files, but cache eviction relies solely on later mtime checks. Add explicit invalidation on write paths for deterministic freshness.Suggested fix (cache eviction on write)
class Config { /** `@type` {Map<string, { data: Object, mtime: number }>} */ `#cache` = new Map(); + `#invalidateCacheFor`(configPath) { + this.#cache.delete(path.resolve(configPath)); + } async saveYaml(configPath, config) { @@ await fs.writeFile(configPath, content, 'utf8'); + this.#invalidateCacheFor(configPath); } async processConfig(configPath, replacements = {}) { @@ await fs.writeFile(configPath, content, 'utf8'); + this.#invalidateCacheFor(configPath); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/cli/lib/config.js` around lines 37 - 38, Cache entries are not evicted when files are written, causing stale reads; update the write paths by explicitly invalidating the cache key used in the reader (the same resolved path used with this.#cache.set) after successful writes in saveYaml and processConfig (or any other file write functions) — e.g., call this.#cache.delete(resolved) or update the entry with the new {data, mtime} so subsequent reads see the fresh content; ensure you use the exact resolved path variable/name that the reader uses to identify the cache entry.tools/cli/commands/status.js-67-69 (1)
67-69:⚠️ Potential issue | 🟠 MajorWorkflow counts are significantly inflated by the glob pattern.
The glob pattern
workflows/**/*.{yaml,yml,md}counts 167 files, but only 18 are actual workflow definitions (namedworkflow.md,workflow.yaml, orworkflow.yml). The remaining 149 are step documentation, templates (.template.md), checklists, instructions, and other supporting materials. The status command will report inflated counts that misrepresent the actual number of workflows.Use the pattern
workflows/**/workflow.{md,yaml,yml}to count only workflow definition files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/cli/commands/status.js` around lines 67 - 69, The current glob call that assigns workflows uses the pattern 'workflows/**/*.{yaml,yml,md}', which inflates counts by including non-definition files; update the glob pattern in the assignment to use 'workflows/**/workflow.{md,yaml,yml}' so that the workflows variable only counts real workflow definition files, and then retain the existing log line that reports Module "${moduleName}": ${agents.length} agent(s), ${workflows.length} workflow(s) to reflect the corrected count.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-86-101 (1)
86-101:⚠️ Potential issue | 🟠 MajorNo concrete mechanism to execute retrieval simulation.
The instruction says "Simulate these common developer queries against the knowledge index" but provides:
- No algorithm for how to match queries against chunk tags/content
- No scoring or ranking mechanism to determine "would be retrieved"
- No threshold for what constitutes a successful match
Without execution logic, this becomes a subjective manual review rather than systematic testing. Agents can't follow this instruction consistently.
🔧 Proposed simulation algorithm
**Simulation Algorithm:** For each test query: 1. Extract key terms and implied categories from the query 2. Match against chunk tags using keyword overlap (minimum 2 tag matches for retrieval) 3. Match against chunk content using semantic similarity (if available) or keyword search 4. Rank retrieved chunks by: (critical priority) > (tag match count) > (content relevance) 5. Select top 3-5 chunks as "would be retrieved" 6. Manually review if retrieved set answers the query intent **Report per query:** - Retrieved chunks: [CHUNK-IDs and match scores] - Missing chunks: Chunks that should match but didn't (explain why) - False positives: Chunks retrieved incorrectly (explain why) - Tag adjustments: Specific tags to add/remove🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 86 - 101, The "Test Queries" section lacks a concrete retrieval simulation; implement the proposed "Simulation Algorithm" into step-03-optimize by defining: a) query processing steps (Extract key terms and implied categories), b) matching rules (match chunk tags by keyword overlap with a minimum of 2 tag matches and/or content matching via keyword or semantic similarity), c) scoring/ranking (priority weight: critical flag > tag match count > content relevance), and d) selection/thresholds (select top 3–5 chunks above a configurable score threshold); update the instructions to require reporting retrieved chunk IDs with match scores, missing chunks, false positives and specific tag adjustments so agents can execute the simulation deterministically.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-1-289 (1)
1-289:⚠️ Potential issue | 🟠 MajorMissing error recovery and rollback procedures.
The entire workflow assumes a happy path with no guidance for:
- What to do if optimization fails midway (e.g., user rejects tag standardization)
- How to recover from incorrect optimizations
- Whether changes can be rolled back to step-2 output
- How to save partial progress if the user needs to pause
Without error handling, agents and users will be stuck if any step fails or produces unsatisfactory results.
Consider adding a "Troubleshooting & Recovery" section at the end:
## TROUBLESHOOTING & RECOVERY: **If optimization fails:** 1. Review the specific failure in section that failed 2. User can request to skip that optimization step and proceed 3. Mark the step as 'partial' in frontmatter instead of 'complete' **If user rejects optimization:** 1. Document which optimizations were rejected 2. Revert to step-2 output (keep backup before optimization) 3. Offer to retry optimization with different parameters **Saving partial progress:** - Save optimization progress after each major section (tag optimization, retrieval config, etc.) - Allow resumption from last completed section🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 1 - 289, Add a "Troubleshooting & Recovery" section to step-03-optimize.md that documents error recovery, rollback and partial-save procedures: describe how to handle failures during "Tag Enrichment", "Generate Retrieval Configuration", or "Finalize Knowledge Index" (e.g., log the failing section, allow user to skip or retry), require creating a timestamped backup of the step-2 output before any mutation so you can revert to it, update the frontmatter keys used in "Update Frontmatter" (e.g., retrieval_tested, status) to support 'partial' states and a changeset list, and mandate saving progress after each major stage (Tag Optimization, Retrieval Configuration, Finalize Knowledge Index) with clear instructions for resumption and documenting user rejections so the workflow can rollback or re-run specific sections.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-44-56 (1)
44-56: 🛠️ Refactor suggestion | 🟠 MajorNo edge case handling for chunks outside target range.
The chunk size analysis identifies problematic chunks but provides no guidance for:
- Atomic chunks that can't be split further but exceed 500 words (e.g., a single complex algorithm)
- Critical context that's under 100 words but complete (e.g., a configuration requirement)
- Trade-offs when splitting reduces semantic coherence
Without fallback strategies, agents may make poor chunking decisions or get stuck.
♻️ Proposed edge case guidance
**Chunk Size Analysis:** - Identify chunks that are too large (reduce retrieval precision) - Identify chunks that are too small (lack sufficient context) - Recommend splits or merges for optimal retrieval size - Target: Each chunk should be 100-500 words for optimal embedding + +**Edge Cases:** +- If a chunk is atomic and cannot be split but exceeds 500 words, keep it intact and flag with `oversized: true` metadata +- If a chunk is semantically complete but under 100 words, preserve it if it represents a discrete concept (e.g., a single requirement) +- Prioritize semantic coherence over strict word count compliance🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 44 - 56, Update the "Chunk Size Analysis" and "Context Sufficiency Check" sections in step-03-optimize.md to add explicit edge-case handling and fallback strategies: for atomic chunks >500 words add guidance to mark them as "atomic" (preserve semantics), generate a concise abstract/summary to create a smaller embedding, and/or flag for human review; for critical chunks <100 words allow merging with neighboring chunks or attach a "context_augment" summary and glossary tags; add trade-off rules when splitting reduces coherence (prefer semantic-preservation, use "merge_with" or "preserve_semantics" metadata, score-based automated split thresholds) and a short list of actions agents should take (mark atomic, auto-summarize, merge, or escalate) so agents won't get stuck.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-60-65 (1)
60-65: 🛠️ Refactor suggestion | 🟠 MajorMissing conflict resolution strategy for tag normalization.
Tag standardization example shows "api-design" and "api-patterns" merging to a "single standard," but doesn't specify:
- How to choose which tag to keep (frequency? user preference? semantic precision?)
- What to do when similar tags have subtly different meanings that shouldn't be merged
- How to handle tag dependencies or hierarchies
Without decision criteria, agents may merge tags incorrectly or prompt the user for every decision, creating decision fatigue.
🔀 Proposed normalization strategy
**Tag Standardization:** - Normalize similar tags (e.g., "api-design" and "api-patterns" → single standard) + - Strategy: Keep the most frequently used tag unless user specifies preference + - Present conflicts where tags may have distinct meanings for user decision + - Example: "api-design" (architectural decisions) vs "api-patterns" (implementation patterns) may both be valid - Create a tag vocabulary for the project - Apply consistent tag format across all chunks🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 60 - 65, The "Tag Standardization" section lacks a conflict-resolution policy for merging tags like "api-design" and "api-patterns"; update the doc to define concrete rules: choose the canonical tag by corpus frequency unless a curated "preferred tag" list overrides it, use semantic-similarity threshold (e.g., embedding/confidence) to auto-merge, maintain a synonyms map and tag-hierarchy so subtly different meanings are not merged (require human confirmation when similarity is borderline), and log all automatic merges and provide an opt-out prompt only for high-ambiguity cases; refer to the "Tag Standardization" heading and example tags "api-design" / "api-patterns" when adding these rules.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-287-289 (1)
287-289: 🛠️ Refactor suggestion | 🟠 MajorNo workflow state update or orchestration link.
The "WORKFLOW COMPLETE" section claims this is the final step but doesn't:
- Update workflow status in any configuration file
- Link back to the parent workflow document for context
- Explain how to re-trigger the workflow for maintenance
- Clear any temporary workflow state
This leaves the workflow in an ambiguous state without proper orchestration closure.
🔗 Proposed workflow closure
## WORKFLOW COMPLETE: -This is the final step of the GenAI Knowledge Sync workflow. The user now has a retrieval-optimized knowledge index that enables AI agents to find and use exactly the project knowledge they need for any implementation task, improving both speed and accuracy of AI-assisted development. +This is the final step of the GenAI Knowledge Sync workflow. + +**Workflow State Update:** +- Set workflow status to 'complete' in workflow.md frontmatter +- Clear any temporary workflow state or flags +- Archive workflow execution logs (if applicable) + +**Re-triggering this workflow:** +To re-sync knowledge when artifacts change, use the KS trigger: `analyst:KS` or load the workflow at `src/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md` + +The user now has a retrieval-optimized knowledge index that enables AI agents to find and use exactly the project knowledge they need for any implementation task, improving both speed and accuracy of AI-assisted development.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 287 - 289, The "WORKFLOW COMPLETE" section in step-03-optimize.md for the GenAI Knowledge Sync workflow lacks orchestration closure; update this markdown to (1) record a workflow status update (e.g., "status: complete" or a link to whatever orchestration/status file your repo uses) referencing the GenAI Knowledge Sync workflow name, (2) add a link back to the parent workflow document for context, (3) add a short "Re-run / maintenance" subsection that explains how to re-trigger the workflow (CLI command, CI job name, or script) and when to re-index, and (4) add a "Cleanup" step that documents clearing temporary state (temp dirs, cache keys, or config flags) so CI/orchestration can mark the run as finished; place these additions under the existing "WORKFLOW COMPLETE" heading and use clear identifiers like "GenAI Knowledge Sync", "WORKFLOW COMPLETE", and "Re-run / maintenance" so reviewers can locate the changes.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-219-235 (1)
219-235:⚠️ Potential issue | 🟠 MajorAspirational validation without enforcement.
The "Completion Validation" section presents checkboxes (✅) as if validation is performed, but there's no:
- Logic to verify these conditions
- Mechanism to halt completion if validation fails
- User confirmation that these checks passed
This creates a false impression of rigor. The checklist is descriptive (what should be true) but not prescriptive (how to verify it).
🔍 Proposed validation enforcement
### 7. Completion Validation -Final checks before completion: +**Perform these checks before marking complete:** **Content Validation:** -✅ All discovered artifacts indexed into chunks -✅ Each chunk has proper metadata and source tracing +Verify: +- [ ] Chunk count matches artifact count from step-1 catalog +- [ ] Randomly sample 5 chunks: confirm metadata is complete (Source, Category, Priority, Tags) +- [ ] Verify no duplicate CHUNK-IDs exist +- [ ] Confirm retrieval test results are documented + +Present validation results to user. Proceed to completion only if user confirms all checks passed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 219 - 235, The "Completion Validation" checklist in the "Completion Validation" section currently only describes desired outcomes without enforcement; add concrete validation logic that programmatically checks each listed condition (e.g., functions like validateIndexedArtifacts(), validateChunkMetadata(), validateDeduplication(), validateRetrievalScenarios(), validateChunkFormat()), make the main completion flow call these validators and fail/abort completion if any return false or throw, and add a user-facing confirmation/prompt (or a --force flag) before marking completion as passed so the checklist reflects actual verified state; ensure validation results are surfaced in the UI/logs with clear messages for each checklist item.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-20-26 (1)
20-26:⚠️ Potential issue | 🟠 MajorMissing prerequisite verification mechanism.
The context boundaries claim "All knowledge chunks from step-2 are indexed" and "Cross-references and deduplication are complete," but there's no validation that step-2 actually completed successfully. If step-2 failed or was skipped, this step will proceed with incomplete or missing data.
🛡️ Suggested prerequisite check
Add a verification section before optimization begins:
## CONTEXT BOUNDARIES: -- All knowledge chunks from step-2 are indexed -- Cross-references and deduplication are complete +**Prerequisites Check:** + +Before optimizing, verify: +- ✅ Step-2 status is 'complete' in knowledge index frontmatter +- ✅ Knowledge index file exists at expected path +- ✅ Minimum chunk count threshold met (e.g., > 0 chunks indexed) + +If any prerequisite fails, halt and prompt user to complete step-2 first. + +**Assumed State:** +- All knowledge chunks from step-2 are indexed +- Cross-references and deduplication are complete🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 20 - 26, Add a prerequisite verification step at the top of step-03-optimize: before any optimization, assert that step-2 completed by checking the existence and freshness of the knowledge index, cross-reference status, and deduplication flag. Implement high-level checks like verify_index_exists(), validate_cross_references(), and ensure_dedup_complete() (or equivalent metadata checks) and fail-fast with a clear error message if any check is missing or stale so optimization only runs when step-2 state is confirmed.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-122-122 (1)
122-122:⚠️ Potential issue | 🟠 MajorVague and unhelpful embedding model recommendation.
"Use an embedding model that handles technical content well" provides no actionable guidance. What makes a model handle technical content well? Options include:
- Code-specific models (e.g., CodeBERT, GraphCodeBERT)
- Domain-adapted models (fine-tuned on technical documentation)
- General-purpose models with large context windows
- Multilingual models for international teams
Without specificity, users can't evaluate or select appropriate models for their RAG pipeline.
-**Model:** Use an embedding model that handles technical content well +**Model:** Choose based on your content type: + - For code-heavy projects: CodeBERT, GraphCodeBERT, or OpenAI code-davinci-002 + - For technical documentation: all-MiniLM-L6-v2, instructor-xl, or OpenAI text-embedding-ada-002 + - For domain-specific terminology: Consider fine-tuning a base model on your documentation corpus + - For multilingual teams: multilingual-e5-large or similar🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` at line 122, Replace the vague bullet under the "**Model:**" heading in step-03-optimize.md with concrete, actionable guidance: enumerate specific model types to consider (code-specific models like CodeBERT/GraphCodeBERT, domain-adapted models fine-tuned on technical docs, general-purpose models with large context windows, and multilingual models), list criteria to evaluate them (tokenization fidelity for code, training/fine-tune corpus, context window size, embedding dimension, license/cost), and provide example recommendations for common scenarios (e.g., use CodeBERT/GraphCodeBERT for code-heavy repos, a domain-fine-tuned model for internal API docs, and a large-context general model for long technical manuals) so readers can map their use case to a concrete model choice.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-142-143 (1)
142-143:⚠️ Potential issue | 🟠 MajorAutomatic success flags without validation.
The frontmatter template sets
retrieval_tested: trueandstatus: 'complete'unconditionally, even if:
- Retrieval scenario testing was skipped
- Tests failed or produced poor results
- User rejected the optimization results
This creates false success indicators that downstream systems may rely on.
🛡️ Conditional status setting
**Update Frontmatter:** ```yaml --- project_name: '{{project_name}}' user_name: '{{user_name}}' date: '{{date}}' total_chunks: {{total_count}} sources_indexed: {{source_count}} tag_vocabulary_size: {{tag_count}} -retrieval_tested: true -status: 'complete' +retrieval_tested: {{retrieval_test_passed}} +retrieval_test_scenarios: {{test_count}} +status: 'complete' # Only set if all validation checks passed ---
+Important: Only set
status: 'complete'if the user confirms:
+- ✅ All retrieval tests passed
+- ✅ Tag optimization is satisfactory
+- ✅ Chunk quality meets project needs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 142 - 143, Update the frontmatter generation in step-03-optimize.md to stop hardcoding retrieval_tested: true and status: 'complete'; instead expose and use variables like retrieval_test_passed and retrieval_test_scenarios (or test_count) so the template can emit retrieval_tested: {{retrieval_test_passed}} and retrieval_test_scenarios: {{test_count}}, and only emit status: 'complete' when a combined validation flag (e.g., all_validations_passed) or explicit user confirmation is true; modify any code that populates these template variables to set retrieval_test_passed based on actual test results, set test_count from executed scenarios, and compute all_validations_passed before rendering the status field so downstream systems receive accurate success indicators.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-44-49 (1)
44-49:⚠️ Potential issue | 🟠 MajorUse tokens, not words, and clarify that chunk size must be tuned per embedding model and use case.
The "100-500 words" claim conflates two separate concerns. Embedding models enforce input limits in tokens, not words (1 token ≈ ¾ word), and this matters because:
Unit mismatch: Modern embedding systems (OpenAI, BERT, Cohere) measure chunks in tokens with hard context limits (e.g., OpenAI text-embedding-3-small is 8,191 tokens max; BERT-like models ~512). Recommending "words" without converting to tokens or clarifying the model's tokenizer makes this guidance unusable.
Industry ranges are narrower and token-based: The "sweet spot" is ~200–400 tokens for general prose, but varies by use case:
- Technical docs: 100–300 tokens
- Procedural/broad-context: 512–1024 tokens
- Factoid/precise lookup: 64–128 tokens
Missing critical constraints:
- The document must specify that chunk size is bounded by the target embedding model's max input tokens.
- Chunk size should be tuned empirically via retrieval metrics (recall/precision), not assumed as a universal "target."
- Overlap (typically 10–20%) is a standard practice but not mentioned.
Reframe this as: "Start with a model-aware baseline (e.g., 300–512 tokens for general prose with 10–20% overlap), then evaluate and adjust based on your retrieval quality on representative queries. Smaller chunks work better for factoid questions; larger chunks preserve broader context."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 44 - 49, Update the "Chunk Size Analysis" guidance in step-03-optimize.md to use tokens instead of words and make it model-aware: replace "100-500 words" with a token-based baseline (e.g., 300–512 tokens for general prose) and state that chunk size must not exceed the target embedding model's max input tokens (reference models like text-embedding-3-small and BERT for examples), recommend typical ranges for different use cases (factoid 64–128, technical 100–300, broad context 512–1024), advise 10–20% overlap as standard, and instruct readers to empirically tune chunk size and overlap using retrieval metrics (precision/recall) on representative queries rather than using a universal fixed value.
🟡 Minor comments (5)
src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-74-80 (1)
74-80:⚠️ Potential issue | 🟡 MinorUndefined template variables in agent output.
The example output uses
{{chunk_count}},{{unique_tag_count}},{{top_tags_by_frequency}}, and{{gaps_fixed}}without explaining:
- Where these values are calculated or stored
- What happens if counts aren't available
- Whether the agent is expected to compute these on-the-fly
If agents follow this template literally without variable population logic, they'll output malformed messages.
Consider adding a calculation instruction before the output template:
**Calculate Metrics:** - chunk_count: Total chunks in knowledge index - unique_tag_count: Distinct tags after normalization - top_tags_by_frequency: Top 5 tags by usage count - gaps_fixed: Number of chunks that received new tags **Present Tag Summary:** "I've optimized the semantic tags across {{chunk_count}} chunks: ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 74 - 80, The template in step-03-optimize.md uses undefined variables ({{chunk_count}}, {{unique_tag_count}}, {{top_tags_by_frequency}}, {{gaps_fixed}}); add a short "Calculate Metrics" instruction block before the output template that specifies how to compute and store each value (chunk_count = total chunks in the knowledge index, unique_tag_count = distinct normalized tags, top_tags_by_frequency = top 5 tags by frequency, gaps_fixed = number of chunks that received new tags), describe where to persist them (e.g., as keys in the agent's result object or context), and add fallback behavior when values are unavailable (e.g., show "N/A" or compute on-the-fly); reference the template variables exactly so agents know to populate {{chunk_count}}, {{unique_tag_count}}, {{top_tags_by_frequency}}, and {{gaps_fixed}} before emitting the final "Present Tag Summary" message.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-250-250 (1)
250-250:⚠️ Potential issue | 🟡 MinorUndefined test count variable in completion message.
Line 250 references
{{test_count}}but this document never specifies:
- How many retrieval scenarios were tested (Section 3 shows 5 hardcoded queries, but were more added?)
- Where this count is stored
- What to display if testing was skipped
This will either show a literal
{{test_count}}or fail to render the message correctly.Add calculation instruction:
+**Calculate Final Metrics:** +- test_count: Number of retrieval scenarios tested (default: 5 from section 3, plus any custom queries) + **🎯 RAG Integration Ready:**🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` at line 250, The template contains an undefined placeholder {{test_count}} in step-03-optimize.md; update the build/render logic to calculate and pass a concrete test_count value (e.g., count of entries in the retrieval test array used in Section 3 or zero/“skipped” fallback) into the template context before rendering, or change the document to use an existing symbol (e.g., the retrievalTests length) and a clear fallback string when tests are not run so the completion message always displays a valid value; locate references to the template variable/test harness around step-03-optimize.md and the code that composes the template context to implement this fix.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-176-218 (1)
176-218:⚠️ Potential issue | 🟡 MinorNo mechanism to determine user skill level.
The completion summary provides three different outputs for "Expert Mode," "Intermediate Mode," and "Beginner Mode," but the document never explains:
- How the agent determines which skill level to use
- Where this preference is configured
- What to do if skill level is unknown
Without this, agents will either guess, ask the user mid-completion (breaking flow), or always use one default mode.
Consider adding a skill level reference:
### 6. Present Completion Summary -Based on user skill level, present the completion: +Based on user skill level from workflow config `{user_skill_level}`, present the completion: +(If unset, default to Intermediate Mode)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` around lines 176 - 218, Add a clear skill-level selection and fallback for the three presentation branches (Expert Mode, Intermediate Mode, Beginner Mode): introduce a user_skill_level input/config (e.g., "user_skill_level" with values expert|intermediate|beginner|auto) and update the summary-rendering logic that emits the blocks containing "{{chunk_count}}", "{{source_count}}", "{{tag_count}}", and "File saved to: {project_knowledge}/knowledge-index.md" to choose the matching block; if user_skill_level is "auto" or missing, implement a deterministic fallback (prefer "intermediate") or a single inline sentence that asks the user once for their preference before emitting the full block, and document the new config key and the unknown-case behavior in the same step-03-optimize.md text.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-94-94 (1)
94-94:⚠️ Potential issue | 🟡 MinorUndefined variable in test query.
Line 94 uses
{{core_feature}}without defining what it is, how it's populated, or what to do if the project doesn't have an identifiable "core feature." This will either output literal braces or fail silently.Consider:
-5. "What are the business rules for {{core_feature}}?" → Should retrieve: requirements + domain chunks +5. "What are the business rules for [project-specific feature]?" → Should retrieve: requirements + domain chunks + (Agent: Replace [project-specific feature] with the primary domain concept from the requirements artifacts)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` at line 94, The test query uses an undefined template variable {{core_feature}} which will render literally or fail; update the workflow so {{core_feature}} is defined (e.g., populate it in earlier step or front-matter) and add a clear fallback when missing (e.g., use a project-wide field like project_core_feature or a default value such as "primary feature" or "project" ); locate the usage in step-03-optimize.md (the test/query string on line with "What are the business rules for {{core_feature}}?") and either replace it with the agreed-upon variable name (project_core_feature) or document/assign core_feature in the prior step/function that prepares template context so the query always receives a defined value.src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md-123-123 (1)
123-123:⚠️ Potential issue | 🟡 MinorClarify chunk overlap specification with technical justification.
The recommendation of "50-100 characters overlap" lacks critical context:
- Industry standard measures overlap in tokens (not characters) to align with embedding model tokenization and chunk size limits. Specify token count or justify character choice.
- The specific range needs rationale: is this 10–20% overlap as a percentage? A fixed-token baseline? Industry practice typically starts with 10–20% overlap or ~10–20 tokens as boundary insurance.
- No mention that overlap strategy varies by content type: semantic boundary-aware splitting (headings, paragraphs) has different overlap needs than fixed-window splitting, and overlap should scale with chunk size.
Provide concrete justification aligned with your embedding model's tokenization and clarify when and why this range applies.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` at line 123, Update the "**Chunk Overlap:** 50-100 characters overlap between adjacent chunks from the same source" line to specify overlap in tokens (not characters), give a concrete token-based recommendation (e.g., 10–20 tokens or 10–20% of chunk size), and add a short justification: explain that token counts align with embedding model tokenization, that overlap should scale with chunk size (percentage vs fixed tokens), and that semantic-aware splitting (headings/paragraphs) may need less overlap than fixed-window splitting; mention when to prefer fixed-token overlap vs percentage-based overlap and tie the guidance to embedding/model tokenization.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/instructions/*.instructions.mdsrc/bmm/agents/analyst.agent.yamlsrc/bmm/workflows/4-implementation/genai-knowledge-sync/knowledge-index-template.mdsrc/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.mdsrc/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.mdsrc/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.mdsrc/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.mdtools/cli/commands/status.jstools/cli/lib/config.js
This pull request introduces a new "Knowledge Sync" workflow for building a Retrieval-Augmented Generation (RAG)-ready knowledge index from project artifacts, optimized for AI agent retrieval. It adds a new workflow trigger, comprehensive templates, and detailed step-by-step instructions for artifact discovery and knowledge chunking. The changes emphasize collaborative, user-driven processes, strict execution protocols, and robust chunking/indexing standards to ensure high-quality, retrievable knowledge assets.
Knowledge Sync Workflow Integration
KS/knowledge-sync) to theanalyst.agent.yaml, enabling automated initiation of the knowledge sync process for building a RAG-ready knowledge index from project artifacts.Templates and Documentation
knowledge-index-template.mdthat structures the knowledge index document, including metadata, index summary, knowledge categories, retrieval configuration, and embedding recommendations for optimal AI retrieval.Knowledge Sync Workflow Steps
step-01-discover.mdoutlining the artifact discovery and cataloging process, including mandatory execution rules, discovery protocols, artifact classification, and user-facing summary/reporting.step-02-index.mddetailing the knowledge indexing and chunking process, with rules for chunk design, metadata tagging, deduplication, cross-referencing, and structured output for the knowledge index document.