Add AGENTS.md review artifacts and Stripe-informed exec plan#14
Add AGENTS.md review artifacts and Stripe-informed exec plan#14WellDunDun merged 10 commits intomasterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…gistry steps (WS6) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…b-checks (WS1-WS5) - WS1: scoreAgentWorkflowBlueprints awards point for agent commands or session orchestrator - WS2: scoreConditionalContext awards point for glob-based rules or hierarchical context files - WS3: scoreToolRegistry awards point for MCP config or skills manifest - WS4: Updated maturity level thresholds and names (Manual/Inloop/Guided Outloop/Full Outloop/Zero Touch) - WS5: detectZteSignals adds findings for auto-merge and deploy-on-merge CI patterns - Added schema_version "2.0" to AuditResult, max_score raised from 18 to 21 - New context signals: hasAgentCommands, hasSessionOrchestrator, hasMcpConfig, hasGlobBasedRules, hierarchicalAgentContextCount, hasSkillsManifest Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
detectZteSignals was appending findings per-file instead of per-signal, causing repos with many CI workflows (e.g., langchainjs with 8+ YAML files) to get duplicate "deploy-on-merge" findings. Now collects unique signals first, then pushes each finding once. Caught by wild-repo validation against langchainjs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Includes PDF evaluation, review markdown, execution plan for the Stripe-informed next iteration, and session PRD memory files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Internal work session files should not live in the open-source repo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughIncreases audit max score to 21 and adds Changes
Sequence DiagramsequenceDiagram
participant User as "CLI / Caller"
participant Audit as "Audit Orchestrator"
participant Context as "Context Detector"
participant Scoring as "Scoring Engine"
participant Resolver as "Maturity Resolver"
User->>Audit: run audit(projectDir)
Audit->>Context: buildAuditRuntimeContext(projectDir)
Context->>Context: inspect files (.claude, .cursor, AGENTS.md, CLAUDE.md, skills.json, mcp.json, rules)
Context-->>Audit: AuditRuntimeContext {hasAgentCommands, hasSessionOrchestrator, hasMcpConfig, hasGlobBasedRules, hierarchicalAgentContextCount, hasSkillsManifest}
Audit->>Scoring: scoreRepositoryKnowledge(ctx)
Scoring->>Scoring: scoreConditionalContext(ctx)
Scoring-->>Audit: repository_knowledge score
Audit->>Scoring: scoreAgentLegibility(ctx)
Scoring->>Scoring: scoreToolRegistry(ctx)
Scoring-->>Audit: agent_legibility score
Audit->>Scoring: scoreAgentWorkflow(ctx)
Scoring->>Scoring: scoreAgentWorkflowBlueprints(ctx)
Scoring->>Scoring: detectZteSignals(workflow files)
Scoring-->>Audit: agent_workflow score
Audit->>Resolver: resolveMaturityLevel(total_score)
Resolver-->>Audit: maturity level (L0-L4)
Audit-->>User: AuditResult {scores, max_score:21, schema_version:"2.0", level}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/design-docs/ecosystem-positioning.md (1)
48-49:⚠️ Potential issue | 🟡 MinorUpdate the audit score range to match the current schema.
This still says
0-18, but the PR updates audit max score to21. Please sync this doc to avoid stale guidance for agents/users.Suggested doc fix
-- `reins audit` — score the repo on 6 dimensions (0-18) +- `reins audit` — score the repo on 6 dimensions (0-21)Based on learnings
cli/reinsis the product engine and the only source of truth for readiness scoring and JSON outputs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/design-docs/ecosystem-positioning.md` around lines 48 - 49, Update the documentation line that describes the `reins audit` score range to reflect the new schema: change the current "0-18" range to "0-21" so the `reins audit` entry correctly states "score the repo on 6 dimensions (0-21)"; ensure the `reins audit` text is the one edited to match the CLI's source-of-truth output.
🧹 Nitpick comments (2)
docs/design-docs/ecosystem-positioning.md (1)
105-105: Consider tightening wording in this sentence.“exactly” reads slightly overstated here; replacing it with a less absolute term would improve tone and precision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/design-docs/ecosystem-positioning.md` at line 105, Replace the word "exactly" in the sentence "The structural foundation Reins provides at Layer 2 is exactly what systems like Minions depend on to operate effectively at the layers above." with a less absolute term (e.g., "fundamentally", "largely", or "substantially") to soften tone and improve precision; update the sentence in the same paragraph so it reads with the chosen substitute while keeping the surrounding phrasing intact.cli/reins/src/index.test.ts (1)
264-271: Addschema_versionto the audit response contract test.Since
schema_versionis now part ofAuditResult, asserting it here will prevent silent contract regressions.Proposed test update
expect(result).toHaveProperty("project"); + expect(result).toHaveProperty("schema_version"); expect(result).toHaveProperty("timestamp"); expect(result).toHaveProperty("scores");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/reins/src/index.test.ts` around lines 264 - 271, The test currently asserts AuditResult fields but misses the new schema_version field; update the test in index.test.ts (the block using expect(result)... and the existing expect(result.max_score).toBe(21) assertion) to also assert the presence of schema_version by adding an assertion like expect(result).toHaveProperty("schema_version"); and optionally assert its expected value if there is a known constant (e.g., compare to AuditResult.schema_version or the expected literal) to prevent silent contract regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/reins/src/lib/audit/scoring.ts`:
- Around line 447-449: The current regex in scoring.ts that sets
hasDeployOnMerge matches push-to-main workflows even when no deployment action
exists; update the check that sets hasDeployOnMerge so it only becomes true when
both a push-to-main trigger and an explicit deploy step/signal are present
(e.g., require the content to match a push:.*branches:.*main pattern AND also
match deploy-related keywords/steps), or alternatively limit to merge-triggered
deployment patterns only (keep the existing merge.*deploy|deploy.*merge branch).
Modify the logic around the if that assigns hasDeployOnMerge to perform two
separate tests (one for push-to-main trigger and one for deploy action) and
combine them with &&, referencing the existing variable hasDeployOnMerge and the
regex used currently.
In `@docs/exec-plans/active/next-iteration-stripe-informed.md`:
- Around line 225-235: The fenced code block starting with triple backticks and
containing the "Phase 1/2/3" plan should include a language tag to satisfy
markdownlint MD040; update the opening fence from ``` to ```text so the block is
tagged as text (the block that contains "Phase 1 (P0): WS1 + WS2 + WS3 → ..."
through "Ship as reins-cli v0.3.0"). Ensure only the opening fence is changed
and no content lines are edited.
- Line 10: The doc currently conflicts on the score range: the intro phrase
"6-dimension/0-18 JSON contract" disagrees with later recalibration to "21";
pick one canonical range and make all references consistent (either change the
intro "0-18" phrasing to "0-21" or update the recalibration text to "0-18").
Edit the occurrences of the phrase "6-dimension/0-18 JSON contract" and the
later recalibration paragraphs that mention "21" and any summary/notation that
references the score range so the entire document uses the same numeric
contract.
---
Outside diff comments:
In `@docs/design-docs/ecosystem-positioning.md`:
- Around line 48-49: Update the documentation line that describes the `reins
audit` score range to reflect the new schema: change the current "0-18" range to
"0-21" so the `reins audit` entry correctly states "score the repo on 6
dimensions (0-21)"; ensure the `reins audit` text is the one edited to match the
CLI's source-of-truth output.
---
Nitpick comments:
In `@cli/reins/src/index.test.ts`:
- Around line 264-271: The test currently asserts AuditResult fields but misses
the new schema_version field; update the test in index.test.ts (the block using
expect(result)... and the existing expect(result.max_score).toBe(21) assertion)
to also assert the presence of schema_version by adding an assertion like
expect(result).toHaveProperty("schema_version"); and optionally assert its
expected value if there is a known constant (e.g., compare to
AuditResult.schema_version or the expected literal) to prevent silent contract
regressions.
In `@docs/design-docs/ecosystem-positioning.md`:
- Line 105: Replace the word "exactly" in the sentence "The structural
foundation Reins provides at Layer 2 is exactly what systems like Minions depend
on to operate effectively at the layers above." with a less absolute term (e.g.,
"fundamentally", "largely", or "substantially") to soften tone and improve
precision; update the sentence in the same paragraph so it reads with the chosen
substitute while keeping the surrounding phrasing intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b00e0e6a-4c6c-4e1d-807d-991787052004
⛔ Files ignored due to path filters (1)
Evaluating AGENTS.md.pdfis excluded by!**/*.pdf
📒 Files selected for processing (13)
.gitignorecli/reins/package.jsoncli/reins/src/index.test.tscli/reins/src/index.tscli/reins/src/lib/audit/context.tscli/reins/src/lib/audit/scoring.tscli/reins/src/lib/commands/evolve.test.tscli/reins/src/lib/commands/evolve.tscli/reins/src/lib/types.tsdocs/design-docs/ecosystem-positioning.mddocs/exec-plans/active/next-iteration-stripe-informed.mdevaluating-agents-md-review.mdpackage.json
| ``` | ||
| Phase 1 (P0): WS1 + WS2 + WS3 → deeper sub-checks in 3 dimensions | ||
| + score threshold recalibration + schema_version field | ||
| Ship as reins-cli v0.2.0 | ||
|
|
||
| Phase 2 (P1): WS4 + WS5 + WS6 → maturity language + ZTE findings + evolve paths | ||
| Ship as reins-cli v0.2.1 | ||
|
|
||
| Phase 3 (P2): WS7-WS10 → tech debt resolution + new features | ||
| Ship as reins-cli v0.3.0 | ||
| ``` |
There was a problem hiding this comment.
Add a language to the fenced block for markdownlint compliance.
This block triggers MD040; tagging it as text is sufficient.
Proposed markdown fix
-```
+```text
Phase 1 (P0): WS1 + WS2 + WS3 → deeper sub-checks in 3 dimensions
+ score threshold recalibration + schema_version field
Ship as reins-cli v0.2.0
@@
Phase 3 (P2): WS7-WS10 → tech debt resolution + new features
Ship as reins-cli v0.3.0</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 225-225: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/exec-plans/active/next-iteration-stripe-informed.md` around lines 225 -
235, The fenced code block starting with triple backticks and containing the
"Phase 1/2/3" plan should include a language tag to satisfy markdownlint MD040;
update the opening fence from ``` to ```text so the block is tagged as text (the
block that contains "Phase 1 (P0): WS1 + WS2 + WS3 → ..." through "Ship as
reins-cli v0.3.0"). Ensure only the opening fence is changed and no content
lines are edited.
- Fix deploy-on-merge false positives: require both push-to-main AND deploy signal instead of matching any push-to-main workflow - Align score-range statement (0-18 → 0-21) with recalibration section - Add language tag to fenced code block for markdownlint compliance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
docs/exec-plans/active/next-iteration-stripe-informed.md (2)
219-220: Makeschema_versionconcrete, not just conceptual.This is marked as a breaking change, but the doc does not define concrete version values or compatibility behavior. Add explicit examples (e.g.,
schema_version: "1.0"for 0–18 and"2.0"for 0–21), plus expected client fallback behavior for unknown versions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/exec-plans/active/next-iteration-stripe-informed.md` around lines 219 - 220, Update the documentation for the `reins audit` output to make `schema_version` concrete: define explicit version values (for example add examples like `schema_version: "1.0"` to indicate scoring model for range 0–18 and `schema_version: "2.0"` for 0–21), document the exact semantic differences each version implies (what fields/score changes each version alters), and include a short client compatibility policy describing expected fallback behavior when encountering unknown `schema_version` values (e.g., log a warning, attempt best-effort parsing using the latest supported schema, and fail-safe to reject or request migration if critical fields are missing). Ensure these additions reference the `reins audit` output and the `schema_version` field explicitly so consumers can detect and handle different scoring models.
241-242: Add an explicit verification command in the success criteria.Consider adding a concrete validation step for this plan section (for example: run audit before/after and confirm score non-regression and schema marker presence). This will make execution/review sign-off objective.
Based on learnings: Self-audit using
cd cli/reins && bun src/index.ts audit ../...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/exec-plans/active/next-iteration-stripe-informed.md` around lines 241 - 242, Add an explicit verification step to the success criteria: instruct the reviewer to run the self-audit command (use the suggested command "cd cli/reins && bun src/index.ts audit ../..") and verify two concrete checks — that the audit score does not regress compared to the baseline and that the expected schema marker(s) are present in the output; update the success criteria text in the same plan section to include these commands and checks so execution and sign-off are objective.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/reins/src/lib/audit/scoring.ts`:
- Around line 447-449: The hasPushToMain regex used to detect push-to-main in
scoring.ts currently misses YAML variants like - "main" or - 'main'; update the
pattern assigned to hasPushToMain so it accepts unquoted or quoted main in both
single-item lists and array syntax (e.g., - main, - "main", - 'main', ["main"],
['main'], ["main", ...]) by allowing optional quotes around main and optional
whitespace inside the branch token; modify the
/push:[\s\S]*?branches:[\s\S]*?(?:-\s*main|\[\s*["']?main["']?\s*\])/i
expression in hasPushToMain to include quotes around main and handle quoted list
entries consistently.
In `@docs/exec-plans/active/next-iteration-stripe-informed.md`:
- Around line 173-190: The current plan auto-discovers plugins from
`.reins/plugins/*.ts|.js` and risks executing untrusted code; change `reins
audit` and the plugin discovery flow so plugins are only loaded when an explicit
opt-in flag (e.g., `--enable-plugins`) is provided, limit discovery to
explicitly trusted local paths (e.g., the target repo's `.reins/plugins` only
when the repo is verified or a user-specified trusted path), and run each plugin
(implementing the ReinsPlugin interface and discovered by the reins audit flow)
in an isolated process/sandbox with enforced timeouts and resource limits so a
misbehaving plugin cannot block or compromise the auditor.
---
Nitpick comments:
In `@docs/exec-plans/active/next-iteration-stripe-informed.md`:
- Around line 219-220: Update the documentation for the `reins audit` output to
make `schema_version` concrete: define explicit version values (for example add
examples like `schema_version: "1.0"` to indicate scoring model for range 0–18
and `schema_version: "2.0"` for 0–21), document the exact semantic differences
each version implies (what fields/score changes each version alters), and
include a short client compatibility policy describing expected fallback
behavior when encountering unknown `schema_version` values (e.g., log a warning,
attempt best-effort parsing using the latest supported schema, and fail-safe to
reject or request migration if critical fields are missing). Ensure these
additions reference the `reins audit` output and the `schema_version` field
explicitly so consumers can detect and handle different scoring models.
- Around line 241-242: Add an explicit verification step to the success
criteria: instruct the reviewer to run the self-audit command (use the suggested
command "cd cli/reins && bun src/index.ts audit ../..") and verify two concrete
checks — that the audit score does not regress compared to the baseline and that
the expected schema marker(s) are present in the output; update the success
criteria text in the same plan section to include these commands and checks so
execution and sign-off are objective.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4bce46da-51b9-4251-a230-bb982d2d21aa
📒 Files selected for processing (2)
cli/reins/src/lib/audit/scoring.tsdocs/exec-plans/active/next-iteration-stripe-informed.md
- Regex now matches `- "main"` and `- 'main'` YAML variants - Plugin system spec requires --enable-plugins opt-in flag - Plugins run in isolated child processes with 10s timeout Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
cli/reins/src/lib/audit/scoring.ts (1)
75-84: Consider recording both conditional-context signals before scoring once.The early return at Line 78 suppresses hierarchical-context findings when glob-based rules are also present. Keeping both findings improves audit explainability without changing max points.
Refactor sketch
function scoreConditionalContext(result: AuditResult, ctx: AuditRuntimeContext): void { + let hasSignal = false; if (ctx.hasGlobBasedRules) { - result.scores.repository_knowledge.score++; result.scores.repository_knowledge.findings.push("Glob-based rule files detected"); - return; + hasSignal = true; } if (ctx.hierarchicalAgentContextCount >= 3) { - result.scores.repository_knowledge.score++; result.scores.repository_knowledge.findings.push( `Conditional context engineering detected (${ctx.hierarchicalAgentContextCount} hierarchical files)`, ); + hasSignal = true; + } + if (hasSignal) { + result.scores.repository_knowledge.score++; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/reins/src/lib/audit/scoring.ts` around lines 75 - 84, The current early return after checking ctx.hasGlobBasedRules prevents recording hierarchical-context findings; change the logic in the block that references ctx.hasGlobBasedRules and ctx.hierarchicalAgentContextCount so both findings are pushed to result.scores.repository_knowledge.findings, but the score increment happens at most once; e.g., remove the return and use a boolean (e.g., foundSignal) or similar to track whether to increment result.scores.repository_knowledge.score once after evaluating both conditions while still pushing both messages when applicable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/reins/src/lib/audit/scoring.ts`:
- Around line 445-450: The failing formatting is in the block that sets
hasAutoMerge and computes hasPushToMain and hasDeploySignal (symbols:
hasAutoMerge, hasPushToMain, hasDeploySignal and the surrounding regex
expressions); run the project's formatter (e.g., npm/yarn format or the
configured formatter) against this file to normalize whitespace and line breaks
so the regex lines match the repo style, then re-stage the formatted file and
push the change to satisfy CI.
- Around line 447-451: The regex used to detect merge/deploy sequences
(/merge.*deploy|deploy.*merge/i) doesn't match across newlines in YAML, so
update the pattern used where hasDeployOnMerge is set (the code calculating
hasPushToMain, hasDeploySignal and setting hasDeployOnMerge) to use [\s\S]*
instead of . (e.g., /merge[\s\S]*deploy|deploy[\s\S]*merge/i) so multi-line
merge→deploy or deploy→merge sequences are correctly detected across YAML
newlines.
---
Nitpick comments:
In `@cli/reins/src/lib/audit/scoring.ts`:
- Around line 75-84: The current early return after checking
ctx.hasGlobBasedRules prevents recording hierarchical-context findings; change
the logic in the block that references ctx.hasGlobBasedRules and
ctx.hierarchicalAgentContextCount so both findings are pushed to
result.scores.repository_knowledge.findings, but the score increment happens at
most once; e.g., remove the return and use a boolean (e.g., foundSignal) or
similar to track whether to increment result.scores.repository_knowledge.score
once after evaluating both conditions while still pushing both messages when
applicable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5a395aed-67a2-4293-852f-ea281096372e
📒 Files selected for processing (2)
cli/reins/src/lib/audit/scoring.tsdocs/exec-plans/active/next-iteration-stripe-informed.md
cli/reins/src/lib/audit/scoring.ts
Outdated
| hasAutoMerge = true; | ||
| } | ||
| const hasPushToMain = | ||
| /push:[\s\S]*?branches:[\s\S]*?(?:-\s*["']?main["']?|\[\s*["']?main["']?\s*\])/i.test(content); | ||
| const hasDeploySignal = /\bdeploy(?:ment)?\b/i.test(content); | ||
| if ((hasPushToMain && hasDeploySignal) || /merge.*deploy|deploy.*merge/i.test(content)) { |
There was a problem hiding this comment.
Formatting check is failing on Lines 445-450.
CI already reports formatter drift here; please run the project formatter so this hunk is normalized and the pipeline passes.
🧰 Tools
🪛 GitHub Actions: CI
[error] 445-450: Formatter would have printed the following content: code formatting issues detected. Please run the formatter to fix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cli/reins/src/lib/audit/scoring.ts` around lines 445 - 450, The failing
formatting is in the block that sets hasAutoMerge and computes hasPushToMain and
hasDeploySignal (symbols: hasAutoMerge, hasPushToMain, hasDeploySignal and the
surrounding regex expressions); run the project's formatter (e.g., npm/yarn
format or the configured formatter) against this file to normalize whitespace
and line breaks so the regex lines match the repo style, then re-stage the
formatted file and push the change to satisfy CI.
The `.*` pattern doesn't match across YAML newlines, causing valid merge-triggered deploy workflows to be missed. Switched to `[\s\S]*`. Also ran biome formatter to satisfy CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Adds evaluation artifacts for AGENTS.md and establishes the next iteration exec plan informed by Stripe integration strategy. Includes review documentation, architecture updates, and expanded CLI infrastructure.
Changes
Testing
bun testpassesbun src/index.ts audit ../..score maintained or improvedMerge Readiness
Audit Impact
Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores