-
Notifications
You must be signed in to change notification settings - Fork 132
Description
Problem
The execution agent (not the review/scoring bot) consistently over-corrects when fixing findings, producing code that is larger and more complex than before. The review bot finds real issues (dead exports, missing tests, duplication), but the execution agent's fixes violate the target codebase's core principle: "every change should make the codebase smaller or more explicit."
Recurring patterns observed
Evidence from chore/desloppify-queue-2026-03-02 on peteromallet/reigh:
1. Single-consumer extractions
Creates new files/functions with exactly 1 caller:
toolSettingsAuthCache.ts,toolSettingsErrors.ts,toolSettingsOperationFailures.ts,toolSettingsScopeReader.ts— all had 0-1 external consumers, added +155 lines of overhead (reverted inbc01baa7f)buildShotEditorControllerSlices,buildShotEditorSections+ 117 lines of interface defs, each called once (reverted)
2. __internal test export hacks
7 files used export const __internal = {...} to expose private functions for testing, instead of exporting them properly or testing through the public API. Files affected:
useShotEditorLayoutModel.ts,generateVideoService.ts,useTimelineOrchestrator.ts,calculate-task-cost/index.ts,huggingface-upload/index.ts
3. Naming convention churn
8 functions renamed with OrThrow suffix; none had a non-throwing variant; pure noise across 18 files (reverted in bc01baa7f)
4. Stripping "why" comments
Navigator.locks rationale, "IMPORTANT: update these transformers" warnings, per-field JSDoc on HiresFixConfig — all stripped as collateral during moves (restored in 7561f7b9e)
5. Behavioral changes hidden in refactors
resolvePrimaryStructureVideo dropped a legacy field fallback chain in a "unify contracts" commit; the test was rewritten to match the new behavior rather than preserving the old (fixed in 7561f7b9e)
6. Net line growth while claiming cleanup
4 of 5 commits in one batch added lines (+440, +17, +90, +38); only 1 reduced lines (-20)
7. Repetitive boilerplate
buildTravelStructureSource had 22 identical pickFirstDefinedValue calls in 148 lines. A 25-line field-descriptor loop does the same thing.
Suggested constraints for the execution agent
- Net lines must decrease or stay flat — if a cleanup commit adds lines, something is wrong
- Don't extract unless 2+ consumers exist — single-consumer extractions add indirection without value
- Don't create new files for single-consumer code — keep things colocated
- Never use
__internalexports — either export the function properly or test through the public API - Preserve "why" comments; only strip "what" comments — rationale comments are load-bearing
- Refactors must preserve behavior — no test expectation changes in refactor commits
- Don't rename for convention alone —
OrThrowsuffixes, etc. are noise if there's no non-throwing variant - Don't delete tests without replacement — two test files (191 + 155 lines) were deleted in
6fd933f83without equivalent coverage