Skip to content

Comments

fix: prevent runaway testing agent spawning (critical)#130

Merged
leonvanzyl merged 1 commit intoAutoForgeAI:masterfrom
ipodishima:fix/too_many_agents_spawned
Jan 30, 2026
Merged

fix: prevent runaway testing agent spawning (critical)#130
leonvanzyl merged 1 commit intoAutoForgeAI:masterfrom
ipodishima:fix/too_many_agents_spawned

Conversation

@ipodishima
Copy link
Contributor

@ipodishima ipodishima commented Jan 29, 2026

CRITICAL BUG FIX - Testing agents spawn uncontrollably (~130+) when launching a project, overwhelming the system.

  • running_testing_agents was a dict keyed by feature_id, so when multiple testing agents targeted the same feature, each
    new spawn overwrote the previous entry. The dict length stayed at 1, defeating all max-agent guards and causing an infinite
    spawn loop.
  • Re-keyed the dict by process PID so each agent is tracked independently. Existing limits (testing_agent_ratio,
    max_concurrency, MAX_TOTAL_AGENTS) now work as intended.
  • Added return-value check on _spawn_testing_agent() to break the maintain loop immediately on failure.

Test plan

  • Run python test_security.py — no regressions
  • Launch a project from the UI with parallel mode and verify only the expected number of testing agents spawn (matching
    testing_agent_ratio)
  • Confirm stop_all correctly terminates all testing agents
  • Verify Mission Control shows accurate agent counts

Summary by CodeRabbit

  • Refactor
    • Improved internal testing agent process tracking and error handling mechanisms for more reliable test execution and clearer failure messages.

✏️ Tip: You can customize this high-level summary in your review settings.

running_testing_agents was keyed by feature_id, so when multiple agents
tested the same feature, each spawn overwrote the previous dict entry.
The count stayed at 1 regardless of how many processes were actually
running, causing the maintain loop to spawn agents indefinitely (~130+).

Re-key the dict by PID so each agent gets a unique entry and the
existing max-agent guards work correctly. Also check the return value
of _spawn_testing_agent() to break the loop on failure.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Refactored testing agent tracking in ParallelOrchestrator to use process ID as the dictionary key instead of feature ID, now storing (feature_id, process) tuples. Updated all spawn, completion, and stop paths with corresponding lookup and logging changes.

Changes

Cohort / File(s) Summary
Testing Agent Tracking Refactor
parallel_orchestrator.py
Changed running_testing_agents dictionary keying from feature_id to PID, with values updated to (feature_id, process) tuples. Updated all spawn/complete/stop paths to perform lookups by proc.pid and adjusted log messages to reference PID instead of feature_id. Modified _spawn_testing_agent return handling to support (success, msg) tuple responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hop, a skip, through PID keys so keen,
Where agents once scattered now gather between,
The tuples align, both feature and proc,
No more searching by ID through the orchestrator's flock!
Order emerges from process refine. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing runaway testing agent spawning by re-keying running_testing_agents by PID to ensure proper tracking and prevent the dictionary overflow bug.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ipodishima
Copy link
Contributor Author

@leonvanzyl ^

@leonvanzyl leonvanzyl merged commit 21fe28f into AutoForgeAI:master Jan 30, 2026
3 checks passed
@leonvanzyl
Copy link
Collaborator

Thanks

leonvanzyl added a commit that referenced this pull request Jan 30, 2026
- Add comment on running_coding_agents explaining why feature_id keying
  is safe (start_feature checks for duplicates before spawning), since
  the sister dict running_testing_agents required PID keying to avoid
  overwrites from concurrent same-feature testing
- Clear running_testing_agents dict in stop_all() after killing
  processes so get_status() doesn't report stale agent counts while
  _on_agent_complete callbacks are still in flight

Follow-up to PR #130 (runaway testing agent spawn fix).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rudiheydra added a commit to rudiheydra/AutoBuildr that referenced this pull request Feb 2, 2026
…utoForgeAI#130)

- Added TestAcceptanceGateEvaluatesValidators class with 8 tests
- Step 1: Verify AcceptanceGate.evaluate() called after kernel execution
- Step 2: Verify each validator executed independently
- Step 3: Verify ValidatorResult contains passed, message, score
- Step 4: Verify all_pass gate mode requires all validators
- Step 5: Verify any_pass gate mode requires at least one validator
- Step 6: Verify AgentRun.final_verdict set correctly
- Step 7: Verify acceptance_results contains per-validator JSON array
- Step 8: Verify acceptance_check event recorded in agent_events
- All 62 tests in test_dspy_pipeline_e2e.py pass with no regressions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rudiheydra added a commit to rudiheydra/AutoBuildr that referenced this pull request Feb 2, 2026
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rudiheydra added a commit to rudiheydra/AutoBuildr that referenced this pull request Feb 2, 2026
…and determines final verdict

- All 8 verification steps passed via pytest
- TestAcceptanceGateEvaluatesValidators: 8/8 tests PASS
- Full suite: 62/62 tests pass (no regressions)
- No production code changes needed - tests prove existing behavior
- Marked feature AutoForgeAI#130 as passing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CoreAspectStu pushed a commit to CoreAspectStu/autocoder-custom that referenced this pull request Feb 9, 2026
- Add comment on running_coding_agents explaining why feature_id keying
  is safe (start_feature checks for duplicates before spawning), since
  the sister dict running_testing_agents required PID keying to avoid
  overwrites from concurrent same-feature testing
- Clear running_testing_agents dict in stop_all() after killing
  processes so get_status() doesn't report stale agent counts while
  _on_agent_complete callbacks are still in flight

Follow-up to PR AutoForgeAI#130 (runaway testing agent spawn fix).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CoreAspectStu pushed a commit to CoreAspectStu/autocoder-custom that referenced this pull request Feb 9, 2026
…s_spawned

fix: prevent runaway testing agent spawning (critical)
CoreAspectStu pushed a commit to CoreAspectStu/autocoder-custom that referenced this pull request Feb 9, 2026
- Add comment on running_coding_agents explaining why feature_id keying
  is safe (start_feature checks for duplicates before spawning), since
  the sister dict running_testing_agents required PID keying to avoid
  overwrites from concurrent same-feature testing
- Clear running_testing_agents dict in stop_all() after killing
  processes so get_status() doesn't report stale agent counts while
  _on_agent_complete callbacks are still in flight

Follow-up to PR AutoForgeAI#130 (runaway testing agent spawn fix).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants