fix: prevent false usage limit detection and upload failure logs on auto-restart#1291
fix: prevent false usage limit detection and upload failure logs on auto-restart#1291
Conversation
Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #1290
Fixes #1290 - When using --tool agent with --attach-logs, auto-restart sessions now report their completion with logs when they fail (e.g., due to usage limit or other errors). Changes: - solve.watch.lib.mjs: Add log upload for failed auto-restart iterations - Upload failure logs with "⚠️ Auto-restart X/Y Failure Log" title - Include error info and usage limit details - Track whether iterations ran and logs were uploaded - solve.mjs: Update final log upload condition - Upload logs if auto-restart ran but last iteration wasn't uploaded - Ensures final logs are always attached when --attach-logs is enabled This ensures consistent behavior between --tool claude and --tool agent for auto-restart session reporting. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
--tool agentThis reverts commit d9ada0a.
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
Also I noticed, that we have second problem: This error was falsely identified as rate limit error, but it is not that. It is just the regular error and fail, we need to ensure we catch the last error, and correctly recognize its category. Ensure all changes are correct, consistent, validated, tested and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass. |
|
🤖 AI Work Session Started Starting automated work session at 2026-02-14T10:44:46.827Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
…"resets" Two root causes found via case study analysis of issue #1290: 1. The "resets" pattern in isUsageLimitError() was too broad - a simple substring match that matched ordinary English words in agent code output (e.g., "loads a shell and resets"). Changed to a regex requiring time-like content after "resets" (digit or month name). 2. Agent fallback pattern matching ran even after the agent had recovered from errors and completed successfully (exitCode=0, session.idle seen). Added guard to skip fallback when agent completed successfully. These two bugs combined caused an AI_JSONParseError from kimi-k2.5 to be falsely categorized as a "UsageLimit" error - the fallback found a stale "type":"error" in the output, then detectUsageLimit scanned the full output and matched "resets" in C# game code comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
🔄 Auto-restart 1/3Detected uncommitted changes from previous run. Starting new session to review and commit them. Uncommitted files: Auto-restart will stop after changes are committed or after 2 more iterations. Please wait until working session will end and give your feedback. |
Add the complete solve session log (3MB) that demonstrates the bug where AI_JSONParseError from kimi-k2.5 was falsely categorized as a UsageLimit error. This log is referenced by the case study README and log-analysis documents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🔄 Auto-restart 1/3 LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
Summary
Fixes #1290 - Two root causes found and fixed for the missing auto-restart session report when using
--tool agent.Problem
When an auto-restart iteration fails with
--tool agent:AI_JSONParseErrorfrom a corrupted kimi-k2.5 API streamRoot Cause Analysis
Problem 1 (false error categorization): Two bugs combined:
'resets'pattern inisUsageLimitError()was too broad - a simple substring match that caught ordinary English words in agent code output (e.g., C# game code comments like"loads a shell and resets")exitCode=0,session.idleseen), finding stale"type":"error"events in the outputProblem 2 (missing log upload): Per-iteration log uploads only happened on
toolResult.success, and the final log upload was skipped due tologsAlreadyUploadedbeing set byverifyResults()before auto-restart.Evidence from the original log
The "resets" word matched in C# code output:
"loads a shell and resets _dragStartPosition".Solution
Changes in
src/usage-limit.lib.mjs:'resets'pattern from substring match to regex:/resets\s+(?:(?:at\s+)?[0-9]|(?:Jan|Feb|...))/i"resets 5am","resets Jan 15, 8am"but NOT"loads a shell and resets"Changes in
src/agent.lib.mjs:exitCode === 0 && agentCompletedSuccessfullyChanges in
src/solve.watch.lib.mjs(from previous commit):--attach-logsis enabledautoRestartIterationsRanandlastIterationLogUploadedflagsChanges in
src/solve.mjs(from previous commit):autoRestartRanButNotUploadedTest plan
npm run lint)npm test) - 67 usage limit tests, 23 agent error detection tests--tool agent --attach-logsand trigger auto-restart scenarioCase Study
Full analysis documented in
docs/case-studies/issue-1290/:README.md: Root cause analysis with detailed chain of eventslog-analysis.md: Complete timeline and error analysis of the original log fileGenerated with Claude Code