fix: 3 critical bugs + routing prefix strip#134
Closed
dpbmaverick98 wants to merge 2 commits intoTinyAGI:mainfrom
Closed
fix: 3 critical bugs + routing prefix strip#134dpbmaverick98 wants to merge 2 commits intoTinyAGI:mainfrom
dpbmaverick98 wants to merge 2 commits intoTinyAGI:mainfrom
Conversation
…s, multi-agent race condition Bug 1 (Response Drops): Channel clients had non-atomic send-then-ack flow causing random response loss. Added 'delivering' status to SQLite responses table, claim/unclaim API endpoints, and claim-before-send pattern in Telegram client with retry tracking (max 3 attempts). Bug 2 (Inter-Agent Mention Failures): Teammate mentions were silently dropped due to case sensitivity, typos, and missing validation logging. Added case-insensitive agent ID lookup, detailed logging in isTeammate() and extractTeammateMentions(), improved regex for bracket handling, and validateAgentResponse() helper. Bug 3 (Multi-Agent Reply Loss): Race condition on conv.pending counter when multiple agents complete simultaneously caused replies to never arrive. Added withConversationLock() promise-chain mutex, safe incrementPending/decrementPending operations, and automatic conversation state recovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The messages API route prepends [channel/sender]: to incoming messages, which caused parseAgentRouting() regex to fail since the message no longer starts with @agent_id. This made all messages fall back to the first agent regardless of @mentions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Collaborator
|
thanks this is my bad, the last commit was too aggressive, what are your thoughts on sqlite migration? |
Author
|
SQLite fixes durability, but the conversations Map is still in-memory—agent handoffs break on restart. Proposal: Keep SQLite for the queue + add NATS for agent-to-agent coordination (request-reply pattern). When @coder calls @writer, it just waits for a response instead of juggling the Map. SQLite isn't wasted; it becomes our agent memory layer (RAG, history). Bonus: NATS makes adding a Web UI trivial (built-in WebSocket). This fits TinyClaw's multi-agent vision without the fragility. Worth me implementing the NATS layer as a follow-up? Also: Is this PR good to merge as-is? It's my first public contribution—happy to address any feedback! |
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Bug 1 — Channel response drops: Send-then-ack flow in channel clients was non-atomic, causing random response loss after the SQLite migration. Added
deliveringstatus to responses table,claim/unclaimAPI endpoints, and claim-before-send pattern in Telegram client with retry tracking (max 3 attempts) and periodic recovery for stuck deliveries.Bug 2 — Inter-agent mention failures: Teammate mentions (
[@agent: msg]) were silently dropped due to case sensitivity, missing validation logging, and a weak regex. Added case-insensitive agent ID lookup, detailed failure logging inisTeammate()andextractTeammateMentions(), improved regex for bracket handling, and avalidateAgentResponse()helper.Bug 3 — Multi-agent reply loss (race condition): When multiple agents completed simultaneously, concurrent
conv.pending--operations caused the conversation to never complete. AddedwithConversationLock()promise-chain mutex, safeincrementPending/decrementPendingoperations, and automatic conversation state recovery.Bug 4 — @agent routing broken by message prefix: The messages API route prepends
[channel/sender]:to incoming messages, which causedparseAgentRouting()to fail since messages no longer start with@agent_id. All messages fell back to the first configured agent. Fixed by stripping the prefix before parsing.Files Changed
src/lib/db.tsdeliveringstatus,claimResponseForDelivery(),unclaimResponse(),recoverStuckDeliveringResponses()src/server/routes/queue.ts/api/responses/:id/claimand/unclaimendpointssrc/lib/conversation.tswithConversationLock(),incrementPending(),decrementPending(), state validation + recoverysrc/queue-processor.tssrc/lib/routing.tsparseAgentRouting()src/channels/telegram-client.tsdocs/bug-fixes/CLAUDE.mdTest plan
npm run buildcompiles cleanly./tinyclaw.sh startlaunches queue processor + API server successfully/api/queue/statusreturns correct stats/api/responses/:id/claimand/unclaimendpoints respond correctly@agent_idrouting works correctly from Telegram (no longer falls back to first agent)@team_idrouting activates the team leader🤖 Generated with Claude Code