fix(web): filter system-injected XML tags from rendering as raw text#387
fix(web): filter system-injected XML tags from rendering as raw text#387hqhq1025 wants to merge 2 commits intotiann:mainfrom
Conversation
Claude Code injects internal messages (<task-notification>, <system-reminder>, <command-name>, <local-command-caveat>) as user-role messages. The web UI was rendering these as raw XML text visible to users. - Parse <task-notification> and display as agent-event with summary text - Silently drop <system-reminder>, <command-name>, <local-command-caveat> - Add tests covering all injection prefixes and edge cases
There was a problem hiding this comment.
Findings
- [Major] Generic XML filtering now rewrites or drops legitimate human prompts across non-Claude sessions, because
reduceTimeline()applies the Claude-specific tag heuristic to everyrole: 'user'message. Codex and Gemini local sessions both forward human input throughsendUserMessage()as plain user text (cli/src/api/apiSession.ts:407,cli/src/codex/session.ts:87,cli/src/gemini/session.ts:93), so a prompt that starts with<system-reminder>disappears entirely and<task-notification>...<summary>...</summary>is rendered as an agent event instead of the user’s text. Evidence:web/src/chat/reducerTimeline.ts:75.
Summary
Review mode: initial
One major regression found in web/src/chat/reducerTimeline.ts: Claude-specific XML filtering was added at the generic timeline layer, so it now affects normal user messages for other agent flavors as well.
Testing
Not run (automation; bun is unavailable in this runner, so the added Vitest file could not be executed here)
HAPI Bot
web/src/chat/reducerTimeline.ts
Outdated
| }) | ||
| continue | ||
| } | ||
| if (isSystemInjectedMessage(msg.content.text)) { |
There was a problem hiding this comment.
[MAJOR] This runs in the generic msg.role === 'user' path, so it also rewrites or drops real human prompts for non-Claude sessions. sendUserMessage() still forwards Codex/Gemini user input as plain role: 'user' text (cli/src/api/apiSession.ts:407, cli/src/codex/session.ts:87, cli/src/gemini/session.ts:93), which means a prompt starting with <system-reminder> now vanishes and a prompt starting with <task-notification> becomes an agent-event.
Suggested fix:
// Keep Claude transcript cleanup in normalizeUserOutput(), which only runs for
// Claude JSONL `output` payloads, instead of filtering every user message here.
if (trimmed.startsWith('<task-notification>')) {
const summary = trimmed.match(/<summary>([\s\S]*?)<\/summary>/)?.[1]?.trim()
if (!summary) return null
return {
id: messageId,
localId,
createdAt,
role: 'event',
content: { type: 'message', message: summary },
isSidechain: false,
meta
}
}
if (
trimmed.startsWith('<system-reminder>') ||
trimmed.startsWith('<command-name>') ||
trimmed.startsWith('<local-command-caveat>')
) {
return null
}Address review feedback: the XML tag filtering was applied at the generic timeline layer, which could incorrectly hide legitimate user messages in Codex/Gemini sessions. - Add isClaudeSession flag threaded from Session.metadata.claudeSessionId - Only filter system-injected tags when isClaudeSession is true - Add tests verifying non-Claude sessions pass through all messages
There was a problem hiding this comment.
Findings
- [Major]
<task-notification>conversion is still unreachable on the real Claude path, because injected XML is stored asrole: 'agent'/type: 'output'insendClaudeSessionMessage()andnormalizeUserOutput()still returnsnullfor<task-notification>beforereduceTimeline()sees amsg.role === 'user'. The new reducer branch therefore only handles synthetic inputs like the new test, so the UI will keep silently dropping task notifications instead of rendering the promised agent event. Evidenceweb/src/chat/reducerTimeline.ts:77, related contextweb/src/chat/normalizeAgent.ts:117,cli/src/api/apiSession.ts:364,web/src/chat/reducerTimeline.test.ts:29.
Suggested fix:if (typeof messageContent === 'string') { const trimmed = messageContent.trimStart() if (trimmed.startsWith('<task-notification>')) { const summary = trimmed.match(/<summary>([\s\S]*?)<\/summary>/)?.[1]?.trim() if (!summary) return null return { id: messageId, localId, createdAt, role: 'event', content: { type: 'message', message: summary }, isSidechain: false, meta } } if ( trimmed.startsWith('<system-reminder>') || trimmed.startsWith('<command-name>') || trimmed.startsWith('<local-command-caveat>') ) { return null } }
Summary
Review mode: follow-up after new commits
One major issue remains in the follow-up diff: the new reducer logic does not run on the actual Claude transcript shape for <task-notification>, so the user-visible regression is only partially fixed.
Testing
Not run (automation; bun is not installed in this runner)
HAPI Bot
|
|
||
| if (msg.role === 'user') { | ||
| if (context.isClaudeSession) { | ||
| const taskSummary = parseTaskNotificationSummary(msg.content.text) |
There was a problem hiding this comment.
[MAJOR] parseTaskNotificationSummary() is wired into the msg.role === 'user' reducer path, but real Claude injections never reach this branch. sendClaudeSessionMessage() stores non-external Claude XML as role: 'agent' / type: 'output' (cli/src/api/apiSession.ts:364), and normalizeUserOutput() still returns null for <task-notification> before reduction (web/src/chat/normalizeAgent.ts:117). That means the new agent-event behavior is unreachable in production, while the added test only covers a synthetic traced role: 'user' message (web/src/chat/reducerTimeline.test.ts:29).
Suggested fix:
if (typeof messageContent === 'string') {
const trimmed = messageContent.trimStart()
if (trimmed.startsWith('<task-notification>')) {
const summary = trimmed.match(/<summary>([\s\S]*?)<\/summary>/)?.[1]?.trim()
if (!summary) return null
return {
id: messageId,
localId,
createdAt,
role: 'event',
content: { type: 'message', message: summary },
isSidechain: false,
meta
}
}
if (
trimmed.startsWith('<system-reminder>') ||
trimmed.startsWith('<command-name>') ||
trimmed.startsWith('<local-command-caveat>')
) {
return null
}
}|
Thanks for the follow-up review! Regarding the second finding — I've traced the full message flow and believe the current implementation is correct:
The comment in The |
Summary
<task-notification>,<system-reminder>,<command-name>,<local-command-caveat>) as user-role messages for model context. The web UI was rendering these as raw XML text visible to users.<task-notification>is now parsed and displayed as anagent-eventblock showing the summary text (e.g., "Background command X was stopped")<system-reminder>,<command-name>, and<local-command-caveat>are silently droppedSYSTEM_INJECTION_PREFIXESlist fromcli/src/api/apiSession.tsto keep filtering in syncTest plan
reducerTimeline.test.tswith 8 tests covering:<task-notification>with summary → agent-event<task-notification>without summary → silently dropped<task-notification>with empty<summary></summary>→ silently dropped<task-notification>with leading whitespace → handled correctly<system-reminder>→ hidden<command-name>→ hidden<local-command-caveat>→ hiddenbun run typecheckpasses