-
Notifications
You must be signed in to change notification settings - Fork 322
sanitize_content_core.cjs: incomplete invisible-char strip set and missing detection gate for MCP-driven safe-output writes #24154
Description
Summary
hardenUnicodeText in sanitize_content_core.cjs (v0.65.0–v0.65.3) strips a subset of invisible Unicode characters but omits U+034F (Combining Grapheme Joiner), U+200E (Left-to-Right Mark), U+200F (Right-to-Left Mark), and U+00AD (Soft Hyphen). A prompt-injection payload can use any of these four characters to obfuscate markers or patterns and pass through the sanitizer unchanged. Separately, native-card-style workflows where write safe-outputs are triggered via MCP tool calls compile to a safe_outputs job with needs: agent only — no detection job is wired, contrary to the reference documentation's claim that detection is "automatically enabled when safe outputs are configured."
Affected Area
Safe-outputs content sanitization layer (output trust boundary). Specifically:
sanitize_content_core.cjs—hardenUnicodeTextstrip regex (line 935)- Compiled
safe_outputsjob dependency in native-card workflows vs.pentest-full-style orchestrated workflows
Reproduction Outline
- On any gh-aw runner, load the deployed sanitizer and inject a CGJ-obfuscated string:
const {hardenUnicodeText} = require(process.env.RUNNER_TEMP + '/gh-aw/safeoutputs/sanitize_content_core.cjs'); const marker = 'SECRET-MARKER-test'; const obf = marker.split('').join('\u034F'); // Combining Grapheme Joiner between each char const result = hardenUnicodeText(obf); console.log('CGJ survives:', result.includes('\u034F')); // → true
- Pass the obfuscated string as the body of a
create_issuesafe-outputs call from an individual native-card workflow (one that uses MCP tool calls rather than frontmatter-declared safe-outputs). - Observe:
safe_outputsexecutes without a detection gate; the issue body contains the obfuscated string verbatim. - Confirm the detection gap by comparing the compiled
safe_outputsjob: native-card hasneeds: agent;pentest-fullhasneeds: [agent, detection].
Observed Behavior
- U+034F, U+200E, U+200F, and U+00AD survive
hardenUnicodeTextunchanged (output length equals input length, characters still present). - Markdown table-cell and code-block fragmentation of marker strings also survives
sanitizeContentCoreintact. - Native-card compiled workflows have no
detectionjob and no detection dependency onsafe_outputs, leaving the detection gate absent for MCP-driven writes.
Expected Behavior
hardenUnicodeTextshould strip (or normalize) all invisible/format Unicode characters, including U+034F, U+200E, U+200F, and U+00AD. Consider extending to the full Unicode Cf (format) category.- Any workflow where write safe-outputs are configured — whether via frontmatter declarations or MCP tool calls — should compile with a
detectionjob and gatesafe_outputsonneeds.detection.result == 'success', consistent with the documented automatic-detection guarantee.
Security Relevance
Incomplete invisible-character stripping allows prompt-injection payloads to use standard Unicode obfuscation techniques to survive the sanitizer and reach GitHub API write endpoints undetected. The missing detection gate in native-card workflows means the "automatically enabled" detection guarantee does not hold for the MCP-driven write path, which is the primary write path for agentic tool-call workflows. Together these gaps reduce the effective depth of the output trust boundary.
gh-aw version (this workflow): v0.65.5
Original finding: https://github.com/githubnext/gh-aw-security/issues/1654
Generated by File Issue · ◷