fix: sanitize JSON before parsing in settler-delta-parser#133
fix: sanitize JSON before parsing in settler-delta-parser#133pzysvip99999 wants to merge 2 commits intoNarcooo:masterfrom
Conversation
…ors from crashing the pipeline The state validator was throwing an uncaught exception when the LLM output contains invalid JSON characters (e.g. control chars, unescaped sequences), causing the entire write pipeline to crash with 'State validator returned invalid JSON'. This wraps the entire function in a try-catch so that any such errors are returned as a validation issue rather than crashing.
Add sanitizeJSON() to strip control characters (\x00-\x1F\x7F) and trailing commas from LLM output before passing to JSON.parse. This prevents the parser from crashing when MiniMax or other providers emit invalid JSON sequences, allowing state settlement to proceed normally.
JiwaniZakir
left a comment
There was a problem hiding this comment.
The trailing-comma regex in sanitizeJSON — .replace(/,\s*([}\]])/g, "$1") — is not context-aware and will incorrectly mutate JSON string values that happen to contain patterns like ", }" or ", ]". For example, {"note": "array, }"} would be corrupted to {"note": "array }"}, producing a parse error or wrong data. This kind of sanitization really needs a proper tokenizer pass or at minimum a stricter heuristic scoped to whitespace-only trailing commas.
In state-validator.ts, wrapping the entire validateRuntimeState body in a try/catch and returning a validator_crash issue changes the failure contract significantly. The original logic in parseOrIssue never threw — it pushed to issues and returned null on failure — so the only things that could throw now are genuine programming errors (e.g., unexpected null dereferences in the hook/chapter-summary loops). Swallowing those as a recoverable issue code means bugs in the validator itself will silently appear as a soft validation failure rather than crashing visibly, making them harder to detect and diagnose. It would be safer to let unexpected exceptions propagate or at least log them before returning the synthetic issue.
Fix: sanitize JSON before parsing in settler-delta-parser
Problem
When the LLM (e.g. MiniMax) outputs JSON in the
RUNTIME_STATE_DELTAblock, it occasionally emits invalid sequences such as:\x00,\x1F, etc.)}or]Calling
JSON.parse()directly on this raw string throws an exception, causing the entire Phase 2b (Reflector) to fail. This means state files are never written, even though the chapter text itself is valid.Fix
Add a
sanitizeJSON()helper that runs beforeJSON.parse():This handles the two most common causes of LLM-generated JSON parse failures without changing the schema validation logic.
Testing
Combined with PR #132
PR #132 wraps
validateRuntimeStatein a try-catch to prevent crashes when validation itself fails. This PR fixes the root cause — invalid JSON reaching the parser in the first place. The two PRs together ensure: