Skip to content

fix: sanitize JSON before parsing in settler-delta-parser#133

Open
pzysvip99999 wants to merge 2 commits intoNarcooo:masterfrom
pzysvip99999:fix/settler-json-sanitizer
Open

fix: sanitize JSON before parsing in settler-delta-parser#133
pzysvip99999 wants to merge 2 commits intoNarcooo:masterfrom
pzysvip99999:fix/settler-json-sanitizer

Conversation

@pzysvip99999
Copy link
Copy Markdown

Fix: sanitize JSON before parsing in settler-delta-parser

Problem

When the LLM (e.g. MiniMax) outputs JSON in the RUNTIME_STATE_DELTA block, it occasionally emits invalid sequences such as:

  • Control characters (\x00, \x1F, etc.)
  • Trailing commas before } or ]

Calling JSON.parse() directly on this raw string throws an exception, causing the entire Phase 2b (Reflector) to fail. This means state files are never written, even though the chapter text itself is valid.

Fix

Add a sanitizeJSON() helper that runs before JSON.parse():

function sanitizeJSON(str: string): string {
  return str
    .replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g, "")  // strip control chars
    .replace(/,\s*([}\]])/g, "$1");                     // strip trailing commas
}

This handles the two most common causes of LLM-generated JSON parse failures without changing the schema validation logic.

Testing

# Before: write next crashes at Phase 2b, state files never updated
# After: JSON parsed successfully, state settlement completes normally

Combined with PR #132

PR #132 wraps validateRuntimeState in a try-catch to prevent crashes when validation itself fails. This PR fixes the root cause — invalid JSON reaching the parser in the first place. The two PRs together ensure:

  1. JSON parse errors are recovered before state settlement (this PR)
  2. Any remaining validation errors don't crash the pipeline (PR fix: wrap validateRuntimeState in try-catch to prevent pipeline crash #132)

OpenClaw Bot added 2 commits April 1, 2026 00:23
…ors from crashing the pipeline

The state validator was throwing an uncaught exception when the LLM output
contains invalid JSON characters (e.g. control chars, unescaped sequences),
causing the entire write pipeline to crash with 'State validator returned
invalid JSON'. This wraps the entire function in a try-catch so that any
such errors are returned as a validation issue rather than crashing.
Add sanitizeJSON() to strip control characters (\x00-\x1F\x7F) and
trailing commas from LLM output before passing to JSON.parse. This
prevents the parser from crashing when MiniMax or other providers emit
invalid JSON sequences, allowing state settlement to proceed normally.
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing-comma regex in sanitizeJSON.replace(/,\s*([}\]])/g, "$1") — is not context-aware and will incorrectly mutate JSON string values that happen to contain patterns like ", }" or ", ]". For example, {"note": "array, }"} would be corrupted to {"note": "array }"}, producing a parse error or wrong data. This kind of sanitization really needs a proper tokenizer pass or at minimum a stricter heuristic scoped to whitespace-only trailing commas.

In state-validator.ts, wrapping the entire validateRuntimeState body in a try/catch and returning a validator_crash issue changes the failure contract significantly. The original logic in parseOrIssue never threw — it pushed to issues and returned null on failure — so the only things that could throw now are genuine programming errors (e.g., unexpected null dereferences in the hook/chapter-summary loops). Swallowing those as a recoverable issue code means bugs in the validator itself will silently appear as a soft validation failure rather than crashing visibly, making them harder to detect and diagnose. It would be safer to let unexpected exceptions propagate or at least log them before returning the synthetic issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants