fix: normalize malformed Bash tool arguments from OpenAI-compatible providers#385
fix: normalize malformed Bash tool arguments from OpenAI-compatible providers#385kevincodex1 merged 8 commits intomainfrom
Conversation
|
Thanks you so much! |
|
@gnanam1990 is this good for review bro? |
|
@kevincodex1 pushed an update for this and it is ready for another review pass.\n\nThis PR addresses #383. The latest revision fixes the remaining local issues I found around invalid Bash argument normalization and malformed streaming Bash input, and I re-checked it locally with the targeted test file plus build before pushing. |
|
one more eyes here @Vasanthdev2004 when you can |
| if (isRecord(parsed)) { | ||
| return parsed | ||
| } | ||
| if (toolName === 'Bash') { |
There was a problem hiding this comment.
This still turns malformed non-object Bash arguments into executable commands. On the current head,
ormalizeToolArguments('Bash', 'false'),
ormalizeToolArguments('Bash', 'null'), and
ormalizeToolArguments('Bash', '[]') all return { command: ... } instead of preserving the invalid input as raw. I reproduced the same thing through the real non-streaming shim path: a provider response with unction.arguments: 'false' becomes a ool_use block with input: { command: 'false' }. Main does not do that — it preserves the parsed boolean/null/array instead of silently converting them into Bash commands. Since this PR's stated goal is to stop malformed Bash args from becoming commands, these JSON-literal cases still violate that contract.
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Rechecked the latest head e214b8a360ee38190f92ff34f608a2fa6f7fba5a against current origin/main.
I still can't approve this because there is one branch-specific normalization regression left.
Current blocker:
-
normalizeToolArguments()still turns malformed non-object Bash arguments into executable commands.
The latest commit improves empty strings and structured-object-looking inputs, but JSON literals likefalse,null, and[]are still coerced into{ command: ... }for the Bash tool.Direct repros on this head:
normalizeToolArguments('Bash', 'false')->{ command: 'false' }normalizeToolArguments('Bash', 'null')->{ command: 'null' }normalizeToolArguments('Bash', '[]')->{ command: '[]' }
I also reproduced it through the real non-streaming shim path: a provider response with
function.arguments: 'false'becomes atool_useblock withinput: { command: 'false' }.Current main does not do that — it preserves the parsed boolean/null/array instead of silently converting them into Bash commands. Since this PR's stated goal is to stop malformed Bash args from becoming commands, these JSON-literal cases still violate that contract.
What I verified on this head:
- direct helper repros above
- direct non-streaming shim repro for
function.arguments: 'false' - direct repros for the fixed cases:
- malformed object literal now stays raw
- empty / whitespace-only terminal Bash args stay raw
bun test src/services/api/openaiShim.test.ts-> 20 passbun run build-> successbun run smoke-> success
I didn't find a compile/runtime blocker outside that, but I wouldn't merge this until malformed JSON literals stop becoming Bash commands.
|
maybe please rebase to latest main @gnanam1990 and then will asked @VennDev help in testing this |
e214b8a to
930b6a3
Compare
|
@kevincodex1 rebased this branch onto the latest This also fixes the remaining blocker from @Vasanthdev2004: malformed Bash JSON literals like Rechecked locally on the rebased head with:
@VennDev when you have time, please re-test this latest head as well. |
|
@kevincodex1 @Vasanthdev2004 @VennDev pushed one more update for this PR. This latest commit is Manual recheck on the latest branch state:
From my side, the earlier Bash normalization cases and the rebased test/CI issues are no longer reproducing on the latest head. When you have time, please recheck this newest commit. |
|
@gnanam1990 conflicts |
d38c58a to
f2fc454
Compare
|
@kevincodex1 @Vasanthdev2004 @VennDev pushed one more rebase update for this PR. This resolves the follow-up rebase conflicts in the test files as well. Rechecked on the latest head:
From my side, the earlier normalization issue, the rebased CI instability, and the rebase conflict fallout are no longer reproducing on the newest commit. |
|
@Vasanthdev2004 latest head is updated and |
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Rechecked the latest head f2fc454baf6b537db50fcf50698662e4e97b2c71 against current origin/main 39f3b2babd43d65546f1f71388596d354047316d.
I still can't approve this because one normalization regression remains in src/services/api/toolArgumentNormalization.ts.
Current blocker:
-
The structured-object guard only recognizes strict JSON object literals with quoted keys, so malformed object-shaped Bash args like
{command:"pwd"}or{'command':'pwd'}still fall through towrapPlainStringToolArguments()and become executable{ command: ... }inputs.Direct repros on this head:
normalizeToolArguments('Bash', '{command:"pwd"}')->{ command: '{command:"pwd"}' }- real non-streaming shim path returns
input: { command: '{command:"pwd"}' } - real streaming shim path emits
{"command":"{command:\"pwd\"}"}
Current
origin/maindoes not do that:- non-streaming preserves
{ raw: '{command:"pwd"}' } - streaming emits the raw partial
{command:"pwd"}
Since this PR's goal is to stop malformed Bash args from silently becoming commands, these object-like non-JSON literals are still unsafe and block approval.
What I verified on this head:
bun test src/services/api/openaiShim.test.ts-> 25 passbun run build-> successbun run smoke-> success
…cases
- Extend STRING_ARGUMENT_TOOL_FIELDS to normalize Read, Write, Edit,
Glob, and Grep plain-string arguments (fixes "Invalid tool parameters"
errors reported by VennDev)
- Normalize streaming Bash args regardless of finish_reason, not only
when finish_reason is 'tool_calls'
- Broaden isLikelyStructuredObjectLiteral to catch malformed object-shaped
strings like {command:"pwd"} and {'command':'pwd'} (fixes CR2 from
Vasanthdev2004)
- Apply blank/object-literal guard to all tools, not just Bash
- Extract duplicated JSON repair suffix combinations into shared constant
- Add 32 isolated unit tests for toolArgumentNormalization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@kevincodex1 @Vasanthdev2004 @VennDev pushed another update ( This commit addresses all outstanding review feedback: Vasanthdev2004's change requests:
VennDev's "Invalid tool parameters" errors (Read, Write, Edit, Glob, Grep):
Additional hardening:
Local verification on latest head:
|
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Rechecked the latest head b20d878b76f5b3e15409f9223a44a8f979c78161 against current origin/main 39f3b2babd43d65546f1f71388596d354047316d.
The previous malformed-object blocker is fixed on this head, but I still can't approve because the new streaming change introduces another regression.
Current blocker:
-
src/services/api/openaiShim.ts:694-705now normalizes buffered Bash arguments on every stream stop, includingfinish_reason: "length". That turns a max-token-truncated Bash tool call into a valid executable{ command: ... }payload instead of preserving the incomplete raw input.Direct repro on this head:
- stream first chunk with
function.arguments: 'rg --fi' - end with
finish_reason: 'length' - emitted
input_json_delta.partial_jsonbecomes{"command":"rg --fi"}
Current
origin/maindoes not do that for the same repro; it emits the raw partialrg --fi.This is not harmless metadata drift: the app treats any assistant message containing a
tool_useblock as a real follow-up/tool-execution turn regardless ofstop_reason(src/query.ts:828-842,src/utils/messages.ts:830-837). So converting a truncated stream into a structured Bash command can make an incomplete command executable.The new test at
src/services/api/openaiShim.test.ts:1380-1423also locks in that behavior by asserting{"command":"rg --fi"}for thefinish_reason: 'length'case. - stream first chunk with
What I verified on this head:
- previous
{command:"pwd"}malformed-object case now stays raw - direct streaming repro above on latest head
- same streaming repro on current
origin/mainfor baseline bun test src/services/api/toolArgumentNormalization.test.ts src/services/api/openaiShim.test.ts-> 57 passbun run build-> successbun run smoke-> success
Truncated tool calls (finish_reason: 'length') now preserve the raw buffer instead of normalizing into executable commands, preventing incomplete commands from becoming runnable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@Vasanthdev2004 good catch — pushed fix in Streaming normalization now skips The test at Local verification:
|
|
@Vasanthdev2004 quick process note — the last three review rounds each raised a single new issue that was visible in the same code at the same time:
Each of these was a valid catch, but batching all findings into a single review would save both of us multiple push → CI → wait → review cycles. If you spot multiple issues, please list them all in one pass so we can address everything together. Thanks! |
|
can you help test if its all good @VennDev ? |
- Remove all { raw: ... } returns that caused InputValidationError with
z.strictObject schemas — return {} instead for clean Zod errors
- Extend normalizeAtStop buffering to all mapped tools (Read, Write,
Edit, Glob, Grep) so streaming paths also get normalized
- Make repairPossiblyTruncatedObjectJson generic — repair any valid
JSON object, not just ones with a command field
- Export hasToolFieldMapping for streaming normalizeAtStop decision
- Skip normalization on finish_reason: length to preserve raw truncated
buffer
- Update all test expectations to match new behavior
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@kevincodex1 @Vasanthdev2004 @VennDev pushed comprehensive hardening update ( This commit addresses all outstanding review feedback and every edge case found across all review rounds: What changed1. Removed all 2. Extended streaming normalization to all tools (Read, Write, Edit, Glob, Grep) 3. Made 4. Skip normalization on 5. JSON literals preserved (Vasanthdev2004 CR1) 6. Malformed object-shaped strings rejected (Vasanthdev2004 CR2) Manual verification with
|
| Tool | Result |
|---|---|
Bash (run pwd) |
Working |
Read (read package.json) |
Working |
Write (create file) |
Working |
Edit (change content) |
Working |
Glob (find .ts files) |
Working (ripgrep not installed — system dep) |
Grep (search pattern) |
Working (same ripgrep note) |
Test verification
bun test src/services/api/toolArgumentNormalization.test.ts src/services/api/openaiShim.test.ts→ 57 pass, 0 failbun test --max-concurrency=1→ 512 pass, 0 fail across 77 filesbun run scripts/build.ts→ success (v0.1.8)node dist/cli.mjs --version→0.1.8 (Open Claude)
kevincodex1
left a comment
There was a problem hiding this comment.
I think we are good now.
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Rechecked the latest head 3a25d71004a853e82f4e5821a61675ac0aeb6786 against current origin/main aff2bd89e22161f57ef330737b7f149208b96097.
I did not find a remaining blocker on this revision.
What I verified on this head:
- malformed Bash object-literal cases now stay non-executable in both paths:
- helper repros:
{command:"pwd"}and{'command':'pwd'}->{} - non-streaming shim path ->
input: {} - streaming stop-path -> emitted partial JSON
{}
- helper repros:
- malformed JSON literals no longer become Bash commands:
- non-streaming
function.arguments: 'false'->input: false
- non-streaming
- truncated streaming Bash input is preserved raw on
finish_reason: 'length':- repro with
rg --fistill emits rawrg --fi, not a normalized command object
- repro with
- mapped plain-string helpers still normalize as intended:
Read('/tmp/file.txt')->{ file_path: '/tmp/file.txt' }Glob('**/*.ts')->{ pattern: '**/*.ts' }
bun test src/services/api/toolArgumentNormalization.test.ts src/services/api/openaiShim.test.ts src/utils/providerProfiles.test.ts src/utils/providerFlag.test.ts src/utils/model/modelOptions.github.test.ts src/ink/termio/osc.test.tsonly hit the oldproviderProfiles.test.ts+modelOptions.github.test.tssuite interaction; I reproduced that same pair failure on currentorigin/main, so I am not counting it as a branch-specific blocker herebun run build-> successbun run smoke-> success
Residual risk:
- the touched code still relies mainly on unit-level shim coverage rather than a full end-to-end tool execution integration test, but I didn't find a concrete correctness issue on the current head.
|
@VennDev That test failure is not from this PR's changes — it's a Please try: git fetch origin
git checkout fix/383-bash-tool-args
git pull origin fix/383-bash-tool-args
bun test --max-concurrency=1On the latest branch head ( |
It's done! |
|
I will try build and link test it! |
|
Still happening! :v Maybe I shouldn’t post images because I’m scared of getting banned by GitHub :v |
|
@VennDev The first two Bash calls ( The third error ( This behavior is the same on Which provider/model are you testing with? |
|
Im testing with |
|
@VennDev Thanks for confirming! Gemini 3.1 Pro is exactly the provider this PR targets — and your first two Bash calls succeeding confirms the normalization is working correctly for it. The third empty-argument error is the model's decision not to send a command — that's not something we can fix on the normalization side. It would happen on |
|
Thank you everyone for your help! |
|
I think we are good now here and this can be merge already. @VennDev if you have any issue related to this please file it again and just link this PR for additional context |
…roviders (Gitlawb#385) * fix: normalize malformed Bash tool arguments from OpenAI-compatible providers * fix: keep invalid Bash tool args from becoming commands * fix: preserve malformed Bash JSON literals * test: stabilize rebased PR 385 checks * test: isolate provider profile env assertions * fix: extend tool argument normalization to all tools and harden edge cases - Extend STRING_ARGUMENT_TOOL_FIELDS to normalize Read, Write, Edit, Glob, and Grep plain-string arguments (fixes "Invalid tool parameters" errors reported by VennDev) - Normalize streaming Bash args regardless of finish_reason, not only when finish_reason is 'tool_calls' - Broaden isLikelyStructuredObjectLiteral to catch malformed object-shaped strings like {command:"pwd"} and {'command':'pwd'} (fixes CR2 from Vasanthdev2004) - Apply blank/object-literal guard to all tools, not just Bash - Extract duplicated JSON repair suffix combinations into shared constant - Add 32 isolated unit tests for toolArgumentNormalization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: skip streaming normalization on finish_reason length Truncated tool calls (finish_reason: 'length') now preserve the raw buffer instead of normalizing into executable commands, preventing incomplete commands from becoming runnable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: comprehensive tool argument normalization hardening - Remove all { raw: ... } returns that caused InputValidationError with z.strictObject schemas — return {} instead for clean Zod errors - Extend normalizeAtStop buffering to all mapped tools (Read, Write, Edit, Glob, Grep) so streaming paths also get normalized - Make repairPossiblyTruncatedObjectJson generic — repair any valid JSON object, not just ones with a command field - Export hasToolFieldMapping for streaming normalizeAtStop decision - Skip normalization on finish_reason: length to preserve raw truncated buffer - Update all test expectations to match new behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>


























Summary
Testing
\"/home/gnanasekaran/.bun/bin/bun\" test src/services/api/openaiShim.test.ts\"/home/gnanasekaran/.bun/bin/bun\" run scripts/build.tsFixes #383