Skip to content

fix: strip Anthropic-specific params from 3P provider paths#258

Open
auriti wants to merge 5 commits intoGitlawb:mainfrom
auriti:fix/repl-anthropic-params-3p
Open

fix: strip Anthropic-specific params from 3P provider paths#258
auriti wants to merge 5 commits intoGitlawb:mainfrom
auriti:fix/repl-anthropic-params-3p

Conversation

@auriti
Copy link
Copy Markdown
Collaborator

@auriti auriti commented Apr 3, 2026

Summary

Fixes 3 of 5 findings from #248 — silent failure modes affecting all third-party provider users. Findings 1 and 4 were already fixed upstream.

Root Causes

Finding Severity Root Cause
#2 Thinking blocks as text HIGH convertContentBlocks() serialized thinking blocks as <thinking>...</thinking> raw text, corrupting multi-turn context on 3P providers
#3 Unknown model 200k default HIGH Models not in the lookup table got 200k default — auto-compact never triggered, causing hard context_window_exceeded errors
#5 Session resume thinking blocks MEDIUM deserializeMessages() restored thinking blocks from Anthropic sessions, causing 400 errors when resumed against 3P providers

Changes

File Change
src/services/api/openaiShim.ts Strip thinking/redacted_thinking blocks instead of converting to text tags
src/utils/context.ts Use conservative 8k default for unknown 3P models + warning log
src/utils/conversationRecovery.ts Strip thinking blocks from deserialized messages for non-Anthropic providers
src/services/api/openaiShim.test.ts Regression test: thinking blocks must not leak as text in assistant messages

Issues addressed

Test plan

  • bun test src/services/api/openaiShim.test.ts — 5/5 pass (includes new regression test)
  • bun test src/utils/context.test.ts — 6/6 pass
  • Manual: resume an Anthropic session against an OpenAI provider — no 400 errors
  • Manual: use an unknown model — verify 8k default + warning in stderr

Three silent failure modes affecting all third-party provider users:

1. Thinking blocks serialized as <thinking> text corrupt multi-turn
   context — strip them instead of converting to raw text tags.

2. Unknown models fall through to 200k context window default, so
   auto-compact never triggers — use conservative 8k for unknown
   3P models with a warning log.

3. Session resume with thinking blocks causes 400 or context corruption
   on 3P providers — strip thinking/redacted_thinking content blocks
   from deserialized messages when resuming against a non-Anthropic
   provider.

Addresses findings 2, 3, and 5 from Gitlawb#248.
kevincodex1
kevincodex1 previously approved these changes Apr 4, 2026
@kevincodex1 kevincodex1 requested a review from gnanam1990 April 4, 2026 06:41
Copy link
Copy Markdown
Collaborator

@gnanam1990 gnanam1990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Request Changes

The core fixes are correct and well-scoped — stripping thinking blocks instead of serializing them as text is the right call. A few issues need addressing before merge.


🔴 Issue 1 — Fragile opt-out provider check (conversationRecovery.ts)

const isThirdPartyProvider = provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex' && provider !== 'foundry'

This is an opt-out blocklist. Any new Anthropic-compatible provider added in the future will incorrectly strip thinking blocks by default, causing silent data loss on resume.

Suggestion: Flip to a positive allowlist:

const isThirdPartyProvider = provider === 'openai' || provider === 'gemini' || provider === 'github' || ...

Or better, add a dedicated isOpenAICompatibleProvider() utility that can be maintained in one place.


🔴 Issue 2 — No test coverage for stripThinkingBlocks in session resume

The conversationRecovery.ts change is the riskiest part of this PR — session resume is a complex flow and a regression here is hard to catch manually. The openaiShim.test.ts regression test is great, but this path has zero coverage.

Suggestion: Add a unit test for deserializeMessagesWithInterruptDetection (or stripThinkingBlocks directly) that verifies:

  • Thinking blocks are stripped for 3P providers
  • Text blocks are preserved
  • Messages that become empty after stripping are dropped without breaking role alternation
  • Thinking blocks are NOT stripped for firstParty/bedrock/vertex

🟡 Minor — console.error bypasses project logging (context.ts:84)

console.error(`[context] Warning: model "${model}" not in context window table...`)

The project uses a logError utility elsewhere. This should use the same abstraction for consistency and to ensure log output follows the same format/routing.


🟡 Minor — Silent message drop in stripThinkingBlocks

If an assistant message contains only thinking blocks, it is silently dropped. This could cause role-alternation issues in edge cases. Worth adding a logError or debug trace when a message is removed.


Overall this is a solid fix for real bugs. Addressing issues 1 and 2 before merge would make it safe to land.

@auriti
Copy link
Copy Markdown
Collaborator Author

auriti commented Apr 7, 2026

Updated this branch on latest main and addressed the requested changes.

What changed:

  • replaced the fragile opt-out provider check with a positive helper (isOpenAICompatibleProvider) so thinking stripping only applies to OpenAI-compatible providers
  • added focused resume-path coverage for deserializeMessagesWithInterruptDetection(...)
    • strips thinking blocks for OpenAI-compatible providers
    • preserves visible text
    • does not apply the 3P stripping path to Anthropic-compatible providers
    • stays consistent with the existing orphan-thinking filtering behavior
  • replaced the console.error warning path in context.ts with logError
  • added a debug log when a resume assistant message is removed because it contains only thinking blocks

Validation run locally:

  • bun test src/utils/conversationRecovery.test.ts src/services/api/openaiShim.test.ts src/utils/context.test.ts
  • bun run smoke

@auriti
Copy link
Copy Markdown
Collaborator Author

auriti commented Apr 7, 2026

Follow-up pushed for the provider-sensitive resume test placement. The latest smoke-and-tests run is now green.

@kevincodex1
Copy link
Copy Markdown
Contributor

@auriti please fix conflicts bro

auriti added a commit that referenced this pull request Apr 20, 2026
- Strip store field from request body for local providers (Ollama, vLLM)
  that reject unknown JSON fields with 400 errors
- Add Gemini 3.x model context windows and output token limits
  (gemini-3-flash-preview, gemini-3.1-pro-preview, google/ OpenRouter variants)
- Preserve reasoning_content on assistant tool-call message replays
  for providers that require it (Kimi k2.5, DeepSeek reasoner)
- Use conservative max_output_tokens fallback (4096/16384) for unknown
  3P models to prevent vLLM/Ollama 400 errors from exceeding max_model_len

Consolidates fixes from: #258, #268, #237, #643, #666, #677

Co-authored-by: auriti <auriti@users.noreply.github.com>
Co-authored-by: Gustavo-Falci <Gustavo-Falci@users.noreply.github.com>
Co-authored-by: lttlin <lttlin@users.noreply.github.com>
Co-authored-by: Durannd <Durannd@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants