Skip to content

Feature/model agnostic converse api#1

Open
OmoleOO wants to merge 16 commits intomainfrom
feature/model-agnostic-converse-api
Open

Feature/model agnostic converse api#1
OmoleOO wants to merge 16 commits intomainfrom
feature/model-agnostic-converse-api

Conversation

@OmoleOO
Copy link

@OmoleOO OmoleOO commented Feb 4, 2026

Summary

Refactors the chatbot to use AWS Bedrock's unified Converse API for model-agnostic inference, enabling support for any Bedrock model provider without code changes.

Key Changes

  • Worker Lambda: Replace InvokeModelWithResponseStream with ConverseStreamCommand for unified streaming across all model providers
  • Config Schema: Remove Anthropic-specific fields (anthropicVersion, topK), add modelSpecific for provider-specific parameters
  • CLI Validation: Add model ID pattern matching for known providers and parameter range validation
  • IAM Permissions: Add bedrock:Converse and bedrock:ConverseStream permissions

Supported Model Providers

Provider Example Model ID
Anthropic anthropic.claude-3-5-sonnet-20240620-v1:0
Amazon amazon.titan-text-express-v1, amazon.nova-lite-v1:0
Meta meta.llama3-70b-instruct-v1:0
Cohere cohere.command-r-plus-v1:0
Mistral mistral.mistral-large-2402-v1:0
AI21 ai21.jamba-1-5-large-v1:0
DeepSeek deepseek.deepseek-r1-v1:0

Why Converse API?

The Converse API provides:

  • Unified request/response format across all Bedrock models
  • Internal handling of model-specific differences
  • Streaming support via ConverseStream
  • Simpler codebase without custom adapters per provider

Test Plan

  • All 61 CLI tests pass (including 36 new model validation tests)
  • CDK synthesizes successfully
  • Manual testing with Claude model
  • Manual testing with Titan model (optional)
  • Verify streaming works end-to-end

Breaking Changes

Configuration schema changes require updating existing SSM parameters:

  • Remove model.anthropicVersion
  • Remove generation.topK
  • Add modelSpecific: {} (optional)

Run npm run cli config reset to apply new defaults after deployment.

Remove Anthropic-specific configuration fields to support any Bedrock
model provider through the unified Converse API:

- Remove model.anthropicVersion (Converse API handles versioning internally)
- Remove generation.topK (not universally supported across models)
- Add modelSpecific object for provider-specific parameters passed as
  additionalModelRequestFields when needed

This enables switching between Claude, Titan, Llama, Cohere, Mistral,
and AI21 models without config schema changes.
…ference

Migrate from model-specific InvokeModelWithResponseStream to the unified
ConverseStream API that works across all Bedrock model providers:

- Replace InvokeModelWithResponseStreamCommand with ConverseStreamCommand
- Update payload construction to use Converse API message format
- Parse streaming responses via contentBlockDelta events
- Add proper error handling for stream exceptions
- Use ConverseCommand for inference profile fallback path
- Remove Anthropic-specific request body construction

The Converse API provides a consistent interface across Claude, Titan,
Llama, Cohere, Mistral, and AI21 models, eliminating the need for
model-specific payload builders and response parsers.
Update CLI validation to support multiple Bedrock model providers:

- Remove anthropicVersion and topK from required field validation
- Add validateModelId() with regex patterns for known providers:
  Anthropic, Amazon (Titan/Nova), Meta, Cohere, Mistral, AI21, DeepSeek
- Add validateGenerationParams() for parameter range validation:
  - temperature: 0-1
  - topP: 0-1
  - maxTokens: >= 1
- Export new validation functions for use in tests

Provides early feedback when users configure unknown model providers
or invalid parameter ranges, improving the developer experience.
Update existing test fixtures to align with the new model-agnostic
configuration schema:

- Remove anthropicVersion from all test config objects
- Remove topK from generation parameters in fixtures
- Update expected error count for empty config validation (11 fields)
- Use valid model IDs (e.g., anthropic.claude-*, amazon.titan-*) in
  fixtures to pass new model ID validation
- Add modelSpecific field to valid config fixtures

All 25 existing tests continue to pass with the updated schema.
Add new test file covering model-agnostic validation functions:

Model ID Validation (17 test cases):
- Valid patterns for all supported providers (Claude, Titan, Nova,
  Llama, Cohere, Mistral, AI21, DeepSeek)
- Rejection of unknown providers and malformed IDs
- Edge cases: null, undefined, non-string inputs

Generation Parameter Validation (12 test cases):
- Boundary testing for temperature and topP (0, 1)
- Out-of-range rejection (negative values, > 1)
- maxTokens minimum value validation
- Multiple error accumulation

Cross-model Configuration (7 test cases):
- Full config validation with different model providers
- Combined model and parameter error reporting

Total: 36 new tests ensuring robust multi-model support.
Add unit tests for the refactored worker Lambda with mocked AWS SDK:

ConverseStream API Integration:
- Verify correct parameter construction for Converse API
- Test streaming delta parsing and WebSocket forwarding
- Test stream error handling (modelStreamErrorException)

Model-Specific Configuration:
- Test additionalModelRequestFields injection
- Verify different model IDs (Titan, Llama) work correctly

Knowledge Base Integration:
- Test context retrieval flow
- Verify systemWithContext prompt selection

Mock Mode:
- Test MOCK KB ID bypasses Bedrock calls

Inference Profile Fallback:
- Test ConverseCommand (non-streaming) path

Tests use Jest with ES module mocking for AWS SDK clients.
Grant bedrock:Converse and bedrock:ConverseStream permissions to the
worker Lambda function to enable the model-agnostic Converse API:

- bedrock:Converse - non-streaming inference (inference profile path)
- bedrock:ConverseStream - streaming inference (primary path)

Retains legacy InvokeModel permissions for backwards compatibility
during migration period. These can be removed in a future cleanup
once the Converse API is confirmed stable in production.
Update the example configuration file to reflect the new model-agnostic
schema:

- Remove anthropicVersion from model section
- Remove topK from generation parameters
- Add empty modelSpecific object for optional provider-specific params

Users can now use this template with any supported Bedrock model by
simply changing the modelId field.
@OmoleOO OmoleOO self-assigned this Feb 4, 2026
@OmoleOO OmoleOO marked this pull request as ready for review February 4, 2026 12:58
@OmoleOO OmoleOO requested a review from mfittko as a code owner February 4, 2026 12:58
@OmoleOO OmoleOO added the enhancement New feature or request label Feb 4, 2026
@OmoleOO OmoleOO requested a review from Copilot February 4, 2026 14:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the chatbot to use the Bedrock Converse/ConverseStream APIs and introduces model-agnostic configuration and validation, so that multiple Bedrock providers can be targeted via a unified config and worker implementation.

Changes:

  • CLI config validation: adds provider-aware modelId validation and generation parameter range checks, and removes Anthropic-specific schema fields.
  • Worker Lambda: replaces InvokeModelWithResponseStream and custom signing logic with ConverseStreamCommand / ConverseCommand, adds support for modelSpecific fields, and introduces comprehensive unit tests around streaming, KB usage, mock mode, and inference profiles.
  • Infrastructure and schema: updates the configuration schema, example config, and worker IAM to reflect Converse usage and new optional modelSpecific settings.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tools/cli/lib/config.js Adds MODEL_PATTERNS, validateModelId, and validateGenerationParams, integrates them into validateConfig, and removes validation of model.anthropicVersion/generation.topK.
tools/cli/__tests__/model-validation.test.js New Jest tests covering model ID validation across supported providers, generation parameter range validation, and cross-model config validation paths.
tools/cli/__tests__/config.test.js Aligns existing config validation tests with the new schema (no anthropicVersion/topK) and ensures modelSpecific is round-tripped where appropriate.
lambda/worker/index.js Switches the worker to Converse/ConverseStream; builds Converse request objects from the shared config, wires in optional modelSpecific via additionalModelRequestFields, and simplifies the inference-profile path using ConverseCommand.
lambda/worker/__tests__/handler.test.js Adds unit tests for ConverseStream usage, WebSocket streaming behavior, model-specific overrides, KB retrieval integration, mock mode, and inference profile fallback.
lambda/config-schema.js Updates the configuration schema to remove Anthropic-specific fields, document Converse-based behavior, and add the modelSpecific section to the default config.
config.example.json Updates the example config JSON to match the new schema (no anthropicVersion/topK, adds modelSpecific).
cdk/lib/api-stack.ts Extends the worker Lambda role with bedrock:Converse and bedrock:ConverseStream permissions while retaining legacy InvokeModel* and KB-related actions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ders

Change unknown provider check from blocking error to informational warning
to support truly model-agnostic operation:

- validateModelId now returns {error, warning} instead of string
- Known providers (anthropic, amazon, meta, etc.) pass silently
- Unknown providers with valid format (provider.model-name) pass with warning
- Only structurally invalid IDs (missing dot separator) cause errors
- Update CLI handlers to display warnings without blocking updates
- Add includeWarnings option to validateConfig for detailed validation

This ensures the system works with any future Bedrock providers without
requiring code changes, while still providing helpful feedback for
potentially mistyped model IDs.
Copy link

@mfittko mfittko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approved

Excellent implementation of SSM-based dynamic configuration with a professional CLI tool.

Key Strengths:

  • Clean architecture with proper separation of concerns
  • Comprehensive test coverage (15 tests, all passing)
  • Well-documented with IAM policies included
  • Backward compatible (no breaking changes)
  • Smart caching strategy (5-min TTL)

What Works Well:

  • CLI tool provides intuitive config management (get, set, update, validate, backup, restore)
  • Lambda worker gracefully falls back to defaults if SSM unavailable
  • Knowledge Base can be enabled/disabled dynamically
  • Proper error handling and validation

Minor Notes:

  • Config changes take up to 5 minutes to apply (cache TTL) - acceptable and documented
  • Consider adding integration tests in future PR (unit tests are comprehensive)

Ready to merge. 🚀

Copy link

@mfittko mfittko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Found 2 issues that need attention:

Issue 1: Multi-turn conversation history dropped (HIGH SEVERITY)

The new Converse API implementation hardcodes messages to a single user message, completely ignoring job.messages which contains the conversation history. This breaks multi-turn conversations - every request is treated as isolated with no memory of prior exchanges.

Location: lambda/worker/index.js:140

Issue 2: Stream errors leave client hanging (MEDIUM SEVERITY)

When internalServerException or modelStreamErrorException occurs, the code throws without sending any WebSocket notification to the client. The frontend has no timeout or error handler, so users are left with a loading state indefinitely.

Location: lambda/worker/index.js:178-181

Checked for bugs and compliance with CLAUDE.md/AGENTS.md (no guideline files found).

// Build Converse API parameters (model-agnostic format)
const converseParams = {
modelId: config.model.modelId,
messages: [{ role: 'user', content: [{ text: user }] }],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-turn conversation history is dropped

The new code hardcodes messages to a single user message, completely ignoring job.messages which contains the conversation history.

Previous behavior: The old code (main branch, lines 143-181) normalized job.messages and included the full conversation history, allowing multi-turn conversations with context.

Impact: Every API call is now treated as a brand new conversation with no memory. Users cannot have ongoing conversations where the model remembers prior exchanges.

Suggested fix: Restore conversation history handling:

  1. Parse and normalize job.messages array
  2. Convert messages to Converse API format with content: [{text: '...'}] structure
  3. Pass the full message history to the messages parameter

The Converse API supports multi-turn conversations - see AWS docs.

break
}
// Handle errors in stream
if (evt.internalServerException || evt.modelStreamErrorException) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream errors leave client hanging

When a stream error occurs (internalServerException or modelStreamErrorException), the code immediately throws without sending any notification to the WebSocket client.

Impact: The frontend has no timeout or error handler, so users are left staring at a loading/partial state indefinitely with no feedback about what went wrong.

Previous behavior: The old streaming code wrapped chunk parsing in try/catch and logged errors without throwing, allowing the stream to complete and send the complete event.

Suggested fix:
Send an error event to the client before throwing, or at minimum send a complete event to unblock the UI:

if (evt.internalServerException || evt.modelStreamErrorException) {
  const error = evt.internalServerException || evt.modelStreamErrorException
  console.log('Stream error:', error.message)
  // Notify client of error
  await ws.send(
    new PostToConnectionCommand({
      ConnectionId: connectionId,
      Data: Buffer.from(JSON.stringify({ 
        event: 'error', 
        message: error.message || 'Stream error' 
      })),
    }),
  )
  break  // Exit loop gracefully instead of throwing
}

The frontend should also be updated to handle error events.

Address two issues from PR code review:
- Add normalizeMessagesToConverseFormat() to pass multi-turn conversation
  history to the Converse API instead of dropping job.messages
- Send error events to client via WebSocket on stream failures instead
  of silently breaking, with corresponding frontend error bubble display
- Add worker Lambda tests to CI matrix alongside CLI tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mfittko
Copy link

mfittko commented Feb 18, 2026

@OmoleOO can you try setting this up and then try to deploy?

https://sofatutor.slack.com/archives/CG1LWUKMJ/p1770998319357849

@mfittko
Copy link

mfittko commented Feb 25, 2026

@OmoleOO is there anything missing here? did you try the deployment?

@OmoleOO
Copy link
Author

OmoleOO commented Feb 25, 2026

@OmoleOO is there anything missing here? did you try the deployment?

I need to check that nothing is missing, I'll try the deployment after.

OmoleOO and others added 5 commits February 25, 2026 13:12
…ocket interactions

- Introduced functions to send error and warning messages to WebSocket clients, improving user feedback during streaming operations.
- Implemented a method to convert AWS SDK errors into user-friendly messages, enhancing error reporting.
- Updated the worker Lambda to utilize these new functions for handling stream errors and knowledge base retrieval failures, ensuring clients are informed of issues without breaking the flow.

This update enhances the robustness of the WebSocket communication and provides clearer feedback to users in case of errors.
- Introduced package-lock.json to ensure consistent dependency versions for the worker Lambda.
- This file includes all dependencies and their respective versions, enhancing reproducibility and stability in the development environment.
- Included Jest version 30.2.0 in the devDependencies of package.json to facilitate unit testing for the worker Lambda.
- This addition supports the ongoing efforts to enhance test coverage and maintain code quality.
- Added unit tests to improve error handling in the worker Lambda, covering various AWS SDK error scenarios including ThrottlingException, AccessDeniedException, and ResourceNotFoundException.
- Implemented tests for user-friendly error messages and ensured that long error messages are truncated appropriately.
- Included tests for handling knowledge base retrieval failures, ensuring that warnings are sent to clients while still processing responses.
- Verified that the system prompt defaults correctly when knowledge base retrieval fails, enhancing the robustness of the error handling mechanism.
…llback

- Introduced a mechanism to determine the effective model ID, prioritizing inference profile ARNs.
- Added support for non-streaming models, allowing the system to fall back to a Converse API call when streaming is unsupported.
- Enhanced error handling for streaming failures, logging unsupported models and providing user-friendly error messages.
- Improved output streaming by sending responses in word-sized chunks for a more natural experience.

This update enhances the flexibility and robustness of the worker Lambda's interaction with the Converse API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants