feat(models): add guardrails configuration and redaction support to Bedrock provider#573
feat(models): add guardrails configuration and redaction support to Bedrock provider#573
Conversation
…edrock provider - Add GuardrailConfig interface with guardrailIdentifier, guardrailVersion, trace, streamProcessingMode, and redaction options - Add guardrailConfig to BedrockModelConfig for Bedrock model configuration - Update _formatRequest to include guardrailConfig in Bedrock API requests - Add ModelRedactContentEvent and ModelRedactContentEventData to streaming.ts - Add _hasBlockedGuardrail method for recursive detection of blocked policies - Add _generateRedactionEvents method for creating redaction events based on config - Update metadata event handling to emit redaction events when guardrails block content - Support both streaming and non-streaming modes - Add comprehensive tests for guardrail configuration, request formatting, blocked detection, and redaction events Resolves #564
|
Assessment: Request Changes This PR adds a well-structured guardrails configuration API for Bedrock with good test coverage and documentation. The implementation follows existing patterns and handles both streaming and non-streaming modes correctly. Review Summary
Nice work on the comprehensive test coverage! 🎉 |
|
/strands how does this pull request update the |
- Update _convert_to_class_event to handle modelRedactContentEvent - Add redaction handling in streamAggregated to modify messages array - When redactUserContentMessage is present, redact the last user message - When redactAssistantContentMessage is present, replace assistant content blocks - Add comprehensive tests for message redaction in model.test.ts This completes the guardrails Phase 1 implementation by ensuring that when guardrails block content, the messages array is updated with the redacted content so consumers have access to the sanitized messages.
|
I've implemented the message redaction support in Changes Made
|
|
/strands Im liking this better, but can you update the AggregatedStreamResult to include the redact content, and update the agent class to overwrite the latest message with this redacted content if its present? |
- Model now only returns redactContent info without modifying messages - Agent handles both user and assistant message redaction - Simplifies model layer by keeping all message manipulation in agent
src/agent/agent.ts
Outdated
| const { message, stopReason } = yield* this._streamFromModel(this.messages, streamOptions) | ||
| const { message, stopReason, redactContent } = yield* this._streamFromModel(this.messages, streamOptions) | ||
|
|
||
| const afterModelCallEvent = new AfterModelCallEvent({ agent: this, stopData: { message, stopReason } }) |
There was a problem hiding this comment.
Can you add redactContent to AfterModelCallEvent?
There was a problem hiding this comment.
Done - added redactionMessage to ModelStopData interface and updated AfterModelCallEvent to include it in stopData.
src/agent/agent.ts
Outdated
|
|
||
| // result.done is true, result.value contains the return value | ||
| return result.value | ||
| return { |
There was a problem hiding this comment.
nit: Can this still just be result.value?
There was a problem hiding this comment.
Done - simplified to return result.value.
| /** | ||
| * Redacts the last user message in the conversation history. | ||
| * Called when guardrails block user input and redaction is enabled. | ||
| * | ||
| * @param redactMessage - The redaction message to replace the user content with | ||
| */ | ||
| private _redactLastUserMessage(redactMessage: string): void { | ||
| // Find and redact the last user message | ||
| for (let i = this.messages.length - 1; i >= 0; i--) { | ||
| const message = this.messages[i] | ||
| if (message && message.role === 'user') { | ||
| this.messages[i] = new Message({ | ||
| role: 'user', | ||
| content: [new TextBlock(redactMessage)], | ||
| }) | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Can you update this so that it just redacts the last message, not the last user message?
There was a problem hiding this comment.
Done - renamed to _redactLastMessage and simplified to redact the last message regardless of role.
src/models/model.ts
Outdated
| redactUserContentMessage?: string | ||
|
|
||
| /** | ||
| * Message to replace assistant output with when output was blocked by guardrails. | ||
| */ | ||
| redactAssistantContentMessage?: string |
There was a problem hiding this comment.
This can just be redactionMessage. We will only ever return this value if the redactUserContentMessage is set. If redactAssistantContentMessage event is returned from the underlying model, then we just update the message of the StreamAggregatedResult so there is no need to redact.
There was a problem hiding this comment.
Done - simplified to just redactionMessage. Assistant redaction is now handled directly in the model by updating the message content.
|
/strands |
- Add redactionMessage to ModelStopData and AfterModelCallEvent - Simplify _streamFromModel to return result.value directly - Rename _redactLastUserMessage to _redactLastMessage (redacts any last message) - Simplify StreamAggregatedResult to use single redactionMessage field - Model now handles assistant redaction directly, returns redactionMessage for user redaction
Summary
This PR implements proper handling of
modelRedactContentEventin the Agent SDK, enabling guardrail-based content redaction for both user input and assistant output.Why This Change Is Needed
When Bedrock Guardrails detect content that should be blocked (either inappropriate user input or harmful assistant output), they emit
modelRedactContentEventevents with redaction messages. The SDK needs to:Public API Changes
StreamAggregatedResultInterface (model.ts)The result returned by
streamAggregatednow includes an optionalredactionMessage:ModelStopDataInterface (events.ts)The
AfterModelCallEventnow includes redaction info:Architecture
Separation of Concerns
Model Layer (
streamAggregated):modelRedactContentEventeventsredactionMessagefor agent to handle user input redactionAgent Layer (
_stream):redactionMessagefrom model viainvokeModel_redactLastMessagehelperredactionMessageviaAfterModelCallEvent.stopDataImplementation Details
_redactLastMessage: Redacts the last message in the conversation history (preserves role)redactionMessageTesting
redactionMessageis returned correctly for user input redactionResolves #564