Tools can also return DocumentBlock, ImageBlock, VideoBlock #396
Tools can also return DocumentBlock, ImageBlock, VideoBlock #396dpruessner wants to merge 1 commit intostrands-agents:mainfrom
Conversation
… string and JSONValue
|
👋 Welcome @dpruessner and thanks for this contribution! This looks like an interesting enhancement allowing tools to return richer content types (DocumentBlock, ImageBlock, VideoBlock). This would enable more sophisticated tool outputs beyond simple text. Hoping maintainers can take a look when they have a chance! 👀 🤖 This comment was generated by an AI agent using strands-agents. Workflow Run: 20944495454 |
| expect(result.type).toBe('documentBlock') | ||
| expect(result.name).toBe('RESULT') | ||
| expect(result.format).toBe('md') | ||
| }) |
There was a problem hiding this comment.
Issue: Test coverage is incomplete for media block returns.
Only DocumentBlock return is tested. Consider adding tests for ImageBlock and VideoBlock returns to ensure consistent behavior across all media types.
Suggestion: Add similar tests for ImageBlock and VideoBlock:
it('handles ImageBlock return', async () => {
const { ImageBlock } = await import('../../types/media.js')
const imgTool = tool({
name: 'create_image',
description: 'Creates an image',
inputSchema: z.object({ data: z.string() }),
callback: (input) => {
return new ImageBlock({
format: 'png',
source: { bytes: new TextEncoder().encode(input.data) },
})
},
})
const result = await imgTool.invoke({ data: 'test' })
expect(result.type).toBe('imageBlock')
expect(result.format).toBe('png')
})
Review SummaryAssessment: Comment (Request minor changes before approval) Key ThemesStrengths:
Areas Needing Attention:
OverallThis is a valuable enhancement that enables richer tool outputs. The core implementation in Bedrock and the type system changes look solid. The main suggestions focus on cross-provider compatibility and test coverage to ensure a robust feature. Nice work on this contribution! 🎉 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Move tests to `tests_integ` as `tests-integ` is not a proper module name. Also extract all provider ignoring to a new providers file which centralizes the environment variables needed.
Unshure
left a comment
There was a problem hiding this comment.
Can you also add support for the other model providers at a part of this pr?
| // eslint-disable-next-line @typescript-eslint/no-explicit-any | ||
| return { text: `[Unsupported content type: ${(content as any).type}]` } |
There was a problem hiding this comment.
nit: Lets just remove the default case so we implicitly skip if an unsupported content type is provided. Or we can warn and skip to be explicit.
| private _wrapInToolResult(value: unknown, toolUseId: string): ToolResultBlock { | ||
| try { | ||
| // Handle DocumentBlock, ImageBlock, VideoBlock directly | ||
| if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) { |
There was a problem hiding this comment.
Can we just do an instance check against ToolResultContent instead? That way TextBlock and JsonBlock are also valid here
https://github.com/strands-agents/sdk-typescript/blob/main/src/types/messages.ts#L195
| if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) { | |
| if (value instanceof ToolResultContent) { |
| * Valid return types for tool callbacks. | ||
| * Includes JSON-serializable values and media blocks. | ||
| */ | ||
| export type ToolReturnValue = JSONValue | DocumentBlock | ImageBlock | VideoBlock |
There was a problem hiding this comment.
| export type ToolReturnValue = JSONValue | DocumentBlock | ImageBlock | VideoBlock | |
| export type ToolReturnValue = JSONValue | ToolResultContent |
| * Tool result content block. | ||
| */ | ||
| export class ToolResultBlock implements ToolResultBlockData { | ||
| export class ToolResultBlock { |
There was a problem hiding this comment.
nit: Why no longer implement ToolResultBlockData?
| expect(result).toBe(3) | ||
| }) | ||
|
|
||
| it('handles DocumentBlock return', async () => { |
There was a problem hiding this comment.
Can you add a similar test to the agent.test.ts file to add a tool which returns a document, and create a mock model to invoke that tool, then ensure the loop completes successfully?
https://github.com/strands-agents/sdk-typescript/blob/main/src/agent/__tests__/agent.test.ts#L78
mehtarac
left a comment
There was a problem hiding this comment.
We have additional model providers in the repo now as well such as openai, anthropic, and gemini. Will be beneficial to add implemenatation for all model providers.
| private _wrapInToolResult(value: unknown, toolUseId: string): ToolResultBlock { | ||
| try { | ||
| // Handle DocumentBlock, ImageBlock, VideoBlock directly | ||
| if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) { |
There was a problem hiding this comment.
The instanceof check happens before the null check, but media blocks could theoretically be null in edge cases.
- Suggestion: Keep the null check first (line 231) before the media block check
| } else if ('image' in contentItem) { | ||
| return new ImageBlock(contentItem.image as ImageBlockData) | ||
| } else if ('video' in contentItem) { | ||
| return new VideoBlock(contentItem.video as VideoBlockData) |
There was a problem hiding this comment.
I think we can remove the type assertions and let TypeScript infer the types to ensure the data matches the expected shape
| expect(result).toBe(3) | ||
| }) | ||
|
|
||
| it('handles DocumentBlock return', async () => { |
There was a problem hiding this comment.
Test only verifies the structure but doesn't validate the actual bytes content was preserved. Consider adding assertion to decode and verify the bytes match the input:
const decoded = new TextDecoder().decode(result.source.bytes)
expect(decoded).toBe('Hello World!')
Description
Enables tools to return
DocumentBlock,ImageBlock, andVideoBlockcontent directly to multi-modal models.Previously, tools could only return strings or JSON. This PR adds support for rich media blocks, allowing more efficient processing of documents, images, and videos through the Bedrock Converse API.
Key Changes:
ToolReturnValuetype supporting media blockstool()helper to accept DocumentBlock/ImageBlock/VideoBlock returnsRelated Issues
Closes #395
Documentation PR
Type of Change
New feature
Testing
How have you tested the change?
npm run checkChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.