Skip to content

Tools can also return DocumentBlock, ImageBlock, VideoBlock #396

Open
dpruessner wants to merge 1 commit intostrands-agents:mainfrom
dpruessner:main
Open

Tools can also return DocumentBlock, ImageBlock, VideoBlock #396
dpruessner wants to merge 1 commit intostrands-agents:mainfrom
dpruessner:main

Conversation

@dpruessner
Copy link

Description

Enables tools to return DocumentBlock, ImageBlock, and VideoBlock content directly to multi-modal models.

Previously, tools could only return strings or JSON. This PR adds support for rich media blocks, allowing more efficient processing of documents, images, and videos through the Bedrock Converse API.

Key Changes:

  • Added ToolReturnValue type supporting media blocks
  • Enhanced tool() helper to accept DocumentBlock/ImageBlock/VideoBlock returns
  • Updated Bedrock formatting to handle media content natively
  • Maintained full backward compatibility

Related Issues

Closes #395

Documentation PR

Type of Change

New feature

Testing

How have you tested the change?

  • I ran npm run check

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@strands-agent
Copy link
Collaborator

👋 Welcome @dpruessner and thanks for this contribution!

This looks like an interesting enhancement allowing tools to return richer content types (DocumentBlock, ImageBlock, VideoBlock). This would enable more sophisticated tool outputs beyond simple text.

Hoping maintainers can take a look when they have a chance! 👀


🤖 This comment was generated by an AI agent using strands-agents. Workflow Run: 20944495454

expect(result.type).toBe('documentBlock')
expect(result.name).toBe('RESULT')
expect(result.format).toBe('md')
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Test coverage is incomplete for media block returns.

Only DocumentBlock return is tested. Consider adding tests for ImageBlock and VideoBlock returns to ensure consistent behavior across all media types.

Suggestion: Add similar tests for ImageBlock and VideoBlock:

it('handles ImageBlock return', async () => {
  const { ImageBlock } = await import('../../types/media.js')

  const imgTool = tool({
    name: 'create_image',
    description: 'Creates an image',
    inputSchema: z.object({ data: z.string() }),
    callback: (input) => {
      return new ImageBlock({
        format: 'png',
        source: { bytes: new TextEncoder().encode(input.data) },
      })
    },
  })

  const result = await imgTool.invoke({ data: 'test' })
  expect(result.type).toBe('imageBlock')
  expect(result.format).toBe('png')
})

@github-actions
Copy link

Review Summary

Assessment: Comment (Request minor changes before approval)

Key Themes

Strengths:

  • Clean implementation that extends the existing type system naturally
  • Proper handling in Bedrock model with _formatDocumentSource and _formatMediaSource
  • Good backward compatibility maintained
  • TSDoc documentation is present

Areas Needing Attention:

  1. OpenAI Compatibility: The OpenAI model now silently ignores media blocks in tool results, returning empty strings. This could lead to unexpected behavior when users switch between providers.

  2. Test Coverage: Only DocumentBlock return type is tested. Adding tests for ImageBlock and VideoBlock would improve confidence in the implementation.

Overall

This is a valuable enhancement that enables richer tool outputs. The core implementation in Bedrock and the type system changes look solid. The main suggestions focus on cross-provider compatibility and test coverage to ensure a robust feature.

Nice work on this contribution! 🎉

@github-actions

This comment was marked as off-topic.

dbschmigelski pushed a commit to dbschmigelski/sdk-typescript that referenced this pull request Feb 5, 2026
Move tests to `tests_integ` as `tests-integ` is not a proper module name.  Also extract all provider ignoring to a new providers file which centralizes the environment variables needed.
@Unshure Unshure self-assigned this Feb 10, 2026
Copy link
Member

@Unshure Unshure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add support for the other model providers at a part of this pr?

Comment on lines +586 to +587
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return { text: `[Unsupported content type: ${(content as any).type}]` }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Lets just remove the default case so we implicitly skip if an unsupported content type is provided. Or we can warn and skip to be explicit.

private _wrapInToolResult(value: unknown, toolUseId: string): ToolResultBlock {
try {
// Handle DocumentBlock, ImageBlock, VideoBlock directly
if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do an instance check against ToolResultContent instead? That way TextBlock and JsonBlock are also valid here
https://github.com/strands-agents/sdk-typescript/blob/main/src/types/messages.ts#L195

Suggested change
if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) {
if (value instanceof ToolResultContent) {

* Valid return types for tool callbacks.
* Includes JSON-serializable values and media blocks.
*/
export type ToolReturnValue = JSONValue | DocumentBlock | ImageBlock | VideoBlock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export type ToolReturnValue = JSONValue | DocumentBlock | ImageBlock | VideoBlock
export type ToolReturnValue = JSONValue | ToolResultContent

* Tool result content block.
*/
export class ToolResultBlock implements ToolResultBlockData {
export class ToolResultBlock {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why no longer implement ToolResultBlockData?

expect(result).toBe(3)
})

it('handles DocumentBlock return', async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a similar test to the agent.test.ts file to add a tool which returns a document, and create a mock model to invoke that tool, then ensure the loop completes successfully?

https://github.com/strands-agents/sdk-typescript/blob/main/src/agent/__tests__/agent.test.ts#L78

Copy link
Member

@mehtarac mehtarac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have additional model providers in the repo now as well such as openai, anthropic, and gemini. Will be beneficial to add implemenatation for all model providers.

private _wrapInToolResult(value: unknown, toolUseId: string): ToolResultBlock {
try {
// Handle DocumentBlock, ImageBlock, VideoBlock directly
if (value instanceof DocumentBlock || value instanceof ImageBlock || value instanceof VideoBlock) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instanceof check happens before the null check, but media blocks could theoretically be null in edge cases.

  • Suggestion: Keep the null check first (line 231) before the media block check

} else if ('image' in contentItem) {
return new ImageBlock(contentItem.image as ImageBlockData)
} else if ('video' in contentItem) {
return new VideoBlock(contentItem.video as VideoBlockData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the type assertions and let TypeScript infer the types to ensure the data matches the expected shape

expect(result).toBe(3)
})

it('handles DocumentBlock return', async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test only verifies the structure but doesn't validate the actual bytes content was preserved. Consider adding assertion to decode and verify the bytes match the input:

const decoded = new TextDecoder().decode(result.source.bytes)
expect(decoded).toBe('Hello World!')

@mehtarac mehtarac self-assigned this Feb 12, 2026
@Unshure Unshure removed their assignment Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Tools can return DocumentBlock, ImageBlock, VideoBlock

4 participants