feat(deepagents): support multimodal files for backends#298
feat(deepagents): support multimodal files for backends#298Colin Francis (colifran) wants to merge 61 commits intomainfrom
Conversation
🦋 Changeset detectedLatest commit: 2930258 The changes in this PR will be included in the next version bump. This PR includes changesets to release 6 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
516cedf to
4d75256
Compare
0ace2b2 to
6beb46c
Compare
Christian Bromann (christian-bromann)
left a comment
There was a problem hiding this comment.
One nit, optional.
Great work 👍
10210cc to
c8b6f3e
Compare
Hunter Lovell (hntrl)
left a comment
There was a problem hiding this comment.
on the mime type thing:
- I don't think the filename heuristic to determine what content block gets made is our best approach. E.g. in S3 I know I can attach a content-type header to a file but have an ambiguous file name
| /** File content as a string (text or base64-encoded binary), undefined on failure */ | ||
| content?: string; |
There was a problem hiding this comment.
same thing w/ mimetype
There was a problem hiding this comment.
when would this be base64 encoded? can we save the encoding time if we just always had a Uint8Array? For reference this is how we type in https://github.com/langchain-ai/langchainjs/blob/c3fcea5f288ac5264b11731d271385e9a2cee537/libs/langchain-core/src/messages/content/multimodal.ts#L51
There was a problem hiding this comment.
I looked into this with Claude so please confirm but I think we would run into an issue with state and store backends. It sounds like the JsonPlusSerializer only handles Uint8Array at the top level (["bytes", obj]) but when it's nested inside the files record it goes through JSON.stringify and comes back as a plain object with numeric keys ({"0": 137, "1": 80, ...}). The replacer/reviver has cases for Set, Map, RegExp, etc. but not Uint8Array. So we would run into issues with data corruption.
Description
Adds support for binary and multimodal files (images, PDFs, audio, video, etc.) across all backend implementations, with a versioned protocol layer that returns structured results instead of plain values or arrays.
Changes
Protocol
Binary File Support
Middleware
Provider Updates
Backward Compatibility
Tests
Example: