feat: add responseModalities support for Gemini image generation #41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Add support for the
responseModalitiesparameter in Google and VertexAI LLM classes to enable native Gemini image generation models (gemini-2.5-flash-image,gemini-3-pro-image-preview, etc.) to return images alongside text.Changes
responseModalities?: ('TEXT' | 'IMAGE' | 'AUDIO')[]toGoogleClientOptionsandVertexAIClientOptionstypesresponseModalitiestogenerationConfiginCustomChatGoogleGenerativeAIconstructorinlineData(image) parts in response processing (convertResponseContentToChatGenerationChunkandmapGenerateContentResultToChatResult), converting them toimage_urlcontent blocks with base64 data URLsresponseModalitiesto generation config in VertexAICustomChatConnection.formatDataUsage
Tested Models
gemini-2.5-flash-image- Returns text + PNG imagegemini-3-pro-image-preview- Returns JPEG imageRelated