Skip to content

Conversation

@usnavy13
Copy link
Contributor

@usnavy13 usnavy13 commented Nov 25, 2025

Summary

This PR adds a comprehensive Gemini Image Generation tool for LibreChat Agents with flexible authentication supporting both Gemini API and Google Cloud Vertex AI.

Key Features:

  • Multi-Authentication Support:
    • User-provided API keys (via GUI)
    • Admin-configured API keys (GEMINI_API_KEY or GOOGLE_KEY)
    • Vertex AI service accounts (automatic fallback)
  • Image Generation Capabilities:
    • Text-to-image generation with Gemini Nano Banana (gemini-2.5-flash-image) and Nano Banana Pro (gemini-3-pro-image)
    • Image context-aware generation (use existing images as inspiration/reference)
  • Storage Compatibility: Works with all LibreChat file storage strategies (local, S3, Azure, Firebase)
  • Consistent Implementation: Uses same loadServiceKey pattern as main Google endpoint and Anthropic Vertex AI
  • User-Friendly: Clear safety filter messages for content policy violations

Why a Tool-Based Approach?

Gemini's image generation models (gemini-2.5-flash-image, gemini-3-pro-image-preview, etc.) require special API parameters to return images:

responseModalities: ['TEXT', 'IMAGE']

Architectural Constraint

LibreChat's current chat architecture routes all endpoint requests (including Google) through the unified agents system (/api/agents/chat/:endpoint), which uses the @librechat/agents package with LangChain.

The limitation:

  • The @librechat/agents package's CustomChatGoogleGenerativeAI class creates the Google client with a hardcoded generationConfig that doesn't include responseModalities
  • LangChain's @langchain/google-genai package doesn't currently expose this parameter
  • Without responseModalities: ['TEXT', 'IMAGE'], Gemini returns text descriptions instead of actual images

Why native model selection won't work (yet):

  • Even with image models available in the model dropdown, the agents system can't request image output
  • The API call succeeds but returns only text, resulting in empty responses

The Tool Approach Works

This tool uses the @google/genai SDK directly (not LangChain), which:

  • ✅ Supports responseModalities: ['TEXT', 'IMAGE']
  • ✅ Returns inline image data that can be saved and displayed
  • ✅ Integrates seamlessly with the agents tool system

Future Native Support

Native support for Gemini image models as selectable endpoints would require:

  1. Updates to @librechat/agents to support responseModalities in CustomChatGoogleGenerativeAI
  2. Handling of image content parts in the response pipeline

A separate feature request has been opened for the @librechat/agents package.
danny-avila/agents#41


Configuration Options

# Option A: Vertex AI with Service Account (recommended for production)
GEMINI_VERTEX_ENABLED=true
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json
GOOGLE_LOC=us-central1  # or GOOGLE_CLOUD_LOCATION=global

# Option B: Dedicated Gemini API Key
GEMINI_API_KEY=your-gemini-api-key

# Option C: Shared Google API Key (uses same key as Google chat)
GOOGLE_KEY=your-google-api-key

# Optional: Change model (default: gemini-2.5-flash-image - Nano Banana)
GEMINI_IMAGE_MODEL=gemini-3-pro-image  # Nano Banana Pro

Authentication Priority:

  1. User-provided API key (via GUI when adding tool)
  2. GEMINI_API_KEY env var (admin-configured)
  3. GOOGLE_KEY env var (shared with Google chat endpoint)
  4. Vertex AI service account (automatic fallback)

This allows:

  • ✅ Vertex AI users: Tool works immediately without API keys
  • ✅ API key users: Admin sets global key OR users provide their own
  • ✅ Mixed environments: Vertex AI with optional user override

Related:

Change Type

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Testing

Tested locally with multiple authentication configurations:

Vertex AI Configuration:

  1. ✅ Service account authentication with GEMINI_VERTEX_ENABLED=true
  2. ✅ Tool available without API key prompts
  3. ✅ Text-to-image generation with Nano Banana (gemini-2.5-flash-image)
  4. ✅ Image-to-image editing with context

API Key Configuration:

  1. ✅ User-provided keys via GUI
  2. ✅ Admin GEMINI_API_KEY env var
  3. ✅ Shared GOOGLE_KEY env var

General Testing:

  1. ✅ Safety filter handling for blocked content (clear error messages)
  2. ✅ Local file storage strategy
  3. ✅ Image context/editing with uploaded images
  4. ✅ Both Nano Banana and Nano Banana Pro models

Test Configuration:

  • Node.js v20
  • MongoDB 7.x
  • Vertex AI service account (us-central1 region)
  • Local file storage strategy
  • Models: gemini-2.5-flash-image (Nano Banana), gemini-3-pro-image (Nano Banana Pro)

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes (updated .env.example)
  • My changes do not introduce new warnings
  • Local unit tests pass with my changes
  • A pull request for updating the documentation has been submitted (Here)

devilb2103 and others added 13 commits September 10, 2025 15:31
* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override.
* Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.
- Bump @google/genai package version to ^1.19.0 for improved functionality.
- Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency.
- Enhance manifest.json for Gemini Image Tools with updated descriptions and icon.
- Add SVG icon for Gemini Image Tools.
- Implement progress tracking for Gemini image generation in the UI.
- Introduce new toolkit and context handling for image generation tools.

This update improves the Gemini image generation capabilities and user experience.
…icon

- Deleted the obsolete PNG file for Gemini image generation.
- Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency.
@usnavy13
Copy link
Contributor Author

@danny-avila Corresponding Docs PR LibreChat-AI/librechat.ai#452

@KiGamji
Copy link
Contributor

KiGamji commented Nov 26, 2025

shouldn't it also work natively?

@KiGamji
Copy link
Contributor

KiGamji commented Nov 26, 2025

image

like this

@KiGamji
Copy link
Contributor

KiGamji commented Nov 26, 2025

nvm, that should invoke tools too lmao

@usnavy13
Copy link
Contributor Author

shouldn't it also work natively?

I was thinking about that but this would be a departure from how the project handles image tools. I organized it similar to the openai tools so the workflows stay the same for users

@KiGamji
Copy link
Contributor

KiGamji commented Nov 26, 2025

@danny-avila with native multimodal image generation models appearing, it would be great to implement this functionality actually!

image

like this

@avimar
Copy link

avimar commented Nov 30, 2025

This is great that it's a tool - it can be called by other models.

But yes, there's more models that can natively return text AND images (and audio?), so that would be good if it can handle that too.

@marlonka
Copy link
Contributor

marlonka commented Dec 7, 2025

@danny-avila can you review this?

- Updated .env.example to include new environment variables for Google Cloud region, service account configuration, and Gemini API key options.
- Modified GeminiImageGen.js to support both user-provided API keys and Vertex AI service accounts, improving flexibility in client initialization.
- Updated manifest.json to reflect changes in authentication methods for the Gemini Image Tools.
- Bumped @google/genai package version to 1.19.0 in package-lock.json for compatibility with new features.
@paulchaum
Copy link

Correct me if I'm wrong, but looking at the PR, I get the impression that the tool will only work with the Gemini or Vertex AI API. I think it would be nice to have the option to make it work with any OpenAI-compatible API, such as OpenRouter.

- Adjusted the return statement in getDefaultServiceKeyPath function for improved readability by formatting it across multiple lines. This change enhances code clarity without altering functionality.
usnavy13 and others added 5 commits December 16, 2025 18:39
Resolved conflicts:
- api/package.json: Keep @google/genai (new SDK), accept dev changes
- package-lock.json: Regenerated with npm install
Resolved conflicts in peerDependencies by keeping both:
- @google/genai (from feature branch)
- @aws-sdk/client-bedrock-runtime (from dev)

Also merged transitive dependencies in package-lock.json.
@inv-Eldho
Copy link

@danny-avila
Hi,
Could you please move forward with this pull request? We’re really looking forward to this feature

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive Gemini Image Generation tool that integrates Google's Gemini image generation models (Nano Banana and Nano Banana Pro) into LibreChat's agent system. The implementation takes a tool-based approach due to architectural limitations in the current agents package that prevent native model support for image generation with responseModalities parameters.

Key Changes:

  • Multi-authentication support (user-provided keys, admin API keys, or Vertex AI service accounts) with flexible fallback
  • Complete image generation workflow including text-to-image and image-to-image with context
  • Integration with all LibreChat storage strategies (local, S3, Azure, Firebase)

Reviewed changes

Copilot reviewed 17 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
api/app/clients/tools/structured/GeminiImageGen.js Core tool implementation with authentication, image generation, and storage handling
packages/api/src/tools/toolkits/gemini.ts Tool schema definition with comprehensive parameters for aspect ratio and image size
packages/api/src/tools/toolkits/imageContext.ts Reusable helper for building image context strings for tools
packages/api/src/endpoints/google/initialize.ts Updated service key path to use 'api/data/auth.json' for consistency
api/app/clients/tools/util/handleTools.js Tool loading integration with image context builder
api/app/clients/tools/manifest.json Tool registration with flexible authentication config
packages/api/package.json Added @google/genai dependency
api/models/tx.js Token pricing configuration for image generation models
packages/data-provider/src/config.ts Added gemini_image_gen to imageGenTools set
client/src/components/Chat/Messages/Content/Part.tsx Client-side tool detection for rendering
client/src/components/Chat/Messages/Content/Parts/OpenAIImageGen/ProgressText.tsx Progress display messaging for Gemini image generation
client/public/assets/gemini_image_gen.svg Custom icon for the tool
.env.example Comprehensive documentation for configuration options

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 124 to 129
if (!fs.existsSync(userDir)) {
fs.mkdirSync(userDir, { recursive: true });
}

const filePath = path.join(userDir, imageName);
fs.writeFileSync(filePath, Buffer.from(base64Data, 'base64'));
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronous file system operations (fs.existsSync, fs.mkdirSync, fs.writeFileSync) can block the event loop and impact performance under load. Consider using the asynchronous equivalents (fs.promises.access, fs.promises.mkdir, fs.promises.writeFile) to avoid blocking, especially since this function is already async and the caller expects a promise.

Copilot uses AI. Check for mistakes.
"@azure/identity": "^4.7.0",
"@azure/search-documents": "^12.0.0",
"@azure/storage-blob": "^12.27.0",
"@google/genai": "^1.19.0",
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package.json specifies version ^1.19.0 for @google/genai, but the package-lock.json has installed version 1.33.0. This version mismatch should be resolved by updating the package.json to match the actual installed version or by ensuring the lock file reflects the specified version constraint. Consider updating package.json to "^1.33.0" to match what's actually installed.

Copilot uses AI. Check for mistakes.
*/
function getDefaultServiceKeyPath() {
return (
process.env.GOOGLE_SERVICE_KEY_FILE || path.join(__dirname, '../../../..', 'data', 'auth.json')
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path construction using relative path segments from __dirname ('../../../../data/auth.json') is inconsistent with the updated path in initialize.ts which uses 'api/data/auth.json'. Given that this file is located at api/app/clients/tools/structured/GeminiImageGen.js, the path '../../../..' would resolve to the project root, making it 'data/auth.json'. However, the updated Google endpoint uses 'api/data/auth.json'. For consistency with the main Google endpoint and to match the updated path pattern, this should be changed to use path.join(process.cwd(), 'api', 'data', 'auth.json') instead of the __dirname-based approach.

Suggested change
process.env.GOOGLE_SERVICE_KEY_FILE || path.join(__dirname, '../../../..', 'data', 'auth.json')
process.env.GOOGLE_SERVICE_KEY_FILE ||
path.join(process.cwd(), 'api', 'data', 'auth.json')

Copilot uses AI. Check for mistakes.
# Set this to enable Vertex AI and allow tool without requiring API keys
# GEMINI_VERTEX_ENABLED=true

# Vertex AI model for image generation (defaults to gemini-2.5-flash-image)
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "Vertex AI model for image generation (defaults to gemini-2.5-flash-image)" but refers to it as "Nano Banana" in the PR description. The model name "gemini-2.5-flash-image" is the actual API model identifier, while "Nano Banana" appears to be an informal name. Consider clarifying this in the comment to avoid confusion, perhaps: "Model for image generation (defaults to gemini-2.5-flash-image, known as Nano Banana)".

Suggested change
# Vertex AI model for image generation (defaults to gemini-2.5-flash-image)
# Vertex AI model for image generation (defaults to gemini-2.5-flash-image, known as "Nano Banana")

Copilot uses AI. Check for mistakes.
… resolution

- Changed the default service key path to use process.cwd() for better compatibility.
- Replaced synchronous file system operations with asynchronous promises for mkdir and writeFile, enhancing performance and error handling.
- Added error handling for credential file access to prevent crashes when the file does not exist.
- Refactored API key checks to improve clarity and consistency.
- Removed redundant checks for user-provided keys, enhancing code readability.
- Ensured proper logging for API key usage across different configurations.
@danny-avila danny-avila changed the title feat: Gemini Image Generation Tool (Nano Banana) 🍌 feat: Gemini Image Generation Tool (Nano Banana) Jan 3, 2026
- Added a check to ensure imageSize is only applied if the gemini model does not include 'gemini-2.5-flash-image', improving compatibility.
- Enhanced the logic for setting imageConfig to prevent potential issues with unsupported configurations.
- Simplified the handling of API keys by removing redundant checks for user-provided keys.
- Updated logging to reflect the new priority order for API key usage, enhancing clarity and consistency.
- Improved code readability by consolidating key retrieval logic.
@danny-avila
Copy link
Owner

Made a few necessary changes to get this ready for merge:

  1. imageSize parameter: Only valid for gemini-3-pro-image-preview, not gemini-2.5-flash-image (undocumented by Google but causes API errors). Added conditional check.

  2. Format handling: Pro-preview wasn't outputting PNG as hardcoded. I think it returns either JPG or WEBP. Also, the librechat.yaml imageOutputType config wasn't being respected like it is for OpenAI tools. Fixed to detect actual MIME type from response.

  3. API key refactoring: Auth values are already resolved by the tool loader, there's no need to check if something is user provided or not.

  4. Path/async operations: Updated service key path to match main Google endpoint + converted to async file ops.

I was hesitant to merge this since LC is moving away from native tools (especially AI provider API wrappers) in favor of MCP, but I am only merging given the amount of work already put into this by others. Maintaining this tool will not be high priority.

@danny-avila
Copy link
Owner

Also worth mentioning: gemini-3-pro-preview generates images at about 15 MB in file size. I am thinking a setting to limit output image size is in order and would need to be implemented here.

@danny-avila danny-avila merged commit 200098d into danny-avila:dev Jan 3, 2026
6 checks passed
@KiGamji
Copy link
Contributor

KiGamji commented Jan 3, 2026

Made a few necessary changes to get this ready for merge:

1. **`imageSize` parameter**: Only valid for `gemini-3-pro-image-preview`, not `gemini-2.5-flash-image` (undocumented by Google but causes API errors). Added conditional check.

2. **Format handling**: Pro-preview wasn't outputting PNG as hardcoded. I think it returns either JPG or WEBP. Also, the `librechat.yaml` `imageOutputType` config wasn't being respected like it is for OpenAI tools. Fixed to detect actual MIME type from response.

3. **API key refactoring**: Auth values are already resolved by the tool loader, there's no need to check if something is user provided or not.

4. **Path/async operations**: Updated service key path to match main Google endpoint + converted to async file ops.

I was hesitant to merge this since LC is moving away from native tools (especially AI provider API wrappers) in favor of MCP, but I am only merging given the amount of work already put into this by others. Maintaining this tool will not be high priority.

Do you have any plans for multi-modal image generation?

@danny-avila
Copy link
Owner

Do you have any plans for multi-modal image generation?

Yes but not a high priority feature right now. It would be better to allow multi-modal generation via chat instead of tool, but at least the tool extends "nano banana" to other models.

@KiGamji
Copy link
Contributor

KiGamji commented Jan 3, 2026

Appreciate it man!

@usnavy13
Copy link
Contributor Author

usnavy13 commented Jan 3, 2026

Made a few necessary changes to get this ready for merge:

  1. imageSize parameter: Only valid for gemini-3-pro-image-preview, not gemini-2.5-flash-image (undocumented by Google but causes API errors). Added conditional check.

  2. Format handling: Pro-preview wasn't outputting PNG as hardcoded. I think it returns either JPG or WEBP. Also, the librechat.yaml imageOutputType config wasn't being respected like it is for OpenAI tools. Fixed to detect actual MIME type from response.

  3. API key refactoring: Auth values are already resolved by the tool loader, there's no need to check if something is user provided or not.

  4. Path/async operations: Updated service key path to match main Google endpoint + converted to async file ops.

I was hesitant to merge this since LC is moving away from native tools (especially AI provider API wrappers) in favor of MCP, but I am only merging given the amount of work already put into this by others. Maintaining this tool will not be high priority.

Thanks, and I agree its not a great template for future tools but it fits the weird place we are in with all the SDKaos going on right now. Ideally native multimodal input and output models should be supported like other models. I took a shot at implementing but ended up deep in the agents package and decided to bring this home instead.

@RedwindA
Copy link
Contributor

RedwindA commented Jan 3, 2026

Does it respect GOOGLE_REVERSE_PROXY in .env?

janpdevops pushed a commit to janpdevops/LibreChat that referenced this pull request Jan 4, 2026
* Added fully functioning Agent Tool supporting Google's Nano Banana

* 🔧 refactor: Update Google credentials handling in GeminiImageGen.js

* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override.
* Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.

* 🛠️ refactor: Remove unnecessary whitespace in handleTools.js

* 🔧 feat: Update Gemini Image Generation Tool

- Bump @google/genai package version to ^1.19.0 for improved functionality.
- Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency.
- Enhance manifest.json for Gemini Image Tools with updated descriptions and icon.
- Add SVG icon for Gemini Image Tools.
- Implement progress tracking for Gemini image generation in the UI.
- Introduce new toolkit and context handling for image generation tools.

This update improves the Gemini image generation capabilities and user experience.

* 🗑️ chore: Remove outdated Gemini image generation PNG and update SVG icon

- Deleted the obsolete PNG file for Gemini image generation.
- Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency.

* fix: ESLint formatting and unused variable in GeminiImageGen

* fix: Update default model to gemini-2.5-flash-image

* ✨ feat: Enhance Gemini Image Generation Configuration

- Updated .env.example to include new environment variables for Google Cloud region, service account configuration, and Gemini API key options.
- Modified GeminiImageGen.js to support both user-provided API keys and Vertex AI service accounts, improving flexibility in client initialization.
- Updated manifest.json to reflect changes in authentication methods for the Gemini Image Tools.
- Bumped @google/genai package version to 1.19.0 in package-lock.json for compatibility with new features.

* 🔧 fix: Format Default Service Key Path in GeminiImageGen.js

- Adjusted the return statement in getDefaultServiceKeyPath function for improved readability by formatting it across multiple lines. This change enhances code clarity without altering functionality.

* ✨ feat: Enhance Gemini Image Generation with Token Usage Tracking

- Added `recordTokenUsage` function to track token usage for balance management.
- Integrated token recording into the image generation process.
- Updated Gemini image generation tool to accept optional `aspectRatio` and `imageSize` parameters for improved image customization.
- Updated token values for new Gemini models in the transaction model.
- Improved documentation for image generation tool descriptions and parameters.

* ✨ feat: Add new Gemini models for image generation token limits

- Introduced token limits for 'gemini-3-pro-image' and 'gemini-2.5-flash-image' models.
- Updated token values to enhance the Gemini image generation capabilities.

* 🔧 fix: Update Google Service Key Path for Consistency in Initialization (danny-avila#11001)

* 🔧 refactor: Update GeminiImageGen for improved file handling and path resolution

- Changed the default service key path to use process.cwd() for better compatibility.
- Replaced synchronous file system operations with asynchronous promises for mkdir and writeFile, enhancing performance and error handling.
- Added error handling for credential file access to prevent crashes when the file does not exist.

* 🔧 refactor: Update GeminiImageGen to streamline API key handling

- Refactored API key checks to improve clarity and consistency.
- Removed redundant checks for user-provided keys, enhancing code readability.
- Ensured proper logging for API key usage across different configurations.

* 🔧 fix: Update GeminiImageGen to handle imageSize support conditionally

- Added a check to ensure imageSize is only applied if the gemini model does not include 'gemini-2.5-flash-image', improving compatibility.
- Enhanced the logic for setting imageConfig to prevent potential issues with unsupported configurations.

* 🔧 refactor: Simplify local storage condition in createGeminiImageTool function

* 🔧 feat: Enhance image format handling in GeminiImageGen with conversion support

* 🔧 refactor: Streamline API key initialization in GeminiImageGen

- Simplified the handling of API keys by removing redundant checks for user-provided keys.
- Updated logging to reflect the new priority order for API key usage, enhancing clarity and consistency.
- Improved code readability by consolidating key retrieval logic.

---------

Co-authored-by: Dev Bhanushali <dev.bhanushali@hingehealth.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
janpdevops pushed a commit to janpdevops/LibreChat that referenced this pull request Jan 6, 2026
* Added fully functioning Agent Tool supporting Google's Nano Banana

* 🔧 refactor: Update Google credentials handling in GeminiImageGen.js

* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override.
* Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.

* 🛠️ refactor: Remove unnecessary whitespace in handleTools.js

* 🔧 feat: Update Gemini Image Generation Tool

- Bump @google/genai package version to ^1.19.0 for improved functionality.
- Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency.
- Enhance manifest.json for Gemini Image Tools with updated descriptions and icon.
- Add SVG icon for Gemini Image Tools.
- Implement progress tracking for Gemini image generation in the UI.
- Introduce new toolkit and context handling for image generation tools.

This update improves the Gemini image generation capabilities and user experience.

* 🗑️ chore: Remove outdated Gemini image generation PNG and update SVG icon

- Deleted the obsolete PNG file for Gemini image generation.
- Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency.

* fix: ESLint formatting and unused variable in GeminiImageGen

* fix: Update default model to gemini-2.5-flash-image

* ✨ feat: Enhance Gemini Image Generation Configuration

- Updated .env.example to include new environment variables for Google Cloud region, service account configuration, and Gemini API key options.
- Modified GeminiImageGen.js to support both user-provided API keys and Vertex AI service accounts, improving flexibility in client initialization.
- Updated manifest.json to reflect changes in authentication methods for the Gemini Image Tools.
- Bumped @google/genai package version to 1.19.0 in package-lock.json for compatibility with new features.

* 🔧 fix: Format Default Service Key Path in GeminiImageGen.js

- Adjusted the return statement in getDefaultServiceKeyPath function for improved readability by formatting it across multiple lines. This change enhances code clarity without altering functionality.

* ✨ feat: Enhance Gemini Image Generation with Token Usage Tracking

- Added `recordTokenUsage` function to track token usage for balance management.
- Integrated token recording into the image generation process.
- Updated Gemini image generation tool to accept optional `aspectRatio` and `imageSize` parameters for improved image customization.
- Updated token values for new Gemini models in the transaction model.
- Improved documentation for image generation tool descriptions and parameters.

* ✨ feat: Add new Gemini models for image generation token limits

- Introduced token limits for 'gemini-3-pro-image' and 'gemini-2.5-flash-image' models.
- Updated token values to enhance the Gemini image generation capabilities.

* 🔧 fix: Update Google Service Key Path for Consistency in Initialization (danny-avila#11001)

* 🔧 refactor: Update GeminiImageGen for improved file handling and path resolution

- Changed the default service key path to use process.cwd() for better compatibility.
- Replaced synchronous file system operations with asynchronous promises for mkdir and writeFile, enhancing performance and error handling.
- Added error handling for credential file access to prevent crashes when the file does not exist.

* 🔧 refactor: Update GeminiImageGen to streamline API key handling

- Refactored API key checks to improve clarity and consistency.
- Removed redundant checks for user-provided keys, enhancing code readability.
- Ensured proper logging for API key usage across different configurations.

* 🔧 fix: Update GeminiImageGen to handle imageSize support conditionally

- Added a check to ensure imageSize is only applied if the gemini model does not include 'gemini-2.5-flash-image', improving compatibility.
- Enhanced the logic for setting imageConfig to prevent potential issues with unsupported configurations.

* 🔧 refactor: Simplify local storage condition in createGeminiImageTool function

* 🔧 feat: Enhance image format handling in GeminiImageGen with conversion support

* 🔧 refactor: Streamline API key initialization in GeminiImageGen

- Simplified the handling of API keys by removing redundant checks for user-provided keys.
- Updated logging to reflect the new priority order for API key usage, enhancing clarity and consistency.
- Improved code readability by consolidating key retrieval logic.

---------

Co-authored-by: Dev Bhanushali <dev.bhanushali@hingehealth.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
lihe8811 pushed a commit to lihe8811/LibreChat that referenced this pull request Jan 7, 2026
* Added fully functioning Agent Tool supporting Google's Nano Banana

* 🔧 refactor: Update Google credentials handling in GeminiImageGen.js

* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override.
* Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.

* 🛠️ refactor: Remove unnecessary whitespace in handleTools.js

* 🔧 feat: Update Gemini Image Generation Tool

- Bump @google/genai package version to ^1.19.0 for improved functionality.
- Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency.
- Enhance manifest.json for Gemini Image Tools with updated descriptions and icon.
- Add SVG icon for Gemini Image Tools.
- Implement progress tracking for Gemini image generation in the UI.
- Introduce new toolkit and context handling for image generation tools.

This update improves the Gemini image generation capabilities and user experience.

* 🗑️ chore: Remove outdated Gemini image generation PNG and update SVG icon

- Deleted the obsolete PNG file for Gemini image generation.
- Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency.

* fix: ESLint formatting and unused variable in GeminiImageGen

* fix: Update default model to gemini-2.5-flash-image

* ✨ feat: Enhance Gemini Image Generation Configuration

- Updated .env.example to include new environment variables for Google Cloud region, service account configuration, and Gemini API key options.
- Modified GeminiImageGen.js to support both user-provided API keys and Vertex AI service accounts, improving flexibility in client initialization.
- Updated manifest.json to reflect changes in authentication methods for the Gemini Image Tools.
- Bumped @google/genai package version to 1.19.0 in package-lock.json for compatibility with new features.

* 🔧 fix: Format Default Service Key Path in GeminiImageGen.js

- Adjusted the return statement in getDefaultServiceKeyPath function for improved readability by formatting it across multiple lines. This change enhances code clarity without altering functionality.

* ✨ feat: Enhance Gemini Image Generation with Token Usage Tracking

- Added `recordTokenUsage` function to track token usage for balance management.
- Integrated token recording into the image generation process.
- Updated Gemini image generation tool to accept optional `aspectRatio` and `imageSize` parameters for improved image customization.
- Updated token values for new Gemini models in the transaction model.
- Improved documentation for image generation tool descriptions and parameters.

* ✨ feat: Add new Gemini models for image generation token limits

- Introduced token limits for 'gemini-3-pro-image' and 'gemini-2.5-flash-image' models.
- Updated token values to enhance the Gemini image generation capabilities.

* 🔧 fix: Update Google Service Key Path for Consistency in Initialization (danny-avila#11001)

* 🔧 refactor: Update GeminiImageGen for improved file handling and path resolution

- Changed the default service key path to use process.cwd() for better compatibility.
- Replaced synchronous file system operations with asynchronous promises for mkdir and writeFile, enhancing performance and error handling.
- Added error handling for credential file access to prevent crashes when the file does not exist.

* 🔧 refactor: Update GeminiImageGen to streamline API key handling

- Refactored API key checks to improve clarity and consistency.
- Removed redundant checks for user-provided keys, enhancing code readability.
- Ensured proper logging for API key usage across different configurations.

* 🔧 fix: Update GeminiImageGen to handle imageSize support conditionally

- Added a check to ensure imageSize is only applied if the gemini model does not include 'gemini-2.5-flash-image', improving compatibility.
- Enhanced the logic for setting imageConfig to prevent potential issues with unsupported configurations.

* 🔧 refactor: Simplify local storage condition in createGeminiImageTool function

* 🔧 feat: Enhance image format handling in GeminiImageGen with conversion support

* 🔧 refactor: Streamline API key initialization in GeminiImageGen

- Simplified the handling of API keys by removing redundant checks for user-provided keys.
- Updated logging to reflect the new priority order for API key usage, enhancing clarity and consistency.
- Improved code readability by consolidating key retrieval logic.

---------

Co-authored-by: Dev Bhanushali <dev.bhanushali@hingehealth.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
@neverhoodboy
Copy link

Why it always generate multiple images in response to a single user request? Even if we ask it to only generate one image per one request in the agent's prompt, it still generates multiple images.

@usnavy13
Copy link
Contributor Author

usnavy13 commented Jan 7, 2026

Why it always generate multiple images in response to a single user request? Even if we ask it to only generate one image per one request in the agent's prompt, it still generates multiple images.

I do not have this issue

@teecrow
Copy link

teecrow commented Jan 8, 2026

Why it always generate multiple images in response to a single user request? Even if we ask it to only generate one image per one request in the agent's prompt, it still generates multiple images.

This also happened to me when, in the configuration for the Agent, I set the "Model" to gemini-3-flash-preview, but it resolved when I changed the model to gemini-3-pro-preview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.