Skip to content

[FEATURE] Automate VLM Unloading for VRAM Management #2

@maocide

Description

@maocide

[FEATURE] Automate VLM Unloading for VRAM Management

User Story: As a user with a VRAM-limited GPU (<= 16GB), I want the application to automatically unload the Vision-Language Model (VLM) from memory after it has finished generating the caption and tags. This way, I have enough free VRAM to load my local Large Language Model (LLM) for the character card generation without needing to restart the application.

Problem: The current workflow for users with limited VRAM is clunky and inefficient. The VLM remains loaded in VRAM even when it's no longer needed, preventing the user from loading a local LLM (like via KoboldCPP) due to insufficient memory. The current workaround is to manually click the "Unload Model" button or restart the entire application, which disrupts the creative workflow.

Proposed Solution:

  1. Implement an optional setting, likely a checkbox in the Settings or Caption tab, labeled something like: "Automatically unload VLM after captioning to save VRAM".

  2. When this option is enabled, the application should automatically trigger the unload process once the caption and tags have been successfully generated and passed to the next tab.

  3. Provide clear feedback in the status bar (e.g., "Caption generated. VLM Unloaded!") so the user knows what's happening.

Goal: To completely solve the "VRAM juggling act" for users on mainstream hardware. This will create a seamless, single-session workflow and dramatically improve the user experience.

Source: This feature was suggested by a user running a local KoboldCPP instance who identified this VRAM bottleneck.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions