Skip to content
This repository was archived by the owner on Jan 4, 2026. It is now read-only.
This repository was archived by the owner on Jan 4, 2026. It is now read-only.

Add cpu offload for Qwen/Qwen2-VL-72B-Instruct-AWQ #24

@nguyen-brat

Description

@nguyen-brat

Hello, I want to run Qwen/Qwen2-VL-72B-Instruct-AWQ on my local computer, currently, I have 2xrtx 3090 but it has trouble OOM. Then I see in your vision.py 's options there --max-memory option to offload on CPU. Can you please implement it also for Qwen/Qwen2-VL-72B-Instruct-AWQ

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions