Support quantized models #36

Open

Labels

opened

on Feb 28, 2025

Please modify the LLMWrapper classes (local.py) to support quantized versions of models.

For testing, check if Llama-405B and DeepSeek-V3/R1 are runnable using FP8 or INT4

Metadata

Assignees

No one assigned

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests