40 GB model

Hi,
thanks for your nice repo. You mention 2 3090:
`<!DOCTYPE html><p dir="auto" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 16px; caret-color: rgb(201, 209, 217); color: rgb(201, 209, 217); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, &quot;Noto Sans&quot;, Helvetica, Arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;">The following hardware is needed to run different models in MiniLLM:</p>

Model | GPU Memory Requirements | Compatible GPUs
-- | -- | --
llama-7b-4bit | 6GB | RTX 2060, 3050, 3060
llama-13b-4bit | 10GB | GTX 1080, RTX 2060, 3060, 3080
llama-30b-4bit | 20GB | RTX 3080, A5000, 3090, 4090, V100
llama-65b-4bit | 40GB | A100, 2x3090, 2x4090, A40, A6000

`

So when I try the 60B Version with 2 RTX 3090 I get an OOM - how can I use both GPUs?

Kind regards,

Dirk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

40 GB model #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model	GPU Memory Requirements	Compatible GPUs
llama-7b-4bit	6GB	RTX 2060, 3050, 3060
llama-13b-4bit	10GB	GTX 1080, RTX 2060, 3060, 3080
llama-30b-4bit	20GB	RTX 3080, A5000, 3090, 4090, V100
llama-65b-4bit	40GB	A100, 2x3090, 2x4090, A40, A6000

40 GB model #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions