llama inference with tensor parallelism by stefpi · Pull Request #1403 · ml-explore/mlx-examples

stefpi · 2026-01-10T23:24:04Z

related to #2973 on mlx repo.

made llama.py use tensor parallelism if run with mlx.launch on 2 or more devices
also added class_predicate to quantize because it wasn’t working properly with quantized mlx models from HF

llms/llama/llama.py

awni · 2026-01-28T20:54:27Z

llms/llama/llama.py

+        class_predicate = (
+            lambda p, m: isinstance(m, (nn.Linear, nn.Embedding))
+            and f"{p}.scales" in weights
+        )


I'm curious about that change? Was it needed for a specific model?

I was having trouble with running quantized models, specifically Llama-2-4bit, and I saw this being used in the gguf llm example for quantized models.

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

stefpi added 4 commits January 9, 2026 15:36

llama tp

6499e50

ffn fix

7e4a1b8

add support for TP in llama inference

9051e9d

cleanup

55f0e5c

stefpi mentioned this pull request Jan 10, 2026

[Docs] Simple example of using MLX distributed ml-explore/mlx#2973

Merged

stefpi added 2 commits January 11, 2026 13:22

pre-commit formatting

fb16039

import shard_linear

704bab7

stefpi marked this pull request as ready for review January 15, 2026 18:09

stefpi changed the title ~~[WIP] llama inference with tensor parallelism~~ llama inference with tensor parallelism Jan 15, 2026

awni reviewed Jan 28, 2026

View reviewed changes

llms/llama/llama.py Outdated Show resolved Hide resolved

awni reviewed Jan 28, 2026

View reviewed changes

remove redundant repeats definition

3c7583e

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama inference with tensor parallelism#1403

llama inference with tensor parallelism#1403
stefpi wants to merge 7 commits intoml-explore:mainfrom
stefpi:main

stefpi commented Jan 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

awni Jan 28, 2026

Uh oh!

stefpi Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stefpi commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

awni Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

stefpi Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stefpi commented Jan 10, 2026 •

edited

Loading