MLX Kernel Sandbox

Motivation

vLLM is difficult to use on Mac and requires significant workarounds. MLX is Apple's native framework that operates closer to the hardware than PyTorch.

MLX requires explicit management of Streams and Memory, similar to CUDA, but within Python. This provides fine-grained control over computation and memory operations.

Setup

This project uses uv for package management and requires Python 3.12+.

Install uv if you haven't already:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install dependencies:
```
uv sync
```

This will create a virtual environment (.venv/) and install MLX and its dependencies.

Running

MLX examples:

uv run main.py          # Matrix multiplication benchmark
uv run generate.py      # Interactive text generation

Modal examples (NVIDIA GPU):

uv run modal run modal/get_started.py

See docs/modal.md for more details.

Why Modal vs RunPod?

When you need NVIDIA GPUs, use Modal instead of RunPod:

Function-based: Decorate functions with @app.function(gpu="A10g") and run instantly
Pay-per-second: Only pay for execution time, no VM management
No setup overhead: Write code locally, execute on cloud immediately

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
docs		docs
modal		modal
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
generate.py		generate.py
main.py		main.py
pyproject.toml		pyproject.toml
stream_generate.py		stream_generate.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLX Kernel Sandbox

Motivation

Setup

Running

Why Modal vs RunPod?

About

Uh oh!

Releases

Packages

Languages

lamng3/mlx-kernel-sandbox

Folders and files

Latest commit

History

Repository files navigation

MLX Kernel Sandbox

Motivation

Setup

Running

Why Modal vs RunPod?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages