Skip to content

lamng3/mlx-kernel-sandbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLX Kernel Sandbox

Motivation

vLLM is difficult to use on Mac and requires significant workarounds. MLX is Apple's native framework that operates closer to the hardware than PyTorch.

MLX requires explicit management of Streams and Memory, similar to CUDA, but within Python. This provides fine-grained control over computation and memory operations.

Setup

This project uses uv for package management and requires Python 3.12+.

  1. Install uv if you haven't already:

    curl -LsSf https://astral.sh/uv/install.sh | sh
  2. Install dependencies:

    uv sync

This will create a virtual environment (.venv/) and install MLX and its dependencies.

Running

MLX examples:

uv run main.py          # Matrix multiplication benchmark
uv run generate.py      # Interactive text generation

Modal examples (NVIDIA GPU):

uv run modal run modal/get_started.py

See docs/modal.md for more details.

Why Modal vs RunPod?

When you need NVIDIA GPUs, use Modal instead of RunPod:

  • Function-based: Decorate functions with @app.function(gpu="A10g") and run instantly
  • Pay-per-second: Only pay for execution time, no VM management
  • No setup overhead: Write code locally, execute on cloud immediately

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages