Skip to content
/ mlxtron Public

(WIP) picotron rewritten for mac silicon using MLX

Notifications You must be signed in to change notification settings

stefpi/mlxtron

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLXtron (work in progress)

4D parallelizable training for models using MLX. Based on Picotron.

very minimal implementation and probably will only support LLama architecture for now.

as mac users we mostly operate in the GPU-poor case 😭, but with enough macs together some real power kicks in.

read the blog to learn along with me at stefpi.net/blog/

design

the benefit of training across multiple macs (aside from the biggest consumer RAM capacity) is the fact that each GPU used in the training network is attached to a significant amount of storage and CPU power. With this it gives us the option to skip many communication/broadcast steps because each device can have the dataset locally and pull only necessary samples into unified memory.

About

(WIP) picotron rewritten for mac silicon using MLX

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published