Skip to content

shawnschulz/loki_distributed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Loki

This is a heavy work in progress, but the idea is to setup up docker containers on multiple hosts with CUDA + MPI, then use either docker swarm or MPI and trad networking to run a distribtued transformer or VAE model. Still working on the cuda kernels for the VAE model and transformers, but the makefile should work for the given docker image.

About

A WIP distributed VAE and transformer model using cuda C kernels and MPI. Goal is to perform reasonable distributed training and inference on modest infrastructure and to have low dependencies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors