pAplakidis / DriveGPT Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

My implementation of comma.ai's "Learning a Driving Simulator"

1 star 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt
vqvae_notes.txt		vqvae_notes.txt

Repository files navigation

DriveGPT

A toy implementation of using VQVAE

Resources

TODO

Too slow to train without large compute
- pre-tokenize dataset images once (offline) (DONE)
- do not run vqvae (self.model.image_tokenizer) in trainer.train_step()
- reduce transformer token count (currently T × H_e × W_e tokens) => pool spatially OR flatten spatial>MLP>smaller token
- Sanity check with profiler
```
torch.cuda.synchronize()
start = time.time()
...
torch.cuda.synchronize()
print("Batch time:", time.time() - start)
```
Train MLSIM
Inference App
Add RNN state to video decoder (smooth video)
Train VQVAE as a GAN
Move from autoregressive GPT model to latent diffusion