- [ ] Scripts for setting up the training env - [ ] Save model checkpoints regularly, send to the dev machine - [ ] Make sure the process restarts when the training is interrupted