## to be added: - [x] BERT language model - [x] decoder-only model with `RMSNorm`, `RoPE`, `SwiGLU` - [x] config file for different model size configs - [x] run file - [ ] quantization logic - [ ] fine-tuning script for model ## to be done: - [ ] test run once with smaller data - [x] hyperparameter tuning