This repository is what I use to understand NLP, mainly attention and transformers. Most of the work comes from learning through Andrej Kaparthy Series on LLM.
Some of the papers I used are:
Neural language translation by learning align and translate
Resources I used:
Andrej Kaparthy Lecture on Transformers
The GPT model was trained on Kaggle using 2xT100 provided for free. Notebook Link