Step-by-step experiments of character-level language modelling starting from RNNCell, finalized with Stacked LSTM.
Used PyTorch and PyTorch Lightning for training and testing.
The RNN is trained to predict next letter in a given text sequence. The trained model can then be used to generate a new text sequence resembling the original data.
Tensorboard is used for logging the training process and observing the improvement in the generated text.
Trained and tested on:
- Python 3.12
- numpy==1.26.4
- pytorch-lightning==2.3.3
- tensorboard==2.17.0
- torch==2.2.2+cu118
- Clone the repository.
- Install the required packages by running
pip install -r requirements_cuda.txt.
- Change directory to the cloned repository.
- Each experiment is in a separate directory starting from 1_***.
- Each experiment directory contains a
main.pyfile which is the entry point for the experiment. File contains the detailed explanation of the experiment. - Run an experiment like
python 1_RNNCell\main.pyin the cloned directory.
- The
params.ymlfile in each experiment directory contains the hyperparameters and configuration for the experiment. These paramaters are for PyTorch modules, PyTorch Lightning Trainer, the dataset, inference, logging and inference etc. - During training process, logs are saved in the
logsdirectory with specific folders for each experiment. - To visualize the training process, run
tensorboard.exe --logdir logsin the cloned directory and open the link in a browser.
- In the default configuration via
params.yml, the model is trained on thedata/shakespeare.txtfile and limited to 5 minuutes of runtime. - The
train_lossgraph shows the progress and final value of the loss reached in 5 minutes for each experiment.
- We can also observe the generated text in the tensorboard.
