A simple character-level RNN implementation in PyTorch for text generation. This model learns to predict the next character in a sequence and can generate new text in the style of the training data.
New Shrek just dropped arrattlese. I tell have layers. I'm to you?
-Uh, no. No! No! No! No! No! Noppotic, don't look down. Don't mean the shoor before this beforge princess.
-Oh, you want to talk and her face
Oh-oh-oh
It's a new romance
Oh-oh-oh
The best years you goions.
Ander through comming to play
Gonna good. Not have lay it place you have to do it that foret my bad say
off and get me
{Belling}
{Shouting}
{Roaring}
{Gasps}
-Oh, hey, wait, but you talking about me!
-Uh, that you got a good it the should Shrek.
- Character-level text generation using vanilla RNN
- Easily configurable sequence length, batch size, and model architecture
- Learned character embeddings instead of one-hot encoding for richer character representations
- Learning rate scheduling with linear warmup and cosine annealing
- Support for CUDA and MPS (Apple Silicon) acceleration
- Temperature-based sampling for text generation
.
├── models/
│ └── chaRNN.py # RNN model implementation
├── datasets/
│ ├── UnstructuredText.py # Dataset handling
│ └── data/ # Training data directory
├── chkpts/ # Saved model checkpoints
├── config.py # Hyperparameter configuration
├── train.py # Training script
└── test.py # Generation/inference script
The model consists of:
- An embedding layer to convert character indices to dense vectors
- A vanilla RNN layer
- A linear layer to project to vocabulary size
Model dimensions are configurable through the config dictionary
Model hyperparameters are centralized in config.py for easier experimentation and consistency between training and generation.
{
"sequence_length": 50, # Length of input sequences (in characters)
"batch_size": 64, # Batch size for training
"max_epochs": 200, # Number of training epochs
"hidden_state_dim": 128, # RNN hidden state dimension
"embedding_dim": 32, # Character embedding dimension
"num_layers": 1, # Number of RNN layers
"lr": 0.001, # Learning rate
"warmup_epoch_ratio": 0.2, # Portion of training used for warmup
"warmup_lr_factor": 0.25, # Starting learning rate factor for warmup
"temperature": 1.0 # Sampling temperature for generation
}- Place your training text file in the
datasets/data/directory - Update the file path in
train.py - Run training:
python train.pyThe model checkpoint will be saved in the chkpts/ directory with a timestamp.
- Update the checkpoint path in
test.pyto point to your trained model - Modify the seed text and number of characters to generate
- Run generation:
python test.py- Uses a simple vanilla RNN (could be extended to LSTM/GRU)
- No regularization
- Character-level only (no word-level or subword tokenization)