GitHub

🛠️ Setup & Training

1. Install Dependencies

First, install the required packages:

pip install torchao liger_kernel pyarrow tensorboard

💡 Note: torchao and liger_kernel may require a recent version of PyTorch (≥2.3) and a CUDA-enabled environment for optimal performance.

2. Prepare Data

Download all files from this repository.
Place them in a single working directory.
Inside this directory, create a subfolder named 128.
Download the training data (Parquet files) into the 128/ folder:
🔗 TinyCorpus-v2

3. File Structure

Your directory should look like this:

your-training-folder/
├── trainGPT-token.py
├── fast_self_attn_model.py
├── data_utils.py
├── dev_optim.py
└── 128/
    ├── tinycorpus-000-of-128.parquet
    ├── tinycorpus-001-of-128.parquet
    └── ...                            # all shard files

4. Start Training

Run the training script from inside your-training-folder:

python trainGPT-token.py

This will replicate MiniModel with 12 layers, the original model used 24 layers. Please change 'layers': 24 in trainGPT-token.py if you wish to replicate the original model.

By default, the script logs training loss and other metrics to a directory called runs/ using PyTorch’s SummaryWriter.

5. Monitor Training with TensorBoard

While training is running (or after it finishes), launch TensorBoard to visualize the loss curve:

tensorboard --logdir=runs

Then open your browser and go to:
👉 http://localhost:6006

You’ll see a real-time plots of the training loss (refreshes every 30s).

6. Troubleshooting Out-of-Memory (OOM) Errors

If you encounter memory issues, open trainGPT-token.py and adjust one or both of the following:

Reduce model size:
```
'input_dims': 512   # default 768
```
Reduce batch size:
```
batch_size = 32     # default 64
```

Smaller values will lower VRAM usage at the cost of training speed or stability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛠️ Setup & Training

1. Install Dependencies

2. Prepare Data

3. File Structure

4. Start Training

5. Monitor Training with TensorBoard

6. Troubleshooting Out-of-Memory (OOM) Errors

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
data_utils.py		data_utils.py
dev_optim.py		dev_optim.py
fast_self_attn_model.py		fast_self_attn_model.py
trainGPT-token.py		trainGPT-token.py

xTimeCrystal/MiniModel

Folders and files

Latest commit

History

Repository files navigation

🛠️ Setup & Training

1. Install Dependencies

2. Prepare Data

3. File Structure

4. Start Training

5. Monitor Training with TensorBoard

6. Troubleshooting Out-of-Memory (OOM) Errors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages