Tiny GPT-2 Implementation

A "tiny" implementation of GPT-2 trained on a Haiku dataset, designed to run on limited resources (like a single T4 GPU on Google Colab).

Project Description

This project demonstrates how to build and train a small-scale GPT-2 model from scratch. The goal is to understand the architecture and mechanics of Transformers by implementing them, rather than just using pre-trained models.

Key features:

Byte-Level BPE Tokenization: Handling text efficiently.
Transformer Architecture: Implementing Self-Attention, LayerNorm, and FeedForward networks.
Training: Autoregressive training on a dataset of haikus.
Generation: Sampling new haikus from the trained model.

Links

Google Colab Notebook: Run the Code
Detailed Blog Post: Read the Tutorial

Usage

The easiest way to run this project is via the Google Colab link above.

If you wish to run it locally, ensure you have the necessary dependencies installed (PyTorch, Transformers, etc.) and run the notebook GPT_2_Implementation.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
GPT_2_Implementation.ipynb		GPT_2_Implementation.ipynb
README.md		README.md
lines.txt		lines.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tiny GPT-2 Implementation

Project Description

Links

Usage

References

About

Uh oh!

Releases

Packages

Languages

t0n4r/gpt2-implementation

Folders and files

Latest commit

History

Repository files navigation

Tiny GPT-2 Implementation

Project Description

Links

Usage

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages