Tech Stack: CUDA C++ · PyTorch (for reference) · Jupyter Notebook
miniGPT is a minimal, CUDA-accelerated implementation of a transformer-based language model inspired by GPT architectures. This project demonstrates the core components of transformer inference, including attention, feedforward layers, and positional encoding, all implemented from scratch in CUDA C++.
- CUDA Toolkit (version 11.0 or higher recommended)
- NVIDIA GPU with compute capability 6.0+
- g++ (for compiling C++/CUDA code)
- Python 3.x (for running notebooks and reference scripts)
- PyTorch (for reference and comparison, optional)
- Jupyter Notebook (for exploration and demonstration)
-
Clone the repository:
git clone https://github.com/yourusername/cuda-miniGPT.git cd cuda-miniGPT -
Compile the CUDA source code:
bash scripts/compile.sh
-
Compile the CUDA source code:
./build/minigptinference
-
(Optional) Explore the Jupyter Notebook:
jupyter notebook miniGPT.ipynb
