PMPP: Programming Massively Parallel Processors

A learning repository following the book "Programming Massively Parallel Processors: A Hands-on Approach" by Wen-mei W. Hwu, David B. Kirk, and Izzat El Hajj. This project explores parallel programming through both CUDA C/C++ and Python/Triton implementations.

📚 About

This repository documents my journey learning GPU programming and parallel computing. I'm experimenting with:

CUDA C/C++ for low-level GPU programming
Triton for high-level, Pythonic GPU kernels
CMake for C/C++ build management
uv for Python dependency management
Doxygen for C/C++ code documentation
jj (Jujutsu) for version control

Current Progress: Chapter 3

Note: This is an experimental learning repository. Code may not be production-ready and is intended for educational purposes.

📂 Project Structure

pmpp/
├── src/
│   ├── cuda/              # CUDA C/C++ implementations
│   │   └── vector_add/    # Chapter 2-3: Vector addition example
│   └── triton/            # Python/Triton implementations (coming soon)
├── notes/                 # Chapter summaries and learning notes
├── html/                  # Doxygen-generated documentation
├── CMakeLists.txt         # CMake configuration (if using top-level build)
├── pyproject.toml         # Python project configuration
├── uv.lock                # Locked Python dependencies
├── Doxyfile               # Doxygen configuration
└── README.md              # This file

Directory Purposes

src/cuda/: Contains CUDA C/C++ kernel implementations organized by chapter/topic
src/triton/: Will contain Python/Triton kernel implementations for comparison
notes/: Personal notes, chapter summaries, and key concepts
html/: Auto-generated API documentation (gitignored, generated locally)

🔧 Prerequisites

Hardware

NVIDIA GPU with CUDA support (Compute Capability 3.5+)
Check your GPU: nvidia-smi

Software

CUDA Toolkit (≥11.0 recommended) - Installation Guide
CMake (≥3.18) - For building C/C++ projects
Python (≥3.11) - For Triton implementations
uv - Modern Python package manager (Installation)
Doxygen (optional) - For generating C/C++ documentation
jj (Jujutsu) (optional) - Version control (Installation)

Verify CUDA Installation

nvcc --version
nvidia-smi

🚀 Getting Started

1. Clone the Repository

Using jj:

jj git clone <repository-url>
cd pmpp

Or with git:

git clone <repository-url>
cd pmpp

2. Python Setup with uv

# Install dependencies (Triton ≥3.5.0)
uv sync

# Verify installation
uv run python -c "import triton; print(triton.__version__)"

3. Building CUDA C/C++ Projects

Each CUDA project has its own CMakeLists.txt. Navigate to the project directory:

# Example: Building vector_add
cd src/cuda/vector_add
cmake -B build
cmake --build build

# Run the executable
./build/vector_add.out

For a cleaner workflow, you can also use:

cd src/cuda/vector_add
cmake .
make
./vector_add.out

🏃 Running Examples

CUDA C/C++ Examples

cd src/cuda/vector_add
cmake -B build && cmake --build build
./build/vector_add.out

Python/Triton Examples (Coming Soon)

uv run python src/triton/example.py

📖 Documentation

Generating C/C++ API Documentation

# Generate HTML documentation
doxygen Doxyfile

# View in browser
firefox html/index.html
# or
xdg-open html/index.html

The Doxygen configuration parses inline comments in CUDA source files to generate comprehensive API documentation.

🌿 Version Control with jj

This project uses jj (Jujutsu) instead of traditional git. Basic commands:

# Create a new change
jj describe          # Add commit description

# View history
jj log               # View commit graph

# Create new change
jj commit            # Finalize current change

# Sync with remote
jj git push          # Push to git remote
jj git fetch         # Fetch from git remote

New to jj? Check out the Jujutsu Tutorial

📚 Learning Resources

Primary Resource

Book: Programming Massively Parallel Processors (4th Edition recommended)

Documentation

Supplementary Materials

CUDA by Example
GPU Gems Series
Chapter notes available in the notes/ directory

🎯 Learning Goals

✅ Understand GPU architecture and memory hierarchies
✅ Master CUDA programming fundamentals (kernels, threads, blocks, grids)
🔄 Learn advanced optimization techniques (memory coalescing, shared memory)
🔄 Explore Triton for high-level GPU programming
🔜 Compare CUDA and Triton approaches
🔜 Implement real-world parallel algorithms

Legend: ✅ Completed | 🔄 In Progress | 🔜 Upcoming

🗺️ Chapter Progress

Chapter	Topic	CUDA C/C++	Triton	Notes
1	Introduction	✅	-	✅
2	Heterogeneous Data Parallel Computing	✅	🔜	✅
3	Multidimensional Grids and Data	🔄	🔜	🔄
4	Compute Architecture and Scheduling	🔜	🔜	🔜
...	...	...	...	...

🤝 Contributing

This is a personal learning repository, but suggestions and corrections are welcome! Feel free to:

Open issues for questions or clarifications
Submit pull requests for bug fixes
Share alternative implementations

📝 License

This project is for educational purposes. Code implementations are based on exercises and examples from "Programming Massively Parallel Processors."

For academic use, please cite the original book:

Hwu, W., Kirk, D., & El Hajj, I. (2022).
Programming Massively Parallel Processors: A Hands-on Approach (4th ed.).
Morgan Kaufmann.

Built with: 🚀 CUDA • 🐍 Python • ⚡ Triton • 🛠️ CMake • 📦 uv • 📚 Doxygen • 🌿 jj

Happy parallel programming! 🎉

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PMPP: Programming Massively Parallel Processors

📚 About

📂 Project Structure

Directory Purposes

🔧 Prerequisites

Hardware

Software

Verify CUDA Installation

🚀 Getting Started

1. Clone the Repository

2. Python Setup with uv

3. Building CUDA C/C++ Projects

🏃 Running Examples

CUDA C/C++ Examples

Python/Triton Examples (Coming Soon)

📖 Documentation

Generating C/C++ API Documentation

🌿 Version Control with jj

📚 Learning Resources

Primary Resource

Documentation

Supplementary Materials

🎯 Learning Goals

🗺️ Chapter Progress

🤝 Contributing

📝 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
notes		notes
src		src
.gitignore		.gitignore
.python-version		.python-version
Doxyfile		Doxyfile
README.md		README.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

Parry-97/pmpp

Folders and files

Latest commit

History

Repository files navigation

PMPP: Programming Massively Parallel Processors

📚 About

📂 Project Structure

Directory Purposes

🔧 Prerequisites

Hardware

Software

Verify CUDA Installation

🚀 Getting Started

1. Clone the Repository

2. Python Setup with uv

3. Building CUDA C/C++ Projects

🏃 Running Examples

CUDA C/C++ Examples

Python/Triton Examples (Coming Soon)

📖 Documentation

Generating C/C++ API Documentation

🌿 Version Control with jj

📚 Learning Resources

Primary Resource

Documentation

Supplementary Materials

🎯 Learning Goals

🗺️ Chapter Progress

🤝 Contributing

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages