30u30: Master Ilya's 30 Foundational AI Papers in 30 Days

"If you really learn all of these, you'll know 90% of what matters today" - Ilya Sutskever

Complete implementations • Interactive notebooks • Beginner-friendly • 100% Free

30u30 is an open-source study guide that takes you through the 30 foundational AI papers recommended by Ilya Sutskever — one paper per day. Each paper comes with a full implementation from scratch, detailed notes, interactive exercises with solutions, and visualizations.
Think of it as Rustlings, but for deep learning fundamentals. You read the paper, understand the math, and build it yourself.

🚀 All units are complete! Bonus papers are coming soon!!!

✅ Unit 1: The Foundations (Days 1-7) - COMPLETE!

Day 1: "The Unreasonable Effectiveness of Recurrent Neural Networks"

Character-level RNN from scratch (pure NumPy)
Interactive Jupyter notebook with visualizations
5 progressive exercises + solutions
Start Day 1 →

Day 2: "Understanding LSTM Networks"

Complete LSTM with 4 gates from scratch
Gate activation analysis & visualizations
5 exercises including LSTM vs GRU comparison
Start Day 2 →

Day 3: "RNN Regularization"

Dropout, Layer Norm, Weight Decay, Early Stopping
Complete regularization pipeline from scratch
5 exercises on preventing overfitting
Start Day 3 →

Day 4: "Minimizing Description Length"

Bayesian / Noisy-Weight networks, MDL intuition
Uncertainty envelopes, compression analysis, pareto frontier
5 exercises that demonstrate gaps, beta tuning, MC inference
Start Day 4 →

Day 5: "MDL Principle Tutorial"

Two-Part Codes, Prequential MDL, NML Complexity
MDL vs AIC vs BIC comparison, compression analysis
5 exercises: from basic MDL to model selection showdown
Start Day 5 →

Day 6: "The First Law of Complexodynamics"

Information equilibration, channel capacity, evolutionary dynamics
Complete complexity evolution simulator with 7 visualizations
5 exercises: from Shannon entropy to real genome analysis
Start Day 6 →

Day 7: "The Coffee Automaton"

Cellular automaton complexity, heat diffusion, emergent behavior
Chaos theory, Lyapunov exponents, information flow analysis
5 exercises: edge of chaos, pattern classification, neural network initialization
Start Day 7 →

✅ Unit 2: Deep Learning Explosion (Vision & Architectures) - COMPLETE!

Day 8: "ImageNet Classification with Deep CNNs (AlexNet)"

The paper that sparked the deep learning revolution
GPU-accelerated training, ReLU activations, dropout regularization
5 exercises: GPU impact analysis, activation functions, data augmentation
Start Day 8 →

Day 9: "Deep Residual Learning for Image Recognition (ResNet)"

Skip connections that enable 100+ layer networks
Identity mappings, residual blocks, gradient highway
5 exercises: vanishing gradients, skip connection ablation, depth analysis
Start Day 9 →

Day 10: "Identity Mappings in Deep Residual Networks (ResNet v2)"

Pre-activation design: BN → ReLU → Conv
Why order matters for 1000+ layer networks
5 exercises: pre vs post activation, information flow, extreme depth
Start Day 10 →

Day 11: "Multi-Scale Context Aggregation by Dilated Convolutions"

Exponentially expanding receptive fields without pooling
Dense prediction, semantic segmentation, WaveNet foundations
5 exercises: receptive field analysis, dilation patterns, context modules
Start Day 11 →

Day 12: "Dropout: A Simple Way to Prevent Overfitting"

The standard regularization technique for neural networks
Inverted dropout, MC Dropout for uncertainty, ensemble interpretation
5 exercises: implement dropout, rate sweep, spatial dropout, MC uncertainty
Start Day 12 →

✅ Unit 3: The Transformer Era (Days 13-16) - COMPLETE!

Day 13: "Attention Is All You Need"

The paper that revolutionized NLP and beyond - the Transformer
Self-attention, multi-head attention, positional encoding
5 exercises: from scaled dot-product to full Transformer + interactive visualization
Start Day 13 →

Day 14: "The Annotated Transformer"

Code-level understanding of the Transformer - from math to PyTorch
Production-quality implementation with all training infrastructure
5 exercises: attention, multi-head, encoder, training, inference
Start Day 14 →

Day 15: "Neural Machine Translation by Jointly Learning to Align and Translate"

The original attention mechanism - before Transformers existed!
Bahdanau (additive) attention, bidirectional encoder, alignment visualization
5 exercises: attention from scratch, encoder-decoder, beam search, visualization
Start Day 15 →

Day 16: "Order Matters: Sequence to Sequence for Sets"

Pointer Networks - process sets, output sequences by pointing!
Order-invariant encoding, Read-Process-Write framework
5 exercises: pointer attention, set encoder, sorting, convex hull, TSP
Start Day 16 →

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE!

Day 17: "Neural Turing Machines"

Differentiable external memory for neural networks
Addressing mechanics: Content-based, Interpolation, Shift, Sharpening
5 exercises: addressing logic, circular convolution, memory updates
Start Day 17 →

Day 18: "Pointer Networks"

Networks that can "point" to their input (essential for combinatorial problems)
Laser pointer attention, sampling without replacement, combinatorial optimization
5 exercises: pointer attention, convex hull formatting, TSP cost analysis
Start Day 18 →

Day 19: "Relational Reasoning"

Pairwise object processing for VQA and physical reasoning
g_theta and f_phi modules, set-based inductive bias
5 exercises: pair generation, sort-of-CLEVR logic, masking
Start Day 19 →

Day 20: "Relational Recurrent Neural Networks"

Multi-head dot-product attention inside a recurrent cell (MHDPA)
Relational memory core: memory slots interact via self-attention at each timestep
5 exercises: memory attention, slot interactions, sequence modeling
Start Day 20 →

Day 21: "Neural Message Passing for Quantum Chemistry"

Unifying framework for graph neural networks: message, update, readout
Edge networks, GRU update, Set2Set readout, QM9 benchmark
5 exercises: message functions, graph construction, property prediction
Start Day 21 →

Day 22: "Deep Speech 2: End-to-End Speech Recognition"

End-to-end speech recognition replacing traditional ASR pipelines
Conv + bidirectional GRU + CTC loss, sequence-wise BatchNorm, SortaGrad
5 exercises: spectrogram features, CTC decoding, RNN BatchNorm, curriculum learning, full pipeline
Start Day 22 →

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE!

Day 23: "Variational Lossy Autoencoder"

Curing posterior collapse in VAEs with powerful decoders
Restricted receptive field (PixelCNN) + Inverse Autoregressive Flows (IAF)
5 exercises: from masked convolutions to full flow priors
Start Day 23 →

Day 24: "GPipe: Efficient Training of Giant Neural Networks"

Pipeline parallelism + micro-batching + activation checkpointing
Training giant 6B+ parameter models on limited hardware
5 exercises: from micro-batching to full pipeline integration
Start Day 24 →

Day 25: "Scaling Laws for Neural Language Models"

The power-law relationships between model size, compute, and performance
Scaling compute budget vs. model size vs. dataset size
5 exercises: scaling law calculations, compute-optimal training, dataset scaling
Start Day 25 →

Day 26: "Kolmogorov Complexity and Algorithmic Randomness"

The mathematical bedrock of information theory: Compression = Intelligence
From-scratch implementation of Huffman and Arithmetic coding
5 exercises: entropy comparison, NCD similarity clustering, and incompressibility
Start Day 26 →

Day 27: "Machine Super Intelligence (Shane Legg)"

Universal Intelligence (Υ), Kolmogorov Complexity proxies, and the Agent-Environment loop
Formal benchmarking of Random vs. RL vs. Predictive agents
5 exercises on Upsilon calculation, environment design, and complexity invariance
Start Day 27 →

Day 28: "CS231n: CNNs for Visual Recognition"

Conv layers (naive + im2col), pooling, ReLU, FC — full CNN from scratch in NumPy
VGGNet-16 parameter analysis, spatial dimension progression, architecture case studies
5 exercises: conv forward, pooling backprop, output sizes, parameter counting, feature viz
Start Day 28 →

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE!

Day 29: "Proximal Policy Optimization (PPO)"

The algorithm behind ChatGPT (RLHF)
Clipped surrogate objective & GAE from scratch
5 exercises on policy constraints and stability
Start Day 29 →

Day 30: "Deep Reinforcement Learning from Human Feedback (RLHF)"

Aligning AI with human preferences
Reward Modeling, Preference Loss, and PPO integration
Synthetic Oracle and full training loop from scratch
Start Day 30 →

🎯 Mission

This is the most comprehensive, beginner-friendly, open-source journey through the papers that defined modern AI. No paywalls. No gatekeeping. Just pure knowledge.

Whether you're pivoting to AI, a student, or a curious mind - this is your roadmap.

📚 What You'll Find Here

Each paper gets the full treatment:

📖 Deep-dive README - Complete explanations with real-world analogies
💡 ELI5 Notes - "Explain Like I'm 5" summaries
💻 Implementation - Clean, commented, CPU-friendly code
🎨 Visualizations - See the concepts come alive
🏋️ Exercises - Build it yourself (with solutions)
📓 Notebooks - Interactive Jupyter walkthroughs
⚡ Quick-start - Minimal training scripts that run in minutes

🗺️ The Journey

Unit 1: The Foundations (Days 1-7) - ✅ COMPLETE

Day	Paper	Status	Core Concept
1	The Unreasonable Effectiveness of RNNs	🚀 LIVE	Why predicting = intelligence
2	Understanding LSTM Networks	🚀 LIVE	The mechanics of memory
3	RNN Regularization	🚀 LIVE	Making RNNs generalize
4	Minimizing Description Length	🚀 LIVE	Compression = Intelligence
5	MDL Principle Tutorial	🚀 LIVE	Math of compression
6	The First Law of Complexodynamics	🚀 LIVE	Physics of complexity
7	The Coffee Automaton	🚀 LIVE	Why intelligence exists

Unit 2: Deep Learning Explosion (Days 8-12) - ✅ COMPLETE

Vision, depth, and the techniques that changed everything

Day	Paper	Status	Core Concept
8	ImageNet Classification (AlexNet)	🚀 LIVE	Deep learning revolution
9	Deep Residual Learning (ResNet)	🚀 LIVE	Skip connections
10	Identity Mappings in ResNets	🚀 LIVE	Pre-activation design
11	Multi-Scale Context (Dilated Conv)	🚀 LIVE	Dilated convolutions
12	Dropout (Srivastava et al.)	🚀 LIVE	Preventing overfitting

Unit 3: The Transformer Era (Days 13-16) - ✅ COMPLETE

The architecture that ate the world

Day	Paper	Status	Core Concept
13	Attention Is All You Need	🚀 LIVE	Self-attention, Transformer
14	The Annotated Transformer	🚀 LIVE	Code-level Transformer
15	Bahdanau Attention (NMT)	🚀 LIVE	Original attention mechanism
16	Order Matters (Pointer Networks)	🚀 LIVE	Set-to-sequence problems

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE

Memory, graphs, and reasoning

Day	Paper	Status	Core Concept
17	Neural Turing Machines	🚀 LIVE	Differentiable external memory
18	Pointer Networks	🚀 LIVE	Selecting input via attention
19	Relational Reasoning	🚀 LIVE	Pairwise object relations; g_theta & f_phi modules
20	Relational RNNs	🚀 LIVE	Self-attention inside recurrence
21	Neural Message Passing	🚀 LIVE	MPNN framework for graph neural networks
22	Deep Speech 2	🚀 LIVE	End-to-end speech recognition with CTC

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE

From theory to massive models

Day	Paper	Status	Core Concept
23	Variational Lossy Autoencoder	🚀 LIVE	Curing posterior collapse with IAF
24	GPipe: Efficient Training of Giant Neural Networks	🚀 LIVE	Pipeline parallelism
25	Scaling Laws for Neural Language Models	🚀 LIVE	The physics of AI scaling
26	Kolmogorov Complexity	🚀 LIVE	Math of compression & randomness
27	Machine Super Intelligence	🚀 LIVE	Safety & intelligence definitions
28	CS231n: CNNs for Visual Recognition	🚀 LIVE	CNN layers from scratch, VGGNet analysis

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE

From Policy Gradients to RLHF

Day	Paper	Status	Core Concept
29	Proximal Policy Optimization (PPO)	🚀 LIVE	The algorithm behind ChatGPT (RLHF)
30	Deep Reinforcement Learning from Human Feedback	🚀 LIVE	The birth of "Human Feedback" (RLHF)

💻⏭️ Bonus Papers: The Language Model Revolution (Coming Soon)

The modern era of LLMs

Paper	Core Concept
BERT: Pre-training of Deep Bidirectional Transformers	Coming Soon
GPT-2: Language Models are Unsupervised Multitask Learners	Coming Soon
GPT-3: Language Models are Few-Shot Learners	Coming Soon
Chinchilla: Training Compute-Optimal Large Language Models	Coming Soon

Complete paper list with links →

⚡ Quick Start

# Clone the repo
git clone https://github.com/yourusername/30u30.git
cd 30u30

# Start with Day 1
cd papers/01_Unreasonable_Effectiveness

Prerequisites

For beginners: Basic Python knowledge. We'll teach you the rest.

For practitioners: Jump to any paper that interests you.

🎓 How to Use This Repo

🎯 The 30-Day Challenge:

One paper per day
Read the README
Run the code
Complete exercises
Share your progress with #30u30

🔀 Choose Your Path:

Theory-First: README → Notes → Code
Code-First: Notebook → Implementation → README
Practice-First: Exercises → Solutions → Deep-dive

🌟 Why This Project?

There are many paper summaries online. But this is different:

You build everything - No "import magic_ai_library"
Multiple learning paths - Theory-first, code-first, or interactive
Production-quality - Code that actually works and teaches
Beginner-friendly - Real-world analogies + rigorous math
Community-driven - Your feedback shapes future days

Goal: The best free resource for learning AI fundamentals.

If this helps you, ⭐ star the repo and share it with others!

🤝 Contributing

We'd love your help making this better!

🐛 Found a bug? Open an issue
💡 Have an idea? Open an issue with the "enhancement" label or Start a discussion
📝 Want to contribute code? See CONTRIBUTING.md

Every contribution helps thousands of learners.

📜 License

CC BY-NC-ND 4.0 — Free to read, learn, and share with attribution. Not for commercial use.

🙏 Acknowledgments

Ilya Sutskever for the original reading list
All paper authors for advancing the field
You for taking this journey

💬 Stay Connected

🐦 Twitter: Share progress with #30u30
📧 Issues: Report bugs or request features
💬 Discussions: Join the conversation
⭐ Star the repo to stay updated on new releases!

Ready to start?
→ Day 1: Character-Level RNN
→ Day 2: Understanding LSTMs
→ Day 3: RNN Regularization
→ Day 4: Minimizing Description Length
→ Day 5: MDL Principle Tutorial
→ Day 6: The First Law of Complexodynamics
→ Day 7: The Coffee Automaton
→ Day 8: ImageNet Classification (AlexNet)
→ Day 9: Deep Residual Learning (ResNet)
→ Day 10: Identity Mappings (ResNet v2)
→ Day 11: Dilated Convolutions
→ Day 12: Dropout
→ Day 13: Attention Is All You Need
→ Day 14: The Annotated Transformer
→ Day 15: Bahdanau Attention (NMT)
→ Day 16: Order Matters (Pointer Networks)
→ Day 17: Neural Turing Machines
→ Day 18: Pointer Networks
→ Day 19: Relational Reasoning
→ Day 20: Relational RNNs
→ Day 21: Neural Message Passing
→ Day 22: Deep Speech 2
→ Day 23: Variational Lossy Autoencoder
→ Day 24: GPipe (Giant Neural Networks)
→ Day 25: Scaling Laws for Neural Language Models
→ Day 26: Kolmogorov Complexity
→ Day 27: Machine Super Intelligence
→ Day 28: CS231n — CNNs for Visual Recognition
→ Day 29: Proximal Policy Optimization (PPO)
→ Day 30: RLHF 🆕 ← Start here!

Let's build something amazing together! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 795 Commits
docs		docs
papers		papers
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
ilya_30_papers.md		ilya_30_papers.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

30u30: Master Ilya's 30 Foundational AI Papers in 30 Days

🚀 All units are complete! Bonus papers are coming soon!!!

✅ Unit 1: The Foundations (Days 1-7) - COMPLETE!

✅ Unit 2: Deep Learning Explosion (Vision & Architectures) - COMPLETE!

✅ Unit 3: The Transformer Era (Days 13-16) - COMPLETE!

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE!

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE!

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE!

🎯 Mission

📚 What You'll Find Here

🗺️ The Journey

Unit 1: The Foundations (Days 1-7) - ✅ COMPLETE

Unit 2: Deep Learning Explosion (Days 8-12) - ✅ COMPLETE

Unit 3: The Transformer Era (Days 13-16) - ✅ COMPLETE

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE

💻⏭️ Bonus Papers: The Language Model Revolution (Coming Soon)

⚡ Quick Start

Prerequisites

🎓 How to Use This Repo

🌟 Why This Project?

🤝 Contributing

📜 License

🙏 Acknowledgments

💬 Stay Connected

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

30u30: Master Ilya's 30 Foundational AI Papers in 30 Days

🚀 All units are complete! Bonus papers are coming soon!!!

✅ Unit 1: The Foundations (Days 1-7) - COMPLETE!

✅ Unit 2: Deep Learning Explosion (Vision & Architectures) - COMPLETE!

✅ Unit 3: The Transformer Era (Days 13-16) - COMPLETE!

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE!

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE!

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE!

🎯 Mission

📚 What You'll Find Here

🗺️ The Journey

Unit 1: The Foundations (Days 1-7) - ✅ COMPLETE

Unit 2: Deep Learning Explosion (Days 8-12) - ✅ COMPLETE

Unit 3: The Transformer Era (Days 13-16) - ✅ COMPLETE

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE

💻⏭️ Bonus Papers: The Language Model Revolution (Coming Soon)

⚡ Quick Start

Prerequisites

🎓 How to Use This Repo

🌟 Why This Project?

🤝 Contributing

📜 License

🙏 Acknowledgments

💬 Stay Connected

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages