Skip to content

Ojhaharsh/30u30

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

795 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

30u30: Master Ilya's 30 Foundational AI Papers in 30 Days

License: CC BY-NC-ND 4.0 Python 3.8+ GitHub stars Website

"If you really learn all of these, you'll know 90% of what matters today" - Ilya Sutskever

Complete implementations • Interactive notebooks • Beginner-friendly • 100% Free

30u30 is an open-source study guide that takes you through the 30 foundational AI papers recommended by Ilya Sutskever — one paper per day. Each paper comes with a full implementation from scratch, detailed notes, interactive exercises with solutions, and visualizations.
Think of it as Rustlings, but for deep learning fundamentals. You read the paper, understand the math, and build it yourself.


🚀 All units are complete! Bonus papers are coming soon!!!

✅ Unit 1: The Foundations (Days 1-7) - COMPLETE!

Day 1: "The Unreasonable Effectiveness of Recurrent Neural Networks"

  • Character-level RNN from scratch (pure NumPy)
  • Interactive Jupyter notebook with visualizations
  • 5 progressive exercises + solutions
    Start Day 1 →

Day 2: "Understanding LSTM Networks"

  • Complete LSTM with 4 gates from scratch
  • Gate activation analysis & visualizations
  • 5 exercises including LSTM vs GRU comparison
    Start Day 2 →

Day 3: "RNN Regularization"

  • Dropout, Layer Norm, Weight Decay, Early Stopping
  • Complete regularization pipeline from scratch
  • 5 exercises on preventing overfitting
    Start Day 3 →

Day 4: "Minimizing Description Length"

  • Bayesian / Noisy-Weight networks, MDL intuition
  • Uncertainty envelopes, compression analysis, pareto frontier
  • 5 exercises that demonstrate gaps, beta tuning, MC inference
    Start Day 4 →

Day 5: "MDL Principle Tutorial"

  • Two-Part Codes, Prequential MDL, NML Complexity
  • MDL vs AIC vs BIC comparison, compression analysis
  • 5 exercises: from basic MDL to model selection showdown
    Start Day 5 →

Day 6: "The First Law of Complexodynamics"

  • Information equilibration, channel capacity, evolutionary dynamics
  • Complete complexity evolution simulator with 7 visualizations
  • 5 exercises: from Shannon entropy to real genome analysis
    Start Day 6 →

Day 7: "The Coffee Automaton"

  • Cellular automaton complexity, heat diffusion, emergent behavior
  • Chaos theory, Lyapunov exponents, information flow analysis
  • 5 exercises: edge of chaos, pattern classification, neural network initialization
    Start Day 7 →

✅ Unit 2: Deep Learning Explosion (Vision & Architectures) - COMPLETE!

Day 8: "ImageNet Classification with Deep CNNs (AlexNet)"

  • The paper that sparked the deep learning revolution
  • GPU-accelerated training, ReLU activations, dropout regularization
  • 5 exercises: GPU impact analysis, activation functions, data augmentation
    Start Day 8 →

Day 9: "Deep Residual Learning for Image Recognition (ResNet)"

  • Skip connections that enable 100+ layer networks
  • Identity mappings, residual blocks, gradient highway
  • 5 exercises: vanishing gradients, skip connection ablation, depth analysis
    Start Day 9 →

Day 10: "Identity Mappings in Deep Residual Networks (ResNet v2)"

  • Pre-activation design: BN → ReLU → Conv
  • Why order matters for 1000+ layer networks
  • 5 exercises: pre vs post activation, information flow, extreme depth
    Start Day 10 →

Day 11: "Multi-Scale Context Aggregation by Dilated Convolutions"

  • Exponentially expanding receptive fields without pooling
  • Dense prediction, semantic segmentation, WaveNet foundations
  • 5 exercises: receptive field analysis, dilation patterns, context modules
    Start Day 11 →

Day 12: "Dropout: A Simple Way to Prevent Overfitting"

  • The standard regularization technique for neural networks
  • Inverted dropout, MC Dropout for uncertainty, ensemble interpretation
  • 5 exercises: implement dropout, rate sweep, spatial dropout, MC uncertainty
    Start Day 12 →

✅ Unit 3: The Transformer Era (Days 13-16) - COMPLETE!

Day 13: "Attention Is All You Need"

  • The paper that revolutionized NLP and beyond - the Transformer
  • Self-attention, multi-head attention, positional encoding
  • 5 exercises: from scaled dot-product to full Transformer + interactive visualization
    Start Day 13 →

Day 14: "The Annotated Transformer"

  • Code-level understanding of the Transformer - from math to PyTorch
  • Production-quality implementation with all training infrastructure
  • 5 exercises: attention, multi-head, encoder, training, inference
    Start Day 14 →

Day 15: "Neural Machine Translation by Jointly Learning to Align and Translate"

  • The original attention mechanism - before Transformers existed!
  • Bahdanau (additive) attention, bidirectional encoder, alignment visualization
  • 5 exercises: attention from scratch, encoder-decoder, beam search, visualization
    Start Day 15 →

Day 16: "Order Matters: Sequence to Sequence for Sets"

  • Pointer Networks - process sets, output sequences by pointing!
  • Order-invariant encoding, Read-Process-Write framework
  • 5 exercises: pointer attention, set encoder, sorting, convex hull, TSP
    Start Day 16 →

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE!

Day 17: "Neural Turing Machines"

  • Differentiable external memory for neural networks
  • Addressing mechanics: Content-based, Interpolation, Shift, Sharpening
  • 5 exercises: addressing logic, circular convolution, memory updates
    Start Day 17 →

Day 18: "Pointer Networks"

  • Networks that can "point" to their input (essential for combinatorial problems)
  • Laser pointer attention, sampling without replacement, combinatorial optimization
  • 5 exercises: pointer attention, convex hull formatting, TSP cost analysis
    Start Day 18 →

Day 19: "Relational Reasoning"

  • Pairwise object processing for VQA and physical reasoning
  • g_theta and f_phi modules, set-based inductive bias
  • 5 exercises: pair generation, sort-of-CLEVR logic, masking
    Start Day 19 →

Day 20: "Relational Recurrent Neural Networks"

  • Multi-head dot-product attention inside a recurrent cell (MHDPA)
  • Relational memory core: memory slots interact via self-attention at each timestep
  • 5 exercises: memory attention, slot interactions, sequence modeling
    Start Day 20 →

Day 21: "Neural Message Passing for Quantum Chemistry"

  • Unifying framework for graph neural networks: message, update, readout
  • Edge networks, GRU update, Set2Set readout, QM9 benchmark
  • 5 exercises: message functions, graph construction, property prediction
    Start Day 21 →

Day 22: "Deep Speech 2: End-to-End Speech Recognition"

  • End-to-end speech recognition replacing traditional ASR pipelines
  • Conv + bidirectional GRU + CTC loss, sequence-wise BatchNorm, SortaGrad
  • 5 exercises: spectrogram features, CTC decoding, RNN BatchNorm, curriculum learning, full pipeline
    Start Day 22 →

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE!

Day 23: "Variational Lossy Autoencoder"

  • Curing posterior collapse in VAEs with powerful decoders
  • Restricted receptive field (PixelCNN) + Inverse Autoregressive Flows (IAF)
  • 5 exercises: from masked convolutions to full flow priors
    Start Day 23 →

Day 24: "GPipe: Efficient Training of Giant Neural Networks"

  • Pipeline parallelism + micro-batching + activation checkpointing
  • Training giant 6B+ parameter models on limited hardware
  • 5 exercises: from micro-batching to full pipeline integration
    Start Day 24 →

Day 25: "Scaling Laws for Neural Language Models"

  • The power-law relationships between model size, compute, and performance
  • Scaling compute budget vs. model size vs. dataset size
  • 5 exercises: scaling law calculations, compute-optimal training, dataset scaling
    Start Day 25 →

Day 26: "Kolmogorov Complexity and Algorithmic Randomness"

  • The mathematical bedrock of information theory: Compression = Intelligence
  • From-scratch implementation of Huffman and Arithmetic coding
  • 5 exercises: entropy comparison, NCD similarity clustering, and incompressibility
    Start Day 26 →

Day 27: "Machine Super Intelligence (Shane Legg)"

  • Universal Intelligence (Υ), Kolmogorov Complexity proxies, and the Agent-Environment loop
  • Formal benchmarking of Random vs. RL vs. Predictive agents
  • 5 exercises on Upsilon calculation, environment design, and complexity invariance
    Start Day 27 →

Day 28: "CS231n: CNNs for Visual Recognition"

  • Conv layers (naive + im2col), pooling, ReLU, FC — full CNN from scratch in NumPy
  • VGGNet-16 parameter analysis, spatial dimension progression, architecture case studies
  • 5 exercises: conv forward, pooling backprop, output sizes, parameter counting, feature viz
    Start Day 28 →

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE!

Day 29: "Proximal Policy Optimization (PPO)"

  • The algorithm behind ChatGPT (RLHF)
  • Clipped surrogate objective & GAE from scratch
  • 5 exercises on policy constraints and stability
    Start Day 29 →

Day 30: "Deep Reinforcement Learning from Human Feedback (RLHF)"

  • Aligning AI with human preferences
  • Reward Modeling, Preference Loss, and PPO integration
  • Synthetic Oracle and full training loop from scratch
    Start Day 30 →

🎯 Mission

This is the most comprehensive, beginner-friendly, open-source journey through the papers that defined modern AI. No paywalls. No gatekeeping. Just pure knowledge.

Whether you're pivoting to AI, a student, or a curious mind - this is your roadmap.

📚 What You'll Find Here

Each paper gets the full treatment:

  • 📖 Deep-dive README - Complete explanations with real-world analogies
  • 💡 ELI5 Notes - "Explain Like I'm 5" summaries
  • 💻 Implementation - Clean, commented, CPU-friendly code
  • 🎨 Visualizations - See the concepts come alive
  • 🏋️ Exercises - Build it yourself (with solutions)
  • 📓 Notebooks - Interactive Jupyter walkthroughs
  • ⚡ Quick-start - Minimal training scripts that run in minutes

🗺️ The Journey

Unit 1: The Foundations (Days 1-7) - ✅ COMPLETE

Day Paper Status Core Concept
1 The Unreasonable Effectiveness of RNNs 🚀 LIVE Why predicting = intelligence
2 Understanding LSTM Networks 🚀 LIVE The mechanics of memory
3 RNN Regularization 🚀 LIVE Making RNNs generalize
4 Minimizing Description Length 🚀 LIVE Compression = Intelligence
5 MDL Principle Tutorial 🚀 LIVE Math of compression
6 The First Law of Complexodynamics 🚀 LIVE Physics of complexity
7 The Coffee Automaton 🚀 LIVE Why intelligence exists

Unit 2: Deep Learning Explosion (Days 8-12) - ✅ COMPLETE

Vision, depth, and the techniques that changed everything

Day Paper Status Core Concept
8 ImageNet Classification (AlexNet) 🚀 LIVE Deep learning revolution
9 Deep Residual Learning (ResNet) 🚀 LIVE Skip connections
10 Identity Mappings in ResNets 🚀 LIVE Pre-activation design
11 Multi-Scale Context (Dilated Conv) 🚀 LIVE Dilated convolutions
12 Dropout (Srivastava et al.) 🚀 LIVE Preventing overfitting

Unit 3: The Transformer Era (Days 13-16) - ✅ COMPLETE

The architecture that ate the world

Day Paper Status Core Concept
13 Attention Is All You Need 🚀 LIVE Self-attention, Transformer
14 The Annotated Transformer 🚀 LIVE Code-level Transformer
15 Bahdanau Attention (NMT) 🚀 LIVE Original attention mechanism
16 Order Matters (Pointer Networks) 🚀 LIVE Set-to-sequence problems

✅ Unit 4: Specialized Architectures (Days 17-22) - COMPLETE

Memory, graphs, and reasoning

Day Paper Status Core Concept
17 Neural Turing Machines 🚀 LIVE Differentiable external memory
18 Pointer Networks 🚀 LIVE Selecting input via attention
19 Relational Reasoning 🚀 LIVE Pairwise object relations; g_theta & f_phi modules
20 Relational RNNs 🚀 LIVE Self-attention inside recurrence
21 Neural Message Passing 🚀 LIVE MPNN framework for graph neural networks
22 Deep Speech 2 🚀 LIVE End-to-end speech recognition with CTC

✅ Unit 5: Generative Models & Scaling (Days 23-28) - COMPLETE

From theory to massive models

Day Paper Status Core Concept
23 Variational Lossy Autoencoder 🚀 LIVE Curing posterior collapse with IAF
24 GPipe: Efficient Training of Giant Neural Networks 🚀 LIVE Pipeline parallelism
25 Scaling Laws for Neural Language Models 🚀 LIVE The physics of AI scaling
26 Kolmogorov Complexity 🚀 LIVE Math of compression & randomness
27 Machine Super Intelligence 🚀 LIVE Safety & intelligence definitions
28 CS231n: CNNs for Visual Recognition 🚀 LIVE CNN layers from scratch, VGGNet analysis

✅ Unit 6: Deep Reinforcement Learning & Alignment (Days 29-30) - COMPLETE

From Policy Gradients to RLHF

Day Paper Status Core Concept
29 Proximal Policy Optimization (PPO) 🚀 LIVE The algorithm behind ChatGPT (RLHF)
30 Deep Reinforcement Learning from Human Feedback 🚀 LIVE The birth of "Human Feedback" (RLHF)

💻⏭️ Bonus Papers: The Language Model Revolution (Coming Soon)

The modern era of LLMs

Paper Core Concept
BERT: Pre-training of Deep Bidirectional Transformers Coming Soon
GPT-2: Language Models are Unsupervised Multitask Learners Coming Soon
GPT-3: Language Models are Few-Shot Learners Coming Soon
Chinchilla: Training Compute-Optimal Large Language Models Coming Soon

Complete paper list with links →

⚡ Quick Start

# Clone the repo
git clone https://github.com/yourusername/30u30.git
cd 30u30

# Start with Day 1
cd papers/01_Unreasonable_Effectiveness

Prerequisites

For beginners: Basic Python knowledge. We'll teach you the rest.

For practitioners: Jump to any paper that interests you.

🎓 How to Use This Repo

🎯 The 30-Day Challenge:

  • One paper per day
  • Read the README
  • Run the code
  • Complete exercises
  • Share your progress with #30u30

🔀 Choose Your Path:

  • Theory-First: README → Notes → Code
  • Code-First: Notebook → Implementation → README
  • Practice-First: Exercises → Solutions → Deep-dive

🌟 Why This Project?

There are many paper summaries online. But this is different:

  1. You build everything - No "import magic_ai_library"
  2. Multiple learning paths - Theory-first, code-first, or interactive
  3. Production-quality - Code that actually works and teaches
  4. Beginner-friendly - Real-world analogies + rigorous math
  5. Community-driven - Your feedback shapes future days

Goal: The best free resource for learning AI fundamentals.

If this helps you, ⭐ star the repo and share it with others!


🤝 Contributing

We'd love your help making this better!

Every contribution helps thousands of learners.

📜 License

CC BY-NC-ND 4.0 — Free to read, learn, and share with attribution. Not for commercial use.

🙏 Acknowledgments

  • Ilya Sutskever for the original reading list
  • All paper authors for advancing the field
  • You for taking this journey

💬 Stay Connected


Ready to start?
Day 1: Character-Level RNN
Day 2: Understanding LSTMs
Day 3: RNN Regularization
Day 4: Minimizing Description Length
Day 5: MDL Principle Tutorial
Day 6: The First Law of Complexodynamics
Day 7: The Coffee Automaton
Day 8: ImageNet Classification (AlexNet)
Day 9: Deep Residual Learning (ResNet)
Day 10: Identity Mappings (ResNet v2)
Day 11: Dilated Convolutions
Day 12: Dropout
Day 13: Attention Is All You Need
Day 14: The Annotated Transformer
Day 15: Bahdanau Attention (NMT)
Day 16: Order Matters (Pointer Networks)
Day 17: Neural Turing Machines
Day 18: Pointer Networks
Day 19: Relational Reasoning
Day 20: Relational RNNs
Day 21: Neural Message Passing
Day 22: Deep Speech 2
Day 23: Variational Lossy Autoencoder
Day 24: GPipe (Giant Neural Networks)
Day 25: Scaling Laws for Neural Language Models
Day 26: Kolmogorov Complexity
Day 27: Machine Super Intelligence
Day 28: CS231n — CNNs for Visual Recognition
Day 29: Proximal Policy Optimization (PPO)
Day 30: RLHF 🆕 ← Start here!

Let's build something amazing together! 🚀

About

30 Days. 30 Papers. 90% of AI

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Contributors