This repository documents my hands-on exploration of reinforcement learning (RL) in the context of robotics, simulation, and control systems. It follows a 12-week structured learning plan, combining theoretical study with practical implementation — targeting mastery in tools like Isaac Gym and deep reinforcement learning algorithms.
RL_Projects/
├── scripts/ # Training and evaluation scripts
├── models/ # Trained agents (e.g. PPO models)
├── logs/ # TensorBoard logs
├── envs/ # (Optional) custom env wrappers or configs
└── README.md # You're here!
🧠 Use the centralized scripts/, models/, and logs/ folders for running core environments like:
LunarLander-v3HalfCheetah-v3CartPole-v1This approach ensures consistent logging, versioning, and ease of comparison across algorithms.
| Week | Project | Description |
|---|---|---|
| 1 | CartPole DQN | Implementation of a 2D cart balancing a pole (1 DOF) using Deep Q-Learning. Based on the PyTorch RL tutorial. |
| 2 | MuJoCo PPO/SAC | PPO and SAC applied to continuous control tasks like HalfCheetah and Walker2d. |
| 3 | Custom RL Baselines | Reimplementation of PPO and SAC from scratch with GAE and modular PyTorch code. |
| 4 | Robotic Arm Grasping (Isaac Gym) | Use PPO/SAC to teach a simulated Franka Panda arm to grasp a cube. |
| 5 | Cloud Training Deployment | Train agents remotely using AWS EC2 or Lambda Labs + Ray RLlib. |
| 6 | Humanoid Locomotion | Teach a humanoid robot to walk using Isaac Gym + PPO + curriculum learning. |
| 7 | Dexterous Manipulation | Use a ShadowHand model to manipulate objects with fine motor control. |
| 8 | RL Competition Entry | Train and submit an agent to an open RL competition (e.g. AIcrowd). |
| 9–12 | Final Project & Portfolio Polish | Showcase project inspired by Tesla Optimus: full-body locomotion + manipulation. |
- Deep RL: DQN, PPO, SAC, GAE
- Custom reward shaping, curriculum learning
- Isaac Gym (NVIDIA) for high-performance robotic sim
- Visual input + partial observability
- Cloud deployment & RLlib
- Portfolio & interview prep
🏅Hackathon: Robo Innovate 2025 - 2nd place!
- Participated in robo.innovate 2025 at TUM in Munich, Germany. 4 days, team of 12, won the second main price
- Challenge (posed & sponsored by BMW): use two Franka Research 3 robotic arms with 7 DOFs each to unpack car parts from unsealed plastic bags
- Github repo & Pitch deck
-
David Silver's Reinforcement Learning Course
- Covers fundamentals: MDPs, dynamic programming, Monte Carlo, TD learning, Q-learning, policy gradients.
-
Reinforcement Learning Specialization (University of Alberta - Coursera)
- Focuses on practical implementation, bandits, prediction/control methods, and function approximation.
-
Reinforcement Learning: An Introduction by Sutton & Barto
- Primary reference for both the Silver and Alberta courses.
Each project folder includes sample output videos, training logs, and architecture diagrams. A short demo reel will be available after Week 12.
Coming soon:
- How I Built a Grasping Robot in Isaac Gym (Week 4)
- What Tesla Optimus Taught Me About RL (Final Project)
- Python 3.8+
- PyTorch
- Isaac Gym (Preview 4)
- MuJoCo / Gymnasium
- Ubuntu 22.04 (dual boot)
- CUDA 11.6
- wandb / Ray / RLlib
I'm Jonas Petersen, aspiring robotics & AI engineer with a background in mechanical engineering, embedded systems, and a passion for embodied intelligence and startups.
📬 Reach out: