Welcome to Rainforcement, a structured and interactive collection of Jupyter notebooks designed to guide learners through the core concepts of Reinforcement Learning (RL). This project emphasizes building an intuitive understanding of decision-making algorithms used in games and AI systems by implementing them from scratch.
This repository includes four hands-on exercises that gradually increase in complexity, covering from basic planning methods to advanced deep reinforcement learning.
Learn the foundation of reinforcement learning through one-step lookahead value iteration, a classic algorithm that evaluates all possible actions for each state. You’ll:
- Implement a simple Markov Decision Process (MDP)
- Understand the Bellman equation and how policies evolve
- Evaluate and improve policy until convergence
Build upon the previous notebook by extending the concept to n-step lookahead methods. This enhances strategic planning over multiple future actions:
- Develop recursive planning strategies
- Learn how future rewards impact current decision making
- Implement rollouts and simulations for value estimation
Now it's time to play and test the environment using agents you’ve built! In this notebook, you:
- Interact with a grid-based game environment
- Visualize agent decisions based on current policy
- Evaluate the effectiveness of different planning strategies
Step into modern AI with Deep Q-Learning. This is where function approximation meets reinforcement learning:
- Train a neural network to approximate Q-values
- Implement experience replay and epsilon-greedy strategies
- Understand target networks and stability techniques
- Python 3
- NumPy
- OpenAI Gym (optional)
- Matplotlib
- PyTorch / TensorFlow (for deep learning notebook)
Clone the repository and open notebooks using Jupyter:
git clone https://github.com/rusiru-erandaka/Rainfrocement_Learning_and_GameAI.git
cd Rainfrocement_Learning_and_GameAI- Students studying AI or machine learning
- Anyone learning reinforcement learning through implementation
- Developers wanting to understand planning algorithms
This project is licensed under the MIT License. Feel free to use and modify it for your own learning or teaching purposes.