This project implements a Reinforcement Learning (RL) approach to control a building's radiator system, balancing energy savings and occupant comfort. The goal is to minimize electricity costs (including off-peak pricing) while maintaining comfortable temperatures when occupants are home. Note that this problem could be solved with statistic approaches, this is why this project aims to build a template for more complex behavior (high non-linearity, randomness, etc...). Overall it is a nice project to have some fun with RL.
- Simulated Environment: Custom linear physical model (conductance + capacity) for building thermal dynamics.
- RL Agents: Rule-based baseline and a DQN (Deep Q-Network) implemented in PyTorch.
- Discrete Action Space: Radiator power levels.
- Reward Function: Weighted sum of electricity cost and comfort (temperature deviation).
- Data: Synthetic data for algorithm testing; real-world data (MeteoSwiss) for training.
- Current indoor temperature
- Outdoor temperature
- Radiator state
- Occupant presence
- Time of day (for pricing)
- Discrete radiator power levels (e.g., 0%, 33%, 66%, 100%)
- Cost Term: Penalizes high electricity usage, scaled by real-time pricing.
- Comfort Term: Penalizes deviation from the desired temperature range.
- Total Reward: Weighted sum of cost and comfort terms.
- Weather Data: Sourced from MeteoSwiss (real-world) and synthetic datasets (testing).
- Electricity Pricing: Simulated off-peak/peak pricing (averaged for training).
- Occupant Presence: Simulated presence ranges.
- Simple heuristic (e.g., turn on radiator if temperature is below a threshold).
- Network: PyTorch implementation.
- Training: Offline (pre-collected data) or online (interaction with the environment).
- Hyperparameters: Learning rate, discount factor, exploration rate (ε-greedy).
- Total Reward: Sum of rewards over a 24-hour period.
- Total Cost: Sum of electricity costs over a 24-hour period.
- Comfort Metrics: Average deviation from the desired temperature.
- Rule-based agent (for comparison).
- Python 3.8+
- PyTorch
- Gymnasium
- Poetry (for dependency management)
- Clone the repository:
git clone https://github.com/mp-mech-ai/radiator-rl - Install dependencies:
poetry install - Download weather data from MeteoSwiss and place it in
data/weather/.
- Run the DQN training script:
python dqn_training.py
- Long Training Time: DQN requires extensive interaction with the environment.
- Localization: Model is currently trained for a single location.
- Temperature Assumptions: Simplified linear model may not capture all real-world dynamics.
- Scalability: Extend to multiple locations with diverse weather patterns.
- Advanced Algorithms: Implement PPO for continuous power control.
- Real-World Deployment: Test on physical hardware or more complex simulators.