From 4d3e58ec4620bc2aa8577af0b023dbf03485fae8 Mon Sep 17 00:00:00 2001 From: hbarkh Date: Sun, 24 Aug 2025 13:18:02 -0700 Subject: [PATCH] Create users_guide.md --- users_guide.md | 142 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 users_guide.md diff --git a/users_guide.md b/users_guide.md new file mode 100644 index 00000000..fe1c9b88 --- /dev/null +++ b/users_guide.md @@ -0,0 +1,142 @@ +# Controls Challenge - Complete User Guide + +## Overview + +**Data**: Synthetic comma-steering-control dataset (real openpilot driving data) +**Model**: TinyPhysics ML model simulates car lateral movement +**Controllers**: Implement BaseController, output steering from target vs current lateral accel +**Eval**: lataccel_cost + jerk_cost, use eval.py for reports +**CSV**: Time-series with velocity, acceleration, steering, target lateral accel + +## Available Controllers + +- **PID**: Baseline proportional controller +- **Q-Learning**: Tabular RL with discretized state space +- **DQN**: Deep Q-Network with continuous states +- **PPO**: Proximal Policy Optimization with continuous actions + +## Quick Start + +**Training**: `python algorithms/sb3_ppo/train_sb3_ppo.py` +**Evaluation**: `python eval.py --test_controller sb3_ppo_controller --baseline_controller pid` + +## CSV Data Format +```csv +t,vEgo,aEgo,roll,targetLateralAcceleration,steerCommand +0.0,33.77,-0.017,0.037,1.004,-0.330 +0.1,16.69,-0.071,0.023,0.017,0.115 +``` + +--- + +# Controller Development Guide + +## 🎯 CRITICAL FOR CONTROLLER: `controller.update()` Interface + +Your controller must implement this signature: +```python +def update(self, target_lataccel: float, current_lataccel: float, state: State, future_plan: FuturePlan) -> float: + # Return steering action in range [-2, 2] +``` + +**Action Range**: `[-2, 2]` (STEER_RANGE) + +## 📊 Data Structures Your Controller Receives + +### State +`namedtuple` with: +- `roll_lataccel`: Lateral acceleration from road banking (m/s²) +- `v_ego`: Vehicle speed (m/s) +- `a_ego`: Vehicle acceleration (m/s²) + +### FuturePlan +5-second future trajectory with: +- `lataccel`: Target lateral accelerations +- `roll_lataccel`: Future road banking effects +- `v_ego`: Future speeds +- `a_ego`: Future accelerations + +## ⚡ Controller Integration Summary + +**What you control**: Steering commands in `[-2, 2]` +**What you receive**: Current state + 5-second future plan +**Goal**: Minimize lateral acceleration tracking error + jerk +**Evaluation**: Steps 100-500 (4 seconds of control) + +### Key Constants: +- Control starts at step 100 +- 10 Hz update rate +- Max lataccel change: 0.5 m/s² per step +- Lateral accel range: [-5, 5] m/s² +- Future plan duration: 5 seconds + +## 🔄 Simulation Flow + +1. **Initialization**: Load data, reset histories +2. **For each timestep**: + - Get current state and future plan + - Call your `controller.update()` + - Clip action to `[-2, 2]` + - Use transformer to predict next lateral acceleration + - Update histories +3. **Evaluation**: Calculate costs on steps 100-500 + +## 📈 Performance Metrics + +- **lataccel_cost**: How well you track the target lateral acceleration +- **jerk_cost**: How smooth your control is (penalizes rapid changes) +- **total_cost**: `lataccel_cost × 50 + jerk_cost` (lateral tracking is heavily weighted) + +## 💡 Controller Development Tips + +1. **Use the future plan**: The 5-second lookahead gives you valuable trajectory information +2. **Balance tracking vs smoothness**: Large steering changes increase jerk cost +3. **Consider vehicle dynamics**: Speed and acceleration affect how steering translates to lateral acceleration +4. **Account for road banking**: `roll_lataccel` affects the vehicle's natural lateral acceleration + +--- + +# Technical Implementation Details + +## TinyPhysics Architecture Functions + +### LataccelTokenizer Class +- `__init__()`: Creates 1024 bins from -5 to 5 m/s² +- `encode()`: Converts continuous lataccel → discrete token +- `decode()`: Converts token → continuous lataccel +- `clip()`: Clamps values to [-5, 5] range + +*Not directly used by controller* + +### TinyPhysicsModel Class +- `__init__()`: Loads ONNX transformer model +- `softmax()`: Probability distribution calculation +- `predict()`: Gets next lataccel token from model +- `get_current_lataccel()`: Main model prediction interface + +*Physics simulation - not for controller* + +### TinyPhysicsSimulator Class + +#### Key Functions: +- `__init__()`: Sets up simulation with your controller +- `reset()`: Initializes 20-step history buffer +- `get_data()`: Processes CSV data (converts steering convention) +- `control_step()`: **CALLS YOUR CONTROLLER** +- `sim_step()`: Updates physics using transformer model +- `step()`: Runs one simulation timestep +- `rollout()`: **MAIN EVALUATION LOOP** +- `compute_cost()`: Calculates performance metrics + +#### Cost Evaluation: +- Steps 100-500 are evaluated +- `lataccel_cost`: MSE between target/actual × 100 +- `jerk_cost`: Smoothness penalty × 100 +- `total_cost`: Weighted sum (lataccel × 50 + jerk) + +### Utility Functions +- `get_available_controllers()`: Lists controller files +- `run_rollout()`: Single trajectory evaluation +- `download_dataset()`: Gets training data + +Your controller's `update()` method is called once per timestep in `control_step()` and must return a float steering command. \ No newline at end of file