Skip to content

realada/treatquest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TreatQuest: Train Your Virtual Pet

A small reinforcement learning project where a virtual pet learns to reach a cookie and avoid a trap in a 5×5 grid. Train a Q-learning agent, visualize the learned policy as arrows, and watch a greedy demo.


Quick start

Windows (PowerShell)

cd path\to\TreatQuest
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements.txt

# train the agent and create artifacts
python -m scripts.train_q_learning

# create policy.png and print the arrow map
python -m scripts.make_policy

# watch one greedy episode (uses saved Q-table)
python -m treatquest.demo

macOS / Linux

cd /path/to/TreatQuest
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

python -m scripts.train_q_learning
python -m scripts.make_policy
python -m treatquest.demo

Outputs after training

  • saved_models/q_table.npy — learned Q-table
  • plots/reward_curve.png — reward per episode with moving average
  • plots/policy.png — arrow map of the greedy action per cell
  • plots/reward_log.csv — per-episode log (episode, total_reward, epsilon)
  • runs/<timestamp or name>/ — copy of artifacts plus run_meta.json with hyperparameters

Common commands

# random agent smoke test (Phase 1)
python -m scripts.run_random

# Q-learning training (Phase 2)
python -m scripts.train_q_learning

# make the policy image and print a text arrow map (Phase 3)
python -m scripts.make_policy

# run one greedy episode using the saved Q-table (falls back to random if missing)
python -m treatquest.demo

Useful flags

Training

# example: longer run, fixed cookie/trap, small step penalty for shorter paths
python -m scripts.train_q_learning \
  --episodes 2000 --max-steps 75 \
  --epsilon-decay 0.997 --fixed-positions --step-penalty -0.01 \
  --run-name my_experiment

Demo

python -m treatquest.demo --steps 75
python -m treatquest.demo --random    # force random even if Q-table exists

Repository layout

treatquest/ init.py config.py env.py # Grid world: reset(), step(), render_text() q_learning.py # Q-learning agent policy_viz.py # Extract best action per cell, render arrows / image demo.py # Greedy episode using saved Q-table (or random fallback)

scripts/ run_random.py # Random episodes smoke test train_q_learning.py # Training entrypoint with CLI flags make_policy.py # Generates policy.png + prints arrow map make_release.py # Builds a submission zip

plots/ # reward_curve.png, policy.png, reward_log.csv saved_models/ # q_table.npy runs/ # per-run copies of artifacts + run_meta.json tests/ test_env.py test_smoke.py test_train_short.py ui/ demo_gui.py

About

TreatQuest open challenge by Parity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages