TreatQuest: Train Your Virtual Pet

A small reinforcement learning project where a virtual pet learns to reach a cookie and avoid a trap in a 5×5 grid. Train a Q-learning agent, visualize the learned policy as arrows, and watch a greedy demo.

Quick start

Windows (PowerShell)

cd path\to\TreatQuest
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements.txt

# train the agent and create artifacts
python -m scripts.train_q_learning

# create policy.png and print the arrow map
python -m scripts.make_policy

# watch one greedy episode (uses saved Q-table)
python -m treatquest.demo

macOS / Linux

cd /path/to/TreatQuest
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

python -m scripts.train_q_learning
python -m scripts.make_policy
python -m treatquest.demo

Outputs after training

saved_models/q_table.npy — learned Q-table
plots/reward_curve.png — reward per episode with moving average
plots/policy.png — arrow map of the greedy action per cell
plots/reward_log.csv — per-episode log (episode, total_reward, epsilon)
runs/<timestamp or name>/ — copy of artifacts plus run_meta.json with hyperparameters

Common commands

# random agent smoke test (Phase 1)
python -m scripts.run_random

# Q-learning training (Phase 2)
python -m scripts.train_q_learning

# make the policy image and print a text arrow map (Phase 3)
python -m scripts.make_policy

# run one greedy episode using the saved Q-table (falls back to random if missing)
python -m treatquest.demo

Useful flags

Training

# example: longer run, fixed cookie/trap, small step penalty for shorter paths
python -m scripts.train_q_learning \
  --episodes 2000 --max-steps 75 \
  --epsilon-decay 0.997 --fixed-positions --step-penalty -0.01 \
  --run-name my_experiment

Demo

python -m treatquest.demo --steps 75
python -m treatquest.demo --random    # force random even if Q-table exists

Repository layout

treatquest/ init.py config.py env.py # Grid world: reset(), step(), render_text() q_learning.py # Q-learning agent policy_viz.py # Extract best action per cell, render arrows / image demo.py # Greedy episode using saved Q-table (or random fallback)

scripts/ run_random.py # Random episodes smoke test train_q_learning.py # Training entrypoint with CLI flags make_policy.py # Generates policy.png + prints arrow map make_release.py # Builds a submission zip

plots/ # reward_curve.png, policy.png, reward_log.csv saved_models/ # q_table.npy runs/ # per-run copies of artifacts + run_meta.json tests/ test_env.py test_smoke.py test_train_short.py ui/ demo_gui.py

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
plots		plots
scripts		scripts
tests		tests
treatquest		treatquest
ui		ui
.gitattributes.txt		.gitattributes.txt
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TreatQuest: Train Your Virtual Pet

Quick start

Windows (PowerShell)

macOS / Linux

Outputs after training

Common commands

Useful flags

Repository layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TreatQuest: Train Your Virtual Pet

Quick start

Windows (PowerShell)

macOS / Linux

Outputs after training

Common commands

Useful flags

Repository layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages