RL-Sim — Reinforcement Learning Simulation

Home for experiments that simulate intervention policies before we run real-world pilots. The repo will host:

Environment definitions (gym-like) for different study designs.
Policy training scripts and evaluation notebooks.
Reporting utilities that summarize safety signals for the ethics board.

TL;DR

python scripts/demo.py — synthetic rewards, CSV summary, HTML report (outputs/demo_*)
python scripts/plot_severity.py --output docs/severity_mean_rewards.png
python scripts/doctor.py — env check (Python, deps, writable outputs)
Install: python3 -m venv .venv && source .venv/bin/activate && pip install matplotlib

See ROADMAP.md for the current build order and meta/launchpad/project-priorities.md for near-term deliverables.

Current Prototype

The src/rl_sim/bandit.py module now exposes stable and volatile reward schedules plus a minimal epsilon-greedy simulator. Start with build_default_bandit() in a notebook to compare conditions before wiring up heavier experiments.

Comparing Severity Groups

Use simulate_severity_groups() with the built-in LOW_SEVERITY and HIGH_SEVERITY profiles to generate quick contrasts before building plots. Example:

from rl_sim import simulate_severity_groups, LOW_SEVERITY, HIGH_SEVERITY
results = simulate_severity_groups(trials=200, profiles=[LOW_SEVERITY, HIGH_SEVERITY], rng_seed=7)
print(results['low']['mean_reward'], results['high']['mean_reward'])

Swap in your own SeverityProfile instances if you need different schedules or exploration rates.

Figure Workflow

Install matplotlib (pip install matplotlib) and run python scripts/plot_severity.py to produce docs/severity_mean_rewards.png, our first sketch for the simulation paper. Details live in docs/figures.md.

Documentation

docs/quickstart.md — one-page demo/plot/doctor steps.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
docs		docs
prompts		prompts
scripts		scripts
src/rl_sim		src/rl_sim
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
ISSUES.md		ISSUES.md
README.md		README.md
ROADMAP.md		ROADMAP.md
VERSION		VERSION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL-Sim — Reinforcement Learning Simulation

TL;DR

Current Prototype

Comparing Severity Groups

Figure Workflow

Documentation

About

Uh oh!

Releases 1

Packages

Languages

stephenmjerge/reinforcement-learning-simulation

Folders and files

Latest commit

History

Repository files navigation

RL-Sim — Reinforcement Learning Simulation

TL;DR

Current Prototype

Comparing Severity Groups

Figure Workflow

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages