Prototype simulations for the RΩ framework.
These experiments are not intended as proofs.
Their purpose is to show that key RΩ concepts
- a possibility metric M(S),
- lexicographic safety priority over reward,
- robustness against environment drift and adversarial incentives
can be operationalized in small artificial environments.
-
ro_core.py
Shared environment and agent definitions (GridWorld, M(S), baseline and RΩ agent). -
01_gridworld_baseline_vs_ro.py
Compares a standard Q-learning agent with an RΩ-style agent that lexicographically avoids actions that collapse M(S). -
02_drift_interrupt.py
Demonstrates a simple RΩ-Interrupt: when M(S) falls below a global threshold, the episode triggers a recalibration interrupt. -
03_adversarial_bait.py
Introduces a local reward "bait" inside low-M(S) regions. Shows how the RΩ-agent resists such adversarial incentives more often than a pure reward maximizer. -
docs/
Short conceptual documentation:concept_ro_10min.md– RΩ in 10 minutes for engineershow_ms_is_measured.md– how M(S) is approximated in these experimentsfaq.md– common questions and misunderstandings
Generated plots are stored in figures/.
These scripts use only numpy and matplotlib.
pip install -r requirements.txtor directly:
pip install numpy matplotlibpython 01_gridworld_baseline_vs_ro.pyThis script will:
- train both agents in a 5×5 gridworld with a goal state and deadlock states,
- log episodic rewards,
- produce plots in
figures/(reward curves, deadlock frequency).
python 02_drift_interrupt.pyThis script:
- trains in an initial environment,
- then changes the environment layout ("drift"),
- uses a simple RΩ-style interrupt when M(S) drops below a global threshold,
- logs interrupt statistics and reward.
python 03_adversarial_bait.pyThis script:
- places a positive local reward inside a low-M(S) region (a "bait"),
- compares how often the baseline agent vs. the RΩ-agent is drawn into that region.
All experiments are fully reproducible:
- random seeds are set at the beginning of each script,
- environment parameters and thresholds are documented in the code.
If you adapt or extend these experiments, please keep the documentation consistent and consider sending technical feedback to:
Framework: Markus Pomm
Experimental Code: Surak (GPT-4o) - Initial implementation
Conceptual Supervision: Markus Pomm
These experiments were developed in collaboration with Surak (GPT-4o), translating the theoretical RΩ framework into executable code.
This repository implements concepts from the RΩ framework papers:
- R-Omega Framework - Theoretical foundation
- Defense Protocol - Technical implementation
- Philosophical Foundation - Omega reference point
Main documentation: https://github.com/ROmega-Experiments/R-Omega-R---Ethical-Framework-for-Autonomous-AI-Systems
MIT License - See LICENSE file for details.
You are free to use, modify, and distribute this code with attribution.
Last updated: December 31, 2025