OpenRay is a Monte Carlo Tree Search project that uses the page DOM to accelerate web searching. It treats a browsing session like a game board. It explores possible actions with MCTS and then spends its compute on the paths that look most promising.
If you have ever watched a web agent waste tokens by rereading the same page or clicking in circles, OpenRay is the fix. It keeps a structured view of what the browser can do next and it learns which moves tend to produce real progress.
OpenRay sits between your goal and the browser.
It
- reads the current page DOM and turns it into a compact state representation
- proposes realistic next actions like click, type, scroll, open, go back
- runs MCTS to explore action sequences without committing to just one guess
- picks the best next action based on search statistics plus lightweight scoring
- repeats until the goal is met or a stop rule triggers
The big idea is simple. Instead of asking a model to guess the next step from scratch every time, OpenRay builds a tree of possible futures and reuses what it already learned.
The DOM is the ground truth of a web page. It is the thing your clicks actually interact with.
A DOM driven planner can
- target stable elements using structure and attributes plus text
- avoid brittle screenshot only reasoning when the page is mostly text and forms
- cache useful page facts so it does not keep paying the token tax
- detect dead ends earlier by recognizing repeated states
In practice this means fewer wasted steps and faster wall clock time for search style tasks.
MCTS is a way to search a huge decision space without brute forcing everything.
OpenRay uses the classic loop
- Selection
Walk down the tree using an exploration rule that balances trying new actions and exploiting proven actions - Expansion
Add one or more new child actions from the current page state - Simulation
Do a short rollout using cheap heuristics or a small model call - Backpropagation
Push the outcome back up the tree so future choices get smarter
Over time the tree becomes a map of what actually works on a given site and for a given kind of goal.
The scoring function is where web search becomes practical. A good score is not just about reaching a final page. It is also about consistent forward movement.
Common signals include
- query match
The page contains more goal relevant text than before - navigation progress
The agent moved from a listing to a detail page or from a landing page to results - interaction confirmation
A click opened a menu, a filter applied, or a new section loaded - novelty
The DOM state is meaningfully different from what has already been explored
You can start simple and then tune this over time based on your use case.
OpenRay is best when there are many plausible next actions and only a few lead to useful information.
Great fits
- product or place searches with filters
- forms where the next field depends on prior input
- multi page research where the agent must gather evidence
- sites that change layout but keep stable DOM semantics
Less ideal fits
- highly visual tasks where the DOM has little meaning
- pages that block automation or require heavy auth flows
This repository is a monorepo. You will usually see folders like
- src
Core planning logic and integration glue - extensions
Optional add ons that hook into the main runtime - ui
Tools for inspecting runs and debugging agent behavior - docs
Notes, references plus design write ups
Exact folder names may evolve, but the core concept stays the same.
Prerequisites
- Node installed
- pnpm installed
Setup
- Clone the repo using your preferred Git client
- Install dependencies
Runpnpm install - Build the project
Runpnpm build - Start the runtime
Runpnpm start
A typical loop looks like this
- Give a goal
Example find the best matching page for a query and extract key facts - OpenRay parses the DOM and enumerates actions
- MCTS explores short action sequences
- OpenRay executes the top action
- Repeat until done
If you are integrating OpenRay into another agent stack, treat it as a planner. It should decide the next browser action, not write the final user facing summary.
OpenRay is designed to be tunable. Useful knobs include
- search budget
Maximum expansions or rollouts per step - depth limit
Maximum action sequence depth before a rollout ends - exploration strength
How aggressively the tree tries untested actions - cache policy
How long to remember DOM states and extracted facts - safety gates
Blocklist actions like purchase flows or destructive clicks
Keep the defaults conservative, then widen the budget when you trust the behavior.
MCTS based agents are only as good as their observability.
Recommended debug outputs
- chosen action plus top alternatives
- per node visit counts and value estimates
- detected page landmarks like search boxes and result cards
- a replay log of DOM snapshots or hashed states
For evaluation, measure
- time to first relevant result
- number of page loads
- token usage for model calls
- success rate on a fixed task set