ProcThor Agent

Agent Architecture

The ProcThor Agent is designed to navigate within simulated environments using a Vision Language Model (VLM). The system employs a structured interaction loop where the model receives visual observations and executes specific tool calls.

Interaction Flow

The interaction follows a strict template where the User acts as the environment interface, providing:

The Goal (e.g., "Visit all rooms").
A history of recent actions and observations.
The current visual observation (RGB image).

The Assistant (LLM) then analyzes the visual input and context to determine the next best action, outputting a structured function call.

Available Tools

The agent is equipped with a precise set of tools to manipulate its position and orientation. These tools are defined with specific arguments to ensure deterministic control over the agent's movement.

Navigation: Moves the agent in cardinal directions (Ahead, Back, Left, Right) with configurable magnitude (0.1 to 1.0).
Rotation: Rotates the view (Left, Right) by fixed degrees (15, 30, 45, 90).
Done: Signals the completion of the task.

Setup

Prerequisites

Python 3.12

Step by step

Run pip install -r requirements.txt
Run cp .env.example .env to create environment file. then add api key into OPENAI_API_KEY=TOKEN

Getting Start

Human Control

python scripts/interactive_wasd.py

Agent Control

python scripts/ai_agent.py

Agent Control - Multple Actions in one response (WIP)

python scripts/ai_agent_chunked.py

Benchmark

Evaluate agent navigation performance on ProcTHOR environments.

Create Benchmark Dataset

python scripts/create_benchmark_dataset.py --num 50 --split test --output benchmark_dataset.jsonl

Run Benchmark

python scripts/run_benchmark.py --benchmark benchmark_dataset.jsonl --max_steps 70

Results saved to benchmark_results.jsonl. Trajectory visualizations saved to benchmark_visualizations/.

Analyze Benchmark Logs

Analyze execution logs, calculate metrics (redundancy, blocked actions), and generate annotated videos of agent performance.

python scripts/analyze_benchmark_logs.py benchmark_results/<benchmark_run_directory>

This generates result_detailed.json and videos in analysis_videos/ within the run directory.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
benchmark_config		benchmark_config
benchmark_visualizations		benchmark_visualizations
images		images
logs		logs
procthor_agent		procthor_agent
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
benchmark_dataset.jsonl		benchmark_dataset.jsonl
benchmark_results.jsonl		benchmark_results.jsonl
chat_template.png		chat_template.png
llm_tools.png		llm_tools.png
model_prices.json		model_prices.json
output.gif		output.gif
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProcThor Agent

Agent Architecture

Interaction Flow

Available Tools

Setup

Prerequisites

Step by step

Getting Start

Human Control

Agent Control

Agent Control - Multple Actions in one response (WIP)

Benchmark

Create Benchmark Dataset

Run Benchmark

Analyze Benchmark Logs

About

Uh oh!

Releases

Packages

Languages

menloresearch/procthor-agent

Folders and files

Latest commit

History

Repository files navigation

ProcThor Agent

Agent Architecture

Interaction Flow

Available Tools

Setup

Prerequisites

Step by step

Getting Start

Human Control

Agent Control

Agent Control - Multple Actions in one response (WIP)

Benchmark

Create Benchmark Dataset

Run Benchmark

Analyze Benchmark Logs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages