Skip to content

johanity/openai-cua-sample-lite

Repository files navigation

GPT-5.4 CUA Sample Lite

Watch GPT-5.4 control a browser in real time. Pick a demo, paste your API key, and go.

Forked from openai/openai-cua-sample-app with a simplified UI, scenario carousel, and bring-your-own-key flow.

A simpler and more accessible way to view the 3 original examples.

Quick Start

git clone https://github.com/johanity/openai-cua-sample-lite.git
cd openai-cua-sample-lite
corepack enable && pnpm install
pnpm dev

Open http://127.0.0.1:3000, click Set API Key, paste your OpenAI key, and pick a demo.

Linux only: run pnpm playwright:install:with-deps after install if the default Playwright setup fails.

Using a .env file instead

cp .env.example .env
# edit .env → set OPENAI_API_KEY=sk-...
pnpm dev

When the runner has a key configured, the browser skips the API key prompt.

Demos

Demo What the model does
Launch Planner Drags cards across a kanban board to match a target layout
Sketch Studio Draws pixel art on a paint canvas using color swatches
Northstar Stays Fills out a hotel booking form and confirms the reservation

Each demo launches an isolated browser, runs the model against a local lab app, and streams screenshots + activity back to the console in real time.

How It Works

Browser (Next.js)  ←→  Runner (Fastify)  ←→  GPT-5.4 Responses API
     ↕                      ↕
  Screenshots            Playwright
  Activity feed          Browser sessions
  API key modal          Lab workspaces

The runner starts a Playwright browser, sends the scenario prompt to the Responses API, and executes the model's tool calls against the browser. The web console connects via SSE and displays live screenshots and an activity trace.

Two execution modes:

  • code — the model writes JavaScript that runs in a Playwright REPL
  • native — the model issues raw clicks, drags, and keystrokes via the computer tool

Project Structure

apps/demo-web/       Next.js operator console
apps/runner/         Fastify runner + SSE + artifact serving
packages/
  replay-schema/     Shared TypeScript contracts
  scenario-kit/      Scenario manifests and prompts
  browser-runtime/   Playwright session wrapper
  runner-core/       Responses loop, executors, verification
labs/                HTML lab templates (kanban, paint, booking)

Development

# Run services separately
pnpm dev:runner
pnpm dev:web

# Quality checks
pnpm check          # lint + typecheck + test + build
pnpm test:live      # requires OPENAI_API_KEY

Environment Variables

All optional — the app works with just a browser-entered API key.

Variable Default Notes
OPENAI_API_KEY Set in .env or enter in browser
HOST 127.0.0.1 Runner bind address
PORT 4001 Runner port
CUA_DEFAULT_MODEL gpt-5.4 Model for Responses API
RUNNER_BASE_URL http://127.0.0.1:4001 Web → runner connection

See .env.example for the full list.

Safety

Computer use is high risk. This app runs against local lab apps only — do not point it at real websites, authenticated sessions, or anything sensitive. The demos are sandboxed and designed for deterministic verification, not general web autonomy.

License

MIT — see LICENSE.

About

GPT-5.4 CUA Sample Lite: Streamlined log, carousel display, BYOK API key flow, and only the 3 original examples

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors