Skip to content

Latest commit

 

History

History
504 lines (370 loc) · 20.4 KB

File metadata and controls

504 lines (370 loc) · 20.4 KB

TSPlay

English | 简体中文

A browser automation execution engine for AI agents and delivery teams. It unifies observe -> draft -> validate -> run -> repair, session reuse, and explicit security controls into one workflow.

TSPlay is built with Go + Playwright and exposes three entry points: Lua CLI / Script, Flow DSL, and MCP Server.

It is not just a thin wrapper around browser actions. TSPlay organizes the capabilities that usually become fragmented in real delivery work:

  • page observation and interactive element extraction
  • structured Flow drafting, validation, and execution
  • failure traces, screenshots, HTML, and DOM snapshots
  • login state and browser session reuse
  • agent-facing MCP tools with explicit permission boundaries

If you want browser automation that can be maintained over time, generated by AI, reviewed by a team, and delivered reliably, TSPlay is built much closer to that goal.

Use Cases

  • Web RPA: login, search, click, upload, download, export
  • page data extraction: text, attributes, links, tables, HTML, cookies, storage state
  • business workflow automation: variables, control flow, assertions, recovery, resumable runs
  • agent browser tooling: let Codex, OpenClaw, and similar models observe a page before drafting, running, and repairing flows
  • browser plus external systems: HTTP APIs, Redis, CSV/Excel, databases

Why TSPlay Is Not "Just Another Playwright Wrapper"

  • Flow is versionable, reviewable, reusable, and much easier for AI to generate strictly
  • failures automatically preserve trace data, screenshots, HTML, and DOM snapshots for debugging and repair
  • sessions can be named, saved, and reused for long-term delivery instead of one-off scripts
  • MCP mode comes with security boundaries, which is much safer for agent-facing automation than exposing a raw browser directly

Three Ways to Use TSPlay

Mode Best for Entry
Lua CLI / Script quick debugging, page exploration, one-off tasks go run . -action cli / go run . -script ...
Flow DSL versioned, reviewable, reusable business workflows that AI can generate go run . -flow ...
MCP Server exposing observe, draft, execute, repair, and session capabilities to agents go run . -action srv

For day-to-day delivery work, Flow should usually be your main path.
Use CLI to explore a page first, and MCP when you want to connect TSPlay to an AI product or agent workflow.

Capability Matrix

The goal of this matrix is not to force every capability into mechanical 1:1 parity. The point is to separate what should stay aligned across layers and what belongs naturally at the Flow level.

Capability Area Typical Actions Flow Lua MCP Recommendation
page primitives navigate, click, type_text, select_option Yes Yes Yes Keep aligned
file and spreadsheet I/O screenshot, save_html, read_csv, read_excel, write_json, write_csv Yes Yes Yes Keep aligned; constrained by allow_file_access in MCP
HTTP requests http_request, json_extract Yes Yes Yes Keep aligned; Lua inside Flow / MCP also obeys allow_http, allow_file_access, and file-root constraints
Redis operations redis_get, redis_set, redis_del, redis_incr Yes Yes Yes Keep aligned; Lua inside Flow / MCP also obeys allow_redis
database operations db_insert, db_insert_many, db_upsert, db_query, db_query_one, db_execute, db_transaction Yes Yes Yes Keep aligned; Lua inside Flow / MCP also obeys allow_database, and db_transaction auto-commits or rolls back
browser state get_storage_state, get_cookies_string, browser.use_session Yes Yes Yes Keep aligned; constrained by allow_browser_state in MCP
Flow convenience actions extract_text, assert_visible, assert_text, set_var, append_var Yes Yes Yes Already aligned; better treated as orchestration sugar than low-level primitives
Flow control flow retry, if, foreach, on_error, wait_until Yes No Yes No need to force parity into Lua
Lua callback-style capability intercept_request No Yes No Best kept Lua-only

Recommended rule of thumb:

  • Atomic data actions such as HTTP / Redis / database / file I/O should ideally work in both Flow and Lua, so exploration, productionization, and integration do not drift apart.
  • Orchestration capabilities such as retry / foreach / on_error / wait_until belong more naturally in Flow DSL and do not need to be translated into Lua extension functions.
  • Semantic actions such as extract_text / assert_text / assert_visible can start as higher-level Flow actions; if Lua users keep rebuilding the same patterns, then a Lua sugar layer is worth adding.

Quick Start

Requirements

  • Go 1.23.6+
  • an environment that can run Playwright Chromium
  • on the first browser-related run, TSPlay will call playwright.Install() automatically to download the browser

Install Dependencies

go mod download

Pick One Way to Start

What you want to do Command
start the interactive CLI go run . -action cli
run a Lua script go run . -script script/open_url.lua
run a Flow go run . -flow script/demo_baidu.flow.yaml
start the built-in static file server go run . -action file-srv -addr :8000
call one TSPlay MCP tool directly go run . -action mcp-tool -tool tsplay.list_actions
list macOS screen recording devices go run . -action list-record-devices
record the entire desktop go run . -action record-screen -record-cmd "go run . -flow script/tutorials/10_assert_page_state.flow.yaml"
record only browser-page video go run . -flow script/tutorials/10_assert_page_state.flow.yaml -browser-video-output artifacts/recordings/lesson-10-assert-page-state.webm
list bundled assets inside the binary go run . -action list-assets
extract bundled docs/script/demo assets go run . -action extract-assets -extract-root ./tsplay-assets
start the MCP server go run . -action srv

Add -headless if you want to hide the browser window.

record-screen captures the entire macOS desktop, which is useful for desktop demos.
-browser-video-output records the Playwright page itself, which is better for browser tutorials.
TSPlay also keeps the page open slightly longer by default so short recordings remain watchable.
For the full instructor workflow, see docs/training/tutorial-video-recording.md.

When you build ./tsplay, ReadMe.md, docs/, script/, and demo/ are bundled into the binary:

  • run a bundled example directly: ./tsplay -script script/tutorials/01_hello_world.lua
  • run a bundled Flow directly: ./tsplay -flow script/tutorials/01_hello_world.flow.yaml
  • serve the bundled demo pages directly: ./tsplay -action file-srv -addr :8000
  • extract the reference assets to a local directory: ./tsplay -action extract-assets -extract-root ./tsplay-assets

Run a Flow First

The repository already includes a minimal example:

go run . -flow script/demo_baidu.flow.yaml

The Flow looks roughly like this:

schema_version: "1"
name: baidu_search
vars:
  query: 山东大学
steps:
  - action: navigate
    url: https://www.baidu.com

  - action: wait_for_selector
    selector: "#kw"
    timeout: 5000

  - action: type_text
    selector: "#kw"
    text: "{{query}}"

  - action: click
    selector: "#su"

  - action: wait_for_network_idle

  - action: get_all_links
    selector: "xpath=//body"
    save_as: links

The run returns structured JSON with variables, step traces, durations, and failure artifact paths when something goes wrong.

Explore with the CLI

Start the CLI:

go run . -action cli

Then enter:

start

After that you can execute Lua-style commands directly:

navigate("https://www.baidu.com")
wait_for_network_idle()
type_text("#kw", "山东大学")
click("#su")

Start the MCP Server

go run . -action srv
go run . -action srv -addr :8081
go run . -action srv -flow-root script -artifact-root artifacts
go run . -action mcp-stdio -flow-root script -artifact-root artifacts
go run . -action mcp-tool -tool tsplay.list_actions
go run . -action mcp-tool -tool tsplay.observe_page -args-file script/tutorials/113_mcp_observe_page_template_release.args.json

Default constraints:

  • flow_path can only read files under script/ unless you change it with -flow-root
  • file input and output are limited to the artifact root by default
  • run_flow defaults to headless=true

If you want to start from the path where a user describes intent and the model helps draft and run a Flow, begin with docs/training/ai-intent-to-flow.md.

Why Flow Should Be the Main Path

Compared with raw Lua, Flow works better as a long-lived business asset:

  • easier for AI to generate strictly
  • easier for humans to review and diff
  • easier to validate with a schema and return structured issues
  • easier to preserve failure context and repair hints
  • easier to expose through MCP to agents

Common Flow capabilities include:

  • variables: vars, save_as, set_var, append_var
  • control flow: retry, if, foreach, on_error, wait_until
  • page actions: click, type, wait, assert, screenshot, upload, download
  • data actions: http_request, json_extract, read_csv, read_excel, write_json, write_csv
  • browser state: use_session, storage_state, save_storage_state

Core Capabilities

  • Chromium automation driven by Playwright
  • direct browser control through Lua
  • structured Flow in YAML / JSON
  • page observation, Flow drafting, validation, execution, and failure repair
  • named browser session save, reuse, and export
  • data actions such as redis_get/set/del/incr and db_insert/db_query/db_transaction
  • automatic failure artifacts including screenshots, HTML, and DOM snapshots
  • explicit security boundaries and capability grants through MCP

Execution Output and Failure Artifacts

Every Flow run returns a step trace that usually includes:

  • action
  • parameter summary
  • status
  • duration_ms
  • output summary
  • current page_url

When a step fails, TSPlay writes the scene into the artifact root, which defaults to artifacts/.
Common files include:

  • failure.png
  • page.html
  • dom_snapshot.json

You can change the output directory with:

go run . -flow script/demo_baidu.flow.yaml -artifact-root artifacts

MCP / Agent Integration

TSPlay can run as an MCP server so an agent does not need to read a full HTML page or hand-author selectors directly.

If you want a shorter default path for the model, prefer tsplay.finalize_flow.
If you need finer-grained control, use the full chain: observe_page -> draft_flow -> validate_flow -> run_flow -> repair_flow_context -> repair_flow.

Common tsplay.finalize_flow statuses:

  • ready: the Flow can run immediately
  • needs_input: more variables or user input are required
  • needs_permission: a security boundary was hit and additional permission is required
  • needs_repair: the Flow is close, but should be fixed before running

MCP Tool Groups

Group Tools
Flow discovery tsplay.list_actions, tsplay.flow_schema, tsplay.flow_examples
page observation and drafting tsplay.observe_page, tsplay.draft_flow, tsplay.finalize_flow
validation, execution, and repair tsplay.validate_flow, tsplay.run_flow, tsplay.repair_flow_context, tsplay.repair_flow
session management tsplay.save_session, tsplay.list_sessions, tsplay.get_session, tsplay.export_session_flow_snippet, tsplay.delete_session

Recommended Call Order

If you want a shorter default path for smaller models or productized integrations, start with:

  1. tsplay.finalize_flow
  2. if status=ready, run tsplay.run_flow
  3. if status=needs_permission, grant permissions and call tsplay.finalize_flow again
  4. if status=needs_input, provide the missing input and call tsplay.finalize_flow again
  5. if status=needs_repair, move into tsplay.validate_flow / tsplay.repair_flow_context / tsplay.repair_flow

If you need finer control, use the full path:

  1. tsplay.flow_schema
  2. tsplay.flow_examples
  3. tsplay.observe_page
  4. tsplay.draft_flow
  5. tsplay.validate_flow
  6. tsplay.run_flow
  7. tsplay.repair_flow_context / tsplay.repair_flow
  8. tsplay.save_session

The golden-path tools try to return a unified envelope whose top-level fields usually include:

  • ok
  • tool
  • summary
  • artifacts
  • next_action
  • warnings
  • run

Flow Authoring Tips

  • use type_text, not fill
  • use save_as, not result_var
  • for file I/O, uploads, downloads, and screenshots in MCP mode, pass allow_file_access=true or use security_preset=browser_write
  • set page-level timeout in browser.timeout instead of adding an unsupported timeout directly to navigate
  • if you are unsure about an action name, check tsplay.list_actions and tsplay.flow_schema first

Browser Sessions and Top-Level Flow Config

If a business flow depends on login state, put browser config at the top of the Flow instead of scattering it across individual steps:

schema_version: "1"
name: admin_orders
browser:
  headless: true
  use_session: admin
  save_storage_state: states/admin-latest.json
  timeout: 30000
  viewport:
    width: 1440
    height: 900
steps:
  - action: navigate
    url: https://example.com/admin/orders

  - action: assert_visible
    selector: "#orders-table"
    timeout: 10000

Common browser fields:

  • headless
  • use_session
  • storage_state / storage_state_path / load_storage_state
  • save_storage_state
  • persistent
  • profile
  • session
  • timeout
  • user_agent
  • viewport.width / viewport.height

Notes:

  • use_session expands automatically from a named session saved with tsplay.save_session
  • save_storage_state saves the current login state after the Flow finishes
  • profile / session enables a persistent browser context
  • persistent profile/session cannot be combined with storage_state or use_session

If you want business users to remember only a single session name, save one first:

{
  "name": "admin",
  "storage_state_path": "states/admin.json"
}

Then write this in later Flows:

browser:
  use_session: admin

Security Boundaries

MCP mode is not fully open by default. High-risk capabilities require explicit permission per request.

security_preset

  • readonly: default minimum permissions
  • browser_write: enables file I/O and browser-state capabilities, which is suitable for upload, download, screenshots, and storage-state reuse
  • full_automation: enables all MCP security capabilities

Explicit allow_* flags override the corresponding fields inside security_preset.

Common Permission Flags

Permission Flag Allows
allow_lua=true lua
allow_javascript=true execute_script, evaluate
allow_file_access=true screenshot, save_html, read_csv, read_excel, upload/download, write_json, write_csv
allow_browser_state=true cookies / storage state / browser.use_session / persistent profile
allow_http=true http_request
allow_redis=true redis_get, redis_set, redis_del, redis_incr, foreach.with.progress_key
allow_database=true db_insert, db_insert_many, db_upsert, db_query, db_query_one, db_execute, db_transaction

Additional notes:

  • even when file actions are allowed, reads and writes still stay within the artifact root
  • relative paths inside top-level browser config are also resolved under the artifact root
  • http_request, redis_*, and db_* inside Lua inherit the corresponding allow_* constraints when running inside Flow / MCP security contexts
  • local CLI runs such as go run . -flow ... remain a more flexible local workflow

External System Integrations

TSPlay can place browser automation and data actions inside the same Flow.

HTTP

You can call external APIs directly with http_request and continue the orchestration with json_extract.
Typical examples:

  • OCR or captcha recognition
  • internal lookup or patch APIs
  • webhook or notification endpoints

Additional notes:

  • both Flow and Lua support http_request
  • when Lua http_request runs inside a Flow / MCP security context, it also obeys allow_http
  • if http_request uses save_path or multipart_files, Lua follows the same allow_file_access constraints as Flow
  • in restricted mode, relative paths in save_path and multipart_files resolve under the configured file root

Redis

Redis works well for shared cookies, cursors, deduplication keys, and resumable checkpoints.

Environment variable conventions:

  • default connection: TSPLAY_REDIS_*
  • named connection: TSPLAY_REDIS_<NAME>_*
  • URL form is also supported: TSPLAY_REDIS_URL, TSPLAY_REDIS_<NAME>_URL

CSV / Excel

Useful for batch import, chunked execution, and writing results back into a ledger.

  • read_csv treats the first non-empty row as the header by default
  • read_excel currently supports .xlsx
  • read_excel.range supports rectangular ranges such as A2:B20
  • you can combine with.start_row, with.limit, and with.row_number_field for resumable processing

Database

Useful when you want to persist structured output directly into tables or query business data during a Flow.

Environment variable conventions:

  • default connection: TSPLAY_DB_*
  • named connection: TSPLAY_DB_<NAME>_*

Common supported drivers:

  • mysql
  • pgsql
  • sqlserver
  • oracle

Notes:

  • both Flow and Lua support db_insert, db_insert_many, db_upsert, db_query, db_query_one, db_execute, and db_transaction
  • when Lua db_* or Lua db_transaction runs inside a Flow / MCP security context, it also obeys allow_database=true
  • db_transaction executes inner database actions in one transaction scope, auto-commits on success, and auto-rolls back on failure
  • db_* actions require allow_database=true in MCP mode
  • SQL Server and Oracle require binaries built with the corresponding driver build tags

Documentation Map

This README covers project positioning and quick start. Training, enablement, and delivery-oriented materials live under docs/.

Content Description Entry
docs index repository-wide documentation map and recommended reading order docs/README.md
training overview a single entry for implementers, testers, developers, and trainers docs/training/README.md
AI intent to Flow hands-on guide for the agent path from user intent to MCP to Flow to execution and repair docs/training/ai-intent-to-flow.md
learning path roadmap from beginner to MCP integrator / trainer docs/training/learning-path.md
bootcamp plan 2-day bootcamp and 4-week rollout rhythm docs/training/bootcamp-plan.md
labs hands-on exercises built on demo/ and script/ docs/training/labs.md
assessment and certification scoring dimensions, evidence standards, and graduation criteria docs/training/assessment.md
trainer playbook preparation, delivery, and retrospective guidance for instructors docs/training/trainer-playbook.md

Project Structure

.
├── main.go
├── docs/             # docs index, training system, labs, and trainer materials
├── tsplay_core/      # core engine, Flow, MCP, observation, and repair
├── script/           # Lua and Flow examples
├── demo/             # local demo pages
├── tsplay_test/      # tests and demo resources
└── mcp_test/         # MCP-related experiment code

Development and Testing

go test ./...

If this is your first time with TSPlay, this reading order works well:

  1. start with this page to understand the three layers and quick start
  2. continue with docs/README.md to locate the rest of the materials
  3. if you want to learn Flow delivery, focus on docs/training/learning-path.md
  4. if you want to integrate agents or MCP, focus on docs/training/ai-intent-to-flow.md