Skip to content

mcpware/chrome-pilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chrome-pilot

npm version license GitHub stars

AI-controlled Chrome with real profile sync. Your AI copilot drives, you watch — and take the wheel when needed.

Why chrome-pilot?

Headless automation tools (Playwright, agent-browser) give AI a clean, empty browser. No logins, no cookies, no bookmarks. Great for scraping. Useless for real work.

chrome-pilot gives AI your real Chrome — with your profiles, passwords, bookmarks, and extensions. The browser is visible on screen so you can watch every step, and take over when needed (2FA, CAPTCHA, payment confirmation).

Headless tools:   AI works alone in a clean browser you can't see
chrome-pilot:     AI works in YOUR browser while you watch

How it's different

vs Playwright / Playwright MCP

Playwright chrome-pilot
Browser Bundled Chromium (clean) Your real Chrome
Sessions Lost on restart Persist forever (profiles)
Passwords & bookmarks ✅ via Google Sync
Extensions
Re-login every time? Yes No
Profile management ✅ list / create / switch / delete

vs agent-browser (Vercel)

agent-browser chrome-pilot
Design AI works alone (headless) AI + human together (headed)
Browser Chrome for Testing (clean) Your real Chrome
MCP support ❌ CLI only ✅ Native MCP server
Profile management Manual (--profile <path>) Full CRUD (list / create / delete)
Google Sync
Human handoff ❌ Can't intervene ✅ You see everything, take over anytime
Extensions
Commands 69 (automation-focused) 25 (interaction-focused)
Token efficiency ✅ 93% less context ✅ 3-tier snap system

When to use what

"Scrape 100 product pages"           → agent-browser (headless, fast, isolated)
"Help me submit my expense claim"    → chrome-pilot (needs your login, you confirm)
"Run E2E tests in CI"                → Playwright (clean environment, no profile)
"Book a flight on my account"        → chrome-pilot (needs your session, you approve)

chrome-pilot is built for tasks where AI needs your identity — and you need to stay in control.

Install

npm install @mcpware/chrome-pilot

Quick Start

As an MCP Server (for Claude, etc.)

Add to your MCP config:

{
  "mcpServers": {
    "chrome-pilot": {
      "command": "npx",
      "args": ["@mcpware/chrome-pilot"]
    }
  }
}

Then ask your AI:

  • "Open my personal Chrome and go to GitHub"
  • "Take a screenshot of the current page"
  • "List my Chrome profiles"

As a Library

import { launchProfile, listProfiles, closeProfile } from "@mcpware/chrome-pilot";

// Launch Chrome with a persistent profile
const { page, port } = await launchProfile("personal", {
  url: "https://github.com",
});

// First time? Chrome sign-in page appears.
// Sign into Google → Chrome Sync pulls your bookmarks, passwords, extensions.
// Next time? Everything is already there.

await page.screenshot({ path: "screenshot.png" });
await closeProfile("personal");

MCP Tools

Profile Management

Tool Description
list_profiles List all Chrome profiles with linked Google accounts
create_profile Create a new profile
delete_profile Delete a profile and its data
launch Launch Chrome with a profile (auto-creates if needed)
close Close a profile session
get_active List running sessions with CDP ports

Browser Control

Tool Description
navigate Go to a URL
screenshot Capture the page as PNG
click Click an element by CSS selector
type Type text into focused element
fill Fill a form field by selector
eval Run JavaScript in page context

3-Tier Snapshot System

Optimized for AI token efficiency with automatic fallback:

Tier 1:  snap          → Interactive elements only (@ref IDs). Fast, ~90% less tokens.
Tier 2:  snap_full     → Complete accessibility tree. More context when snap isn't enough.
Tier 3:  screenshot    → Full visual capture. Last resort, 100% accurate.
Tool Description
snap Compact snapshot — only buttons, links, inputs with @ref IDs
snap_full Full accessibility tree — all text, headings, landmarks
screenshot Visual screenshot of the page
click_ref Click element by @ref from snap (e.g. @e3)
fill_ref Fill input by @ref from snap (e.g. @e1)

Tab Management

Tool Description
list_tabs List all open tabs
new_tab Open a new tab

How It Works

  1. Uses your real Chrome (not a bundled automation browser)
  2. Headed by default — you see what the AI sees, take over anytime
  3. Persistent profiles stored in ~/.chrome-pilot/profiles/
  4. Chrome Sync brings your bookmarks, passwords, and extensions
  5. Multi-profile — run personal and work Chrome simultaneously on different ports
  6. MCP native — AI agents call tools directly, no CLI parsing
  7. 3-tier snapshots — AI reads the page efficiently, falls back to visuals when needed

Human-in-the-loop Design

chrome-pilot is designed for AI and humans to share a browser:

  • AI navigates → you watch the browser window
  • AI fills forms → you verify before submitting
  • 2FA / CAPTCHA appears → you handle it, AI continues after
  • Payment page → you click Buy, not the AI
  • Something goes wrong → you see it immediately and intervene

This is not a limitation — it's the point. For tasks involving your identity, money, or sensitive data, you should always be watching.

Requirements

  • Google Chrome installed on your system
  • Node.js 20+

License

MIT