-
Notifications
You must be signed in to change notification settings - Fork 1
Integrate Chrome DevTools Protocol CLI Tool for Browser Automation #7
Description
Integrate Chrome DevTools Protocol CLI Tool for Browser Automation
Summary
Request to integrate chrome-ws (Chrome DevTools Protocol CLI tool) into the browser container system as a lightweight alternative to Playwright for browser automation and testing.
Background
The Playwright MCP currently has significant performance issues when dealing with complex web pages, returning massive YAML snapshots (100KB+) that consume excessive context and are impractical for automation tasks. The chrome-ws tool provides a zero-dependency alternative that works directly with Chrome via DevTools Protocol.
Upstream Project
- Repository: https://github.com/obra/superpowers-chrome
- Tool Location:
skills/browsing/directory - Key Files:
chrome-ws- Main executable (bash script)chrome-ws-lib.js- Core library- Documentation:
SKILL.md,COMMANDLINE-USAGE.md,EXAMPLES.md
Technical Requirements
Dependencies
- Node.js 16+ (already available)
- Chrome with remote debugging port enabled
- No npm packages required (zero-dependency design)
Wayland Compatibility
Successfully tested on Wayland with the following Chrome flags:
google-chrome \
--remote-debugging-port=9222 \
--no-sandbox \
--enable-features=UseOzonePlatform \
--ozone-platform=wayland \
--user-data-dir=/path/to/profile \
--no-first-run \
--no-default-browser-check \
--disable-default-apps \
--disable-syncEnvironment Variables
export DISPLAY=:0
export WAYLAND_DISPLAY=wayland-0Key Advantages
-
Minimal Output: Returns clean text instead of massive YAML structures
- Example.com extraction: "Example Domain" (8 bytes)
- vs Playwright MCP: 100KB+ YAML snapshot
-
Performance:
- Complex page extraction: 0.05 seconds
- Amazon homepage: 84 lines, 4KB output
-
Simplicity:
- Direct bash commands
- No complex dependencies
- Works with existing browser sessions
-
Functionality:
- Navigation and page loading
- Content extraction (text, HTML, markdown)
- Element interaction (click, type, select)
- JavaScript evaluation
- Cookie/authentication support
- Multi-tab management
- Screenshot capture
Proven Use Cases
Successfully demonstrated:
- Extracting structured data from HackerNews (story titles)
- Navigating between pages with visual confirmation
- Clicking links and interacting with elements
- Running JavaScript to extract specific DOM elements
- Exporting pages as markdown
Implementation Options
Option 1: Install as-is
- Clone upstream repository
- Symlink
chrome-wsandchrome-ws-lib.jsto/usr/local/bin/ - Add to container build process
Option 2: Fork and customize (Recommended)
- Fork repository for our specific needs
- Add Wayland-specific wrapper script
- Pre-configure chrome launch flags for container environment
- Add integration with existing test frameworks
- Maintain as part of fedora-desktop tooling
Recommended Approach
Create a fork with the following customizations:
-
Wrapper Script (
/usr/local/bin/chrome-ws-wayland):- Auto-configures Wayland environment variables
- Provides sensible defaults for Chrome flags
- Handles profile management for ephemeral containers
-
Helper Functions:
chrome-start- Launch Chrome with correct flagschrome-stop- Clean shutdownchrome-auth- Set authentication cookies via JavaScript
-
Integration:
- Add to browser container image
- Document usage patterns for test automation
- Provide examples for common scenarios
Testing Confirmation
All features tested and working on Fedora/Wayland:
- ✅ Chrome launches in headed mode with Wayland
- ✅ Remote debugging port accessible
- ✅ CLI commands work correctly
- ✅ Content extraction produces minimal output
- ✅ Interactive browser control functions properly
- ✅ No popup dialogs when using suppression flags
Questions for Discussion
- Should we install upstream as-is or maintain a fork?
- Where should the tool be installed? (
/usr/local/bin/,/opt/chrome-ws/, etc.) - Should Chrome launch be automatic or manual in tests?
- Do we need additional wrapper scripts for common operations?
- Should this replace Playwright MCP entirely or complement it?
References
- Upstream repository: https://github.com/obra/superpowers-chrome/tree/main/skills/browsing
- Chrome DevTools Protocol: https://chromedevtools.github.io/devtools-protocol/
- Alternative to Playwright MCP which has known performance issues with large pages