diff --git a/README.md b/README.md index 506451d..9232824 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,13 @@ # Browser Commander -A universal browser automation library that supports both Playwright and Puppeteer with a unified API. The key focus is on **stoppable page triggers** - ensuring automation logic is properly mounted/unmounted during page navigation. +A universal browser automation library with a unified API across multiple browser engines and programming languages. The key focus is on **stoppable page triggers** - ensuring automation logic is properly mounted/unmounted during page navigation. -## Installation +## Available Implementations -```bash -npm install browser-commander -``` - -You'll also need either Playwright or Puppeteer: - -```bash -# With Playwright -npm install playwright - -# Or with Puppeteer -npm install puppeteer -``` +| Language | Package | Status | +|----------|---------|--------| +| JavaScript/TypeScript | [browser-commander](https://www.npmjs.com/package/browser-commander) | [![npm](https://img.shields.io/npm/v/browser-commander)](https://www.npmjs.com/package/browser-commander) | +| Rust | [browser-commander](https://crates.io/crates/browser-commander) | [![crates.io](https://img.shields.io/crates/v/browser-commander)](https://crates.io/crates/browser-commander) | ## Core Concept: Page State Machine @@ -35,103 +26,9 @@ Browser Commander manages the browser as a state machine with two states: **WORKING STATE**: Page is fully loaded (30 seconds of network idle). Page triggers can safely interact with DOM. -## Quick Start - -```javascript -import { - launchBrowser, - makeBrowserCommander, - makeUrlCondition, -} from 'browser-commander'; - -// 1. Launch browser -const { browser, page } = await launchBrowser({ engine: 'playwright' }); - -// 2. Create commander -const commander = makeBrowserCommander({ page, verbose: true }); - -// 3. Register page trigger with condition and action -commander.pageTrigger({ - name: 'example-trigger', - condition: makeUrlCondition('*example.com*'), // matches URLs containing 'example.com' - action: async (ctx) => { - // ctx.commander has all methods, but they throw ActionStoppedError if navigation happens - // ctx.checkStopped() - call in loops to check if should stop - // ctx.abortSignal - use with fetch() for cancellation - // ctx.onCleanup(fn) - register cleanup when action stops - - console.log(`Processing: ${ctx.url}`); - - // Safe iteration - stops if navigation detected - await ctx.forEach(['item1', 'item2'], async (item) => { - await ctx.commander.clickButton({ selector: `[data-id="${item}"]` }); - }); - }, -}); - -// 4. Navigate - action auto-starts when page is ready -await commander.goto({ url: 'https://example.com' }); - -// 5. Cleanup -await commander.destroy(); -await browser.close(); -``` - -## URL Condition Helpers - -The `makeUrlCondition` helper makes it easy to create URL matching conditions: - -```javascript -import { - makeUrlCondition, - allConditions, - anyCondition, - notCondition, -} from 'browser-commander'; - -// Exact URL match -makeUrlCondition('https://example.com/page'); - -// Contains substring (use * wildcards) -makeUrlCondition('*checkout*'); // URL contains 'checkout' -makeUrlCondition('*example.com*'); // URL contains 'example.com' - -// Starts with / ends with -makeUrlCondition('/api/*'); // starts with '/api/' -makeUrlCondition('*.json'); // ends with '.json' - -// Express-style route patterns -makeUrlCondition('/vacancy/:id'); // matches /vacancy/123 -makeUrlCondition('https://hh.ru/vacancy/:vacancyId'); // matches specific domain + path -makeUrlCondition('/user/:userId/profile'); // multiple segments - -// RegExp -makeUrlCondition(/\/product\/\d+/); - -// Custom function (receives full context) -makeUrlCondition((url, ctx) => { - const parsed = new URL(url); - return ( - parsed.pathname.startsWith('/admin') && parsed.searchParams.has('edit') - ); -}); - -// Combine conditions -allConditions( - makeUrlCondition('*example.com*'), - makeUrlCondition('*/checkout*') -); // Both must match - -anyCondition(makeUrlCondition('*/cart*'), makeUrlCondition('*/checkout*')); // Either matches - -notCondition(makeUrlCondition('*/admin*')); // Negation -``` - ## Page Trigger Lifecycle -### The Guarantee - -When navigation is detected: +The library provides a guarantee when navigation is detected: 1. **Action is signaled to stop** (AbortController.abort()) 2. **Wait for action to finish** (up to 10 seconds for graceful cleanup) @@ -143,192 +40,16 @@ This ensures: - Actions can do proper cleanup (clear intervals, save state) - No race conditions between action and navigation -## Action Context API - -When your action is called, it receives a context object with these properties: - -```javascript -commander.pageTrigger({ - name: 'my-trigger', - condition: makeUrlCondition('*/checkout*'), - action: async (ctx) => { - // Current URL - ctx.url; // 'https://example.com/checkout' - - // Trigger name (for debugging) - ctx.triggerName; // 'my-trigger' - - // Check if action should stop - ctx.isStopped(); // Returns true if navigation detected - - // Throw ActionStoppedError if stopped (use in manual loops) - ctx.checkStopped(); - - // AbortSignal - use with fetch() or other cancellable APIs - ctx.abortSignal; - - // Safe wait (throws if stopped during wait) - await ctx.wait(1000); - - // Safe iteration (checks stopped between items) - await ctx.forEach(items, async (item) => { - await ctx.commander.clickButton({ selector: item.selector }); - }); - - // Register cleanup (runs when action stops) - ctx.onCleanup(() => { - console.log('Cleaning up...'); - }); - - // Commander with all methods wrapped to throw on stop - await ctx.commander.fillTextArea({ selector: 'input', text: 'hello' }); - - // Raw commander (use carefully - does not auto-throw) - ctx.rawCommander; - }, -}); -``` - -## API Reference +## Getting Started -### launchBrowser(options) +For installation and usage instructions, see the documentation for your preferred language: -```javascript -const { browser, page } = await launchBrowser({ - engine: 'playwright', // 'playwright' or 'puppeteer' - headless: false, // Run in headless mode - userDataDir: '~/.hh-apply/playwright-data', // Browser profile directory - slowMo: 150, // Slow down operations (ms) - verbose: false, // Enable debug logging - args: ['--no-sandbox', '--disable-setuid-sandbox'], // Custom Chrome args to append -}); -``` - -The `args` option allows passing custom Chrome arguments, which is useful for headless server environments (Docker, CI/CD) that require flags like `--no-sandbox`. - -### makeBrowserCommander(options) - -```javascript -const commander = makeBrowserCommander({ - page, // Required: Playwright/Puppeteer page - verbose: false, // Enable debug logging - enableNetworkTracking: true, // Track HTTP requests - enableNavigationManager: true, // Enable navigation events -}); -``` - -### commander.pageTrigger(config) - -```javascript -const unregister = commander.pageTrigger({ - name: 'trigger-name', // For debugging - condition: (ctx) => boolean, // When to run (receives {url, commander}) - action: async (ctx) => void, // What to do - priority: 0, // Higher runs first -}); -``` - -### commander.goto(options) - -```javascript -await commander.goto({ - url: 'https://example.com', - waitUntil: 'domcontentloaded', // Playwright/Puppeteer option - timeout: 60000, -}); -``` - -### commander.clickButton(options) - -```javascript -await commander.clickButton({ - selector: 'button.submit', - scrollIntoView: true, - waitForNavigation: true, -}); -``` - -### commander.fillTextArea(options) - -```javascript -await commander.fillTextArea({ - selector: 'textarea.message', - text: 'Hello world', - checkEmpty: true, -}); -``` - -### commander.destroy() - -```javascript -await commander.destroy(); // Stop actions, cleanup -``` - -## Best Practices - -### 1. Use ctx.forEach for Loops - -```javascript -// BAD: Won't stop on navigation -for (const item of items) { - await ctx.commander.click({ selector: item }); -} - -// GOOD: Stops immediately on navigation -await ctx.forEach(items, async (item) => { - await ctx.commander.click({ selector: item }); -}); -``` - -### 2. Use ctx.checkStopped for Complex Logic - -```javascript -action: async (ctx) => { - while (hasMorePages) { - ctx.checkStopped(); // Throws if navigation detected - - await processPage(ctx); - hasMorePages = await ctx.commander.isVisible({ selector: '.next' }); - } -}; -``` - -### 3. Register Cleanup for Resources - -```javascript -action: async (ctx) => { - const intervalId = setInterval(updateStatus, 1000); - - ctx.onCleanup(() => { - clearInterval(intervalId); - console.log('Interval cleared'); - }); - - // ... rest of action -}; -``` - -### 4. Use ctx.abortSignal with Fetch - -```javascript -action: async (ctx) => { - const response = await fetch(url, { - signal: ctx.abortSignal, // Cancels on navigation - }); -}; -``` - -## Debugging - -Enable verbose mode for detailed logs: - -```javascript -const commander = makeBrowserCommander({ page, verbose: true }); -``` +- **JavaScript/TypeScript**: See [js/README.md](js/README.md) +- **Rust**: See [rust/README.md](rust/README.md) ## Architecture -See [src/ARCHITECTURE.md](src/ARCHITECTURE.md) for detailed architecture documentation. +See [js/src/ARCHITECTURE.md](js/src/ARCHITECTURE.md) for detailed architecture documentation. ## License diff --git a/js/.changeset/fix-readme-npm-publish.md b/js/.changeset/fix-readme-npm-publish.md new file mode 100644 index 0000000..0fb0f77 --- /dev/null +++ b/js/.changeset/fix-readme-npm-publish.md @@ -0,0 +1,13 @@ +--- +'browser-commander': patch +--- + +Include README.md in npm package + +Added language-specific README.md files for each implementation: + +- js/README.md: JavaScript/npm-specific documentation with installation and API usage +- rust/README.md: Rust/Cargo-specific documentation +- Root README.md: Common overview linking to both implementations + +The npm package now includes the JavaScript-specific README.md directly from the js/ directory. diff --git a/js/README.md b/js/README.md new file mode 100644 index 0000000..fa54610 --- /dev/null +++ b/js/README.md @@ -0,0 +1,335 @@ +# Browser Commander + +A universal browser automation library for JavaScript/TypeScript that supports both Playwright and Puppeteer with a unified API. The key focus is on **stoppable page triggers** - ensuring automation logic is properly mounted/unmounted during page navigation. + +## Installation + +```bash +npm install browser-commander +``` + +You'll also need either Playwright or Puppeteer: + +```bash +# With Playwright +npm install playwright + +# Or with Puppeteer +npm install puppeteer +``` + +## Core Concept: Page State Machine + +Browser Commander manages the browser as a state machine with two states: + +``` ++------------------+ +------------------+ +| | navigation start | | +| WORKING STATE | -------------------> | LOADING STATE | +| (action runs) | | (wait only) | +| | <----------------- | | ++------------------+ page ready +------------------+ +``` + +**LOADING STATE**: Page is loading. Only waiting/tracking operations are allowed. No automation logic runs. + +**WORKING STATE**: Page is fully loaded (30 seconds of network idle). Page triggers can safely interact with DOM. + +## Quick Start + +```javascript +import { + launchBrowser, + makeBrowserCommander, + makeUrlCondition, +} from 'browser-commander'; + +// 1. Launch browser +const { browser, page } = await launchBrowser({ engine: 'playwright' }); + +// 2. Create commander +const commander = makeBrowserCommander({ page, verbose: true }); + +// 3. Register page trigger with condition and action +commander.pageTrigger({ + name: 'example-trigger', + condition: makeUrlCondition('*example.com*'), // matches URLs containing 'example.com' + action: async (ctx) => { + // ctx.commander has all methods, but they throw ActionStoppedError if navigation happens + // ctx.checkStopped() - call in loops to check if should stop + // ctx.abortSignal - use with fetch() for cancellation + // ctx.onCleanup(fn) - register cleanup when action stops + + console.log(`Processing: ${ctx.url}`); + + // Safe iteration - stops if navigation detected + await ctx.forEach(['item1', 'item2'], async (item) => { + await ctx.commander.clickButton({ selector: `[data-id="${item}"]` }); + }); + }, +}); + +// 4. Navigate - action auto-starts when page is ready +await commander.goto({ url: 'https://example.com' }); + +// 5. Cleanup +await commander.destroy(); +await browser.close(); +``` + +## URL Condition Helpers + +The `makeUrlCondition` helper makes it easy to create URL matching conditions: + +```javascript +import { + makeUrlCondition, + allConditions, + anyCondition, + notCondition, +} from 'browser-commander'; + +// Exact URL match +makeUrlCondition('https://example.com/page'); + +// Contains substring (use * wildcards) +makeUrlCondition('*checkout*'); // URL contains 'checkout' +makeUrlCondition('*example.com*'); // URL contains 'example.com' + +// Starts with / ends with +makeUrlCondition('/api/*'); // starts with '/api/' +makeUrlCondition('*.json'); // ends with '.json' + +// Express-style route patterns +makeUrlCondition('/vacancy/:id'); // matches /vacancy/123 +makeUrlCondition('https://hh.ru/vacancy/:vacancyId'); // matches specific domain + path +makeUrlCondition('/user/:userId/profile'); // multiple segments + +// RegExp +makeUrlCondition(/\/product\/\d+/); + +// Custom function (receives full context) +makeUrlCondition((url, ctx) => { + const parsed = new URL(url); + return ( + parsed.pathname.startsWith('/admin') && parsed.searchParams.has('edit') + ); +}); + +// Combine conditions +allConditions( + makeUrlCondition('*example.com*'), + makeUrlCondition('*/checkout*') +); // Both must match + +anyCondition(makeUrlCondition('*/cart*'), makeUrlCondition('*/checkout*')); // Either matches + +notCondition(makeUrlCondition('*/admin*')); // Negation +``` + +## Page Trigger Lifecycle + +### The Guarantee + +When navigation is detected: + +1. **Action is signaled to stop** (AbortController.abort()) +2. **Wait for action to finish** (up to 10 seconds for graceful cleanup) +3. **Only then start waiting for page load** + +This ensures: + +- No DOM operations on stale/loading pages +- Actions can do proper cleanup (clear intervals, save state) +- No race conditions between action and navigation + +## Action Context API + +When your action is called, it receives a context object with these properties: + +```javascript +commander.pageTrigger({ + name: 'my-trigger', + condition: makeUrlCondition('*/checkout*'), + action: async (ctx) => { + // Current URL + ctx.url; // 'https://example.com/checkout' + + // Trigger name (for debugging) + ctx.triggerName; // 'my-trigger' + + // Check if action should stop + ctx.isStopped(); // Returns true if navigation detected + + // Throw ActionStoppedError if stopped (use in manual loops) + ctx.checkStopped(); + + // AbortSignal - use with fetch() or other cancellable APIs + ctx.abortSignal; + + // Safe wait (throws if stopped during wait) + await ctx.wait(1000); + + // Safe iteration (checks stopped between items) + await ctx.forEach(items, async (item) => { + await ctx.commander.clickButton({ selector: item.selector }); + }); + + // Register cleanup (runs when action stops) + ctx.onCleanup(() => { + console.log('Cleaning up...'); + }); + + // Commander with all methods wrapped to throw on stop + await ctx.commander.fillTextArea({ selector: 'input', text: 'hello' }); + + // Raw commander (use carefully - does not auto-throw) + ctx.rawCommander; + }, +}); +``` + +## API Reference + +### launchBrowser(options) + +```javascript +const { browser, page } = await launchBrowser({ + engine: 'playwright', // 'playwright' or 'puppeteer' + headless: false, // Run in headless mode + userDataDir: '~/.hh-apply/playwright-data', // Browser profile directory + slowMo: 150, // Slow down operations (ms) + verbose: false, // Enable debug logging + args: ['--no-sandbox', '--disable-setuid-sandbox'], // Custom Chrome args to append +}); +``` + +The `args` option allows passing custom Chrome arguments, which is useful for headless server environments (Docker, CI/CD) that require flags like `--no-sandbox`. + +### makeBrowserCommander(options) + +```javascript +const commander = makeBrowserCommander({ + page, // Required: Playwright/Puppeteer page + verbose: false, // Enable debug logging + enableNetworkTracking: true, // Track HTTP requests + enableNavigationManager: true, // Enable navigation events +}); +``` + +### commander.pageTrigger(config) + +```javascript +const unregister = commander.pageTrigger({ + name: 'trigger-name', // For debugging + condition: (ctx) => boolean, // When to run (receives {url, commander}) + action: async (ctx) => void, // What to do + priority: 0, // Higher runs first +}); +``` + +### commander.goto(options) + +```javascript +await commander.goto({ + url: 'https://example.com', + waitUntil: 'domcontentloaded', // Playwright/Puppeteer option + timeout: 60000, +}); +``` + +### commander.clickButton(options) + +```javascript +await commander.clickButton({ + selector: 'button.submit', + scrollIntoView: true, + waitForNavigation: true, +}); +``` + +### commander.fillTextArea(options) + +```javascript +await commander.fillTextArea({ + selector: 'textarea.message', + text: 'Hello world', + checkEmpty: true, +}); +``` + +### commander.destroy() + +```javascript +await commander.destroy(); // Stop actions, cleanup +``` + +## Best Practices + +### 1. Use ctx.forEach for Loops + +```javascript +// BAD: Won't stop on navigation +for (const item of items) { + await ctx.commander.click({ selector: item }); +} + +// GOOD: Stops immediately on navigation +await ctx.forEach(items, async (item) => { + await ctx.commander.click({ selector: item }); +}); +``` + +### 2. Use ctx.checkStopped for Complex Logic + +```javascript +action: async (ctx) => { + while (hasMorePages) { + ctx.checkStopped(); // Throws if navigation detected + + await processPage(ctx); + hasMorePages = await ctx.commander.isVisible({ selector: '.next' }); + } +}; +``` + +### 3. Register Cleanup for Resources + +```javascript +action: async (ctx) => { + const intervalId = setInterval(updateStatus, 1000); + + ctx.onCleanup(() => { + clearInterval(intervalId); + console.log('Interval cleared'); + }); + + // ... rest of action +}; +``` + +### 4. Use ctx.abortSignal with Fetch + +```javascript +action: async (ctx) => { + const response = await fetch(url, { + signal: ctx.abortSignal, // Cancels on navigation + }); +}; +``` + +## Debugging + +Enable verbose mode for detailed logs: + +```javascript +const commander = makeBrowserCommander({ page, verbose: true }); +``` + +## Architecture + +See [src/ARCHITECTURE.md](src/ARCHITECTURE.md) for detailed architecture documentation. + +## License + +[UNLICENSE](../LICENSE) diff --git a/rust/README.md b/rust/README.md new file mode 100644 index 0000000..74bc7f0 --- /dev/null +++ b/rust/README.md @@ -0,0 +1,175 @@ +# Browser Commander + +A Rust library for universal browser automation that provides a unified API for different browser automation engines. The key focus is on **stoppable page triggers** - ensuring automation logic is properly mounted/unmounted during page navigation. + +## Installation + +Add this to your `Cargo.toml`: + +```toml +[dependencies] +browser-commander = "0.4" +tokio = { version = "1.0", features = ["full"] } +``` + +## Core Concept: Page State Machine + +Browser Commander manages the browser as a state machine with two states: + +``` ++------------------+ +------------------+ +| | navigation start | | +| WORKING STATE | -------------------> | LOADING STATE | +| (action runs) | | (wait only) | +| | <----------------- | | ++------------------+ page ready +------------------+ +``` + +**LOADING STATE**: Page is loading. Only waiting/tracking operations are allowed. No automation logic runs. + +**WORKING STATE**: Page is fully loaded (30 seconds of network idle). Page triggers can safely interact with DOM. + +## Quick Start + +```rust +use browser_commander::prelude::*; + +#[tokio::main] +async fn main() -> anyhow::Result<()> { + // Launch a browser with chromiumoxide engine + let options = LaunchOptions::chromiumoxide() + .headless(true); + + let result = launch_browser(options).await?; + println!("Browser launched: {:?}", result.browser.engine); + + // Navigate to a URL + let page = &result.page; + goto(page, "https://example.com", None).await?; + + // Click a button + click_button(page, "button.submit", None).await?; + + // Fill a text field + fill_text_area(page, "input[name='email']", "test@example.com", None).await?; + + Ok(()) +} +``` + +## Features + +- **Unified API** across multiple browser engines +- **Built-in navigation safety handling** +- **Element visibility and scroll management** +- **Click, fill, and other interaction support with verification** +- **Async/await support with Tokio** + +## API Reference + +### Browser Launch + +```rust +use browser_commander::prelude::*; + +// Launch with chromiumoxide (CDP-based) +let options = LaunchOptions::chromiumoxide() + .headless(true) + .user_data_dir("~/.browser-data"); + +let result = launch_browser(options).await?; +``` + +### Navigation + +```rust +// Navigate to URL +goto(&page, "https://example.com", None).await?; + +// Navigate with options +let nav_options = NavigationOptions { + wait_until: WaitUntil::NetworkIdle, + timeout: Some(30000), +}; +goto(&page, "https://example.com", Some(nav_options)).await?; + +// Wait for URL to match condition +wait_for_url_condition(&page, |url| url.contains("success")).await?; +``` + +### Element Interactions + +```rust +// Click a button +click_button(&page, "button.submit", None).await?; + +// Click with options +let click_options = ClickOptions { + scroll_into_view: true, + wait_for_navigation: true, + ..Default::default() +}; +click_button(&page, "button.submit", Some(click_options)).await?; + +// Fill text area +fill_text_area(&page, "textarea.message", "Hello world", None).await?; + +// Scroll element into view +scroll_into_view(&page, ".target-element", None).await?; +``` + +### Element Queries + +```rust +// Check visibility +let visible = is_visible(&page, ".element").await?; + +// Check if enabled +let enabled = is_enabled(&page, "button.submit").await?; + +// Get text content +let text = text_content(&page, ".message").await?; + +// Get attribute value +let href = get_attribute(&page, "a.link", "href").await?; + +// Count matching elements +let count = count(&page, ".item").await?; +``` + +### Utilities + +```rust +// Wait for a duration +wait(1000).await; + +// Get current URL +let url = get_url(&page).await?; + +// Parse URL +let parsed = parse_url("https://example.com/path?query=value")?; + +// Evaluate JavaScript +let result: String = evaluate(&page, "document.title").await?; +``` + +## Modules + +- `core` - Core types and traits (constants, engine adapter, logger) +- `elements` - Element operations (selectors, visibility, content) +- `interactions` - User interactions (click, scroll, fill) +- `browser` - Browser management (launcher, navigation) +- `utilities` - General utilities (URL handling, wait operations) +- `high_level` - High-level DRY utilities + +## Prelude + +For convenience, import everything commonly needed with: + +```rust +use browser_commander::prelude::*; +``` + +## License + +[UNLICENSE](../LICENSE)