Sync by metehanozdev · Pull Request #2 · emregucerr/stagehand

metehanozdev · 2025-08-09T01:42:18Z

why

what changed

test plan

# why adds support for using claude 4.5 opus with cua # what changed added opus to model maps # test plan  --- ## Summary by cubic Adds support for Anthropic Claude 4.5 Opus in CUA. Registers anthropic/claude-opus-4-5-20251101 and maps it to the Anthropic provider. Written for commit 2e54c27. Summary will update automatically on new commits.

@claude

## 🤖 Installing Claude Code GitHub App This PR adds a GitHub Actions workflow that enables Claude Code integration in our repository. ### What is Claude Code? [Claude Code](https://claude.com/claude-code) is an AI coding agent that can help with: - Bug fixes and improvements - Documentation updates - Implementing new features - Code reviews and suggestions - Writing tests - And more! ### How it works Once this PR is merged, we'll be able to interact with Claude by mentioning @claude in a pull request or issue comment. Once the workflow is triggered, Claude will analyze the comment and surrounding context, and execute on the request in a GitHub action. ### Important Notes - **This workflow won't take effect until this PR is merged** - **@claude mentions won't work until after the merge is complete** - The workflow runs automatically whenever Claude is mentioned in PR or issue comments - Claude gets access to the entire PR or issue context including files, diffs, and previous comments ### Security - Our Anthropic API key is securely stored as a GitHub Actions secret - Only users with write access to the repository can trigger the workflow - All Claude runs are stored in the GitHub Actions run history - Claude's default tools are limited to reading/writing files and interacting with our repo by creating comments, branches, and commits. - We can add more allowed tools by adding them to the workflow file like: ``` allowed_tools: Bash(npm install),Bash(npm run build),Bash(npm run lint),Bash(npm run test) ``` There's more information in the [Claude Code action repo](https://github.com/anthropics/claude-code-action). After merging this PR, let's try mentioning @claude in a comment on any PR to get started!  --- ## Summary by cubic Adds GitHub Actions to integrate Claude Code for automated PR reviews and comment-triggered help. Enables @claude to review code and perform tasks using repository context. - **New Features** - claude.yml: Runs when @claude is mentioned in issue/PR comments or reviews, or in issue title/body; uses anthropics/claude-code-action@v1 with actions: read to access CI; requires secrets.ANTHROPIC_API_KEY. - claude-code-review.yml: Auto-reviews PRs on open/sync for quality, bugs, performance, security, and tests; posts feedback via gh; uses claude.md for guidance; includes optional filters and limited allowed tools. - **Migration** - Add ANTHROPIC_API_KEY to repository secrets. - Merge this PR, then mention @claude in a PR or issue to trigger. - Optional: adjust file path filters, author filters, or allowed tools. Written for commit d7e4303. Summary will update automatically on new commits.  --------- Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

@claude

Keeps `@\claude` support but drops the auto PR reviews by claude, we already have plenty of auto-review feedback from greptile and cubic.  --- ## Summary by cubic Removed the Claude PR review GitHub Action to stop automatic reviews. Keeps @claude support and reduces duplicate bot feedback, since greptile and cubic already provide auto-review. Written for commit 740d927. Summary will update automatically on new commits.

# why After the transition to v3, the model handling for agent evals was not updated to account for new model formats # what changed - added isCua flag and two separate model maps to allow for models that can be ran with cua and non - adjusted model handling to properly parse cua models - added tag to distinguish if the run is using cua or non # test plan - tested evals for cua, and non cua  --- ## Summary by cubic Updated the agent evals CLI to support and correctly run both CUA and non-CUA agent models in v3. Fixes agent model parsing and enables mixed eval runs. - **New Features** - Split agent models into standard and CUA lists; added getAgentModelEntries with a cua flag. - Passed isCUA through EvalInput to initV3 and tasks; selects a safe internal model for handlers when CUA. - Improved provider lookup and error messages for CUA models using short names; testcases now tag models as "cua" or "agent". Written for commit 13b906c. Summary will update automatically on new commits.

# why - to clean up the actHandler before #1330  --- ## Summary by cubic Refactors actHandler to centralize LLM action parsing and execution, reduce duplication, and improve metrics reporting. Behavior stays the same, with clearer naming and more reliable two-step and fallback flows. ## Why: - Reduce duplicated LLM calls and normalization logic. - Improve readability and maintainability. - Ensure consistent metrics and variable substitution. - Make the self-heal/fallback path more robust. ## What: - Renamed actFromObserveResult to takeDeterministicAction and updated all call sites (ActCache, AgentCache, v3). - Added getActionFromLLM for inference, metrics, normalization, and variable substitution. - Added recordActMetrics to centralize ACT metrics reporting. - Extracted normalizeActInferenceElement and substituteVariablesInArguments helpers. - Simplified two-step act flow and fallback retry using shared helpers. - Kept existing behavior (selector normalization, variable substitution, retries). ## Test Plan: - [ ] Run unit tests for actHandler to confirm no regressions. - [ ] Verify single-step actions execute as before. - [ ] Verify two-step flow triggers when LLM returns twoStep and executes the second action. - [ ] Confirm fallback self-heal path updates selector and retries successfully. - [ ] Check metrics are recorded once per inference call in both steps and fallback. - [ ] Validate variable substitution replaces %key% tokens in action arguments. - [ ] Exercise AgentCache and ActCache paths to ensure takeDeterministicAction works end-to-end. - [ ] Build passes and type checks for all renamed method references. Written for commit 08d8454. Summary will update automatically on new commits.

@loic-carbonne

great catch from @loic-carbonne # why - currently it is not possible to rerun a cached agent run with a different prompt - therefore, this docs example is misleading # what changed - removed misleading example  --- ## Summary by cubic Removed the incorrect docs example that suggested cached agent workflows can be reused with different inputs. This aligns the deterministic agent page with current behavior where each instruction generates a new cache key, so runs cannot be rerun with a different prompt. Written for commit 4908805. Summary will update automatically on new commits.  Co-authored-by: Loïc Carbonne <loic.carbonne.mail@gmail.com>

…1330) # why - async functions invoked by act, extract, and observe all continued to run even after the timeout was reached # what changed - this PR introduces a time remaining check mechanism which runs between each major IO operation inside each of the handlers - this ensures that user defined timeout are actually respected inside of act, extract, and observe # test plan - added tests to confirm that internal async functions do not continue running after the timeout is reached  --- ## Summary by cubic Fixes act, extract, and observe to truly honor the timeout parameter with step-wise guards that abort early and return clear errors. Deterministic actions now use the same guard path in v3. - **Bug Fixes** - Added createTimeoutGuard and specific ActTimeoutError, ExtractTimeoutError, and ObserveTimeoutError (exported). - Replaced Promise.race with per-step checks across snapshot capture, LLM inference, action execution, and self-heal retries. - Enforced per-step timeouts in ActHandler.takeDeterministicAction; metrics unchanged. - Wired v3 deterministic actions to pass a timeout guard; shadow DOM and unsupported actions behavior unchanged. Written for commit d6bbfb8. Summary will update automatically on new commits.  --------- Co-authored-by: miguel <miguelg71921@gmail.com> Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

# why our slack link expired # what changed updated slack invite link # test plan  --- ## Summary by cubic Replaced the expired Slack invite link with a new working one. Updated the core README and contributing docs so contributors can join the community without broken links. Written for commit 9f0b262. Summary will update automatically on new commits.

# why Users don't know about the v2/v3 version toggle in the docs navigation. # what changed Added a banner at the top of the v3 docs pages to help users easily discover Stagehand Python (v2). # test plan n/a  --- ## Summary by cubic Added a reusable banner to the top of all v3 docs pages to highlight the Stagehand Python (v2) option. Improves discoverability of the v2/v3 toggle and reduces confusion. - **New Features** - Added V3Banner MDX snippet linking to “/v2/first-steps/introduction”. - Imported and rendered the banner across v3 Basics, Best Practices, Configuration, First Steps, Integrations, Migrations, and References pages. - Minor metadata/formatting updates in v2 docs (e.g., User Data frontmatter) for consistency. Written for commit 515a13d. Summary will update automatically on new commits.

# why Anthropic agents in CUA mode are unable to issue key presses (not to be confused with `type` actions) # what changed The format for the anthropic tool `computer_20250124` replies with: ```ts { "action":"key", "text":"BackSpace" } ``` wasn't properly mapped to our internal action abstraction: `keypress`, which accepts parameter `keys`. It was issued directly from the anthropic format. Updated `AnthropicCUAClient.ts` to account for this and map appropriately # test plan - [x] Tested on sample eval  --- ## Summary by cubic Fixes key action mapping in Anthropic CUA so agents can send key presses (e.g., Backspace) correctly instead of failing on the "key" action. - **Bug Fixes** - Map Anthropic "key" to internal "keypress" and pass keys from input.text. - Remove the old "key" path and Playwright key mapping to avoid mismatches. Written for commit b9716b9. Summary will update automatically on new commits.  --------- Co-authored-by: Sean McGuire <75873287+seanmcguire12@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

# why The banner was hard-coded for light mode only # what changed <img width="706" height="316" alt="image" src="https://github.com/user-attachments/assets/64fadf31-a96e-43ae-b435-7082db9b6a64" /> <img width="707" height="314" alt="image" src="https://github.com/user-attachments/assets/515ab34a-f040-4574-89bf-7c2d621a63e6" /> # test plan  --- ## Summary by cubic Fixed the V3 docs banner to support light mode while preserving dark mode styling. Added light-theme border, background, and text colors with dark: variants and aligned link hover states to improve readability. Written for commit 14ab04f. Summary will update automatically on new commits.

…rve/extract, CLICK/HOVER/SCROLL, and CDP (#1283) # why Clarify where the execution flow goes when stagehand runs by showing more detailed logs. <img width="1443" height="529" alt="image" src="https://github.com/user-attachments/assets/1c85f91e-de94-46c3-8226-fe42d4c3e338" /> # what changed Adds a log line printed at the beginning and end of each layer's execution: 1. 🅰 Agent TASK: top-level user intent: when agent.execute('<intent here>') is called (the initial entrypoint) 2. 🆂 Stagehand STEP: any call to .act(...) .extract() or .observe() 3. 🆄 Understudy ACTION: any playwright or browser interaction api action dispatched, e.g. CLICK, HOVER, SCROLL, etc. 4. 🧠 LLM req/resp, 🅲 CDP CALL/Event: any LLM calls or CDP websocket msgs to/from the browser Log lines are written to `./.browserbase/sessions/{sessionId}/{agent,stagehand,understudy,cdp}.log` at runtime, and can be followed in a single unified screen by doing: `tail -f ./.browserbase/sessions/latest/*.log` # test plan Test by running: ```bash # (make sure `OPENAI_API_KEY` and `ANTHROPIC_API_KEY` are both set in env too) export BROWSERBASE_CONFIG_DIR=./.browserbase nano packages/core/examples/flowLoggingJourney.ts # paste in contents (it's just a basic test of the main apis) pnpm tsx packages/core/examples/flowLoggingJourney.ts & tail -f ./.browserbase/sessions/latest/* ``` `flowLoggingJourney.ts`: ```typescript import { Stagehand } from "../lib/v3"; async function run(): Promise<void> { const openaiKey = process.env.OPENAI_API_KEY; const anthropicKey = process.env.ANTHROPIC_API_KEY; if (!openaiKey || !anthropicKey) { throw new Error( "Set both OPENAI_API_KEY and ANTHROPIC_API_KEY before running this demo.", ); } const stagehand = new Stagehand({ env: "LOCAL", verbose: 2, model: { modelName: "openai/gpt-4.1-mini", apiKey: openaiKey }, localBrowserLaunchOptions: { headless: true, args: ["--window-size=1280,720"], }, disablePino: true, }); try { await stagehand.init(); const [page] = stagehand.context.pages(); await page.goto("https://example.com/", { waitUntil: "load" }); // Test standard agent path const agent = stagehand.agent({ systemPrompt: "You are a QA assistant. Keep answers short and deterministic. Finish quickly.", }); const agentResult = await agent.execute( "Glance at the Example Domain page and confirm that you see the hero text.", ); console.log("Agent result:", agentResult); // Test CUA (Computer Use Agent) path await page.goto("https://example.com/", { waitUntil: "load" }); const cuaAgent = stagehand.agent({ cua: true, model: { modelName: "anthropic/claude-sonnet-4-5-20250929", apiKey: anthropicKey, }, }); const cuaResult = await cuaAgent.execute({ instruction: "Click on the 'More information...' link on the page.", maxSteps: 3, }); console.log("CUA Agent result:", cuaResult); const observations = await stagehand.observe("Find any links on the page"); console.log("Observe result:", observations); if (observations.length > 0) { await stagehand.act(observations[0]); } else { await stagehand.act("click the link on the page"); } const extraction = await stagehand.extract( "Summarize the current page title and contents in a single sentence", ); console.log("Extraction result:", extraction); } finally { await stagehand.close({ force: true }).catch(() => {}); } } run().catch((error) => { console.error(error); process.exitCode = 1; }); ``` EXPECTED OUTPUT: ```bash 2025-12-08 12:20:26.23300 ⤑ ⤑ [🆄 #694a GOTO] ▷ Page.goto({args:[https://example.com/,{waitUntil:load}]}) 2025-12-08 12:20:26.23401 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Page.navigate({url:https://example.com/}) 2025-12-08 12:20:26.26402 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedNavigating({frameId:8A6B…FE7B,u…rId:F41F…7B31,navigationType:differentDocument}) 2025-12-08 12:20:26.26403 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:26.57304 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Page.setLifecycleEventsEnabled({enabled:true}) 2025-12-08 12:20:26.57605 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameNavigated({frame:{id:8A6B…FE7B,loaderI…tIsolated,gatedAPIFeatures:[]},type:Navigation}) 2025-12-08 12:20:26.57706 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Network.policyUpdated({}) 2025-12-08 12:20:26.57807 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Runtime.consoleAPICalled({type:info,args:[{type:…ptId:5,url:",lineNumber:0,columnNumber:2837}]}}) 2025-12-08 12:20:26.57908 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.domContentEventFired({timestamp:545864.312948}) 2025-12-08 12:20:26.58009 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.loadEventFired({timestamp:545864.313355}) 2025-12-08 12:20:26.58110 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStoppedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:26.58311 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:document.readyState,contextId:2,returnByValue:true}) 2025-12-08 12:20:26.58412 ⤑ ⤑ [🆄 #694a GOTO] ✓ GOTO completed in 0.35s 2025-12-08 12:20:26.58513 [🅰 #1d66] ▷ Agent.execute(Glance at the Example Domain page and confirm that you see the hero text.) 2025-12-08 12:20:26.59314 [🅰 #1d66] ⤑ [🧠 #21e1 LLM] gpt-4.1-mini ⏴ user: Glance at the Example Domain page and confirm that you see the hero text. +{10 tools} 2025-12-08 12:20:29.44715 [🅰 #1d66] ⤑ [🧠 #21e1 LLM] gpt-4.1-mini ↳ ꜛ688 ꜜ12 | tool call: ariaTree() 2025-12-08 12:20:29.44816 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ▷ Stagehand.extract() 2025-12-08 12:20:29.45317 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ⤑ [🅲 #FE7B CDP] ⏵ DOM.getDocument({depth:-1,pierce:true}) 2025-12-08 12:20:29.46018 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ⤑ [🅲 #FE7B CDP] ⏵ Accessibility.getFullAXTree({frameId:8A6B…FE7B}) 2025-12-08 12:20:29.46419 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ✓ EXTRACT completed in 0.02s 2025-12-08 12:20:29.46520 [🅰 #1d66] ⤑ [🧠 #03a1 LLM] gpt-4.1-mini ⏴ tool result: ariaTree(): Accessibility Tre…7] paragraph [0-18] link: Learn more +{10 tools} 2025-12-08 12:20:32.21321 [🅰 #1d66] ⤑ [🧠 #03a1 LLM] gpt-4.1-mini ↳ ꜛ806 ꜜ34 | tool call: close() 2025-12-08 12:20:32.21422 [🅰 #1d66] ✓ Agent.execute() DONE in 5.6s | 2 LLM calls ꜛ1494 ꜜ46 tokens | 6 CDP msgs 2025-12-08 12:20:32.21523 ⤑ ⤑ [🆄 #cb65 GOTO] ▷ Page.goto({args:[https://example.com/,{waitUntil:load}]}) 2025-12-08 12:20:32.21524 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Page.navigate({url:https://example.com/}) 2025-12-08 12:20:32.25425 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedNavigating({frameId:8A6B…FE7B,u…rId:2130…4BDE,navigationType:differentDocument}) 2025-12-08 12:20:32.25426 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:32.25727 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Page.setLifecycleEventsEnabled({enabled:true}) 2025-12-08 12:20:32.25828 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.scrollableFlagUpdated({nodeId:1,isScrollable:false}) 2025-12-08 12:20:32.25929 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameNavigated({frame:{id:8A6B…FE7B,loaderI…tIsolated,gatedAPIFeatures:[]},type:Navigation}) 2025-12-08 12:20:32.26030 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Network.policyUpdated({}) 2025-12-08 12:20:32.26031 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.documentUpdated({}) 2025-12-08 12:20:32.26032 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Runtime.consoleAPICalled({type:info,args:[{type:…ptId:5,url:",lineNumber:0,columnNumber:2837}]}}) 2025-12-08 12:20:32.26133 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.documentUpdated({}) 2025-12-08 12:20:32.26134 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.domContentEventFired({timestamp:545869.998129}) 2025-12-08 12:20:32.26135 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.loadEventFired({timestamp:545869.998762}) 2025-12-08 12:20:32.26136 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStoppedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:32.26237 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:document.readyState,contextId:3,returnByValue:true}) 2025-12-08 12:20:32.26338 ⤑ ⤑ [🆄 #cb65 GOTO] ✓ GOTO completed in 0.05s 2025-12-08 12:20:32.26339 [🅰 #c756] ▷ Agent.execute({instruction:Click on the More information... link on the page.,maxSteps:3}) 2025-12-08 12:20:32.26440 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Page.addScriptToEvaluateOnNewDocument({source:(() => …ue });\n setTimeout(install, 100);\n }\n })();}) 2025-12-08 12:20:32.26441 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏴ Accessibility.loadComplete({root:{nodeId:23,ignored:f…ds:[24],backendDOMNodeId:23,frameId:8A6B…FE7B}}) 2025-12-08 12:20:32.26542 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:({ w: window.innerWidth,…ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:32.26543 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() => {\n const ID = __… 100);\n }\n })();,includeCommandLineAPI:false}) 2025-12-08 12:20:32.26744 [🅰 #c756] ⤑ [🧠 #2798 LLM] claude-sonnet-4-5-20250929 ⏴ Click on the More information... link on the page. 2025-12-08 12:20:36.15745 [🅰 #c756] ⤑ [🧠 #2798 LLM] claude-sonnet-4-5-20250929 ↳ ꜛ1875 ꜜ79 | Ill help you click on the More information... l tool_use:computer 2025-12-08 12:20:36.96146 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:36.96447 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:36.96648 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:37.01149 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.01250 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] ✓ SCREENSHOT completed in 0.05s 2025-12-08 12:20:37.01251 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:37.01352 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.01453 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:37.04054 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.04155 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] ✓ SCREENSHOT completed in 0.03s 2025-12-08 12:20:37.04156 [🅰 #c756] ⤑ [🧠 #ce80 LLM] claude-sonnet-4-5-20250929 ⏴ Current URL: https://example.com/ +{15.8kb image} 2025-12-08 12:20:44.82757 [🅰 #c756] ⤑ [🧠 #ce80 LLM] claude-sonnet-4-5-20250929 ↳ ꜛ3192 ꜜ192 | I can see a pag…ith Example Domain as the head tool_use:computer 2025-12-08 12:20:45.12958 [🅰 #c756] ⤑ [🆄 #f8c3 V3CUA.SCROLL] ▷ v3CUA.scroll({target:(644, 400),args:[{type:sc…scroll_amount:3,pageUrl:https://example.com/}]}) 2025-12-08 12:20:45.12959 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:typeof w…"undefined\"&&window.__v3Cursor.move(644, 400)}) 2025-12-08 12:20:45.12960 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] ▷ Page.scroll({args:[644,400,0,300]}) 2025-12-08 12:20:45.13061 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Input.dispatchMouseEvent({type:mouseMoved,x:644,y:400,button:none}) 2025-12-08 12:20:45.13762 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Input.dispatchMouseEvent({type:mouseW…el,x:644,y:400,button:none,deltaX:0,deltaY:300}) 2025-12-08 12:20:45.14663 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] ✓ SCROLL completed in 0.02s 2025-12-08 12:20:45.64764 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:45.64965 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.65266 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:45.68567 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.68668 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] ✓ SCREENSHOT completed in 0.04s 2025-12-08 12:20:45.68769 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:45.68770 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.68971 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:45.71372 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.71473 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] ✓ SCREENSHOT completed in 0.03s 2025-12-08 12:20:45.71474 [🅰 #c756] ⤑ [🧠 #ed51 LLM] claude-sonnet-4-5-20250929 ⏴ Current URL: https://example.com/ +{15.8kb image} ``` --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com> Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>

# why Stand up a Fastify Stagehand server we can reuse for thin-client SDKs across multiple languages. # what changed created new fastify server # test Plan - Start the Fastify server (pnpm --filter server dev or your usual command). - Local browser smoke: MODEL_API_KEY=... ./scripts/test_local_browser.sh - Browserbase smoke: MODEL_API_KEY=... BROWSERBASE_API_KEY=... BROWSERBASE_PROJECT_ID=... ./scripts/test_remote_browser.sh.  --- ## Summary by cubic Adds a new Fastify-based Stagehand API server exposing V3 browser automation over REST with streaming responses and session management. Supports both local Chrome and Browserbase, includes health/readiness endpoints, and ships an OpenAPI spec. - **New Features** - New packages/server with REST routes: start, navigate, observe, act, extract, agentExecute, end (streaming logs/results) - In-memory LRU session store with TTL, lazy V3 init, and cleanup on end - Local and Browserbase browsers; credentials passed via headers - Health (/healthz) and readiness (/readyz), metrics, and structured request logging - OpenAPI v3 spec and README - Removed v2 code and DB dependency; auth currently disabled - **Migration** - Run: pnpm --filter @browserbasehq/stagehand-server dev - Required header: x-model-api-key; for Browserbase also x-bb-api-key and x-bb-project-id Written for commit ed1089b. Summary will update automatically on new commits.

# Agent Abort Signal and Message Continuation ## Why Enable users to cancel long-running agent tasks and continue conversations across multiple `execute()` calls. Also ensures graceful shutdown when `stagehand.close()` is called by automatically aborting any running agent tasks. ## What Changed ### New Features (behind `experimental: true`) #### Abort Signal Support - Pass `signal` to `agent.execute()` to cancel execution mid-run - Works with `AbortController` and `AbortSignal.timeout()` - Throws `AgentAbortError` when aborted #### Message Continuation - `execute()` now returns `messages` in the result - Pass previous messages to continue a conversation across calls ### New Utilities | File | Purpose | |---------------------------------|-------------------------------------------------------------------------------------------| | `combineAbortSignals.ts` | Merges multiple signals (uses native `AbortSignal.any()` on Node 20+, fallback for older) | | `errorHandling.ts` | Consolidates abort detection logic—needed because `close()` may cause indirect errors (e.g., null context) that should still be treated as abort | | `validateExperimentalFeatures.ts` | Single place for all experimental/CUA feature validation | ### CUA Limitations Abort signal and message continuation are not supported with CUA mode (throws `StagehandInvalidArgumentError`). This matches existing streaming limitation. ### Tests Added - `agent-abort-signal.spec.ts` (7 tests) - `agent-message-continuation.spec.ts` (4 tests) - `agent-experimental-validation.spec.ts` (17 tests)  --- ## Summary by cubic Adds agent abort support and conversation continuation. You can cancel long runs, auto-abort on close, and carry messages across execute() calls. Feature is gated behind experimental: true and has clear CUA limitations. - **New Features** - Abort signal for execute() and stream() with AbortController and AbortSignal.timeout; throws AgentAbortError; stagehand.close() auto-aborts via an internal controller combined with any user signal. - Message continuation: execute() returns messages and accepts previous messages on the next call; tool calls and results are included. - **Refactors** - Centralized experimental/CUA validation via validateExperimentalFeatures: CUA disallows streaming, abort signal, and message continuation; experimental required for integrations, tools, streaming, callbacks, signal, and messages. - Public API updates: re-export ModelMessage; Agent types include messages and signal; AgentAbortError exported for consistent abort typing. Written for commit 5276e41. Summary will update automatically on new commits.  --------- Co-authored-by: Nick Sweeting <github@sweeting.me>

# why Click count in CDP's [Input.dispatchMouseEvent](https://chromedevtools.github.io/devtools-protocol/tot/Input/#method-dispatchMouseEvent) does **not** issue multiple click events, is mainly kept for tracking. Individual `mousePressed`/`mouseReleased` events must be sent # what changed Added a for loop for the `clickCount` number provided in both `locator.click()` and `page.click()`. Also built redundancy around `AnthropicCUAClient` double_click coordinate parsing. # test plan - [x] tested on https://doubleclicktest.com/ - [x] added evals site and unit tests on `click-count.spec.ts`  --- ## Summary by cubic Fixes multiple-click behavior by dispatching individual mousePressed/mouseReleased events per click and normalizes Anthropic CUA doubleClick coordinates. Double-clicks and multi-clicks now work reliably via CDP and CUA. - **Bug Fixes** - locator.click and page.click now loop over clickCount, sending pressed/released pairs for each click. - AnthropicCUAClient parses doubleClick consistently and falls back to coordinate arrays when x/y are missing. - Added tests for single, double, and triple clicks for locator.click and page.click. Written for commit 26b784d. Summary will update automatically on new commits.

# why - Google CUA agent was crashing with `Cannot read properties of undefined (reading 'parts')` - This can happen when the model's response is blocked due to safety filters, rate limiting, or other API-level issues # what changed - Added a null check for `candidate.content` and `candidate.content.parts` in `GoogleCUAClient.processResponse()` - When content is missing, the agent now gracefully returns with the finishReason logged for debugging  --- ## Summary by cubic Fixed crash in the Google CUA agent when Gemini returns an empty or blocked response. We now guard against missing content, log the finishReason, and return a safe, completed response with no actions or function calls. Written for commit 5309757. Summary will update automatically on new commits.

# why ci test failed due to timeout being hit on 1/3 ci runs unsure if this will fail again, but increasing delay to prevent in the future # what changed increased timeout from 10s to 20s  --- ## Summary by cubic Increased the test timeout from 10s to 20s in agent-abort-signal.spec to reduce CI flakiness and avoid false timeouts on slower runs. Written for commit 8d3c418. Summary will update automatically on new commits.

# why These dev dependencies don't belong here. Some are no longer used, some should go into their respective packages # what changed Moved dev dependencies to respective packages and removed unused ones # test plan  --- ## Summary by cubic Moved dev dependencies from the workspace root into the packages that use them and removed unused ones to cut install bloat. - **Dependencies** - Removed unused devDependencies from the root; moved required ones into packages/core and packages/evals. - Added missing dev deps to packages/core (@types/adm-zip, @types/node, @types/ws, adm-zip, chalk, esbuild) and packages/evals (braintrust, chalk, string-comparison). - Cleaned pnpm-lock.yaml (large reduction in entries). Written for commit c6f6221. Summary will update automatically on new commits.

# why - writing base64 screenshots to disk is unnecessary: screenshots do not get replayed, so there is no sense in writing it to disk # what changed - added a `pruneAgentResult()` fn which prunes the screenshot entry before it is written to disk # test plan - existing tests & evals should suffice for this one  --- ## Summary by cubic Stop writing base64 screenshots to the agent cache to reduce disk usage and keep cache entries lean. Screenshots aren’t replayed, so pruning them has no impact on behavior. - **Refactors** - Added pruneAgentResult to remove screenshot base64 blobs from actions before persisting. - Prunes only the cached copy; the live AgentResult returned to callers is unchanged. Written for commit 625f982. Summary will update automatically on new commits.

# why - `extract()` was missing from `stagehand.history()` - addresses #1357 # what changed - added a call to `addToHistory()` after `extract()` finishes # test plan  --- ## Summary by cubic Include extract() in stagehand.history() so extract actions and results are tracked with instruction, selector, timeout, and schema details. Fixes missing history entries for extract and addresses #1357. Written for commit 84f95db. Summary will update automatically on new commits.

@miguelg719

This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/stagehand@3.0.6 ### Patch Changes - [#1388](#1388) [`605ed6b`](605ed6b) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix multiple click event dispatches on CDP and Anthropic CUA handling (double clicks) - [#1400](#1400) [`34e7e5b`](34e7e5b) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - don't write base64 encoded screenshots to disk when caching agent actions - [#1345](#1345) [`943d2d7`](943d2d7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for aborting / stopping an agent run & continuing an agent run using messages from prior runs - [#1334](#1334) [`0e95cd2`](0e95cd2) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for google vertex provider - [#1410](#1410) [`d4237e4`](d4237e4) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: include extract in stagehand.history() - [#1315](#1315) [`86975e7`](86975e7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add streaming support to agent through stream:true in the agent config - [#1304](#1304) [`d5e119b`](d5e119b) Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for Microsoft's Fara-7B - [#1346](#1346) [`4e051b2`](4e051b2) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: don't attach to targets twice - [#1327](#1327) [`6b5a3c9`](6b5a3c9) Thanks [@miguelg719](https://github.com/miguelg719)! - Informed error parsing from api - [#1335](#1335) [`bb85ad9`](bb85ad9) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support for page.addInitScript() - [#1331](#1331) [`88d28cc`](88d28cc) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: page.evaluate() now works with scripts injected via context.addInitScript() - [#1316](#1316) [`45bcef0`](45bcef0) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for callbacks in stagehand agent - [#1374](#1374) [`6aa9d45`](6aa9d45) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix key action mapping in Anthropic CUA - [#1330](#1330) [`d382084`](d382084) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: make act, extract, and observe respect user defined timeout param - [#1336](#1336) [`1df08cc`](1df08cc) Thanks [@tkattkat](https://github.com/tkattkat)! - Patch agent on api - [#1358](#1358) [`2b56600`](2b56600) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for 4.5 opus in cua agent ## @browserbasehq/stagehand-evals@1.1.5 ### Patch Changes - [#1364](#1364) [`ca0630e`](ca0630e) Thanks [@tkattkat](https://github.com/tkattkat)! - Update model handling in agent evals cli - Updated dependencies \[[`605ed6b`](605ed6b), [`34e7e5b`](34e7e5b), [`943d2d7`](943d2d7), [`0e95cd2`](0e95cd2), [`d4237e4`](d4237e4), [`86975e7`](86975e7), [`d5e119b`](d5e119b), [`4e051b2`](4e051b2), [`6b5a3c9`](6b5a3c9), [`bb85ad9`](bb85ad9), [`88d28cc`](88d28cc), [`45bcef0`](45bcef0), [`6aa9d45`](6aa9d45), [`d382084`](d382084), [`1df08cc`](1df08cc), [`2b56600`](2b56600)]: - @browserbasehq/stagehand@3.0.6 ## @browserbasehq/stagehand-server@3.0.6 ### Patch Changes - Updated dependencies \[[`605ed6b`](605ed6b), [`34e7e5b`](34e7e5b), [`943d2d7`](943d2d7), [`0e95cd2`](0e95cd2), [`d4237e4`](d4237e4), [`86975e7`](86975e7), [`d5e119b`](d5e119b), [`4e051b2`](4e051b2), [`6b5a3c9`](6b5a3c9), [`bb85ad9`](bb85ad9), [`88d28cc`](88d28cc), [`45bcef0`](45bcef0), [`6aa9d45`](6aa9d45), [`d382084`](d382084), [`1df08cc`](1df08cc), [`2b56600`](2b56600)]: - @browserbasehq/stagehand@3.0.6 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

# why update agent docs to reflect new features # what changed - docs on abort signal - docs on message continuation - docs on streaming - docs on callbacks  --- ## Summary by cubic Updated Agent docs to cover new experimental capabilities—streaming, callbacks, abort signals, and message continuation—and clarified what’s supported for Computer Use Agents vs non-CUA. This helps build real-time UIs, control execution, and maintain conversation state. - **New Features** - Added CUA vs non-CUA feature matrix. - Documented streaming mode (`stream: true`), `textStream`/`fullStream`, and `AgentStreamResult`. - Added lifecycle callbacks for non-streaming and streaming, with examples. - Added `AbortSignal` usage, timeout patterns, and streaming abort behavior. - Added message continuation via `messages` in `execute` options. - Updated references: `AgentConfig.stream`, `messages`, `signal`, `callbacks`, response fields (e.g., `messages`, `timestamp`), and new error types. - **Migration** - Set `experimental: true` to use these features; they are not supported with CUA. - Enable `stream: true` for streaming and streaming callbacks; using streaming-only callbacks without streaming will throw. - Pass previous result `messages` to continue conversations; use `AbortController.signal` to cancel runs. Written for commit 3b58bf9. Summary will update automatically on new commits.

# why - `act`, `extract`, & `observe` fail, and stagehand logs `AI_LoadAPIKeyError` if a user attempts to use a google LLM, and has `GOOGLE_API_KEY` in their `.env` instead of `GOOGLE_GENERATIVE_AI_API_KEY` or `GEMINI_API_KEY` # what changed - this PR widens the accepted env vars for google models to accept `GOOGLE_API_KEY`  --- ## Summary by cubic Allow GOOGLE_API_KEY for Google models by expanding the env var lookup. Fixes key-loading failures in act, extract, and observe when users set GOOGLE_API_KEY in .env. Written for commit 4984318. Summary will update automatically on new commits.

…1409) # why - update `act` reference to use `"provider/model-name"` formatting --------- Co-authored-by: Sean McGuire <seanmcguire1@outlook.com>

# why We didn't have a link to our Discord # what changed <img width="289" height="245" alt="image" src="https://github.com/user-attachments/assets/d1e12f96-db02-4982-806f-fc45d6bb42fb" /> # test plan n/a  --- ## Summary by cubic Added a Discord link across the docs (global anchors, navbar, footer) so users can quickly join the community. Also added a GitHub anchor and removed the outdated “Stagehand by Browserbase” link. Written for commit fb5d591. Summary will update automatically on new commits.

# why - when transitioning to v3, we did not use the latest version of screenshot collector - screenshot collector currently fails due to not having page.on and page.off support for the load, and domcontentloaded events. # what changed - added latest version of screenshot collector # test plan - ran evals in cli with additional logging to also verify everything is working as expected  --- ## Summary by cubic Updated the evals CLI screenshot collector to the latest version, adding image-diff filtering and a V3 event bus that emits agent screenshots. This reduces duplicate screenshots and stabilizes capture on v3 pages where navigation events are disabled. - **New Features** - Skip similar screenshots using MSE/SSIM thresholds with sharp. - Event bus integration: agents emit screenshots; collector can ingest them. - Non-blocking initial/final captures and safer interval capture with error handling. - **Dependencies** - Added sharp ^0.34.5 for image processing (evals and core). - Patch bump via changeset for @browserbasehq/stagehand-evals. Written for commit f4e90f8. Summary will update automatically on new commits.  --------- Co-authored-by: miguel <miguelg71921@gmail.com> Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>

# why we need more evals for agent # what changed - Added 19 new evals composed primarily of "hard" level tasks from public datasets such as onlineMind2web - Updated evals to import agent from agent, rather than v3Agent, as it was an incorrect import causing tasks to fail # test plan ran evals  --- ## Summary by cubic Added 18 new hard-level agent evals and fixed the agent import to use the correct agent, improving coverage and stability of browser tasks. - **New Features** - Added evals for diverse sites (Amazon cart, KFC order, Redfin rentals, Flipkart filters, WebMD tools, Trustpilot, Uniqlo, Alibaba, NVIDIA drivers, OED search, Radiotimes, TheGamer, Trailhead, etc.). - Integrated ScreenshotCollector in new evals to capture journeys for better automated evaluation. - Updated evals.config.json to register all new tasks under the agent category. - **Bug Fixes** - Replaced v3Agent with agent across existing evals to prevent task failures. - Standardized agent.execute usage and evaluation flow to improve reliability. Written for commit b947d97. Summary will update automatically on new commits.

# why We had `page.click(x, y)` for coordinate-based clicking but no equivalent for hovering. Also, the agent's will need hover abilities # what changed - Added `page.hover(x, y, options?)` to dispatch mouse move events at coordinates # test plan Added `page-hover.spec.ts` with 6 tests covering: - Mouseover event triggers - Hover doesn't click - `returnXpath` option - CSS `:hover` pseudo-class activation - Multiple sequential hovers  --- ## Summary by cubic Adds page.hover(x, y, options?) for coordinate-based hovering. Enables mouseover and CSS :hover without clicking, with an option to return the hovered element’s XPath. - **New Features** - Dispatches mouseMoved at absolute page coordinates via CDP. - Supports options.returnXpath to return the element XPath. - Moves cursor without triggering click; activates mouseover and :hover states. Written for commit 5b3b39f. Summary will update automatically on new commits.

# why - this function was from legacy stagehand which only operated on one page - presently, it was only being used to produce a log which: - at best, misinformed users on whether the page had actually navigated, and, - at worst, resulted in a noisy error log - the error log would happen if `clickElement()` triggered page closure. this means that the frame.evaluate() to get the URL would attempt `.evaluate()` on a frame that no longer existed # what changed - removed `handlePossibleNavigation()` # test plan - existing tests are fine here  --- ## Summary by cubic Removed the legacy handlePossibleNavigation() that tried to detect navigation by URL and produced misleading logs. This also prevents errors when clicks close the page and evaluate runs on a non-existent frame, reducing log noise. Written for commit 1d2c3d6. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1761">Review in cubic</a>

# what changed - added documentation for the `context.setExtraHTTPHeaders()` function  --- ## Summary by cubic Add v3 docs for context.setExtraHTTPHeaders(), including API, context-wide behavior (applies to all pages, replaces not merges, clear via {}), examples, and error docs. Also updates the V3Context interface to include this method; addresses Linear STG-1414. Written for commit c6f64ee. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1762">Review in cubic</a>

Fixes CE-731 ## Summary - Remove Claude 3.5 Sonnet (`claude-3-5-sonnet-latest`, `-20241022`, `-20240620`) and Claude 3.7 Sonnet (`claude-3-7-sonnet-latest`, `-20250219`) from all supported model lists - These models are **retired** by Anthropic — API calls to them will fail - Replace with `claude-sonnet-4-20250514` across evals, CI, docs, and examples ## What changed (27 files) - **Core types**: Removed from `model.ts` type union, `agent.ts` CUA models list - **Provider mappings**: Removed from `LLMProvider.ts`, `AgentProvider.ts`, server `utils.ts`, server `model.ts` - **Evals/CI**: Updated `taskConfig.ts`, `initV3.ts`, `ci.yml`, `.env.example` to use `claude-sonnet-4-20250514` - **Tests**: Updated `model-deprecation.test.ts` and `llm-and-agents.test.ts` (513/513 pass) - **Docs**: Updated all v2 and v3 documentation references (11 `.mdx` files) - **Other**: Issue template, MCP example ## Context Per [Anthropic's model deprecations page](https://docs.anthropic.com/en/docs/resources/model-deprecations): | Model | Retired | |-------|---------| | `claude-3-5-sonnet-20240620` | Oct 28, 2025 | | `claude-3-5-sonnet-20241022` | Oct 28, 2025 | | `claude-3-7-sonnet-20250219` | Feb 19, 2026 | ## Test plan - [x] All 513 unit tests pass (`pnpm exec turbo run test:core`) - [x] `grep` confirms zero remaining references outside CHANGELOG.md (historical) - [ ] Verify CI passes 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Summary - expose `headers` on `GoogleVertexProviderSettings` in Stagehand public model types - add a public API type test proving model configs with headers are accepted for google/openai/anthropic - add a patch changeset ## Context Runtime already forwards provider options to `createVertex()`, but TypeScript rejected `headers` in model config. This aligns public types with runtime behavior. ## Validation - `pnpm -C packages/core run typecheck` - `pnpm -C packages/core run build:esm` - `pnpm -C packages/core run test:core -- packages/core/dist/esm/tests/unit/public-api/llm-and-agents.test.js` - `pnpm -C packages/core run test:core -- packages/core/dist/esm/tests/unit/llm-provider.test.js`  --- ## Summary by cubic Expose the headers field on GoogleVertexProviderSettings in the public model config types so custom provider headers (e.g., X-Goog-Priority) are accepted without TypeScript errors. Updated the public API type test to cover Vertex headers and align the model config check with the public API style, keeping types consistent with runtime behavior. Written for commit bf4907d. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1764">Review in cubic</a>

…l Cache sections (#1770) ## Summary Restructures the caching best practices docs page into two clear sections: ### Changes - **Removed** the disclaimer/note about server-side caching only working with `env: "BROWSERBASE"` — this is now naturally conveyed in the section description - **Renamed** "Server-side Caching" → **"Browserbase Cache"** with a clear description of what it is (managed, server-side, automatic, zero-config) - **Renamed** "Local Caching" → **"Local Cache"** with a clear description of what it is (file-based, works everywhere, portable) - **Added** use-case bullets to the Local Cache section explaining when to reach for it (agent workflows, CI/CD, local dev, cross-machine sharing) - **Preserved** all existing code snippets, configuration examples, and best practices ### What stays the same - All code examples (disabling on constructor, disabling per call, inspecting cache status, act/agent caching, cache directory organization) - The limitations section for Browserbase Cache - The best practices accordion (descriptive dirs, clearing cache, committing for CI/CD) - The blog link for deeper technical details Only modifies `packages/docs/v3/best-practices/caching.mdx`. Linear: https://linear.app/browserbase/issue/STG-1482

…ction time (#1719) # why Init script injection was racing with Debugger.resume() sometimes, causing frames to load without init scripts running sometimes. This led to flaky init script tests, which were legitimately catching the issue. - https://github.com/browserbase/stagehand/actions/runs/22233062982/job/64336364420?pr=1580 <img width="1613" height="987" alt="image" src="https://github.com/user-attachments/assets/e836cd65-ed3b-41c8-8f8e-152fd70f30f4" /> # what changed - queues calls on page load to run before we resume - catches oopifs and lazy frames and click-triggered popups the same way playwright does - removes flaky timeout/retry based prior approach https://deepwiki.com/search/how-does-playwright-guarantee_8cf2339b-c060-4cfc-bc62-f3baaf57b229?mode=deep # test plan  --- ## Summary by cubic Fixes the init‑script race by guaranteeing pre‑resume setup and correcting popup attach order. Init scripts now run reliably in same‑ and cross‑process popups, OOPIF iframes, and across reloads; race tests verify addScript is sent before resume per session. - **Bug Fixes** - Enforce pre‑resume ordering: per‑session dispatch waiters ensure Page/Runtime enables, Target.setAutoAttach(waitForDebuggerOnStart), Network.enable/setExtraHTTPHeaders, and Page.addScriptToEvaluateOnNewDocument(runImmediately) are sent before Runtime.runIfWaitingForDebugger; resume only after dispatch; log ordering issues only for top‑level pages. - Stabilize attach and evaluation: fix popup attach ordering; fan out Target.* events to root listeners; retry Runtime.evaluate once on stale context ids; pre‑register the piercer script before resume and lazy‑install if needed. - Harden lifecycle: convert detach errors to PageNotFoundError and propagate; treat Page.enable/lifecycle acks as best‑effort; never drop top‑level Page.create due to local timeouts. - Expand tests and deflake: add delayed‑CDP‑send popup/iframe race repro with real URLs; assert addScript precedes resume per session; cover in‑process and cross‑process popups, window.open, OOPIF iframes, and reload persistence; update detach expectations and timeouts. Written for commit 6f464d3. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1719">Review in cubic</a>

# why - to allow for setting HTTP headers at the page level # what changed - added new function `page.setExtraHTTPHeaders()` , which sets HTTP headers for the CDP session of the page, and all of its child sessions (eg, iframes)  --- ## Summary by cubic Adds page.setExtraHTTPHeaders() to set per-page HTTP headers on all requests from the page and its iframes. Applies to pipeline sessions immediately and replays on newly adopted child sessions. Addresses ST LaurensG- NB: STG-1316. Written for commit cf677c2. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1774">Review in cubic</a>

) ## Summary - Adds support for CDP (Chrome DevTools Protocol) extra HTTP headers when connecting to browser sessions - Passes `extraHTTPHeaders` from the Stagehand config through to the CDP connection layer - Warns when `cdpHeaders` provided without `cdpUrl` - Includes integration test for the new functionality Related: #1737 Co-authored-by: aditya-silna <aditya@silnahealth.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: aditya-silna <aditya@silnahealth.com>

# why After the build migration, `pnpm build:cli` was no longer linking or preserving overriden configs # what changed - Added bin field in `package.json` to enable npm linking - Implemented smart config merging in the build script that updates tasks/benchmarks from source while preserving user-customized defaults - Added auto-linking via npm link --force at the end of the build process with graceful fallback, for whenever users run `pnpm build:cli` - Set `serverCache: false` in initV3 for consistent eval behavior on API # test plan --------- Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

## Summary - Server integration tests, evals, and Stainless preview builds require repo secrets that GitHub doesn't expose to fork PRs - These jobs were running and failing with missing env var errors on every fork PR - Add the same fork guard (`head.repo.full_name == github.repository`) that e2e tests already use ### Jobs fixed: - `server-integration-tests` in `ci.yml` - `run-evals` in `ci.yml` - `preview` in `stainless.yml` ## Test plan - [ ] Verify existing (non-fork) PRs still run all CI jobs - [ ] Verify fork PRs skip the guarded jobs gracefully 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- ## Summary by cubic Skip CI jobs that require repo secrets on fork PRs to prevent missing env errors. These jobs now run only when the PR comes from this repo. - **Bug Fixes** - Guarded server integration tests in ci.yml. - Guarded eval runs in ci.yml. - Guarded Stainless preview builds in stainless.yml. Written for commit f71de8d. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1780">Review in cubic</a>

PSA potential hackers: dont get excited, we don't have any real secrets in CI worth stealing, and our CI does not autodeploy anything to prod. All important secrets and CD processes are kept in our closed-source repos. # why # what changed # test plan  --- ## Summary by cubic Add a gating workflow that blocks CI until a maintainer approves running secrets on forked PRs. CI now triggers from that gate, resolves labels and path filters under workflow_run, removes same-repo guards so integration/e2e/evals run on approved forks, and checks out the PR commit consistently across jobs. Written for commit c682847. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1782">Review in cubic</a>

…ed" (#1786) Reverts #1782  --- ## Summary by cubic Reverts the approval-based CI for external contributors. CI now runs on pull_request and blocks secrets for forked PRs by skipping integration, E2E, and eval jobs. - **Refactors** - Removed the “Ensure Contributor Is Trusted to Run CI” workflow. - Switched CI trigger to pull_request; removed workflow_run logic. - Read labels from github.event.pull_request; removed API calls. - Simplified checkouts; dropped explicit head_sha refs. - Updated concurrency group to use github.ref. - Ignored docs-only changes in CI. Written for commit d6ace82. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1786">Review in cubic</a>

Reverts #1780  --- ## Summary by cubic Reverts the change that skipped CI on forked PRs. Integration tests, evals, and the Stainless preview now run for all PRs by removing the head-repo equality checks in ci.yml and stainless.yml. Written for commit 18480e8. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1787">Review in cubic</a>

# why cdpHeaders is already plumbed through packages/server correctly, it was just missing from the spec. - packages/core/lib/v3/types/public/api.ts:15 defines cdpHeaders on LocalBrowserLaunchOptionsSchema. - packages/server/src/routes/v1/sessions/start.ts:192 forwards browser.launchOptions with a spread into localBrowserLaunchOptions, so cdpHeaders is preserved. - packages/server/src/lib/InMemorySessionStore.ts:240 passes localBrowserLaunchOptions straight into new V3(...). - packages/core/lib/v3/v3.ts:750 passes lbo.cdpHeaders into V3Context.create(...). - packages/core/lib/v3/understudy/context.ts:167 finally uses it in CdpConnection.connect(wsUrl, { headers: opts?.cdpHeaders }). # what changed # test plan  --- ## Summary by cubic Added the missing `cdpHeaders` field to the v3 server OpenAPI spec so clients can send custom Chrome DevTools Protocol headers. This aligns the spec with server launch options and prevents client codegen/validation errors. Written for commit 39ee737. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1797">Review in cubic</a>

…and server-v4 dirs (#1796) # Follow-up Tasks - [ ] Update stainless SDK custom code for all languages to pull new `stagehand-server-v3-darwin-x64` binary names (`-v3-` added)  --- ## Summary by cubic Split the Stagehand API into `packages/server-v3` and `packages/server-v4`, each with its own builds, tests, SEA binaries, and release workflows. Delivers STG-1536 and lets us keep v3 stable while iterating on v4; CI/test discovery and OpenAPI artifacts are versioned. - **Refactors** - Renamed the original server to `packages/server-v3` (`@browserbasehq/stagehand-server-v3`); updated docs and runtime path helpers (now synced across core/docs/evals and both servers), ESLint globs/ignores, scripts/Turbo filters, tests, and Stainless to read `packages/server-v3/openapi.v3.yaml`; v3 SEA binaries use `stagehand-server-v3-*`. - Added `packages/server-v4` (`@browserbasehq/stagehand-server-v4`) with `/v4/**` routes, SSE streaming via `x-stream-response`, LRU/TTL in-memory session store, health/readiness, logging/metrics, `openapi.v4.yaml` + generator, SEA tooling, and v4 integration tests. - CI: path filters, test discovery, and artifacts cover both versions; added `stagehand-server-v4-release.yml` and `stagehand-server-v4-sea-build.yml`; renamed v3 workflows; artifacts include `packages/server-v3/**` and `packages/server-v4/**` dists and OAS. - **Migration** - Replace `packages/server/**` refs with `packages/server-v3/**` or `packages/server-v4/**`. - Use new package filters and binary names: `@browserbasehq/stagehand-server-v3` / `@browserbasehq/stagehand-server-v4`; `stagehand-server-v3-*` / `stagehand-server-v4-*`. - Update OpenAPI consumers to `packages/server-v3/openapi.v3.yaml` or `packages/server-v4/openapi.v4.yaml`. Written for commit 2b9114c. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1796">Review in cubic</a>

## Summary - Adds the `@browserbasehq/browse-cli` package (`packages/cli`) to the stagehand monorepo, open-sourcing browser automation for AI agents - CLI provides stateful browser control via a daemon architecture — navigation, clicking, typing, screenshots, accessibility snapshots, multi-tab, network capture, and env switching (local/remote) - Uses `@browserbasehq/stagehand` as a workspace dependency (bundled into the CLI binary via tsup) - Includes full test suite and documentation ## Changes - `packages/cli/` — all CLI source code, config, tests, and docs - `pnpm-workspace.yaml` — added `packages/cli` to workspace - `.github/workflows/ci.yml` — added CLI path filters and build artifact uploads - `.changeset/open-source-browse-cli.md` — changeset for initial release - `pnpm-lock.yaml` — updated lockfile ## Test plan - [x] CLI builds successfully (`pnpm --filter @browserbasehq/browse-cli run build`) - [x] Full monorepo build passes (`turbo run build` — 9/9 tasks) - [x] `browse --help` and `browse --version` output correctly - [x] `browse status` returns valid JSON - [x] Lint passes clean (`pnpm --filter @browserbasehq/browse-cli run lint`) - [x] Source verified identical to stagent-cli (only import path changed) - [x] Empirically tested Browserbase credential requirements match core - [ ] Run `pnpm --filter @browserbasehq/browse-cli run test` (requires Chrome/browser environment) ## Known issues (pre-existing from stagent-cli, not introduced by this PR) - Network capture `response.json` always writes `status: 0` — response metadata from `responseReceived` CDP event is not persisted to `loadingFinished` handler - Ref-based `click` command silently ignores `--button`/`--count`/`--force` flags (coordinate-based `click_xy` handles them correctly) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…g CI (#1801) # why # what changed # test plan  --- ## Summary by cubic Corrects the changeset package reference from `@browserbasehq/stagehand-server` to `@browserbasehq/stagehand-server-v3` to unblock CI and ensure the correct package receives the patch release. Written for commit 177bc48. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1801">Review in cubic</a>

## Summary - `browse env` showed stale "local" mode after `browse env remote` - Root cause: `.mode` file was only written during lazy browser init (`ensureBrowserInitialized`), not at daemon startup. Between daemon start and first command, `readCurrentMode()` returned `null` and fell back to hardcoded `"local"` - Write `.mode` eagerly in `runDaemon()` at startup so it's immediately available - Fall back to `desiredMode` instead of `"local"` in the `env` display handler as a safety net ## Test plan - [x] Reproduced bug: `browse env remote` → `browse env` showed `"mode":"local"` - [x] Verified fix: `browse env remote` → `browse env` now shows `"mode":"remote"` - [x] `mode.test.ts` passes (3/3)  --- ## Summary by cubic Fixes `browse env` showing stale "local" after `browse env remote` (STG-1547). The daemon now writes `.mode` at startup, the display falls back to `desiredMode` until mode is written, and a patch changeset is added for `@browserbasehq/browse-cli`. Written for commit 9661d92. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1806">Review in cubic</a>  --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Summary - Stacked on #1800 - Only `BROWSERBASE_API_KEY` is required for remote mode in the CLI - `BROWSERBASE_PROJECT_ID` is still passed through if set, but no longer checked ## Changes - `packages/cli/src/index.ts` — `hasBrowserbaseCredentials()` only checks for API key - `packages/cli/tests/mode.test.ts` — Updated test to match new error message - `packages/cli/README.md` — Updated docs to reflect optional project ID ## Test plan - [x] Existing mode test updated - [x] Manual: `browse env remote` with only `BROWSERBASE_API_KEY` set 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- ## Summary by cubic Make `BROWSERBASE_PROJECT_ID` optional in the CLI for remote mode, so only `BROWSERBASE_API_KEY` is required. The project ID is still forwarded when provided. - **Bug Fixes** - Updated remote mode check and error message to only require `BROWSERBASE_API_KEY`. - Autodetection now defaults to `remote` when the API key is set; otherwise `local`. - Updated tests and `@browserbasehq/browse-cli` README to match. Written for commit 99eb186. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1803">Review in cubic</a>  Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r PRs to run CI with secrets (#1794) # why - External contributor PRs currently fail CI because they cant run with secrets - We dont want to allow them to run with secrets until a team member "claims" them and reviews for any secrets exfiltration / sketchy code - Once claimed, we want to run the full CI suite with secrets # what changed # test plan  --- ## Summary by cubic Adds two GitHub Actions that let maintainers claim external contributor PRs by mirroring the approved head SHA to a maintainer-owned branch so full CI can run with secrets. Claims come from an approving review by a team member with write access on the latest commit and are auto-invalidated on new commits (Linear STG-1518). - **New Features** - Detects forked PRs and posts claim instructions; manages labels: `external-contributor`, `external-contributor:awaiting-approval`, `external-contributor:mirrored`, `external-contributor:stale`, `external-contributor:completed`. - On approving review of the latest commit, verifies reviewer permission, mirrors that exact SHA to `external-contributor-pr-<PR#>-<12sha>`, and creates/reopens a “[Claimed #X]” PR assigned to the approver. - Closes and links the original PR with marker comments; keeps labels/status in sync on both PRs. - Auto-closes the mirror when new commits land on the external PR and comments with next steps; if the mirror closes without merge, reopens and relabels the original PR; if the external PR is reopened with the same approved SHA while the mirror is open, it is closed again to keep discussion on the mirror. - Implemented via `external-contributor-pr-approval-handoff.yml` (captures approved reviews, uploads artifact) and `external-contributor-pr.yml` (consumes artifact, performs mirroring); uses `actions/github-script@v7`, `actions/create-github-app-token@v1`, `actions/checkout@v4`, `actions/download-artifact@v4`, `actions/upload-artifact@v4`; concurrency scoped per PR/workflow run. - **Migration** - Create a GitHub App with `contents:write`, `pull_requests:write`, and `issues:write`; add `EXTERNAL_CONTRIBUTOR_PR_APP_ID` and `EXTERNAL_CONTRIBUTOR_PR_APP_PRIVATE_KEY` secrets. - To claim: submit an approving review on the latest commit of a forked PR. If new commits are pushed, approve again to re-claim and rerun CI. Written for commit 4875e99. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1794">Review in cubic</a>

# why bug in previous approach # what changed # test plan  --- ## Summary by cubic Fixes the external PR approval flow by switching to the correct `GITHUB_TOKEN`, stabilizing the mirror/refresh behavior, and ignoring third‑party bot comments when parsing claim markers. Also improves the `claude` workflow to build the repo before edits and allow rerunning failed jobs. - **Bug Fixes** - Use `GITHUB_TOKEN` for branch pushes and API calls; remove the GitHub App token path. - Enable `persist-credentials: true` during checkout to allow pushes. - Keep the mirrored PR open and mark it stale when new commits land on the external PR; relabel both PRs consistently. - Auto-handle reopen/close transitions across external and mirrored PRs. - Ignore comments from non-managed bots (e.g., Greptile, Cubic); only parse claim markers from `github-actions[bot]` to avoid false triggers. - **Refactors** - Inline a small JS lib (`ECPR_LIB`) to manage labels, comments, lifecycle, and claims; jobs run in clear phases (external lifecycle → claim prep → branch refresh → claim finalize). - Refresh internal branches by rebasing onto the approved external ref; report conflicts cleanly for manual follow-up. - Improve `claude.yml`: upgrade to `actions/checkout@v6`, set `actions: write`, run `pnpm`/`turbo` build via `setup-node-pnpm-turbo`, enable `track_progress`, and use an explicit tool allowlist for `anthropics/claude-code-action@v1`. Written for commit a46b159. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1812">Review in cubic</a>

# Why OpenAI organizations with Zero Data Retention (ZDR) rejects stored responses from the Responses API (`store: true` is the default when the AI SDK auto selects it). This causes agent runs to fail # What Changed - Set `openai: { store: false }` in `providerOptions` across `generateText` / `streamText` calls: `v3AgentHandler.ts` (execute + stream), `handleDoneToolCall.ts`, - Simplified the existing Gemini `providerOptions` — removed the conditional `modelId.includes("gemini-3")` check and always pass `google: { mediaResolution: "MEDIA_RESOLUTION_HIGH" }` since non-Google providers ignore it. # Test Plan - [ ] Run agent in mode with an OpenAI model to confirm no breaking changes  --- ## Summary by cubic Defaulted agent calls to OpenAI to not store responses, preventing failures for Zero Data Retention orgs. Also simplified Gemini options by always sending high media resolution. - **Bug Fixes** - Set `providerOptions.openai.store` to `false` for agent `generateText` and `streamText` calls in `v3AgentHandler` (execute + stream) and `handleDoneToolCall`, avoiding Responses API rejections in ZDR orgs. - **Refactors** - Always pass `google: { mediaResolution: "MEDIA_RESOLUTION_HIGH" }` in `providerOptions`; non-Google providers ignore it. Added a changeset for a patch release of `@browserbasehq/stagehand`. Written for commit a01d8c0. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1814">Review in cubic</a>

## Summary - Adds `--context-id <id>` and `--persist` flags to `browse open` so agents can load/persist browser state (cookies, localStorage, etc.) across Browserbase sessions using Contexts - Validates edge cases: `--persist` requires `--context-id`, `--context-id` requires remote mode, context change triggers daemon restart ## Usage ```bash # Load a context (read-only — state not saved back) browse open https://app.com --context-id ctx_abc123 # Load and persist changes back on session end browse open https://app.com --context-id ctx_abc123 --persist ``` ## How it works 1. `browse open --context-id` writes context config to `/tmp/browse-{session}.context` 2. The daemon reads this file during browser initialization and passes it through as `browserbaseSessionCreateParams.browserSettings.context` 3. If a second `browse open` is called with a different context ID, the daemon is restarted (context is baked into the BB session at creation time) Context config uses a temp file (same pattern as `.mode`) because it's needed at Browserbase session creation time, before the daemon's command socket is up. ## Test plan - [x] `browse open https://example.com --context-id <known-id> --persist` on remote mode — verify session created with context in BB dashboard - [x] `browse stop` then reopen with same context — verify state persists - [x] Verify context mismatch triggers daemon restart (open with context A, then open with context B) - [x] Same context, second open — verify no unnecessary restart - [x] `browse open https://example.com --context-id <id>` on local mode — verify clear error - [x] `browse open https://example.com --persist` without `--context-id` — verify clear error - [x] Plain `browse open` (no context flags) — verify no regression - [x] `cleanupStaleFiles` removes `.context` file on shutdown - [x] Stale `.context` file from crashed daemon is cleared on next `browse open` without `--context-id` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

# why when running pnpm format, it formats files that are not relevant to current changes which is annoying # what changed formatted the unformatted files in cli package # test plan  --- ## Summary by cubic Standardized Prettier/ESLint formatting in `packages/cli` so `pnpm format` runs are stable and don’t touch unrelated files. No functional changes. - **Refactors** - Applied Prettier across `packages/cli/src` and tests (line breaks, parens, quotes). - Tidied lint/Prettier config formatting (`eslint.config.mjs`, `.prettierrc` newline). - Adjusted test imports and one assertion to match formatter. Written for commit 31570db. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1819">Review in cubic</a>

# why Allow users to pass custom headers in their LLM calls # what changed Add headers to the model.ts types # test plan  --- ## Summary by cubic Adds `headers` support to `ClientOptions` so clients can send custom HTTP headers with every provider request. Useful for auth tokens or routing hints without changing global config. - **New Features** - Added `headers?: Record<string, string>` to `ClientOptions` in `packages/core/lib/v3/types/public/model.ts`; headers are sent with each request. - No breaking changes; default behavior is unchanged. Written for commit 424dc1a. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1817">Review in cubic</a>

# why Sync the Stagehand MCP docs with the Browserbase MCP docs for STG-1576. # what changed Copied the refreshed Browserbase MCP introduction and setup pages into `packages/docs/v3/integrations/mcp`. # test plan `pnpm exec prettier --check packages/docs/docs.json packages/docs/v3/integrations/mcp/introduction.mdx packages/docs/v3/integrations/mcp/setup.mdx`; `pnpm --dir packages/docs exec mint broken-links` (unrelated existing failures only); `pnpm lint` fails in `packages/core` on an existing ESLint rule config issue. --------- Co-authored-by: ci-test <ci-test@example.com>

miguelg719 force-pushed the main branch from 4994eab to bd0a799 Compare October 29, 2025 16:15

tkattkat and others added 29 commits December 4, 2025 11:15

feat: enabling gpt 5.2 (#1403)

6255e4c

[docs]: update act reference with preferred model name formatting (#…

d4b5bd4

…1409) # why - update `act` reference to use `"provider/model-name"` formatting --------- Co-authored-by: Sean McGuire <seanmcguire1@outlook.com>

seanmcguire12 and others added 30 commits February 25, 2026 19:38

[STG-1458] server cache docs (#1753)

54ea8ba

[feat]: add configurable timeout to agent tools (#1766)

7817fcc

Make projectId optional for Browserbase sessions (#1800)

2abf5b9

[chore]: refactor & fix lint for browse CLI (#1821)

ac4539a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync#2

Sync#2
metehanozdev wants to merge 725 commits intoemregucerr:mainfrom
browserbase:main

metehanozdev commented Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

metehanozdev commented Aug 9, 2025

why

what changed

test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants