A multi-agent VoltAgent application for browser automation and test generation.
This application demonstrates a powerful approach to Playwright test automation using a coordinated multi-agent system:
-
Orchestrator Agent: The main planner that analyzes test requirements, breaks them down into subtasks, and coordinates the execution between specialized agents.
-
Browser Automation Agent: Handles all interactions with the web browser - navigation, clicking, typing, extracting data, and taking screenshots. This agent maintains a single browser instance for efficiency.
-
Code Generation Agent: Transforms recorded browser actions into maintainable Playwright test scripts in TypeScript, ensuring best practices and proper assertions.
The agents communicate through a context-sharing mechanism, allowing them to pass data and maintain state throughout the test session. When a user provides a natural language instruction like "Test the login flow on example.com," the system:
- The Orchestrator analyzes the request and creates an execution plan
- Browser Agent navigates to the site and performs the required interactions
- Code Generation Agent simultaneously records these actions
- Generated test code is returned, ready to be saved and integrated into test suites
This architecture allows for a significantly more intuitive testing workflow compared to traditional approaches, supporting both exploratory testing and automated test generation from a single natural language interface.
- Node.js (v18 or newer)
- npm, yarn, or pnpm
- Playwright browsers (installed automatically)
- Clone this repository
- Install dependencies
npm install
# or
yarn
# or
pnpm install- Install Playwright browsers
npx playwright installRun the development server:
npm run dev
# or
yarn dev
# or
pnpm devHere are some example prompts you can use to interact with the agents:
Go to google.com, search for "Playwright automation", and take a screenshot of the results.
Visit github.com, navigate to the Playwright repository, and extract the number of stars it has.
Go to a login page, fill in the username "testuser" and password "password123", then click the login button.
Start recording a test session, navigate to example.com, click on the first link, wait for the page to load, verify the page title contains "Example", and then generate a test script.
Create a test that checks if the login form on myapp.com validates email addresses correctly.
Generate a test script that verifies all navigation links on wikipedia.org work correctly.
Go to an e-commerce site, add three items to the cart, proceed to checkout, and generate a test script for this entire workflow.
Create a monitoring test that checks if our company website responds within 2 seconds and doesn't have any console errors.
Visit my web application, create a new account, verify email confirmation works, then generate a regression test for this user registration flow.
This project uses VoltAgent as a framework for building AI agents with the following capabilities:
- Multi-Agent Architecture - Specialized agents working together to solve complex tasks
- Browser Automation - Interact with websites programmatically using Playwright
- Test Generation - Create Playwright test scripts from browser sessions
- Natural Language Interface - Control browser and generate tests using plain English commands
- Tool Integration - Extensible system with specialized tools for various tasks
The Browser Agent specializes in web automation tasks. It can navigate websites, interact with page elements, capture screenshots, and extract data from web pages.
The Code Generation Agent creates automated Playwright test scripts from recorded browser sessions. It can record browser actions and generate executable test code.
The application leverages a supervisor-agent architecture where the main agent can delegate tasks to specialized sub-agents:
- Task Delegation - The main agent determines which specialized agent should handle a task
- Context Sharing - Agents can share context and results with each other
- Collaborative Problem Solving - Complex tasks are broken down and solved by multiple agents working together
| Tool | Description |
|---|---|
| navigationTool | Navigate to a URL |
| goBackTool | Navigate back in browser history |
| goForwardTool | Navigate forward in browser history |
| refreshPageTool | Refresh the current page |
| closeBrowserTool | Close the browser |
| Tool | Description |
|---|---|
| clickTool | Click on an element |
| typeTool | Type text into an input field |
| getTextTool | Get text content from an element |
| selectOptionTool | Select an option from a dropdown |
| checkTool | Check a checkbox or radio button |
| uncheckTool | Uncheck a checkbox |
| hoverTool | Hover over an element |
| pressKeyTool | Press a keyboard key |
| waitForElementTool | Wait for an element to appear |
| Tool | Description |
|---|---|
| saveToFileTool | Save content to a file |
| exportPdfTool | Export page as PDF |
| extractDataTool | Extract data from the page |
| screenshotTool | Take a screenshot |
| Tool | Description |
|---|---|
| expectResponseTool | Expect a specific response |
| assertResponseTool | Assert properties of a response |
| Tool | Description |
|---|---|
| setUserAgentTool | Set a custom user agent |
| getUserAgentTool | Get the current user agent |
| Tool | Description |
|---|---|
| getVisibleTextTool | Get visible text from the page |
| getVisibleHtmlTool | Get visible HTML from the page |
| listInteractiveElementsTool | List all interactive elements |
| Tool | Description |
|---|---|
| startCodegenSessionTool | Start recording a session |
| recordActionTool | Record a browser action |
| generateTestTool | Generate a test from recorded actions |
| endCodegenSessionTool | End a recording session |
- VoltAgent - Framework for building and running AI agents
- Playwright - Browser automation library for reliable end-to-end testing
- TypeScript - Type-safe JavaScript for better development experience
- Mistral AI - Large language model for natural language processing
- Vercel AI SDK - Integration with AI models
.
├── src/
│ ├── agents/ # Agent definitions
│ │ ├── browserAgent.ts # Browser automation agent
│ │ └── codegenAgent.ts # Test generation agent
│ ├── tools/ # Tool implementations
│ │ ├── browser/ # Browser automation tools
│ │ └── codegen/ # Code generation tools
│ └── index.ts # Main application entry point
├── .voltagent/ # Auto-generated folder for agent memory
├── package.json
├── tsconfig.json
└── README.md
MIT