Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ A collection of examples for building browser automations with [Intuned](https:/
| [EHR-integration](./python-examples/ehr-integration-python/) | Data Extraction from Openimis Website|
| [playwright-basics](./python-examples/playwright-python/) | Playwright Basics |
| [e-commerece-category](./python-examples/e-commerece-category/) | E-commerce category and product scraper |
| [hyprid-automation](./python-examples/hyprid-automation/) | Hybrid automation combining Intuned Browser SDK with AI-powered tools like Stagehand and extract_structured_data |
| [hybrid-automation](./python-examples/hybrid-automation/) | Hybrid automation combining Intuned Browser SDK with AI-powered tools like Stagehand and extract_structured_data |
| [computer-use](./python-examples/computer-use/) | AI-powered browser automation with Anthropic, OpenAI, Gemini, and Browser Use |
| [cdp-connection](./python-examples/cdp-connection/) | Basic example demonstrating Chrome DevTools Protocol (CDP) connection |
| [setup-hooks](./python-examples/setup-hooks/) | Demonstrates setup hooks for preparing data before API execution |
Expand Down
2 changes: 1 addition & 1 deletion python-examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Intuned sample projects in Python.
| [e-commerce-auth-scrapingcourse](./e-commerce-auth-scrapingcourse/) | Authenticated e-commerce scraper with Auth Sessions |
| [e-commerece-shopify](./e-commerece-shopify/) | Shopify store product scraper |
| [e-commerece-category](./e-commerece-category/) | E-commerce category and product scraper |
| [hyprid-automation](./hyprid-automation/) | Hybrid automation combining Intuned Browser SDK with AI-powered tools like Stagehand and extract_structured_data |
| [hybrid-automation](./hybrid-automation/) | Hybrid automation combining Intuned Browser SDK with AI-powered tools like Stagehand and extract_structured_data |
| [computer-use](./computer-use/) | AI-powered browser automation with Anthropic, OpenAI, and Gemini |
| [cdp-connection](./cdp-connection/) | Basic example demonstrating Chrome DevTools Protocol (CDP) connection |
| [setup-hooks](./setup-hooks/) | Demonstrates setup hooks for preparing data before API execution |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ class Book(BaseModel):

# Extract from the Page directly using Pydantic model.
# You can also extract from a specific locator or by passing TextContentItem.
# Check https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/helpers/functions/extract_structured_data for more details.
# Check https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/ai/functions/extract_structured_data for more details.
product = await extract_structured_data(
source=page,
strategy="HTML",
Expand Down
34 changes: 34 additions & 0 deletions python-examples/hybrid-automation/Intuned.jsonc
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// For more information, see our Intuned settings reference
// https://docs.intunedhq.com/docs/05-references/intuned-json
{
// "workspaceId": "your-workspace-id", // Add your workspace ID here for local development
// "projectName": "your-project-name", // Add your project name here for local development
"apiAccess": {
"enabled": false
},
"authSessions": {
"enabled": false
},
"replication": {
"maxConcurrentRequests": 1,
"size": "standard"
},
"metadata": {
"template": {
"name": "hybrid-automation",
"description": "Hybrid automation combining Intuned Browser SDK with AI-powered tools like Stagehand and extract_structured_data",
"tags": ["hybrid", "ai", "scraping", "rpa", "crawler", "stagehand"]
},
"defaultRunPlaygroundInput": {
"apiName": "rpa/fill-form",
"parameters": {
"name": "Sarah Williams",
"email": "sarah.w@startup.com",
"phone": "+1-555-7890",
"date": "2025-01-05",
"time": "11:47",
"topic": "api-integration"
}
}
}
}
156 changes: 156 additions & 0 deletions python-examples/hybrid-automation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Hybrid Automation

Flexible automation combining the [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/overview) with AI-powered tools like [Stagehand](https://docs.stagehand.dev/) and `extract_structured_data` for speed, reliability, and adaptability.

## Key Features

- **Best of Both Worlds**: Combines fast, reliable SDK automation with AI adaptability
- **Smart Fallbacks**: Uses deterministic methods first, falls back to AI when needed
- **Three Use Cases**: RPA form filling, e-commerce scraping, and job board crawling
- **Production Ready**: Cost-effective primary path with AI safety net for edge cases

## Why Hybrid?

| Approach | Pros | Cons |
|----------|------|------|
| **Deterministic (Intuned Browser SDK)** | Fast, reliable, cost-effective | Breaks when site structure changes |
| **AI-Driven (Stagehand, extract_structured_data)** | Adapts to layout changes | Slower, less predictable |
| **Hybrid (This example)** | Best of both worlds | Slightly more complex |

The hybrid pattern: Use Intuned Browser SDK first (fast path), fall back to AI tools when needed.

Learn more: [Flexible Automations](https://docs.intunedhq.com/docs/02-features/flexible-automation)


## `intuned-browser`: Intuned Browser SDK

This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).

<!-- IDE-IGNORE-START -->
## Run on Intuned

[![Run on Intuned](https://cdn1.intuned.io/button.svg)](https://app.intuned.io?repo=https://github.com/Intuned/cookbook/tree/main/python-examples/hybrid-automation)

## Getting Started

To get started developing browser automation projects with Intuned, check out our [Quick Starts Guide](https://docs.intunedhq.com/docs/00-getting-started/quickstarts).

## Development

> **_NOTE:_** All commands support `--help` flag to get more information about the command and its arguments and options.

### Setup

**Important:** This template uses Intuned's AI gateway for AI-powered features (Stagehand and `extract_structured_data`). The AI gateway requires the project to be saved before running any APIs.


To save the project to intuned, you need to set up your Intuned workspace:

1. **Create a workspace** - Follow the [workspace management guide](https://docs.intunedhq.com/docs/03-how-to/manage/manage-workspace) to create your Intuned workspace

2. **Get your API key** - Generate an API key from the [API keys page](https://docs.intunedhq.com/docs/03-how-to/manage/manage-api-keys#how-to-manage-api-keys) in your Intuned dashboard

3. **Configure workspace ID** - Add your workspace ID to `Intuned.jsonc`:
```jsonc
{
"workspaceId": "your-workspace-id",
"projectName": "your-project-name", // Will be used as the name of this project.
// ... rest of config
}
```

4. **Set environment variable** - Add your API key as an environment variable:
```bash
export INTUNED_API_KEY=your-api-key
```

### Install dependencies
```bash
uv sync
```

After installing dependencies, `intuned` command should be available in your environment.

### Initialize project

Run the save command to upload your project and set up the required `.env` file:

```bash
uv run intuned save
```

This will configure your local environment and prepare the AI gateway for running.

Reference for saving project [here](https://docs.intunedhq.com/docs/02-features/local-development-cli#use-runtime-sdk-and-browser-sdk-helpers)

### Run an API

Now you're ready to run the APIs:

```bash
uv run intuned run api rpa/fill-form .parameters/api/rpa/fill-form/default.json
uv run intuned run api scraper/list .parameters/api/scraper/list/default.json
uv run intuned run api scraper/details .parameters/api/scraper/details/default.json
uv run intuned run api crawler/crawl .parameters/api/crawler/crawl/default.json
uv run intuned run api crawler/crawl .parameters/api/crawler/crawl/job-posting.json
uv run intuned run api crawler/crawl .parameters/api/crawler/crawl/not-lever.json
```

### Deploy project
```bash
uv run intuned deploy
```

<!-- IDE-IGNORE-END -->


## Project Structure
```
/
├── .parameters/ # Test parameters for APIs
│ └── api/
│ ├── rpa/
│ │ └── fill-form/
│ │ └── default.json
│ ├── scraper/
│ │ ├── list/
│ │ │ └── default.json
│ │ └── details/
│ │ ├── default.json
│ │ └── example2.json
│ └── crawler/
│ └── crawl/
│ ├── default.json
│ ├── job-posting.json
│ └── not-lever.json
├── api/ # API endpoints
│ ├── rpa/
│ │ └── fill-form.py # Form filling with Stagehand fallback
│ ├── scraper/
│ │ ├── list.py # Product list with pagination
│ │ └── details.py # Product details with AI extraction
│ └── crawler/
│ └── crawl.py # Job board crawler (hybrid extraction)
├── hooks/
│ └── setup_context.py # CDP URL setup for Stagehand
├── utils/
│ └── crawler/ # Crawler utilities
├── Intuned.jsonc # Intuned project configuration
└── pyproject.toml # Python project dependencies
```

## APIs

| API | Description |
|-----|-------------|
| `rpa/fill-form` | RPA automation that fills consultation booking forms. Uses Playwright via Intuned Browser SDK for form fields, falls back to `stagehand.page.act()` if selectors fail. Verifies success with Playwright, falls back to `stagehand.page.extract()` |
| `scraper/list` | E-commerce product list scraping. Uses Intuned Browser SDK for pagination and link extraction with AI-powered adaptability |
| `scraper/details` | Product details extraction combining SDK methods with `extract_structured_data` for unstructured fields like descriptions and specifications |
| `crawler/crawl` | Job board crawler that extracts structured job postings. Uses static Playwright extraction for Lever (`jobs.lever.co`), AI extraction with `extract_structured_data` for other boards (Greenhouse, etc.). Extracts title, location, department, team, description, commitment, workplace type |

## Learn More

- [Flexible Automations](https://docs.intunedhq.com/docs/02-features/flexible-automation)
- [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/overview)
- [Extract Structured Data](https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/ai/functions/extract_structured_data)
- [Stagehand act/extract/observe](https://docs.stagehand.dev/v2/basics/act)
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@ async def automation(
print("✓ Filled name with Playwright")
except Exception as e:
print(f"Playwright failed for name, using Stagehand act: {e}")
await stagehand.act(f'Type "{name}" in the name input field')
stagehand_page = stagehand.page
await stagehand_page.act(f'Type "{name}" in the name input field')
print("✓ Filled name with Stagehand act")

# Step 2: Fill email field
Expand All @@ -94,7 +95,8 @@ async def automation(
print("✓ Filled email with Playwright")
except Exception as e:
print(f"Playwright failed for email, using Stagehand act: {e}")
await stagehand.act(f'Type "{email}" in the email input field')
stagehand_page = stagehand.page
await stagehand_page.act(f'Type "{email}" in the email input field')
print("✓ Filled email with Stagehand act")

# Step 3: Fill phone field
Expand All @@ -103,7 +105,8 @@ async def automation(
print("✓ Filled phone with Playwright")
except Exception as e:
print(f"Playwright failed for phone, using Stagehand act: {e}")
await stagehand.act(f'Type "{phone}" in the phone input field')
stagehand_page = stagehand.page
await stagehand_page.act(f'Type "{phone}" in the phone input field')
print("✓ Filled phone with Stagehand act")

# Step 4: Fill date field
Expand All @@ -112,7 +115,8 @@ async def automation(
print("✓ Filled date with Playwright")
except Exception as e:
print(f"Playwright failed for date, using Stagehand act: {e}")
await stagehand.act(f'Type "{date}" in the date input field')
stagehand_page = stagehand.page
await stagehand_page.act(f'Type "{date}" in the date input field')
print("✓ Filled date with Stagehand act")

# Step 5: Fill time field
Expand All @@ -121,7 +125,8 @@ async def automation(
print("✓ Filled time with Playwright")
except Exception as e:
print(f"Playwright failed for time, using Stagehand act: {e}")
await stagehand.act(f'Type "{time}" in the time input field')
stagehand_page = stagehand.page
await stagehand_page.act(f'Type "{time}" in the time input field')
print("✓ Filled time with Stagehand act")

# Step 6: Select the consultation topic from dropdown
Expand All @@ -130,7 +135,8 @@ async def automation(
print("✓ Selected topic with Playwright")
except Exception as e:
print(f"Playwright failed for topic selection, using Stagehand act: {e}")
await stagehand.act(f'Select "{topic}" from the topic dropdown')
stagehand_page = stagehand.page
await stagehand_page.act(f'Select "{topic}" from the topic dropdown')
print("✓ Selected topic with Stagehand act")

# Step 7: Submit the booking form
Expand All @@ -139,7 +145,8 @@ async def automation(
print("✓ Submitted form with Playwright")
except Exception as e:
print(f"Playwright failed for submit, using Stagehand act: {e}")
await stagehand.act("Click the submit button to submit the booking form")
stagehand_page = stagehand.page
await stagehand_page.act("Click the submit button to submit the booking form")
print("✓ Submitted form with Stagehand act")

# Step 8: Wait for and verify the success modal
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ async def automation(
for product in all_products:
extend_payload(
{
"api": "hyprid-scraper/details",
"api": "hybrid-scraper/details",
"parameters": dict(product),
}
)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
from .content import extract_page_content
from .helpers import get_job_run_id, sanitize_key
from .links import extract_links, get_base_domain, is_file_url, normalize_url

__all__ = [
"extract_links",
"normalize_url",
"get_base_domain",
"extract_page_content",
"is_file_url",
"sanitize_key",
"get_job_run_id",
Expand Down
26 changes: 0 additions & 26 deletions python-examples/hyprid-automation/Intuned.jsonc

This file was deleted.

Loading