Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ Each project must have a `.parameters/` folder containing test parameters for ru
## Documentation Links

- AuthSessions: https://docs.intunedhq.com/docs/02-features/auth-sessions
- Browser SDK: https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview
- Browser SDK: https://docs.intunedhq.com/automation-sdks/overview
- Intuned in depth: https://docs.intunedhq.com/docs/01-learn/deep-dives/intuned-indepth
- Introduction / Quickstarts: https://docs.intunedhq.com/docs/00-getting-started/introduction
- Recipe docs: https://docs.intunedhq.com/docs/01-learn/recipes/
Expand Down
2 changes: 1 addition & 1 deletion .cursorrules
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@

## Doc Links
- AuthSessions: https://docs.intunedhq.com/docs/02-features/auth-sessions
- Browser SDK: https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview
- Browser SDK: https://docs.intunedhq.com/automation-sdks/overview
- Intuned in depth: https://docs.intunedhq.com/docs/01-learn/deep-dives/intuned-indepth
- Introduction / Quickstarts: https://docs.intunedhq.com/docs/00-getting-started/introduction
- Recipe docs: download-file, pagination, upload-files, capture-screenshots under https://docs.intunedhq.com/docs/01-learn/recipes/
Expand Down
2 changes: 1 addition & 1 deletion python-examples/browser-sdk-showcase/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,4 +155,4 @@ See [ai/README.md](./api/ai/README.md) for AI helpers that require API keys and

For detailed documentation on each helper function, visit:
- [Intuned Browser SDK - Python](https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/helpers/functions/)
- [Browser SDK Overview](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview)
- [Browser SDK Overview](https://docs.intunedhq.com/automation-sdks/overview)
2 changes: 1 addition & 1 deletion python-examples/e-commerce-auth-scrapingcourse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,4 +193,4 @@ This project uses the Intuned browser SDK for enhanced reliability:
- **`save_file_to_s3`**: Automatically upload images and files to S3 storage
- **`extend_payload`**: Trigger additional API calls dynamically (used to trigger `details` API for each product)

For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview).
For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{}
{
"limit": 10
}
7 changes: 6 additions & 1 deletion python-examples/e-commerce-scrapingcourse/Intuned.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,12 @@
"metadata": {
"template": {
"name": "e-commerce-scrapingcourse",
"description": "Basic e-commerce scraper using scrapingcourse.com"
"description": "Basic e-commerce scraper using scrapingcourse.com",
"tags": ["web-scraping", "e-commerce", "pagination", "jobs"]
},
"defaultRunPlaygroundInput": {
"apiName": "list",
"parameters": {}
},
"defaultJobInput": {
"configuration": {
Expand Down
154 changes: 41 additions & 113 deletions python-examples/e-commerce-scrapingcourse/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
# E-Commerce Product Scraper
# e-commerce-scrapingcourse Intuned project

E-commerce scraping automation that extracts product information from an online store with pagination support.

## Key Features

- **Automatic Pagination**: The `list` API automatically handles pagination to scrape multiple pages
- **Dynamic API Chaining**: Uses `extend_payload` to automatically trigger the `details` API for each product found
- **S3 File Upload**: Product images are automatically uploaded to S3 using `save_file_to_s3`
- **Job Configuration**: Configured as a job template with retry logic and concurrent request handling

<!-- IDE-IGNORE-START -->
## Run on Intuned

Open this project in Intuned by clicking the button below.
Expand All @@ -26,141 +34,61 @@ After installing dependencies, `intuned` command should be available in your env


### Run an API

```bash
uv run intuned run api list .parameters/api/list/default.json
uv run intuned run api details .parameters/api/details/default.json
```

#### Example: List Products

### Save project
```bash
# List products with default page limit
uv run intuned run api list .parameters/api/list/default.json
uv run intuned run save
```

#### Example: Get Product Details

```bash
# Get details for a specific product
uv run intuned run api details .parameters/api/details/default.json
```
Reference for saving project [here](https://docs.intunedhq.com/docs/02-features/local-development-cli#use-runtime-sdk-and-browser-sdk-helpers)

### Deploy project
```bash
uv run intuned deploy
```

### `intuned-browser`: Intuned Browser SDK

This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).

<!-- IDE-IGNORE-END -->


## Project Structure
The project structure is as follows:
```
/
├── api/ # Your API endpoints
│ ├── list.py # API to scrape product list with pagination
│ └── details.py # API to scrape detailed product information
├── utils/ # Utility files
│ └── types_and_schemas.py # Python types and Pydantic models
└── Intuned.jsonc # Intuned project configuration file
├── .parameters/ # Test parameters for APIs
│ └── api/
│ ├── list/
│ │ └── default.json
│ └── details/
│ └── default.json
├── api/ # API endpoints
│ ├── list.py # Scrape product list with pagination
│ └── details.py # Extract detailed product information
├── utils/ # Utility modules
│ └── types_and_schemas.py # Type definitions and Pydantic models
├── Intuned.jsonc # Intuned project configuration
└── pyproject.toml # Python project dependencies
```


## APIs

### `list` - Product List Scraper

Scrapes products from the e-commerce store with pagination support.

**Parameters:**
- `limit` (optional): Maximum number of pages to scrape (default: 50)

**Returns:**
List of products with:
- `name`: Product name
- `detailsUrl`: URL to product details page

**Features:**
- Automatic pagination handling
- Triggers `details` API for each product using `extend_payload`
- Configurable page limit

### `details` - Product Details Scraper

Scrapes detailed information for a specific product.

**Parameters:**
- `name`: Product name
- `detailsUrl`: URL to the product details page

**Returns:**
Product details object with:
- `name`: Product name
- `price`: Product price
- `sku`: Stock Keeping Unit
- `category`: Product category
- `shortDescription`: Brief product description
- `fullDescription`: Complete product description
- `imageAttachments`: List of product images (uploaded to S3)
- `availableSizes`: List of available sizes
- `availableColors`: List of available colors
- `variants`: List of product variants with stock information


## `Intuned.jsonc` Reference
```jsonc
{
// API access settings
"apiAccess": {
// Whether to enable consumption through Intuned API
"enabled": false
},

// Auth session settings
"authSessions": {
// Auth sessions are not used in this project
"enabled": false
},

// Replication settings
"replication": {
// The maximum number of concurrent executions allowed via Intuned API
"maxConcurrentRequests": 1,

// The machine size to use for this project
// "standard": Standard machine size (6 shared vCPUs, 2GB RAM)
// "large": Large machine size (8 shared vCPUs, 4GB RAM)
// "xlarge": Extra large machine size (1 performance vCPU, 8GB RAM)
"size": "standard"
},

// Default job configuration
"metadata": {
"defaultJobInput": {
"configuration": {
// Number of concurrent API calls within the job
"maxConcurrentRequests": 2,
// Retry configuration
"retry": {
"maximumAttempts": 3
}
},
"payload": [
{
"apiName": "list",
"parameters": {}
}
]
}
}
}
```

## Using `intuned_browser` SDK
| API | Description |
|-----|-------------|
| `list` | Scrapes products from the e-commerce store with pagination support. Automatically triggers `details` API for each product using `extend_payload` |
| `details` | Extracts detailed information for a specific product including price, SKU, category, descriptions, images (uploaded to S3), sizes, colors, and variants |

This project uses the Intuned browser SDK for enhanced reliability:

- **`go_to_url`**: Navigate to URLs with automatic retries and intelligent timeout detection
- **`save_file_to_s3`**: Automatically upload images and files to S3 storage
- **`extend_payload`**: Trigger additional API calls dynamically (used to trigger `details` API for each product)
## Learn More

For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview).
- [Intuned Documentation](https://docs.intunedhq.com)
- [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/overview)
- [Web Scraping Recipe](https://docs.intunedhq.com/docs/01-learn/recipes/)
- [extend_payload Helper](https://docs.intunedhq.com/docs/05-references/runtime-sdk-python/extend-payload)
- [save_file_to_s3 Helper](https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/helpers/functions/save_file_to_s3)
23 changes: 12 additions & 11 deletions python-examples/e-commerce-scrapingcourse/api/details.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
# Extract detailed product information from e-commerce product page
from playwright.async_api import Page
from typing import List, Optional
from typing import TypedDict, List
from intuned_browser import go_to_url, save_file_to_s3, Attachment
import json
import re

from pydantic import HttpUrl
from utils.types_and_schemas import (
ProductDetails,
ProductVariant,
DetailsSchema,
)


class Params(TypedDict):
name: str
detailsUrl: HttpUrl


async def get_product_images(page: Page) -> List[Attachment]:
# Extract all product images from the gallery
image_elements = await page.locator(".woocommerce-product-gallery__image img").all()
Expand Down Expand Up @@ -109,7 +115,7 @@ async def extract_product_details(page: Page, params: DetailsSchema) -> ProductD
price = await price_element.text_content()

# Extract id
id_element = page.locator(".sku_wrapper .id")
id_element = page.locator(".sku_wrapper .sku")
id = await id_element.text_content() or ""

# Extract category
Expand Down Expand Up @@ -148,14 +154,7 @@ async def extract_product_details(page: Page, params: DetailsSchema) -> ProductD
)


async def handler(
page: Page,
params: Optional[dict] = None,
**_kwargs,
) -> ProductDetails:
if params is None:
raise ValueError("Params are required for this handler")

async def automation(page: Page, params: Params, **_kwargs) -> ProductDetails:
# Validate params using pydantic model
validated_params = DetailsSchema(**params)

Expand All @@ -168,5 +167,7 @@ async def handler(
# Extract all detailed product information
product_details = await extract_product_details(page, validated_params)

print(f"Successfully extracted details for product: {product_details.name}")

# Return the complete product details
return product_details
22 changes: 12 additions & 10 deletions python-examples/e-commerce-scrapingcourse/api/list.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
# List products from e-commerce site with pagination
from playwright.async_api import Page
from typing import TypedDict, List, Optional
from typing import TypedDict, List
from runtime_helpers import extend_payload
from intuned_browser import go_to_url

from utils.types_and_schemas import ListSchema


class Params(TypedDict):
limit: int


class Product(TypedDict):
name: str
detailsUrl: str
Expand Down Expand Up @@ -83,16 +88,9 @@ async def navigate_to_next_page(page: Page) -> None:
await page.locator("#product-list").wait_for(state="visible")


async def handler(
page: Page,
params: Optional[dict] = None,
**_kwargs,
) -> List[Product]:
async def automation(page: Page, params: Params, **_kwargs) -> List[Product]:
# Get the page limit from params, default to 50 if not provided
if params is None:
params = {}

validated_params = ListSchema(**params)
validated_params = ListSchema(**(params or {}))
page_limit = validated_params.limit or 50

# Navigate to the e-commerce website
Expand Down Expand Up @@ -133,5 +131,9 @@ async def handler(

current_page += 1

print(
f"Successfully scraped {len(all_products)} products from {current_page} page(s)"
)

# Return the scraped products
return all_products
2 changes: 1 addition & 1 deletion python-examples/empty-auth/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ uv run intuned run authsession validate <auth-session-id>

### `intuned-browser`: Intuned Browser SDK

This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview).
This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).



Expand Down
2 changes: 1 addition & 1 deletion python-examples/empty/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ uv run intuned deploy

### `intuned-browser`: Intuned Browser SDK

This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview).
This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).



Expand Down
4 changes: 2 additions & 2 deletions python-examples/hyprid-automation/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Hybrid Automation

This example demonstrates **hybrid automation** - a flexible approach that combines the [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview) with AI-powered tools like [Stagehand](https://docs.stagehand.dev/) and `extract_structured_data`. This gives you the speed and reliability of traditional automation with the adaptability of AI when needed.
This example demonstrates **hybrid automation** - a flexible approach that combines the [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/overview) with AI-powered tools like [Stagehand](https://docs.stagehand.dev/) and `extract_structured_data`. This gives you the speed and reliability of traditional automation with the adaptability of AI when needed.

## Run on Intuned

Expand Down Expand Up @@ -85,6 +85,6 @@ utils/crawler/ # Crawler utilities
## Learn More

- [Flexible Automations](https://docs.intunedhq.com/docs/02-features/flexible-automation)
- [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview)
- [Intuned Browser SDK](https://docs.intunedhq.com/automation-sdks/overview)
- [Extract Structured Data](https://docs.intunedhq.com/automation-sdks/intuned-sdk/python/helpers/functions/extract_structured_data)
- [Stagehand act/extract/observe](https://docs.stagehand.dev/v2/basics/act)
2 changes: 1 addition & 1 deletion python-examples/playwright-python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,4 @@ uv run intuned deploy
## Related

- [Playwright deep dive](https://docs.intunedhq.com/docs/01-learn/deep-dives/playwright)
- [Intuned SDK](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview)
- [Intuned SDK](https://docs.intunedhq.com/automation-sdks/overview)
2 changes: 1 addition & 1 deletion python-examples/project_getting_started_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ uv run intuned deploy

### `intuned-browser`: Intuned Browser SDK

This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/intuned-sdk/overview).
This project uses Intuned browser SDK. For more information, check out the [Intuned Browser SDK documentation](https://docs.intunedhq.com/automation-sdks/overview).


<!-- This should always match the project structure the readme is in -->
Expand Down
Loading