Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
.idea/
.DS_Store
node_modules/
5 changes: 5 additions & 0 deletions api-reference/answers/create.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
---
title: "Create Answer"
description: "Create an AI-powered answer by searching the web and extracting information."
openapi: "POST /v1/answers"
---

<Note>
**Metadata** <Icon icon="clock" /> Coming Soon — See [Metadata](/api-reference/common/metadata) for details.
</Note>
5 changes: 5 additions & 0 deletions api-reference/answers/get.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
---
title: "Get Answer"
description: "Retrieve a previously created answer by its ID."
openapi: "GET /v1/answers/{answer_id}"
---

<Note>
**Metadata** <Icon icon="clock" /> Coming Soon — See [Metadata](/api-reference/common/metadata) for details.
</Note>
4 changes: 4 additions & 0 deletions api-reference/batches/create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Create Batch'
description: 'Starts a new batch. You receive an `id` that you can use to track the progress of the batch as shown [here](/api-reference/batches/info). Note: Processing time is constant regardless of batch size'
openapi: POST /v1/batches
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
4 changes: 4 additions & 0 deletions api-reference/batches/info.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Batch Info'
description: 'Retrieves the status and progress information about a batch. To retrieve the content for a batch, see [here](/api-reference/batches/items)'
openapi: GET /v1/batches/{batch_id}
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
4 changes: 4 additions & 0 deletions api-reference/batches/items.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Batch Items'
description: 'Retrieves the list of items processed for a batch. You can then use the `retrieve_id` to get the content with the Retrieve Endpoint'
openapi: GET /v1/batches/{batch_id}/items
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
7 changes: 3 additions & 4 deletions api-reference/batches/list.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
---
title: 'Batch Items'
description: 'Fetches the list of items processed for a batch.'
openapi: GET /v1/batches/{batch_id}/items
---
title: 'List Batches'
description: 'Fetches the list of recent batches.'
openapi: GET /v1/batches/recent
52 changes: 52 additions & 0 deletions api-reference/common/metadata.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: 'Metadata'
sidebarTitle: 'Metadata'
description: 'Attach custom key-value pairs to API requests'
icon: 'tag'
---

Many endpoints accept a `metadata` parameter for storing additional information with your requests. Metadata is returned in responses and can be used for tracking, filtering, or storing context.

## Usage

```json
{
"url_to_scrape": "https://example.com",
"metadata": {
"order_id": "12345",
"customer_name": "John Doe",
"priority": "high"
}
}
```

Metadata follows [Stripe's approach](https://stripe.com/docs/api/metadata) — simple, flexible, and consistent across all endpoints.

---

## Validation Rules

| Constraint | Limit | Error Example |
|------------|-------|---------------|
| Maximum keys | 50 | `"Metadata can have a maximum of 50 keys. You provided 51 keys."` |
| Key length | 40 characters | `"Metadata key \"my_very_long_key_name...\" exceeds 40 character limit."` |
| Key format | No square brackets | `"Metadata key \"items[0]\" cannot contain square brackets ([ or ])."` |
| Value length | 500 characters | `"Metadata value for key \"description\" exceeds 500 character limit (got 523 characters)."` |
| Value type | Strings only | `"Metadata value for key \"count\" must be a string. Got object."` |

<Note>
Numbers and booleans are automatically converted to strings. Objects and arrays are rejected.
</Note>

---

## Availability

| Endpoint | Status |
|----------|--------|
| [Batches](/api-reference/batches/create) | <Icon icon="check" color="green" /> Available |
| [Crawls](/api-reference/crawls/create) | <Icon icon="check" color="green" /> Available |
| [Maps](/api-reference/maps/create) | <Icon icon="check" color="green" /> Available |
| [Scrapes](/api-reference/scrapes/create) | <Icon icon="clock" /> Coming Soon |
| [Answers](/api-reference/answers/create) | <Icon icon="clock" /> Coming Soon |

127 changes: 127 additions & 0 deletions api-reference/common/pagination.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
title: 'Pagination'
sidebarTitle: 'Pagination'
description: 'How to paginate through large result sets using cursor-based pagination'
icon: 'arrow-right'
---

Many endpoints return large datasets that are paginated using a **cursor-based pagination** mechanism. This allows you to efficiently retrieve all results by making multiple requests.

## How It Works

Pagination uses two query parameters:

- **`cursor`**: A token that indicates where to start fetching results. On the first request, **omit the `cursor` parameter**. For subsequent requests, use the `cursor` value from the previous response. See the [`cursor` parameter](/api-reference/batches/items#query-cursor) documentation for details.
- **`limit`**: The maximum number of results to return per request (recommended: 10-50 for batches/crawls).

When there are more results available, the response includes a `cursor` field. Continue making requests with the new `cursor` value until the `cursor` field is absent, indicating all results have been retrieved.

---

## Examples

<CodeGroup>

```python Python
import requests

API_URL = 'https://api.olostep.com/v1'
API_KEY = '<your_token>'
HEADERS = {'Authorization': f'Bearer {API_KEY}'}

def get_batch_items(batch_id, cursor=None, limit=10):
params = {'limit': limit}
if cursor:
params['cursor'] = cursor
response = requests.get(
f'{API_URL}/batches/{batch_id}/items',
headers=HEADERS,
params=params
)
return response.json()

# Paginate through all items
cursor = None
while True:
result = get_batch_items('batch_abc123', cursor=cursor, limit=10)

for item in result['items']:
print(f"Custom ID: {item['custom_id']}, URL: {item['url']}")

if 'cursor' not in result:
break

cursor = result['cursor']
```

```js Node.js
const API_URL = 'https://api.olostep.com/v1';
const API_KEY = '<your_token>';

async function getBatchItems(batchId, cursor = null, limit = 10) {
const params = new URLSearchParams();
if (cursor !== null) params.append('cursor', cursor);
params.append('limit', limit);

const response = await fetch(
`${API_URL}/batches/${batchId}/items?${params}`,
{ headers: { 'Authorization': `Bearer ${API_KEY}` } }
);
return response.json();
}

// Paginate through all items
let cursor = null;
while (true) {
const result = await getBatchItems('batch_abc123', cursor, 10);

result.items.forEach(item => {
console.log(`Custom ID: ${item.custom_id}, URL: ${item.url}`);
});

if (result.cursor === undefined) break;
cursor = result.cursor;
}
```

```bash cURL
# First request (omit cursor parameter)
curl -G "https://api.olostep.com/v1/batches/batch_abc123/items" \
-H "Authorization: Bearer $OLOSTEP_API_KEY" \
--data-urlencode "limit=10"

# Subsequent requests use the cursor from previous response
curl -G "https://api.olostep.com/v1/batches/batch_abc123/items" \
-H "Authorization: Bearer $OLOSTEP_API_KEY" \
--data-urlencode "cursor=10" \
--data-urlencode "limit=10"
```

</CodeGroup>

---

## Best Practices

1. **Omit `cursor` on first request**: For batches and crawls, omit the `cursor` parameter entirely on the first request. Only include it when continuing from a previous response.

2. **Use appropriate limits**:
- Batches/Crawls: 10-50 items per request
- Maps: Handled automatically (up to 10MB per response)

3. **Check for cursor**: Always check if a `cursor` field exists in the response before making the next request. If it's absent, you've retrieved all results.

4. **Handle errors**: Implement retry logic for network errors, but don't retry with the same cursor if you've already processed those results.

5. **Streaming**: For crawls, you can start paginating while the crawl is `in_progress` to stream results as they become available. See the [`cursor` parameter](/api-reference/crawls/pages#query-cursor) documentation for details.

---

## Availability

| Endpoint | Cursor Type | Limit Parameter | Notes |
|----------|-------------|-----------------|-------|
| [Batches Items](/api-reference/batches/items) | Integer | Yes (10-50 recommended) | See [`cursor` parameter](/api-reference/batches/items#query-cursor) |
| [Crawl Pages](/api-reference/crawls/pages) | Integer | Yes (10-50 recommended) | See [`cursor` parameter](/api-reference/crawls/pages#query-cursor) |
| [Maps](/api-reference/maps/create) | String | No (automatic) | Auto-paginates at 10MB |

4 changes: 4 additions & 0 deletions api-reference/crawls/create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Create Crawl'
description: 'Starts a new crawl. You receive a `id` to track the progress. The operation may take 1-10 mins depending upon the site and depth and pages parameters.'
openapi: POST /v1/crawls
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
4 changes: 4 additions & 0 deletions api-reference/crawls/info.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Crawl Info'
description: 'Fetches information about a specific crawl.'
openapi: GET /v1/crawls/{crawl_id}
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
4 changes: 4 additions & 0 deletions api-reference/crawls/pages.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ title: 'Crawl Pages'
description: 'Fetches the list of pages for a specific crawl.'
openapi: GET /v1/crawls/{crawl_id}/pages
---

<Tip>
**Metadata** <Icon icon="sparkles" /> New — See [Metadata](/api-reference/common/metadata) for details.
</Tip>
Loading