-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Open
Description
Summary
The Docker API already calls arun_many() internally for multi-URL requests, but the current /crawl contract only accepts a single crawler_config per request.
This makes the Docker service less expressive than the SDK, where arun_many() supports:
- a single
CrawlerRunConfig, or - a list of
CrawlerRunConfigobjects withurl_matcherpatterns
Why this matters
For production orchestrators, the Docker API is the cleanest architecture because it keeps:
- the app as orchestration only
- Crawl4AI as the actual crawler runtime
- browser/playwright dependencies out of the app image
But today this forces a tradeoff:
- use the SDK directly for full control, or
- use the Docker API with reduced batch/config flexibility
Requested enhancement
Please extend the official Docker API so it can expose the richer arun_many() capability from the SDK, ideally by supporting one of these forms:
crawler_configsas a list of serializedCrawlerRunConfigobjects- support for URL-specific config matching equivalent to SDK
config=[...]withurl_matcher
Desired behavior
- multi-URL synchronous crawling without webhooks
- per-result success/failure isolation
- support for mixed URL configs in one request
- optional dispatcher settings if possible
Current observation
The current Docker API implementation loads:
BrowserConfig.load(browser_config)CrawlerRunConfig.load(crawler_config)
and then callsarun_many()whenlen(urls) > 1, but only with a single config object.
Benefit
This would make the official Docker API much more viable for real production systems that want clean service separation without losing SDK-level batch control.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels