A lightweight actors scraper that collects structured information about published actors and returns clean, filterable datasets. It helps teams explore actor listings, analyze usage metrics, and track performance trends without manual browsing.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for apify-actors you've just found your team β Letβs Chat. ππ
This project gathers actor metadata and statistics into a structured dataset thatβs easy to filter and analyze. It solves the problem of manually reviewing large actor catalogs by turning them into queryable data. Itβs built for developers, analysts, and product teams who need visibility into actor ecosystems.
- Collects actor profiles with titles, usernames, and descriptions
- Supports filtering by name, title keywords, and result limits
- Outputs normalized records ready for analytics or storage
- Designed to scale across hundreds or thousands of actors
| Feature | Description |
|---|---|
| Keyword Filtering | Narrow results by username or title for targeted discovery. |
| Result Limiting | Control output size to match analysis or testing needs. |
| Rich Metadata | Includes usage stats, categories, and descriptive fields. |
| Dataset Output | Returns consistent, structured records for easy reuse. |
| Field Name | Field Description |
|---|---|
| title | Human-readable title of the actor. |
| name | Unique identifier or slug for the actor. |
| username | Publisher or owner username. |
| description | Short summary of what the actor does. |
| categories | Functional categories associated with the actor. |
| stats_totalRuns | Total number of runs executed. |
| stats_totalUsers | Count of unique users. |
| stats_lastRunStartedAt | Timestamp of the most recent run. |
| pictureUrl | URL to the actorβs icon image. |
| currentPricingInfo | Pricing and trial configuration details. |
[
{
"title": "Y Combinator Companies",
"name": "y-combinator-companies",
"username": "prog-party",
"stats_totalBuilds": 11,
"stats_totalRuns": 8,
"stats_totalUsers": 1,
"stats_lastRunStartedAt": "2025-03-13T21:47:39.490Z",
"description": "Retrieves Y Combinator company data and returns it as a dataset.",
"categories": ["AUTOMATION", "LEAD_GENERATION"],
"pictureUrl": "https://example.com/actor-icon.png"
}
]
Apify Actors/
βββ src/
β βββ main.py
β βββ filters.py
β βββ collectors/
β β βββ actor_collector.py
β βββ models/
β β βββ actor_schema.py
β βββ utils/
β βββ validators.py
βββ data/
β βββ sample_input.json
β βββ sample_output.json
βββ requirements.txt
βββ README.md
- Product teams use it to analyze actor adoption, so they can prioritize integrations.
- Developers use it to discover relevant actors, so they can speed up implementation.
- Market researchers use it to track trends, so they can understand ecosystem growth.
- Automation builders use it to shortlist tools, so they can design workflows faster.
What input does this scraper require? It accepts a JSON configuration where you can define filters such as username, title keywords, and a maximum result limit.
Can I run it without filters? Yes, running it without filters returns a general listing, though applying filters is recommended for performance and relevance.
Is the output suitable for analytics pipelines? Absolutely. The structured dataset format is designed to plug directly into analytics or storage systems.
Does it support large datasets? Itβs optimized to handle large result sets, with limit controls to manage performance.
Primary Metric: Processes several hundred actor records per minute under normal conditions.
Reliability Metric: Consistently achieves high success rates with stable data collection across runs.
Efficiency Metric: Minimal memory footprint due to streaming-style data handling.
Quality Metric: High data completeness with consistent field coverage across records.
