Medium User Search Scraper helps you discover Medium writers directly from search results and turn them into structured profile records. Use it to build outreach lists, map niche communities, and track authors by keyword—without manual copying and cleanup.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for medium-user-search-scraper you've just found your team — Let’s Chat. 👆👆
This project searches Medium users by keyword and extracts profile-level details into a clean, analysis-ready dataset. It solves the problem of finding relevant writers at scale while keeping the data consistent for research, influencer tracking, and outreach workflows. It’s built for marketers, researchers, founders, and developers who need a repeatable way to collect Medium user search profiles.
- Searches users using one or more keyword queries
- Collects profile identity, bio, and public profile URL in a structured format
- Supports configurable result limits per run for controlled list building
- Produces consistent records suitable for spreadsheets, CRMs, and analytics pipelines
- Works well for running multiple niche searches to expand coverage
| Feature | Description |
|---|---|
| Keyword user search | Find Medium users by one or more search terms to discover writers in specific niches. |
| Rich profile extraction | Captures bio, username, avatar, profile URL, and visibility-friendly flags for quick filtering. |
| Multi-keyword batching | Run multiple searches in one execution and unify results into a single dataset. |
| Configurable maxItems | Control how many user profiles are collected to match budget, time, or list size goals. |
| Clean, structured output | Produces consistent fields for easy deduplication, segmentation, and enrichment. |
| Field Name | Field Description |
|---|---|
| id | Unique user identifier from the source profile record. |
| name | Display name shown on the user profile. |
| username | Medium username/handle used for profile addressing. |
| bio | Short biography text from the user profile. |
| avatarUrl | Profile avatar image URL (useful for previews and enrichment). |
| userUrl | Public profile URL for the user. |
| isBookAuthor | Boolean flag indicating book-author status (when available). |
| isMember | Boolean flag indicating membership status (when available). |
[
{
"id": "2d196168bd00",
"name": "chel writes",
"username": "chelwrites",
"bio": "write articles and personal thoughts. publish drafts regularly. sometimes in english or indonesian :)",
"avatarUrl": "https://miro.medium.com/v2/resize:fill:96:96/1*5GUJnGMlifgnut6UmNvdng.jpeg",
"userUrl": "https://chelwrites.medium.com",
"isBookAuthor": false,
"isMember": false
},
{
"id": "11841531c264",
"name": "Jo Ann Harris, Writer of Daily Musings",
"username": "joannharris-53598",
"bio": "Writing on Medium since 2018. Writer for Illumination. I write on a myriad of subjects with you in mind.",
"avatarUrl": "https://miro.medium.com/v2/resize:fill:96:96/0*nZwiKx3sl_Wcu6yS.",
"userUrl": "https://joannharris-53598.medium.com",
"isBookAuthor": false,
"isMember": true
}
]
Medium User Search Scraper/
├── src/
│ ├── main.py
│ ├── runner.py
│ ├── pipelines/
│ │ ├── search_users.py
│ │ └── normalize_records.py
│ ├── extractors/
│ │ ├── profile_parser.py
│ │ └── field_mapper.py
│ ├── http/
│ │ ├── client.py
│ │ └── retries.py
│ ├── storage/
│ │ ├── dataset_writer.py
│ │ └── dedupe.py
│ └── config/
│ ├── schema.json
│ └── settings.example.json
├── data/
│ ├── input.example.json
│ └── output.sample.json
├── tests/
│ ├── test_parser.py
│ └── test_normalize.py
├── .gitignore
├── requirements.txt
├── LICENSE
└── README.md
- [Content marketers] use it to discover writers by niche keywords, so they can build targeted outreach lists for collaborations.
- [Startup founders] use it to find relevant Medium authors in their industry, so they can identify partners and guest-post contributors faster.
- [Researchers] use it to collect Medium user search profiles at scale, so they can analyze author ecosystems and topic clusters.
- [Agencies] use it to generate segmented writer databases, so they can match clients with suitable creators and publications.
- [Analysts] use it to track influencer discovery over time, so they can monitor emerging authors in specific fields.
Q1: How do I choose the best keywords for finding relevant writers? Use intent-focused keywords that match writer niches (e.g., “product management”, “web3”, “data analytics”, “copywriting”). Start narrow, review the first batch, then expand with adjacent terms and synonyms. Running multiple smaller searches typically produces cleaner lists than one broad keyword.
Q2: What does maxItems control, and what’s a good default? maxItems sets the maximum number of user profiles collected for the run. For testing, 20–50 is a practical range. For list-building, 200–500 per keyword is common, but you should scale gradually and deduplicate across runs to keep your dataset clean.
Q3: How do I avoid duplicates when running multiple keywords? Deduplicate by stable identifiers like id or username. If you’re merging multiple runs, keep a “seen” set keyed by id/username and only store new profiles. This keeps outreach lists and analytics consistent across overlapping keyword searches.
Q4: What fields are most useful for outreach segmentation? username, bio, userUrl, and isMember are typically the most actionable for segmentation. Use bio keyword matching for niche tagging, and isMember/book-author flags as optional signals for prioritization.
Primary Metric: ~2.0–4.5 profiles/second on stable connections with moderate result limits (100–300), depending on response latency.
Reliability Metric: 97–99% run completion rate under typical usage, with automatic retry handling for transient failures.
Efficiency Metric: ~120–220 MB peak memory usage for 500–1,000 collected profiles, with streaming writes to keep overhead low.
Quality Metric: 95–98% field completeness for core identity fields (id, name, username, userUrl), with optional flags (isMember/isBookAuthor) varying by profile visibility.
