A lightweight Instagram media scraper that extracts post summaries, media download links, and engagement metadata from public profiles. It helps developers, researchers, and marketers quickly collect structured Instagram post data without login or complex setup.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for instagram you've just found your team — Let’s Chat. 👆👆
This project extracts recent public Instagram posts from multiple usernames and returns direct media URLs with rich metadata. It solves the problem of quickly accessing usable Instagram post data for analysis, archiving, or integration, without manual browsing. It is designed for developers, analysts, students, and social media professionals who need simple, reliable access to public content.
- Works with public profiles only and requires no authentication
- Supports up to 10 usernames per run with up to 12 posts each
- Extracts images, videos, and carousel cover media
- Returns engagement metrics and timestamps in structured format
| Feature | Description |
|---|---|
| Multi-user support | Process up to 10 Instagram usernames in a single run. |
| Media extraction | Collect images, videos, and carousel cover images. |
| Direct download links | Provides real media URLs ready for download or processing. |
| Engagement metadata | Extracts likes, comments, views, and timestamps. |
| Lightweight workflow | Fast startup and minimal configuration required. |
| Field Name | Field Description |
|---|---|
| username | Instagram username that owns the post. |
| postUrl | Direct URL to the Instagram post or reel. |
| mediaDownloadUrl | Downloadable URL of the image or video media. |
| mediaType | Type of media: image, video, or carousel. |
| title | Caption title or headline text of the post. |
| description | Full or partial post caption text. |
| likes | Total number of likes on the post. |
| comments | Total number of comments on the post. |
| timestamp | Original posting time in ISO format. |
| scrapedAt | Time when the data was collected. |
| dimensions | Width and height of the media asset. |
| videoViewCount | Total views for video posts, if available. |
[
{
"username": "nick_saraev",
"postUrl": "https://www.instagram.com/reel/DNj0JScvl7j/",
"mediaDownloadUrl": "https://scontent.cdninstagram.com/video.mp4",
"mediaType": "video",
"title": "Comment \"VIDEO\" to get this free open-source AI video model",
"description": "Comment \"VIDEO\" to get this free open-source AI video model...",
"likes": 2278,
"comments": 1746,
"timestamp": "2025-08-20T02:12:16.000Z",
"scrapedAt": "2025-08-20T22:36:56.082Z",
"dimensions": {
"width": 640,
"height": 1137
},
"videoViewCount": 19224,
"success": true
}
]
instagram-media-summary-scraper/
├── src/
│ ├── main.py
│ ├── scraper/
│ │ ├── profile_fetcher.py
│ │ ├── post_parser.py
│ │ └── media_resolver.py
│ ├── utils/
│ │ ├── time_utils.py
│ │ └── http_client.py
│ └── config/
│ └── settings.example.json
├── data/
│ ├── input.sample.json
│ └── output.sample.json
├── requirements.txt
└── README.md
- Social media managers use it to analyze recent posts across multiple accounts, so they can compare engagement performance quickly.
- Researchers use it to collect public Instagram data, so they can study content trends and user interaction patterns.
- Developers use it to integrate Instagram media into applications, so they can automate content aggregation workflows.
- Students use it for academic projects, so they can work with real-world social media datasets.
Does this work with private Instagram accounts? No, only publicly accessible profiles are supported. Private accounts are intentionally excluded.
How many posts can be extracted per user? Up to 12 recent posts per username are collected in each run to keep extraction lightweight.
Are carousel posts fully extracted? Carousel posts return the cover image as a basic representation. Individual carousel items are not fully expanded.
Do media URLs expire over time? Yes, media download links may expire and should be used or stored shortly after extraction.
Primary Metric: Average extraction completes within seconds for 5–10 public profiles under normal conditions.
Reliability Metric: Consistently achieves a high success rate on public accounts with valid usernames.
Efficiency Metric: Minimal resource usage due to limited post count and streamlined requests.
Quality Metric: Captures core engagement and media data accurately for the most recent posts.
