forked from EddyLuten/domain-scrape
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME
More file actions
17 lines (12 loc) · 633 Bytes
/
README
File metadata and controls
17 lines (12 loc) · 633 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Scrapes the pages and resources on a domain, starting from the provided URL.
Local directory structure will mimic the URL paths as closely as possible.
Inspects the HTML pages for src and href attributes.
Usage: usage = scrape.py OPTIONS domain url
Options:
-h, --help show the help message and exit
--out output directory, if not provided, will use working directory
Examples:
Scrape the google.com domain, starting at http://google.com/:
python ./scrape.py google.com http://google.com/
Scrape the github.com domain, store in the provided directory:
python ./scrape.py --out ./github github.com http://github.com/