Skip to content

rileycleavenger/SeleniumWebScraper-Docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Selenium Web Scraper for Docker

I created this for developers looking to use Selenium's web driver to web scrape within a Docker. It is very useful for automating web scraping tasks using an external scheduler. The example provided in app.py allows for HTTP GET requests to be made to <insert docker url>/<insert website to scrape url> and returns a text file with the desired page's source code. Ideally this setup will just serve as an example for anyone looking to utlize Selenium and Docker in their webscraping processes.

How To Use

Command Line Initialization

git clone https://github.com/rileycleavenger/SeleniumWebScraper-Docker.git
cd SeleniumWebScraper-Docker
./build.sh

Simple Scraping

using a web browser visit localhost:8080/<insert website to scrape url> to return a txt file with the page's source code

About

simple open source docker setup for web scraping with selenium

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published