Spotify Web Scraper

Spotify Web Scraper crawls through the Spotify web interface and extracts artist logo information.

Technology Stack

Beautiful Soup - Parse HTML code
PyMongo - Python MongoDB Driver
Selenium - Browser Automation Tool

Dependencies

Install Dependencies with pip

pip install beautifulsoup4
pip install pymongo
pip install selenium

Alternatively you can install the requirement.txt

pip install -r requirement.txt

Configuration Setup

The scraper will look for config.py file at the root with the following parameters to connect to your database

# config.py

DATABASE_CONFIG = {
    'host': 'mongodb://{0}:{1}@MONGODB_HOST/DB_COLLECTION_NAME',
    'dbuser': '', # Database user
    'dbuserpassword': '', # Database password
}

Selenium Chrome Webdriver

Since Spotify webpages are generated dynamically, we need a headless browser to generate the webpages dynamically from the source. I have chosen the Chrome WebDriver which works great in this instance.

Download and save the Chrome WebDriver at the root level.

Author

Chris Yang

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
requirement.txt		requirement.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spotify Web Scraper

Technology Stack

Dependencies

Configuration Setup

Selenium Chrome Webdriver

Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zaidongy/Spotify-Scraper

Folders and files

Latest commit

History

Repository files navigation

Spotify Web Scraper

Technology Stack

Dependencies

Configuration Setup

Selenium Chrome Webdriver

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages