Skip to content

novamind/dataparsing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Lyrics Scraper and Text Analysis

A Python project for scraping song lyrics from AZLyrics and performing basic text analysis on the collected data. This project includes data collection, cleaning, tokenization, and simple statistics about the lyrics of different artists.

Note: This repository supported the research and writing of a personal blog post analyzing differences in lyrical style between artists from different genres.

Project Overview

Song lyrics are rich in emotion, storytelling, and linguistic patterns. This project demonstrates how to:

  1. Scrape song lyrics from AZLyrics using BeautifulSoup.
  2. Collect metadata like album info and song details.
  3. Tokenize and analyze lyrics to understand patterns in language.
  4. Compare lyrics across different genres (pop vs rap).

We focus on Eminem (rap) and Miley Cyrus (pop) as case studies.

Features

  • Web scraping: Pull song lyrics and metadata (title, album, featured artists).
  • Text preprocessing: Clean lyrics and tokenize words.
  • Basic statistics:
    • Total number of songs scraped
    • Vocabulary size (unique words)
    • Token frequency
    • Most common and rare words

Next Steps

  • Sentiment analysis
  • Topic modeling
  • Visualizations (word clouds, token frequency plots)
  • Batch scraping of multiple artists

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published