An exploratory data analysis (EDA) of Taylor Swift's Spotify presence. This project examines the relationship between song popularity and technical track attributes using a Kaggle Spotify dataset.
The objective was to determine if song duration influences popularity and to analyze the distribution of tracks across her discography.
- Language: Python
- Libraries: Pandas, Matplotlib, NumPy
- Dataset: Spotify top artist tracks (Kaggle)
- Correlation Result: Based on the analysis, there is a weak correlation between track duration and popularity. This suggests that Taylor Swift's audience engagement is driven by factors other than song length.
- Artist Dominance: The data highlights a high density of tracks with popularity scores above 75, showcasing consistent listener retention.
- Data Cleaning: Performed preprocessing to handle missing values and formatted the duration metrics for accurate statistical plotting.
- Clone the repository.
- Ensure you have
pandasandmatplotlibinstalled. - Run
spotify_analysis.ipynbto view the visualizations.
spotify_analysis.ipynb: The main analysis and visualization notebook.spotify_data_clean.csv: Preprocessed dataset used for the analysis.