Skip to content

Exploratory Data Analysis (EDA) of Netflix’s movie & TV show dataset using Python — uncovering trends, genres, ratings, global distribution, and content evolution.

Notifications You must be signed in to change notification settings

amaanuddin9318/Netflix-Data-Analysis-Project

Repository files navigation

🎬 Netflix Data Analysis – Exploratory Data Analysis (EDA)

This project performs an in-depth Exploratory Data Analysis (EDA) on the Netflix Titles dataset. It uncovers meaningful insights about content trends, genres, ratings, durations, and global distribution to support content strategy and decision-making.

📌 Project Overview

This project aims to:

Analyze Netflix’s movie and TV show catalog

Study genre dominance and viewer preferences

Understand rating distributions and audience segmentation

Explore global content contributions (countries producing the most titles)

Analyze content release trends across years

Examine movie durations and TV show season counts

Identify patterns useful for strategy, recommendations, and content planning

📁 Dataset

Source: Netflix Titles dataset

Format: CSV

Records: 8,800+ titles

Key Features:

type, title, director, cast

country, rating, listed_in (genres)

duration, release_year

date_added

Cleaning steps performed:

Removed duplicates

Filled missing values in country, rating, date_added

Extracted numeric duration

Split genres

Converted dates to proper datetime formats

🛠 Technologies Used

Python

Pandas – data cleaning & manipulation

NumPy – numerical operations

Matplotlib & Seaborn – data visualization

Jupyter Notebook – analysis environment

📊 Key Insights Extracted 🔹 1. Genre Distribution

Drama, Comedy, Documentaries dominate Netflix’s library

Genre diversity is high, supporting a broad audience base

🔹 2. Release Year Trends

Major surge in content after 2015

Indicates rising investment in originals and acquisitions

🔹 3. Country Contributions

USA leads significantly

India, UK, Canada, Japan follow

Strong global expansion patterns observed

🔹 4. Content Ratings

TV-MA & TV-14 dominate → mature audience focus

🔹 5. Durations & Engagement

Movies mostly between 80–120 minutes

TV Shows vary; many have 1–2 seasons (binge-friendly)

📈 Visualizations Included

Genre bar charts

Release year trends

Country-wise content distribution

Rating distribution

Movie duration histograms

TV seasons count

Genre evolution over years

Ratings vs Duration correlations

Monthly/Yearly addition trends

📘 Project Files in Repository File Description netflix_data_analysis.ipynb Full Python analysis & visualizations Netflix_Data_Analysis_Presentation.pdf Clean presentation summarizing insights netflix_titles.csv (optional) Dataset used for analysis README.md Documentation 🚀 Future Improvements

Add clustering for viewer preference segmentation

Build a recommendation engine

Integrate user watch-time / engagement data

Dashboard using Power BI / Tableau

💡 Conclusion

This analysis provides strong insights into Netflix’s evolving content library, enabling better content strategy decisions, audience targeting, global planning, and recommendation systems.

👤 Author

Amaan Uddin Data Analyst | Python | SQL | Visualization

About

Exploratory Data Analysis (EDA) of Netflix’s movie & TV show dataset using Python — uncovering trends, genres, ratings, global distribution, and content evolution.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published