Skip to content

Power BI dashboard analyzing YouTube trending data, combines storytelling visuals, regression-based prediction, and clustering segmentation to uncover engagement patterns and optimal posting times.

Notifications You must be signed in to change notification settings

9eek9/YouTube_Trending_Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“บ YouTube Trending Analytics: Insights, Prediction & Segmentation with Power BI

This project explores YouTube video performance using data-driven storytelling and machine learning inside Power BI.
It combines data visualization, predictive modeling, and clustering analysis to uncover what makes videos trend across regions and time zones.


๐ŸŽฏ Objective

The goal of this project is to understand what drives a YouTube video's success - from engagement metrics like likes and views to publishing time and regional trends.
By applying descriptive analytics, linear regression, and K-Means clustering, this dashboard turns raw YouTube metrics into meaningful insights and recommendations for content creators and marketers.


๐Ÿงฉ Dataset Overview

Attribute Description
Video ID Unique identifier for each video
Title Title of the YouTube video
Channel Channel or creator name
Views Total number of video views
Likes Total number of likes received
Region Country/region code (e.g., IN, US)
Published Date & time of upload
  • ๐Ÿ“ฆ Source: YouTube Data API v3 (via REST API / Power BI web connector)
  • ๐Ÿ“Š Size: ~800 rows ร— 7 columns
  • ๐Ÿ• Data refreshed dynamically via API or batch queries

โš™๏ธ Data Preparation & Modeling

Data cleaning and feature engineering were performed using Power Query and Python scripts embedded inside Power BI:

  • Extracted Publish Hour from timestamp to study time-based engagement.
  • Created Engagement Rate metric โ†’ Likes / Views.
  • Applied log-transformed scaling for numeric stability.

๐Ÿง  Machine Learning Models

Algorithm Purpose Description
Linear Regression Predictive Predicts views based on Likes, Region, and Publish Hour using scikit-learn
K-Means Clustering Segmentation Groups videos into High, Medium, and Low performers based on engagement

๐Ÿ“Š Dashboard Design

The Power BI dashboard consists of three main sections:
Data Storytelling, Data Art, and Data Showcasing.

๐Ÿ“˜ Data Storytelling Visuals

  • Scatter Plot (Likes vs Views) โ†’ Shows engagement correlation
  • Bar Chart (Top 10 Channels by Views) โ†’ Highlights leading creators
  • Heatmap (Publish Hour vs Avg Views) โ†’ Reveals peak posting hours

๐ŸŽจ Data Art Visual

  • Treemap (Channel Contribution by Region) โ†’ Visualizes view share and geographic dominance

๐Ÿ’ก Data Showcasing Visuals

  • Regression Model (Actual vs Predicted Views) โ†’ Evaluates model accuracy
  • K-Means Segmentation (Video Performance) โ†’ Groups content by performance level

๐Ÿ“ˆ Key Insights & Findings

  • ๐Ÿ”น Likes strongly correlate with views - engagement drives visibility.
  • ๐Ÿ”น T-Series dominates the global landscape with over 34M views.
  • ๐Ÿ”น Best publishing times: 1 AMโ€“3 AM (India region) for maximum reach.
  • ๐Ÿ”น Regression model accurately predicts general view patterns, though extreme viral cases deviate.
  • ๐Ÿ”น K-Means clustering reveals clear High/Medium/Low performer segments for strategic content targeting.

๐ŸŽฏ Recommendations

  • Focus on high-engagement formats - niche content often yields loyal audiences.
  • Post during region-specific peak hours to boost reach.
  • Benchmark against top creators (e.g., T-Series, Universal Music India).
  • Use clustering insights to tailor optimization strategies for underperforming videos.

๐Ÿง  Tools & Technologies

  • Microsoft Power BI Desktop
  • Python (scikit-learn, pandas) via Power Query scripting
  • DAX & Power Query for data transformation
  • K-Means & Linear Regression for ML integration

๐Ÿ–ผ๏ธ Dashboard Preview

(Add screenshots or exported visuals from your Power BI dashboard here)

Visual Description
Scatter Plot Likes vs Views correlation
Heatmap Optimal publish hours
Cluster Chart Performance segmentation

๐Ÿ”ฎ Future Work

  • Add comment sentiment and keyword trend analysis.
  • Include watch time, shares, and comments metrics for deeper engagement modeling.
  • Test Gradient Boosting and Neural Networks for improved prediction accuracy.
  • Extend dataset to multi-year timeframes for trend forecasting.

๐Ÿ‘ฉโ€๐Ÿ’ป Author

Ei Ei Khaing
Graduate Certificate in Artificial Intelligence & Machine Learning | Fanshawe College
๐Ÿ“ง [ellenkhaing@gmail.com]


About

Power BI dashboard analyzing YouTube trending data, combines storytelling visuals, regression-based prediction, and clustering segmentation to uncover engagement patterns and optimal posting times.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published