Skip to content

A Python-based machine learning project that performs user profiling and segmentation using clustering techniques for personalized advertising.

Notifications You must be signed in to change notification settings

nandanarnandu/user_profiling_and_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠📊 AI-Powered User Profiling & Segmentation

Python Flask scikit-learn pandas License: MIT

A complete Flask-based machine learning app that profiles and segments users using demographics, behavior, and interests to power targeted advertising strategies. Built with Python, Flask, Pandas, and Scikit-learn.

Screenshot 2025-09-06 235000 Screenshot 2025-09-06 235102 ---

✨ Features

  • 🔐 Upload & Manage Data
    Import CSVs with demographic, behavioral, and interest features

  • 🔎 EDA at a Glance
    Summary stats, missing value report, distributions, and correlations

  • 🧼 Preprocessing Pipeline
    Scaling (Standard/MinMax), one-hot encoding, and feature selection

  • 🧩 Clustering (K-Means)
    Configurable K, inertia/Elbow and Silhouette diagnostics

  • 🗺️ Segment Insights & Labels
    Auto-generated segment summaries (e.g., “Weekend Warriors”, “Engaged Professionals”, “Budget Browsers”)

  • 📊 Visual Analytics
    Radar charts for segment profiles, cluster counts, PCA 2D plot


🚀 Quick Start

# Clone and setup
git clone https://github.com/your-username/user-profiling-segmentation.git
cd user-profiling-segmentation

python -m venv venv
# Linux/Mac:
source venv/bin/activate
# Windows (PowerShell):
# .\venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

# (Optional) set environment variables
# Linux/Mac:
export FLASK_APP=app.py
export FLASK_ENV=development
# Windows (PowerShell):
# $env:FLASK_APP="app.py"
# $env:FLASK_ENV="development"

# Run the app
flask run

# Open http://127.0.0.1:5000

📂 Dataset

The dataset includes demographic, behavioral, and interest-based attributes for ad users. Typical Columns ⦁ Demographics: age, gender, income_level

Device & Usage: device_type, time_spent_weekday, time_spent_weekend

Engagement: likes, reactions, ctr (click-through-rate)

Interests: top_interests (or one-hot encoded interest columns)

⦁ Place your CSV inside data/ (e.g., data/ad_users.csv) or upload via the web UI.

📊 Features Used

⦁ Age, Gender, Income Level

⦁ Device Usage

⦁ Time Spent Online (Weekday & Weekend)

⦁ Likes and Reactions

⦁ Click-Through Rate (CTR)

⦁ Top Interests

🔍 Techniques Applied

⦁ Exploratory Data Analysis (EDA)

⦁ Data Preprocessing (Scaling & Encoding)

⦁ K-Means Clustering

⦁ Segment Analysis & Visualization (Radar Charts & PCA)

📌 Segment Examples

Exploratory Data Analysis (EDA)

Weekend Warriors — high weekend activity, mobile-first

Engaged Professionals — higher income, strong CTR on weekdays

Budget Browsers — low spend, moderate engagement

📈 Output

⦁ Radar chart visualizing 5 user segments across online behavior and interaction metrics

⦁ Cluster diagnostics: Elbow (inertia) & Silhouette score plots

⦁ Segment summary table with key stats per cluster

🛠️ Tech Stack

Backend: Python, Flask

ML/DS: Pandas, NumPy, Scikit-learn

Visualization: Matplotlib, Plotly (optional)

Utilities: joblib (model persistence), python-dotenv (env vars)

💡 Contributions, issues, and feature requests are welcome!


About

A Python-based machine learning project that performs user profiling and segmentation using clustering techniques for personalized advertising.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published