GitHub - KayMan2025/google_ads_support_optimization: End-to-end Google Ads support optimization system using machine learning and LLMs to classify ticket severity, route cases, extract issue signals, and generate priority scores that improve support efficiency and resolution time.

📘 Google Ads Support Optimization — ML + LLM Hybrid System

A Data Science + LLM project simulating how Google’s gTech teams optimize Ads customer operations

🚀 Overview

This project builds an end-to-end support ticket optimization pipeline inspired by real workflows inside Google Ads / gTech.

It demonstrates how machine learning, LLMs, and operational analytics can be combined to:

classify support ticket severity

extract semantic tags using lightweight LLM prompts

compute a priority score blending ML + LLM + business impact

enable smarter ticket triage, routing, and escalation

The goal is to show how data science and LLMs can directly improve customer support outcomes at scale — aligning with responsibilities in Google’s Business Data Science (gDATA) and BizOps roles.

🧠 Project Architecture

+------------------+

| Raw Support Data |

+------------------+

|

v

+------------------+

| EDA + Cleaning |

+------------------+

|

v

+-------------------------------+

| ML Severity Classifier |

| (TF-IDF + Logistic Regression)|

+-------------------------------+

|

v

+-------------------------------+

| LLM Issue Tagger (Groq LLaMA) |

| - topic tags |

| - urgency estimation |

| - concise summarization |

+-------------------------------+

|

v

+-------------------------------+

| Priority Score Engine |

| ML severity + LLM tags + |

| revenue impact (optional) |

+-------------------------------+

|

v

+------------------+

| Ranked Tickets |

+------------------+

🔍 1. Exploratory Data Analysis (EDA)

The dataset includes synthetic support tickets with:

ticket text

customer metadata (region, spend, segment)

sentiment + escalation info

timestamps

labels for severity

Notebooks provide:

✔ distribution plots

✔ correlations

✔ text length analysis

✔ severity imbalance checks

✔ baseline exploratory insights

🤖 2. ML Severity Classifier

A supervised ML model predicts ticket severity levels (e.g., low, medium, high).

Key components:

TF-IDF vectorizer

Logistic Regression classifier

Pipeline stored in models/severity_classifier.pkl

Why this matters

Severity classification is the first triage step used by real gTech teams.

🧩 3. LLM Issue Module (Groq LLaMA-3.1)

A lightweight NLP layer extracts semantics from the ticket text using Groq’s high-speed inference.

LLM outputs:

A. Category classification

(Policy, Billing, Performance, Tracking, Access, etc.)

B. 1–2 sentence summary for agents

C. JSON semantic tags

{

"billing_related": false,

"policy_related": true,

"performance_related": true,

"access_security_related": false,

"tracking_related": false,

"urgency_hint": "high"

}

This creates rich contextual metadata that ML alone cannot capture.

🔥 4. Priority Scoring Engine (ML + LLM Hybrid)

The engine fuses multiple signals:

Component Description

ML Severity 0–1–2 or low/med/high (baseline urgency)

LLM Urgency low / medium / high

LLM Semantic Tags topic-based risk cues

Revenue Impact optional financial weighting

Example output:

{

"priority_score": 82.5,

"components": {

"severity_weight": 0.60,

"llm_urgency_weight": 0.20,

"llm_topic_weights": 0.15,

"revenue_weight": 0.05

}

Why this matters

This mirrors the real multi-signal decision logic in enterprise support systems — ensuring the right agent sees the right issue at the right time.

📊 5. Visualizations

The notebooks include:

📈 Severity Distribution

📊 Priority Score Histogram

🔥 Correlation Heatmap (LLM tags vs. Priority)

🎯 Scatter Plot (Severity vs Priority)

🧩 Component Breakdown Bar Chart

These give clear evidence of system behavior and interpretability.

📂 Repository Structure

google_ads_support_optimization/

│

├── data/

│ └── raw_support_tickets.xlsx

│

├── models/

│ └── severity_classifier.pkl

│

├── src/

│ ├── config.py

│ ├── llm_client.py

│ ├── llm_issue_module.py

│ └── priority_engine.py

│

├── notebooks/

│ ├── 01_EDA.ipynb

│ ├── 02_severity_classifier.ipynb

│ └── 03_priority_engine.ipynb

│

├── .env

├── requirements.txt

└── README.md

🛠️ Installation

1. Clone the repo

git clone https://github.com//google_ads_support_optimization.git

cd google_ads_support_optimization

2. Create and activate virtual environment

python -m venv .venv

.\.venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

4. Set up .env

GROQ_API_KEY=your_key_here

LLM_PROVIDER=groq

LLM_MODEL=llama-3.1-8b-instant

🧪 Run the notebooks

jupyter notebook

🎯 Why This Project Is Relevant to Google

This project demonstrates capability in:

✔ Data analysis & statistical modeling

✔ ML model development

✔ LLM integration + prompt engineering

✔ Operations optimization

✔ Cross-functional communication (summaries, explainability)

✔ Building production-ready pipelines

✔ Prioritizing high-impact business problems

It directly aligns with responsibilities in:

gTech Business Data Science (gBDS)

Business Data Scientist (gDATA)

Google Ads Strategy & Operations

BizOps / Product Operations

AI/LLM-enabled support analytics

🧵 Future Extensions

Routing classifier for assigning agent group

Real-time API for scoring new tickets

Streamlit dashboard for interactive triage

Integration with BigQuery or Vertex AI

Multi-label classification for richer taxonomy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models		models
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

KayMan2025/google_ads_support_optimization

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages