π Google Ads Support Optimization β ML + LLM Hybrid System
A Data Science + LLM project simulating how Googleβs gTech teams optimize Ads customer operations
π Overview
This project builds an end-to-end support ticket optimization pipeline inspired by real workflows inside Google Ads / gTech.
It demonstrates how machine learning, LLMs, and operational analytics can be combined to:
classify support ticket severity
extract semantic tags using lightweight LLM prompts
compute a priority score blending ML + LLM + business impact
enable smarter ticket triage, routing, and escalation
The goal is to show how data science and LLMs can directly improve customer support outcomes at scale β aligning with responsibilities in Googleβs Business Data Science (gDATA) and BizOps roles.
π§ Project Architecture
Β +------------------+
Β | Raw Support Data |
Β +------------------+
Β |
Β v
Β +------------------+
Β | EDA + Cleaning |
Β +------------------+
Β |
Β v
Β +-------------------------------+
Β | ML Severity Classifier |
Β | (TF-IDF + Logistic Regression)|
Β +-------------------------------+
Β |
Β v
Β +-------------------------------+
Β | LLM Issue Tagger (Groq LLaMA) |
Β | - topic tags |
Β | - urgency estimation |
Β | - concise summarization |
Β +-------------------------------+
Β |
Β v
Β +-------------------------------+
Β | Priority Score Engine |
Β | ML severity + LLM tags + |
Β | revenue impact (optional) |
Β +-------------------------------+
Β |
Β v
Β +------------------+
Β | Ranked Tickets |
Β +------------------+
π 1. Exploratory Data Analysis (EDA)
The dataset includes synthetic support tickets with:
ticket text
customer metadata (region, spend, segment)
sentiment + escalation info
timestamps
labels for severity
Notebooks provide:
β distribution plots
β correlations
β text length analysis
β severity imbalance checks
β baseline exploratory insights
π€ 2. ML Severity Classifier
A supervised ML model predicts ticket severity levels (e.g., low, medium, high).
Key components:
TF-IDF vectorizer
Logistic Regression classifier
Pipeline stored in models/severity_classifier.pkl
Why this matters
Severity classification is the first triage step used by real gTech teams.
π§© 3. LLM Issue Module (Groq LLaMA-3.1)
A lightweight NLP layer extracts semantics from the ticket text using Groqβs high-speed inference.
LLM outputs:
A. Category classification
(Policy, Billing, Performance, Tracking, Access, etc.)
B. 1β2 sentence summary for agents
C. JSON semantic tags
{
Β "billing_related": false,
Β "policy_related": true,
Β "performance_related": true,
Β "access_security_related": false,
Β "tracking_related": false,
Β "urgency_hint": "high"
}
This creates rich contextual metadata that ML alone cannot capture.
π₯ 4. Priority Scoring Engine (ML + LLM Hybrid)
The engine fuses multiple signals:
Component Description
ML Severity 0β1β2 or low/med/high (baseline urgency)
LLM Urgency low / medium / high
LLM Semantic Tags topic-based risk cues
Revenue Impact optional financial weighting
Example output:
{
Β "priority_score": 82.5,
Β "components": {
Β "severity_weight": 0.60,
Β "llm_urgency_weight": 0.20,
Β "llm_topic_weights": 0.15,
Β "revenue_weight": 0.05
Β }
}
Why this matters
This mirrors the real multi-signal decision logic in enterprise support systems β ensuring the right agent sees the right issue at the right time.
π 5. Visualizations
The notebooks include:
π Severity Distribution
π Priority Score Histogram
π₯ Correlation Heatmap (LLM tags vs. Priority)
π― Scatter Plot (Severity vs Priority)
π§© Component Breakdown Bar Chart
These give clear evidence of system behavior and interpretability.
π Repository Structure
google_ads_support_optimization/
β
βββ data/
β βββ raw_support_tickets.xlsx
β
βββ models/
β βββ severity_classifier.pkl
β
βββ src/
β βββ config.py
β βββ llm_client.py
β βββ llm_issue_module.py
β βββ priority_engine.py
β
βββ notebooks/
β βββ 01_EDA.ipynb
β βββ 02_severity_classifier.ipynb
β βββ 03_priority_engine.ipynb
β
βββ .env
βββ requirements.txt
βββ README.md
π οΈ Installation
1. Clone the repo
git clone https://github.com//google_ads_support_optimization.git
cd google_ads_support_optimization
2. Create and activate virtual environment
python -m venv .venv
.\.venv\Scripts\activate
3. Install dependencies
pip install -r requirements.txt
4. Set up .env
GROQ_API_KEY=your_key_here
LLM_PROVIDER=groq
LLM_MODEL=llama-3.1-8b-instant
π§ͺ Run the notebooks
jupyter notebook
π― Why This Project Is Relevant to Google
This project demonstrates capability in:
β Data analysis & statistical modeling
β ML model development
β LLM integration + prompt engineering
β Operations optimization
β Cross-functional communication (summaries, explainability)
β Building production-ready pipelines
β Prioritizing high-impact business problems
It directly aligns with responsibilities in:
gTech Business Data Science (gBDS)
Business Data Scientist (gDATA)
Google Ads Strategy & Operations
BizOps / Product Operations
AI/LLM-enabled support analytics
π§΅ Future Extensions
Routing classifier for assigning agent group
Real-time API for scoring new tickets
Streamlit dashboard for interactive triage
Integration with BigQuery or Vertex AI
Multi-label classification for richer taxonomy