COUPON ACCEPTANCE ANALYSIS PROJECT
Assignment 5.1: Will the Customer Accept the Coupon?
Berkeley ML/AI Professional Certificate
PROJECT OVERVIEW 📓 View the Jupyter Notebook
This analysis investigates factors that influence whether drivers accept mobile coupons delivered to their cell phones while driving. Using survey data from Amazon Mechanical Turk, we examine acceptance patterns across different driver demographics, driving contexts, and coupon types to identify high-value customer segments.
Data Source: UCI Machine Learning Repository Dataset Size: 12,684 survey responses Analysis Focus: Bar coupons (2,017 observations)
- Younger, socially active individuals respond more positively to short-term bar and restaurant offers.
- Campaigns can improve conversion rates by focusing on these groups while tailoring messaging for families differently.
- Python • pandas • Matplotlib • Seaborn • Jupyter Notebook
✅ Repository includes:
- Data (
data/coupons.csv) - Jupyter notebook (
prompt.ipynb) - Supporting images and visuals
- This README summary
Coupon Types:
- Bar
- Coffee House
- Carry Out & Take Away
- Restaurant (< $20)
- Restaurant ($20-$50)
Key Variables: User Attributes: • Age (categorical ranges: below21, 21, 26, 31, 36, 41, 46, 50plus) • Gender • Marital Status • Income (categorical ranges) • Education • Occupation • Venue visit frequencies (Bar, CoffeeHouse, Restaurant, CarryAway)
Contextual Attributes: • Driving destination (home, work, no urgent place) • Weather (sunny, rainy, snowy) • Temperature (30F, 55F, 80F) • Time of day (10AM, 2PM, 6PM) • Passenger type (alone, partner, kid(s), friend(s)) • Distance to venue • Direction (same/opposite as destination)
Target Variable: • Y: Acceptance (1 = accepted, 0 = rejected)
KEY FINDINGS
• Baseline coupon acceptance rate: 56.8% • Bar coupon acceptance rate: 41.0% • Bar coupons have lower acceptance than overall average
★★★ HIGHEST IMPACT FINDING ★★★
Drivers who go to bars MORE THAN 3 times per month: • Acceptance Rate: 76.9% • Sample Size: 199 drivers
Drivers who go to bars 3 OR FEWER times per month: • Acceptance Rate: 37.1% • Sample Size: 1,797 drivers
DIFFERENCE: 39.8 percentage points (pp)
INTERPRETATION: Frequent bar-goers (>3/month) are MORE THAN TWICE as likely to accept bar coupons compared to infrequent bar-goers. This is the strongest predictor identified in the analysis.
Profile 1: Bar-Goers Over Age 25 (Question 4) Conditions: Bar visits >1/month AND Age >25 • Target Group Acceptance: 62.2% • All Others Acceptance: 55.4% • Lift: +6.8 pp • Sample Size: 2,777 drivers
Profile 2: Social Bar-Goers (Question 5) Conditions: Bar >1/month AND No kids as passengers AND Not in farming • Target Group Acceptance: 62.3% • All Others Acceptance: 54.6% • Lift: +7.7 pp • Sample Size: 3,696 drivers
Profile 3: Social Non-Widowed Bar-Goers (Question 6a) Conditions: Bar >1/month AND No kids AND Not widowed • Target Group Acceptance: 62.3% • All Others Acceptance: 54.6% • Lift: +7.7 pp • Sample Size: 3,696 drivers
Profile 4: Young Bar-Goers (Question 6b) Conditions: Bar >1/month AND Age <30 • Target Group Acceptance: 62.8% • All Others Acceptance: 55.5% • Lift: +7.3 pp • Sample Size: 2,272 drivers
Profile 5: Budget-Conscious Diners (Question 6c) Conditions: Cheap restaurants >4/month AND Income <$50K • Target Group Acceptance: 60.1% • All Others Acceptance: 56.1% • Lift: +4.0 pp • Sample Size: 2,279 drivers
- Frequent bar-goers (>3/month): +39.8 pp ⭐⭐⭐ HIGHEST
- Social non-widowed bar-goers: +7.7 pp
- Young bar-goers (<30): +7.3 pp
- Bar-goers over 25: +6.8 pp
- Budget-conscious diners: +4.0 pp
Drivers who accept bar coupons tend to: ✓ Already have established bar-going habits (frequency is key) ✓ Are younger and more socially active ✓ Often driving without children (more spontaneous) ✓ Have flexibility in their schedule and destination
Geographic Considerations: • High acceptance areas likely include college campuses, downtown city centers, entertainment districts where young, social drivers are prevalent
ACTIONABLE ITEMS & RECOMMENDATIONS
-
SEGMENT TARGETING Action: Focus bar coupon campaigns on drivers who visit bars >3 times/month Expected Impact: 2x higher acceptance rate (76.9% vs 37.1%) Implementation: Partner with bar establishments to identify frequent patrons or use credit card transaction data
-
AGE-BASED TARGETING Action: Prioritize drivers under age 30 for bar coupons Expected Impact: +7.3 pp lift in acceptance Implementation: Use app demographic data or device behavioral patterns
-
CONTEXTUAL FILTERING Action: Do NOT send bar coupons when kids are passengers Expected Impact: +7.7 pp lift when targeting "no kids" scenarios Implementation: Use passenger detection via bluetooth/GPS patterns or explicit user input
-
TIME & LOCATION OPTIMIZATION Action: Deploy bar coupons in high-opportunity zones Target Areas: • College campuses (young drivers) • Downtown entertainment districts • Areas near existing bars and nightlife Target Times: • Evening hours (6PM onwards) • Weekends • "No urgent destination" scenarios
-
CROSS-COUPON STRATEGY Action: Test similar targeting for restaurant coupons with budget-conscious drivers (income <$50K + frequent cheap restaurant visits) Expected Impact: +4.0 pp lift Rationale: These drivers show sensitivity to discounts
-
A/B TESTING FRAMEWORK Action: Set up controlled experiments to validate findings Test Groups: • Control: Random distribution • Test 1: Frequency-based targeting (>3 bars/month) • Test 2: Multi-factor targeting (frequency + age + passenger type) Metrics: Acceptance rate, redemption rate, customer lifetime value
-
MACHINE LEARNING MODEL DEVELOPMENT Action: Build predictive model using all identified features Features to Include: • Bar visit frequency (most important) • Age • Passenger type • Marital status • Income • Time of day • Destination • Weather/temperature Goal: Real-time acceptance probability scoring
-
DYNAMIC COUPON VALUE OPTIMIZATION Action: Test variable discount levels for different segments Hypothesis: Frequent bar-goers may accept with lower discounts (already motivated), while infrequent visitors may need higher incentives
-
LOOKALIKE AUDIENCE EXPANSION Action: Use frequent bar-goers as seed audience to find similar users who don't yet show bar-going behavior but share other characteristics (age, income, lifestyle patterns) Goal: Expand addressable market beyond current frequent visitors
-
VENUE PARTNERSHIP STRATEGY Action: Collaborate with bars that attract target demographics Tactics: • Co-marketing campaigns • Loyalty program integration • Event-based coupon triggers (sports events, concerts)
HIGH PRIORITY (Launch First): ✓ Bar frequency >3/month ✓ Age <30 ✓ No kids in car ✓ Evening/weekend timing ✓ Near entertainment districts
MEDIUM PRIORITY (Phase 2): ✓ Age >25 + Bar >1/month ✓ Income <$50K + Restaurant frequency >4/month ✓ Single or unmarried partner status
LOW PRIORITY (Avoid): ✗ Bar frequency ≤1/month (37.1% acceptance - below baseline) ✗ Kids as passengers ✗ Morning hours (10AM) ✗ Residential areas
Based on acceptance lift and segment size:
Segment Budget % Rationale ──────────────────────────────────────────────────────── Frequent bar-goers (>3/month) 40% Highest ROI Young bar-goers (<30) 25% Good lift + sizable Social non-widowed 20% Strong lift Budget-conscious diners 10% Lower lift Experimental/testing 5% Learning
• Monitor for coupon fatigue in high-frequency segments • Ensure responsible marketing (avoid targeting impaired drivers) • Track redemption rates, not just acceptance (validate actual behavior) • Consider legal/ethical implications of targeting based on drinking habits
Primary KPIs: • Coupon acceptance rate (target: >60% for targeted segments) • Redemption rate • Cost per acquisition (CPA) • Return on ad spend (ROAS)
Secondary KPIs: • Customer lifetime value (CLV) of acquired users • Repeat coupon usage rate • Cross-coupon category adoption
TECHNICAL NOTES
• Removed 'car' column (99% missing values - 12,576/12,684) • Retained other columns with minor missing values: - Bar: 107 missing (0.8%) - CoffeeHouse: 217 missing (1.7%) - CarryAway: 151 missing (1.2%) - RestaurantLessThan20: 130 missing (1.0%) - Restaurant20To50: 189 missing (1.5%)
Age Column Challenge: • Age stored as categorical strings ('below21', '21', '26', etc.), not numeric integers • Cannot use direct numeric comparison (age > 25) • Solution: Use .isin() method with explicit category lists
Example: AGE_OVER_25 = ['26', '31', '36', '41', '46', '50plus'] over25 = df['age'].isin(AGE_OVER_25)
Alternative approaches explored: 1. Numeric mapping (convert to midpoint values) 2. Ordinal categorical encoding (preserve ordering) 3. Custom age grouping (broad life stages)
• Comparison approach: Target segment vs. "All Others" • Metric: Percentage point (pp) difference in acceptance rates • No formal hypothesis testing conducted (exploratory analysis) • Sample sizes sufficient for meaningful comparisons (all segments >199)
Current visualizations: • Bar plots for coupon type acceptance rates • Histograms for temperature distribution • Count plots for bar visit frequency among acceptors
Recommended additions: • Side-by-side acceptance rate comparisons for each question • Summary dashboard with all key findings • Segmentation heatmaps
HOW TO USE THIS NOTEBOOK
Python Libraries Required: • pandas • numpy • matplotlib • seaborn
prompt.ipynb - Main analysis notebook data/coupons.csv - Source dataset README.txt - This file
- Ensure data/coupons.csv is in the correct path
- Run cells sequentially from top to bottom
- Key sections:
- Cells 1-13: Data loading and cleaning
- Cells 17-23: Overall acceptance analysis
- Cells 29-41: Bar coupon baseline analysis
- Cell 49: Categorical age handling workflow (educational)
- Cells 51-52: Question 4 analysis + visualization
- Cell 57: Question 5 analysis
- Cell 59: Question 6 analysis (three comparisons)
- Cell 61: Hypothesis statement
All results are deterministic. Running the notebook will produce identical output to the documented findings above.
Opportunities for further investigation: • Analyze other coupon types (Coffee House, Restaurant, CarryAway) • Examine interaction effects (e.g., age × income × bar frequency) • Build predictive models using scikit-learn • Conduct statistical significance testing • Analyze temporal patterns (time of day, expiration period) • Geographic analysis (distance, direction factors)
PROJECT METADATA
Author: [Your Name] Course: Berkeley ML/AI Professional Certificate - Module 5 Assignment: 5.1 - Practical Application 1 Date: October 2025 Tools: Python 3, Jupyter Notebook, pandas, matplotlib, seaborn Dataset: UCI ML Repository - In-Vehicle Coupon Recommendation
For questions or collaboration: [Your Contact] GitHub Repository: [Your GitHub URL]
CHANGELOG
Version 1.0 (Initial Analysis) • Data cleaning and exploration • Bar coupon acceptance analysis (Questions 1-7) • Driver profile segmentation • Categorical age handling workflow • Key findings documentation
Future Versions: • Add statistical significance testing • Expand to other coupon types analysis • Develop predictive ML model • Create interactive dashboard
END OF README (enhanced with AI using OpenAI)
================================================================================