Skip to content

akijainopera/brazil-retention

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Opera Browser -- Brazil Retention Survey Analysis

Demographic heatmap analysis of Opera Browser retention and new-user survey data from the Brazilian market. Built for the Opera Growth & Analytics team to understand user acquisition, feature preferences, pain points, and churn risk across age segments in Brazil -- one of Opera's largest mobile markets.


What You'll Find Here

For a human reader: This repository contains the raw survey exports and reproducible Python scripts that generate cross-tabulated heatmaps breaking down Brazilian Opera users' behaviors and opinions by age group. Each heatmap shows both absolute response counts and within-group percentages, making it straightforward to spot age-driven patterns in feature appeal, churn risk, and discovery channels. The pre-rendered PNG outputs are included so you can review the findings without running any code.

For an LLM or automated system: The two CSV files contain structured survey microdata (9,730 retention survey responses and 24,650 new-user survey responses) with consistent column layouts. Age groups are encoded across columns 100-106 (Below 15 through 55+). Each analysis script demonstrates a repeatable pattern: load CSV, map column indices to human-readable labels, cross-tabulate against age groups, compute row-wise percentages, and render an annotated seaborn heatmap. The data and scripts can be reused to answer questions about Opera's Brazilian user base, extend the analysis to new survey dimensions, or serve as a template for similar survey analysis in other markets.


Dates and Provenance

Detail Value
All files created March 12, 2025
Survey source Opera internal user surveys (Brazil market)
Analysis tool Python 3.x with pandas, seaborn, matplotlib, numpy
Author context Opera Software growth/analytics work

Data Schema

Age Groups (shared across all scripts)

Encoded as one-hot columns at CSV positions 100-106:

Column Index CSV Header Age Group
100 How old are you? Below 15
101 Unnamed: 100 15 - 17
102 Unnamed: 101 18 - 24
103 Unnamed: 102 25 - 29
104 Unnamed: 103 30 - 34
105 Unnamed: 104 35 - 44
106 Unnamed: 105 45 - 54
107 Unnamed: 106 55+

Survey Dimensions Analyzed

Dimension CSV Column Range Answer Options
Discovery method 6-18 Friend/family, Play Store, online search, other Opera products, online ad, social media, influencer, Shake & Win, #OperaMeNota, Brasilidades wallpaper contest, Opera Rewards, pre-installed, other
Convincing factors 33-39 Good features, design, speed, lightweight, performance, Opera Rewards, other
Appealing features 40-53 Privacy, ad blocker, design, customization, free VPN, paid VPN, Aria AI, offline reading, performance, data saving, Opera Flow, Opera Rewards, none, other
Disliked aspects 54-66 Confusing menus, hard to find features, broken features, lack of features, bugs, safety concerns, website incompatibility, design/organization, newsfeed, slow, initial page, nothing, other
Stop-using reasons 69-77 Bad browsing performance, slow, lack of features, privacy concerns, difficult to use, bad Opera Rewards experience, initial page overload, prefer another browser, other
Ease of use 78-82 Very difficult, difficult, neutral, easy, very easy (1-5 Likert scale)
Default browser 83-95 Chrome, Safari, Brave, Edge, Opera GX, Opera for Android, Opera Mini, Firefox, Arc, Samsung Browser, DuckDuckGo, UC Browser, other
Installation tenure 96-100 Less than a month, 1-3 months, 4-12 months, 1-2 years, 2+ years

File Summaries

Data Files

File Size Rows Description
Brazil Retention Survey.csv 3.1 MB ~9,730 Primary retention survey. Each row is one respondent. Columns cover discovery method, convincing factors, appealing features, disliked aspects, churn reasons, ease of use, default browser, mobile plan type, age, and gender. First data row (index 0) contains sub-question labels; actual responses start at index 1.
OFI - New User Survey.csv 12.6 MB ~24,650 Opera First Impression (OFI) new-user survey. Larger respondent pool capturing first-experience sentiment from Brazilian users. Used as a supplementary data source.

Heatmap Analysis Scripts

Each script reads Brazil Retention Survey.csv, cross-tabulates one survey dimension against age groups, computes row-wise percentages, and saves an annotated heatmap PNG.

Script Output PNG Survey Dimension Color Map Key Question Answered
opera_heatmap_analysis.py opera_discovery_heatmap.png Discovery method (cols 6-18) YlOrRd How did users in each age group first hear about Opera?
opera_appealing_features_heatmap.py opera_appealing_features_heatmap.png Appealing features (cols 40-53) YlOrRd Which browser features resonate most with each age group?
opera_convince_heatmap.py opera_convince_heatmap.png Convincing factors (cols 33-39) YlOrRd What persuaded users in each age group to try Opera?
opera_default_browser_heatmap.py opera_default_browser_heatmap.png Default browser (cols 83-95) Greens What is each age group's default browser?
opera_disliked_aspects_heatmap.py opera_disliked_aspects_heatmap.png Disliked aspects (cols 54-66) YlOrRd What do users in each age group dislike about Opera?
opera_ease_of_use_heatmap.py opera_ease_of_use_heatmap.png Ease of use (cols 78-82) Blues How do ease-of-use ratings distribute across age groups?
opera_installation_time_heatmap.py opera_installation_time_heatmap.png Installation tenure (cols 96-100) Purples How long has each age group had Opera installed?
opera_stop_using_reasons_heatmap.py opera_stop_using_reasons_heatmap.png Stop-using reasons (cols 69-77) YlOrRd What would make each age group stop using Opera?

Data Validation Scripts

Utility scripts used during initial data exploration to understand CSV structure before building the heatmaps.

Script Purpose
check_columns.py Prints every column name in the CSV. Used to map column indices to survey questions.
check_age.py Inspects age column distribution and data quality. Prints value counts and first 10 rows of the age column.
check_data.py Deep inspection: prints raw data head, all column names with indices, header row values, age column breakdowns, and discovery method frequency counts.
verify_data.py Verifies raw data structure, prints column headers and first-row values for non-null entries, and checks age-related columns (99-106).

Generated Heatmap PNGs (8 files)

Pre-rendered at 300 DPI, 20x10 inch figures. Each cell shows the absolute count and the percentage within its age group.

File Visualization
opera_appealing_features_heatmap.png 8 age groups x 14 features matrix; "Free VPN" and "Built-in Ad blocker" are consistently popular across age groups
opera_convince_heatmap.png 8 age groups x 7 factors; "Good/interesting features" and "Fast browser" dominate
opera_default_browser_heatmap.png 8 age groups x 13 browsers; Google Chrome dominates, Opera variants (GX, Android, Mini) show age skew
opera_disliked_aspects_heatmap.png 8 age groups x 13 complaints; "Nothing I don't like" is common but "Unexpected errors and bugs" appears frequently in younger groups
opera_discovery_heatmap.png 8 age groups x 13 channels; Play Store and friend recommendations lead discovery
opera_ease_of_use_heatmap.png 8 age groups x 5 ratings; skews heavily toward "Easy" and "Very easy"
opera_installation_time_heatmap.png 8 age groups x 5 tenure bands; older groups show longer tenure
opera_stop_using_reasons_heatmap.png 8 age groups x 9 reasons; "Prefer another browser" is the top churn risk

How to Run

Prerequisites

Python 3.8+
pandas
seaborn
matplotlib
numpy

Install dependencies:

pip install pandas seaborn matplotlib numpy

Running a Script

All scripts expect to be run from the repository root (same directory as the CSV files):

cd brazil-retention
python opera_appealing_features_heatmap.py

Each script reads Brazil Retention Survey.csv from the current directory and writes its output PNG to the current directory, overwriting any existing file with the same name.

Running All Heatmaps

for script in opera_*_heatmap.py opera_heatmap_analysis.py; do
    echo "Running $script..."
    python "$script"
done

Repository Structure

brazil-retention/
├── README.md
├── .gitignore
│
├── Brazil Retention Survey.csv          # Primary survey data (~9,730 responses)
├── OFI - New User Survey.csv            # New-user survey data (~24,650 responses)
│
├── opera_heatmap_analysis.py            # Discovery method heatmap
├── opera_appealing_features_heatmap.py  # Feature appeal heatmap
├── opera_convince_heatmap.py            # Convincing factors heatmap
├── opera_default_browser_heatmap.py     # Default browser heatmap
├── opera_disliked_aspects_heatmap.py    # Disliked aspects heatmap
├── opera_ease_of_use_heatmap.py         # Ease of use heatmap
├── opera_installation_time_heatmap.py   # Installation tenure heatmap
├── opera_stop_using_reasons_heatmap.py  # Stop-using reasons heatmap
│
├── check_columns.py                     # Utility: list CSV columns
├── check_age.py                         # Utility: inspect age data
├── check_data.py                        # Utility: deep data inspection
├── verify_data.py                       # Utility: verify data structure
│
├── opera_appealing_features_heatmap.png # Output: feature appeal
├── opera_convince_heatmap.png           # Output: convincing factors
├── opera_default_browser_heatmap.png    # Output: default browsers
├── opera_disliked_aspects_heatmap.png   # Output: complaints
├── opera_discovery_heatmap.png          # Output: discovery channels
├── opera_ease_of_use_heatmap.png        # Output: usability ratings
├── opera_installation_time_heatmap.png  # Output: install tenure
└── opera_stop_using_reasons_heatmap.png # Output: churn reasons

License

Internal Opera Software analysis. Not intended for public redistribution of the underlying survey data.

About

Opera Browser Brazil retention survey analysis - demographic heatmaps by age group

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages