Beauty Market Analysis

Is Sephora a good launchpad for a beauty brand with global ambitions?

A data analysis project combining Sephora's product catalog with a global beauty consumer dataset to map market structure, consumer targeting, pricing dynamics, and ethical positioning.

Project context

This project was built as part of a data analysis training program. The analysis is presented from the perspective of two researchers investigating whether Sephora accurately reflects the global beauty market and what that means for a brand looking to launch and expand in the EU market.

Project presentation

https://docs.google.com/presentation/d/1LAG3IEvvtY3rd52WlLuxbaos3TKq7kmd/edit?usp=sharing&ouid=116785207806004174375&rtpof=true&sd=true

Datasets

Dataset	Source	Description
`product_info.csv`	Sephora Products & Skincare Reviews — Kaggle	8,000+ Sephora products with pricing, ratings, reviews, and product attributes
`most_used_beauty_cosmetics_products_extended.csv`	Most Used Beauty Cosmetics Products — Kaggle	Global beauty product usage data across countries, age groups, skin types, and gender

Business hypotheses

#	Hypothesis	Dataset	Variables
H1	The higher the price, the lower the rating	Both	price, rating
H2	EU countries have more cruelty-free products than the rest of the world	Worldwide	country, cruelty_free, product count
H3	Skincare accounts for more than 40% of Sephora's catalog	Sephora	category, product count

Analysis conclusions

Hypothesis	Status	Key Data Point	Executive Takeaway
Higher price → lower ratings	❌ Rejected	4.07–4.36 range across all tiers	No rating risk at any price point
EU has more cruelty-free products	✅ Confirmed	—	Cruelty-free certification is an expectation at this market stage
Skincare > 40% of Sephora catalog	❌ Rejected	Skincare = 19.7%	Skincare is less crowded than expected

Project structure

beauty_project/sql_beauty_project/
│
├── data/
│   ├── processed/
│   │   ├── dim_category.csv
│   │   ├── dim_country_id.csv
│   │   └── dim_gender.csv
│   │
│   ├── query-results/
│   │   ├── sephora products average rating per price.csv
│   │   ├── sephora share of each category in the catalog.csv
│   │   ├── sephora skincare vs everything else.csv
│   │   ├── world dataset in EU cruelty free product.csv
│   │   └── World products average rating per price.csv
│   │
│   └── raw/
│       ├── sephora_clean.csv
│       └── world_clean.csv
│
├── notebooks/
│   ├── beauty-sephora-data-processing.ipynb
│   ├── beauty-world-data-processing.ipynb
│   └── data-analysis.ipynb
│
├── sql-scripts/
│   ├── create-schema.sql
│   └── query-scripts.sql
│
├── ER-diagram.png
└── README.md

Database schema

The project uses 4 analytical tables built on top of 2 raw tables loaded from Python.

SQL concepts used

Concept	Used in
`GROUP BY`	H1, H2, H3
`JOIN`	H3
Subqueries	H1, H2, H5
`AVG`, `COUNT`, `SUM`	All hypotheses

How to get the data

The datasets are sourced from Kaggle via kagglehub. Install it, run the download, and use the local path it returns to load your dataframes.

pip install kagglehub

import kagglehub
path = kagglehub.dataset_download("nadyinky/sephora-products-and-skincare-reviews")
print("Path to dataset files:", path)  # use this path to read your CSVs

Authors

Valeria ACEVEDO
Hanane MAMALIK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beauty Market Analysis

Is Sephora a good launchpad for a beauty brand with global ambitions?

Project context

Project presentation

Datasets

Business hypotheses

Analysis conclusions

Project structure

Database schema

SQL concepts used

How to get the data

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
sql-scripts		sql-scripts
.DS_Store		.DS_Store
ER-diagram.png		ER-diagram.png
README.md		README.md
beauty-sephora-data-processing.ipynb		beauty-sephora-data-processing.ipynb
beauty-world-data-processing.ipynb		beauty-world-data-processing.ipynb
data-analysis.ipynb		data-analysis.ipynb

Folders and files

Latest commit

History

Repository files navigation

Beauty Market Analysis

Is Sephora a good launchpad for a beauty brand with global ambitions?

Project context

Project presentation

Datasets

Business hypotheses

Analysis conclusions

Project structure

Database schema

SQL concepts used

How to get the data

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages