A beginner-friendly, end-to-end data analytics project for restaurant business insights. This repository demonstrates how to analyze a real-world restaurant dataset using Python, pandas, and visualization libraries. All code and outputs are reproducible and ready for extension or learning.
- Top Cuisines Analysis: Find the most popular cuisines and their market share.
- City Analysis: Discover which cities have the most restaurants and the highest average ratings.
- Price Range Distribution: Visualize how restaurants are distributed across price categories.
- Online Delivery Impact: See how online delivery availability affects restaurant ratings.
- Votes & Popularity: Explore the relationship between customer votes and ratings.
- Service Features: Analyze how price range relates to online delivery and table booking.
- Review Text Insights: Extract frequent positive & negative keywords, review length distribution, and correlation between review length and rating.
- Clone the repository
- Install dependencies
python -m venv .venv .\.venv\Scripts\activate pip install -r analysis\requirements.txt
- Run the analysis
python analysis\analysis.py
- View results
- Output CSVs and images are in
analysis/output/ - Open PNG files to see visualizations
- Output CSVs and images are in
These are always produced unless FULL_MODE=1 is set:
level1_city_analysis.csvlevel1_price_range_distribution.csvlevel1_online_delivery_summary.csvlevel1_online_vs_offline_rating.csvlevel3_price_range_vs_delivery_table.csvlevel1_top_cuisines.csv- Key charts (PNG):
level1_top3_cuisines.pnglevel1_city_most_restaurants.pnglevel1_price_range_distribution.pnglevel1_online_vs_offline_rating_bar.pnglevel3_votes_vs_rating.pnglevel3_price_range_vs_services.png
Top/Bottom city & cuisine lists, extended plots (top10, top15, pies, highest rating city), votes top/bottom lists, level1 summary, README_generated, review keyword extended artifacts (if provided review data).
Provide a reviews.csv (recommended columns: review_text, rating) in Data analysis dataset/ to enable:
reviews_top_positive_keywords.csv/reviews_top_negative_keywords.csvreviews_length_stats.csvreviews_length_rating_corr.txt(only if rating column present)- Associated PNG charts
Run in full mode:
set FULL_MODE=1 & python analysis\analysis.py(PowerShell: $env:FULL_MODE='1'; python analysis\analysis.py)
- analysis/analysis.py: Main script (supports FULL_MODE flag)
- analysis/requirements.txt: Python dependencies
- analysis/output/: Generated results (CSV + PNG)
- Data placement: put dataset CSV (any of:
Dataset .csv,dataset.csv,Dataset.csv) insideData analysis dataset/.
- Fork and add new analyses (e.g., sentiment, time trends)
- Use as a template for your own data projects
- All code is commented for easy understanding
MIT License. Free for learning and commercial use.





