An interactive data analysis and prediction project using coffee shop transaction data (March 2024 β March 2025).
The project includes EDA, statistical analysis, regression models, and a Streamlit dashboard.
βββ Coffe_sales.csv # Dataset
βββ Project_coffe_sales_with_models.ipynb # Jupyter Notebook (EDA + Modeling)
βββ dashboard.py # Streamlit app for dashboard
βββ requirements.txt # Dependencies
βββ README.md # Project documentation
- Analyze sales patterns by time, weekday, month, and season.
- Identify top-selling coffee types and customer spending habits.
- Compare weekday vs weekend performance.
- Build predictive models (OLS, Linear Regression, XGBoost).
- Deploy results via a Streamlit dashboard.
- Source: Kaggle Coffee Shop Transactions Dataset
- Period Covered: March 2024 β March 2025 (1 full year)
- Transactions: 3,500
| Column | Description |
|---|---|
hour_of_day |
Hour of purchase (0β23) |
cash_type |
Payment type (Cash/Card) |
money |
Transaction amount |
coffee_name |
Coffee type (Latte, Americano, etc.) |
Time_of_Day |
Morning / Afternoon / Night |
Weekday |
Day of the week |
Month_name |
Month name |
Weekdaysort |
Numeric weekday (1=Mon β¦ 7=Sun) |
Monthsort |
Numeric month (1=Jan β¦ 12=Dec) |
Date |
Transaction date |
Time |
Transaction time |
π Dataset Link
Clone the repository:
git clone https://github.com/MUSAB10000/Project-Coffee-Sales.git
cd Project-Coffee-Salesπ Key Insights
Sales are evenly split between weekdays and weekends (~50/50).
Autumn recorded the highest seasonal sales (~29%).
Americano with Milk and Latte are the most popular coffee types.
XGBoost provided the best predictive accuracy compared to OLS and Linear Regression.