🐍 Python - Kiwilytics Project

This repository contains a Python project developed as part of the Kiwilytics Data Engineering Course.
The project demonstrates practical data analysis and manipulation techniques using Python and Pandas.

📘 Project Overview

The goal of this project is to perform essential data engineering tasks, including:

Handling missing values
Calculating total prices and revenues
Grouping and aggregating data
Identifying insights such as top-selling products and highest-spending customers

All tasks were implemented in a Jupyter Notebook to demonstrate the workflow clearly.

🧩 Main Tasks Performed

Data Cleaning and Handling Missing Values
- Filled missing unit_price values using the average price per product.
Feature Engineering
- Created a new column total_price by multiplying unit_price and quantity.
Revenue Calculation
- Calculated total revenue across all orders.
Data Analysis and Insights
- Identified which product has the highest total quantity sold.
- Determined which customer has the highest total spending.

🛠️ Tools and Libraries Used

Python 3
Pandas
Jupyter Notebook

🧠 Skills Demonstrated

Data cleaning and transformation
Use of groupby(), fillna(), and transform() in Pandas
Logical problem solving for real-world data engineering tasks

🧩 Dataset

The dataset used in this project is stored in the /data folder: data/kiwilytics_orders.csv

📝 The dataset was provided by the course instructors for educational purposes and does not contain any real or sensitive information.

📂 Files in This Repository

File/Folder	Description
`Kiwilytics-Python-Project.ipynb`	Main Jupyter Notebook containing all Python code, data cleaning, and analysis
`data/kiwilytics_orders.csv`	Dataset used for the analysis (sample data provided by the course)
`README.md`	Project documentation, objectives, and instructions

🚀 How to Run the Project

Clone or download this repository.
Open the Jupyter Notebook file: Kiwilytics-Python-Project.ipynb
Make sure the CSV file exists in the /data folder: data/kiwilytics_orders.csv
Run all notebook cells sequentially to reproduce the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐍 Python - Kiwilytics Project

📘 Project Overview

🧩 Main Tasks Performed

🛠️ Tools and Libraries Used

🧠 Skills Demonstrated

🧩 Dataset

📂 Files in This Repository

🚀 How to Run the Project

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🐍 Python - Kiwilytics Project

📘 Project Overview

🧩 Main Tasks Performed

🛠️ Tools and Libraries Used

🧠 Skills Demonstrated

🧩 Dataset

📂 Files in This Repository

🚀 How to Run the Project