This project is part of the CodeAlpha Internship Program and focuses on building a Sales Prediction model using supervised machine learning techniques. It involves data preprocessing, exploratory data analysis (EDA), model training, and performance evaluation to forecast future sales.
The goal of this project is to predict sales based on historical data using regression models. The notebook walks through the steps of:
- Loading and preprocessing the dataset
- Exploring relationships in the data
- Training a linear regression model
- Evaluating the model using appropriate metrics
CodeAlpha_Sales_Prediction.ipynb: The main Jupyter Notebook containing all code and analysis.data/: (Optional) Folder for storing dataset files if added later.
- Python 3.x
- Jupyter Notebook
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- Data Cleaning & Preprocessing: Handles missing values and formats the data for modeling.
- Visualization: Utilizes Seaborn and Matplotlib to understand feature relationships.
- Model Building: Implements Linear Regression for sales forecasting.
- Performance Evaluation: Assesses model accuracy using metrics like RΒ² and Mean Squared Error (MSE).
- Load Data: Reads the dataset into a DataFrame.
- EDA: Visualizes feature distributions and correlations.
- Preprocess: Converts categorical variables and scales numerical features.
- Train Model: Fits a Linear Regression model.
- Evaluate Model: Prints RΒ² score and plots prediction vs. actual sales.
To run this project locally:
- Clone the repository:
git clone https://github.com/your-username/sales-prediction.git cd sales-prediction