You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🎯 Student Math Score Prediction - Machine Learning Project
This project aims to predict the math score of students based on various academic and demographic features.
It showcases a full machine learning workflow, from data preprocessing and EDA to feature engineering and model evaluation.
📂 Dataset Description
Column
Description
gender
Student's gender (male / female)
race_ethnicity
Student group based on race/ethnicity
parental_level_of_education
Parent's highest education level
lunch
Type of lunch (standard / free/reduced)
test_preparation_course
Completion of test preparation course
reading_score
Reading test score
writing_score
Writing test score
math_score
Target variable - score in mathematics
📝 Sample Dataset
gender
race_ethnicity
parental_level_of_education
lunch
test_preparation_course
reading_score
writing_score
math_score
female
group B
bachelor's degree
standard
none
72
74
72
female
group C
some college
standard
completed
90
88
69
female
group B
master's degree
standard
none
95
93
90
male
group A
associate's degree
free/reduced
none
57
44
47
male
group C
some college
standard
none
78
75
76
🎯 Project Objectives
📊 Conduct Exploratory Data Analysis (EDA) to understand distributions and correlations.
🧹 Preprocess data with encoding, scaling, and transformation techniques.
🧠 Train and evaluate multiple regression models to predict math_score.
📈 Evaluate models using R², MAE, and RMSE metrics.
🧪 Machine Learning Techniques Applied
Categorical Encoding: Label & One-Hot Encoding
Feature Engineering: Derived features & normalization
Multiple Regression Models: Tried various ML regressors
Evaluation Metrics: R² Score, MAE, RMSE
📈 Key Insights
📚 Students who completed the test preparation course generally scored higher.
✍️ Reading and writing scores are strongly correlated with math scores.
🎓 Parental education and lunch type influence student performance.
💻 Technologies Used
Python (Jupyter Notebook)
Pandas, NumPy – Data manipulation
Matplotlib, Seaborn – Visualizations
Scikit-learn – Machine learning
Flask – Web app deployment (optional)
▶️ How to Run
Clone the repository or download the notebook file.
Install dependencies:
pip install -r requirements.txt
Run the notebook step-by-step in Jupyter.
Optional: Start Flask app using:
python app.py
📬 Feedback & Contributions
Feel free to fork the repo, contribute, or share your suggestions!