End to End ML project

A general approach to implement any ML project with an orgnised structure and pipeline. The project also is deployed on AWS Elastic Beanstalk with EC2 instance.

URL : http://end2endml-env.eba-mtvphctj.eu-west-2.elasticbeanstalk.com/predictdata

Tech Stack :

numpy
pandas
scikit-learn : Also used for data preprocessing and model selection and hyperparameter tuning.
Flask : used to implement a prediction pipeline and present it as a web app
xgboost
catboost
joblib : to handle model saving and loading

Data Example :

Student performence dataset

Dataframe Head :

Approach for the project

Data Ingestion :
- Data reading and splitting.
Data Transformation :
- used Sklearn to preprocess the data, applied SimpleImputer (median strategy for numeric features and most frequent startegy for categorical features)to handle missing values and Standar Scaling.
- This preprocessing pipeline is saved using joblib in order to use in the predicition pipeline for unseen/new data.
Model Training :
- Models tested: Random Forest, DecisionTreeRegressor,GradientBoostingRegressor, LinearRegression, XGBRegressor, CatBoostRegressor, AdaBoostRegressor
- used GridSearch for hyperparameter tuning
- The model with the highest accuracy is saved as a joblib file.
Prediction Pipeline :
- This pipeline converts given data into dataframe and has various functions to load model and preprocessor files and predict the final results in python.
Flask App creation :
- Flask app is created to provide a convenient way for user input.

As this approach is general, one can easily replace the data in notebook/data and edit the data transformer accordingly to train a new model. The model selection will work mainly on regression problems.

to increase the application performence you can comment out or delete catboost model

Todo

Create a UI
Present a more detailed analysis for the provided data

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.ebextensions		.ebextensions
artifects		artifects
catboost_info		catboost_info
notebook/data		notebook/data
screenshots		screenshots
src		src
templates		templates
.gitignore		.gitignore
README.MD		README.MD
application.py		application.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

End to End ML project

URL : http://end2endml-env.eba-mtvphctj.eu-west-2.elasticbeanstalk.com/predictdata

Tech Stack :

Data Example :

Dataframe Head :

Approach for the project

Todo

About

Uh oh!

Releases

Packages

Uh oh!

Languages

farok-amo/mlproject

Folders and files

Latest commit

History

Repository files navigation

End to End ML project

URL : http://end2endml-env.eba-mtvphctj.eu-west-2.elasticbeanstalk.com/predictdata

Tech Stack :

Data Example :

Dataframe Head :

Approach for the project

Todo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages