Clément Antheaume - Camille-Amaury Juge
Here is a basic repository of all the directed and praticed missions we did during the first master degree year of machine learning in Nantes' University.
Here is a list of the Python library we used :
Dataset : Users opinion (commentary and rate from 1 to 5) related to a course, see the csv file "/TP1/reviews_by_course.csv"
Aim :
- Open a CSV file from pandas.
- Quickly explore data with basic pandas function.
- Learn how to process words using two approach (bag of words (hot encoder), TF-IDF) using Sklearn.
- Learn how to implements a Naive Bayes model (Multinomial) in order to predict classification (the rate of the user) using Sklearn.
- Learn to adjust model parameters.
- Learn how to calculate and read results indicators.
- Learn how to draw line-chart, standard deviation-chart using Plotly.
--> TF-IDF maybe need to be corrected (be careful about results)
--> need to play with more parameters than alpha
Dataset :
- Multivalued indicators related to candidate chances of admission, see the csv file "/TP2/admissiondata.csv"
- Multivalued indicators related to house's environnement with their price, see Load Boston
Aim :
- Work with google colab, basic import functions and directory management.
- Open a CSV file from pandas.
- Quickly explore data with basic pandas function.
- Pre-processing datas with pandas, numpy.
- Learn how to understand Correlation Matrix (you can see an other repository nammed Data Analysis Project for more detailled statistics).
- Learn how to implements a Linear Regression using Sklearn.
- Learn how to calculate and read results indicators.
- Learn how to implements a Gradient Boosting Regression using Sklearn.
- Learn how to calculate and read results indicators.
- Learn how to draw heathmap, scatterplot using Seaborn.
- Learn how to draw barplot using Matplotlib.
Dataset :
- Multivalued indicators related to people's rating on movies, see the csv file "/TP3/movies.csv"
Aim :
- Work with google colab, basic import functions and directory management.
- Open a CSV file from pandas.
- Quickly explore data with basic pandas function.
- Pre-processing datas with pandas, numpy.
- Learn how to understand Confusion Matrix.
- Learn how to implements a Decision Tree using Sklearn.
- Learn how to calculate and read results indicators.
Dataset :
- Multivalued indicators related to people's bank crediting, see the csv file "/DataMining/TP2/german.data"
Aim :
- Work with google colab, basic import functions and directory management.
- Open a CSV file from pandas.
- Quickly explore data with basic pandas function.
- Pre-processing datas with pandas, numpy.
- Learn how to understand Confusion Matrix.
- Learn how to implements a Decision Tree, Random Forest, KNN using Sklearn.
- Learn how to implements a Grid Search in order to do fine-tuning.
- Learn how to implements a Cross-Validation rather than simple validation.
- Learn how to calculate and read results indicators.
Dataset :
see the file "/ProbabilisticModel/PM_Vitterbi_Forward_Backward.html"
Aim :
- Implements Backward and Forward algorithm with python for PFA.
You can freely copy notebooks except for correction ones and pdf files which belongs to the Nantes' University.
See the linkedIn shields in the top of the README.md