ML-Classification

Protein Expression in mice with Down Syndrome

Description

The project is divided into four main parts:

[Preliminary analysis]
[Classification]
[Prediction with different algorithms and evaluation] 4.a [Prediction of the expression values of the protein] 4.b [Determination if the test performance of the best model found at step (2.) improves if the SOD1_N feature is also used for prediction (and training).]

Programming language: Python in jupyter notebook The datasets used for the analysis are: training and test datasets

1. Preliminary analysis: exploratory data analysis on training data

Inspection of classes and parameters/proteins
Inspection of protein expression distribution
Missing data
Extreme values
General look to the proteins important for the prediction of each class
Feature to feature relationships, collinearity
Check for unbalanced classes
Protein division in groups of lowly, medium and highly expressed for each class
Important proteins for each pair of biological meaningful classes
Clustering, serch for structure in data

2. Classification: find a model that is able to classify mice in the 8 different classes

Feature selection
Comparison of different classification algorithms
Test the best algorithm on the test set

3. Using the training data, train and compare different regression algorithms to predict the expression value of SOD1 protein given the other feature, usage of robust evaluation techniques to compare algorithms

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
ML.ipynb		ML.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-Classification

Protein Expression in mice with Down Syndrome

Description

1. Preliminary analysis: exploratory data analysis on training data

2. Classification: find a model that is able to classify mice in the 8 different classes

3. Using the training data, train and compare different regression algorithms to predict the expression value of SOD1 protein given the other feature, usage of robust evaluation techniques to compare algorithms

4. Usage of the right model previously found to predict expression value of SOD1_N such that the column in test data is filled

5. Determination of the test performance of the model found, if SOD1_N feature is also used for prediction and training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML-Classification

Protein Expression in mice with Down Syndrome

Description

1. Preliminary analysis: exploratory data analysis on training data

2. Classification: find a model that is able to classify mice in the 8 different classes

3. Using the training data, train and compare different regression algorithms to predict the expression value of SOD1 protein given the other feature, usage of robust evaluation techniques to compare algorithms

4. Usage of the right model previously found to predict expression value of SOD1_N such that the column in test data is filled

5. Determination of the test performance of the model found, if SOD1_N feature is also used for prediction and training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages