Phenotype Classification of Medical Text using Machine Learning

Electronic Health Record (EHR) data is a rapidly growing source of unstructured biomedical data. This data is extremely rich, often capturing a patient’s phenotype. In a clinical context, phenotype refers to the medical conditions, diseases, and disorders of a patient. These records can capture data in higher detail compared to structured encodings such as the International Classification of Diseases (ICD). Traditional methods for extracting phenotypes from this data typically relies on manual review or processing the data through rule-based expert systems. Both approaches are time intensive, rely heavily on human expertise, and scale poorly. This project proposes an automated approach to identifying phenotypes in EHR data through machine learning.

** Data files have been excluded due to size and security. Please contact renzeer@berkeley.edu to request access the the data files **

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phenotype Classification of Medical Text using Machine Learning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
archive		archive
w266_common		w266_common
1.0_Introduction_Data-WordEmbeddings-NearestNeighbor.ipynb		1.0_Introduction_Data-WordEmbeddings-NearestNeighbor.ipynb
2.0_CardiovascularDisease-CNN-Fulltext.ipynb		2.0_CardiovascularDisease-CNN-Fulltext.ipynb
2.1_CardiovascularDisease-LSTM-FullText.ipynb		2.1_CardiovascularDisease-LSTM-FullText.ipynb
2.2_Phenotype-LSTM-FullText.ipynb		2.2_Phenotype-LSTM-FullText.ipynb
3.0_CardiovascularDisease-CNN-Sentences-NoWordLimit.ipynb		3.0_CardiovascularDisease-CNN-Sentences-NoWordLimit.ipynb
3.1_CardiovascularDisease-LSTM-Sentences-NoWordLimit.ipynb		3.1_CardiovascularDisease-LSTM-Sentences-NoWordLimit.ipynb
4.0_CardiovascularDisease-CNN-Sentences-30WordLimit.ipynb		4.0_CardiovascularDisease-CNN-Sentences-30WordLimit.ipynb
4.1_OrthopedicDisorder-CNN-Sentences-30WordLimit.ipynb		4.1_OrthopedicDisorder-CNN-Sentences-30WordLimit.ipynb
4.2_Phenotype-CNN-Sentences-30WordLimit.ipynb		4.2_Phenotype-CNN-Sentences-30WordLimit.ipynb
5.0_CardiovascularDisease-LSTM-Sentences-30WordLimit.ipynb		5.0_CardiovascularDisease-LSTM-Sentences-30WordLimit.ipynb
5.1_OrthopedicDisorder-LSTM-Sentences-30WordLimit.ipynb		5.1_OrthopedicDisorder-LSTM-Sentences-30WordLimit.ipynb
5.2_Phenotype-LSTM-Sentences-30WordLimit.ipynb		5.2_Phenotype-LSTM-Sentences-30WordLimit.ipynb
6.0_CardiovascularDisease-CNN-WeightedCost.ipynb		6.0_CardiovascularDisease-CNN-WeightedCost.ipynb
6.1_CardiovascularDisease-CNN-WeightedCostv2.ipynb		6.1_CardiovascularDisease-CNN-WeightedCostv2.ipynb
6.2_CardiovascularDisease-CNN-WeightedCost-L2.ipynb		6.2_CardiovascularDisease-CNN-WeightedCost-L2.ipynb
6.3_CardiovascularDisease-CNN-WeightedCost-MultiFilter-MultiDropout.ipynb		6.3_CardiovascularDisease-CNN-WeightedCost-MultiFilter-MultiDropout.ipynb
6.4_CardiovascularDisease-CNN-WeightedCost-MultiFilter-SingleDropout.ipynb		6.4_CardiovascularDisease-CNN-WeightedCost-MultiFilter-SingleDropout.ipynb
LICENSE		LICENSE
README.md		README.md
glove_helper.py		glove_helper.py

Folders and files

Latest commit

History

Repository files navigation

Phenotype Classification of Medical Text using Machine Learning

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages