Machine Learning Pipeline for TDA of Physicians' Networks

Compute persistence images for the pysicians' network data and then use K-fold cross validation (CV) to select the best algorithm and parameters (for the algorithm and the persistence image weight function).

Current Pipeline

Generate persistence diagrams (PDs) from the graph data (not included here) from a file of HSA IDs.
- Code/generate_PDs_as_strings.py
Generate Test and Fold indicies for the PDs.
- Code/generate_kfolds_indices.py
Run cross validation.
- Module: paths, parameter values, functions, and other useful information
  - Code/modules/cv_prep_vars.py
  - Most of the editing and generic (algorithm agnostic) information is here
- Command line script: Runs CV for a given year, outcome, pixel resolution, H dimension, scoring metric, and k/test percent information
  - Code/run_cv_cmd_line_script.py
- This script should be run through a job scheduler because it can take days for some algorithms. See the "submit_scritps" directory, jobs_submit.sh (to easily submit and name jobs), and eg_submit_cmd example file.

Future Directions

Edit graph script:
- Option to run before generating PDs
Generate PDs:
- Fixes to this script, including making it a command line script
Generate Test and Fold indicies
Generate CV Data:
- Make more reproducible and faster
- Finish testing modifications and replace original version
- Add option to select an arbitrary pixel size
- Add option to output data in a different directory
Select Best Parameters/Model(s)
- Determine how to select the best models
Fit and Explore Best Model/s
- Fit best models and explore the results
Sensitivity Tests
- Re-do steps 4-6 with different outcome definitions and different pixel resolutions

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
Code		Code
Data		Data
submit_scripts		submit_scripts
.gitignore		.gitignore
1407212683.seed		1407212683.seed
3667424171.seed		3667424171.seed
4123456789.seed		4123456789.seed
README.md		README.md
eg_submit_cmd		eg_submit_cmd
jobs_submit.sh		jobs_submit.sh
jobs_submit_en.sh		jobs_submit_en.sh
jobs_submit_rfr.sh		jobs_submit_rfr.sh
load_packages		load_packages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Learning Pipeline for TDA of Physicians' Networks

Current Pipeline

Future Directions

About

Uh oh!

Releases

Packages

Languages

rfunklab/ML_physicians_network

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Pipeline for TDA of Physicians' Networks

Current Pipeline

Future Directions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages