Skip to content

dtak/decision-points

Repository files navigation

Running DecisionPoints README

Input/Require:

DV0, the unaggregated data

Step 1:

Regenerate the aggregated data using Jiayu’s method and save that version (DV1, the aggregated data).

Link to running Notebook: Running of step1_engineering_features

Step 2:

Regenerate engineered features and save that version (DV2, engineered data).

Step 3:

Impute the data again and save that version (DV3, imputed data).

Remarks: We don't have ppo2 or o2.

Output: imputed_data_engineered_hypotensive_reordered_action_for_model.pickle, a file that has the engineered features of interest, forward fill imputed, with four action labels.

Link to running Notebook: Running of step2+Step 3_engineering_features.ipynb

Step 4:

Run the models and save them.

Running of Running of step4

Link to running Notebook: Testing of step4_just_kernel.ipynb

Link to running Notebook: Testing of step4_rnn_kernel.ipynb

Step 4.5:

Evaluate the models.

Link to running Notebook: Testing of step4.5_model_eval.ipynb

Step 5:

Run the decision points and save them with Uncertainty Labels and Decision Points status (DV4, imputed data + DP + UL).

  • You need to have trained a kernel on the non-time series and have trained an RNN on time series and kernel on embedding.

Kernel (non-time series) pipeline

For every patient:

  • States 0 through the second to last state are S1 “all_states”.
  • States 1 through the last state are “cand_states”.
  • Run the decision point logic using just the kernel, giving you P: a matrix of decision points. If a row of P sums to more than one, that row is a decision point.
  • Pass decision points through the uncertainty label mapping.
  • Match DP and UL with the original time series for each patient and save.

Notebook: Running of step5_kernel_dp.ipynb

Renamed final output csv file (just decision points): kernel_computed_decision_points_justdp_uncertainty_label_withPid.csv

Renamed final output csv file (all decision points--including non decision points): kernel_computed_decision_points_all_withPid.csv

RNN (time-series) pipeline

For every patient:

  • Get S1, S2 the same way.
  • “Window” S1 and S2 separately.
  • Pass windowed S1 and S2 through the RNN embedder.
  • Run the decision point logic using the RNN kernel, giving you P.
  • Pass through uncertainty label.
  • Match DP and UL with the original time series for each patient and save.

Note: One weakness of this process is that we can't calculate decision points for the first 7-hour stamps of a trajectory. We are windowing (chunking the dataset using a sliding window) with size 8, so the first data point in each trajectory that is valid is the 8th data point, because it has a full window before it.

Link to testing Notebook: Testing of step4.5_model_eval.ipynb

Renamed final output csv file (just decision points): rnn_computed_decision_points_justdp_uncertainty_label_withPid.csv

Renamed final output csv file (all decision points--including non decision points): rnn_computed_decision_points_all_withPid

Binary Classification:

Kernel:

Renamed final output csv file (just decision points): binary_kernel_computed_decision_points_justdp_uncertainty_label_withPid.csv

Renamed final output csv file (all decision points--including non decision points): binary_kernel_computed_decision_points_all_withPid.csv

Plots:

Binary kernel tsne/umap source plots cluster.ipynb

RNN:

Renamed final output csv file (just decision points): binary_rnn_computed_decision_points_justdp_uncertainty_label_withPid.csv

Renamed final output csv file (all decision points--including non decision points): binary_rnn_computed_decision_points_all_withPid

Plots:

Binary RNN tsne/umap source plots cluster.ipynb

Step 6: Run the plots

Run UMAP, T-SNE and save those 2D representations (DV5a/b, reduced data).

  • Plot connected components in DV4, DV5a/b.
  • Normalized average of features plot (DV 5) in each manually selected cluster (see step 7). This is what doctors (Leo) want to see.
  • Average percentage of top 20 nearest neighbors plot (DV5a/b).

Link to Source Plots: Link to Source Plots

Step 7: Cluster selection

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •