-
Notifications
You must be signed in to change notification settings - Fork 97
Description
Hi Ed,
thank you very much for adding the process_mimic.py script :)
It all worked fairly painlessly, following your clear instructions (I used "counts") - and now I'm the very proud owner of 10000 synthetic EHR's - woohoo !!!
So I loaded samples, but I'm not sure how to interpret them?
>>> import numpy as np
>>> X = np.load('/home/ajay/PythonProjects/medgan-master/samples/samples.npy')
>>> X
array([[ 0.42479137, 0.38992843, 0.3843686 , ..., 0.48570082,
0.44278869, 0.4656629 ],
[ 0.28643027, 0.45749718, 0.23394403, ..., 0.47090551,
0.41072363, 0.43643555],
[ 0.29359645, 0.46955556, 0.22549649, ..., 0.48150307,
0.41780272, 0.45492986],
...,
[ 0.56480783, 0.66771448, 0.54325938, ..., 0.47483209,
0.43128845, 0.45304856],
[ 0.68514657, 0.79574692, 0.73424697, ..., 0.47857872,
0.43853614, 0.44970644],
[ 0.17376943, 0.19806506, 0.27509841, ..., 0.47925362,
0.44123808, 0.46058744]], dtype=float32)
>>> X.shape
(10000, 1071)
>>> synthetic_ehr = X[0,:]
>>> synthetic_ehr
array([ 0.42479137, 0.38992843, 0.3843686 , ..., 0.48570082,
0.44278869, 0.4656629 ], dtype=float32)
I just realized I'm not sure what synthetic_ehr is? Does it look right to you?
I thought it would be like a row of a table where the columns are the 1071 ICD-9 codes, and the counts are the number of times those entities appear in the patients ehr? So the counts should be whole numbers, and would give some idea of co-morbidities? For example, cardiovascular and metabolic disorders would frequently co-occur?
So would one way of analysis be a correlation matrix?
Thanks very much 👍