-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
I am trying to reproduce the results from your paper and have a few questions about the code and methodology:
- Zero replacement: In the "Compositional Feature Dropout" section of the paper, you mention that you add a small positive pseudo-count (the inverse of the library size) to each component before renormalising. Where in the repository is this implemented, and for a given dataset is the same pseudo-count used in the other two augmentation strategies?
- From the available code, it looks like all models are trained on the data transformed into proportions, without any further data transformation (e.g. CLR, ILR, etc.). Is that correct?
- To my understanding in your implementation of compositional feature dropout, you set randomly selected entries of the training examples to one rather than to zero as described in the paper (see the definition of augment_X in train_and_evaluate.py). Why is that?
- Using the paper’s nomenclature, for task 7 (colorectal cancer data), do you apply any preprocessing beyond excluding features with zero standard deviation? Specifically, in task 7 do you use all 980 taxa as-is? Also, since task 7 includes paired samples from the same patients (two samples per patient), how do you account for within-subject correlation?
Thanks in advance for your help!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels