This repository was archived by the owner on Jul 10, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
This repository was archived by the owner on Jul 10, 2020. It is now read-only.
ToDo from feedback by He and Thiseas #2
Copy link
Copy link
Open
Description
He
- For the merge module, it may be better to have a checkpoint, about the number of bed/bim/fam/janno file found in each package path. That may be the job of package maintaining though, but would be helpful in some cases. For example when there is an extra ._PACKAGE.bim file, the file list will also include it and create error in plink.
- For the covert module, would it be better to have the output directory specified as well? My understanding is that the poseidon packages are maintained in the project folder and people can convert/merge them for their own usage. Now it only writes converted files in the input folder. You can set it as a non-mandatory parameter. If the fourth parameter is found, then write files in the given path, or automatically write in the input path.
- For the convert module as well, why you generate a .pedind file for the convertf run? I thought we can use .fam file as the input individual file, but maybe I’m wrong.
-
No check for too many arguments. Could lead to odd results that "worked" when a user provides multiple input files for eg.
-
Log folder could get messy if someone submits jobs in a loop, as multiple jobs will have same time stamp. could cause issues when jobs are sbatched. Perhaps add a hash to the end of the folder? Or maybe deter people from looping with a check for this logdir existing and an error?
-
_print_packages also prints commented out lines or comments in the lines. Bug or feature?
-
poseidon_merge
- line 101,146: No need to create empty bash array. easier to use
unset _janno_files. the current fomrulation will trip text editors into thinking _janno_files is a function definition. (minor thing) - _plink_merge, line 167: --keep-allele-order? I remembed there was the issue with plink that it would sometimes flip alleles to Maj/Min rather than Ref/Alt which is what we generally use. could cause issues in downstream merging with Boston datasets in non-Poseidon format. Elina Salmela was the one that ran into these issues. Maybe she could explain how she solved it? maybe it isn't exactly relevant and I'm misremembering.
- line 101,146: No need to create empty bash array. easier to use
Metadata
Metadata
Assignees
Labels
No labels