Update artic pipeline to 1.8.4#283
Conversation
|
currently updating porecov to some adjusted syntax and thus higher nextflow version requirment. (see #281). i will merge this into the master this week. you might need a git rebase to avoid some merge conflicts later on. |
|
@fischer-hub now in master |
|
|
||
| script: | ||
| def normalise_arg = normalise_threshold ? "--normalise ${normalise_threshold}" : '--normalise 0' | ||
| def normalise_arg = normalise_threshold ? "--normalise ${normalise_threshold}" : '--normalise 0' // why is the --normalise flag not part of the bash script ^^ |
There was a problem hiding this comment.
Not sure if I understand your comment 🤔 Can you elaborate?
|
would recommend to add some log.err logic since you changed the primer version names. e.g. alternatively still allow the old ones and change them via workflow (if this is possible). |
You could copy the BED file from an ARTIC primer, and use it as custom input. The consensus sequnces should then be identical (between the poreCov run w/ custom primer and the poreCov run w/ the same pre-defined primer) |
|
Some clarifying comments: The models for
Regarding the new primer version format,s after talking back to @MarieLataretu, the current behaviour is:
I also added a version format check, so there will be warnings if the format is neither the old nor the new one, and if a bed file without a reference fasta is provided. There are also warnings when using the old I'll check if the output with these changes is reasonable, and then I think this would be ready to merge from my side :) |
|
Thanks for the summary @fischer-hub! I think there are two general questions to discuss @replikation, @hoelzer : ad a): With the new ARTIC version and Clair3 it's at least easier to update the models. ad b): |
I tested this container: |
|
As long as the container runs, I'm fine with it. Most of the nanozoo containers we use are because alternatives were not compatible with a cluster or cloud setting, or simply weren't working. |
With newer artic releases, the usage of
medakaandnanopolishis deprecated. Instead,Clair3is used by default. This PR updates the artic version and adjusts some code around the artic sub workflow to work with the new output. It also removes some of the old code related tonanopolish/medakaand adjusts the parameters, as well as throwing warnings when old flags are used anyway.I also had to patch the main script of
CoVarPlotsince the new VCF output writes e.g. the allele frequency and read depth to theFORMATandSAMPLEfields instead ofINFO. I created a PR on theCoVarPlotsrepo, but until this is merged, we can get away with running the patched main script from the workflows/bindirectory. Once we get a merge, we can rebuild the container and switch back to the original tool.EDIT: this was merged already and Martin helped update the respective container :)
Since the allele frequency is added to the VCF by default with
Clair3, theadd_allel_frequencies()process was renamed tocount_mixed_sites(). It's only counting mixed sites in the VCF file now (from the correct VCF fields again).Had to also rename some hardcoded paths in process
seqrs(), and I just now saw some part of the artic pipeline is actually pulling primer schemes into 'data/external_primer_schemes/artic-sars-cov-2' and I think its ´artic minion´, will check what is happening there ^^Everything is running fine with the test data that I have (some fastq files) but I feel like its quite some changes in this PR, what do you think is the best way to test this fully? Comparing the VCF output? It has quite a different structure, especially in the
INFO, FORMAT, SAMPLEfields. I also couldn't test the custom BED file subworkflow since I don't have any custom BED files, haha. If you have some in mind that I can use for testing that would be great.Lastly, I wasn't sure whether to remove the
--fast5parameters completely, from my understanding, this only worked with--nanopolish, but maybe I'm missing something here - lmk what you think.EDIT 2: Since I started on this artic got a few more updates to now 1.8.4 being the newest version. I created a new sequera container for that version and had to adjust some more things, but this now seems good to merge!
artic_make_depth_maskwas integrated to run as part ofartic minionand running it manually afterwards results in duplicated lines in thecoverage_mask.depthfiles, this is fixed by writing the output of both calls toartic_make_depth_maskto separate filesartic_custom_bedsubworkflow, but they were not naturally sorteg (e.g. SARS-CoV-10_LEFT would come before SARS-CoV-2_LEFT), this is now fixedcloses #280