GenotypeQuality

How the HaplotypeCaller's reference confidence model works:

https://software.broadinstitute.org/gatk/documentation/article.php?id=4042

GenotypeLikelihoods:

https://software.broadinstitute.org/gatk/documentation/article.php?id=4442

What is gVCF:

https://software.broadinstitute.org/gatk/documentation/article.php?id=4017

TODO:

check out avocado and guacamole project
check out GATK spark functionality

AVOCADO: For highest accuracy, Avocado is run as a two phase tool. In the first phase, we reassemble or realign our reads around INDEL variants. In the second phase, we apply a probabilistic model built around a biallelic model to the reads to identify variants.

Our approach does not rely on the input reads being sorted, and as such, is not unduly impacted by variations in coverage across the genome. This point is critical in a parallel approach, as coverage can vary dramatically across the genome

We then use Apache Spark’s reduceByKey functionality to compute the number of times each variant was observed with high quality. We do this to discard sequence variants that were observed in a read that represent a sequencing error, and not a true variant. (czemu od razu nie odfiltrowuja takich readow?) https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-204.pdf [chapter 7]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GenotypeQuality

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally