Skip to content

Genotype Quality (GQ) calculation #67

@Pasky222

Description

@Pasky222

Hi ! I've been looking for SNPs from EM-seq data and i have some questions about the GQ calculation.

In the SNVFromATCGmap.py script, the GQ calculation is the following :

def VCF_line (CHR, POS, REF, GN, DP, prob_pre, prob_nuc, FILTER = "PASS"):

...

        if REF in AllCases[GN]:  # no mutation
                GQ = 99  # log10 (0) be very large
            else:
                if (1 - prob_nuc - prob_pre) > 0:
                    GQ = int(- math.log10((1 - prob_nuc - prob_pre) / (1 - prob_nuc)) * 10)
                else:
                    GQ = 99

The issue is when you call the VCF_line function, prob_pre will always be equals to prob_nuc

if show_all:
                print("\t".join([chr, nuc, pos,
                                 "%d,%d,%d,%d"%(W_A, W_T, W_C, W_G), "%d,%d,%d,%d"%(C_A, C_T, C_C, C_G),
                                 NUC, "%.1e" % (1-prob_pre)]))
                #
                if vcffile:
                    VCF.write(VCF_line(chr, pos, nuc, NUC, COV, prob_nuc, prob_nuc) + "\n")
                #
            elif (NUC != "-" or NUC != "N"):
                #
                if (NUC in AllCases):
                    if (nuc not in AllCases[NUC]):
                        print ("\t".join([chr, nuc, pos,
                                        "%d,%d,%d,%d"%(W_A, W_T, W_C, W_G), "%d,%d,%d,%d"%(C_A, C_T, C_C, C_G),
                                         NUC, "%.1e" % (1-prob_pre)]))
                        #
                        if vcffile:
                            VCF.write(VCF_line(chr, pos, nuc, NUC, COV, prob_nuc, prob_nuc) + "\n")

Is there something i don't understand here ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions