Skip to content

Question about extremely high Z-scores (Z = 100) in qpDstat results #122

@aaannaw

Description

@aaannaw

Dear Author,

I am currently using qpDstat from the Admixtools package to calculate D-statistics among several rodent species. However, I have noticed that many of my results show Z-scores = 100, and most tests have |Z| > 3. I would like to ask whether this pattern is expected or indicates a problem in my setup.

Image

Here are some details about my data and parameters:

Reference genome size: ~2.3 Gb

Number of SNPs used: 278518514,quite large

Parameter file: blgsize: 0.01

Outgroup: a relatively distant species

Many tests were run for all possible quartets (W, X, Y, Z), not only for topologies consistent with the species tree.

My questions are:

Is it normal to get Z = 100 for many tests, or does this indicate numerical saturation (e.g., SE(D) too small)?

Should I increase the block size (e.g., blgsize: 0.05 or larger) to avoid unrealistically small SE values?

Would it be more appropriate to limit the tests to quartets consistent with the species tree, instead of testing all possible combinations?

Could the high Z-scores result from using too distant an outgroup or from excessive divergence among species? In that case, would you recommend restricting the D-statistic tests within clades and choosing the nearest outgroup for each trio?

Any guidance on how to interpret these large Z-scores and how to adjust parameters or filtering strategies would be greatly appreciated.

Thank you very much for your time and for maintaining this excellent tool.

Best regards,
Na Wan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions