-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Dear Author,
I am currently using qpDstat from the Admixtools package to calculate D-statistics among several rodent species. However, I have noticed that many of my results show Z-scores = 100, and most tests have |Z| > 3. I would like to ask whether this pattern is expected or indicates a problem in my setup.
Here are some details about my data and parameters:
Reference genome size: ~2.3 Gb
Number of SNPs used: 278518514,quite large
Parameter file: blgsize: 0.01
Outgroup: a relatively distant species
Many tests were run for all possible quartets (W, X, Y, Z), not only for topologies consistent with the species tree.
My questions are:
Is it normal to get Z = 100 for many tests, or does this indicate numerical saturation (e.g., SE(D) too small)?
Should I increase the block size (e.g., blgsize: 0.05 or larger) to avoid unrealistically small SE values?
Would it be more appropriate to limit the tests to quartets consistent with the species tree, instead of testing all possible combinations?
Could the high Z-scores result from using too distant an outgroup or from excessive divergence among species? In that case, would you recommend restricting the D-statistic tests within clades and choosing the nearest outgroup for each trio?
Any guidance on how to interpret these large Z-scores and how to adjust parameters or filtering strategies would be greatly appreciated.
Thank you very much for your time and for maintaining this excellent tool.
Best regards,
Na Wan