Skip to content

Asteroid on non-randomly missing data #5

@LPDagallier

Description

@LPDagallier

Hi Benoit,

Thanks for Asteroid, looks a very promising tool!
This is not an issue on the program, but more a question.
From what I understand of the paper, Asteroid performs well with high proportion of data that is missing because of a stochastic process of data deletion (in the case of simulated datasets) or data absence (in the case of empirical datasets).

Do you have any idea of the performance of Asteroid in case data is non-randomly missing?
For example, in case where a dataset combines a few species represented by a lot of genes (e.g. phylogenomic dataset) with a lot of species represented by a few genes (e.g. sanger sequencing/barcode data) (see e.g. https://doi.org/10.1093/molbev/msad109).

Did you tried to simulate missing data in a non random manner?

I'm curious to know whether Asteroid would perform similarly well with high levels of non-random missing data.

Thanks,
Léo-Paul

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions