Skip to content

Add tree sequence initialization flag for "compute mutation parents" #2757

@daikitag

Description

@daikitag

When I try to construct a tree sequence through a mutation table,

  • time must either be UNKNOWN_TIME (a NAN value which indicates the time is unknown) or be a finite value which is greater or equal to the mutation node’s time, less than the node above the mutation’s time and equal to or less than the time of the parent mutation if this mutation has one. If one mutation on a site has UNKNOWN_TIME then all mutations at that site must, as a mixture of known and unknown is not valid.
  • mutation must be sorted by site ID

in the mutation requirements (https://tskit.dev/tskit/docs/stable/data-model.html#sec-mutation-requirements) are detected as errors. However, codes such as

ts = tskit.Tree.generate_comb(4, span=10).tree_sequence
tables = ts.dump_tables()
tables.sites.add_row(0, "A")
tables.mutations.add_row(site=0, node=3, derived_state="T")
tables.mutations.add_row(site=0, node=5, derived_state="T")
ts = tables.tree_sequence()

(Mutation ID 0 occurs at a later time than Mutation ID 1)

and

ts = tskit.Tree.generate_comb(4, span=10).tree_sequence
tables = ts.dump_tables()
tables.sites.add_row(0, "A")
tables.mutations.add_row(site=0, node=0, derived_state="T")
tables.mutations.add_row(site=0, node=0, derived_state="G")
ts = tables.tree_sequence()

(Mutation ID 1 should have Mutation ID 0 as its parent, though ts.mutation(1) has parent=-1 instead of parent=0)

won't be detected as errors, even though they are violating the mutation requirements of tree sequence data. I would appreciate any insights on this topic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Small things

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions