-
Notifications
You must be signed in to change notification settings - Fork 79
Closed
Milestone
Description
When I try to construct a tree sequence through a mutation table,
- time must either be UNKNOWN_TIME (a NAN value which indicates the time is unknown) or be a finite value which is greater or equal to the mutation node’s time, less than the node above the mutation’s time and equal to or less than the time of the parent mutation if this mutation has one. If one mutation on a site has UNKNOWN_TIME then all mutations at that site must, as a mixture of known and unknown is not valid.
- mutation must be sorted by site ID
in the mutation requirements (https://tskit.dev/tskit/docs/stable/data-model.html#sec-mutation-requirements) are detected as errors. However, codes such as
ts = tskit.Tree.generate_comb(4, span=10).tree_sequence
tables = ts.dump_tables()
tables.sites.add_row(0, "A")
tables.mutations.add_row(site=0, node=3, derived_state="T")
tables.mutations.add_row(site=0, node=5, derived_state="T")
ts = tables.tree_sequence()
(Mutation ID 0 occurs at a later time than Mutation ID 1)
and
ts = tskit.Tree.generate_comb(4, span=10).tree_sequence
tables = ts.dump_tables()
tables.sites.add_row(0, "A")
tables.mutations.add_row(site=0, node=0, derived_state="T")
tables.mutations.add_row(site=0, node=0, derived_state="G")
ts = tables.tree_sequence()
(Mutation ID 1 should have Mutation ID 0 as its parent, though ts.mutation(1) has parent=-1 instead of parent=0)
won't be detected as errors, even though they are violating the mutation requirements of tree sequence data. I would appreciate any insights on this topic.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Small things