-
Notifications
You must be signed in to change notification settings - Fork 4
Description
@tivdnbos proposed to me the idea of propagating counts from taxa that are situated higher in the taxonomy tree to nodes that are situated lower, in order to increase identification confidence of taxa on a lower rank in the NCBI taxonomy.
An example is appropriate here:
The root of the tree is annotated (directly or indirectly) with 27 peptides. On the next level, only 9 peptides are left in total. On the third level, we can see that only 3 peptides are left in total and that all three taxa (E, F and G) are equally probable. We know, however, from the previous level (2) that B is annotated with 6 / 9 = 66% of the peptides and thus that E on level 3 probably occurs more than F and G.
The approach proposed here is to somehow propagate information from higher levels in the NCBI taxonomy to lower levels and in this way increase identification confidence.
