Skip to content

Ignore less reliable HMMer domain results (noted with '?') in reconstructing alignment #7

@dustine32

Description

@dustine32

Attempting to graft this sequence:

>Cyanophora_paradoxa_CPAR027107_Apc11
QKTLTILAKDRNYKVEDFKAAGAIAKTRLDQQREPCSCKVAASDAHPCVRRVLFLNLSAA
VGAREPRLGARRAPALRSMKVKIVWHAVASWTWNVDDEACGICRNAYDGCCPDCKTPGDD
CPLWGECRHAFHLHCILKWVNSQQEGKQHCPMCRRDWKFRSSD

...onto the PANTHER 15.0 library, TreeGrafter outputs this error:

ERROR MSF of Cyanophora_paradoxa_CPAR027107_Apc11 should have length 90, actual length is 203

Debugging what's going on, the treeGrafter.pl script appears to be parsing the hmmscan output for the top hit to PTHR11210 incorrectly and this causes the reconstruction of the query sequence alignment in TreeGrafter to not match the alignment length of the PTHR11210 family PIR file:

image
Specifically, the script recognizes that this hit has two domains and uses that count in iterating through start/end alignment values. Unfortunately, a regex for /!/ used in parsing out those start/end values causes the first domain ? to be skipped and the wrong values are used. We'll need to debug further to figure out how to line these parts all up together correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions