Skip to content

Cannot use custom GFF3 file for annotations #167

@rickymagner

Description

@rickymagner

Hi, I wanted to experiment using SpliceAI a bit and was wondering how I can use an updated version of GENCODE (e.g. v49). I tried passing this directly to the -A flag, but get the error:

ERROR:root:Gene annotation file ../Exome/gencode.v49.annotation.gff3.gz not formatted properly: Error tokenizing data. C error: Expected 1 fields in line 8, saw 9

It looks like the tool hardcodes some assumptions about the header, which break in GENCODE v49 (and maybe earlier versions). Is there any way I can reformat the file to use in this case, or if the tool can extend to handle this case? If there's a dependency on a GFF3 parser maybe updating that version would help?

P.S. I tried using the tool with updated package versions using -A grch38 and hit an numpy error: np.fromstring is being used but was depricated many versions ago, which should be replaced with np.frombuffer. This finally triggered an error in >= 2.3.0 so might be worth replacing this method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions