-
Notifications
You must be signed in to change notification settings - Fork 174
Description
Hi, I wanted to experiment using SpliceAI a bit and was wondering how I can use an updated version of GENCODE (e.g. v49). I tried passing this directly to the -A flag, but get the error:
ERROR:root:Gene annotation file ../Exome/gencode.v49.annotation.gff3.gz not formatted properly: Error tokenizing data. C error: Expected 1 fields in line 8, saw 9
It looks like the tool hardcodes some assumptions about the header, which break in GENCODE v49 (and maybe earlier versions). Is there any way I can reformat the file to use in this case, or if the tool can extend to handle this case? If there's a dependency on a GFF3 parser maybe updating that version would help?
P.S. I tried using the tool with updated package versions using -A grch38 and hit an numpy error: np.fromstring is being used but was depricated many versions ago, which should be replaced with np.frombuffer. This finally triggered an error in >= 2.3.0 so might be worth replacing this method.