-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Hi there,
I am trying to test scikit-ribo on some yeast ribo-seq data we have just generated.
When running scikit-ribo-build.py I get the following error:
File "/home/drew/anaconda3/envs/scikit-ribo/lib/python3.6/site-packages/gffutils/interface.py", line 227, in __getitem__ raise FeatureNotFoundError(key) gffutils.exceptions.FeatureNotFoundError: YDR106W_BY4741
I am using the yeast BY4741 Toronto genome, and a subset of the associated gff file that I converted to gtf using gffread. The subset just includes coding genes (and excludes some other Dubious annotated transcripts). This seemed to solve another issue I was having using the full gff or gtf made from this which were either
pandas.errors.ParserError: Too many columns specified: expected 9 and found 1.
when using a gff, or
KeyError: 'gene_id'
when using a gtf that included noncoding genes (the problem here is with missing "gene_id" descriptors for these kinds of genes or for exon annotations.)
However I ran into the problem described above.
I am not sure why gffutils is not finding the genes by these names (e.g. YDR106W_BY4741) since this is exactly how they are named in the gtf file (obviously, since the geneNames list in gtf_preprocess.py is built from the gtf itself).
Your help would be much appreciated!! Thanks.