-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hi,
I have struggled with installing mikado, but I want to ask a question about input data.
I have annotated my genome with Tiberius 1st, and then I used the output as input for annotation with EviAnn using protein, transcript, and RNA-seq data. I was recommended to use Mikado to polish the annotation.
I mapped all the reads, merged the bam files and then used Portcullis to polish the boundaries. Also, I will input the Tiberius and EviAnn gffs, but I wanted to have an addition gff based on the NCBI annotation of an individual of the same species. I checked the gff, and it has annotated pseudogenes. I tried liftoff annotation transfer _+ polishing using the gff with and without pseudogenes, and I got more genes/exons/CDS using the gff without pseudogenes - which kinda does make sense.
I have a question.
Should I filter the polished gff with gffread to keep only genes with START/STOP codons and without in-frame STOP Codons, or
should I feed Mikado the raw gff output by Liftoff?
Another question: should I use a softmasked genome or a regular fasta file is OK?
Thanks.