Replies: 2 comments
-
|
Some of the utility programs (e.g. diff_isoform_usage) write gene id and transcript id as separate; which is great, but then different than other programs, requiring even more programming. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Also, transcript ids composed of two UUIDs make it harder to display the transcripts (e.g. in ucsc browser) and seems pointless. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The convention of merging the transcript id and gene id into transId_geneId is time-wasting to use in analysis pipelines.
When the GTF is written out, the transcript_id is set to just the trans_id part. So then if you run other tools on the GTF,
and want to access the flair files, you need to do id mapping from transId_geneId.
Plus using "_" conflicts with RefSeq accessions 🤯
Please don't encoding of metadata in ids, this has long been known to cause all kinds of grief in bioinformatics (TCGA). Please use two columns, one for transcript id and one for gene id in files. for BEDs, you can make it a BED12+1 if you want (nice for turning into a bigBed track.
Beta Was this translation helpful? Give feedback.
All reactions