merging of transcript ids and gene ids is frustrating to use #312

diekhans · 2023-08-27T20:19:17Z

diekhans
Aug 27, 2023
Maintainer

The convention of merging the transcript id and gene id into transId_geneId is time-wasting to use in analysis pipelines.

When the GTF is written out, the transcript_id is set to just the trans_id part. So then if you run other tools on the GTF,
and want to access the flair files, you need to do id mapping from transId_geneId.

Plus using "_" conflicts with RefSeq accessions 🤯

Please don't encoding of metadata in ids, this has long been known to cause all kinds of grief in bioinformatics (TCGA). Please use two columns, one for transcript id and one for gene id in files. for BEDs, you can make it a BED12+1 if you want (nice for turning into a bigBed track.

diekhans · 2023-08-28T03:10:19Z

diekhans
Aug 28, 2023
Maintainer Author

Some of the utility programs (e.g. diff_isoform_usage) write gene id and transcript id as separate; which is great, but then different than other programs, requiring even more programming.

0 replies

diekhans · 2023-08-28T03:14:56Z

diekhans
Aug 28, 2023
Maintainer Author

Also, transcript ids composed of two UUIDs make it harder to display the transcripts (e.g. in ucsc browser) and seems pointless.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merging of transcript ids and gene ids is frustrating to use #312

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

merging of transcript ids and gene ids is frustrating to use #312

Uh oh!

Uh oh!

diekhans Aug 27, 2023 Maintainer

Replies: 2 comments

Uh oh!

diekhans Aug 28, 2023 Maintainer Author

Uh oh!

diekhans Aug 28, 2023 Maintainer Author

diekhans
Aug 27, 2023
Maintainer

diekhans
Aug 28, 2023
Maintainer Author

diekhans
Aug 28, 2023
Maintainer Author