Accessing transcript data for genes from bacterial artificial chromosomes (BACs)

@davmlaw

Hi Dave, I am currently working on a project in Alex Wagner's laboratory aimed at standardizing output from different gene fusion detection algorithms. A central component of this work is using a transcript-based model to model the transcript junctions for each of the partners in a fusion (see our [specification](https://fusions.cancervariants.org/en/latest/information_model.html) for further reference).

We are currently using UTA to get this transcript data, but have observed several cases where an outputted fusion may report genes from bacterial artificial chromosomes as a fusion partner (e.g. RP5-899B16.3 and CTD-2055G21.1). We are considering using cdot in addition to UTA to help get transcript data for gene symbols that may not exist in the recent UTA release.

By processing earlier versions of GENCODE GTFs, such as version 38, we were able to extract the transcripts linked to these gene symbols. However, when querying the matched transcripts using cdot, the `gene_name` attribute was None. For example, for the gene RP5-899B16.3 we observed:

```shell
{'id': 'ENST00000666152.1',
 'chrom': 'NC_000006.12',
 'start': 139938863,
 'end': 139991094,
 'strand': '-',
 'cds_start': 139991094,
 'cds_end': 139991094,
 'gene_name': None,
 'exons': [[139938863, 139939458],
  [139978011, 139978404],
  [139978621, 139978873],
  [139990992, 139991094]]}
```
We were wondering why the `gene_name` attribute returned None? Also, would cdot be appropriate for this use case (getting a list of transcripts associated with a gene symbol)? Thank you for help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accessing transcript data for genes from bacterial artificial chromosomes (BACs) #89

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accessing transcript data for genes from bacterial artificial chromosomes (BACs) #89

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions