Skip to content

Compatibility of createKEGGdb with keyType option of clusterProfiler::enrichKEGG function #12

@thegrebe

Description

@thegrebe

Hello,

Thanks for this useful package!

I have some questions on what exactly is stored in the resulting KEGG.db, and how that relates to the options of clusterProfiler::enrichKEGG.
enrichKEGG has an option keyType, which accepts kegg, ncbi-geneid, ncbi-proteinid or uniprot.


Background/context

I would like to have a solution for doing KEGG enrichment analysis, starting from gene SYMBOL. I want to be able to use the same solution from any arbitrary species.

From this reply YuLab-SMU/clusterProfiler#108 (comment)

KEGG id and ENTREZID are the same for only some of the species, but not always the same.

and this blog post https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/

A rule of thumb for the ‘kegg’ ID is entrezgene ID for eukaryote species and Locus ID for prokaryotes.

I conclude that kegg id are not reliable enough/not sufficiently well described for my use. I would thus prefer to use ncbi-geneid.


However, when opening the sqlite database created through createKEGGdb, I only see a field gene_or_orf_id in table pathway2gene.

Questions:

  • what is the gene_or_orf_id present in the KEGG.db database? Is it a kegg id?
  • can I use createKEGGdb to create a KEGG.db package, and then use it for clusterProfiler::enrichKEGG with keyType = ncbi-geneid (and use_internal_data = TRUE)

Than you in advance for your help,
All the best

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions