Skip to content

Question about NCyc_95.faa.gz vs NCyc_100.faa.gz #40

@pthieringer

Description

@pthieringer

Hello and thank you for this very useful tool for finding N-cycling related genes!

I have a question regarding the output of using the NCyc_95 vs NCyc_100(_2019Jul) databases: how is it that the 95 database can produce more hits for a gene than the 100 database? To provide some background, I am running the program on a set of MAGs and wanted to see how the output would differ using the two different databases. I have multiple MAGs where there are a greater number of hits for genes resulting from using the 95 database compared to the 100 database. Why does this happen?

I am curious about this because it would seem that any of the representative sequences belonging to the 95 database should also be present in the 100 database? Am I misunderstanding how these two databases are created?

I am running the tool using Diamond and am happy to share any code or resulting output of the program as you may need! Thank you for your time and advice!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions