Skip to content

How to make surpi db for new NCBI accession2taxid files #36

@DooYal

Description

@DooYal

I modified the create_taxonomy shell and python scripts with replacing "gi_taxid_nucl.dmp.gz" into ”nucl_gb.accession2taxid.gz“ and also replacing such filename for protein.
However, when it comes to the step 4 of create_taxonomy shell script, there is always an error:

Starting creation of taxonomy SQLite databases...
Creating names_nodes_scientific.db...
Creating taxid_prot.db...
Traceback (most recent call last):
File "/mnt/upan/share/surpi/create_taxonomy_db.py", line 76, in
c.execute("INSERT INTO gi_taxid VALUES ("+line[0]+","+line[1]+")")
sqlite3.OperationalError: no such column: accession

I found that there is actually "accession" in the header of corresponding files, so, where should I modify to correctly build the surpi protein and nucleotide db?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions