-
Notifications
You must be signed in to change notification settings - Fork 1
1. Scorpio CreateDb Module
Mohammad Saleh Refahi edited this page Nov 17, 2024
·
2 revisions
The createdb module is used to create a database for the Scorpio model. This guide provides instructions on how to use the createdb.py script, including the necessary arguments and example usage.
The script accepts the following arguments:
-
--scorpio_model(str): Path to the Scorpio model -
--output(str): Output DB Folder -
--db_fasta(str): Fasta File -
--val_fasta(str): Validation Fasta File -
--max_len(int): Maximum Length of Sequence -
--batch_size(int): Batch size (default: 60) -
--db_embedding(str): Embedding File -
--val_embedding(str): Validation Embedding File -
--metadata(str): Metadata File -
--cal_kmer_freq(bool): Calculate Kmer Frequency (default: False) -
--num_distance(int): Number of Distances which is parameter for confidence score training (default: 2000) -
--num_device(int): Number of devices to use (default: 1) -
--required_memory_gb(int): Required memory in GB (default: 79)
To run the createdb(createdb.py) module, use the following command:
scorpio createdb --scorpio_model "./models/Scorpio-6Freq" \
--db_fasta "./data/data-amr/train.fasta" \
--val_fasta "./data/data-amr/val.fasta" \
--metadata "./data/data-amr/metadata.csv" \
--cal_kmer_freq True \
--output "db_path" \
--max_len 2096 \
--num_distance 2000 \
--batch_size 100