Skip to content

build_bowtie2db issue #158

@michaelwasney

Description

@michaelwasney

Hi Dr. Zhao,

I am trying to build a genome index, and am getting the following error:

[michaelw@n1015 errors]$ cat build_genome_indices.e12643016
1773252698.0:  Build Bowtie2 genome database in subcommand build_bowtie2db with args
1773252698.0:  {
1773252698.0:      "subcommand": "build_bowtie2db",
1773252698.0:      "force": false,
1773252698.0:      "debug": false,
1773252698.0:      "zzz_worker_mode": false,
1773252698.0:      "batch_branch": "master",
1773252698.0:      "batch_memory": 378880,
1773252698.0:      "batch_vcpus": 48,
1773252698.0:      "batch_queue": "pairani",
1773252698.0:      "batch_ecr_image": "pairani:latest",
1773252698.0:      "bt2_indexes_dir": "/u/project/ngarud/Garud_lab/JJacobs/MIDAS_output/one_bt2_indexes",
1773252698.0:      "bt2_indexes_name": "repgenomes",
1773252698.0:      "midasdb_name": "localdb",
1773252698.0:      "midasdb_dir": "/u/scratch/m/michaelw/assembling_mouse_db/MIDASDB-mgbc",
1773252698.0:      "species_list": null,
1773252698.0:      "species_profile": "/u/project/ngarud/Garud_lab/JJacobs/MIDAS_output/merge/species/species_prevalence.tsv",
1773252698.0:      "select_by": "mean_abundance,sample_counts",
1773252698.0:      "select_threshold": "3,1",
1773252698.0:      "num_cores": 8,
1773252698.0:      "prune_centroids": false,
1773252698.0:      "prune_method": "max",
1773252698.0:      "prune_cutoff": 0.4,
1773252698.0:      "remove_singleton": false
1773252698.0:  }
1773252698.1:  MIDAS2::build_bowtie2db::build bt2 indexees for the listed species: []
1773252698.3:  MIDAS2::build_bowtie2_repgenomes_indexes::start
1773252698.3:  Deleting untrustworthy outputs due to error. Specify --debug flag to keep.
Traceback (most recent call last):
  File "/u/home/m/michaelw/.conda/envs/midasv3/bin/midas", line 10, in <module>
    sys.exit(main())
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/__main__.py", line 25, in main
    return subcommand_main(subcommand_args)
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/subcommands/build_bowtie2db.py", line 144, in main
    build_bowtie2db(args)
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/subcommands/build_bowtie2db.py", line 137, in build_bowtie2db
    raise error
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/subcommands/build_bowtie2db.py", line 118, in build_bowtie2db
    midas_db.fetch_files("repgenome", species_of_interest)
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/models/midasdb.py", line 190, in fetch_files
    return self.fetch_tarball(filename, list_of_species)
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/models/midasdb.py", line 259, in fetch_tarball
    md5_fetched = file_md5sum(_fetched_file)
  File "/u/home/m/michaelw/.conda/envs/midasv3/lib/python3.9/site-packages/midas/models/midasdb.py", line 364, in file_md5sum
    return md5(open(local_file, "rb").read()).hexdigest()
IsADirectoryError: [Errno 21] Is a directory: '/u/scratch/m/michaelw/assembling_mouse_db/MIDASDB-mgbc/gene_annotations//'

I did not get this error the last time I ran the build genome index step. u/scratch/m/michaelw/assembling_mouse_db/MIDASDB-mgbc/gene_annotations/ is indeed a directory that contains all of the genome annotations for a custom database that I built. Here are the arguments I passed it:

db_name="localdb"
db_dir=/u/scratch/m/michaelw/assembling_mouse_db/MIDASDB-mgbc
species_profile=/u/project/ngarud/Garud_lab/JJacobs/MIDAS_output/merge/species/species_prevalence.tsv
b2index_dir=/u/project/ngarud/Garud_lab/JJacobs/MIDAS_output/one_bt2_indexes
num_cores=8

echo "Building SNPs repgenomes."

midas build_bowtie2db \
  --midasdb_name $db_name \
  --midasdb_dir $db_dir \
  --species_profile $species_profile \
  --select_by mean_abundance,sample_counts \
  --select_threshold 3,1 \
  --bt2_indexes_name repgenomes \
  --bt2_indexes_dir $b2index_dir \
  --num_cores $num_cores \
  --debug

Can you please help me understand what is causing the error?

Thank you!

Best,
Michael

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions