Skip to content

Updated Gambit 1.1.0#1584

Open
cwoodside1278 wants to merge 20 commits intoStaPH-B:masterfrom
cwoodside1278:gambit-1.1.0
Open

Updated Gambit 1.1.0#1584
cwoodside1278 wants to merge 20 commits intoStaPH-B:masterfrom
cwoodside1278:gambit-1.1.0

Conversation

@cwoodside1278
Copy link
Contributor

@cwoodside1278 cwoodside1278 commented Mar 8, 2026

Pull Request (PR) checklist:

  • Include a description of what is in this pull request in this message.
  • The dockerfile successfully builds to a test target for the user creating the PR. (i.e. docker build --tag samtools:1.15test --target test docker-builds/build-files/samtools/1.15 )
  • Directory structure as name of the tool in lower case with special characters removed with a subdirectory of the version number in build-files (i.e. docker-builds/build-files/spades/3.12.0/Dockerfile)
    • (optional) All test files are located in same directory as the Dockerfile (i.e. build-files/shigatyper/2.0.1/test.sh)
  • Create a simple container-specific README.md in the same directory as the Dockerfile (i.e. docker-builds/build-files/spades/3.12.0/README.md)
    • If this README is longer than 30 lines, there is an explanation as to why more detail was needed
  • Dockerfile includes the recommended LABELS
  • Main README.md has been updated to include the tool and/or version of the dockerfile(s) in this PR
  • Program_Licenses.md contains the tool(s) used in this PR and has been updated for any missing

Description

Updated gambit to 1.1.0 version from 1.0.0

  • created new directory structure gambit/1.1.0
  • Updated the micromamba to a much newer version: 'FROM mambaorg/micromamba:2.5.0 AS app_base'
  • Updated env.yaml to more recent versions of packages, but still usable/applicable
  • Updated the container-specific read me to gambit/1.1.0
  • Updated the main read me to include the new version

Diff Output

diff -r 1.0.0/Dockerfile 1.1.0/Dockerfile
2c2
< FROM mambaorg/micromamba:0.27.0 as app_base
---
> FROM mambaorg/micromamba:2.5.0 AS app_base
4c4
< ARG GAMBIT_SOFTWARE_VERSION="1.0.0"
---
> ARG GAMBIT_SOFTWARE_VERSION="1.1.0"
8c8
< LABEL base.image="mambaorg/micromamba:0.27.0"
---
> LABEL base.image="mambaorg/micromamba:2.5.0"
59c59
< FROM app as test
---
> FROM app AS test
63d62
< 
diff -r 1.0.0/env.yaml 1.1.0/env.yaml
5c5
<   - python ==3.9
---
>   - python ==3.11
7d6
< 
10,12c9,10
<   - cython >=0.29
<   - numpy >=1.13
< 
---
>   - cython >=3.0
>   - numpy >=1.24
14,20c12,18
<   - sqlalchemy >=1.1
<   - biopython >=1.69
<   - alembic >=1.0
<   - attrs >=20
<   - cattrs >=1.0
<   - click >=7.0
<   - h5py >=3.0
---
>   - sqlalchemy >=2.0
>   - biopython >=1.80
>   - alembic >=1.13
>   - attrs >=23
>   - cattrs >=23.0
>   - click >=8.0
>   - h5py >=3.9
22,23c20
<   - scipy >=1.7
< 
---
>   - scipy >=1.11
26d22
< 
28c24
<   - pandas >=1.4
---
>   - pandas >=2.0

Test Build

[+] Building 1.0s (17/17) FINISHED                                                                                                                                    docker:default
 => [internal] load build definition from Dockerfile                                                                                                                            0.0s
 => => transferring dockerfile: 2.10kB                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/mambaorg/micromamba:2.5.0                                                                                                            0.3s
 => [internal] load .dockerignore                                                                                                                                               0.0s
 => => transferring context: 2B                                                                                                                                                 0.0s
 => [app_base 1/6] FROM docker.io/mambaorg/micromamba:2.5.0@sha256:af06736ba66714dd4b18b63011e4405091561b8777da14da30ba618ef280c0b5                                             0.0s
 => => resolve docker.io/mambaorg/micromamba:2.5.0@sha256:af06736ba66714dd4b18b63011e4405091561b8777da14da30ba618ef280c0b5                                                      0.0s
 => [internal] load build context                                                                                                                                               0.0s
 => => transferring context: 57B                                                                                                                                                0.0s
 => [app 2/2] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gs /gambit-db/         0.4s
 => [app 1/2] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gdb /gambit-db/        0.4s
 => CACHED [app_base 2/6] COPY --chown=mambauser:mambauser env.yaml /tmp/env.yaml                                                                                               0.0s
 => CACHED [app_base 3/6] RUN micromamba install -y -n base -f /tmp/env.yaml &&     micromamba clean --all --yes                                                                0.0s
 => CACHED [app_base 4/6] RUN pip install https://github.com/jlumpe/gambit/archive/refs/tags/v1.1.0.tar.gz &&   micromamba clean -a -y                                          0.0s
 => CACHED [app_base 5/6] RUN mkdir /gambit-db /data &&   chown mambauser:mambauser /gambit-db /data                                                                            0.0s
 => CACHED [app_base 6/6] WORKDIR /data                                                                                                                                         0.0s
 => CACHED [app 1/2] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gdb /gambit-db  0.0s
 => CACHED [app 2/2] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gs /gambit-db/  0.0s
 => CACHED [test 1/2] COPY test.sh .                                                                                                                                            0.0s
 => CACHED [test 2/2] RUN bash test.sh                                                                                                                                          0.0s
 => exporting to image                                                                                                                                                          0.1s
 => => exporting layers                                                                                                                                                         0.0s
 => => exporting manifest sha256:6225d3c5386ec2cb5b325b363731cea065eddc0866fb254b628997ad2d5fc8e3                                                                               0.0s
 => => exporting config sha256:d65102f87bec5410f8e06d190deb43f88c33a774e25de85474dc5abb0b581cab                                                                                 0.0s
 => => exporting attestation manifest sha256:4d6c81ede0b691dc9acc4af9a7453bbf799eaa52b2ab8468866f765ae4eec045                                                                   0.0s
 => => exporting manifest list sha256:099484a289118fb2b9737acf42510b65e3631ee0f51b74182243b1e340476c95                                                                          0.0s
 => => naming to docker.io/library/gambit:1.1.0                                                                                                                                 0.0s
 => => unpacking to docker.io/library/gambit:1.1.0   

Comment on lines +61 to +62
LABEL maintainer1="Kevin Libuit"
LABEL maintainer.email1="kevin.libuit@theiagen.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you adjust LABEL maintainer1 to LABEL maintainer and LABEL maintainer.email1 to LABEL maintainer.email?

Copy link
Contributor Author

@cwoodside1278 cwoodside1278 Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I thought you wanted me to remove the labels from the app stage haha. I removed all the labels from the app stage. Would you like me to just do it to the top ones?

@erinyoung
Copy link
Contributor

The tests work

#17 [test 2/2] RUN bash test.sh
#17 0.683 gambit, version 1.1.0
#17 0.757 --2026-03-09 21:53:21--  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/240/185/GCF_000240185.1_ASM24018v2/GCF_000240185.1_ASM24018v2_genomic.fna.gz
#17 0.765 Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.10, 130.14.250.11, 2607:f220:41e:250::31, ...
#17 0.886 Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.10|:443... connected.
#17 0.936 HTTP request sent, awaiting response... 200 OK
#17 0.999 Length: 1679403 (1.6M) [application/x-gzip]
#17 1.000 Saving to: ‘GCF_000240185.1_ASM24018v2_genomic.fna.gz’
#17 1.000 
#17 1.000      0K .......... .......... .......... .......... ..........  3% 1.04M 1s
#17 1.047     50K .......... .......... .......... .......... ..........  6% 2.08M 1s
#17 1.070    100K .......... .......... .......... .......... ..........  9% 47.4M 1s
#17 1.071    150K .......... .......... .......... .......... .......... 12% 2.16M 1s
#17 1.094    200K .......... .......... .......... .......... .......... 15%  142M 1s
#17 1.094    250K .......... .......... .......... .......... .......... 18% 99.1M 0s
#17 1.095    300K .......... .......... .......... .......... .......... 21%  187M 0s
#17 1.095    350K .......... .......... .......... .......... .......... 24% 2.16M 0s
#17 1.118    400K .......... .......... .......... .......... .......... 27%  113M 0s
#17 1.118    450K .......... .......... .......... .......... .......... 30% 35.6M 0s
#17 1.120    500K .......... .......... .......... .......... .......... 33% 79.5M 0s
#17 1.120    550K .......... .......... .......... .......... .......... 36%  157M 0s
#17 1.121    600K .......... .......... .......... .......... .......... 39% 82.9M 0s
#17 1.121    650K .......... .......... .......... .......... .......... 42%  159M 0s
#17 1.121    700K .......... .......... .......... .......... .......... 45%  111M 0s
#17 1.122    750K .......... .......... .......... .......... .......... 48% 2.48M 0s
#17 1.142    800K .......... .......... .......... .......... .......... 51%  148M 0s
#17 1.142    850K .......... .......... .......... .......... .......... 54% 45.1M 0s
#17 1.143    900K .......... .......... .......... .......... .......... 57% 70.6M 0s
#17 1.144    950K .......... .......... .......... .......... .......... 60%  103M 0s
#17 1.144   1000K .......... .......... .......... .......... .......... 64%  156M 0s
#17 1.144   1050K .......... .......... .......... .......... .......... 67% 73.4M 0s
#17 1.145   1100K .......... .......... .......... .......... .......... 70%  109M 0s
#17 1.146   1150K .......... .......... .......... .......... .......... 73% 73.7M 0s
#17 1.146   1200K .......... .......... .......... .......... .......... 76%  104M 0s
#17 1.147   1250K .......... .......... .......... .......... .......... 79%  162M 0s
#17 1.147   1300K .......... .......... .......... .......... .......... 82%  127M 0s
#17 1.147   1350K .......... .......... .......... .......... .......... 85% 77.2M 0s
#17 1.148   1400K .......... .......... .......... .......... .......... 88%  158M 0s
#17 1.148   1450K .......... .......... .......... .......... .......... 91% 74.5M 0s
#17 1.149   1500K .......... .......... .......... .......... .......... 94% 3.08M 0s
#17 1.165   1550K .......... .......... .......... .......... .......... 97% 74.4M 0s
#17 1.166   1600K .......... .......... .......... ..........           100%  128M=0.2s
#17 1.166 
#17 1.166 2026-03-09 21:53:21 (9.66 MB/s) - ‘GCF_000240185.1_ASM24018v2_genomic.fna.gz’ saved [1679403/1679403]
#17 1.166 
#17 2.586 Parsing input
#17 2.631 Calculating distances
#17 5.005 Classifying
#17 5.206 GCF_000240185.1_ASM24018v2_genomic,Klebsiella pneumoniae,species,573,0.4691092073917389,0.0,[GCF_000240185.1] Klebsiella pneumoniae subsp. pneumoniae HS11286 (enterobacteria),,,,
#17 DONE 5.2s

@erinyoung
Copy link
Contributor

I've moved the LABELS to the 'app' stage and fixed maintainer1 to maintainer

@cwoodside1278
Copy link
Contributor Author

Gotcha. Thanks for doing that! I wasn't sure.

- python ==3.11
- pip
# Build requirements
- c-compiler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the compiler needed at runtime or just when building the software?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just for the software.

@erinyoung
Copy link
Contributor

Sorry about all the failures. I wanted to simplify the ARGs and broke EVERYTHING!

Those are now working.

This image is missing ps, which is required when using these images in nextflow workflows. Can you ensure this is installed in the app stage?

Added code to Install ps (procps) required for Nextflow workflows
@cwoodside1278
Copy link
Contributor Author

Okay I updated the code to include ps. I ran the docker build and it ran perfectly:

[+] Building 191.4s (19/19) FINISHED                                                                                                                                  docker:default
 => [internal] load build definition from Dockerfile                                                                                                                            0.0s
 => => transferring dockerfile: 2.61kB                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/mambaorg/micromamba:2.5.0                                                                                                            0.4s
 => [internal] load .dockerignore                                                                                                                                               0.0s
 => => transferring context: 2B                                                                                                                                                 0.0s
 => [app_base 1/6] FROM docker.io/mambaorg/micromamba:2.5.0@sha256:af06736ba66714dd4b18b63011e4405091561b8777da14da30ba618ef280c0b5                                             0.0s
 => => resolve docker.io/mambaorg/micromamba:2.5.0@sha256:af06736ba66714dd4b18b63011e4405091561b8777da14da30ba618ef280c0b5                                                      0.0s
 => [internal] load build context                                                                                                                                               0.0s
 => => transferring context: 780B                                                                                                                                               0.0s
 => CACHED [app 3/4] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gs /gambit-db/  0.1s
 => CACHED [app 2/4] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gdb /gambit-db  0.2s
 => CACHED [app_base 2/6] COPY --chown=mambauser:mambauser env.yaml /tmp/env.yaml                                                                                               0.0s
 => [app_base 3/6] RUN micromamba install -y -n base -f /tmp/env.yaml &&     micromamba clean --all --yes                                                                      31.9s
 => [app_base 4/6] RUN pip install https://github.com/jlumpe/gambit/archive/refs/tags/v1.1.0.tar.gz &&   micromamba clean -a -y                                                27.1s 
 => [app_base 5/6] RUN mkdir /gambit-db /data &&   chown mambauser:mambauser /gambit-db /data                                                                                   0.2s 
 => [app_base 6/6] WORKDIR /data                                                                                                                                                0.0s 
 => [app 1/4] RUN apt-get update && apt-get install -y --no-install-recommends procps &&     rm -rf /var/lib/apt/lists/*                                                        4.6s 
 => [app 2/4] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gdb /gambit-db/        0.1s 
 => [app 3/4] ADD --chown=mambauser:mambauser https://storage.googleapis.com/jlumpe-gambit/public/databases/refseq-curated/1.0/gambit-refseq-curated-1.0.gs /gambit-db/        12.1s 
 => [app 4/4] WORKDIR /data                                                                                                                                                     0.0s 
 => [test 1/2] COPY test.sh .                                                                                                                                                   0.0s 
 => [test 2/2] RUN bash test.sh                                                                                                                                                 7.6s 
 => exporting to image                                                                                                                                                        106.9s 
 => => exporting layers                                                                                                                                                        79.1s
 => => exporting manifest sha256:2edfcb507132b49b28ca3730b787d433c0f84c226317b6223b1f4fe323dd4122                                                                               0.0s 
 => => exporting config sha256:ae1d8ed139a8df85247ec6f3a7ed427148beba60de1175eed90bc82e7008f9e3                                                                                 0.0s 
 => => exporting attestation manifest sha256:25ea84f620bc48f441b87b568093f8d6ad1188f318408ff687106790646350b5                                                                   0.0s 
 => => exporting manifest list sha256:17d658fbca0a2b91ce00812ee0589584b9839330b9962c0a36a6f16dfd393dfb                                                                          0.0s 
 => => naming to docker.io/library/gambit:1.1.0                                                                                                                                 0.0s
 => => unpacking to docker.io/library/gambit:1.1.0           

@erinyoung
Copy link
Contributor

This image has a lot of compiler software in it. These aren't often used at run time and we can reduce the size of the image if they aren't only installed in the stage where they are used.

@erinyoung
Copy link
Contributor

Closes #905

@cwoodside1278
Copy link
Contributor Author

Closes #905

So would you still like me to attempt to update the docker to remove the extra complier software? I was already on it.

@erinyoung
Copy link
Contributor

Yes! I'm sorry for the confusion. I'm just linking one of the issue to this PR, so that the issue closes when this PR gets merged

@cwoodside1278
Copy link
Contributor Author

I made some edits to the dockerfile (that I haven't pushed here yet) that brought it down about 50MB:

gambit:1.1.0         17d658fbca0a       4.61GB         1.32GB        
gambit:1.1.0v2       fd2b91061e35       4.44GB         1.27GB 

That is small, but better. I think most of the size has to do with the conda env and then the databases. I could mount the databases as a volume at runtime?

@cwoodside1278
Copy link
Contributor Author

I made some edits to the dockerfile (that I haven't pushed here yet) that brought it down about 50MB:

gambit:1.1.0         17d658fbca0a       4.61GB         1.32GB        
gambit:1.1.0v2       fd2b91061e35       4.44GB         1.27GB 

That is small, but better. I think most of the size has to do with the conda env and then the databases. I could mount the databases as a volume at runtime?

Nevermind! I forgot that mounting required it to be downloaded to my personal machine. Not practical

@erinyoung
Copy link
Contributor

Tests worked

#23 [test 2/2] RUN bash test.sh
#23 0.733 gambit, version 1.1.0
#23 0.890 --2026-03-13 18:26:52--  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/240/185/GCF_000240185.1_ASM24018v2/GCF_000240185.1_ASM24018v2_genomic.fna.gz
#23 0.900 Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.31, 130.14.250.7, 2607:f220:41e:250::31, ...
#23 0.917 Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.31|:443... connected.
#23 0.925 HTTP request sent, awaiting response... 200 OK
#23 0.933 Length: 1679403 (1.6M) [application/x-gzip]
#23 0.934 Saving to: ‘GCF_000240185.1_ASM24018v2_genomic.fna.gz’
#23 0.934 
#23 0.934      0K .......... .......... .......... .......... ..........  3% 10.5M 0s
#23 0.939     50K .......... .......... .......... .......... ..........  6% 9.24M 0s
#23 0.944    100K .......... .......... .......... .......... ..........  9% 13.1M 0s
#23 0.948    150K .......... .......... .......... .......... .......... 12% 9.32M 0s
#23 0.953    200K .......... .......... .......... .......... .......... 15% 16.1M 0s
#23 0.956    250K .......... .......... .......... .......... .......... 18% 17.2M 0s
#23 0.959    300K .......... .......... .......... .......... .......... 21% 12.0M 0s
#23 0.963    350K .......... .......... .......... .......... .......... 24% 6.96M 0s
#23 0.970    400K .......... .......... .......... .......... .......... 27% 15.6M 0s
#23 0.973    450K .......... .......... .......... .......... .......... 30% 17.4M 0s
#23 0.976    500K .......... .......... .......... .......... .......... 33% 17.3M 0s
#23 0.979    550K .......... .......... .......... .......... .......... 36% 18.2M 0s
#23 0.981    600K .......... .......... .......... .......... .......... 39% 17.6M 0s
#23 0.984    650K .......... .......... .......... .......... .......... 42% 19.5M 0s
#23 0.987    700K .......... .......... .......... .......... .......... 45% 25.5M 0s
#23 0.989    750K .......... .......... .......... .......... .......... 48% 28.4M 0s
#23 0.990    800K .......... .......... .......... .......... .......... 51% 18.9M 0s
#23 0.993    850K .......... .......... .......... .......... .......... 54% 18.0M 0s
#23 0.996    900K .......... .......... .......... .......... .......... 57% 45.0M 0s
#23 0.997    950K .......... .......... .......... .......... .......... 60% 23.2M 0s
#23 0.999   1000K .......... .......... .......... .......... .......... 64% 18.4M 0s
#23 1.002   1050K .......... .......... .......... .......... .......... 67% 32.1M 0s
#23 1.003   1100K .......... .......... .......... .......... .......... 70% 27.5M 0s
#23 1.005   1150K .......... .......... .......... .......... .......... 73% 20.2M 0s
#23 1.007   1200K .......... .......... .......... .......... .......... 76% 18.8M 0s
#23 1.010   1250K .......... .......... .......... .......... .......... 79% 34.9M 0s
#23 1.012   1300K .......... .......... .......... .......... .......... 82% 31.1M 0s
#23 1.013   1350K .......... .......... .......... .......... .......... 85% 20.4M 0s
#23 1.015   1400K .......... .......... .......... .......... .......... 88% 32.5M 0s
#23 1.017   1450K .......... .......... .......... .......... .......... 91% 38.6M 0s
#23 1.018   1500K .......... .......... .......... .......... .......... 94% 27.1M 0s
#23 1.020   1550K .......... .......... .......... .......... .......... 97% 46.4M 0s
#23 1.021   1600K .......... .......... .......... ..........           100% 23.2M=0.09s
#23 1.023 
#23 1.023 2026-03-13 18:26:52 (18.1 MB/s) - ‘GCF_000240185.1_ASM24018v2_genomic.fna.gz’ saved [1679403/1679403]
#23 1.023 
#23 2.990 Parsing input
#23 3.046 Calculating distances
#23 5.932 Classifying
#23 6.285 GCF_000240185.1_ASM24018v2_genomic,Klebsiella pneumoniae,species,573,0.4691092073917389,0.0,[GCF_000240185.1] Klebsiella pneumoniae subsp. pneumoniae HS11286 (enterobacteria),,,,
#23 DONE 6.3s

Comment on lines +5 to +7
Full documentation: https://gambit-genomics.readthedocs.io/en/latest/

GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking) is a tool for rapid taxonomic identification of microbial pathogens. It uses an extremely efficient genomic distance metric along with a curated database of approximately 50,000 reference genomes (derived from NCBI RefSeq) to identify unknown bacterial genomes within seconds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include all the dependencies installed with micromamba?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I also just added those!

@cwoodside1278
Copy link
Contributor Author

I just merged the app stage! @erinyoung

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants