Skip to content

Conversation

@jayavenkatesh19
Copy link
Contributor

Towards #582

Has instructions to build a custom Docker image for RAPIDS libraries on conda and pip:

  • conda uses cuda-base as the base image and install all the libraries inside a conda environment
  • pip uses cuda-runtime as the base image and installs all the libraries inside a pip virtual environment
  • Users have the option to choose what libraries to include in the Docker image by editing the env.yaml for conda and requirements.txt for pip.
  • Follows the pattern of base and notebook images giving the user the choice of using an ipython kernel or a jupyterhub server.
  • Has instructions for build customization through arguments, and extending to include libraries needed by the user.

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@jayavenkatesh19 jayavenkatesh19 self-assigned this Aug 8, 2025
@jayavenkatesh19 jayavenkatesh19 requested a review from a team as a code owner August 8, 2025 01:04
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! I have a few high-level thoughts.

There are two personas I want to address in #582:

  • Folks who think our kitchen-sink images are too big and they want to build their own cut-down versions.
  • Folks who have an existing container image and want to add RAPIDS to it.

I think this draft of the page satisfies the first case, but not the second. This is ok, but we should either update this page to try and satisfy both, or add another page specifically for the second case.

If we look at this page through the lens of diataxis it falls somewhere between a tutorial and a guide. It feels like enough of a cookbook recipe that maybe it is a guide. But it also covers a heck of a lot of customization options that tip it towards a tutorial.

If it's a guide then it's way too complicated. There are so many options and opinionated customization points. We are providing a template and then explaining how to use the template. We aren't explaining what the template is actually doing. I think if we want this to be a guide we need to simplify heavily. How many lines from the Dockerfile can we remove? Do we need build args? Do we need multi-stage builds?

We also don't necessarily demonstrate at the end that we have achieved the goals we started with. We state that images will be smaller, and contain a more flexible selection of packages. Let's prove it. Can we show container image size comparisons? Can we count the number of installed packages vs the official images?

I also want to try and tackle some more explanation and understanding. As a user who is trying to build custom images there are a bunch of unknowns. Maybe this should live on a separate page. I want to answer things like:

  • What is in the nvidia/cuda images? What do they add over standard ubuntu?
  • What does the docker run --gpu flag actually add in terms of libraries and binaries? nvidia-smi becomes magically available, that means it isn't in the image, so where does it come from? What else gets mounted in? Are system libraries mounted in?
  • What is the minimum an image needs to run CUDA applications? CUDA toolkit? Something from the drivers? Anything else?
  • You start with the nvidia/cuda images and add miniforge. Can we start with the miniforge images and add CUDA? Or start from something else? If not, why not?
  • What is the difference between the conda packages and pip packages. It mentions that conda packages have the runtime components. Why?

Reading this page brings a lot of these questions to mind. It's ok not to directly address them here, but maybe we need an FAQ page with supporting information that we can link to from here.

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Comment on lines 171 to 178
### Available Build Arguments

| Argument | Default Value | Description | Example Values |
| ------------------ | ------------- | ------------------------------------------------------ | -------------------- |
| `CUDA_VER` | `12.9.1` | Sets the CUDA version for the base image and packages. | `12.0` |
| `PYTHON_VER` | `3.12` | Defines the Python version to install and use. | `3.11`, `3.10` |
| `LINUX_DISTRO` | `ubuntu` | The base Linux distribution. | `rockylinux9`, `cm2` |
| `LINUX_DISTRO_VER` | `24.04` | The version of the Linux distribution. | `20.04`, `24.04` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but I would consider build args an advanced feature of building containers that may be unfamiliar to folks. Build args are great if you're building a whole matrix of images, but most folks just want to build one.

I think most people would be comfortable setting these in the Dockerfile directly. Maybe setting them as an ENV would be better, then make a note that if folks want to build a whole matrix they could use build args instead.

This is just a gut feeling, and a drive to keep things simple. Happy for further discussion on this.

docker run --gpus all -it rapids-pip-base
```

2. **Test RAPIDS within IPython**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could tell them to install it at runtime. Don't include it in the image. Just tell them to install it with pip if they want to use it.

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. Let's get this merged and then we can iterate further.

@jacobtomlinson jacobtomlinson merged commit b205997 into rapidsai:main Aug 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants