-
Notifications
You must be signed in to change notification settings - Fork 17
Support simultaneous ptx and lib building (multiple builds in a single builder) #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is a different solution by making the builder reusable instead of using multiple thread pools like #3 |
|
@ivarflakstad Hi Ivar, are you able to review this PR? I believe the PR at huggingface/candle#3221 depends on this. |
| authors = ["Nicolas Patry <patry.nicolas@protonmail.com>"] | ||
| name = "bindgen_cuda" | ||
| version = "0.1.5" | ||
| version = "0.1.7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| version = "0.1.7" | |
| version = "0.1.5" |
We can increment the version in a separate PR when creating a new release.
| true | ||
| }; | ||
| let ccbin_env = std::env::var("NVCC_CCBIN"); | ||
| let nvcc_binary = if std::path::Path::new("/usr/local/cuda/bin/nvcc").exists() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could create a utility ala get_nvcc_binary and avoid the repetition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is horrible. Let users setup their own path. This location is hardcoded to specific linux distributions it has nothing to do here.
If you really want allow environment overrides. It's pretty bad too, but at least it won't be hardcoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is horrible. Let users setup their own path. This location is hardcoded to specific linux distributions it has nothing to do here.
If you really want allow environment overrides. It's pretty bad too, but at least it won't be hardcoded.
This is intended as a fallback to the default nvcc path. At the moment, bindgen_cuda can panic even when nvcc is available in the standard installation location (/usr/local/cuda/bin).
Given that, we either provide a reasonable fallback here, or require every user to manually set and hardcode the path in their own environment. A get_nvcc_binary may needed as suggested by @ivarflakstad
| pub fn build_ptx(self) -> Result<Bindings, Error> { | ||
| let cuda_root = self.cuda_root.expect("Could not find CUDA in standard locations, set it manually using Builder().set_cuda_root(...)"); | ||
| pub fn build_ptx(&self) -> Result<Bindings, Error> { | ||
| let mut cuda_include_dir = PathBuf::from("/usr/local/cuda/include"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has nothing to do here.
| /// Force to use a given compute capability | ||
| pub fn set_compute_cap(&mut self, cap: usize) { | ||
| self.compute_cap = Some(cap); | ||
| } | ||
|
|
||
| pub fn get_compute_cap(&self) -> Option<usize> { | ||
| self.compute_cap | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They already exist, remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it shouldn't be here because I'm pushing to my own forked repo (forgot this has being pushed for a merge). But these interface are actual demand, we haven't expose compute_cap (initialized on default) to users and sometime they need a specific cuda compute cap to build certain kernels (instead of relying on env or nvidia-smi), e.g., flash attention v2 kernels needs to compiled with sm_90 or below even other kernels compiled with sm_120+. Related issue guoqingbao/vllm.rs#194
| /// println!("cargo:rustc-link-lib=flash"); | ||
| /// ``` | ||
| pub fn build_lib<P>(self, out_file: P) | ||
| pub fn build_lib<P>(&self, out_file: P) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was kind of points to consume the builder...
Everything already works, instead of making build_lib(&self) you should clone the builder directly (so implementing Clone).
The current version does not support multiple builder instances and a build instance cannot be used for another build (as also mentioned in #2 ), I have fixed this by making the builder reusable, e.g., building ptx after lib building.
The tested use case: