Consistent OOB behaviour for wgpu#296
Conversation
| // both wgpu and Dawn handle this by either returning dummy data or clamping the index | ||
| // to valid bounds. This means it's harmless to use in a select. | ||
| let out_item = out.item(); | ||
| value = format!("select({out_item}(0), {value}, {ind} < {len})"); |
There was a problem hiding this comment.
Not sure this is going to work with vectorization.
There was a problem hiding this comment.
Just with the syntax of vec4(0)? Or more that indexing needs to take vectorization into account? I'm really not sure how vecorization works, if you could add a test for it that'd be amazing!
927c3fe to
1e29f78
Compare
|
|
||
| let length = match lhs.has_buffer_length() { | ||
| true => cube::Metadata::BufferLength { var: lhs }, | ||
| false => cube::Metadata::Length { var: lhs }, | ||
| }; | ||
|
|
||
| instructions.push(self.compile_metadata(length, Some(array_len))); | ||
| instructions.push(wgsl::Instruction::CheckedIndex { | ||
| len: self.compile_variable(array_len), | ||
| lhs: self.compile_variable(lhs), | ||
| rhs: self.compile_variable(rhs), | ||
| out: self.compile_variable(out), | ||
| }); |
There was a problem hiding this comment.
I don't think it's necessary right now, but I think having this implemented with a cpa kernel like CheckedIndexAssign would be easier.
There was a problem hiding this comment.
I'm also not sure when we don't have BufferLength
There was a problem hiding this comment.
+1 to cpa someday.
And yeah this behaviour is copied from the CPP version. Personally I feel like that kind of thing should fail to compile, rather than creating a language with strange edge cases of UB. I'm not sure what kind of situation means you don't have buffer length
Currently, CubeCL relies on wgpu to have a consistent behaviour for OOB indexing (read-0, write-discard). While this so far has been true, it's not true in Dawn on WebGPU. The WebGPU specification defines an OOB as a "dynamic error" that might result in anything including program termination. In practice it's not that severe - but some backends do have differing behaviour like clamping the index instead of discarding a write.
This PR changes Cube to use an indexing mode much like cubecl-cpp, and insert checks manually. For performance, I still do a read-oob-0 by using a
select()which hopefully compiles to a conditional move. This isn't quite correct as the WebGPU spec allows any kind of behaviour when indexing OOB, but in practice it can't do much besides picking a random in bound index, so this is practically safe.On Vulkan + Spir-V we're relying on Vulkan robustness, this doens't change anything there.
In the future we could also disable these checks on Vulkan with robustness when using WGSL (or any other platforms where this behavious is guaranteed).
This might unblock #211, as mentioned here: tracel-ai/burn#2435 metal had new issues as the OOB behaviour in WGPU changes.
Very much not my area so please have Genna sign this off!