-
Notifications
You must be signed in to change notification settings - Fork 191
Add option for canonical direct I/O layout #3108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Creates an image with a memory layout that ensures that axis 0 is the most contiguous (fastest varying), followed by axis 1, then axis 2, and so on, up to the highest dimension. All axes involved in this ordering will have positive effective strides.
|
clang-tidy review says "All clean, LGTM! 👍" |
|
There's already the function One of the reasons for this more general solution is that it's quite common when working with 4D data to want data along the fourth direction to be contiguous. We don't care all that much about the spatial axes, its good to be able to manipulate them arbitrarily and that's exploited at image load but it's primarily making all data for a voxel contiguous that yields performance benefits. You might find that for what you're currently doing with 3D data you're seeking trivial spatial strides, whereas if you were to be trying to do something with 4D data on the GPU you'd instead be seeking the contiguous-along-nonspatial-axes solution. |
My main motivation to add a new function to
Would it also make sense to have something like a |
Yep, something like that makes sense. Mostly we use |
|
Unfortunately it's not that simple. The main priority in All this to say that requesting a specific set of strides is not guaranteed to give you those strides: it'll treat -1,+2,-3,+4 as equivalent to +1,-2,+3,-4. So we either need an extra method (or option to |
|
By the way, in response to this:
This already exists: https://github.com/MRtrix3/mrtrix3/blob/master/core/stride.h#L386-L408 |
|
Ok, so we could probably do something like this: Then change In Then, passing in a canonical symbolic list should do the job. If we want an even more general solution, I guess modifying |
|
@jdtournier @Lestropie I've modified the PR based on your feedback and suggestions. |
A new StrideInterpretation option is introduced to determining the target memory layout in with_direct_io. The default behaviour is preserved with StrideInterpretation::NearestMatch and the user can pass StrideInterpretation::Exact to use exact target symbolic layout.
57a0437 to
63e0342
Compare
No, I don't think it does. The first two functions highlighted ensure contiguity along one axis, whereas my description was specifically for contiguity along all non-spatial axes.
I don't think there should be any need for a second function argument to There's an existing (perhaps implicit?) capability within the strides handling that setting the stride to 0 in a list will, at the point at which the strides are actualised, be interpreted as "put these axes after all developer-specified axes, with the same relative ordering and sign as the input image". That should be adequate AFAICT; I think your proposal here is providing a parallel mechanism to existing capabilities. Say you have an input image that is [-3,+1,+2,+4,+5]. What I originally believed you wanted is something like the following two functions:
Is there a necessary scenario that isn't covered by either of these? |
|
The use case @daljit46 is specifically interested in here is getting images in & out of the GPU in an efficient manner, which means a single memory transfer. For that to work, we need to make sure the data are stored in the correct data type with exactly the strides specified (what @daljit46 referred to as 'canonical') - otherwise we'd need to handle arbitrary strides in the GPU shader, which seems likely to complicate things too much. The current infrastructure almost covers this, up to the guaranteeing the sign of the strides. What we would need is a call like: or something like that, where We can easily do the |
|
Yes, as Donald said, canonical here is used as : contiguous + strict axis order + positive signs only. Note for the specific use case of uploading images as textures to the GPU, the idea only needs to be applied to the 2D and 3D case (as rendering APIs natively support 2D and 3D textures only and higher order images would be ordinary buffers). However, even for the general nD case, having the canonical direct IO layout would be useful when the built-in MRtrix3 facilities aren't available (e.g. in GPU shaders) as I believe it is the most algorithmically straightforward layout to deal with. |
OK, I wasn't aware of the failure to enforce stride sign despite a developer very explicitly providing manual strides. I think that's the origin of the divergence. If that's the case though, I'm still not sure that the proposed interface is the right approach. Having a singular data structure called "
, and your issue is that you need to achieve 2. but the API is only facilitating 3., then arguably these warrant being stored as different structures. The structure storing your strides should intrinsically reflect what they encode / how they are to be imposed, and not be dependent on conditional interpretation. 3 is just a permutation and should enforce non-negativity. Further, there may be elsewhere in the code base where there is a discrepancy between a developer's expectations of whether the signs of strides matters and how those data are utilised. Certainly that wasn't obvious to me, so this could well be a source of problems elsewhere... |
I somewhat agree with this, but I would argue that Image with_direct_io(Stride::List with_strides = Stride::List());At the site of declaration, the doc comment says:
One would expect that if I pass in a given |
If that's the case, then perhaps rather than adding a flag that modulates whether Or instead, check the contents of
This is probably the more direct redress of the issue you're encountering, and doesn't require a new enum. You just provide as input a |
63e0342 to
88cb9e5
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
70031c3 to
6bf4cec
Compare
88cb9e5 to
63e0342
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
This PR introduces
Image::with_canonical_axes_order_layout(), a utility method to obtain anImagewith its data preloaded into RAM in a standardized, contiguous layout. This layout ensures that axis 0 is the most contiguous (fastest varying), followed by axis 1, then axis 2, and so on, up to the highest dimension. All axes involved in this ordering will have positive effective strides. This is a common packed layout suitable for interoperability with GPU APIs (where axis 0 often maps to width, axis 1 to height, axis 2 to depth) and possibly other external libraries.Checking the implementation of
Image::with_direct_io, I believe that the function should also ensure that the resulting stride has all positive directions.I'm slightly unsure about the naming though. I called the function
canonicalbut I admit that may be confusing (the other one I was debating wasstandard).