This is in part enforced by the CNM scatter map shape. The scatter map mandates that the last dimensions of the input have the same shape as the buffer. However since then, CNM scatter has been made bufferizable. If the memref has a custom layout or strides, the elements may not be contiguous anyway.
For instance
%3 = scf.for %arg6 = %c0 to %c1024 step %c32 iter_args(%arg7 = %alloc_0) -> (memref<1x128xi32>) {
%subview_2 = memref.subview %arg0[%arg2, %arg6] [1, 32] [1, 1] : memref<1x1024xi32> to memref<1x32xi32, strided<[1024, 1], offset: ?>>
%4 = upmem.alloc_dpus : !upmem.hierarchy<1x128x1>
upmem.scatter %subview_2[132, 32, #map] onto %4 : memref<1x32xi32, strided<[1024, 1], offset: ?>> onto !upmem.hierarchy<1x128x1>
is this legal?
-> yes, because the stride in the second dimension is 1.