(Un)signedness of the base type is not taken into account in min/max and some comparison primitives.

We always use one of the `uintX_t` types from the `<cstdint>`-header for the base type of the processing style when specializing the TVL primitives and usually use the suitable `_mm*_epiX` intrinsic (for Intel‘s SIMD extensions). While this is fine for almost all primitives, it yields wrong results for certain inputs for primitives where the (un)signedness of the base type actually matters. This is the case for primitives that must determine which of two vector elements is less/greater than the other. In particular these primitives are (at least): `min`, `max`, `less`, `greater`, `lessequal`, `greaterequal`.

For instance, given the two input elements `0xffffffffffffffff` and `0x0000000000000001` the current AVX-512 specialization of the `min`-primitive would interpret them as `-1` and `1`, respecitively, and, thus, identify the former as the minimum, which is correct only for signed `int64_t`. However, since the primitive is a specialization for `uint64_t` base type, the inputs should be interpreted as `(2^64)-1` and `1`, whereby the latter is the minimum.

Please note that we really need the specialization for `uintX_t`, since we exclusively work with unsigned integers in MorphStore at the moment.

Please also note that the current state of things also incurs an inconsistency between the scalar and the vectorized primitives, since the specializations for a scalar processing style always return the correct result.

The solution would be to use the `_mm*_epu` intrinsics, e.g., `_mm512_min_epu64` instead of `_mm512_min_epi64`, whenever such an intrinsic is available. When no instrinsic for unsigned elements is available, we require an efficient workaround. For instance, there is no `_mm_min_epu64` in SSE.

Furthermore, we should keep the current specializations of the primitives mentioned above, but correct them such that they assume `intX_t` as the base type, rather than `uintX_t`.

Finally, this issue is worth fixing, since we often try all possible bit widths when experimenting with compression. That is, we really encounter data elements with the MSB set. In fact, some micro benchmarks in the Engine repo need to circumvent this issue, which they should not need to do.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Un)signedness of the base type is not taken into account in min/max and some comparison primitives. #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

(Un)signedness of the base type is not taken into account in min/max and some comparison primitives. #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions