Clarify the casting behavior from floating-point / signed integers <-> unsigned integers

Opening this issue to track the [discussion](https://github.com/webmachinelearning/webnn/pull/478#discussion_r1405576949) for `cast` operator in PR #478.

(raised by @wacky6 in Chromium [CL-5056249](https://chromium-review.googlesource.com/c/chromium/src/+/5056249/comment/159c35cc_28e8f1e6/) review)

Casting to/from fp <-> int or fp is well understood. But what about the casting behavior from fp / int <-> uint?

Would -1.23 fp32 cast to 0 or 255 uint8?

The spec PR mentions cast is implementation defined, which isn't ideal. We should at least provide what caller should expect.

@fdwr [commented](https://github.com/webmachinelearning/webnn/pull/478#discussion_r1408719971)

> what about the casting behavior from fp / int <-> uint?

That's a trickier case for conformance, casting from a wider range into a narrower range, as different hardware gives differing results. On CPU, using SSE vs the classic FPU could return different results. On GPU, you could get different results depending on whether your GPU supported typed UAV's or native structured UAV's. On NPU, I don't even know yet.

> Would -1.23 fp32 cast to 0 or 255 uint8?

I can say that locally for -1.0f -> uint8, I get 255 on CPU via C++ `static_cast<uint8_t>`, but I get 0 on my GPU (because -1.0f is mapped to int32_t -1, which then clamps to `[0,255]` when written out to the typed UAV (since uint8_t is not a natively supported type within HLSL). Then for -1.0f -> uint16_t, it's a similar story, 0xFFFF on CPU but 0 on GPU. Though, if I tried this on a GPU with `D3D12_FEATURE_DATA_D3D12_OPTIONS4::Native16BitShaderOpsSupported` true, then I might well get 0xFFFF instead.

So, if we require an always consistent answer in the spec for negative float -> uint cases, then we'd need some intermediate casts. Surprisingly though, this issue evidently hasn't come up so far in the DML EP. Want to open an issue for it?

@wchao1115 [mentioned](https://github.com/webmachinelearning/webnn/pull/478#discussion_r1410222259):

Type casting from floating-point to signed/unsigned integer is a complex process, and one without clear industry standard b/c the casting will be both accuracy-lossy and range-dependent. It is generally understood simply as an "undefined" behavior where the outcome may depend on a number of runtime factors including the support in the hardware. This is part of the reason why no other framework attempts to define it concretely to-date and leave part of it as implementation-dependent. The same situation is applied to WebNN.

Another open is if the behavior depends on different hardware internals, would it cause some finger-printing issues? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify the casting behavior from floating-point / signed integers <-> unsigned integers #489

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarify the casting behavior from floating-point / signed integers <-> unsigned integers #489

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions