Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 62 additions & 1 deletion main/acle.md
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,8 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin

* Added feature test macro for FEAT_SSVE_FEXPA.
* Added feature test macro for FEAT_CSSC.
* Added [**Alpha**](#current-status-and-anticipated-changes) support
for Brain 16-bit floating-point vector multiplication intrinsics.

### References

Expand Down Expand Up @@ -2122,6 +2124,20 @@ are available. Specifically, if this macro is defined to `1`, then:
for the FEAT_SME_B16B16 instructions and if their associated intrinsics
are available.

#### Brain 16-bit floating-point vector multiplication support

This section is in
[**Alpha** state](#current-status-and-anticipated-changes) and might change or be
extended in the future.

`__ARM_FEATURE_SVE_BFSCALE` is defined to `1` if there is hardware
support for the SVE BF16 vector multiplication extensions and if the
associated ACLE intrinsics are available.

See [Half-precision brain
floating-point](#half-precision-brain-floating-point) for details
of half-precision brain floating-point types.

### Cryptographic extensions

#### “Crypto” extension
Expand Down Expand Up @@ -2634,6 +2650,7 @@ be found in [[BA]](#BA).
| [`__ARM_FEATURE_SVE`](#scalable-vector-extension-sve) | Scalable Vector Extension (FEAT_SVE) | 1 |
| [`__ARM_FEATURE_SVE_B16B16`](#non-widening-brain-16-bit-floating-point-support) | Non-widening brain 16-bit floating-point intrinsics (FEAT_SVE_B16B16) | 1 |
| [`__ARM_FEATURE_SVE_BF16`](#brain-16-bit-floating-point-support) | SVE support for the 16-bit brain floating-point extension (FEAT_BF16) | 1 |
| [`__ARM_FEATURE_SVE_BFSCALE`](#brain-16-bit-floating-point-vector-multiplication-support) | SVE support for the 16-bit brain floating-point vector multiplication extension (FEAT_SVE_BFSCALE) | 1 |
| [`__ARM_FEATURE_SVE_BITS`](#scalable-vector-extension-sve) | The number of bits in an SVE vector, when known in advance | 256 |
| [`__ARM_FEATURE_SVE_MATMUL_FP32`](#multiplication-of-32-bit-floating-point-matrices) | 32-bit floating-point matrix multiply extension (FEAT_F32MM) | 1 |
| [`__ARM_FEATURE_SVE_MATMUL_FP64`](#multiplication-of-64-bit-floating-point-matrices) | 64-bit floating-point matrix multiply extension (FEAT_F64MM) | 1 |
Expand Down Expand Up @@ -9374,6 +9391,26 @@ BFloat16 floating-point multiply vectors.
uint64_t imm_idx);
```

### SVE BFloat16 floating-point adjust exponent vectors instructions.

The specification for SVE BFloat16 floating-point adjust exponent vectors instructions is in
[**Alpha** state](#current-status-and-anticipated-changes) and might change or be
extended in the future.

#### BFSCALE

BFloat16 floating-point adjust exponent vectors.

``` c
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the whitespace issue here ? Are you talking about the space between quotes and "c" ? If so this is done pretty much throughout the whole document so I am not sure that is actually smth we need to correct. But I can fix it if you prefer it without space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Kmeakin , would it be possible to clarify the issue here ? I am not exactly sure what to address here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am referring to the whitespace between the backticks and the c. It is minor enough I don't think it should hold up the PR. LGTM

// Only if __ARM_FEATURE_SVE_BFSCALE != 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be __ARM_FEATURE_SVE_BFSCALE != 0 && __ARM_FEATURE_SVE2 != 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only express direct feature dependencies here and I don't see SVE2 in requirements for corresponding instructions. So I think this is okay the way it is.

Also, you don't need SVE2 to run these even if you follow dependency chain. What you need is either SVE2 or SME2 .

svbfloat16_t svscale[_bf16]_m (svbool_t pg, svbfloat16_t zdn, svint16_t zm);
svbfloat16_t svscale[_bf16]_x (svbool_t pg, svbfloat16_t zdn, svint16_t zm);
svbfloat16_t svscale[_bf16]_z (svbool_t pg, svbfloat16_t zdn, svint16_t zm);
svbfloat16_t svscale[_n_bf16]_m (svbool_t pg, svbfloat16_t zdn, int16_t zm);
svbfloat16_t svscale[_n_bf16]_x (svbool_t pg, svbfloat16_t zdn, int16_t zm);
svbfloat16_t svscale[_n_bf16]_z (svbool_t pg, svbfloat16_t zdn, int16_t zm);
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is the whitespace issue here. I don't see any trailing whitespace after the quotes?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leading whitespace at the start of the line (before the backticks). But as I said, minor and I don't care enough to hold up the PR


### SVE2.1 instruction intrinsics

The specification for SVE2.1 is in
Expand Down Expand Up @@ -11639,7 +11676,7 @@ Multi-vector floating-point fused multiply-add/subtract
__arm_streaming __arm_inout("za");
```

#### BFMLA. BFMLS, FMLA, FMLS (indexed)
#### BFMLA, BFMLS, FMLA, FMLS (indexed)

Multi-vector floating-point fused multiply-add/subtract

Expand Down Expand Up @@ -12732,6 +12769,30 @@ element types.
svint8x4_t svuzpq[_s8_x4](svint8x4_t zn) __arm_streaming;
```

#### BFMUL

BFloat16 Multi-vector floating-point multiply

``` c
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing space

// Only if __ARM_FEATURE_SVE_BFSCALE != 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be __ARM_FEATURE_SVE_BFSCALE != 0 && __ARM_FEATURE_SME2 != 0

svbfloat16x2_t svmul[_bf16_x2](svbfloat16x2_t zd, svbfloat16x2_t zm) __arm_streaming;
svbfloat16x2_t svmul[_single_bf16_x2](svbfloat16x2_t zd, svbfloat16_t zm) __arm_streaming;
svbfloat16x4_t svmul[_bf16_x4](svbfloat16x4_t zd, svbfloat16x4_t zm) __arm_streaming;
svbfloat16x4_t svmul[_single_bf16_x4](svbfloat16x4_t zd, svbfloat16_t zm) __arm_streaming;
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leading space


#### BFSCALE

BFloat16 floating-point adjust exponent vectors.

``` c
// Only if __ARM_FEATURE_SVE_BFSCALE != 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be __ARM_FEATURE_SVE_BFSCALE != 0 && __ARM_FEATURE_SME2 != 0

svbfloat16x2_t svscale[_bf16_x2](svbfloat16x2_t zdn, svint16x2_t zm) __arm_streaming;
svbfloat16x2_t svscale[_single_bf16_x2](svbfloat16x2_t zn, svint16_t zm) __arm_streaming;
svbfloat16x4_t svscale[_bf16_x4](svbfloat16x4_t zdn, svint16x4_t zm) __arm_streaming;
svbfloat16x4_t svscale[_single_bf16_x4](svbfloat16x4_t zn, svint16_t zm) __arm_streaming;
```

### SME2.1 instruction intrinsics

The specification for SME2.1 is in
Expand Down