-
Notifications
You must be signed in to change notification settings - Fork 68
Add support for the Brain 16-bit floating-point vector multiplication (FEAT_SVE_BFSCALE) intrinsics #410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add support for the Brain 16-bit floating-point vector multiplication (FEAT_SVE_BFSCALE) intrinsics #410
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -465,6 +465,8 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin | |
|
|
||
| * Added feature test macro for FEAT_SSVE_FEXPA. | ||
| * Added feature test macro for FEAT_CSSC. | ||
| * Added [**Alpha**](#current-status-and-anticipated-changes) support | ||
| for Brain 16-bit floating-point vector multiplication intrinsics. | ||
|
|
||
| ### References | ||
|
|
||
|
|
@@ -2122,6 +2124,20 @@ are available. Specifically, if this macro is defined to `1`, then: | |
| for the FEAT_SME_B16B16 instructions and if their associated intrinsics | ||
| are available. | ||
|
|
||
| #### Brain 16-bit floating-point vector multiplication support | ||
|
|
||
| This section is in | ||
| [**Alpha** state](#current-status-and-anticipated-changes) and might change or be | ||
| extended in the future. | ||
|
|
||
| `__ARM_FEATURE_SVE_BFSCALE` is defined to `1` if there is hardware | ||
| support for the SVE BF16 vector multiplication extensions and if the | ||
| associated ACLE intrinsics are available. | ||
|
|
||
| See [Half-precision brain | ||
| floating-point](#half-precision-brain-floating-point) for details | ||
| of half-precision brain floating-point types. | ||
|
|
||
| ### Cryptographic extensions | ||
|
|
||
| #### “Crypto” extension | ||
|
|
@@ -2634,6 +2650,7 @@ be found in [[BA]](#BA). | |
| | [`__ARM_FEATURE_SVE`](#scalable-vector-extension-sve) | Scalable Vector Extension (FEAT_SVE) | 1 | | ||
| | [`__ARM_FEATURE_SVE_B16B16`](#non-widening-brain-16-bit-floating-point-support) | Non-widening brain 16-bit floating-point intrinsics (FEAT_SVE_B16B16) | 1 | | ||
| | [`__ARM_FEATURE_SVE_BF16`](#brain-16-bit-floating-point-support) | SVE support for the 16-bit brain floating-point extension (FEAT_BF16) | 1 | | ||
| | [`__ARM_FEATURE_SVE_BFSCALE`](#brain-16-bit-floating-point-vector-multiplication-support) | SVE support for the 16-bit brain floating-point vector multiplication extension (FEAT_SVE_BFSCALE) | 1 | | ||
| | [`__ARM_FEATURE_SVE_BITS`](#scalable-vector-extension-sve) | The number of bits in an SVE vector, when known in advance | 256 | | ||
| | [`__ARM_FEATURE_SVE_MATMUL_FP32`](#multiplication-of-32-bit-floating-point-matrices) | 32-bit floating-point matrix multiply extension (FEAT_F32MM) | 1 | | ||
| | [`__ARM_FEATURE_SVE_MATMUL_FP64`](#multiplication-of-64-bit-floating-point-matrices) | 64-bit floating-point matrix multiply extension (FEAT_F64MM) | 1 | | ||
|
|
@@ -9374,6 +9391,26 @@ BFloat16 floating-point multiply vectors. | |
| uint64_t imm_idx); | ||
| ``` | ||
|
|
||
| ### SVE BFloat16 floating-point adjust exponent vectors instructions. | ||
|
|
||
| The specification for SVE BFloat16 floating-point adjust exponent vectors instructions is in | ||
| [**Alpha** state](#current-status-and-anticipated-changes) and might change or be | ||
| extended in the future. | ||
|
|
||
| #### BFSCALE | ||
|
|
||
| BFloat16 floating-point adjust exponent vectors. | ||
|
|
||
| ``` c | ||
| // Only if __ARM_FEATURE_SVE_BFSCALE != 0 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should only express direct feature dependencies here and I don't see SVE2 in requirements for corresponding instructions. So I think this is okay the way it is. Also, you don't need SVE2 to run these even if you follow dependency chain. What you need is either SVE2 or SME2 . |
||
| svbfloat16_t svscale[_bf16]_m (svbool_t pg, svbfloat16_t zdn, svint16_t zm); | ||
| svbfloat16_t svscale[_bf16]_x (svbool_t pg, svbfloat16_t zdn, svint16_t zm); | ||
| svbfloat16_t svscale[_bf16]_z (svbool_t pg, svbfloat16_t zdn, svint16_t zm); | ||
| svbfloat16_t svscale[_n_bf16]_m (svbool_t pg, svbfloat16_t zdn, int16_t zm); | ||
| svbfloat16_t svscale[_n_bf16]_x (svbool_t pg, svbfloat16_t zdn, int16_t zm); | ||
| svbfloat16_t svscale[_n_bf16]_z (svbool_t pg, svbfloat16_t zdn, int16_t zm); | ||
| ``` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whitespace
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure what is the whitespace issue here. I don't see any trailing whitespace after the quotes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leading whitespace at the start of the line (before the backticks). But as I said, minor and I don't care enough to hold up the PR |
||
|
|
||
| ### SVE2.1 instruction intrinsics | ||
|
|
||
| The specification for SVE2.1 is in | ||
|
|
@@ -11639,7 +11676,7 @@ Multi-vector floating-point fused multiply-add/subtract | |
| __arm_streaming __arm_inout("za"); | ||
| ``` | ||
|
|
||
| #### BFMLA. BFMLS, FMLA, FMLS (indexed) | ||
| #### BFMLA, BFMLS, FMLA, FMLS (indexed) | ||
|
|
||
| Multi-vector floating-point fused multiply-add/subtract | ||
|
|
||
|
|
@@ -12732,6 +12769,30 @@ element types. | |
| svint8x4_t svuzpq[_s8_x4](svint8x4_t zn) __arm_streaming; | ||
| ``` | ||
|
|
||
| #### BFMUL | ||
|
|
||
| BFloat16 Multi-vector floating-point multiply | ||
|
|
||
| ``` c | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Trailing space |
||
| // Only if __ARM_FEATURE_SVE_BFSCALE != 0 | ||
amilendra marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be |
||
| svbfloat16x2_t svmul[_bf16_x2](svbfloat16x2_t zd, svbfloat16x2_t zm) __arm_streaming; | ||
| svbfloat16x2_t svmul[_single_bf16_x2](svbfloat16x2_t zd, svbfloat16_t zm) __arm_streaming; | ||
| svbfloat16x4_t svmul[_bf16_x4](svbfloat16x4_t zd, svbfloat16x4_t zm) __arm_streaming; | ||
| svbfloat16x4_t svmul[_single_bf16_x4](svbfloat16x4_t zd, svbfloat16_t zm) __arm_streaming; | ||
| ``` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leading space |
||
|
|
||
| #### BFSCALE | ||
|
|
||
| BFloat16 floating-point adjust exponent vectors. | ||
|
|
||
| ``` c | ||
| // Only if __ARM_FEATURE_SVE_BFSCALE != 0 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be |
||
| svbfloat16x2_t svscale[_bf16_x2](svbfloat16x2_t zdn, svint16x2_t zm) __arm_streaming; | ||
| svbfloat16x2_t svscale[_single_bf16_x2](svbfloat16x2_t zn, svint16_t zm) __arm_streaming; | ||
| svbfloat16x4_t svscale[_bf16_x4](svbfloat16x4_t zdn, svint16x4_t zm) __arm_streaming; | ||
| svbfloat16x4_t svscale[_single_bf16_x4](svbfloat16x4_t zn, svint16_t zm) __arm_streaming; | ||
| ``` | ||
|
|
||
| ### SME2.1 instruction intrinsics | ||
|
|
||
| The specification for SME2.1 is in | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the whitespace issue here ? Are you talking about the space between quotes and "c" ? If so this is done pretty much throughout the whole document so I am not sure that is actually smth we need to correct. But I can fix it if you prefer it without space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Kmeakin , would it be possible to clarify the issue here ? I am not exactly sure what to address here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am referring to the whitespace between the backticks and the c. It is minor enough I don't think it should hold up the PR. LGTM