-
Notifications
You must be signed in to change notification settings - Fork 68
Fix SIMD calls #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SIMD calls #59
Conversation
Fast SIMD calls and secure align for result.
fabiangreffrath
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you! But what was the exact issue with the former code that needed to get fixed?
fabiangreffrath
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the failing builds.
libfaac/quantize.c
Outdated
| x = _mm_max_ps(x, _mm_sub_ps((__m128){0, 0, 0, 0}, x)); | ||
| x = _mm_mul_ps(x, (__m128){sfacfix, sfacfix, sfacfix, sfacfix}); | ||
| const __m256d x_d = _mm256_load_pd(&xr[cnt]); | ||
| __m128 x = _mm256_cvtpd_ps(x_d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are AVX intrinsics, but this is SSE2 code. This will crash on systems without AVX support.
You need to use two _mm_load_pd and _mm_cvtpd_ps calls and one _mm_movelh_ps.
The main problem appears when unalligned data is used. The implicit For make an fix-only PR, I will change this part. |
Potential unaligned memory location.
Only fix unaligned memory access.
fabiangreffrath
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!

Fast SIMD calls and secure align for result.