Conversation
369b4fe to
6805fe6
Compare
WalkthroughReplace Vc with xsimd in build and headers, add octahedral normal packing API, generalize/vector/matrix SIMD code (add 3x3 support, swizzling), rewrite matrix inversion/transpose, adjust literal operator syntax, add/extend unit tests, and update CMake/test presets and .gitignore. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as TestSuite
participant API as mr::API
participant SIMD as xsimd
participant Math as Matr/Vec
Test->>API: Create Vec/Matr/Norm (swizzle, pack, invert)
API->>SIMD: load/store, swizzle, reduce_add, select
SIMD->>Math: perform SIMD ops (via xsimd primitives)
Math->>Math: transposition / Gaussian elimination (inversed)
Math->>API: return results
Test->>Test: assert results (identity, equality, tolerances)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas to focus review on:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🔭 Outside diff range comments (3)
CMakeLists.txt (1)
82-82: Update DEPENDENCIES to reflect xsimd instead of Vc.The DEPENDENCIES still references "Vc 1.4" but the library now uses xsimd. This inconsistency could cause issues for package consumers.
Apply this diff to fix the dependency reference:
- DEPENDENCIES "Vc 1.4" + DEPENDENCIES "xsimd"include/mr-math/vec.hpp (1)
117-143: Clean cross product implementation using swizzle.The refactored implementation is more readable and leverages the new swizzle functionality effectively. Consider removing the commented alternative implementations.
include/mr-math/matr.hpp (1)
197-283: Matrix inversion correctly implements Gaussian elimination.The new implementation using the
Row2Nhelper struct for augmented matrix operations is well-structured. The algorithm correctly handles pivot normalization and zero pivots with epsilon checks.Consider adding a comment about numerical stability limitations of Gaussian elimination for ill-conditioned matrices. For production use, implementing pivoting strategies or using more stable algorithms like LU decomposition might be beneficial.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
.gitignore(1 hunks)CMakeLists.txt(1 hunks)cmake/dependencies.cmake(1 hunks)include/mr-math/color.hpp(2 hunks)include/mr-math/debug.hpp(4 hunks)include/mr-math/def.hpp(2 hunks)include/mr-math/matr.hpp(6 hunks)include/mr-math/norm.hpp(1 hunks)include/mr-math/row.hpp(3 hunks)include/mr-math/units.hpp(1 hunks)include/mr-math/vec.hpp(6 hunks)tests/main.cpp(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/main.cpp (1)
include/mr-math/def.hpp (4)
equal(45-47)equal(45-45)equal(56-58)equal(56-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Analyze (actions)
- GitHub Check: Analyze (python)
🔇 Additional comments (23)
.gitignore (1)
41-41: LGTM: Good addition for VS Code support.Adding
.vscode/*to gitignore is a good practice to prevent VS Code workspace settings from being committed to the repository.CMakeLists.txt (1)
46-47: LGTM: Library linking correctly updated.The migration from Vc to xsimd in the library linking is implemented correctly.
include/mr-math/debug.hpp (1)
27-27: Double instantiation removal is intentional across the codebase.All
doubledebug instantiations are commented out indebug.hpp, matching the commented-outdoublealiases inmatr.hpp(e.g.Matr4d) and elsewhere. This appears to be a deliberate change—no further updates needed unless restoring double-precision debug support is required in the future.include/mr-math/color.hpp (2)
71-72: LGTM: Improved efficiency with swizzled method.Replacing explicit element-wise construction with the
swizzledmethod is more efficient and leverages the new SIMD capabilities from xsimd. The implementation correctly maintains the same functionality while improving performance.Also applies to: 76-77, 81-82
116-116: LGTM: Literal operator syntax correction.Removing the space between
operator""and_rgbafollows proper C++ literal operator syntax.include/mr-math/norm.hpp (1)
87-88: LGTM! Correct migration to xsimd API.The change from
.sum()tostdx::reduce_add()properly aligns with the xsimd library API for vector reduction operations.include/mr-math/def.hpp (2)
8-8: New C++23 header included.The
<print>header requires C++23. Ensure your build configuration and compiler support this standard.
23-27: SIMD library migration looks good.The switch from Vc to xsimd is properly implemented with appropriate namespace aliasing.
include/mr-math/units.hpp (1)
124-146: Correct C++ standard compliance for literal operators.Removing the space between
operator""and the literal suffix is required by the C++ standard. The old syntax with a space was deprecated and is now ill-formed.tests/main.cpp (2)
38-44: Great test coverage for the new swizzle functionality!The test properly validates both the non-mutating
swizzled()and mutatingswizzle()methods.
208-230: Comprehensive matrix inversion tests added.The tests thoroughly validate matrix inversion for both 4x4 and 3x3 matrices, including verification against expected results and identity matrix checks.
Also applies to: 272-367
include/mr-math/vec.hpp (3)
151-152: Consistent xsimd API usage for reductions.Both
length2()anddot()correctly usestdx::reduce_add()instead of the Vc-specific.sum()method.Also applies to: 219-220
271-281: Well-designed swizzle API.The swizzle methods follow the established pattern of providing both mutating and non-mutating variants, maintaining API consistency.
303-313: Efficient SIMD-based clamping implementation.The refactored method properly leverages SIMD select operations for branchless clamping, which should improve performance compared to element-wise operations.
include/mr-math/row.hpp (6)
52-56: LGTM! Array-based copy constructor implementation is correct.The implementation correctly handles type conversion between different Row types using an intermediate buffer. The buffer sizing with
std::maxensures sufficient space for both source and destination SIMD types.
92-99: Element access methods correctly adapted to xsimd API.The switch from direct array access to
.get()method aligns with xsimd's API requirements.
101-118: Well-implemented swizzle functionality.The swizzle methods correctly handle element reordering with proper SIMD padding. The parameter pack overload provides a convenient interface while delegating to the array-based implementation.
142-158: Epsilon-based equality comparison correctly implemented.The array-based approach with
mr::equalproperly handles floating-point comparison with tolerance.
163-171: Protected _set method correctly adapted.The array-based loading approach with proper SIMD buffer sizing is consistent with the xsimd migration pattern.
174-179: Element modification correctly implemented via array operations.The store-modify-load pattern appropriately handles single element updates within xsimd's API constraints.
include/mr-math/matr.hpp (3)
55-59: Excellent generalization using compile-time lambda.The use of
std::integer_sequencewith a compile-time lambda elegantly generalizes the constructor to support any matrix size, replacing the hardcoded 4x4 implementation.
106-119: Clarify TODO and consider enabling matrix multiplication specialization.The TODO comment suggests issues with the generic implementation. The commented N=3 specialization using
xsimd::insertappears necessary for proper 3x3 matrix multiplication.Please clarify what "extra cases" the TODO refers to and whether the N=3 specialization should be enabled for correct 3x3 matrix multiplication.
175-195: Well-designed transpose specializations for 3x3 and 4x4 matrices.The implementation cleverly handles SIMD requirements by using a 4x4 temporary array for 3x3 transposition. The specialized approach ensures optimal performance for supported matrix sizes.
| # CPMFindPackage( | ||
| # NAME Vc | ||
| # GITHUB_REPOSITORY 4J-company/Vc | ||
| # GIT_TAG 1.4 | ||
| # OPTIONS | ||
| # "USE_CCACHE ON" | ||
| # ) | ||
|
|
||
| CPMFindPackage( | ||
| NAME Vc | ||
| GITHUB_REPOSITORY 4J-company/Vc | ||
| GIT_TAG 1.4 | ||
| OPTIONS | ||
| "USE_CCACHE ON" | ||
| NAME xsimd | ||
| GITHUB_REPOSITORY xtensor-stack/xsimd | ||
| # GIT_TAG 12.1.1 | ||
| GIT_TAG master | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Consider using a specific version tag instead of master branch.
While the migration from Vc to xsimd looks correct, using the master branch can lead to build instability as it may include breaking changes. Consider using the commented-out version 12.1.1 or another stable release tag.
Apply this diff to use a specific version:
CPMFindPackage(
NAME xsimd
GITHUB_REPOSITORY xtensor-stack/xsimd
- # GIT_TAG 12.1.1
- GIT_TAG master
+ GIT_TAG 12.1.1
)🤖 Prompt for AI Agents
In cmake/dependencies.cmake between lines 9 and 22, replace the GIT_TAG value
for the xsimd package from "master" to the specific version tag "12.1.1" by
uncommenting the line with "12.1.1" and commenting out or removing the "master"
tag line. This ensures build stability by using a fixed, stable release instead
of the potentially unstable master branch.
| // using SimdImpl = stdx::fixed_size_simd<T, N>; | ||
| requires(sizeof(T) == 4) | ||
| using SimdImpl = typename stdx::make_sized_batch<T, 4>::type; | ||
| // using SimdImpl = typename stdx::make_sized_batch<T, N>::type; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consider the limitations of the hardcoded SIMD configuration.
The new implementation:
- Restricts types to 4-byte sizes only (excluding
doubleand 64-bit integers) - Uses a fixed batch size of 4, which may not be optimal for all architectures
The commented line suggests this might be temporary. Consider making this more flexible or documenting why these restrictions are necessary.
🤖 Prompt for AI Agents
In include/mr-math/def.hpp around lines 34 to 37, the current SIMD
implementation restricts types to 4-byte sizes and uses a fixed batch size of 4,
limiting support for types like double and 64-bit integers and potentially
reducing performance on some architectures. To fix this, either revert to or
enhance the commented flexible implementation that uses the template parameter N
for batch size, or add clear documentation explaining why the fixed size and
type restrictions are necessary and any plans for future flexibility.
| // std::println("-----------------------------------------"); | ||
| // matrix.inverse(); | ||
| // std::println("after functrion"); | ||
| EXPECT_TRUE(matrix.inverse().equal(expected, 0.0001)); |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Remove commented debug code.
Please remove the commented println debug statements before merging.
- // std::println("-----------------------------------------");
- // matrix.inverse();
- // std::println("after functrion");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // std::println("-----------------------------------------"); | |
| // matrix.inverse(); | |
| // std::println("after functrion"); | |
| EXPECT_TRUE(matrix.inverse().equal(expected, 0.0001)); | |
| EXPECT_TRUE(matrix.inverse().equal(expected, 0.0001)); |
🤖 Prompt for AI Agents
In tests/main.cpp around lines 226 to 229, remove the commented out debug print
statements to clean up the code before merging. Delete the lines containing the
commented std::println calls so only the relevant test assertion remains.
| template <ArithmeticT R, std::size_t S, ArithmeticT ... Args> | ||
| constexpr Vec(const Vec<R, S> &v, Args ... args) noexcept : _data(v, args...) {} | ||
| #endif | ||
| requires (sizeof...(Args) + S == N) | ||
| // TODO: use RowT consturctor here, move code to it (now doesnt compile) | ||
| // constexpr Vec(const Vec<R, S> &v, Args ... args) noexcept : _data(v, args...) {} | ||
| constexpr Vec(const Vec<R, S> &v, Args ... args) noexcept { | ||
| std::array<T, RowT::SimdT::size> buf; | ||
| v._data._data.store_unaligned(buf.data()); | ||
| std::array args_buf {args...}; | ||
| for (std::size_t i = S; i < N; i++) { | ||
| buf[i] = args_buf[i - S]; | ||
| } | ||
| _data._data = stdx::load_unaligned(buf.data()); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Address the TODO for constructor delegation.
The implementation is correct but could be more efficient by delegating to a RowT constructor as suggested in the TODO comment. This would avoid the store/load round-trip.
Would you like me to help implement the RowT constructor to enable this optimization?
🤖 Prompt for AI Agents
In include/mr-math/vec.hpp around lines 84 to 96, the constructor currently
copies data by storing and loading arrays, which is inefficient. To fix this,
implement a constructor in the RowT class that accepts a Vec<R, S> and Args...,
then delegate the Vec constructor to this RowT constructor. This will eliminate
the store/load round-trip and improve performance.
| std::array<T, RowT::SimdT::size> buf {}; | ||
| for (std::size_t i = 0; i < len; i++) { | ||
| buf[i] = span[i]; | ||
| } | ||
| _data._data = stdx::load_unaligned(buf.data()); | ||
| // for (size_t i = 0; i < len; i++) { | ||
| // _data._set_ind(i, span[i]); | ||
| // } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Proper SIMD data loading implementation.
The refactored constructor correctly uses a temporary buffer with load_unaligned for SIMD-compatible data loading. Consider removing the commented old code.
_data._data = stdx::load_unaligned(buf.data());
- // for (size_t i = 0; i < len; i++) {
- // _data._set_ind(i, span[i]);
- // }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| std::array<T, RowT::SimdT::size> buf {}; | |
| for (std::size_t i = 0; i < len; i++) { | |
| buf[i] = span[i]; | |
| } | |
| _data._data = stdx::load_unaligned(buf.data()); | |
| // for (size_t i = 0; i < len; i++) { | |
| // _data._set_ind(i, span[i]); | |
| // } | |
| } | |
| std::array<T, RowT::SimdT::size> buf {}; | |
| for (std::size_t i = 0; i < len; i++) { | |
| buf[i] = span[i]; | |
| } | |
| _data._data = stdx::load_unaligned(buf.data()); | |
| } |
🤖 Prompt for AI Agents
In include/mr-math/vec.hpp around lines 70 to 78, the constructor uses a
temporary buffer and load_unaligned for SIMD data loading correctly, but the old
commented-out code remains. Remove the commented-out loop code to clean up and
finalize the implementation.
| constexpr bool operator==(const Row &other) const noexcept { | ||
| return stdx::all_of(_data == other._data); | ||
| // return stdx::all_of(_data == other._data); | ||
| // return _data.eq(other._data); | ||
|
|
||
| // return stdx::all(stdx::eq(_data, other._data)); | ||
|
|
||
| // this variant works only for vec3, make this more generic is too hard | ||
| // auto batch = stdx::eq(_data, other._data); | ||
| // decltype(batch) mask_batch(true, true, true, false); | ||
| // return stdx::all(batch || mask_batch); | ||
|
|
||
| std::array<T, SimdT::size> left_buf, right_buf; | ||
| _data.store_unaligned(left_buf.data()); | ||
| other._data.store_unaligned(right_buf.data()); | ||
| for (uint32_t i = 0; i < N; i++) { | ||
| if (left_buf[i] != right_buf[i]) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consider optimizing equality comparison with SIMD operations.
While the array-based comparison is correct, it loses SIMD performance benefits. The commented attempts suggest challenges with generic SIMD equality across different vector sizes.
Consider investigating xsimd's masked operations or size-specific specializations to enable SIMD-based equality checks for better performance. The current implementation is functionally correct but could be optimized.
🤖 Prompt for AI Agents
In include/mr-math/row.hpp around lines 120 to 140, the equality operator
currently compares elements by storing SIMD data into arrays and iterating,
which is correct but does not leverage SIMD performance. To fix this, refactor
the operator== to use SIMD masked equality operations provided by xsimd or
implement size-specific specializations that perform SIMD comparisons directly,
avoiding the need to store data into arrays and loop manually. This will
optimize the equality check while maintaining correctness.
| // TODO: this code for vec3 fill 4th float as value instead 0 too | ||
| constexpr Row(const T value) : _data(value) {} |
There was a problem hiding this comment.
Critical: Vec3 initialization fills 4th SIMD lane incorrectly
The TODO comment correctly identifies a bug where vec3 initialization fills the 4th SIMD lane with the constructor value instead of 0. This can cause incorrect results in SIMD operations that expect zero-padding.
The commented-out code (lines 28-34) provides the correct implementation. Enable it to fix this issue:
- // TODO: this code for vec3 fill 4th float as value instead 0 too
- constexpr Row(const T value) : _data(value) {}
- // constexpr Row(const T data) {
- // std::array<T, SimdT::size> buf {};
- // for (uint32_t i = 0; i < N; i++) {
- // buf[i] = data;
- // }
- // _data = _data.load_unaligned(buf.data());
- // }
+ constexpr Row(const T data) {
+ std::array<T, SimdT::size> buf {};
+ for (uint32_t i = 0; i < N; i++) {
+ buf[i] = data;
+ }
+ _data = _data.load_unaligned(buf.data());
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // TODO: this code for vec3 fill 4th float as value instead 0 too | |
| constexpr Row(const T value) : _data(value) {} | |
| constexpr Row(const T data) { | |
| std::array<T, SimdT::size> buf {}; | |
| for (uint32_t i = 0; i < N; i++) { | |
| buf[i] = data; | |
| } | |
| _data = _data.load_unaligned(buf.data()); | |
| } |
🤖 Prompt for AI Agents
In include/mr-math/row.hpp at lines 26 to 27, the constructor for vec3
incorrectly fills the 4th SIMD lane with the given value instead of zero,
causing errors in SIMD operations. Replace the current constructor
implementation with the commented-out code below it (lines 28-34) that correctly
initializes the 4th lane to zero while setting the first three lanes to the
value. This ensures proper zero-padding for vec3 SIMD vectors.
| static Matr get_identity() { | ||
| // TODO: beautify it | ||
| std::array<RowT, N> id; | ||
| constexpr auto io = std::ranges::iota_view {(size_t)0, N}; | ||
| if constexpr (N == 4) { | ||
| id[0] = {1, 0, 0, 0}; | ||
| id[1] = {0, 1, 0, 0}; | ||
| id[2] = {0, 0, 1, 0}; | ||
| id[3] = {0, 0, 0, 1}; | ||
| } else if (N == 3) { | ||
| id[0] = {1, 0, 0}; | ||
| id[1] = {0, 1, 0}; | ||
| id[2] = {0, 0, 1}; | ||
| } | ||
|
|
||
| // std::array<RowT, N> id; | ||
| // constexpr auto io = std::ranges::iota_view {(size_t)0, N}; | ||
|
|
||
| std::transform( | ||
| io.begin(), io.end(), id.begin(), | ||
| [&io](size_t i) -> RowT { | ||
| return SimdImpl<T, N>([i](size_t j) { return j == i ? 1 : 0; }); | ||
| }); | ||
| // std::transform( | ||
| // io.begin(), io.end(), id.begin(), | ||
| // [&io](size_t i) -> RowT { | ||
| // return SimdImpl<T, N>([i](size_t j) { return j == i ? 1 : 0; }); | ||
| // }); | ||
|
|
||
| return id; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Identity matrix initialization is clear and efficient.
The explicit initialization for 3x3 and 4x4 matrices is more readable and efficient than the previous generic approach.
The TODO comment about beautifying can be removed - the current implementation is already clean and straightforward.
🤖 Prompt for AI Agents
In include/mr-math/matr.hpp around lines 426 to 449, remove the TODO comment
about beautifying the identity matrix initialization since the current explicit
initialization for 3x3 and 4x4 matrices is clear, readable, and efficient as is.
| template <ArithmeticT T> | ||
| using Matr3 = Matr<T, 3>; | ||
|
|
||
| using Matr3f = Matr3<float>; | ||
| // using Matr4d = Matr4<double>; | ||
| using Matr3i = Matr3<int>; | ||
| using Matr3u = Matr3<uint32_t>; |
There was a problem hiding this comment.
Fix copy-paste error in comment.
Line 27 has a duplicate comment about Matr4d which should refer to Matr3d.
using Matr3f = Matr3<float>;
- // using Matr4d = Matr4<double>;
+ // using Matr3d = Matr3<double>;
using Matr3i = Matr3<int>;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| template <ArithmeticT T> | |
| using Matr3 = Matr<T, 3>; | |
| using Matr3f = Matr3<float>; | |
| // using Matr4d = Matr4<double>; | |
| using Matr3i = Matr3<int>; | |
| using Matr3u = Matr3<uint32_t>; | |
| template <ArithmeticT T> | |
| using Matr3 = Matr<T, 3>; | |
| using Matr3f = Matr3<float>; | |
| // using Matr3d = Matr3<double>; | |
| using Matr3i = Matr3<int>; | |
| using Matr3u = Matr3<uint32_t>; |
🤖 Prompt for AI Agents
In include/mr-math/matr.hpp around lines 23 to 29, the comment on line 27
incorrectly refers to Matr4d due to a copy-paste error. Update the comment to
correctly refer to Matr3d instead, ensuring it matches the surrounding code
context and naming conventions.
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
include/mr-math/quat.hpp (1)
74-129: Resolve merge conflict and fix quaternion constructor calls with incorrect typesThis block contains unresolved merge conflict markers that prevent compilation. Additionally, the stashed implementation has critical compile errors beyond what was previously noted:
Merge conflict (lines 90–129):
- Conflict markers must be removed; file will not compile as-is.
- Two competing
Vec×Quatrotation implementations.std::coutdebug statement at line 102 will spam logs and must be removed.Critical compile errors in stashed code:
- Line 100:
Quat(0, v.x(), v.y(), v.z())—0is a raw scalar, but Quat constructors requireRadians<T>as the first argument. TheRadiansconstructor is markedexplicit, so implicit conversion is not possible.- Line 122:
Quat(w(), -vec())—w()returnsT(a raw scalar), but the Quat constructor expectsRadians<T>. This will fail to compile.Correct these by wrapping scalar angles with
Radians<T>:[[nodiscard]] friend constexpr Vec3<T> operator*(const Vec3<T>& v, const Quat& q) noexcept { const Quat pure(Radians<T>(0), v.x(), v.y(), v.z()); const Quat result = q * pure * q.conjugate(); return result.vec(); } friend constexpr Vec<T, 3> & operator*=(Vec<T, 3> &lhs, const Quat &rhs) noexcept { lhs = lhs * rhs; return lhs; } [[nodiscard]] friend constexpr Quat operator/(const Quat &lhs, T rhs) noexcept { return { Radians<T>(lhs.w() / rhs), lhs.vec() / rhs }; } friend constexpr Quat & operator/=(Quat &lhs, T rhs) noexcept { lhs = lhs / rhs; return lhs; } [[nodiscard]] constexpr Quat conjugate() const noexcept { return Quat(Radians<T>(w()), -vec()); } friend std::ostream & operator<<(std::ostream &os, const Quat &q) noexcept { os << '(' << q.w() << ", " << q.vec() << ')'; return os; }
♻️ Duplicate comments (2)
include/mr-math/row.hpp (1)
26-35:Row(const T)still corrupts the padded SIMD lane for 3‑component rowsFor
N == 3,SimdThas 4 lanes butconstexpr Row(const T value) : _data(value) {}fills all lanes with
value. Any code that reduces over the full SIMD register (e.g., dot products usingstdx::reduce_add) will then see 4 contributions instead of 3, which breaks things like lengths or “broadcast” rows built via this constructor. The TODO above already calls this out.Suggested fix (based on the commented code and current xsimd API) is to only initialize the first
Nlanes and leave the padding lane(s) at 0:- // TODO: this code for vec3 fill 4th float as value instead 0 too - constexpr Row(const T value) : _data(value) {} - // constexpr Row(const T data) { - // std::array<T, SimdT::size> buf {}; - // for (uint32_t i = 0; i < N; i++) { - // buf[i] = data; - // } - // _data = _data.load_unaligned(buf.data()); - // } + constexpr Row(const T value) { + std::array<T, SimdT::size> buf{}; + for (uint32_t i = 0; i < N; ++i) { + buf[i] = value; + } + _data = SimdT::load_unaligned(buf.data()); + }(The updated
_set/_set_indimplementations already follow the “store or load through a full-sized buffer” pattern; this constructor should be consistent with that.)Also applies to: 170-181, 184-195
include/mr-math/def.hpp (1)
8-9: SimdImpl is now hard‑wired to 4‑byte types and 4‑wide batchesThe new alias
template <ArithmeticT T, std::size_t N> requires(sizeof(T) == 4) using SimdImpl = typename stdx::make_sized_batch<T, 4>::type;means:
- SIMD is only available for 4‑byte types (e.g.,
float,int32_t);doubleand 64‑bit integers cannot useSimdImplat all.- The template parameter
Nis ignored; all rows/vecs use a 4‑lane batch regardless of their logical size.If that restriction is intentional for this project, it would help to document it clearly (and possibly
static_assertwhen incompatible types are instantiated). Otherwise, consider moving towardmake_sized_batch<T, N>or a small meta that capsNat the hardware width, soNactually drives the batch shape and doubles remain supported.Unrelated but minor:
#include <print>in this core header forces a C++23 lib feature even though this file doesn’t use it directly; you may want to keep<print>confined to test or utility code that actually callsstd::println.Also applies to: 23-28, 33-39
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (10)
CMakeUserPresets.json(1 hunks)include/mr-math/color.hpp(2 hunks)include/mr-math/def.hpp(2 hunks)include/mr-math/norm.hpp(2 hunks)include/mr-math/packing.hpp(1 hunks)include/mr-math/quat.hpp(1 hunks)include/mr-math/row.hpp(3 hunks)tests/CMakeLists.txt(1 hunks)tests/misc.cpp(1 hunks)tests/norm.cpp(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
tests/misc.cpp (2)
include/mr-math/color.hpp (14)
Color(14-14)Color(17-18)Color(20-21)Color(24-31)Color(33-39)Color(33-33)r(42-42)r(42-42)g(43-43)g(43-43)b(44-44)b(44-44)a(45-45)a(45-45)include/mr-math/def.hpp (8)
equal(46-48)equal(46-46)equal(57-59)equal(57-57)within(98-100)within(98-98)within_ex(121-123)within_ex(121-121)
include/mr-math/norm.hpp (1)
include/mr-math/def.hpp (4)
equal(46-48)equal(46-46)equal(57-59)equal(57-57)
include/mr-math/packing.hpp (1)
include/mr-math/norm.hpp (1)
_data(52-52)
include/mr-math/def.hpp (2)
include/mr-math/matr.hpp (1)
N(175-195)include/mr-math/vec.hpp (8)
N(106-106)N(107-107)N(108-108)N(109-109)N(237-246)N(237-237)N(259-269)N(259-259)
include/mr-math/row.hpp (2)
include/mr-math/matr.hpp (18)
value(226-231)value(226-226)value(233-237)value(233-233)buf(215-218)buf(215-215)buf(221-224)buf(221-221)i(140-140)i(144-144)i(245-247)i(245-245)nodiscard(140-142)nodiscard(144-146)nodiscard(149-162)N(175-195)other(80-89)other(80-80)include/mr-math/vec.hpp (37)
i(99-99)i(99-99)i(110-110)nodiscard(106-106)nodiscard(107-107)nodiscard(108-108)nodiscard(109-109)nodiscard(110-110)nodiscard(150-153)nodiscard(155-157)nodiscard(160-162)nodiscard(218-221)nodiscard(223-225)N(106-106)N(107-107)N(108-108)N(109-109)N(237-246)N(237-237)N(259-269)N(259-259)swizzle(272-274)swizzle(272-272)other(116-143)other(116-116)other(145-147)other(145-145)other(218-218)other(223-223)other(228-234)other(228-228)other(249-256)other(249-249)other(321-323)other(321-321)other(325-327)other(325-325)
tests/norm.cpp (2)
include/mr-math/packing.hpp (12)
pack_oct32(125-128)pack_oct32(125-125)unpack_oct32(130-133)unpack_oct32(130-130)pack_oct24(135-138)pack_oct24(135-135)unpack_oct24(140-143)unpack_oct24(140-140)pack_oct16(145-148)pack_oct16(145-145)unpack_oct16(150-153)unpack_oct16(150-150)include/mr-math/def.hpp (4)
equal(46-48)equal(46-46)equal(57-59)equal(57-57)
include/mr-math/quat.hpp (1)
include/mr-math/vec.hpp (20)
N(106-106)N(107-107)N(108-108)N(109-109)N(237-246)N(237-237)N(259-269)N(259-259)nodiscard(106-106)nodiscard(107-107)nodiscard(108-108)nodiscard(109-109)nodiscard(110-110)nodiscard(150-153)nodiscard(155-157)nodiscard(160-162)nodiscard(218-221)nodiscard(223-225)w(103-103)w(103-103)
🔇 Additional comments (14)
include/mr-math/norm.hpp (1)
48-50: Dot product switch tostdx::reduce_addlooks correctUsing
stdx::reduce_add(_data._data._data * other._data._data)is the right xsimd-style replacement for the previous.sum()call and keeps the semantics ofdotunchanged. TheUncheckedTagconstructor still only asserts normalization, so no behavioral change there.Also applies to: 86-89
tests/CMakeLists.txt (1)
1-5: GoogleTest wiring and test target setup look goodUsing
GTest::gtest_mainplusgtest_discover_testsis a clean way to integrate the suite, and the test sources list matches the new test files (withquat.cppintentionally commented out).include/mr-math/packing.hpp (12)
1-6: LGTM!Header guards and includes are appropriate.
10-12: LGTM!Type aliases are clear and provide semantic meaning for packed normal representations.
44-51: LGTM!Quantization constants are correctly defined.
63-68: LGTM!Quantization logic correctly transforms and packs the 2D vector into a 32-bit value.
70-74: LGTM!Dequantization correctly inverts the quantize32 operation.
76-81: LGTM!24-bit quantization correctly uses 12 bits per component.
83-88: LGTM!Dequantization correctly inverts the quantize24 operation.
90-95: LGTM!16-bit quantization correctly uses 8 bits per component.
97-101: LGTM!Dequantization correctly inverts the quantize16 operation.
125-153: LGTM!The public API functions are well-structured, consistently delegate to implementation details, and follow a clear naming convention for the three packing variants (32-bit, 24-bit, 16-bit).
105-112: Norm3f guarantees normalization—no defensive check needed.The
Norm<T, N>template constructor automatically normalizes on construction (line 43-45 of include/mr-math/norm.hpp):_data.normalize(). SinceNorm3fis guaranteed to be a normalized vector by construction, the denominator inpack_oct_impl(abs(x) + abs(y) + abs(z)) can never be zero. The code is safe as-is.
53-56: No issues found—code is correct.The
Vec2fconstructor acceptsArithmeticTtemplate arguments, which includesbool(asboolsatisfiesstd::integral). The functionsign_non_zerocorrectly constructsVec2f{v.x() >= 0, v.y() >= 0}with boolean comparison results, and the implicit bool-to-float conversion is standard C++ behavior. The code compiles and functions as intended.
| "conan": {} | ||
| }, | ||
| "include": [ | ||
| "build/Release/generators/CMakePresets.json" |
There was a problem hiding this comment.
🧩 Analysis chain
Verify the hardcoded Release build type accommodates project workflows.
The include path hardcodes build/Release/generators/CMakePresets.json, which limits support for Debug, MinSizeRel, or RelWithDebInfo builds. If the project needs to support multiple build configurations (especially with multi-config generators like Visual Studio or Xcode), this path may not work as intended.
Please verify:
- Is the build directory structure always
build/Release/..., or should it be dynamic based on the active configuration? - Does the project use single-config generators (Unix Makefiles) or multi-config generators (Visual Studio, Xcode)?
- If CMakeUserPresets.json should remain user-specific and not committed, is it listed in
.gitignore?
🏁 Script executed:
#!/bin/bash
# Check if CMakeUserPresets.json is in .gitignore
grep -E '^CMakeUserPresets\.json$|CMakeUserPresets' .gitignore
# Check CMakeLists.txt for how Conan integration is configured
rg -C 5 'cmake_minimum_required|conan_toolchain|CMAKE_BUILD_TYPE' CMakeLists.txtLength of output: 189
I need to gather more information about the project's Conan setup and .gitignore configuration.
🏁 Script executed:
#!/bin/bash
# Check if .gitignore exists and its full content
echo "=== .gitignore content ==="
cat .gitignore 2>/dev/null || echo ".gitignore not found"
# Search for Conan configuration files
echo -e "\n=== Searching for conanfile.* ==="
fd -t f 'conanfile\.(txt|py)' --max-one-result
# Check if there's a conan profile or toolchain in standard locations
echo -e "\n=== Searching for CMake toolchain references ==="
rg -l 'CMAKE_TOOLCHAIN_FILE|conan_toolchain' --type cmake
# Look for build scripts that might set configuration
echo -e "\n=== Searching for build or setup scripts ==="
fd -t f '(build|setup|configure)' -e sh -e py | head -20Length of output: 756
I found a critical detail: .gitignore ignores the entire build/ directory, but CMakeUserPresets.json is NOT ignored (committed to repo). Let me search for the Conan configuration file to understand the complete setup.
🏁 Script executed:
#!/bin/bash
# Correct the conanfile search syntax
echo "=== Searching for conanfile.* ==="
fd -t f 'conanfile'
# Also check for any README or documentation about build setup
echo -e "\n=== Searching for README or build documentation ==="
fd -t f '(README|BUILD|SETUP)' -e md -e txt
# Check the complete CMakeLists.txt file
echo -e "\n=== Full CMakeLists.txt ==="
cat CMakeLists.txt 2>/dev/null | head -50Length of output: 1717
Let me check the Conan integration in cmake/dependencies.cmake and the build instructions in README.md:
🏁 Script executed:
#!/bin/bash
# Check cmake/dependencies.cmake for Conan integration
echo "=== cmake/dependencies.cmake ==="
cat cmake/dependencies.cmake 2>/dev/null || echo "File not found"
# Check README for build instructions
echo -e "\n=== README.md (first 60 lines) ==="
head -60 README.md 2>/dev/null || echo "File not found"Length of output: 3155
Remove or rewrite CMakeUserPresets.json—it references Conan, which is not used in this project.
The file declares "vendor": "conan" and references build/Release/generators/CMakePresets.json, but the project uses CPM.cmake for dependency management (as seen in cmake/dependencies.cmake). There is no conanfile.txt or conanfile.py in the repository. This mismatch means the preset file won't work with the actual build system.
Either:
- Remove the file if not needed, or
- Rewrite it to align with CPM.cmake, or
- Add proper Conan integration if intentional (conanfile + CMakeLists.txt updates).
🤖 Prompt for AI Agents
CMakeUserPresets.json around line 7 references "vendor": "conan" and
build/Release/generators/CMakePresets.json which is incorrect for this repo that
uses CPM.cmake; either delete CMakeUserPresets.json if not needed, or replace
its contents to remove Conan-specific fields and point to a presets/generator
configuration compatible with CPM.cmake (or, if Conan was intended, add a
conanfile and update CMakeLists to integrate Conan). Make the change by removing
the vendor/conan references and any generator paths that assume Conan, or delete
the file entirely if no IDE/user presets are required.
| // format conversions | ||
| // TODO: implement using shuffle | ||
| Vec4f argb() const noexcept { | ||
| return {a(), r(), g(), b()}; | ||
| // return {a(), r(), g(), b()}; | ||
| return _data.swizzled(3, 0, 1, 2); | ||
| } | ||
|
|
||
| Vec4f bgra() const noexcept { | ||
| return {b(), g(), r(), a()}; | ||
| // return {b(), g(), r(), a()}; | ||
| return _data.swizzled(2, 1, 0, 3); | ||
| } | ||
|
|
||
| Vec4f abgr() const noexcept { | ||
| return {a(), b(), g(), r()}; | ||
| // return {a(), b(), g(), r()}; | ||
| return _data.swizzled(3, 2, 1, 0); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Color swizzle conversions are fine; be cautious with exposing _data
- Using
_data.swizzled(…)inargb/bgra/abgris a nice fit with the new Vec swizzle API and avoids manual component reshuffling. - Commenting out
private:effectively makes_datapart of the public API; if that’s intentional (to enable low-level interop), it’s worth documenting, because external code can now mutate the underlyingVec4fwithout going throughColor’s setters. - The updated literal
operator""_rgba(unsigned long long)matches the_rgbausages in tests and is standard UDL syntax.
If you want to keep some encapsulation while still allowing advanced users access, consider adding a named accessor (e.g. const Vec4f& data() const noexcept) instead of fully exposing the member.
Also applies to: 110-112, 116-119
🤖 Prompt for AI Agents
In include/mr-math/color.hpp around lines 69-83 (and similarly at 110-112 and
116-119), the member _data was made public by commenting out the private:
section; restore encapsulation by re-enabling private: for _data, and provide a
controlled accessor instead — add a const Vec4f& data() const noexcept to expose
read-only access for advanced users and, only if external mutation is required,
add a non-const Vec4f& data() noexcept with a clear comment; update any call
sites to use these accessors and document the choice in the header comment.
| struct PackedNorm24 { | ||
| PackedNorm24() = default; | ||
|
|
||
| constexpr explicit PackedNorm24(uint32_t value) noexcept { | ||
| _data[0] = value & 0xFF; | ||
| _data[1] = (value >> 8) & 0xFF; | ||
| _data[2] = (value >> 16) & 0xFF; | ||
| } | ||
|
|
||
| constexpr explicit operator uint32_t() const noexcept { | ||
| return | ||
| (static_cast<uint32_t>(_data[2]) << 16) | | ||
| (static_cast<uint32_t>(_data[1]) << 8) | | ||
| static_cast<uint32_t>(_data[0]); | ||
| } | ||
|
|
||
| constexpr explicit operator const uint8_t* () const noexcept { | ||
| return _data.data(); | ||
| } | ||
|
|
||
| constexpr explicit operator uint8_t* () noexcept { | ||
| return _data.data(); | ||
| } | ||
|
|
||
| private: | ||
| std::array<uint8_t, 3> _data; | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
Verify required includes are available.
The struct uses std::array<uint8_t, 3> and various fixed-width integer types, but this file doesn't directly include <array> or <cstdint>. While these may be transitively included via norm.hpp or vec.hpp, relying on transitive includes is fragile and can break if those headers change.
Run the following script to check if the required headers are included in the dependency chain:
If these includes are not present, consider adding them directly to this file:
#ifndef __MR_PACKING_HPP_
#define __MR_PACKING_HPP_
+#include <array>
+#include <cstdint>
+
#include "norm.hpp"
#include "vec.hpp"🏁 Script executed:
#!/bin/bash
# Check if array and cstdint are included in norm.hpp or vec.hpp
echo "Checking norm.hpp for required includes:"
rg -n '#include\s+<(array|cstdint)>' include/mr-math/norm.hpp
echo -e "\nChecking vec.hpp for required includes:"
rg -n '#include\s+<(array|cstdint)>' include/mr-math/vec.hppLength of output: 81
🏁 Script executed:
#!/bin/bash
# Get the full packing.hpp file to see all includes
echo "=== Full packing.hpp content (first 50 lines) ==="
head -50 include/mr-math/packing.hpp
echo -e "\n=== All #include directives in packing.hpp ==="
rg -n '^#include' include/mr-math/packing.hpp
echo -e "\n=== All #include directives in norm.hpp ==="
rg -n '^#include' include/mr-math/norm.hpp
echo -e "\n=== All #include directives in vec.hpp ==="
rg -n '^#include' include/mr-math/vec.hppLength of output: 1533
🏁 Script executed:
#!/bin/bash
# Check def.hpp and row.hpp for includes
echo "=== All #include directives in def.hpp ==="
rg -n '^#include' include/mr-math/def.hpp
echo -e "\n=== All #include directives in row.hpp ==="
rg -n '^#include' include/mr-math/row.hpp
echo -e "\n=== Check for std::array usage in def.hpp ==="
rg -n 'std::array|<array>' include/mr-math/def.hpp
echo -e "\n=== Check for cstdint or integer types in def.hpp ==="
rg -n '<cstdint>|uint32_t|uint8_t' include/mr-math/def.hpp | head -20Length of output: 563
🏁 Script executed:
#!/bin/bash
# Check operators.hpp
echo "=== First 40 lines of operators.hpp ==="
head -40 include/mr-math/operators.hpp 2>/dev/null || echo "File not found or error"
echo -e "\n=== All #include directives in operators.hpp ==="
rg -n '^#include' include/mr-math/operators.hpp 2>/dev/null || echo "Not found"
# Check if xsimd.hpp or any xsimd includes provide array/cstdint
echo -e "\n=== Checking xsimd directory for array/cstdint includes ==="
find . -name "*.hpp" -path "*/xsimd/*" 2>/dev/null | head -20
# Most direct test: search entire mr-math directory for std::array to see what includes it
echo -e "\n=== Files using std::array in mr-math ==="
rg -l 'std::array' include/mr-math/ 2>/dev/null
# Check all header files for <array> and <cstdint> in mr-math directory
echo -e "\n=== All files in mr-math with <array> include ==="
rg -l '<array>' include/mr-math/ 2>/dev/null
echo -e "\n=== All files in mr-math with <cstdint> include ==="
rg -l '<cstdint>' include/mr-math/ 2>/dev/nullLength of output: 1548
🏁 Script executed:
#!/bin/bash
# Check specific usage of std::array in the files that use it
echo "=== std::array usage in vec.hpp ==="
rg -n 'std::array' include/mr-math/vec.hpp | head -5
echo -e "\n=== std::array usage in norm.hpp ==="
rg -n 'std::array' include/mr-math/norm.hpp | head -5
echo -e "\n=== std::array usage in packing.hpp ==="
rg -n 'std::array' include/mr-math/packing.hpp | head -5
# Final check: look for any system header that might be commonly providing both array and cstdint
echo -e "\n=== Check if any single header could provide both (like <bits/stdc++.h>) ==="
rg -n '<bits|stdc++' include/mr-math/*.hpp 2>/dev/null || echo "Not found (which is good)"Length of output: 474
Add explicit includes for std::array and fixed-width integer types.
Verification confirms that packing.hpp uses std::array<uint8_t, 3> (line 39) and fixed-width integer types (uint32_t, uint8_t) throughout the struct, but neither <array> nor <cstdint> are included directly or available through the transitive include chain (norm.hpp → def.hpp and vec.hpp → def.hpp). While the code may compile through implicit transitive includes (likely from xsimd), this violates best practice and risks breakage if dependencies change.
Add to the top of the file:
#ifndef __MR_PACKING_HPP_
#define __MR_PACKING_HPP_
+#include <array>
+#include <cstdint>
+
#include "norm.hpp"
#include "vec.hpp"Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In include/mr-math/packing.hpp around lines 14 to 40, the PackedNorm24 struct
uses std::array and fixed-width integer types (uint8_t, uint32_t) but the file
doesn't include <array> or <cstdint>; add direct includes for these headers at
the top of the file (include <array> and include <cstdint>) so the code no
longer relies on transitive includes and will compile reliably.
| inline Vec2f oct_wrap(const Vec2f& v) | ||
| { | ||
| return (Vec2f{1} - std::abs(Vec2f{v.y(), v.x()})) * sign_non_zero(v); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Verify std::abs works with Vec2f.
The function calls std::abs(Vec2f{v.y(), v.x()}), which requires std::abs to be overloaded for Vec2f or for the type to provide a compatible interface.
Run the following script to check if abs is defined for Vec types:
🏁 Script executed:
#!/bin/bash
# Search for abs function overloads or member functions for Vec types
rg -nP --type=cpp -C3 'std::abs.*Vec|abs\s*\(.*Vec|Vec.*::abs'
# Also check for using std::abs or abs definitions in vec.hpp
rg -n 'using std::abs|^\s*inline.*abs.*Vec' include/mr-math/vec.hppLength of output: 1689
🏁 Script executed:
#!/bin/bash
# Check vec.hpp for abs overload definitions
cat include/mr-math/vec.hpp | grep -n -A5 -B5 "abs"
# Also search for overloads of operator* for Vec2f
rg -n "operator\*" include/mr-math/vec.hppLength of output: 793
🏁 Script executed:
#!/bin/bash
# Search all header files for std::abs overload for Vec types
rg -n "std::abs.*Vec|namespace std.*abs" include/
# Check if there's an abs free function defined for Vec
rg -n "^\s*inline.*abs\s*\(" include/mr-math/Length of output: 104
🏁 Script executed:
#!/bin/bash
# Find stdx::abs definition
rg -n "namespace stdx|stdx::abs" include/
# Check includes in packing.hpp to understand what's available
head -20 include/mr-math/packing.hppLength of output: 618
🏁 Script executed:
#!/bin/bash
# Check for using declarations in packing.hpp
rg -n "using " include/mr-math/packing.hpp
# Check norm.hpp for any abs overloads
rg -n "abs" include/mr-math/norm.hpp
# Check if there's an abs free function in mr namespace
rg -n "^\s*(inline\s+)?abs\s*\(" include/mr-math/*.hppLength of output: 72
🏁 Script executed:
#!/bin/bash
# Look at all uses of abs in packing.hpp to see the pattern
rg -n "abs" include/mr-math/packing.hpp | head -20
# Check what sign_non_zero returns and if it's related
rg -n "sign_non_zero" include/mr-math/packing.hpp -A3 -B3Length of output: 702
Replace std::abs(Vec2f{...}) with component-wise absolute value.
Line 60 calls std::abs() on a Vec2f object, but std::abs has no overload for composite vector types. The code will not compile. Other calls in the file correctly apply std::abs to scalar components. Change:
return (Vec2f{1} - std::abs(Vec2f{v.y(), v.x()})) * sign_non_zero(v);to:
return (Vec2f{1} - Vec2f{std::abs(v.y()), std::abs(v.x())}) * sign_non_zero(v);🤖 Prompt for AI Agents
In include/mr-math/packing.hpp around lines 58 to 61, the code calls std::abs on
a Vec2f which has no overload for vector types; replace the vector-level
std::abs call with a component-wise absolute construction: compute
Vec2f{std::abs(v.y()), std::abs(v.x())} and use that in the expression so the
return becomes (Vec2f{1} - Vec2f{std::abs(v.y()), std::abs(v.x())}) *
sign_non_zero(v).
| inline Norm3f unpack_oct_impl(const Vec2f& norm) { | ||
| // TODO: add shader code for unpacking normals | ||
| Vec3f vec{ norm, 1 - std::abs(norm.x()) - std::abs(norm.y())}; | ||
| if (vec.z() < 0) { | ||
| vec = Vec3f{mr::details::oct_wrap(norm), vec.z()}; | ||
| } | ||
| return vec.normalized_unchecked(); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Address TODO comment and verify normalized_unchecked safety.
The function contains a TODO comment suggesting incomplete implementation. Additionally, it calls normalized_unchecked() on the reconstructed vector, which bypasses safety checks. For octahedral decoding, the reconstructed vector should always be normalizable, but this assumption should be verified.
- Should this TODO be resolved before merging, or is it acceptable to track it separately?
- Would you like me to open an issue to track the shader code implementation?
Run the following script to verify the behavior of normalized_unchecked:
🏁 Script executed:
#!/bin/bash
# Find the definition of normalized_unchecked and check if it has any safety checks
ast-grep --pattern 'normalized_unchecked() {
$$$
}'
# Also search for documentation about when it's safe to use
rg -nP --type=cpp -C3 'normalized_unchecked'Length of output: 1555
Let me also verify whether the TODO comment blocks merge:
🏁 Script executed:
#!/bin/bash
# Check for other TODOs in the file to understand the project's approach
rg -nP 'TODO' include/mr-math/packing.hpp
# Search for oct_wrap implementation to understand the wrapping logic
rg -nP --type=cpp -B2 -A5 'oct_wrap'Length of output: 1515
🌐 Web query:
octahedral normal encoding decoding guaranteed non-zero vector
💡 Result:
Short answer: use the standard octahedral encode/decode and on decode detect zero-length and replace it (or add a tiny epsilon) before normalizing. This guarantees a non‑zero unit vector.
Recommended decode (GLSL-style, minimal):
-
Encode (assumes unit n):
p = n.xy / (abs(n.x)+abs(n.y)+abs(n.z));
if (n.z < 0) p = (1 - abs(p.y), 1 - abs(p.x)) * sign(p); -
Decode:
vec3 n; n.x = p.x; n.y = p.y; n.z = 1.0 - abs(p.x) - abs(p.y);
if (n.z < 0.0) {
float tx = (1.0 - abs(n.y)) * sign(n.x);
float ty = (1.0 - abs(n.x)) * sign(n.y);
n.x = tx; n.y = ty; n.z = -n.z; // or recompute as 1-abs(x)-abs(y)
}
// guard: avoid zero-length
float len = length(n);
if (len == 0.0) n = vec3(0.0, 0.0, 1.0);
else n = n / len;
Notes: many implementations instead add a tiny epsilon before normalization (e.g. len = max(length(n), 1e-6)) to avoid branches. This is the standard approach described in the literature and practical guides. [1][2]
Sources:
[1] Knarkowicz, "Octahedron normal vector encoding" (implementation + decode/encode formulas).
[2] Cigolle et al., "A Survey of Efficient Representations for Independent Unit Vectors" (octahedral encoding background and variants).
Add guard for zero-length vectors before normalization in octahedral decoding.
The unpack_oct_impl function calls normalized_unchecked() without checking if the reconstructed vector is zero-length. While octahedral decoding is mathematically well-defined, industry best practice for octahedral decoding is to detect zero-length and replace it (or add epsilon) before normalizing to guarantee a non-zero unit vector. The current implementation omits this safety check, creating a potential edge case vulnerability.
Add a guard before normalization—either check for zero-length and replace with a default vector, or add a small epsilon to the length before division.
Regarding the TODO comment: clarify whether shader code implementation is a prerequisite for merging or can be tracked separately.
🤖 Prompt for AI Agents
In include/mr-math/packing.hpp around lines 114–121, the reconstructed Vec3f may
be zero-length before calling normalized_unchecked(); add a guard that computes
the squared length (or length) of vec and if it's below a tiny epsilon replace
vec with a safe default normal (e.g., {0,0,1}) or perform normalization using
length + epsilon to avoid division by zero, then return the normalized result;
also update or remove the TODO comment to state whether shader-side decoding
must be implemented before merge or can be tracked as a follow-up task.
| constexpr void swizzle(const std::array<std::size_t, N> &mask) { | ||
| using IntSimdT = SimdImpl<uint32_t, N>; | ||
| // using IntSimdT = stdx::batch<uint32_t, xsimd::sse2>; | ||
| std::array<uint32_t, IntSimdT::size> buf; | ||
| for (std::size_t i = 0; i < N; i++) { | ||
| buf[i] = mask[i]; | ||
| } | ||
| for (std::size_t i = N; i < buf.size(); i++) { | ||
| buf[i] = static_cast<uint32_t>(i); | ||
| } | ||
| // IntSimdT sm; | ||
| // sm = sm.load_unaligned(buf.data()); | ||
| auto simd_mask = stdx::load_unaligned(buf.data()); | ||
| <<<<<<< Updated upstream | ||
| ======= | ||
| // std::cout << typeid(_data).name() << std::endl; | ||
| // std::cout << typeid(sm).name() << std::endl; | ||
|
|
||
| // TODO(dk6): this code isnt compile in benchmark | ||
| >>>>>>> Stashed changes | ||
| _data = stdx::swizzle(_data, simd_mask); | ||
| } | ||
|
|
||
| template<std::integral ...Ints> | ||
| requires(sizeof...(Ints) == N) | ||
| constexpr void swizzle(Ints ...ints) { | ||
| swizzle(std::array {(std::size_t)ints...}); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Unresolved merge conflict and fragile mask construction in swizzle
This block still contains conflict markers:
<<<<<<< Updated upstream
=======
// std::cout << ...
// TODO(dk6): this code isnt compile in benchmark
>>>>>>> Stashed changeswhich will prevent compilation, and there’s no reason for the debug printing here anyway.
Also, to keep the architecture of the mask batch consistent with SimdT, it’s safer to use the batch’s own load_unaligned rather than an unqualified stdx::load_unaligned, which may default to default_arch instead of the make_sized_batch arch used in SimdT.
A minimal, cleaner version could be:
- constexpr void swizzle(const std::array<std::size_t, N> &mask) {
- using IntSimdT = SimdImpl<uint32_t, N>;
- std::array<uint32_t, IntSimdT::size> buf;
- for (std::size_t i = 0; i < N; i++) {
- buf[i] = mask[i];
- }
- for (std::size_t i = N; i < buf.size(); i++) {
- buf[i] = static_cast<uint32_t>(i);
- }
- auto simd_mask = stdx::load_unaligned(buf.data());
-<<<<<<< Updated upstream
-=======
- // std::cout << typeid(_data).name() << std::endl;
- // std::cout << typeid(sm).name() << std::endl;
-
- // TODO(dk6): this code isnt compile in benchmark
->>>>>>> Stashed changes
- _data = stdx::swizzle(_data, simd_mask);
- }
+ constexpr void swizzle(const std::array<std::size_t, N> &mask) {
+ using IntSimdT = SimdImpl<uint32_t, N>;
+ std::array<uint32_t, IntSimdT::size> buf;
+ for (std::size_t i = 0; i < N; ++i) {
+ buf[i] = static_cast<uint32_t>(mask[i]);
+ }
+ for (std::size_t i = N; i < buf.size(); ++i) {
+ buf[i] = static_cast<uint32_t>(i);
+ }
+ auto simd_mask = IntSimdT::load_unaligned(buf.data());
+ _data = stdx::swizzle(_data, simd_mask);
+ }This removes the conflict markers and debug I/O, and ensures the mask batch uses the same architecture and size as SimdT, which is what xsimd::swizzle expects.
I'll verify the review comment by examining the actual code and understanding the architectural concerns.
---
🏁 Script executed:
```shell
#!/bin/bash
# Check the actual current state of the swizzle method in row.hpp
cat include/mr-math/row.hpp | sed -n '95,135p'
Length of output: 1375
🏁 Script executed:
#!/bin/bash
# Find the SimdImpl definition to verify it has load_unaligned
rg -n "class SimdImpl|struct SimdImpl" --type=cpp --type=hpp -A 30 | head -100Length of output: 27
🏁 Script executed:
#!/bin/bash
# Check for SimdImpl::load_unaligned method
rg -n "load_unaligned" --type=cpp --type=hpp -B 2 -A 2 | head -60Length of output: 27
🏁 Script executed:
#!/bin/bash
# Understand the relationship between SimdImpl and stdx namespace
rg -n "using.*stdx|namespace.*stdx" --type=cpp --type=hpp | head -20Length of output: 27
🏁 Script executed:
#!/bin/bash
# Check if there are any architecture/batch related setup patterns in SimdImpl
ast-grep --pattern 'class SimdImpl { $$$ }'Length of output: 0
🏁 Script executed:
#!/bin/bash
# Look for make_sized_batch references mentioned in the review
rg -n "make_sized_batch" --type=cpp --type=hpp -B 2 -A 2Length of output: 27
🏁 Script executed:
#!/bin/bash
# Find where _data is defined in the row class to understand its type
rg -n "_data.*SimdT\|using.*SimdT" --type=cpp --type=hpp -B 3 -A 1 | head -40Length of output: 27
🏁 Script executed:
#!/bin/bash
# Check xsimd::swizzle expectations
rg -n "xsimd::swizzle\|stdx::swizzle" --type=cpp --type=hpp -B 3 -A 3 | head -50Length of output: 27
🏁 Script executed:
#!/bin/bash
# Search for xsimd documentation on swizzle and architecture consistency
# to understand if batch types must match
echo "Checking if merge conflict markers are present in current version..."
grep -n "<<<<<<< Updated upstream" include/mr-math/row.hpp || echo "No merge conflict found in current version"Length of output: 97
🏁 Script executed:
#!/bin/bash
# Find files containing SimdImpl definition
fd "\.hpp$|\.h$" | xargs grep -l "class SimdImpl\|struct SimdImpl" 2>/dev/nullLength of output: 0
🏁 Script executed:
#!/bin/bash
# Get SimdImpl definition
fd "\.hpp$|\.h$" | xargs grep -n "class SimdImpl\|struct SimdImpl" -A 40 2>/dev/null | head -80Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for load_unaligned in header files
fd "\.hpp$|\.h$" | xargs grep -n "load_unaligned" 2>/dev/null | head -30Length of output: 988
🏁 Script executed:
#!/bin/bash
# Find where _data is defined with SimdT type
fd "\.hpp$|\.h$" | xargs grep -n "SimdT.*_data\|_data.*SimdT\|using.*SimdT" 2>/dev/null | head -20Length of output: 431
🏁 Script executed:
#!/bin/bash
# Check for make_sized_batch in the codebase
fd "\.hpp$|\.h$|\.cpp$" | xargs grep -n "make_sized_batch" 2>/dev/nullLength of output: 188
🏁 Script executed:
#!/bin/bash
# Look for xsimd includes and usage patterns to understand architecture setup
fd "\.hpp$|\.h$|\.cpp$" | xargs grep -n "xsimd\|stdx" 2>/dev/null | head -40Length of output: 3419
🏁 Script executed:
#!/bin/bash
# Search row.hpp for the Row class definition and SimdT template parameter
head -150 include/mr-math/row.hpp | grep -n "class Row\|template\|SimdT\|_data"Length of output: 2038
I need to verify the xsimd API to confirm the architectural concern and the suggested fix.
🌐 Web query:
xsimd load_unaligned API method signature batch
💡 Result:
The xsimd batch unaligned load signatures (template forms) are:
-
template
inline batch<From, A> load_unaligned(From const *ptr) noexcept. [1] -
template
inline batch<T, A> load_unaligned(U const *mem) noexcept // member overload on batch<T,A>. [2]
References: xsimd data-transfer and batch docs. [1][2]
🌐 Web query:
xsimd batch make_sized_batch load_unaligned static method
💡 Result:
Short summary:
-
xsimd::make_sized_batch<T, N> is a type-trait that yields a batch type whose element type is T and whose number of elements is N (or void if no native register of that size exists). [1]
-
xsimd::batch<T,Arch>::load_unaligned(ptr) is a static method that returns a new batch loaded from the (possibly unaligned) memory address ptr. As of the migration to the newer API, load/load_unaligned/load_aligned are static functions (they create and return a batch) rather than instance methods that mutate an existing batch. [2][3]
Minimal example:
using batch_t = xsimd::make_sized_batch<float,8>::type; // select batch of 8 floats (if available)
batch_t v = batch_t::load_unaligned(ptr); // static load_unaligned returns a batch
References:
- make_sized_batch / batch API docs. [xsimd docs]. [1]
- Data transfer / load_unaligned description. [xsimd docs]. [2]
- Migration note: load* functions are now static. [xsimd migration guide]. [3]
🌐 Web query:
xsimd swizzle batch type consistency requirements
💡 Result:
Short answer: swizzle requires the source batch and the mask to share the same architecture (same A) and the mask to be an integer-type batch whose element size matches the element size expected by the swizzle overload (or a compile-time batch_constant). In other words use swizzle(batch<T,A> const&, batch<ITy,A> mask) or swizzle(batch<T,A> const&, batch_constant<... , A, ...>), where ITy is an integer type and A is the same architecture for both arguments. [1][2]
Sources:
- xsimd data-transfer / swizzle API. [1]
- generic arch implementation showing swizzle(batch<T,A>, batch<ITy,A>) requiring same A and integer mask elements. [2]
Resolve merge conflict and fix architecture mismatch in swizzle
The merge conflict markers at lines 114–119 in include/mr-math/row.hpp will prevent compilation. Additionally, the architecture consistency issue is valid: using the free function stdx::load_unaligned(buf.data()) defaults to the default architecture, but xsimd's swizzle requires the source batch and mask to share the same architecture. Since _data is created via SimdImpl<T, N> (which uses make_sized_batch), the mask must use the same architecture.
Replace lines 101–121 with:
constexpr void swizzle(const std::array<std::size_t, N> &mask) {
using IntSimdT = SimdImpl<uint32_t, N>;
std::array<uint32_t, IntSimdT::size> buf;
for (std::size_t i = 0; i < N; ++i) {
buf[i] = static_cast<uint32_t>(mask[i]);
}
for (std::size_t i = N; i < buf.size(); ++i) {
buf[i] = static_cast<uint32_t>(i);
}
auto simd_mask = IntSimdT::load_unaligned(buf.data());
_data = stdx::swizzle(_data, simd_mask);
}This removes the conflict markers, debug code, and ensures the mask uses the same architecture as _data for compatibility with xsimd::swizzle.
🤖 Prompt for AI Agents
In include/mr-math/row.hpp around lines 101 to 128, remove the leftover merge
conflict markers and debug comments and ensure the SIMD mask uses the same
architecture as _data: use IntSimdT (the SimdImpl<uint32_t, N> alias) to declare
buf of size IntSimdT::size, fill buf[0..N-1] with static_cast<uint32_t>(mask[i])
and buf[N..end] with static_cast<uint32_t>(i), then create the mask with
IntSimdT::load_unaligned(buf.data()) (not stdx::load_unaligned) and call _data =
stdx::swizzle(_data, simd_mask); this removes conflict markers, keeps casts
explicit, and fixes architecture mismatch.
| TEST(Transform, ScaleVector) { | ||
| EXPECT_EQ(mr::Vec3f(1, -1, 0) * mr::scale(mr::Vec3f{30, 47, 80}), mr::Vec3f(30, -47, 0)); | ||
| } | ||
|
|
||
| TEST(Transform, TranslateVector) { | ||
| EXPECT_EQ(mr::Vec3f(0, 0, 22) * mr::translate(mr::Vec3f{30, 47, 80}), mr::Vec3f(30, 47, 102)); | ||
| } | ||
|
|
||
| // TODO(dk6): uncomment | ||
| // TEST(Transform, RotateVector) { | ||
| // // mr::Vec3f v{30, 47, 80}; | ||
| // // mr::Vec3f expected{69.0185819718, 23.0733671242, 64.908050904}; | ||
| // // EXPECT_TRUE(mr::equal(v * mr::rotate(mr::Norm3f(1, 1, 1), 102_rad), expected, 0.0001)); | ||
| // | ||
| // EXPECT_TRUE(mr::equal(mr::axis::x * mr::rotate(mr::Yaw(90_deg)), mr::axis::z)); | ||
| // std::println("{}", mr::axis::x * mr::rotate(mr::Yaw(90_deg))); | ||
| // | ||
| // // EXPECT_TRUE(mr::equal(mr::axis::y * mr::rotate(mr::Pitch(90_deg)), -mr::axis::z)); | ||
| // // EXPECT_TRUE(mr::equal(mr::axis::z * mr::rotate(mr::Roll(90_deg)), mr::axis::z)); | ||
| // } | ||
|
|
||
|
|
||
| // TEST(Transform, TransformVector) { | ||
| // mr::Vec3f v = mr::axis::x; | ||
| // mr::Vec3f expected{0}; | ||
| // EXPECT_TRUE(mr::equal(v * mr::scale(mr::Vec3f{2}) * mr::rotate(mr::Yaw(90_deg)) * mr::translate(mr::Vec3f{0, 0, 2}), expected, 0.001)); | ||
| // } | ||
|
|
||
| #if 1 | ||
|
|
||
| // TODO: camera tests | ||
|
|
||
| TEST(Color, Constructors) { | ||
| EXPECT_EQ(mr::Color(), mr::Color(0, 0, 0, 0)); | ||
|
|
||
| const mr::Color expected1{0.3, 0.47, 0.8, 1.0}; | ||
| EXPECT_EQ(mr::Color(0.3, 0.47, 0.8), expected1); | ||
| EXPECT_EQ(mr::Color(mr::Vec4f(0.3, 0.47, 0.8, 1)), expected1); | ||
|
|
||
| const mr::Color expected2{0.2980392156862745, 0.4666666666666667, 0.8, 1.0}; | ||
| EXPECT_EQ(mr::Color(76, 119, 204, 255), expected2); | ||
| EXPECT_EQ(mr::Color(0x4C'77'CC'FF), expected2); | ||
| EXPECT_EQ(0x4C'77'CC'FF_rgba, expected2); | ||
| } | ||
|
|
||
| TEST(Color, Formats) { | ||
| const auto color = 0x4C'77'CC'FF_rgba; | ||
| EXPECT_EQ(color.argb(), 0xFF'4C'77'CC_rgba); | ||
| EXPECT_EQ(color.bgra(), 0xCC'77'4c'FF_rgba); | ||
| EXPECT_EQ(color.abgr(), 0xFF'CC'77'4c_rgba); | ||
| } | ||
|
|
||
| TEST(Color, Getters) { | ||
| const auto color = 0x4C'77'CC'FF_rgba; | ||
| EXPECT_FLOAT_EQ(color.r(), 0.2980392156862745f); | ||
| EXPECT_FLOAT_EQ(color.g(), 0.4666666666666667f); | ||
| EXPECT_FLOAT_EQ(color.b(), 0.8); | ||
| EXPECT_FLOAT_EQ(color.a(), 1.0); | ||
|
|
||
| EXPECT_EQ(color[0], color.r()); | ||
| EXPECT_EQ(color[1], color.g()); | ||
| EXPECT_EQ(color[2], color.b()); | ||
| EXPECT_EQ(color[3], color.a()); | ||
|
|
||
| const auto [r, g, b, a] = color; | ||
| EXPECT_EQ(r, color.r()); | ||
| EXPECT_EQ(g, color.g()); | ||
| EXPECT_EQ(b, color.b()); | ||
| EXPECT_EQ(a, color.a()); | ||
| } | ||
|
|
||
| TEST(Color, Setters) { | ||
| auto color = 0x4C'77'CC'FF_rgba; | ||
| color.r(1.0); | ||
| color.set(1, 0.5); | ||
| EXPECT_EQ(color, mr::Color(1.0, 0.5, 0.8, 1.0)); | ||
| } | ||
|
|
||
| TEST(Color, Equality) { | ||
| const auto color1 = 0x4C'77'CC'FF_rgba; | ||
| const auto copy = color1; | ||
| EXPECT_EQ(color1, copy); | ||
| EXPECT_TRUE(color1.equal(copy)); | ||
| EXPECT_TRUE(equal(color1, copy)); | ||
|
|
||
| const auto color2 = 0x00'00'00'00_rgba; | ||
| EXPECT_NE(color1, color2); | ||
| EXPECT_FALSE(color1.equal(color2)); | ||
| EXPECT_FALSE(equal(color1, color2)); | ||
| } | ||
|
|
||
| TEST(Color, Addition) { | ||
| // Values can exeed 1.0 (should they?) | ||
| EXPECT_EQ(mr::Color(1.0, 0.0, 0.5, 1.0) + mr::Color(0.0, 1.0, 0.5, 1.0), mr::Color(1.0, 1.0, 1.0, 2.0)); | ||
| } | ||
|
|
||
| TEST(Utility, Within) { | ||
| EXPECT_FALSE(mr::within(1, 10)(0)); | ||
| EXPECT_TRUE(mr::within(1, 10)(1)); | ||
| EXPECT_TRUE(mr::within(1, 10)(5)); | ||
| EXPECT_TRUE(mr::within(1, 10)(10)); | ||
| EXPECT_FALSE(mr::within(1, 10)(11)); | ||
|
|
||
| EXPECT_FALSE(mr::within_ex(1, 10)(0)); | ||
| EXPECT_FALSE(mr::within_ex(1, 10)(1)); | ||
| EXPECT_TRUE(mr::within_ex(1, 10)(5)); | ||
| EXPECT_FALSE(mr::within_ex(1, 10)(10)); | ||
| EXPECT_FALSE(mr::within_ex(1, 10)(11)); | ||
|
|
||
| EXPECT_TRUE(mr::within(1., 10.f)(5)); | ||
| EXPECT_TRUE(mr::within_ex(1., 10.f)(5)); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
New transform/color/utility tests are solid; consider where strict float equality is really needed
These tests give good coverage for vector scaling/translation, color construction and format conversions, equality, and the within/within_ex helpers. The only potential fragility is the use of exact EXPECT_EQ / EXPECT_FLOAT_EQ against long decimal literals for color channels—those are fine as long as the normalization path stays exactly component / 255, but if you ever tweak implementation (e.g., change types or intermediate math), it may be safer to switch some of these to tolerance-based checks via mr::equal or EXPECT_NEAR.
| TEST(NormalPackingTest, Octahedron32) { | ||
| mr::Norm3f n{ 30, 47, 80 }; | ||
| mr::PackedNorm32 packed = mr::pack_oct32(n); | ||
| mr::Norm3f unpacked = mr::unpack_oct32(packed); | ||
| std::cout << n << " " << unpacked << std::endl; | ||
| EXPECT_TRUE(mr::equal(unpacked, n, 0.01)); | ||
| } | ||
|
|
||
| TEST(NormalPackingTest, Octahedron24) { | ||
| mr::Norm3f n{ 30, 47, 80 }; | ||
| mr::PackedNorm24 packed = mr::pack_oct24(n); | ||
| mr::Norm3f unpacked = mr::unpack_oct24(packed); | ||
| EXPECT_TRUE(mr::equal(unpacked, n, 0.01)); | ||
| } | ||
|
|
||
| TEST(NormalPackingTest, Octahedron16) { | ||
| mr::Norm3f n{ 30, 47, 80 }; | ||
| mr::PackedNorm16 packed = mr::pack_oct16(n); | ||
| mr::Norm3f unpacked = mr::unpack_oct16(packed); | ||
| EXPECT_TRUE(mr::equal(unpacked, n, 0.1)); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Octahedral packing tests are good; consider dropping the stdout debug print
The three tests nicely cover pack/unpack for 32/24/16‑bit formats with sensible tolerances via mr::equal. The only nit is the std::cout in Octahedron32, which will clutter test output in CI; it’s better removed or wrapped in a debug macro.
🤖 Prompt for AI Agents
In tests/norm.cpp around lines 6 to 25, remove the stray std::cout debug print
in the Octahedron32 test (the line printing n and unpacked) because it clutters
CI output; either delete that line or wrap it in a debug-only macro (e.g.,
#ifdef DEBUG / #endif) so normal test runs produce no stdout noise, leaving the
EXPECT_TRUE assertions unchanged.
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.