Skip to content

[asm] Memory offset optimization: split oversized constants#967

Open
Hardcode84 wants to merge 2 commits intoiree-org:mainfrom
Hardcode84:mem-offset-opt
Open

[asm] Memory offset optimization: split oversized constants#967
Hardcode84 wants to merge 2 commits intoiree-org:mainfrom
Hardcode84:mem-offset-opt

Conversation

@Hardcode84
Copy link
Contributor

Summary

  • Extend the MemoryOffsetOptimization pass to handle constants that exceed the hardware offset limit by splitting them into an aligned high part (K_hi) kept in the v_add_u32 and a low residual (K_lo) folded into the static offset field.
  • Before: v_add_u32(base, 68608) with offset:0 — full constant in VALU.
  • After: v_add_u32(base, 65536) with offset:3072 — residual absorbed into hardware.
  • The downstream ScopedCSE pass merges v_add_u32 ops that now share the same K_hi and base, eliminating redundant VALU instructions across nearby loads.

Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
@Hardcode84 Hardcode84 marked this pull request as ready for review February 24, 2026 19:34
Copy link
Contributor

@ftynse ftynse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much better when the commit message explains why a change is made, we can see what it is from reading the code.

return std::nullopt;

// Check shift overflow before creating any new ops
// Check shift overflow before creating any new ops.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually avoid making irrelevant stylistic changes in a load-bearing commit. These show up in git history with unrelated commit message and you will be the one to blame for this code. Putting them in a separate NFC is a better idea.

@Hardcode84
Copy link
Contributor Author

It is much better when the commit message explains why a change is made, we can see what it is from reading the code.

It does

The downstream ScopedCSE pass merges v_add_u32 ops that now share the same K_hi and base, eliminating redundant VALU instructions across nearby loads.

@ftynse ftynse requested review from ftynse and removed request for ftynse March 2, 2026 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants