Skip to content

Conversation

@abaresk
Copy link
Contributor

@abaresk abaresk commented Jun 19, 2024

This change normalizes the assembly output to remove 2 common sources of diff scoring penalties:

  1. Extra padding appearing at the end of function definitions (line 62 in before)
  2. Optional destination register omitted by assembler (line 28 in before)

Before: https://pastebin.com/raw/CybxVDe2
After: https://pastebin.com/raw/BMXyfD5s

arm32-discrepancies.zip

@abaresk abaresk force-pushed the arm32-discrepancies branch from 645a716 to ef2fb96 Compare June 19, 2024 21:21
@simonlindholm
Copy link
Owner

Optional destination register omitted by assembler (line 28 in before)

Surprising to hear of this sources of mismatches. Is it different in the binary, e.g. some arm vs thumb thing? Or what causes objdump to either one or the other?

@abaresk
Copy link
Contributor Author

abaresk commented Jun 21, 2024

The issue is with the assembler used by decomp.me. Objdump is correct in disambiguating these 2 cases, as the instructions have different encodings:

1c40: adds r0, r0, #1
3001: adds r0, #1

Unfortunately the assembler used by decomp.me arm-none-eabi-as coerces instances of the former case into the latter.

@simonlindholm
Copy link
Owner

Can we change the assembler? I worry that this could hide real mismatches from wrong assembler choices, and make them hard to debug when you notice that your build doesn't match. If it's impractical or never comes up in practice outside decomp.me then I guess we could merge this.

(We have a similar case to this in MIPS where li encoded as addiu vs ori show up differently and while confusion vs clarity has been a mixed bag I think it has helped cast light on some real assembler mismatches.)

@abaresk
Copy link
Contributor Author

abaresk commented Jun 22, 2024

That makes sense, I'll report back after investigating the effort on adding proprietary assemblers to decomp.me.

I did some spelunking into modern binutils to see if there's any flags to disable this behavior. Unfortunately, there did not seem to be any -- from what I can tell, this is baked into the relocation-processing stage of the assembler (md_apply_fix()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants