Skip to content

Stage 1: Mechanical Ghidra-to-C# translation (10185 methods)#8

Open
AstrandPallas wants to merge 2 commits intomainfrom
ghidra-stage1
Open

Stage 1: Mechanical Ghidra-to-C# translation (10185 methods)#8
AstrandPallas wants to merge 2 commits intomainfrom
ghidra-stage1

Conversation

@AstrandPallas
Copy link
Contributor

@AstrandPallas AstrandPallas commented Feb 12, 2026

Summary

  • Deterministic translator replaces // TODO stubs with Ghidra-decompiled method bodies across 520+ files
  • 10,185 methods written with 0 Roslyn compile errors
  • 57.6% line resolution rate (291,169 / 505,734 lines)
  • Pipeline: boilerplate stripping, alias tracking, field/static/array resolution, method call resolution, type mapping, Roslyn validation with auto-revert

Phase History

Phase Methods Roslyn Errors Unity Errors Key Fixes
1 (Fixes A-E) 7,253 0 7,156 Core pipeline, field resolution, method calls
2 (Fixes F-I) 7,258 0 6,316 Bool/enum bitwise, smart demangling, visibility
3 (Fixes K-P) 9,681 translated 0 5,266 Type-aware array guards, narrowing casts, indexers

Stats

Metric Count
Methods processed 33,811
Fully resolved 9,681
Written (post-validation) ~10,185
Partial (has // GHIDRA: comments) 24,126
Files modified 520+
Roslyn errors 0
Unity errors 5,266

Phase 3 Improvements

  • Fix K: Type-aware array offset guards — only emit .Length/[0] for actual array types (CS1061: 1330→328, CS0021: 1184→226)
  • Fix L: Narrowing cast insertion for non-literal assignments (int→byte, int→enum)
  • Fix M: get_Item(idx)[idx] indexer resolution (CS0571: 160→96)
  • Fix N: --unity-check flag with integrated Unity build + iterative visibility auto-fixer
  • Fix P: CS0052 inconsistent accessibility auto-fixer

Test plan

  • Roslyn type check passes with 0 errors
  • Unity batch build — 5,266 errors remaining (iterative reduction in progress)
  • Spot-check translated methods against Ghidra pseudocode

…, 514 files)

Translator improvements since Stage 1:
- Constructor merging: IL2CPP split new()+.ctor() patterns merged into proper C# constructors
- Class field type awareness: bool/enum/ref fixes now apply to this.fieldName patterns
- Alternation-based regex: pre-compiled patterns per category for O(1) matching
- Prefix capture for member access: handles this.field without lookbehind issues
- File-level cast fixup: corrects this.(int) → (int)this. syntax

Stats: 7,249 methods written, 2,908 stubs replaced, 58.4% line resolution, 0 Roslyn errors
Translator improvements (ghidra_to_csharp.py Phase 3):
- Type-aware array offset guards: only emit .Length/[0] for actual arrays
- Narrowing cast insertion for non-literal assignments (int→byte/enum)
- get_Item/set_Item → C# indexer resolution
- Integrated --unity-check with iterative visibility auto-fixer
- Multi-level prefix capture fix for bool negation patterns

Results: 9681 methods translated, 0 Roslyn errors, 5266 Unity errors
(down from 6316 with 2936 additional methods written)
@AstrandPallas AstrandPallas changed the title Stage 1: Mechanical Ghidra-to-C# translation (8143 methods) Stage 1: Mechanical Ghidra-to-C# translation (10185 methods) Feb 12, 2026
@AstrandPallas
Copy link
Contributor Author

Don't review this yet still got more to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant