From 09a7b0c770aee60cc7f199d7ffa98b05a764ca7b Mon Sep 17 00:00:00 2001 From: dgw Date: Sun, 15 Oct 2017 02:44:26 -0500 Subject: [PATCH] Overhaul readme (spelling, punctuation, grammar) Slight wording changes in a few places for better flow. Mostly just fixing spelling and formatting (like using `code` instead of "quotes"). Also fixed spelling of the first image filename, because why not. :) --- ...ox_remcompiler.jpg => xbox_recompiler.jpg} | Bin readme.md | 96 +++++++++--------- 2 files changed, 48 insertions(+), 48 deletions(-) rename _images/{xbox_remcompiler.jpg => xbox_recompiler.jpg} (100%) diff --git a/_images/xbox_remcompiler.jpg b/_images/xbox_recompiler.jpg similarity index 100% rename from _images/xbox_remcompiler.jpg rename to _images/xbox_recompiler.jpg diff --git a/readme.md b/readme.md index ae5ccf8..64ca96e 100644 --- a/readme.md +++ b/readme.md @@ -1,24 +1,24 @@ ## Porting Xbox360 executables to Windows -![DolphinDemoScreenshot](/_images/xbox_remcompiler.jpg) +![DolphinDemoScreenshot](/_images/xbox_recompiler.jpg) -The idea is simple: *what if you could take the Xbox360 game and run it on your PC?* Is this even possible in principle? I was pondering this question few years ago and that should not come as a supprise that there are some obvious technical difficulties in getting this done: +The idea is simple: *What if you could take an Xbox360 game and run it on your PC?* Is this even possible in principle? I was pondering this question a few years ago, and it should not come as a surprise that there are some obvious technical difficulties in getting this done: -- **Different CPUs** - Xbox360 uses PowerPC based CPU, our PCs are based on x86 architexture. They are different in so many ways that I don't even know where to start :) PowerPC is RISC based, has shitloads of registers but very simple instructions. x86 is totally different on the other hand - not so many registers and many more instructions that are more complicated (addressing modes...). It's obvious that a simple transcription is not feasible. +- **Different CPUs** - Xbox360 uses a PowerPC-based CPU, while PCs are based on x86 architecture. They are different in so many ways that I don't even know where to start. :) PowerPC is RISC-based, with shitloads of registers but very simple instructions. x86 is totally different on the other hand—not so many registers and many more instructions that are more complicated (addressing modes...). It's obvious that a simple translation is not feasible. -- **Memory Layout** - Xbox360 uses BigEndian byte ordering, x86 CPUs use LittleEndian. To be compatible with incoming data that is being read from files and read/written into the memory all memory based operands must be byteswapped. This may pose a significant performance issue. +- **Memory Layout** - Xbox360 uses big-endian byte ordering, x86 CPUs use little-endian. To be compatible with incoming data that is being read from files and read/written into the memory, all memory based operands must be byte-swapped. This may pose a significant performance issue. -- **Encrypted executable image** - Yup, for various reasons the executables on Xbox360 are encrypted. There are some cleaver guys in Russia though that figured how :) +- **Encrypted executable image** - Yup, for various reasons the executables on Xbox360 are encrypted. There are some clever guys in Russia, though, who figured it out. :) -- **Different and outdated GPU architecture** - If we want to see any graphics rendered the GPU needs to be emulated. Ther are two hard nuts to crack: first, the shaders we see will be complied into the GPU compatible format, no HLSL on input, sorry. Those shaders will have to be reverse engineered as well. Secondly, the Xbox360 GPU was using ~10MB of internal memory called EDRAM that was serving as a temporary storage of render target for the duration of rendering. Although some card today still use similar concept this is never exposed directly to the user. Since there a lot of differnt ways people used the EDRAM on Xbox this part has to be emulated. To be honest probably differently for every game. +- **Different and outdated GPU architecture** - If we want to see any graphics rendered, the GPU needs to be emulated. There are two hard nuts to crack: first, the shaders we see will be compiled into the GPU-compatible format (no HLSL on input, sorry). Those shaders will have to be reverse engineered as well. Secondly, the Xbox360 GPU was using ~10MB of internal memory called EDRAM that was serving as a temporary storage of render target for the duration of rendering. Although some cards today still use similar concept, this is never exposed directly to the user. Since there are a lot of different ways people used the EDRAM on Xbox, this part has to be emulated (to be honest, probably differently for every game). -- **Inlining of graphics/kernel functions** - Some of the functions used while compiling the executable were inlined directly into the compiled code making it much harder to write a simple API level wrapper. This kills the dream of making "function level" wrapper where we could just go and wrap the "d3d->DrawPrimitive" call directly. Nope, this is not going ot happen. +- **Inlining of graphics/kernel functions** - Some of the functions used while compiling the executable were inlined directly into the compiled code making it much harder to write a simple API level wrapper. This kills the dream of making a "function level" wrapper where we could just go and wrap the `d3d->DrawPrimitive` call directly. Nope, this is not going to happen. -Forunatelly, every problem is solvable and the answard is ***YES*** in principle. If you want to know how, keep reading :) +Fortunately, every problem is solvable and the answer is ***YES***, in principle. If you want to know how, keep reading. :) ## Current state of the project -Currently the published branch of the project allows to run simple Xbox360 demo apps (samples). I've not yet attempted to run it with any real game as it probably would not work with anything big and serious. Also, on the legal side, this is a fine line because getting anything bigger is tricky as it requires going basically to the Torrent Sites and digging through old Xbox Live Arcace content or pirated game. Xbox360 is not yet abandonware :) For the same reason there are no source executables given, you need to get one "from somewhere". Sorry :( +Currently the published branch of the project allows running simple Xbox360 demo apps (samples). I've not yet attempted to run it with any real game as it probably would not work with anything big and serious. Also, on the legal side, this is a fine line because getting anything bigger is tricky as it basically requires going to the torrent sites and digging through old Xbox Live Arcade content or pirated games. Xbox360 is not yet abandonware. :) For the same reason there are no source executables given, you need to get one from "somewhere". Sorry. :( Stuff currently implemented: @@ -52,17 +52,17 @@ Stuff currently implemented: + Basic GPU trace viewer that allows to inspect internal GPU state at each point + *Time Machine* tool that makes it possible to find previous instruction that touched given register or memory -## How to run it ? +## How to run it? - You will need Visual Studio 2015 (sorry, Windows only) -- Get the wxWidgets in 3.1.0 and compile the x64 DLL libs, place them in dev\external\wxWidgets-3.1.0\ -- Compile the whole solution from dev\src\recompile.sln -- Run the "framework\frontend" project -- Open the project "projects\xenon\doplhin\dolphin.px" +- Get wxWidgets 3.1.0 and compile the x64 DLL libs, place them in dev\external\wxWidgets-3.1.0\ +- Compile the whole solution from `dev\src\recompile.sln` +- Run the `framework\frontend` project +- Open the project `projects\xenon\doplhin\dolphin.px` - Select the "Final" configuration -- Click the "Build button" -- Assumming you've installed the project in C:\recompiler run the "launcher\frontend" project with following parameters: "-platform=Recompiler.Xenon.Launcher.dll -image=C:\recompiler\projects\xenon\doplhin\Dolphin.px.Final.VS2015.dll -dvd=C:\recompiler\projects\xenon\doplhin\data -devkit=C:\recompiler\projects\xenon\doplhin\data" -- To exit the app close the GPU output window +- Click the "Build" button +- Assuming you've installed the project in `C:\recompiler`, run the `launcher\frontend` project with the parameters: `-platform=Recompiler.Xenon.Launcher.dll -image=C:\recompiler\projects\xenon\doplhin\Dolphin.px.Final.VS2015.dll -dvd=C:\recompiler\projects\xenon\doplhin\data -devkit=C:\recompiler\projects\xenon\doplhin\data` +- To exit the app, close the GPU output window ## References @@ -95,12 +95,12 @@ Stuff currently implemented: ## XEX -First, the XEX (Xbox Executable) format must be ripped apart and the actual code has to be extracted. XEX is a Xbox360 specific executable packing/encryption format. It's not very complicated and quite good description can be found here: [Free60](http://free60.org/wiki/Main_Page). There are also some old references [here](http://www.openrce.org/forums/posts/111). Inside the XEX there are some platform specific headers (like file certificate, media/region information, file encryption key, etc) but also there's a normal -PE style executable, unfortunatelly it's packed and encrypted. +First, the XEX (Xbox Executable) format must be ripped apart and the actual code has to be extracted. XEX is an Xbox360-specific executable packing/encryption format. It's actually not very complicated, and a quite good description can be found at [Free60](http://free60.org/wiki/Main_Page). There are also some old references [here](http://www.openrce.org/forums/posts/111). Inside the XEX there are some platform-specific headers (like file certificate, media/region information, file encryption key, etc), but also there's a normal +PE-style executable; unfortunately it's packed and encrypted. -Decryption of any actual executables requires knowing the secret AES key that is used internally by the loader to compute another AES key that is actually used to decrypt the file content. I found it on a Russian site few years ago but could not retrace my steps any more, most likely the site is down gone for good. The rest of the XEX format suggests strongly that it was bascically built on top of existing PE image loader that existed in the OS. The compression used in the XEX is either simple block based compression or a variation of LZ compression. Both were identified and reversed years ago by people trying to break the Xbox360 anti-piracy protection. +Decryption of any actual executables requires knowing the secret AES key that is used internally by the loader to compute another AES key that is actually used to decrypt the file content. I found it on a Russian site a few years ago but could not retrace my steps any more, most likely the site is gone for good. The rest of the XEX format suggests strongly that it was basically built on top of existing PE image loader that existed in the OS. The compression used in the XEX is either simple block based compression or a variation of LZ compression. Both were identified and reversed years ago by people trying to break the Xbox360 anti-piracy protection. -Any way, after dealing with those two bumps on the road and unpacking the "internal EXE" from the XEX we follow normal disassembly procedure. In general case we end up with a list of sections: +Anyway, after dealing with those two bumps on the road and unpacking the "internal EXE" from the XEX we follow normal disassembly procedure. In the general case we end up with a list of sections: ``` .rdata 0x00000400-0x0004C100 r__ @@ -113,29 +113,29 @@ Any way, after dealing with those two bumps on the road and unpacking the "inter .reloc 0x002A0200-0x002B2EF4 r__ ``` -The only section that contains executable code is the .text section and only that section requires disassembly. Rest of the sections must still be loaded into the memory when we try to execute the code since data may be read/written into those addresses. For now I did not implement any data relocation so the unpacked image must be loaded exactly under it's base address. Fortunatelly on 64-bit systems this is achievable fairly easy. +The only section that contains executable code is the `.text` section, and only that section requires disassembly. The rest of the sections must still be loaded into the memory when we try to execute the code since data may be read/written into those addresses. For now I did not implement any data relocation so the unpacked image must be loaded exactly under its base address. Fortunately on 64-bit systems this is achievable fairly easy. ## Disassembly -Disassembling the PowerPC instructions is a pleasure. After we identify the .text (code) section in the PE executable the rest is straightforward. Every instruction is always 4 bytes so there is no ambigiuity like with x86 instructions and even if we don't know how to decode a particular instruction we can easily continue with the rest of the code. This allows us to have semi-working solution way sooner than with x86. See the [actual file](https://github.com/rexdex/recompiler/blob/master/dev/src/platforms/xenon/xenonDecompiler/xenonInstructionDecoding.cpp) for details. +Disassembling the PowerPC instructions is a pleasure. After we identify the `.text` (code) section in the PE executable the rest is straightforward. Every instruction is always 4 bytes so there is no ambiguity like with x86 instructions, and even if we don't know how to decode a particular instruction we can easily continue with the rest of the code. This allows us to have semi-working solution way sooner than with x86. See the [actual file](https://github.com/rexdex/recompiler/blob/master/dev/src/platforms/xenon/xenonDecompiler/xenonInstructionDecoding.cpp) for details. -Basically in this project I've tried 3 ways to approach the subject: +Basically in this project I've tried three ways to approach the subject: -- Script based (for faster iteration) - I've written a LUA script that was doing the disassembly. It was fast to iterate in small samples but very slow to run and disassemble milions of instructions in normal executables was taking minutes. Even to check if an instruction is valid instruction was taking way too much time. +- Script-based (for faster iteration) - I've written a Lua script that was doing the disassembly. It was fast to iterate in small samples but very slow to run, and disassembling millions of instructions in normal executables was taking minutes. Even just to check if an instruction is valid was taking way too much time. -- Data driven pattern matching - Basically an big XML with binary "rules" that were matching bit patterns and emitting instructions. This was much faster but because of the corner cases in the PowerPC instruction encoding it got messy in the end and required me to do a lot of copy-pasting. Performance wise it was fast and could work if not for a one little detail: it's not enough just to disassemble the code, we still need to extract some "metadata" out of the code (like register dependnecies, calculated jump addresses, etc). This still requires us to know a little bit about the instruction that just it's name. So, the template-based diassembler was producing an instruction named "bc" but I still had to write manual code to understand that it's a "conditional branch" and even more code to be able to evaluate this condition. +- Data-driven pattern matching - Basically a big XML with binary "rules" that were matching bit patterns and emitting instructions. This was much faster but because of the corner cases in the PowerPC instruction encoding it got messy in the end and required me to do a lot of copy-pasting. Performance-wise it was fast and could work if not for a one little detail: it's not enough just to disassemble the code, we still need to extract some metadata out of the code (like register dependencies, calculated jump addresses, etc). This still requires us to know a little bit more about the instruction than just its name. So, the template-based disassembler was producing an instruction named `bc` but I still had to write manual code to understand that it's a conditional branch, and even more code to be able to evaluate this condition. -- Finally I ended up with an abstract CPUInstruction class that is implemented for every instruction that CPU implements + a big ass C++ switch() to do the disassemblly. This is actually very nice and maintaintable solution. +- Finally I ended up with an abstract `CPUInstruction` class that is implemented for every instruction that the Xbox360 CPU implements + a big ass C++ `switch()` to do the disassembly. This is actually a very nice and maintainable solution. The ***biggest*** and most valuable resources on this topic were the official PowerPC instruction set documentation: [Power ISA Version 2.07](http://fileadmin.cs.lth.se/cs/education/EDAN25/PowerISA_V2.07_PUBLIC.pdf) ## Testing the disassembler -I had lots of bugs in the disassembler. Of course I could write an unit test for each instruction but that would just take ages. The fastest way I've found to test the correctness of the disassembly is to compare the output with something that we know works. Basically, disassembling any big PowerPC executable by IDA or any other disassemble and comparing the output of tens of milions of instructions is a very good step towards some level of trust that the disassembler is working :) +I had lots of bugs in the disassembler. Of course I could write a unit test for each instruction but that would just take ages. The fastest way I've found to test the correctness of the disassembly is to compare the output with something that we know works. Basically, disassembling any big PowerPC executable by IDA or any other disassembler and comparing the output of tens of millions of instructions is a very good step towards some level of trust that the disassembler is working. :) -## XBox360 specific instructions +## Xbox360-specific instructions -Xbox360 has a special version of the PowerPC CPU that has 128 VMX registers (instead of 32 ones in the standard CPU). Those registers are used for vectorized math operations (similary to SSE). There's no way to address 128 registers in normal PowerPC instructions because there are only 5 bits delegated to indicate the register index in every instruction and this pattern is CPU-wide. Unfortunatelly, the opcodes for those special instructions are not avaiable on the internet (or are buried deeply). I ended up reversing the opcodes manually by observing the bit patters in the generated listing files. There's a simple tool I've wrote for that [XPrint](https://github.com/rexdex/recompiler/blob/master/dev/tools/xprint/XOpPrint.cpp). Typical output of a decoded instruction bit pattern looks like this: +Xbox360 has a special version of the PowerPC CPU that has 128 VMX registers (instead of 32 ones in the standard CPU). Those registers are used for vectorized math operations (similarly to SSE). There's no way to address 128 registers in normal PowerPC instructions because there are only 5 bits delegated to indicate the register index in every instruction and this pattern is CPU-wide. Unfortunately, the opcodes for those special instructions are not available on the Internet (or are buried deeply). I ended up reversing the opcodes manually by observing the bit patterns in the generated listing files. There's a simple tool I've written for that called [XPrint](https://github.com/rexdex/recompiler/blob/master/dev/tools/xprint/XOpPrint.cpp). Typical output of a decoded instruction bit pattern looks like this: ``` Instruction 'vxor128' (3 params) @@ -165,9 +165,9 @@ Instruction 'vxor128' (3 params) Variant vr97: ------00 00100001 00001*-- --*-**11 ``` -## Abstract instruction +## Abstract instructions -The result of disassembly process is an "unpacked instruction". The most useful thing is that the opcode and operands are unpacked so an easy "ToString" method can be written for presentation purposes. Surprisingly, this structure captures a lot of quirks of not only the PowerPC instructions but Intel as well. On PowerPC the operand type is closely related to the particular instruction type (add vs addi, etc). On Intel this is not the case and the same instruction (add) may be used with immediate value as well as memory location, etc. To capture this generalization the Operand structure is introduced. +The result of disassembly process is an unpacked instruction. The most useful thing is that the opcode and operands are unpacked so an easy `ToString` method can be written for presentation purposes. Surprisingly, this structure captures a lot of quirks of not only the PowerPC instructions but Intel as well. On PowerPC the operand type is closely related to the particular instruction type (`add` vs `addi`, etc). On Intel this is not the case and the same instruction (`add`) may be used with immediate value as well as memory location, etc. To capture this generalization the `Operand` structure is introduced. ```c++ class Instruction @@ -231,7 +231,7 @@ protected: }; ``` -In practice, the unpacked format is not good enough for many operations. After the disassembly is completed more work is needed to get the code to a useful state than just unpacking. For example we need to identify the "blocks" - places in the code where execution enters a particular linear set of instruction that are going to be executed without interuptions until a "jump" or "call" to another block. It's nice and useful to abstract this instruction concept a little bit more. This is done by the following class: +In practice, the unpacked format is not good enough for many operations. After the disassembly is completed, more work is needed to get the code to a useful state than just unpacking. For example we need to identify the "blocks"—places in the code where execution enters a particular linear set of instructions that are going to be executed without interruptions until a `jump` or `call` to another block. It's nice and useful to abstract this instruction concept a little bit more. This is done by the following class: ```c++ class InstructionExtendedInfo @@ -296,17 +296,17 @@ public: }; ``` -By filling in this class the decompiler can express much more about the instruction - ie. what is it going to do with the control flow of the program or what registeres are being read/written by it. Each actual PowerPC instruction has it's opcode class that is able to transform the Instruction into the InstructionExtendedInfo. +By filling in this class the decompiler can express much more about the instruction—i.e. what is it going to do with the control flow of the program or what registers are being read/written by it. Each actual PowerPC instruction has its opcode class that is able to transform the `Instruction` into the `InstructionExtendedInfo`. ## Blocks -After all instructions are disassembled it's important to identify blocks of instructions that can have known single place of entry. This is done by analyzing all the "call" and "jump" instructions that can be resolved. This is not foolproof as it does not identify properly the indirect calls (vtable, function pointers) and indirect jumps (switch statements). The more knowledge about a block we have and the more certainty about the points of entry the faster code we will be able to generate. +After all instructions are disassembled it's important to identify blocks of instructions that can have a known single point of entry. This is done by analyzing all the `call` and `jump` instructions that can be resolved. This is not foolproof as it does not properly identify the indirect calls (vtable, function pointers) and indirect jumps (switch statements). The more knowledge about a block we have and the more certainty about the points of entry the faster code we will be able to generate. ![DolphinDemoScreenshot](/_images/xex_decompiled.jpg) ## Recompilation -After all of the code is disassembled we can start to decompile it into logically equivalent representation. The simple trick here is to realize that for the sake of just getting it to work we don't need to convert the code into any high-level language. What matters is to get exactly the same execution results. The CPU state is represented as a structure: +After all of the code is disassembled we can start to decompile it into a logically equivalent representation. The simple trick here is to realize that for the sake of just getting it to work we don't need to convert the code into any high-level language. What matters is to get exactly the same execution results. The CPU state is represented as a structure: ```c++ class CpuRegs : public runtime::RegisterBank @@ -340,7 +340,7 @@ public: }; ``` -All the PowerPC instructions are rewritten as a heavily templatized and inlined C++ functions: +All the PowerPC instructions are rewritten as heavily templatized and inlined C++ functions: ```c++ // addic - add immediate with the update of the carry flag @@ -353,13 +353,13 @@ static inline void addic( CpuRegs& regs, TReg* out, const TReg a, const uint32 i } ``` -Finally all the blocks that were identified are transformed 1-1 into equivalent block functions. Blocks function signature is following: +Finally all the blocks that were identified are transformed 1:1 into equivalent block functions. The block function signature is: ```c++ uint64 __fastcall _code__block82060508( uint64 ip, cpu::CpuRegs& regs ) ``` -It takes the current IP (instruction pointer) directly as the argument + the current CPU state expressed by the "regs". The returned value represents next address to execute. This is the lowest (slowest) code generation level that we have. In this mode we are putting ***ALL BURDEN*** of optimizing this to final assembly to the target compiler. Suprisingly, even using this naive approach most of the recompiled executables are running suprisingly well. Typical block looks like that: +It takes the current IP (instruction pointer) directly as the argument + the current CPU state expressed by the `regs`. The returned value represents next address to execute. This is the lowest (slowest) code generation level that we have. In this mode we are putting ***all the burden*** of optimizing this into final assembly on the target compiler. Surprisingly, even using this naïve approach most of the recompiled executables are running surprisingly well. A typical block looks like this: ```c++ ////////////////////////////////////////////////////// @@ -395,7 +395,7 @@ uint64 __fastcall _code__block82060508( uint64 ip, cpu::CpuRegs& regs ) } ``` -In case the block is confirmed to be single point of entry we can use following simplified form: +In cases where the block is confirmed to have a single point of entry, we can use the following simplified form: ```c++ ////////////////////////////////////////////////////// @@ -427,9 +427,9 @@ uint64 __fastcall _code__block82060508( uint64 ip, cpu::CpuRegs& regs ) } ``` -There are more optimization steps possible that I'm currently working on - for example if all blocks in a function are "well behaved" - no indirect jumps are found and the function follows ABI rules - clear preamble can be identified + all return statements have proper cleanup code, then we can promote the whole function to a single C++ function pulling all blocks inside + defining all VOLATILE registers inside the function (on stack) and not using the ones in the *regs* structure. +There are more optimization steps possible that I'm currently working on. For example, if all blocks in a function are "well behaved"—no indirect jumps are found and the function follows ABI rules—a clear preamble can be identified + all return statements have proper cleanup code, then we can promote the whole function to a single C++ function pulling all blocks inside + defining all VOLATILE registers inside the function (on stack) and not using the ones in the `regs` structure. -Next optimization step can occur when two "well behaved" functions are calling each other. Then, instead of going through the generic call via the returned "next instruction address" we can generate code like this: +Next optimization step can occur when two "well behaved" functions are calling each other. Then, instead of going through the generic call via the returned "next instruction address", we can generate code like this: ```c++ uint64 __fastcall _code__func82060508( uint64 ip, cpu::CpuRegs& regs ) @@ -449,9 +449,9 @@ uint64 __fastcall _code__func82060508( uint64 ip, cpu::CpuRegs& regs ) This again makes the generated code faster. -## The thread and the thread execution. +## The thread and the thread execution -All the generated blocks are then compiled using standard C++ compiler and produce a DLL. Pointers to all block functions are then registered into a "block table". Block table allows easily to retrieve the block that will contain the code for given IP (Instruction Pointer). Finally the core of the simulated CPU thread boils down to this function: +All the generated blocks are then compiled using a standard C++ compiler and produce a DLL. Pointers to all block functions are then registered into a "block table". Block table allows to easily retrieve the block that will contain the code for given IP (Instruction Pointer). Finally the core of the simulated CPU thread boils down to this function: ```c++ void CodeExecutor::Run() @@ -467,9 +467,9 @@ void CodeExecutor::Run() ## Imported functions -The XEX image contains import of functions from another modules. Unlink the quite common "named" imports, the ones in the XEX executable are only ordinal based. A table is required that contans the "human readable" names of the functions as well as their ordering in the given lib. See [here](https://github.com/rexdex/recompiler/blob/master/dev/src/platforms/xenon/xenonDecompiler/Recompiler.Xenon.Platform.exports). +The XEX image contains imports of functions from another modules. Unlike the quite common "named" imports, the ones in the XEX executable are only ordinal based. A table is required that contains the human-readable names of the functions as well as their ordering in the given lib. See [here](https://github.com/rexdex/recompiler/blob/master/dev/src/platforms/xenon/xenonDecompiler/Recompiler.Xenon.Platform.exports). -When we load an image for decompiled executable we can patch the entries in the block table for given import stubs with a C++ reimplementation of that function. There is still the same and we have to "unpack" the arguments from the registers manually. For example: +When we load an image for decompiled executable we can patch the entries in the block table for given import stubs with a C++ reimplementation of that function. There is still the same and we have to unpack the arguments from the registers manually. For example: ```c++ uint64 __fastcall XboxThreads_KeDelayExecutionThread( uint64 ip, cpu::CpuRegs& regs ) @@ -486,9 +486,9 @@ uint64 __fastcall XboxThreads_KeDelayExecutionThread( uint64 ip, cpu::CpuRegs& r } ``` -It takes around 300 functions to get the simple app to start. Most of the are very similar (if not exactly the same) as the Windows counterparts. The rest is mostly guess work. +It takes around 300 functions to get the simple app to start. Most of them are very similar (if not exactly the same) as the Windows counterparts. The rest is mostly guesswork. ## Future Work -Well, it would be much cooler to run an actual game, maybe some day :) +Well, it would be much cooler to run an actual game... Maybe some day. :)