diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..0e4d02f2 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,32 @@ +# Contributing + +Your contributions are very much appreciated! If you want to work on this tool, we recommend you do the following: +1. Set up a virtual environment in this directory. +2. Install this project within itself in editable mode: `pip install -e .` +3. Install the dev requirements: `pip install -r requirements-tests.txt` + +If you also have a decompilation project, we recommend the following: +1. Set up a _separate_ virtual environment in your decompilation project. +2. Inside that virtual environment, `pip install -e path/to/your/local/reccmp/repository`. + +This way, you can easily run your latest `reccmp` changes against your decompilation project. + +## Testing + +`isledecomp` comes with a suite of tests based on `pytest`. A number of them can be run out of the box: +```bash +pytest . +``` + +As of this writing, some of the tests still depend on the [LEGO Island decompilation project](https://github.com/isledecomp/isle). You will need a copy of the _original_ binaries for LEGO Island in order to execute all tests. This can be done by +```bash +pytest . --lego1=/path/to/LEGO1.DLL +``` + +## Linting and formatting + +In order to keep the Python code clean and consistent, we use `pylint` and `black`: + +* Run `pylint`: `pylint reccmp` +* Check formatting without making changes: `black --check reccmp` +* Apply formatting: `black reccmp` diff --git a/README.md b/README.md index 550b4311..bf21d796 100644 --- a/README.md +++ b/README.md @@ -1,255 +1,76 @@ -# LEGO Island Decompilation Tools +# Reccmp Decompilation Toolchain -Accuracy to the game's original code is the main goal of the [decompilation project](https://github.com/isledecomp/isle). To facilitate the decompilation effort and maintain overall quality, we have devised a set of annotations, to be embedded in the source code, which allow us to automatically verify the accuracy of re-compiled functions' assembly, virtual tables, variable offsets and more. - -In order for contributions to be accepted, the annotations must be used in accordance to the rules outlined here. Proper use is enforced by [GitHub Actions](/.github/workflows) which run the Python tools found in this folder. It is recommended to integrate these tools into your local development workflow as well. - -# Overview - -We are continually working on extending the capabilities of our "decompilation language" and the toolset around it. Some of the following annotations have not made it into formal verification and thus are not technically enforced on the source code level yet (marked as **WIP**). Nevertheless, it is recommended to use them since it is highly likely they will eventually be fully integrated. - -## Functions - -All non-inlined functions in the code base with the exception of [3rd party code](https://github.com/isledecomp/isle/tree/master/3rdparty) must be annotated with one of the following markers, which include the module name and address of the function as found in the original binaries. This information is then used to compare the recompiled assembly with the original assembly, resulting in an accuracy score. Functions in a given compilation unit must be ordered by their address in ascending order. - -The annotations can be attached to the function implementation, which is the most common case, or use the "comment" syntax (see examples below) for functions that cannot be referred to directly (such as templated, synthetic or non-inlined inline functions). The latter should only ever appear in `.h` files. - -### `FUNCTION` - -Functions with a reasonably complete implementation which are not templated or synthetic (see below) should be annotated with `FUNCTION`. - -``` +`reccmp` (recompilation comparison) is a collection of tools for decompilation projects. It was born from the [decompilation of LEGO Island](https://github.com/isledecomp/isle). Functions and data are matched based on comments in the source code. For example: +```cpp // FUNCTION: LEGO1 0x100b12c0 MxCore* MxObjectFactory::Create(const char* p_name) { // implementation } - -// FUNCTION: LEGO1 0x100140d0 -// MxCore::IsA -``` - -### `STUB` - -Functions with no or a very incomplete implementation should be annotated with `STUB`. These will not be compared to the original assembly. - -``` -// STUB: LEGO1 0x10011d50 -LegoCameraController::LegoCameraController() -{ - // TODO -} -``` - -### `TEMPLATE` - -Templated functions should be annotated with `TEMPLATE`. Since the goal is to eventually have a full accounting of all the functions present in the binaries, please make an effort to find and annotate every function of a templated class. - ``` -// TEMPLATE: LEGO1 0x100c0ee0 -// list >::_Buynode +This allows you to automatically verify the accuracy of re-compiled functions, virtual tables, variable offsets and more. See [here](docs/annotations.md) for the full syntax. -// TEMPLATE: LEGO1 0x100c0fc0 -// MxStreamListMxDSSubscriber::~MxStreamListMxDSSubscriber +At the moment, C++ compiled to 32-bit x86 with old versions of MSVC (like 4.20) is supported. Work on support for newer MSVC versions is in progress - testing and bug reports are greatly appreciated. Other compilers, languages and architectures are not supported at the moment, but feel free to contribute if you wish to do so! -// TEMPLATE: LEGO1 0x100c1010 -// MxStreamListMxDSAction::~MxStreamListMxDSAction -``` - -### `SYNTHETIC` - -Synthetic functions should be annotated with `SYNTHETIC`. A synthetic function is generated by the compiler; most common is the "scalar deleting destructor" found in virtual tables. Other cases include default destructors and assignment operators. Note: `SYNTHETIC` takes precedence over `TEMPLATE`. - -``` -// SYNTHETIC: LEGO1 0x10003210 -// Helicopter::`scalar deleting destructor' - -// SYNTHETIC: LEGO1 0x100c4f50 -// MxCollection::`scalar deleting destructor' - -// SYNTHETIC: LEGO1 0x100c4fc0 -// MxList::`scalar deleting destructor' -``` - -### `LIBRARY` - -Functions located in 3rd party libraries should be annotated with `LIBRARY`. Since the goal is to eventually have a full accounting of all the functions present in the binaries, please make an effort to find and annotate every function of every statically linked library, including the MSVC standard libraries. - -``` -// LIBRARY: ISLE 0x4061b0 -// _MemPoolInit@4 +## Getting started -// LIBRARY: ISLE 0x406520 -// _MemPoolSetPageSize@8 +### Installing / upgrading `reccmp` +1. (Recommended) Set up and activate a virtual Python environment in the directory of your recompilation project (this is different for different operating systems and shells). +2. Install `reccmp`: `pip install https://github.com/isledecomp/reccmp` -// LIBRARY: ISLE 0x406630 -// _MemPoolSetBlockSizeFS@8 -``` +The next steps differ based on what kind of project you have. -## Virtual tables +### Contributing to a project that already uses `reccmp` +1. Compile the C++ project. +2. Run `reccmp-project detect --search-path path/to/folder/with/original/binaries`. +3. If there is no `reccmp-build.yml` after building: Navigate to the recompiled binaries folder and run `reccmp-project detect --what recompiled`. +4. Look into `reccmp-project.yml` to see what the target is called. +5. Run `reccmp-reccmp --target `. You should see a list of functions and others together with their match percentage. -Classes with a virtual table should be annotated using the `VTABLE` marker, which includes the module name and address of the virtual table. Additionally, virtual function declarations should be annotated with a comment indicating their relative offset. Please use the following example as a reference. +### Setting up an existing decompilation project that has not used `reccmp` before -``` -// VTABLE: LEGO1 0x100dc900 -class MxEventManager : public MxMediaManager { -public: - MxEventManager(); - virtual ~MxEventManager() override; - - virtual void Destroy() override; // vtable+0x18 - virtual MxResult Create(MxU32 p_frequencyMS, MxBool p_createThread); // vtable+0x28 -``` +1. Run `reccmp-project create --originals path/to/original --scm`. This generates two files `reccmp-project.yml` and `reccmp-user.yml`; the latter will automatically be added to the `.gitignore`. +2. Annotate one function of your existing project as shown above and recompile. Note that the recompiled binary should have the same name file name as the original. +3. Navigate to your recompiled binary and run `reccmp-project detect --what recompiled`. A file `reccmp-build.yml` will be generated. This file should also be user-specific (see below on how to auto-generate this file by the build toolchain). +4. Look into `reccmp-project.yml` to see what the target is called. +5. Run `reccmp-reccmp --target ` from the same directory. If all goes well, you will see match percentage of the function you annotated above. -## Class size +### Fresh project -Classes should be annotated using the `SIZE` marker to indicate their size. If you are unsure about the class size in the original binary, please use the currently available information (known member variables) and detail the circumstances in an extra comment if necessary. +1. Run `reccmp-project create --originals path/to/original/binary --cmake-project` +2. You will see a lot of new files. Set up your C++ compiler and compile the project defined by `CMakeLists.txt`, ideally into a sub-directory like `./build`. Advice on building with old MSVC versions can be found at the [LEGO Island Decompilation project](https://github.com/isledecomp/isle). +3. Look into `reccmp-project.yml` to see what the target is called. +4. Navigate to the build directory and run `reccmp-reccmp --target `. -``` -// SIZE 0x1c -class MxCriticalSection { -public: - MxCriticalSection(); - ~MxCriticalSection(); - static void SetDoMutex(); -``` +## Tooling -Furthermore, add `DECOMP_SIZE_ASSERT(MxCriticalSection, 0x1c)` to the respective `.cpp` file (if the class has no dedicated `.cpp` file, use any appropriate `.cpp` file where the class is used). - -## Member variables - -Member variables should be annotated with their relative offsets. - -``` -class MxDSObject : public MxCore { -private: - MxU32 m_sizeOnDisk; // 0x8 - MxU16 m_type; // 0xc - char* m_sourceName; // 0x10 - undefined4 m_unk0x14; // 0x14 -``` - -## Global variables - -Global variables should be annotated using the `GLOBAL` marker, which includes the module name and address of the variable. - -``` -// GLOBAL: LEGO1 0x100f456c -MxAtomId* g_jukeboxScript = NULL; - -// GLOBAL: LEGO1 0x100f4570 -MxAtomId* g_pz5Script = NULL; - -// GLOBAL: LEGO1 0x100f4574 -MxAtomId* g_introScript = NULL; -``` - -## Strings - -String values should be annotated using the `STRING` marker, which includes the module name and address of the string. - -``` -inline virtual const char* ClassName() const override // vtable+0x0c -{ - // STRING: LEGO1 0x100f03fc - return "Act2PoliceStation"; -} -``` - -# Tooling - -Use `pip` to install the required packages to be able to use the Python tools found in this folder: - -``` -pip install -e . -``` - -All scripts will become available to use in your terminal with the `reccmp-` prefix. The example usages below assume that the retail binaries have been copied to `./legobin`. +All scripts will become available to use in your terminal with the `reccmp-` prefix. Note that these scripts need to be executed in the directory where `reccmp-build.yml` is located. * [`decomplint`](/reccmp/tools/decomplint.py): Checks the decompilation annotations (see above) * e.g. `reccmp-decomplint --module LEGO1 LEGO1` -* [`isledecomp`](/reccmp/isledecomp): A library that implements a parser to identify the decompilation annotations (see above) -* [`reccmp`](/reccmp/reccmp): Compares an original binary with a recompiled binary, provided a PDB file. For example: - * Display the diff for a single function: `reccmp-reccmp --verbose 0x100ae1a0 legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` - * Generate an HTML report: `reccmp-reccmp --html output.html legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` - * Create a base file for diffs: `reccmp-reccmp --json base.json --silent legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` - * Diff against a base file: `reccmp-reccmp --diff base.json legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` +* [`reccmp`](/reccmp/tools/asmcmp.py): Compares an original binary with a recompiled binary, provided a PDB file. For example: + * Display the diff for a single function: `reccmp-reccmp --target LEGO1 --verbose 0x100ae1a0` + * Generate an HTML report: `reccmp-reccmp --target LEGO1 --html output.html` + * Create a base file for diffs: `reccmp-reccmp --target LEGO1 --json base.json --silent` + * Diff against a base file: `reccmp-reccmp --target LEGO1 --diff base.json` * [`stackcmp`](/reccmp/tools/stackcmp.py): Compares the stack layout for a given function that almost matches. - * e.g. `reccmp-stackcmp legobin/BETA10.DLL build_debug/LEGO1.DLL build_debug/LEGO1.pdb . 0x1007165d` + * e.g. `reccmp-stackcmp --target BETA10 0x1007165d` * [`roadmap`](/reccmp/tools/roadmap.py): Compares symbol locations in an original binary with the same symbol locations of a recompiled binary * [`verexp`](/reccmp/tools/verexp.py): Verifies exports by comparing the exports of the original DLL and the recompiled DLL * [`vtable`](/reccmp/tools/vtable.py): Asserts virtual table correctness by comparing a recompiled binary with the original - * e.g. `reccmp-vtable legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` + * e.g. `reccmp-vtable --target LEGO1` * [`datacmp`](/reccmp/tools/datacmp.py): Compares global data found in the original with the recompiled version - * e.g. `reccmp-datacmp legobin/LEGO1.DLL build/LEGO1.DLL build/LEGO1.PDB .` + * e.g. `reccmp-datacmp --target LEGO1` -## Testing +## Ghidra Import -`isledecomp` comes with a suite of tests. Install `requirements-tests.txt` and run it like this: +There are existing scripts to import the information from the decompilation into [Ghidra](https://github.com/NationalSecurityAgency/ghidra). See the relevant [README](reccmp/ghidra_scripts/README.md) for additional information. -``` -pip install -r requirements-tests.txt -pytest . -``` - -## Tool Development - -In order to keep the Python code clean and consistent, we use `pylint` and `black`: - -`pip install -r requirements-tests.txt` - -### Run pylint (ignores build and virtualenv) - -`pylint reccmp` - -### Check Python code formatting without rewriting files - -`black --check reccmp` - -### Apply Python code formatting - -`black reccmp` - -# Modules -The following is a list of all the modules found in the annotations (e.g. `// FUNCTION: [module] [address]`) and which binaries they refer to. See [this list of all known versions of the game](https://www.legoisland.org/wiki/LEGO_Island#Download). - -## Retail v1.1.0.0 (v1.1) -* `LEGO1` -> `LEGO1.DLL` -* `CONFIG`-> `CONFIG.EXE` -* `ISLE` -> `ISLE.EXE` - -These modules are the most important ones and refer to the English retail version 1.1.0.0 (often shortened to v1.1), which is the most widely released one. These are the ones we attempt to decompile and match as best as possible. - -## BETA v1.0 - -* `BETA10` -> `LEGO1D.DLL` - -The Beta 1.0 version contains a debug build of the game. While it does not have debug symbols, it still has a number of benefits: -* It is built with less or no optimisation, leading to better decompilations in Ghidra -* Far fewer functions are inlined by the compiler, so it can be used to recognise inlined functions -* It contains assertions that tell us original variable names and code file paths - -It is therefore advisable to search for the corresponding function in `BETA10` when decompiling a function in `LEGO1`. Finding the correct function can be tricky, but is usually worth it, especially for longer functions. - -Unfortunately, some code has been changed after this beta version was created. Therefore, we are not aiming for a perfect binary match of `BETA10`. In case of discrepancies, `LEGO1` (as defined above) is our "gold standard" for matching. - -### Re-compiling a beta build (**WIP**) - -If you want to match the code against `BETA10`, use the following `cmake` setup to create a debug build: -``` -cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_BUILD_TYPE=Debug -DISLE_USE_SMARTHEAP=OFF -``` -**TODO**: If you can figure out how to make a debug build with SmartHeap enabled, please add it here. +## Best practices -If you want to run scripts to compare your debug build to `BETA10` (e.g. `reccmp`), it is advisable to add a copy of `LEGO1D.DLL` to `/legobin` and rename it to `BETA10.DLL`. +We have established some best practices that have no impact on `reccmp`'s output, but have made a positive impact on the LEGO Island decompilation. We have listed them [here](docs/recommendations.md) for convenience. -### Finding matching functions -This is not a recipe, but rather a list of things you can try. -* If you are working on a virtual function in a class, try to find the class' vtable. Many (but not all) classes implement `ClassName()`. These functions are usually easy to find by searching the memory for the string consisting of the class name. Keep in mind that not all child classes overwrite this function, so if the function you found is used in multiple vtables (or if you found multiple `ClassName()`-like functions), make sure you actually have the parent's vtable. -* If that does not help, you can try to walk up the call tree and try to locate a function that calls the function you are interested in. -* Assertions can also help you - most `.cpp` file names have already been matched based on `BETA10`, so you can search for the name of your `.cpp` file and check all the assertions in that file. While that does not find all functions in a given source file, it usually finds the more complex ones. -* _If you have found any other strategies, please add them here._ +## Contributing -## Others (**WIP**) -* `ALPHA` (only used twice) +Feel free to contribute to this project if you are interested! More information can be found at [CONTRIBUTING.md](./CONTRIBUTING.md). diff --git a/docs/annotations.md b/docs/annotations.md new file mode 100644 index 00000000..9ae8bc3e --- /dev/null +++ b/docs/annotations.md @@ -0,0 +1,166 @@ +# Annotations + +The following describes how the source code of the recompilation can be annotated such that `reccmp` can compare the recompilation to the original binary. + +All annotations are of the form +```c++ +// :
+``` +For example, +```c++ +// FUNCTION: LEGO1 0x100b12c0 +``` +refers to a function at address `0x100b12c0` in the build target aliased by `LEGO1` (since it is possible to build different targets from the same source code). + + +## Functions + +Functions can be annotated by one of the markers below. Each marker contains the address of the function as found in the original binaries. This information is then used to compare the recompiled assembly with the original assembly, resulting in an accuracy score. + +Note that functions in a given compilation unit must be ordered by their address in ascending order. + +Function annotations can have multiple different types, which are explained below. + +There are three ways to annotate a function: + +### Annotating the implementation +The preferable way is to annotate the implementation directly. For example: +```c++ +// FUNCTION: LEGO1 0x100b12c0 +MxCore* MxObjectFactory::Create(const char* p_name) +{ + // implementation +} +``` + +### Annotating a comment of the function name + +There are situations where the previous kind of annotation is not possible. Typical examples are: +- templated functions +- synthetic functions (generated by the compiler) +- library functions (like the C++ standard library) +- non-inlined inline functions + +In those cases, one can spell out the function's name in a comment: +```c++ +// TEMPLATE: LEGO1 0x100c4f50 +// MxCollection::`scalar deleting destructor' +``` + +### Annotating a comment of the function's symbol + +There are a few cases where two functions of the same name need to be annotated by comment (e.g. in function overloads). In such cases, you can annotate a comment of the function's debug symbol: +```c++ +// TEMPLATE: LEGO1 0x10035790 +// ?_Construct@@YAXPAPAVROI@@ABQAV1@@Z +``` + +### Annotation types + +#### `FUNCTION` + +Functions with a reasonably complete implementation which are not templated or synthetic (see below) should be annotated with `FUNCTION`. It is preferable to annotate the function's implementation directly. + +#### `STUB` + +Functions with no or a very incomplete implementation should be annotated with `STUB`. These will not be compared to the original assembly. + +```c++ +// STUB: LEGO1 0x10011d50 +LegoCameraController::LegoCameraController() +{ + // TODO +} +``` + +#### `TEMPLATE` + +Templated functions should be annotated with `TEMPLATE`. Since the goal is to eventually have a full accounting of all the functions present in the binaries, please make an effort to find and annotate every function of a templated class. + +```c++ +// TEMPLATE: LEGO1 0x100c0ee0 +// list >::_Buynode + +// TEMPLATE: LEGO1 0x100c0fc0 +// MxStreamListMxDSSubscriber::~MxStreamListMxDSSubscriber + +// TEMPLATE: LEGO1 0x100c1010 +// MxStreamListMxDSAction::~MxStreamListMxDSAction +``` + +#### `SYNTHETIC` + +Synthetic functions should be annotated with `SYNTHETIC`. A synthetic function is generated by the compiler; most common is the "scalar deleting destructor" found in virtual tables. Other cases include default destructors and assignment operators. Note: `SYNTHETIC` takes precedence over `TEMPLATE`. + +```c++ +// SYNTHETIC: LEGO1 0x10003210 +// Helicopter::`scalar deleting destructor' + +// SYNTHETIC: LEGO1 0x100c4f50 +// MxCollection::`scalar deleting destructor' + +// SYNTHETIC: LEGO1 0x100c4fc0 +// MxList::`scalar deleting destructor' +``` + +#### `LIBRARY` + +Functions located in 3rd party libraries should be annotated with `LIBRARY`. This can be useful for working towards a full accounting of all the functions present in the binaries. + +```c++ +// LIBRARY: ISLE 0x4061b0 +// _MemPoolInit@4 + +// LIBRARY: ISLE 0x406520 +// _MemPoolSetPageSize@8 + +// LIBRARY: ISLE 0x406630 +// _MemPoolSetBlockSizeFS@8 +``` + + +## Virtual tables + +Classes with a virtual table should be annotated using the `VTABLE` marker, which includes the module name and address of the virtual table: +```c++ +// VTABLE: LEGO1 0x100dc900 +class MxEventManager : public MxMediaManager { + // ... +} +``` + +## Global variables + +Global variables should be annotated using the `GLOBAL` marker, which includes the module name and address of the variable. + +```c++ +// GLOBAL: LEGO1 0x100f456c +MxAtomId* g_jukeboxScript = NULL; + +// GLOBAL: LEGO1 0x100f4570 +MxAtomId* g_pz5Script = NULL; + +// GLOBAL: LEGO1 0x100f4574 +MxAtomId* g_introScript = NULL; +``` + +## Strings + +String values should be annotated using the `STRING` marker, which includes the module name and address of the text content. Note that this is usually not required since most strings can be auto-detected. If you want, you can use this for bookeeping, but it will usually not affect the `reccmp` match. + +```c++ +inline virtual const char* ClassName() const override // vtable+0x0c +{ + // STRING: LEGO1 0x100f03fc + return "Act2PoliceStation"; +} +``` + +String constants can have a distinct `STRING` and `GLOBAL` address at the same time. The `STRING` points at the actual text while the `GLOBAL` is a _pointer_ to the text: +```c++ +// GLOBAL: LEGO1 0x10102048 +// STRING: LEGO1 0x10102040 +const char* g_strACTION = "ACTION"; +``` + +In this example, there is an `A` at address `0x10102040` and a 32-bit pointer to `0x10102040` at address `0x10102048`. diff --git a/docs/project_files.md b/docs/project_files.md new file mode 100644 index 00000000..2b203215 --- /dev/null +++ b/docs/project_files.md @@ -0,0 +1,31 @@ +# Project Files + +The configuration of `reccmp` requires three different files. As explained in the [main README](../README.md), `reccmp-project` can be used to generate each of them. + +* `reccmp-project.yml` contains the main configuration. We recommend that you keep this file at the root of your repository and add it to your VCS (like `git`). +* `reccmp-user.yml` contains information that may differ from user to user, like the location of the original binary files. We recommend that you ignore this file from your VCS and keep it at the root of your repository. +* `reccmp-build.yml` contains information that may differ in each recompilation, like the location of the recompiled binary and debug symbol file. We recommend that you ignore this file from your VCS. + * If the names or paths of your build artifacts change, we recommend you generate this script as part of your build process. + * If they do not, you can generate this file once and keep it in your build directory or at the repository root. + * Note that as of this writing, the Ghidra import needs to have a `reccmp-build.yml` at the repository root. + + +## Additional information in `reccmp-project.yml` + +> See the relevant [Python file](../reccmp/project/config.py) in case this documentation is outdated. + +Some additional information can be added to `reccmp-project.yml` by hand. For example: +```yml +targets: + BETA10: + filename: BETA10.DLL + source-root: LEGO1 + hash: + sha256: ... + ghidra: + ignore-types: + - Act2Actor + ignore-functions: + - 0x100f8ad0 +``` +This tells the Ghidra import script to ignore certain types and functions. diff --git a/docs/recommendations.md b/docs/recommendations.md new file mode 100644 index 00000000..dfdc5322 --- /dev/null +++ b/docs/recommendations.md @@ -0,0 +1,68 @@ +# Recommendations + +The following is a list of recommendations and best practices we have established at the [LEGO Island Decompilation project](https://github.com/isledecomp/isle). They do not affect the output of `reccmp`, so it is up to you if you want to use them. + +## Class/struct size annotation and assertion + +Once we have a reasonable guess for the size of a class or struct, we add it in a comment like so: +```c++ +// SIZE 0x1c +class MxCriticalSection { +public: + MxCriticalSection(); + ~MxCriticalSection(); + static void SetDoMutex(); + // ... +} +``` +Furthermore, we use a compile-time assertion to verify that the recompiled size is correct (see also [this file](https://github.com/isledecomp/isle/blob/82453f62d84f979f8a6fc7b46e21b61cb835d2f1/util/decomp.h)): +```c++ +#define DECOMP_STATIC_ASSERT(V) \ + namespace \ + { \ + typedef int foo[(V) ? 1 : -1]; \ + } +#define DECOMP_SIZE_ASSERT(T, S) DECOMP_STATIC_ASSERT(sizeof(T) == S) +``` +Then we add `DECOMP_SIZE_ASSERT(MxCriticalSection, 0x1c)` to the respective `.cpp` file (if the class has no dedicated `.cpp` file, we use any appropriate `.cpp` file where the class is used). + +## Member variables + +We annotate member variables with their relative offsets. + +```c++ +class MxDSObject : public MxCore { +private: + MxU32 m_sizeOnDisk; // 0x08 + MxU16 m_type; // 0x0c + char* m_sourceName; // 0x10 + undefined4 m_unk0x14; // 0x14 + // ... +} +``` + +## VTable members + +In addition to the `VTABLE` annotation (which is relevant to `reccmp`), we also add comments to indicate the relative offset of each function: +```c++ +// VTABLE: LEGO1 0x100dc900 +class MxEventManager : public MxMediaManager { +public: + MxEventManager(); + virtual ~MxEventManager() override; + + virtual void Destroy() override; // vtable+0x18 + virtual MxResult Create(MxU32 p_frequencyMS, MxBool p_createThread); // vtable+0x28 + // ... +} +``` + +## Aliases for unknown scalar types + +In order to distinguish known from unknown types, we have added the following typedefs: +```c++ +typedef unsigned char undefined; +typedef unsigned short undefined2; +typedef unsigned int undefined4; +``` +Note that the behaviour of signed and unsigned integers can be different even when no arithmetics is involved. If changing e.g. from `undefined4` to `int` improves the match, this is a strong indicator that the original variable was signed as well. \ No newline at end of file diff --git a/reccmp/project/create.py b/reccmp/project/create.py index bd541199..0eddbd89 100644 --- a/reccmp/project/create.py +++ b/reccmp/project/create.py @@ -251,16 +251,18 @@ def create_project( if scm: # Update existing .gitignore to skip build.yml and user.yml. gitignore_path = project_directory / ".gitignore" - if gitignore_path.exists(): - ignore_rules = gitignore_path.read_text().splitlines() - if RECCMP_USER_CONFIG not in ignore_rules: - logger.debug("Adding '%s' to .gitignore...", RECCMP_USER_CONFIG) - with gitignore_path.open("a") as f: - f.write(f"{RECCMP_USER_CONFIG}\n") - if RECCMP_BUILD_CONFIG not in ignore_rules: - logger.debug("Adding '%s' to .gitignore...", RECCMP_BUILD_CONFIG) - with gitignore_path.open("a") as f: - f.write(f"{RECCMP_BUILD_CONFIG}\n") + if not gitignore_path.exists(): + gitignore_path.touch() + + ignore_rules = gitignore_path.read_text().splitlines() + if RECCMP_USER_CONFIG not in ignore_rules: + logger.debug("Adding '%s' to .gitignore...", RECCMP_USER_CONFIG) + with gitignore_path.open("a") as f: + f.write(f"{RECCMP_USER_CONFIG}\n") + if RECCMP_BUILD_CONFIG not in ignore_rules: + logger.debug("Adding '%s' to .gitignore...", RECCMP_BUILD_CONFIG) + with gitignore_path.open("a") as f: + f.write(f"{RECCMP_BUILD_CONFIG}\n") if cmake: # Generate tempalte files so you can start building each target with CMake. diff --git a/tests/test_project.py b/tests/test_project.py index b1876405..82f6c4b0 100644 --- a/tests/test_project.py +++ b/tests/test_project.py @@ -110,7 +110,7 @@ def test_project_creation(tmp_path_factory, binfile: PEImage): assert project.targets[target_name].target_id == target_name assert project.targets[target_name].filename == bin_path.name assert project.targets[target_name].source_root == project_root - assert not (project_root / ".gitignore").is_file() + assert (project_root / ".gitignore").is_file() assert (project_root / "CMakeLists.txt").is_file() assert (project_root / "cmake/reccmp.cmake").is_file()