Skip to content

Latest commit

 

History

History
166 lines (118 loc) · 5.35 KB

annotations.md

File metadata and controls

166 lines (118 loc) · 5.35 KB

Annotations

The following describes how the source code of the recompilation can be annotated such that reccmp can compare the recompilation to the original binary.

All annotations are of the form

// <annotation type>: <target> <address>

For example,

// FUNCTION: LEGO1 0x100b12c0

refers to a function at address 0x100b12c0 in the build target aliased by LEGO1 (since it is possible to build different targets from the same source code).

Functions

Functions can be annotated by one of the markers below. Each marker contains the address of the function as found in the original binaries. This information is then used to compare the recompiled assembly with the original assembly, resulting in an accuracy score.

Note that functions in a given compilation unit must be ordered by their address in ascending order.

Function annotations can have multiple different types, which are explained below.

There are three ways to annotate a function:

Annotating the implementation

The preferable way is to annotate the implementation directly. For example:

// FUNCTION: LEGO1 0x100b12c0
MxCore* MxObjectFactory::Create(const char* p_name)
{
  // implementation
}

Annotating a comment of the function name

There are situations where the previous kind of annotation is not possible. Typical examples are:

  • templated functions
  • synthetic functions (generated by the compiler)
  • library functions (like the C++ standard library)
  • non-inlined inline functions

In those cases, one can spell out the function's name in a comment:

// TEMPLATE: LEGO1 0x100c4f50
// MxCollection<MxRegionLeftRight *>::`scalar deleting destructor'

Annotating a comment of the function's symbol

There are a few cases where two functions of the same name need to be annotated by comment (e.g. in function overloads). In such cases, you can annotate a comment of the function's debug symbol:

// TEMPLATE: LEGO1 0x10035790
// ?_Construct@@YAXPAPAVROI@@ABQAV1@@Z

Annotation types

FUNCTION

Functions with a reasonably complete implementation which are not templated or synthetic (see below) should be annotated with FUNCTION. It is preferable to annotate the function's implementation directly.

STUB

Functions with no or a very incomplete implementation should be annotated with STUB. These will not be compared to the original assembly.

// STUB: LEGO1 0x10011d50
LegoCameraController::LegoCameraController()
{
  // TODO
}

TEMPLATE

Templated functions should be annotated with TEMPLATE. Since the goal is to eventually have a full accounting of all the functions present in the binaries, please make an effort to find and annotate every function of a templated class.

// TEMPLATE: LEGO1 0x100c0ee0
// list<MxNextActionDataStart *,allocator<MxNextActionDataStart *> >::_Buynode

// TEMPLATE: LEGO1 0x100c0fc0
// MxStreamListMxDSSubscriber::~MxStreamListMxDSSubscriber

// TEMPLATE: LEGO1 0x100c1010
// MxStreamListMxDSAction::~MxStreamListMxDSAction

SYNTHETIC

Synthetic functions should be annotated with SYNTHETIC. A synthetic function is generated by the compiler; most common is the "scalar deleting destructor" found in virtual tables. Other cases include default destructors and assignment operators. Note: SYNTHETIC takes precedence over TEMPLATE.

// SYNTHETIC: LEGO1 0x10003210
// Helicopter::`scalar deleting destructor'

// SYNTHETIC: LEGO1 0x100c4f50
// MxCollection<MxRegionLeftRight *>::`scalar deleting destructor'

// SYNTHETIC: LEGO1 0x100c4fc0
// MxList<MxRegionLeftRight *>::`scalar deleting destructor'

LIBRARY

Functions located in 3rd party libraries should be annotated with LIBRARY. This can be useful for working towards a full accounting of all the functions present in the binaries.

// LIBRARY: ISLE 0x4061b0
// _MemPoolInit@4

// LIBRARY: ISLE 0x406520
// _MemPoolSetPageSize@8

// LIBRARY: ISLE 0x406630
// _MemPoolSetBlockSizeFS@8

Virtual tables

Classes with a virtual table should be annotated using the VTABLE marker, which includes the module name and address of the virtual table:

// VTABLE: LEGO1 0x100dc900
class MxEventManager : public MxMediaManager {
    // ...
}

Global variables

Global variables should be annotated using the GLOBAL marker, which includes the module name and address of the variable.

// GLOBAL: LEGO1 0x100f456c
MxAtomId* g_jukeboxScript = NULL;

// GLOBAL: LEGO1 0x100f4570
MxAtomId* g_pz5Script = NULL;

// GLOBAL: LEGO1 0x100f4574
MxAtomId* g_introScript = NULL;

Strings

String values should be annotated using the STRING marker, which includes the module name and address of the text content. Note that this is usually not required since most strings can be auto-detected. If you want, you can use this for bookeeping, but it will usually not affect the reccmp match.

inline virtual const char* ClassName() const override // vtable+0x0c
{
	// STRING: LEGO1 0x100f03fc
	return "Act2PoliceStation";
}

String constants can have a distinct STRING and GLOBAL address at the same time. The STRING points at the actual text while the GLOBAL is a pointer to the text:

// GLOBAL: LEGO1 0x10102048
// STRING: LEGO1 0x10102040
const char* g_strACTION = "ACTION";

In this example, there is an A at address 0x10102040 and a 32-bit pointer to 0x10102040 at address 0x10102048.