-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved behavior for indirect name lookup #89
Conversation
I currently get an
It come from
|
Update: Looks like this can be fixed by doing the following in # Check whether the thunk destination is a matched symbol
if orig_func.recomp_addr not in recomp_thunks:
best_name = orig_func.best_name()
assert best_name is not None
self._db.create_orig_thunk(orig_thunk, best_name)
continue And then changing for key in ("computed_name", "name", "import_name"): Not sure if that has any unintended side effects, I couldn't find any. |
The regression can be fixed by adding // GLOBAL: BETA10 0x101faf70
// _aexit_rtn to |
I think we can solve it very simply by just giving the import entity a name. The recomp imports in the PDB already have names like We don't have the granularity in the DB API to set the name attribute only if it is not already set. That means we would lose the PDB values for now, but could still have: |
Sounds reasonable to me. Feel free to merge after addressing this issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. While testing I found another issue that is already present on master
- I'll open a separate ticket for that.
There are a few different types of "indirect" calls:
The path to get the name depends on what we're looking at:
In 1, we are calling the function pointer at the given variable, which may or may not have an initial value. 2 is a virtual method but we are calling it this way instead of loading the vptr base into
ecx
and using an offset. 3 is an imported function from a DLL.In all cases, the address in the instruction is a pointer to the actual address we want to call. In 1, we have the variable in our database and it's more useful to use that as the name. (The initial value may be irrelevant.) For 2, we can get to the annotated function by reading the pointer address from the binary and looking up that value in our database.
We can do the same for 3, except that the pointer location does not have an address. It is a relative virtual address. When the image is loaded, this value should be updated to point at the location of the imported function, but in the file it looks like this:
0x10be1e
is the RVA to theIMAGE_IMPORT_BY_NAME
struct for this function.This currently works for asm sanitize by the hack of loading the RVA as if it were an address. The problem comes later when we want to iterate through all entities in the database. We have to filter out the fake addresses so they are ignored by
roadmap
or the ghidra import scripts.We could patch the binary (in memory) so the RVA becomes a real address, then use that as the entity with the DLL name. What I did instead is change the name replacement function to recognize an indirect lookup and use a different attribute of the import entity for this situation.
That change required these steps:
The binary lookup was a dependency of asm sanitize. This is now used directly by the name replacement function and it specifically returns a pointer instead of allowing a read of any size.
We now have
EntityType.IMPORT
specifically for imports from.idata
. Previously we usedPOINTER
as the type. We also set theimport_name
attribute to contain the DLL name and function name. This attribute is used by the name replacement function.Loading and matching of imports is a lot simpler because we don't need to set up the aforementioned hack with the RVA.
We also have a lot of new comments and tests surrounding name replacement and asm sanitize. The only changes to the generated assembly are:
POINTER
is nowIMPORT
for the various__imp__
entities.BETA10 had one minor drop in accuracy on the library function
_amsg_exit
. No change to LEGO1.