Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New binary files for unit tests #3

Open
2 of 5 tasks
disinvite opened this issue Oct 20, 2024 · 8 comments · May be fixed by #90
Open
2 of 5 tasks

New binary files for unit tests #3

disinvite opened this issue Oct 20, 2024 · 8 comments · May be fixed by #90
Labels
cleanup Fix the leftovers of a major migration

Comments

@disinvite
Copy link
Collaborator

disinvite commented Oct 20, 2024

The unit tests in tests/test_islebin.py depend on LEGO1.DLL and make sure we correctly interpret various points of interest from the file. We need a new generic file (or files) to adapt these tests for general use.

Some things to consider:

  • Decide which types of file we want to test. We currently only read PE-format files. Do we want to look at MZ or NE executables in the future?
  • Test for detecting compiler-specific metadata? e.g. MSVC debug information like build timestamp, etc
  • Decide where the code for these dummy executables will live and how they will get built.
  • Exclude these files from the python module install. They should only exist here for CI or on reccmp dev setups
  • Right now we just test for reading the data structures out of the file. We could test more of the analysis tools as we add them (e.g. function/vtable range detection. ascii string detection)
@jonschz jonschz added the cleanup Fix the leftovers of a major migration label Oct 27, 2024
@disinvite
Copy link
Collaborator Author

#26 has raised this topic again. I think open questions 1 and 2 above (and probably 5) can be answered with "yes, we should do these things". The challenge now is creating these files. Or should we just find some files that would meet the criteria in the meantime? (Old shareware games or whatever)

@madebr
Copy link
Collaborator

madebr commented Dec 2, 2024

My intention in #26 was to add small executables completely under our control.

win32_binaries.zip built with MSVC 6
By avoiding the CRT, and linking with /ENTRY, it's possible to keep the file sizes low.

@disinvite
Copy link
Collaborator Author

Pulling the discussion back over here. @madebr brought up open watcom as a modern way to build old exes without dosbox/dosemu and an old compiler. We have some sample files (for PE) to start with. But where do we put them?

  • Build the files somewhere, bundle them here in the reccmp repo.
    Pros: Easy access, every contributor can test locally, can track changes to the binaries and companion tests.
    Cons: Awkward to ship lots of exe files with the module. Potential for virus scan alerts?

  • Build files from source on a separate repo.
    Pros: Clear separation of code and sample files.
    Cons: But now they are separate. Need to keep tests (in this repo) and binary changes in sync. Would need to download files locally to test locally.

And would we have separate binary files for each test case?

@madebr
Copy link
Collaborator

madebr commented Dec 31, 2024

I dislike it when binaries are added to git.
At libsdl-org/SDL_image and libsdl-org/SDL_mixer they added x86/x64 dlls for commen encoders/decoders.
The .git directory of those projects are close to 1GiB and 250 MiB because of this.

I propose to add build scripts to the repo and try to deduce all locations using pdbs, dbg files, linker map files, ... and put those in a json/yml.
I'd put the binaries themselves (and accompanying metadata) in a separate repo or as a release asset.

An advantage of this is it allows quick iteration if you have all toolchains available on your system.
A disadvantage is you have to be fairly confident in the post build scripts.
Another disadvantage is that the tests cannot have embedded absolute addresses: those must be configurable as well.

@jonschz
Copy link
Collaborator

jonschz commented Dec 31, 2024

Further option:

  • Keep the source code but not the binaries in this repo, build the binaries here in releases. Has most of the pros/cons of your second option, but should mitigate the "keep up to date" issue. Still has the drawback that you need to download binaries or install another compiler in order to run the tests locally, but you might need to do that sooner or later anyway.

Regarding separate binary files:

In my opinion, these binaries are useful for two kinds of tests:

  • Integration tests
  • Unit testing file parsing

@madebr
Copy link
Collaborator

madebr commented Dec 31, 2024

I think we should put the binaries in a git submodule.
Otherwise it will be hard to run git bisect, or similar archeology, in the future.

@disinvite
Copy link
Collaborator Author

The thing I haven't been able to figure out is what kinds of tests we can run if the sample binary files might be different. All the tests we have now depend on the fact that LEGO1.DLL is the same every time.

@madebr
Copy link
Collaborator

madebr commented Jan 12, 2025

That's also what is putting me off. I want to avoid a situation where we re-upload the same binary, slightly modified, over and over again.
Perhaps we select a OSS project, build a binary with a "good toolchain", and treat is as "the golden reference"?

I think we should select both a C and C++ project, making use of virtual inheritance.

@disinvite disinvite linked a pull request Feb 23, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Fix the leftovers of a major migration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants