Sourcerer: Channeling the void

The rest of this document will walk you trough replicating the results from our paper: Sourcerer Channeling the void published in DIMVA'25.

Citation


@inproceedings{Badoux_sourcerer_2025,
  author = {Badoux, Nicolas and Toffalini, Flavio and Payer, Mathias},
  month = jul,
  booktitle={International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment},
  title = {{Sourcerer: Channeling the void}},
  year = {2025}
}

Repository layout

The repository is originally a fork of LLVM 19.0.0. The code of Sourcerer is commited in the respective folder of the LLVM project (clang, LLVM, etc...). Lastly, in the Type++ folder, you can find all the scripts to run the different experiments.

For SPEC CPU: patches, scripts to run the experiments, and get the results.

The LLVM code is released under the Apache License v2.0 with LLVM Exceptions. The type++ code follows the same license.

Artifact

Instructions are available in artifact.pdf.

The artifact contains the material to reproduce the results:

Porting Effort (Section 7.1 – Table 1)
Performance Overhead (Section 7.2 – Table 2)
Source of Performance Overhead (Section 7.3 – Table 3)
Sourcerer as a Sanitizer for Fuzzing Campaigns (Section 7.4 – Figure 1)

This material is released under the Apache License 2.0, in line with the LLVM project Sourcerer builds upon.

1. Accessing the artifact:
We release the artifact on a public GitHub repository. The main branch contains the latest version of the code.

2. Hardware dependencies:

Minimum: 16GB RAM for SPEC CPU evaluations
Recommended: 128GB RAM and 1TB disk

3. Software dependencies:

Ubuntu 20.04
Docker
Active internet connection (for Chromium evaluation)
Installed: curl, git, docker, pip

4. Benchmarks:

SPEC CPU 2006 and 2017 benchmarks (SPEC CPU 2006, SPEC CPU 2017)

Artifact Installation

Clone the repository and build the Docker image:

REPO=https://github.com/HexHive/Sourcerer
git clone $REPO --single-branch --branch main --depth 100 Sourcerer
cd Sourcerer
pip install -r requirements.txt

Each experiment is encapsulated in Docker containers. The `Dockerfile` is at the root of the repository. We do not provide support for running experiments outside Docker.

Experiment Workflow

The artifact reproduces results from four experiments:

Compatibility (extra classes to instrument over type++)
Performance Overhead (SPEC CPU 2006 and 2017)
Ablation Study (source of overhead)
Fuzzing Campaign (on OpenCV)

Scripts are provided to run experiments and generate the corresponding tables and figure. More details are in the repository’s `README.md`.

Major Claims

(C1) Compatibility:
Sourcerer is compatible with C++ codebases with minor changes.
→ See Experiment E1, Table 1.
(C2) Performance Overhead:
Sourcerer introduces negligible overhead while adding protection.
→ See Experiment E2, Table 2.
(C3) Ablation Study:
Overhead analysis of Sourcerer’s components.
→ See Experiment E3, Table 3.
(C4) Fuzzing Campaign:
Demonstrates Sourcerer in a fuzzing context.
→ See Experiment E4, Figure 1.

Evaluation

Experiment 1 (E1) - Compatibility Analysis

[2 minutes human + 2 compute-hours]

Evaluate extra classes in SPEC CPU benchmarks.

Preparation:

Ensure both .iso benchmark files are in the repository root.

Execution:

./table1.sh

Output:

Logs and a table similar to Table 1.

Experiment 2 (E2) - Performance Overhead

[2 minutes human + 15 compute-hours]

Compare SPEC CPU performance with/without cast checking (Sourcerer vs LLVM-CFI).

Preparation:

Ensure both .iso benchmark files are in the repository root.

Execution:

./table2.sh

Output:

Logs and a table similar to Table 2. Expect ~10% variation in performance.

Experiment 3 (E3) - Ablation Study

[2 minutes human + 15 compute-hours]

Measure cost of individual components in the type checking process.

Preparation:

Ensure both .iso benchmark files are in the repository root.

Execution:

./table3.sh

Output:

Table similar to Table 3.

Experiment 4 (E4) - Fuzzing Campaign

[2 minutes human + 25 compute-hours]

Compare Sourcerer and ASan in a fuzzing campaign on OpenCV.

Preparation:

./fig1_requirements.sh

Execution:

./fig1.sh

Output:

Figures similar to Figure 1, saved in fuzzing_pics.

Usage

Todo

Troubleshooting

Disk space issue

The different Docker images require quite some space. If you run out of space, you can remove specific container with docker rm $CONTAINER_ID or run docker system prune which removes dangling images and containers. This might, however, require to rebuild some images via the respective docker build command.

Container name already in use

If you want to rerun the evaluation, you will first need to remove the named containers as the name has to be unique. Simply execute docker rm $CONTAINER_NAME before relaunching the evaluation. If the evaluation was is still running, first kill the container via docker kill $CONTAINER_NAME.

Permission issue inside the Docker container

The whole artifact folder is mounted inside the Docker containers. Any modification to the permissions of the folder will be reflected inside the container. In particular, if the owner of the folder is changed, the ID inside the container will not match the owner of the files resulting in permission issues. If you encounter this problem, you should reset the permissions inside the container with the chown -R $USER:$USER command. To access inside the container, you can use the docker exec -it $CONTAINER_ID zsh command.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ci		.ci
.devcontainer		.devcontainer
.github		.github
Type++		Type++
bolt		bolt
clang-tools-extra		clang-tools-extra
clang		clang
cmake		cmake
compiler-rt		compiler-rt
cross-project-tests		cross-project-tests
docker		docker
eval-dimwa		eval-dimwa
flang		flang
libc		libc
libclc		libclc
libcxx		libcxx
libcxxabi		libcxxabi
libunwind		libunwind
lld		lld
lldb		lldb
llvm-libgcc		llvm-libgcc
llvm		llvm
mlir		mlir
offload		offload
openmp		openmp
polly		polly
pstl		pstl
results_cpu_baseline		results_cpu_baseline
results_cpu_cfi		results_cpu_cfi
results_cpu_cfi_stats		results_cpu_cfi_stats
results_cpu_sourcerer		results_cpu_sourcerer
results_cpu_sourcerer_abi_only		results_cpu_sourcerer_abi_only
results_cpu_sourcerer_no_check		results_cpu_sourcerer_no_check
results_cpu_sourcerer_stats		results_cpu_sourcerer_stats
results_memory_baseline		results_memory_baseline
results_memory_sourcerer		results_memory_sourcerer
runtimes		runtimes
third-party		third-party
utils/bazel		utils/bazel
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.mailmap		.mailmap
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMPETITORS.md		COMPETITORS.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE.TXT		LICENSE.TXT
README.md		README.md
SECURITY.md		SECURITY.md
artifact.pdf		artifact.pdf
build.sh		build.sh
build_analysis.sh		build_analysis.sh
build_collect.sh		build_collect.sh
build_instrument.sh		build_instrument.sh
cxx_build.sh		cxx_build.sh
cxx_build_for_program.sh		cxx_build_for_program.sh
fetch_repos.sh		fetch_repos.sh
fig1.sh		fig1.sh
fig1_requirements.sh		fig1_requirements.sh
libc++13.patch		libc++13.patch
libcxx.patch		libcxx.patch
libcxxabi.patch		libcxxabi.patch
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
seccomp.json		seccomp.json
table1.sh		table1.sh
table2.sh		table2.sh
table3.sh		table3.sh
table6_requirements.sh		table6_requirements.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sourcerer: Channeling the void

Citation

Repository layout

Artifact

Artifact Installation

Experiment Workflow

Major Claims

Evaluation

Experiment 1 (E1) - Compatibility Analysis

Preparation:

Execution:

Output:

Experiment 2 (E2) - Performance Overhead

Preparation:

Execution:

Output:

Experiment 3 (E3) - Ablation Study

Preparation:

Execution:

Output:

Experiment 4 (E4) - Fuzzing Campaign

Preparation:

Execution:

Output:

Usage

Troubleshooting

Disk space issue

Container name already in use

Permission issue inside the Docker container

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

HexHive/Sourcerer

Folders and files

Latest commit

History

Repository files navigation

Sourcerer: Channeling the void

Citation

Repository layout

Artifact

Artifact Installation

Experiment Workflow

Major Claims

Evaluation

Experiment 1 (E1) - Compatibility Analysis

Preparation:

Execution:

Output:

Experiment 2 (E2) - Performance Overhead

Preparation:

Execution:

Output:

Experiment 3 (E3) - Ablation Study

Preparation:

Execution:

Output:

Experiment 4 (E4) - Fuzzing Campaign

Preparation:

Execution:

Output:

Usage

Troubleshooting

Disk space issue

Container name already in use

Permission issue inside the Docker container

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages