-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wasm_of_ocaml benchmarks with current-bench output #1842
Conversation
Note: this is ready for review but I still need to verify details—namely, how current-bench deals with multiple opam files, and what should be the name of the results file—so we should hold off from merging yet. |
Thanks to @punchagan’s advice, this should now hopefully work with current-bench. |
458a34c
to
df5a387
Compare
I think |
I would put all the results together and grouping them? {
"name": "Wasm_of_ocaml",
"results": [
{
"name": "Microbenchmarks",
"metrics": [
{
"name": "microbenchmark/almabench",
"value": 1.6087265,
"units": "s"
},
{
"name": "microbenchmark/bdd",
"value": 0.31883700000000004,
"units": "s"
}, |
This will cause all the run times to appear on one unique graph. I’m not sure we want that? |
Maybe not, actually, since they have very different running times. This is done for Ocaml, but it is not very readable. We should still put them in the same list of metrics. This will be more readable when we will add other benchmarks. Maybe we could report the geometric mean of all the running times as well? |
Actually, I disagree. When you look at the last line of https://github.com/ocurrent/current-bench/blob/b0528d005b34b8ea1ff4cd7f893165ce782ef5db/pipeline/lib/custom_dockerfile.ml#L78-L96, it also installs regular dependencies. I have rearranged the metrics list as requested.
A geometric mean of absolute values? Isn’t the geometric mean usually used on relative evolution numbers? |
9e15f25
to
9087508
Compare
2846158
to
da6c377
Compare
This reverts commit 8556727.
9913438
to
4262355
Compare
I have opened #1854 which should unblock the installation of Binaryen as a binary here. |
This reverts commit 4a495e9.
# This target is the one run by `current-bench`, | ||
# see https://github.com/ocurrent/current-bench | ||
.PHONY: bench | ||
bench: __run_wasm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be called bench-wasm ?
if js | ||
then ( | ||
compile_jsoo ~wasm "" code Spec.byte code Spec.js_of_ocaml; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct. if both js and wasm are set, this will use wasm_of_ocaml
instead of js_of_ocaml
to compile the file.
I think it would be better to rename compile_jsoo
with something like
val compile_js : backend:[`Jsoo | `Wasmoo] -> ...
compile "ocamlc -unsafe" src Spec.ml code Spec.byte_unsafe; | ||
compile "ocamlopt" src Spec.ml code Spec.opt_unsafe; | ||
compile_jsoo "" code Spec.byte_unsafe code Spec.js_of_ocaml_unsafe; | ||
if wasm then compile_jsoo ~wasm "" code Spec.byte code Spec.wasm_of_ocaml; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we take effects into account ?
; Some Spec.js_of_ocaml_call | ||
; Some Spec.js_of_ocaml_effects | ||
] ) | ||
else if effects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've dropped this logic. The documentation of effects
was previously:
only run with and without effect handler support
which I think meant that we wanted to run a small subset of the runs.
; "-effects", Arg.Set effects, " only run with and without effect handler support" | ||
; ( "-effects" | ||
, Arg.Symbol | ||
( [ "none"; "cps"; "double-translation" ] | ||
, function | ||
| "none" -> set_effects `None | ||
| "cps" -> set_effects `Cps | ||
| "double-translation" -> set_effects `Double_translation | ||
| _ -> assert false ) | ||
, " only run with and without effect handler support" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is what we want. When benching for effects, we probably when to be able to compare all backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR does many things at once
- Setup docker LGTM
- Some refactoring of the benchmarks (e.g introducing Measure). Can we move that to a separate PR ?
- Add a new output format. Can we move that to a separate PR (or together with the refactoring above)
- Add support for effects backend. I don't think it is correct. We probably want to be able to bench different backends / bench all backend or none ? Can this be moved to a different PR
- Add support for wasm in benchmarks. Can we have this in a dedicated PR ?
Maybe we could have the following stack of PR
- refactoring and new output format
- wasm support and docker setup
- effects backends
The benchmark scripts have received very little attention so far. Maybe it's a good time for larger refactoring.
run.ml is responsible for producing all data for reports. It has a few flags to produce a subset of data, presumably to speedup the process.
We could image a way to only generate the data we need for a set of reports.
See #1855 |
I think it would be useful to extend the scope of this to track things like
|
@vouillon and I (so far, mostly @vouillon) have started to look at ways of improving performance of Wasm code. To keep track of our progress, we need to set up a benchmarking suite that gets run regularly.
This PR makes the repository compatible for monitoring by https://github.com/ocurrent/current-bench/: the benchmark suite is extended with the current-bench output format and I modified the scripts so that Wasm benchmarks are first-class citizens. The existing benchmarking utilities should still work, I ran a few checks to verify that.
This is a first step, we will soon want to add macrobenchmarks as well — presumably by having a
benchmarks/sources/macro
directory where each subdirectory is a macrobenchmark with its own dune file.