Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBMC: Add contracts to native backends #685

Merged
merged 4 commits into from
Jan 23, 2025
Merged

CBMC: Add contracts to native backends #685

merged 4 commits into from
Jan 23, 2025

Conversation

hanno-becker
Copy link
Contributor

@hanno-becker hanno-becker commented Jan 22, 2025

So far, the CBMC proofs covered the combination of C frontend and C backend. While most native backend functions are drop-in replacements of functions in the default C backend and should thus inherit their specification, there are minor differences in signature (in particular since #675) or functionality (e.g. the native implementation of rej_uniform may reject certain inputs, and the frontend uses fallback logic which is not present with the C backend) that merit explicit contract annotations for all functions in the arithmetic and FIPS202 backend, and proofs against those specs.

This PR achieves this by adding CBMC contracts for all functions in the native and FIPS202 backends, and proving that their call-sites still uphold their spec when the respective MLKEM_USE_NATIVE_XXX option is set. This is largely trivial, but it is still worth doing for (a) documentation purposes, and since (b) the native backends operate on raw arrays, while their call-sites operate in terms of the poly_xxx structs.

The only non-trivial case is rej_uniform, where we had to slightly alter the fallback logic to avoid having to reason about buffer shifting, which CBMC struggles with.

@hanno-becker hanno-becker added enhancement New feature or request CBMC labels Jan 22, 2025
@hanno-becker hanno-becker changed the title CBMC: Add contracts to native backend CBMC: Add contracts to most of the native backend Jan 22, 2025
@hanno-becker hanno-becker force-pushed the cbmc_native branch 2 times, most recently from 9ebf93e to 2ad5077 Compare January 23, 2025 03:54
@hanno-becker hanno-becker marked this pull request as ready for review January 23, 2025 04:15
@hanno-becker hanno-becker requested a review from a team January 23, 2025 04:15
@hanno-becker hanno-becker changed the title CBMC: Add contracts to most of the native backend CBMC: Add contracts to native backend Jan 23, 2025
@hanno-becker hanno-becker changed the title CBMC: Add contracts to native backend CBMC: Add contracts to native backends Jan 23, 2025
@hanno-becker hanno-becker added the benchmark this PR should be benchmarked in CI label Jan 23, 2025
Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 29107 cycles 29052 cycles 1.00
ML-KEM-512 encaps 35391 cycles 35399 cycles 1.00
ML-KEM-512 decaps 45894 cycles 45876 cycles 1.00
ML-KEM-768 keypair 49310 cycles 49321 cycles 1.00
ML-KEM-768 encaps 55628 cycles 55570 cycles 1.00
ML-KEM-768 decaps 70412 cycles 70358 cycles 1.00
ML-KEM-1024 keypair 72191 cycles 72070 cycles 1.00
ML-KEM-1024 encaps 80887 cycles 80854 cycles 1.00
ML-KEM-1024 decaps 100742 cycles 100755 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 13800 cycles 13802 cycles 1.00
ML-KEM-512 encaps 17962 cycles 17570 cycles 1.02
ML-KEM-512 decaps 23132 cycles 23387 cycles 0.99
ML-KEM-768 keypair 23457 cycles 23201 cycles 1.01
ML-KEM-768 encaps 25319 cycles 25123 cycles 1.01
ML-KEM-768 decaps 33912 cycles 33676 cycles 1.01
ML-KEM-1024 keypair 32085 cycles 32115 cycles 1.00
ML-KEM-1024 encaps 35613 cycles 35735 cycles 1.00
ML-KEM-1024 decaps 47098 cycles 47116 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 33074 cycles 33132 cycles 1.00
ML-KEM-512 encaps 38655 cycles 38766 cycles 1.00
ML-KEM-512 decaps 50252 cycles 50299 cycles 1.00
ML-KEM-768 keypair 55347 cycles 55296 cycles 1.00
ML-KEM-768 encaps 62252 cycles 62242 cycles 1.00
ML-KEM-768 decaps 76965 cycles 76974 cycles 1.00
ML-KEM-1024 keypair 83082 cycles 82932 cycles 1.00
ML-KEM-1024 encaps 93825 cycles 93743 cycles 1.00
ML-KEM-1024 decaps 114068 cycles 114136 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 58300 cycles 58289 cycles 1.00
ML-KEM-512 encaps 65747 cycles 65711 cycles 1.00
ML-KEM-512 decaps 84466 cycles 84570 cycles 1.00
ML-KEM-768 keypair 98981 cycles 98990 cycles 1.00
ML-KEM-768 encaps 110581 cycles 110176 cycles 1.00
ML-KEM-768 decaps 136457 cycles 137638 cycles 0.99
ML-KEM-1024 keypair 150190 cycles 150332 cycles 1.00
ML-KEM-1024 encaps 166819 cycles 166279 cycles 1.00
ML-KEM-1024 decaps 202425 cycles 202235 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 18010 cycles 18063 cycles 1.00
ML-KEM-512 encaps 22914 cycles 22850 cycles 1.00
ML-KEM-512 decaps 30155 cycles 30165 cycles 1.00
ML-KEM-768 keypair 30930 cycles 31002 cycles 1.00
ML-KEM-768 encaps 33800 cycles 34023 cycles 0.99
ML-KEM-768 decaps 44402 cycles 44557 cycles 1.00
ML-KEM-1024 keypair 44215 cycles 44313 cycles 1.00
ML-KEM-1024 encaps 49565 cycles 49591 cycles 1.00
ML-KEM-1024 decaps 64263 cycles 64079 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 20411 cycles 20410 cycles 1.00
ML-KEM-512 encaps 27107 cycles 27066 cycles 1.00
ML-KEM-512 decaps 35987 cycles 35942 cycles 1.00
ML-KEM-768 keypair 34957 cycles 34920 cycles 1.00
ML-KEM-768 encaps 38184 cycles 38141 cycles 1.00
ML-KEM-768 decaps 50973 cycles 50974 cycles 1.00
ML-KEM-1024 keypair 48025 cycles 48008 cycles 1.00
ML-KEM-1024 encaps 54104 cycles 54099 cycles 1.00
ML-KEM-1024 decaps 71510 cycles 71491 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 18133 cycles 18133 cycles 1
ML-KEM-512 encaps 22166 cycles 22155 cycles 1.00
ML-KEM-512 decaps 28785 cycles 28800 cycles 1.00
ML-KEM-768 keypair 30586 cycles 30543 cycles 1.00
ML-KEM-768 encaps 33622 cycles 33621 cycles 1.00
ML-KEM-768 decaps 43154 cycles 43142 cycles 1.00
ML-KEM-1024 keypair 44178 cycles 44188 cycles 1.00
ML-KEM-1024 encaps 49656 cycles 49646 cycles 1.00
ML-KEM-1024 decaps 62637 cycles 62618 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 43537 cycles 43518 cycles 1.00
ML-KEM-512 encaps 52107 cycles 52036 cycles 1.00
ML-KEM-512 decaps 67340 cycles 67331 cycles 1.00
ML-KEM-768 keypair 71662 cycles 71702 cycles 1.00
ML-KEM-768 encaps 83138 cycles 83257 cycles 1.00
ML-KEM-768 decaps 103392 cycles 103871 cycles 1.00
ML-KEM-1024 keypair 106857 cycles 106943 cycles 1.00
ML-KEM-1024 encaps 121941 cycles 121977 cycles 1.00
ML-KEM-1024 decaps 147823 cycles 147661 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 14961 cycles 14993 cycles 1.00
ML-KEM-512 encaps 19666 cycles 19725 cycles 1.00
ML-KEM-512 decaps 26323 cycles 26357 cycles 1.00
ML-KEM-768 keypair 25512 cycles 25553 cycles 1.00
ML-KEM-768 encaps 28054 cycles 28039 cycles 1.00
ML-KEM-768 decaps 37802 cycles 37799 cycles 1.00
ML-KEM-1024 keypair 35473 cycles 35520 cycles 1.00
ML-KEM-1024 encaps 40389 cycles 40414 cycles 1.00
ML-KEM-1024 decaps 53875 cycles 53907 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 51752 cycles 51799 cycles 1.00
ML-KEM-512 encaps 59942 cycles 59977 cycles 1.00
ML-KEM-512 decaps 76827 cycles 76884 cycles 1.00
ML-KEM-768 keypair 84472 cycles 84408 cycles 1.00
ML-KEM-768 encaps 95849 cycles 95781 cycles 1.00
ML-KEM-768 decaps 118298 cycles 118281 cycles 1.00
ML-KEM-1024 keypair 125305 cycles 125281 cycles 1.00
ML-KEM-1024 encaps 139554 cycles 139567 cycles 1.00
ML-KEM-1024 decaps 168492 cycles 168487 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 29100 cycles 29104 cycles 1.00
ML-KEM-512 encaps 35388 cycles 35462 cycles 1.00
ML-KEM-512 decaps 45894 cycles 45978 cycles 1.00
ML-KEM-768 keypair 49325 cycles 49306 cycles 1.00
ML-KEM-768 encaps 55651 cycles 55556 cycles 1.00
ML-KEM-768 decaps 70445 cycles 70360 cycles 1.00
ML-KEM-1024 keypair 72165 cycles 72012 cycles 1.00
ML-KEM-1024 encaps 80897 cycles 80773 cycles 1.00
ML-KEM-1024 decaps 100761 cycles 100684 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 37921 cycles 37911 cycles 1.00
ML-KEM-512 encaps 43324 cycles 43321 cycles 1.00
ML-KEM-512 decaps 55571 cycles 55507 cycles 1.00
ML-KEM-768 keypair 63131 cycles 63061 cycles 1.00
ML-KEM-768 encaps 70477 cycles 70589 cycles 1.00
ML-KEM-768 decaps 86923 cycles 86995 cycles 1.00
ML-KEM-1024 keypair 94818 cycles 94556 cycles 1.00
ML-KEM-1024 encaps 105592 cycles 105344 cycles 1.00
ML-KEM-1024 decaps 126615 cycles 126634 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 39546 cycles 39428 cycles 1.00
ML-KEM-512 encaps 45506 cycles 45586 cycles 1.00
ML-KEM-512 decaps 58948 cycles 59063 cycles 1.00
ML-KEM-768 keypair 64914 cycles 65034 cycles 1.00
ML-KEM-768 encaps 73349 cycles 73496 cycles 1.00
ML-KEM-768 decaps 91498 cycles 91598 cycles 1.00
ML-KEM-1024 keypair 96721 cycles 97195 cycles 1.00
ML-KEM-1024 encaps 107920 cycles 108129 cycles 1.00
ML-KEM-1024 decaps 131312 cycles 131588 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 60634 cycles 61333 cycles 0.99
ML-KEM-512 encaps 69811 cycles 69801 cycles 1.00
ML-KEM-512 decaps 88729 cycles 88869 cycles 1.00
ML-KEM-768 keypair 101841 cycles 102004 cycles 1.00
ML-KEM-768 encaps 114045 cycles 114158 cycles 1.00
ML-KEM-768 decaps 139580 cycles 139434 cycles 1.00
ML-KEM-1024 keypair 154066 cycles 154696 cycles 1.00
ML-KEM-1024 encaps 169861 cycles 170496 cycles 1.00
ML-KEM-1024 decaps 202272 cycles 202941 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 18943 cycles 18938 cycles 1.00
ML-KEM-512 encaps 23518 cycles 23513 cycles 1.00
ML-KEM-512 decaps 30637 cycles 30656 cycles 1.00
ML-KEM-768 keypair 32296 cycles 32286 cycles 1.00
ML-KEM-768 encaps 35881 cycles 35869 cycles 1.00
ML-KEM-768 decaps 46028 cycles 46014 cycles 1.00
ML-KEM-1024 keypair 46563 cycles 46584 cycles 1.00
ML-KEM-1024 encaps 52414 cycles 52428 cycles 1.00
ML-KEM-1024 decaps 66191 cycles 66248 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 39444 cycles 39331 cycles 1.00
ML-KEM-512 encaps 45270 cycles 45257 cycles 1.00
ML-KEM-512 decaps 57186 cycles 57179 cycles 1.00
ML-KEM-768 keypair 65888 cycles 65949 cycles 1.00
ML-KEM-768 encaps 73744 cycles 73801 cycles 1.00
ML-KEM-768 decaps 89817 cycles 89807 cycles 1.00
ML-KEM-1024 keypair 99026 cycles 99032 cycles 1.00
ML-KEM-1024 encaps 109988 cycles 109977 cycles 1.00
ML-KEM-1024 decaps 130809 cycles 130765 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bananapi bpi-f3 benchmarks

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 332331 cycles 332366 cycles 1.00
ML-KEM-512 encaps 441735 cycles 441764 cycles 1.00
ML-KEM-512 decaps 590620 cycles 590572 cycles 1.00
ML-KEM-768 keypair 550609 cycles 550510 cycles 1.00
ML-KEM-768 encaps 690083 cycles 689970 cycles 1.00
ML-KEM-768 decaps 882592 cycles 882276 cycles 1.00
ML-KEM-1024 keypair 811642 cycles 811520 cycles 1.00
ML-KEM-1024 encaps 984478 cycles 984270 cycles 1.00
ML-KEM-1024 decaps 1218919 cycles 1218595 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Becker and others added 2 commits January 23, 2025 05:28
So far, the CBMC proofs covered the combination of C frontend and C
backend. While most native backend functions are drop-in replacements
of functions in the default C backend and should thus inherit their
specification, there are minor differences in signature (in particular
since #675) or functionality (e.g. the native implementation of
rej_uniform may reject certain inputs, and the frontend uses fallback
logic which is not present with the C backend) that merit explicit
contract annotations for all functions in the arithmetic and FIPS202
backend, and proofs against those specs.

This commit takes a step in this direction by adding CBMC contracts
for all functions in the native arithmetic backend, except for
`rej_uniform_native`, and proving that their call-sites still uphold
their spec when the respective MLKEM_USE_NATIVE_XXX option is set.
This is largely trivial, but it is still worth doing for
(a) documentation purposes, and since (b) the native backends operate
on raw arrays, while their call-sites operate in terms of the
`poly_xxx` structs.

The case of `rej_uniform` is more complicated and will be
handled separately.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
From a perspective of the native backend, the rejection sampling
routine `rej_uniform` is special in that it is not expected to
be replaced by native code in its entirety, but only in special cases.
Outside of those special cases, the default C implementation is used.

Previously, this fallback logic was implemented as follows:
First, the native backend would be called. Upon success, the function
would return immediately. Otherwise, it would fall back to the default
implementation. Success/Failure would be communicated through a special
return value -1.

There are two problems with this logic:
- It appears very difficult to reason about in CBMC: Specifically, when
  we call the native backend, we shift the input buffer by the amount of
  data that has already been successfully sampled, and CBMC struggles
  reasoning about that.
- We call the native backend with a potentially unaligned buffer, which
  seems unnatural. This is not an issue for the existing backends,
  because their bounds checks ensure that the native implementation only
  takes effect at the beginning of rejection sampling, where the output
  buffer is aligned; yet, alignment is not guaranteed in general.

This commit simplifies the fallback logic by invoking the native backend
only upon the first call to `rej_uniform`, when no coefficients have yet
been sampled. This solves both issues above.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
This commit adds a contract for the rejection sampling backend and
proves the C implementation correct on the basis of that contract.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Benchmark suite Current: 043f340 Previous: 68a82c6 Ratio
ML-KEM-512 keypair 51944 cycles 51950 cycles 1.00
ML-KEM-512 encaps 58407 cycles 58300 cycles 1.00
ML-KEM-512 decaps 74250 cycles 74975 cycles 0.99
ML-KEM-768 keypair 88206 cycles 87886 cycles 1.00
ML-KEM-768 encaps 97143 cycles 97356 cycles 1.00
ML-KEM-768 decaps 119798 cycles 120518 cycles 0.99
ML-KEM-1024 keypair 132251 cycles 131973 cycles 1.00
ML-KEM-1024 encaps 145060 cycles 145348 cycles 1.00
ML-KEM-1024 decaps 176696 cycles 175597 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@rod-chapman rod-chapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. See comments below.

Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanno-becker.

I double checked that the contracts match those of the original C functions and that you have one proof for each contract.

A few very minor comments.

This commit adds CBMC specifications to the FIPS202 backend API and
proves the functions depending on it -- KeccakF1600_StatePermute and
KeccakF1600x4_StatePermute -- memory-safe with respect to those
specifications.

Since the logic implementing KeccakF1600x4_StatePermute in the presence
of a FIPS202 backend depends on whether x1/x2/x4-batched Keccak is present
in the native backend, we provide 3 dummy FIPS202 backends for each case.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
@hanno-becker hanno-becker merged commit 3ec8542 into main Jan 23, 2025
125 checks passed
@hanno-becker hanno-becker deleted the cbmc_native branch January 23, 2025 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark this PR should be benchmarked in CI CBMC enhancement New feature or request
Projects
None yet
4 participants