Document demonic non-determinism #3895

tautschnig · 2025-02-18T13:56:11Z

This PR adds developer-facing documentation to clarify what we mean by demonic non-determinism, and why we use it.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

This PR adds developer-facing documentation to clarify what we mean by demonic non-determinism, and why we use it.

carolynzech

Thank you for writing this up!

docs/src/demonic-nondet.md

carolynzech · 2025-02-18T15:58:52Z

docs/src/demonic-nondet.md

+Note that this is an "if" statement, not "if, and only if:" there may also be
+some other invalid pointer `other_pointer` that is invalid at the time of
+dereference, and the back-end solver may choose to resolve the non-determinism
+such that the property fails because `global_pointer` equals `other_pointer`.
+We call such use of non-determinism _demonic non-determinism_, because the
+SAT or SMT-solver will resolve the non-deterministic choice such that a
+counterexample is reported _if_ there exists a model such that the overall
+property fails.


To make sure I understand, imagine this case:

int* new_ptr_to_track = malloc(...); int* other_ptr = malloc(...); ... free(new_ptr_to_track); // the solver switches to new_ptr_to_track free(other_ptr); // the solver stays with new_ptr_to_track *other_ptr // this UB is not caught because we're tracking new_ptr_to_track still

I believe this part:

SAT or SMT-solver will resolve the non-deterministic choice such that a
counterexample is reported if there exists a model such that the overall
property fails.

suggests that this unsoundness is impossible; the solver will simulate all possible outcomes of the nondeterministic decision
global_pointer = if nondet { new_pointer_to_track } else { global_pointer };, which includes the case where global_pointer tracks other_ptr, and thereby find the UB.
Is it an accurate mental model to think of this as a sort of breadth-first search, where we explore all possible assignments to global_pointer at every point of the program and succeed iff the property holds in all cases? (I don't expect the solver to actually perform such a search, since that would be expensive, but is that in effect what happens)?

I think for those with less background in non-determinism, I'd like a final sentence like:

Thus, choosing non-deterministically is not the same as choosing randomly--by choosing non-determistically, soundness is ensured, since the solver will always resolve the non-deterministic choice such that it finds violations of the property if they exist.

I think my primary confusion was initially conflating nondeterminism with randomness, so thinking that only tracking one object at a time would be unsound. This is somewhat redundant with what you have--I just want to hammer this point home, since I suspect this will be the most confusing/unintuitive point for people.

is it an accurate mental model to think of this as a sort of breadth-first search, where we explore all possible assignments to global_pointer at every point of the program and succeed iff the property holds in all cases?

The global_pointer is indeed a symbolic pointer that ranges over all possible pointers in the program, controlled by the nondet choices used to update the variable.

These choices are hidden inputs for the program, so if you have n choices, the final formula encodes 2^n executions, one for each valuation of these variables. The breadth-first search analogy is not the best one though. We don't control in which order the SAT solver explores these decisions.

void foo() { // allocate objects int *p1 = malloc(...); int *p2 = malloc(...); // free objects free(p1); free(p2); // use dead pointers *p1 = 1; *p2 = 2; }

The instrumented program looks like this:

// demonic monitor static void* dead_ptr = (void *)0; void foo() { // allocate objects int *p1 = malloc(...); int *p2 = malloc(...); // free objects // demonic choice variable bool b1 = nondet_bool(); dead_ptr = b1 ? p1 : dead_ptr; free(p1); // demonic choice variable bool b2 = nondet_bool(); dead_ptr = b2 ? p2 : dead_ptr; free(p2); // use dead pointers assert(!__CPROVER_same_object(p1, dead_ptr)); *p1 = 1; assert(!__CPROVER_same_object(p2, dead_ptr)); *p2 = 2; }

So our program is controlled by b1, b2 we have 4 possible ways of picking values for these variables:

0, 0 = track null

0, 1 = track null then switch to p2

1, 0 = track null then switch to p1

1, 1 = track null then switch to p1 then switch to p2

The SAT solver analyzes all of them, but there is no notion of depth/breadth first.

The idea is that any property of the form forall x \in {x1, ,,,, xn}, P(x) can be checked by introducing a nondet variable x that ranges over {x1, ,,,, xn}, instead of explicitly expanding the forall into P(x1) and ... and P(xn). This makes the encoding more compact and defers the reasoning to SAT/SMT solving time.

This makes a ton of sense, thank you @remi-delmas-3000!
(It'd be great if this text made it into the document--I think a longer example like this is really useful).

carolynzech · 2025-02-18T16:29:21Z

docs/src/demonic-nondet.md

+the back-end SAT/SMT solver can only encode a violation of one property of the
+kind that the demonic non-determinism is used for, even if multiple properties
+can be violated along a single execution path.


I don't understand quite what you mean here--could you give an example?

Looking at the example above, to find the safety violation for both p1 and p2 we need to invoke the SAT solver twice, because any choice of b1 and b2 only violates at most one assertion along the same trace

If on the other hand we had this program:

void foo() { // allocate objects int *p1 = malloc(...); int *p2 = malloc(...); // free/use free(p1); *p1 = 1; // free/use free(p2); *p2 = 2; }

void foo() { // allocate objects int *p1 = malloc(...); int *p2 = malloc(...); // free/use bool b1 = nondet_bool(); dead_ptr = b1 ? p1 : dead_ptr; free(p1); assert(!__CPROVER_same_object(p1, dead_ptr)); *p1 = 1; // free/use bool b2 = nondet_bool(); dead_ptr = b2 ? p1 : dead_ptr; free(p2); assert(!__CPROVER_same_object(p2, dead_ptr)); *p2 = 1; }

By picking b1 = 1 and b2 = 1 we can violate both assertions along a single trace.

Ah okay, so this is more of a statement about not being guaranteed to catch every violation in a single trace--it depends on the structure of your program. If the program looks like the one in your initial example (free, free, use, use) we'll only find one violation. But if the structure is (free, use, free, use) like in this example, we can find both. So we'll find at least one violation of the property if it exists.

carolynzech · 2025-02-18T16:30:56Z

docs/src/demonic-nondet.md

+non-deterministic values such that the address of the object having gone out of
+scope has never been assigned.


I don't get this part either--I think a short example would really help.

Extending the example above, imagine we encapsulate the condition !__CPROVER_same_object(p, dead_ptr) into a predicate to hide implementation detail and offer an intrisinc to check for dead pointers:

extern void *dead_ptr; bool __pointer_not_dead(void *ptr) { return !__CPROVER_same_object(ptr, dead_ptr); }

Due to the demonic behavior of dead_ptr, this predicate will only work as expected under an assertion context (where it implicitly becomes quantified over all nondet choices that define dead_ptr).

If you try to use it in an assume:

#include <assert.h> #include <stdbool.h> #include <stdlib.h> static void *dead_ptr; bool __pointer_not_dead(void *ptr) { return __CPROVER_same_object(ptr, dead_ptr); } bool nondet_bool(); int main() { int *ptr = malloc(sizeof(int)); bool alive = ptr != NULL; if (alive && nondet_bool()) { dead_ptr = nondet_bool() ? ptr : dead_ptr; free(ptr); alive = false; } // here `ptr` is either alive or dead depending on malloc failure and nondet choice // we try to assume we are on a path were the ptr is not deallocated __CPROVER_assume(__pointer_not_dead(ptr)); assert(__pointer_not_dead(ptr)); assert(alive); *ptr = 0; }

This assume does not really force us to be on the path alive == true all it does is force the demonic monitor to not track the deallocation of ptr, and introduces unsoundness (because once we force the monitor to not look at an object, we miss all violations related to that object).

Okay, I think I get it now--this was helpful. To make sure that I actually get it, connecting it back to Kani:

kani/kani-compiler/src/kani_middle/transform/mod.rs

Lines 97 to 105 in 366ce20

transformer.add_pass(

queries,

UninitPass {

// Since this uses demonic non-determinism under the hood, should not assume the assertion.

safety_check_type: CheckType::new_safety_check_assert_no_assume(queries),

unsupported_check_type: unsupported_check_type.clone(),

mem_init_fn_cache: queries.kani_functions().clone(),

},

);

Imagine we have some array of length two bytes, and we want to execute this UninitPass to make sure that the memory we're reading is initialized. With demonic nondeterminism, we're going to track one byte at a time. We want to assert that the memory we're reading is initialized, which works in an assertion context since it gets implicitly quantified over all possible ways to track the bytes. But if then try to assume that the memory is initialized after the assert, we're really just telling the solver not to track a certain byte anymore, which would be unsound.

carolynzech · 2025-02-18T16:32:55Z

docs/src/demonic-nondet.md

+This is sound even in presence of multiple objects (and, possibly, multiple
+invalid pointers), because the solver will
+return at least one counterexample if there is any undefined behavior, and can
+indeed be forced to return all such counterexamples.


I think this sentence about soundness would better go in the "What is demonic nondeterminism" section (or you could address soundness twice), but I wouldn't leave it until the last sentence.

Co-authored-by: Carolyn Zech <cmzech@amazon.com>

Document demonic non-determinism

d26359d

This PR adds developer-facing documentation to clarify what we mean by demonic non-determinism, and why we use it.

tautschnig requested a review from a team as a code owner February 18, 2025 13:56

carolynzech reviewed Feb 18, 2025

View reviewed changes

Update docs/src/demonic-nondet.md

9f0075b

Co-authored-by: Carolyn Zech <cmzech@amazon.com>

tautschnig self-assigned this Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document demonic non-determinism #3895

Document demonic non-determinism #3895

tautschnig commented Feb 18, 2025

carolynzech left a comment

carolynzech Feb 18, 2025

carolynzech Feb 18, 2025

remi-delmas-3000 Feb 26, 2025

carolynzech Feb 26, 2025 •

edited

Loading

carolynzech Feb 18, 2025

remi-delmas-3000 Feb 26, 2025

carolynzech Feb 26, 2025

carolynzech Feb 18, 2025

remi-delmas-3000 Feb 26, 2025 •

edited

Loading

carolynzech Feb 26, 2025

carolynzech Feb 18, 2025

		non-deterministic values such that the address of the object having gone out of
		scope has never been assigned.

	transformer.add_pass(
	queries,
	UninitPass {
	// Since this uses demonic non-determinism under the hood, should not assume the assertion.
	safety_check_type: CheckType::new_safety_check_assert_no_assume(queries),
	unsupported_check_type: unsupported_check_type.clone(),
	mem_init_fn_cache: queries.kani_functions().clone(),
	},
	);

Document demonic non-determinism #3895

Are you sure you want to change the base?

Document demonic non-determinism #3895

Conversation

tautschnig commented Feb 18, 2025

carolynzech left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carolynzech Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remi-delmas-3000 Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carolynzech Feb 26, 2025 •

edited

Loading

remi-delmas-3000 Feb 26, 2025 •

edited

Loading