Document not found (404)
+This URL is invalid, sorry. Please use the navigation bar or search to continue.
+ +diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml new file mode 100644 index 0000000..47b9410 --- /dev/null +++ b/.github/workflows/deploy.yml @@ -0,0 +1,51 @@ +# Reference: https://github.com/rust-lang/mdBook/wiki/Automated-Deployment:-GitHub-Actions#github-pages-deploy +name: Deploy +on: + push: + branches: + - main + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v2 + with: + fetch-depth: 0 + - name: Install mdbook + run: | + mkdir mdbook + curl -sSL https://github.com/rust-lang/mdBook/releases/download/v0.4.36/mdbook-v0.4.36-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook + echo `pwd`/mdbook >> $GITHUB_PATH + - name: Install mdbook-admonish + run: | + mkdir mdbook-admonish + curl -sSL https://github.com/tommilligan/mdbook-admonish/releases/download/v1.15.0/mdbook-admonish-v1.15.0-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook-admonish + echo `pwd`/mdbook-admonish >> $GITHUB_PATH + - name: Install mdbook-katex + run: | + mkdir mdbook-katex + curl -sSL https://github.com/lzanini/mdbook-katex/releases/download/v0.5.9/mdbook-katex-v0.5.9-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook-katex + echo `pwd`/mdbook-katex >> $GITHUB_PATH + - name: Deploy GitHub Pages + run: | + cd book + mdbook build + git worktree add deployed-pages + git config user.name "Deploy from CI" + git config user.email "" + cd deployed-pages + # Delete the ref to avoid keeping history. + git update-ref -d refs/heads/deployed-pages + rm -rf * + mv ../book/* . + git add . + git commit -m "Deploy $GITHUB_SHA to deployed-pages" + git push --force --set-upstream origin deployed-pages + +# NOTE: `gh-pages` branch was replaced w/ `deployed-pages` b/c GitHub +# had an un-removeable default GHAction on branch "gh-pages" that +# over-wrote pushes from the main branch's deploy action (this script) + +# Another note- I wish .yml had ability to have variables to avoid +# hardcoding the "forge-docs-book" directory and branch name "deployed-pages" \ No newline at end of file diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..9712139 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +.DS_Store +book/book/ diff --git a/404.html b/404.html new file mode 100644 index 0000000..6edee13 --- /dev/null +++ b/404.html @@ -0,0 +1,225 @@ + + +
+ + +This URL is invalid, sorry. Please use the navigation bar or search to continue.
+ +Now that we've written our first model—tic-tac-toe boards—let's switch to something a bit more serious: binary search trees. A binary search tree (BST) with an added property about its structure. So let's start modeling by following the our 5-step process.
+A binary tree is made up of nodes:
+#lang forge/froglet
+sig Node {
+ key: one Int, -- every node has some key
+ left: lone Node, -- every node has at most one left-child
+ right: lone Node -- every node has at most one right-child
+}
+
+What makes a binary tree a binary tree? One way to say it is:
+Let's get started encoding this. Sometimes it can be helpful to write a domain predicate along with wellformedness, if it makes the model more clear, hence why we wrote root
:
#lang forge/froglet
+sig Node {
+ key: one Int, -- every node has some key
+ left: lone Node, -- every node has at most one left-child
+ right: lone Node -- every node has at most one right-child
+}
+
+pred root[n: Node] {
+ -- a node is a root if it has no ancestor (note this doesn't enforce uniqueness)
+ no n2: Node | n = n2.left or n = n2.right
+}
+
+pred wellformed {
+ -- no cycles: no node can reach itself via a succession of left and right fields
+ all n: Node | not reachable[n, n, left, right]
+
+ -- all non-root nodes have a common ancestor from which both are reachable
+ -- the "disj" keyword means that n1 and n2 must be _different_
+ all disj n1, n2: Node | (not root[n1] and not root[n2]) implies {
+ some anc: Node | reachable[n1, anc, left, right] and
+ reachable[n2, anc, left, right] }
+
+ -- nodes have a unique parent (if any)
+ all disj n1, n2, n3: Node |
+ not ((n1.left = n3 or n1.right = n3) and (n2.left = n3 or n2.right = n3))
+
+}
+
+[FILL: example trees: singleton, empty, unbalanced...]
+-- View a tree or two
+run {binary_tree} for exactly 8 Node
+
+[FILL: Oops, not quite right, we were missing a constraint; underconstraint bug -- fix it]
+Missing:
+ -- left+right differ (unless both are empty)
+ all n: Node | some n.left => n.left != n.right
+
+FILL: and iterate.
+-- Run a test: our predicate enforces a unique root exists (if any node exists)
+pred req_unique_root {
+ no Node or {
+ one root: Node |
+ all other: Node-root | other in descendantsOf[root]}}
+assert binary_tree is sufficient for req_unique_root for 5 Node
+
+
+ If you're taking this course or reading this book, you've probably had to complete some programming assignments---or at least written some small program for a course or an online tutorial. Take a moment to list a handful of such assignments: what did you have to build?
+Now ask yourself:
+In the context of a course, there are "expected" answers to these questions. For instance, you might say you know your code worked because you tested it (very thoroughly)! But is that really the truth? In terms of consequences, the true bar for excellent in a programming class is the grade you got. That is:
+But outside that context, as (say) a professional engineer, you lose the safety net. You might be working on a program that controls the fate of billions of dollars, tempers geopolitical strife, or controls a patient's insulin pump. Even if you had a TA, would you trust them to tell you that those programs worked? Would you trust your boss to understand exactly what needed to happen, and tell you exactly how to do it? Probably not! Instead, you need to think carefully about what you want, how to build it, and how to evaluate what you and others have built.
+As engineers, we strive for perfection. But perfection is an ideal; it's not obtainable. Why? First: we're human. Even if we could read our customers' minds, that's no guarantee that they know what they really need. And even if we can prove our code is correct, we might be checking for the wrong things. Second: our environment is hostile. Computers break. Patches to dependencies introduce errors. Cosmic radiation can flip bits in memory. 100% reliability is hopeless. Anyone who tells you differently is trying to sell you something.
+But that doesn't mean we should give up. It just means that we should moderate our expectations. This book won't teach you how to never make mistakes. Instead, it will teach you a multiple ways to increase your confidence in correctness.
+Let's start with unit-testing and deconstruct it. What does it do well? What does it do poorly?
+Exercise: Make two lists to answer the above question. Why do we test? What could go wrong, and how can the sort of testing you've done in other classes let us down?
+Hopefully we all agree that concrete input-output testing has its virtues and that we should keep doing it. But let's focus on the things that testing doesn't do well.
+You might have observed that (for most interesting programs, anyway) tests cannot be exhaustive because there are infinitely many possible inputs. And since we're forced to test non-exhaustively, we have to hope we pick good tests---tests that not only focus on our own implementation, but on others (like the implementation that replaces yours eventually) too.
+Worse, we can't test the things we don't think of, or don't know about; we're vulnerable to our limited knowledge, the availability heuristic, confirmation bias, and so on. In fact, we humans are generally ill equipped for logical reasoning, even if trained.
+Suppose we're thinking about the workings of a small company. We're given some facts about the company, and have to answer a question based on those facts. Here's an example. We know that:
+Question: Does someone who graduated from Brown directly supervise someone who graduated from another University?
+Yes! Regardless of whether Bob graduated from Brown, some Brown graduate supervises some non-Brown graduate. Reasoning by hypotheticals, there is one fact we don't know: where Bob graduated. In case he graduated Brown, he supervises Charlie, a non-Brown graduate. In case he graduated from another school, he's supervised by Alice, a Brown graduate.
+Humans tend to be very bad at reasoning by hypotheticals. There's a temptation to think that this puzzle isn't solvable because we don't know where Bob graduated from. Even Tim thought this at first after seeing the puzzle—in grad school! For logic!
+Now imagine a puzzle with a thousand of these unknowns. A thousand boolean variables means cases to reason through. Want to use a computer yet?
+ +Of course, this isn't really about logic puzzles.
+There's a real cryptographic protocol called the Needham-Schroeder public-key protocol. You can read about it here. Unfortunately the protocol has a bug: it's vulnerable to attack if one of the principles can be fooled into starting an exchange with a badly-behaved or compromised agent. We won't go into specifics. Instead, let's focus on the fact that it's quite easy to get things like protocols wrong, and sometimes challenging for us humans to completely explore all possible behaviors -- especially since there might be behaviors we'd never even considered! It sure would be nice if we could get a computer to help out with that.
+A pair of former 1710 students did an ISP on modeling crypto protocols, using the tools you'll learn in class. Here's an example picture, generated by their model, of the flaw in the Needham-Schroeder protocol:
+You don't need to understand the specifics of the visualization; the point is that someone who has studied crypto protocols would. And this really does show the classic attack on Needham-Schroeder. You may not be a crypto-protocol person, but you probably are an expert in something you'd like to model, and you might very well get the chance to do so this semester.
+The human species has been so successful, in part, because of our ability to use assistive devices—tools! Eyeglasses, bicycles, hammers, bridges: all devices that assist us in navigating the world in our fragile meat-bodies. One of our oldest inventions, writing, is an assistive device that increases our long-term memory space and makes that memory persistent. Computers are only one in a long line of such inventions.
+So, naturally, we've found ways to:
+There's a large body of work in Computer Science that uses logic to do all those things. We tend to call it formal methods, especially when the focus is on reasoning about systems. It touches on topics like system modeling, constraint solving, program analysis, design exploration, and more. That's what this book is about: the foundational knowledge to engage with many different applications of these ideas, even if you don't end up working with them directly every day.
+More concretely, we'll focus on a class of techniques called lightweight formal methods, which are characterized by tradeoffs that favor ease of use over strong guarantees (although we'll sometimes achieve those as well).
+Jeanette Wing and Daniel Jackson wrote a short article coining the term "lightweight FM" in the 90's, which you can find online.
+When we say "systems" in this book we don't necessarily mean the kind of systems you see in a class on networks, hardware architecture, or operating systems. You can apply the techniques in this book to those subjects quite naturally, but you can also apply it to user interfaces, type systems in programming, hardware, version control systems like Git, web security, cryptographic protocols, robotics, puzzles, sports and games, and much more. So we construe the word "system" very broadly.
+Here are some examples of "systems" that students have modeled in Forge: lifetimes in Rust, network reachability, and poker!
+For better or worse, The shape of engineering is changing. Lots of people are excited, scared, or both about large language models like ChatGPT. This book won't teach you how to use generative AI, so it's reasonable to wonder: why learn from this book, instead of another AI book?
+There are two questions that will never go out of style, and won't be answered by AI (at least, not in our lifetimes):
+Even setting aside the customer-facing aspects, we'll still need to think critically about what it is we want and how to evaluate whether we're getting it. The skills you learn here will remain useful (or become even more so) as engineering evolves. And those skills will be useful for more than just code.
+ + +The main tool we'll use in this course is Forge. Forge is a tool for modeling systems; we'll talk about what that means later on. For now, be aware that we'll be progressing through three language levels in Forge:
+We'll also use:
+A model is a representation of a system that faithfully includes some but not all of the system's complexity. There are many different ways to model a system, all of which have different advantages and disadvantages. Think about what a car company does before it produces a new car design. Among other things, it creates multiple models. E.g.,
+There may be many different models of a system, all of them focused on something different. As the statisticians say, "all models are wrong, but some models are useful". Learning how to model a system is a key skill for engineers, not just within "formal methods". Abstraction is one of the key tools in Computer Science, and modeling lies at the heart of abstraction.
+In this course, the models we build aren't inert; we have tools that we can use the explore and analyze them!
+We don't need to fully model a system to be able to make useful inferences. We can simplify, omit, and abstract concepts/attributes to make models that approximate the system while preserving the fundamentals that we're interested in.
+EXERCISE: If you've studied physics, there's a great example of this in statics and dynamics. Suppose I drop a coin from the top of the science library, and ask you what its velocity will be when it hits the ground. Using the methods you learn in beginning physics, what's something you usefully disregard?
+Air resistance! Friction! We can still get a reasonable approximation for many problems without needing to include that. (And advanced physics adds even more factors that aren't worth considering at this scale.) The model without friction is often enough.
+When we say "systems" in this module, we mean the term broadly. A distributed system (like replication in MongoDB) is a system, but so are user interfaces and hardware devices like CPUs and insulin pumps. Git is a system for version control. The web stack, cryptographic protocols, chemical reactions, the rules of sports and games---these are all systems too!
+To help build intuition, let's work with a simple system: the game of tic-tac-toe (also called noughts and crosses). There are many implementations of this game, including this one that Tim wrote for these notes in Python. And, of course, these implementations often have corresponding test suites, like this (incomplete) example.
+Exercise: Play a quick game of tic-tac-toe by hand. If you can, find a partner, but if not, then play by yourself.
+Notice what just happened. You played the game. In doing so, you ran your own mental implementation of the rules. The result you got was one of many possible games, each with its own specific sequence of legal moves, leading to a particular ending state. Maybe someone won, or maybe the game was a tie. Either way, many different games could have ended with that same board.
+Modeling is different from programming. When you're programming traditionally, you give the computer a set of instructions and it follows them. This is true whether you're programming functionally or imperatively, with or without objects, etc. Declarative modeling languages like Forge work differently. The goal of a model isn't to run instructions, but rather to describe the rules that govern systems.
+Here's a useful comparison to help reinforce the difference (with thanks to Daniel Jackson):
+What are the essential concepts in a game of tic-tac-toe?
+When we're first writing a model, we'll start with 5 steps. For each step, I'll give examples from tic-tac-toe and also for binary search trees (which we'll start modeling soon) for contrast.
+X
and O
marks that go in board locations. 0
and 2
, since the board is 3-by-3. X
moves first into the middle square.X
and O
marks.Why make this distinction between well-formedness and domain predicates? Because one should always hold in any instance Forge considers, but the other may or may not hold. In fact, we might want to use Forge to verify that a domain predicate always holds! And if we've told Forge that any instance that doesn't satisfy it is garbage, Forge won't find us such an instance.
+We might list:
+X
and O
;Now let's add those ideas to a model in Forge!
+#lang forge/froglet
+
+The first line of any Forge model will be a #lang
line, which says which Forge language the file uses. We'll start with the Froglet language for now. Everything you learn in this language will apply in other Forge languages, so I'll use "Forge" interchangeably.
Now we need a way to talk about the noughts and crosses themselves. So let's add a sig
that represents them:
#lang forge/froglet
+abstract sig Player {}
+one sig X, O extends Player {}
+
+You can think of sig
in Forge as declaring a kind of object. A sig
can extend another, in which case we say that it is a child of its parent, and child sig
s cannot overlap. When a sig is abstract
, any member must also be a member of one of that sig
's children; in this case, any Player
must either be X
or O
. Finally, a one
sig has exactly one member---there's only a single X
and O
in our model.
We also need a way to represent the game board. We have a few options here: we could create an Index
sig, and encode an ordering on those (something like "column A, then column B, then column C"). Another is to use Forge's integer support. Both solutions have their pros and cons. Let's use integers, in part to get some practice with them.
#lang forge/froglet
+abstract sig Player {}
+one sig X, O extends Player {}
+
+sig Board {
+ board: pfunc Int -> Int -> Player
+}
+
+Every Board
object contains a board
field describing the moves made so far. This field is a partial function, or dictionary, for every Board
that maps each (Int
, Int
) pair to at most one Player
.
These definitions sketch the overall shape of a board: players, marks on the board, and so on. But not all boards that fit the definition will be valid. For example:
+0
, 1
, and 2
, not a board where (e.g.) row -5
, column -1
is a valid location. We'll call these well-formedness constraints. They aren't innately enforced by our sig
declarations, but we'll almost always want Forge to enforce them, so that it doesn't find "garbage instances". Let's write a wellformedness predicate:
-- a Board is well-formed if and only if:
+pred wellformed[b: Board] {
+ -- row and column numbers used are between 0 and 2, inclusive
+ all row, col: Int | {
+ (row < 0 or row > 2 or col < 0 or col > 2)
+ implies no b.board[row][col]
+ }
+}
+
+This predicate is true of any Board
if and only if the above 2 constraints are satisfied. Let's break down the syntax:
all row, col: Int | ...
says that for any pair of integers (up to the given bidwidth), the following condition (...
) must hold. Forge also supports, e.g., existential quantification (some
), but we don't need that yet. We also have access to standard boolean operators like or
, implies
, etc. b.board[row][col]
evaluates to the Player
(if any) with a mark at location (row
, col
) in board b
; butno b.board[row][col]
is true if and only if there is no such `Player``.pred
(predicate) in Forge is a helper function that evaluates to a boolean. Thus, its body should always be a formula. Notice that, rather than describing a process that produces a well-formed board, or even instructions to check well-formedness, we've just given a declarative description of what's necessary and sufficient for a board to be well-formed. If we'd left the predicate body empty, any board would be considered well-formed---there'd be no formulas to enforce!
+Since a predicate is just a function that returns true or false, depending on its arguments and whichever instance Forge is looking at, we can write tests for it the same way we would for any other boolean-valued function. But even if we're not testing, it can be useful to write a small number of examples, so we can build intuition for what the predicate means.
+We'll write two examples in Forge:
+-- Helper to make these examples easier to write
+pred all_wellformed { all b: Board | wellformed[b]}
+
+-- all_wellformed should be _true_ for the following instance
+example firstRowX_wellformed is {all_wellformed} for {
+ Board = `Board0 -- backquote labels atoms
+ X = `X O = `O -- examples must define all sigs
+ Player = X + O
+ `Board0.board = (0, 0) -> `X + -- examples must define all fields
+ (0, 1) -> `X + -- here, the partial function for
+ (0, 2) -> `X -- the board (empty squares remain empty)
+}
+
+-- all_wellformed should be _false_ for the following instance
+example off_board_not_wellformed is {not all_wellformed} for {
+ Board = `Board0
+ X = `X O = `O
+ Player = X + O
+ `Board0.board = (-1, 0) -> `X +
+ (0, 1) -> `X +
+ (0, 2) -> `X
+}
+
+In Forge, example
s are automatically run whenever your model executes. They describe basic intent about a given predicate; in this case, we're testing that a board where X
has moved 3 times in valid locations is considered well formed. In contrast, if X
has moved in an invalid location, it won't be considered well formed. Again, we're not making judgements about the rules being obeyed yet—just about whether our model is conforming to how we've encoded the game, using 0
, 1
, and 2
as the only valid indexes.
Notice that we've got a test thats a positive example and another test that's a negative example. We want to make sure to exercise both cases, or else "always true" or "always" false could pass our suite.
+The run
command tells Forge to search for an instance satisfying the given constraints:
run { some b: Board | wellformed[b]}
+
+When we click the play button in the VSCode extension, the engine solves the constraints and produces a satisfying instance, (Because of differences across solver versions, hardware, etc., it's possible you'll see a different instance than the one shown here.) A browser window should pop up with a visualization. You can also run racket <filename.frg>
in the terminal, although we recommend the VSCode extension.
If you're running on Windows, the Windows-native cmd
and PowerShell terminals will not properly load Forge's visualizer. Instead, we suggest using one of many other options on Windows that we've tested and know to work: the VSCode extension (available on the VSCode Marketplace), DrRacket, Git bash
, Windows Subsystem for Linux, or Cygwin.
There are many options for visualization. The default which loads initially is a directed-graph based one:
+This isn't very useful; it looks nothing like a tic-tac-toe board! We can make more progress by using the "Table" visualization---which isn't ideal either:
+Forge also allows users to make custom visualizations via short JavaScript programs; here's an example basic visualizer for this specific tic-tac-toe model that produces images like this one:
+We'll talk more about visualization scripts later. For now, let's proceed. TODO: replace img with one matching the table view +TODO: add side-by-side CSS
+This instance contains a single board, and it has 9 entries. Player O
has moved in all of them (the 0
suffix of O0
in the display is an artifact of how Forge's engine works; ignore it for now). It's worth noticing two things:
O
occupies all the squares. We might ask: has player O
been cheating? But the fact is that this board satisfies the constraints we have written so far. Forge produces it simply because our model isn't yet restrictive enough, and for no other reason. "Cheating" doesn't exist yet. So far we've just modeled boards, not full games. But we can still contrast our work here against the implementation of tic-tac-toe shared above. We might ask:
+Now let's write predicates that describe important ideas in the domain. What's important in the game of tic-tac-toe? Here are a few things.
+What would it mean to be a starting state in a game? The board is empty:
+pred starting[s: Board] {
+ all row, col: Int |
+ no s.board[row][col]
+}
+
+How do we tell when it's a given player's turn? It's X
's turn when there are the same number of each mark on the board:
pred XTurn[s: Board] {
+ #{row, col: Int | s.board[row][col] = X} =
+ #{row, col: Int | s.board[row][col] = O}
+}
+
+The {row, col: Int | ...}
syntax means a set comprehension, and describes the set of row-column pairs where the board contains X
(or O
). The #
operator gives the size of these sets, which we then compare.
Question: Is it enough to say that OTurn
is the negation of XTurn
?
No! At least not in the model as currently written. If you're curious to see why, run the model and look at the instances produced. Instead, we need to say something like this:
+pred OTurn[s: Board] {
+ #{row, col: Int | s.board[row][col] = X} =
+ add[#{row, col: Int | s.board[row][col] = O}, 1]
+}
+
+Forge supports arithmetic operations on integers like add
. While it doesn't matter for this model yet, addition (and other operations) can overflow according to 2's complement arithmetic. For example, if we're working with 4-bit integers, then add[7,1]
will be -8
. You can experiment with this in the visualizer's evaluator, which we'll be using a lot after the initial modeling tour is done.
Don't try to use +
for addition in any Forge language. Use add
instead; this is because +
is reserved for something else (which we'll explain later).
What does it mean to win? A player has won on a given board if:
+We'll express this in a winner
predicate that takes the current board and a player name. Let's also define a couple helper predicates along the way:
pred winRow[s: Board, p: Player] {
+ -- note we cannot use `all` here because there are more Ints
+ some row: Int | {
+ s.board[row][0] = p
+ s.board[row][1] = p
+ s.board[row][2] = p
+ }
+}
+
+pred winCol[s: Board, p: Player] {
+ some column: Int | {
+ s.board[0][column] = p
+ s.board[1][column] = p
+ s.board[2][column] = p
+ }
+}
+
+pred winner[s: Board, p: Player] {
+ winRow[s, p]
+ or
+ winCol[s, p]
+ or
+ {
+ s.board[0][0] = p
+ s.board[1][1] = p
+ s.board[2][2] = p
+ } or {
+ s.board[0][2] = p
+ s.board[1][1] = p
+ s.board[2][0] = p
+ }
+}
+
+After writing these domain predicates, we're reaching a fairly complete model for a single tic-tac-toe board. Let's decide how to fix the issue we saw above (the reason why OTurn
couldn't be the negation of XTurn
): perhaps a player has moved too often.
Should we add something like OTurn[s] or XTurn[s]
to our wellformedness predicate? No! If we then later enforced wellformedness for all boards, that would exclude "cheating" instances where a player has more moves on the board than are allowed. But this has some risk, depending on how we intend to use the wellformed
predicate:
wellformed
and rule it out. wellformed
.not cheating
to wellformed
; because wellformed
also excludes garbage boards, we'd probably use it in our verification—in which case, Forge will never find us a counterexample! [TODO: move this point to position it properly, if PBT comes after vs. before]
+Notice the similarity between this issue and what we do in property-based testing. Here, we're forced to distinguish between what a reasonable board is (analogous to the generator's output in PBT) and what a reasonable behavior is (analogous to the validity predicate in PBT). One narrows the scope of possible worlds to avoid true "garbage"; the other checks whether the system behaves as expected in one of those worlds.
+We'll come back to this later, when we've had a bit more modeling experience. For now, let's separate our goal into a new predicate called balanced
, and add it to our run
command above:
pred balanced[s: Board] {
+ XTurn[s] or OTurn[s]
+}
+run { some b: Board | wellformed[b] and balanced[b]}
+
+If we click the "Next" button a few times, we see that not all is well: we're getting boards where wellformed
is violated (e.g., entries at negative rows, or multiple moves in one square).
We're getting this because of how the run
was phrased. We said to find an instance where some board was well-formed and valid, not one where all boards were. By default, Forge will find instances with up to 4 Boards
. So we can fix the problem either by telling Forge to find instances with only 1 Board:
run { some b: Board | wellformed[b] and balanced[b]}
+for exactly 1 Board
+
+or by saying that all boards must be well-formed and balanced:
+run { all b: Board | wellformed[b] and balanced[b]}
+
+run
The run
command can be used to give Forge more detailed instructions for its search.
Is it possible for an instance with no boards to still satisfy constraints like these?
+run {
+ all b: Board | {
+ -- X has won, and the board looks OK
+ wellformed[b]
+ winner[b, X]
+ balanced[b]
+ }
+ }
+
+Yes! There aren't any boards, so there's no obligation for anything to satisfy the constraints inside the quantifier. You can think of the all
as something like a for
loop in Java or the all()
function in Python: it checks every Board
in the instance. If there aren't any, there's nothing to check—return true.
This addition also requires that X
moved in the middle of the board:
run {
+ all b: Board | {
+ -- X has won, and the board looks OK
+ wellformed[b]
+ winner[b, X]
+ balanced[b]
+ -- X started in the middle
+ b.board[1][1] = X
+ }
+ } for exactly 2 Board
+
+Notice that, because we said exactly 2 Board
here, Forge must find instances containing 2 tic-tac-toe boards, and both of them must satisfy the constraints: wellformedness, X
moving in the middle, etc. You could ask for a board where X
hasn't won by adding not winner[b, X]
.
What do you think a game of tic-tac-toe looks like? How should we model the moves between board states?
+It's often convenient to think of the game as a big graph, where the nodes are the states (possible board configurations) and the edges are transitions (in this case, legal moves of the game). Here's a rough sketch:
+A game of tic-tac-toe is a sequence of steps in this graph, starting from the empty board. Let's model it.
+First, what does a move look like? A player puts their mark at a specific location. In Alloy, we'll represent this using a transition predicate: a predicate that says when it's legal for one state to evolve into another. We'll often call these the pre-state and post-state of the transition:
+pred move[pre: Board, row: Int, col: Int, p: Player, post: Board] {
+ // ...
+}
+
+What constraints should we add? It's useful to divide the contents of such a predicate into:
+For the guard, in order for the move to be valid, it must hold that in the pre-state:
+For the action:
+Now we can fill in the predicate. Let's try something like this:
+pred move[pre: Board, row: Int, col: Int, p: Player, post: Board] {
+ -- guard:
+ no pre.board[row][col] -- nobody's moved there yet
+ p = X implies XTurn[pre] -- appropriate turn
+ p = O implies OTurn[pre]
+
+ -- action:
+ post.board[row][col] = p
+ all row2: Int, col2: Int | (row!=row2 and col!=col2) implies {
+ post.board[row2][col2] = pre.board[row2][col2]
+ }
+}
+
+There are many ways to write this predicate. However, we're going to stick with this form because it calls out an important point. Suppose we had only written post.board[row][col] = p
for the action, without the all
on the next following lines. Those added lines, which we'll call a frame condition, say that all other squares remain unchanged; without them, the contents of any other square might change in any way. Leaving them out would cause an underconstraint bug: the predicate would be too weak to accurately describe moves in tic-tac-toe.
Exercise: comment out the 3 frame-condition lines and run the model. Do you see moves where the other 8 squares change arbitrarily?
+Exercise: could there be a bug in this predicate? (Run Forge and find out!)
+The all row2...
formula says that for any board location where both the row and column differ from the move's, the board remains the same. But is that what we really wanted? Suppose X
moves at location 1
, 1
. Then of the 9 locations, which is actually protected?
Row | Column | Protected? |
---|---|---|
0 | 0 | yes |
0 | 1 | no (column 1 = column 1) |
0 | 2 | yes |
1 | 0 | no (row 1 = row 1) |
1 | 1 | no (as intended) |
1 | 2 | no (row 1 = row 1) |
2 | 0 | yes |
2 | 1 | no (column 1 = column 1) |
2 | 2 | yes |
Our frame condition was too weak! We need to have it take effect whenever either the row or column is different. Something like this will work:
+ all row2: Int, col2: Int |
+ ((row2 != row) or (col2 != col)) implies {
+ post.board[row2][col2] = pre.board[row2][col2]
+ }
+
+
+Once someone wins a game, does their win still persist, even if more moves are made? I'd like to think so: moves never get undone, and in our model winning just means the existence of 3-in-a-row for some player. We probably even believe this property without checking it. However, it won't always be so straightforward to show that properties are preserved by the system. We'll check this one in Forge as an example of how you might prove something similar in a more complex system.
+This is our first step into the world of verification. Asking whether or not a program, algorithm, or other system always satisfies some assertion is a core problem in formal methods, and has a long and storied history.
+We'll tell Forge to find us pairs of states, connected by a move: the pre-state before the move, and the post-state after it. That's any potential transition in tic-tac-toe. The trick is in adding two more constraints. We'll say that someone has won in the pre-state, but they haven't won in the post-state.
+pred winningPreservedCounterexample {
+ some pre, post: Board | {
+ some row, col: Int, p: Player |
+ move[pre, post, row, col, p]
+ winner[pre, X]
+ not winner[post, X]
+ }
+}
+run {
+ all s: Board | wellformed[s]
+ winningPreservedCounterexample
+}
+
+The check passes---Forge can't find any counterexamples. We'll see this reported as "UNSAT" in the visualizer.
+ +Recall that our worldview for this model is that systems transition between states, and thus we can think of a system as a directed graph. If the transitions have arguments, we'll sometimes label the edges of the graph with those arguments. This view is sometimes called a discrete event model, because one event happens at a time. Here, the events are moves of the game. In a bigger model, there might be many different types of events.
+TODO: I think this goes later in the book, we don't want to start on finite traces yet.
+Today, we'll ask Forge to find us traces of the system, starting from an initial state. We'll also add a Game
sig to incorporate some metadata.
one sig Game {
+ initialState: one Board,
+ next: pfunc Board -> Board
+}
+
+pred traces {
+ -- The trace starts with an initial state
+ starting[Game.initialState]
+ no sprev: Board | Game.next[sprev] = Game.initialState
+ -- Every transition is a valid move
+ all s: Board | some Game.next[s] implies {
+ some row, col: Int, p: Player |
+ move[s, row, col, p, Game.next[s]]
+ }
+}
+
+By itself, this wouldn't be quite enough; we might see a bunch of disjoint traces. We could add more constraints manually, but there's a better option: tell Forge, at run
time, that next
represents a linear ordering on states.
run {
+ traces
+} for {next is linear}
+
+The key thing to notice here is that next is linear
isn't a constraint; it's a separate annotation given to Forge alongside a run
or a test. Never put such an annotation in a constraint block; Forge won't understand it. These annotations narrow Forge's bounds (the space of possible worlds to check) which means they can often make problems more efficient for Forge to solve.
In general, Forge accepts such annotations after numeric bounds. E.g., if we wanted to see full games, rather than unfinished game prefixes (the default bound on any sig, including Board
, is up to 4) we could have asked:
run {
+ traces
+} for exactly 10 Board for {next is linear}
+
+You might notice that because of this, some traces are excluded. That's because next is linear
forces exact bounds on Board
. More on this next time.
Moreover, since we're now viewing a single fixed instance, we can evaluate Forge expressions in it. This is great for debugging, but also for just understanding Forge a little bit better. Open the evaluator here at the bottom of the right-side tray, under theming. Then enter an expression or constraint here:
+Type in something like some s: State | winner[s, X]
. Forge should give you either #t
(for true) or #f
(for false) depending on whether the game shows X
winning the game.
You might notice that this model takes a while to run, after we start reasoning about full games. Why might that be? Let's re-examine our bounds and see if there's anything we can adjust. In particular, here's what the evaluator says we've got for integers:
+Wow---wait, do we really need to be able to count up to 7
for this model? Probably not. If we change our integer bounds to 3 Int
we'll still be able to use 0
, 1
, and 2
, and the search space is much smaller.
TODO: much of this should be moved later
+Suppose we're using Forge to search for family trees. There are infinitely many potential family tree instances, but Forge needs to work with a finite search space. This is where the bounds of a run
come in; they limit the set of instances Forge will even consider. Constraints you write are not yet involved at this point.
Once the bounded search space has been established, Forge uses the constraints you write to find satisfying instances within the bounded search space.
+The engine uses bounds and constraints very differently, and inferring constraints is often less efficient than inferring bounds. But the engine treats them differently, which means sometimes the distinction leaks through.
+In Forge, there is a special value called none
. It's analogous (but not the same!) to a null
in languages like Java.
Suppose I added this predicate to our run
command in the tic-tac-toe model:
pred myIdea {
+ all row1, col1, row2, col2: Int |
+ (row1 != row2 or col1 != col2) implies
+ Game.initialState.board[row1][col1] !=
+ Game.initialState.board[row2][col2]
+}
+
+I'm trying to express that every entry in the board is different. This should easily be true about the initial board, as there are no pieces there.
+For context, recall that we had defined a Game
sig earlier:
one sig Game {
+ initialState: one State,
+ nextState: pfunc State -> State
+}
+
+What do you think would happen?
+It's very likely this predicate would be unsatisfiable, given the constraints on the initial state. Why?
+Because none
equals itself! We can check this:
test expect {
+ nullity: {none != none} is unsat
+}
+
+Thus, when you're writing constraints like the above, you need to watch out for none
: the value for every cell in the initial board is equal to the value for every other cell!
The none
value in Forge has at least one more subtlety: none
is "reachable" from everything if you're using the built-in reachable
helper predicate. That has an impact even if we don't use none
explicitly. If I write something like: reachable[p.spouse, Nim, parent1, parent2]
I'm asking whether, for some person p
, their spouse is an ancestor of Nim
. If p
doesn't have a spouse, then p.spouse
is none
, and so this predicate would yield true for p
.
The keyword some
is used in 2 different ways in Forge:
some b: Board, p: Player | winner[s, p]
, which says that somebody has won in some board; andsome Game.initialState.board[1][1]
, which says that that cell of the initial board is populated. You can read some row : Int | ...
as "There exists some integer row
such that ...". The transliteration isn't quite as nice for all
; it's better to read all row : Int | ...
as "In all integer row
s, it holds that ...".
If you want to further restrict the values used in an all
, you'd use implies
. But if you want to add additional requirements for a some
, you'd use and
. Here are 2 examples:
parent1
doesn't also have that person as their parent2
": all p: Person | some p.parent1 implies p.parent1 != p.parent2
.parent1
and a spouse
": some p: Person | some p.parent1 and some p.spouse
.Technical aside: The type designation on the variable can be interpreted as having a character similar to these add-ons: and
(for some
) and implies
(for all
). E.g., "there exists some row
such that row
is an integer and ...", or "In all row
s, if row
is an integer, it holds that...".
some
Atom vs. Some InstanceForge searches for instances that satisfy the constraints you give it. Every run
in Forge is about satisfiability; answering the question "Does there exist an instance, such that...".
Crucially, you cannot write a Forge constraint that quantifies over instances themselves. You can ask Forge "does there exist an instance such that...", which is pretty flexible on its own. E.g., if you want to check that something holds of all instances, you can ask Forge to find counterexamples. This is how assert ... is necessary for ...
is implemented, and how the examples from last week worked.
The one
quantifier is for saying "there exists a UNIQUE ...". As a result, there are hidden constraints embedded into its use. one x: A | myPred[x]
really means, roughly, some x: A | myPred[x] and all x2: A | not myPred[x]
. This means that interleaving one
with other quantifiers can be subtle; for that reason, we won't use it except for very simple constraints.
If you use quantifiers other than some
and all
, beware. They're convenient, but various issues can arise.
Checking whether or not two predicates are equivalent is the core of quite a few Forge applications---and a great debugging technique sometimes.
+How do you do it? Like this:
+pred myPred1 {
+ some i1, i2: Int | i1 = i2
+}
+pred myPred2 {
+ not all i2, i1: Int | i1 != i2
+}
+assert myPred1 is necessary for myPred2
+assert myPred2 is necessary for myPred1
+
+If you get an instance where the two predicates aren't equivalent, you can use the Sterling evaluator to find out why. Try different subexpressions, discover which is producing an unexpected result! E.g., if we had written (forgetting the not
):
pred myPred2 {
+ all i2, i1: Int | i1 != i2
+}
+
+One of the assertions would fail, yielding an instance in Sterling you could use the evaluator with.
+I'm not going to use this syntax in class if I don't need to, because it's less intentional. But using it here lets me highlight a common conceptual issue with Forge in general.
+test expect {
+ -- correct: "no counterexample exists"
+ p1eqp2_A: {
+ not (myPred1 iff myPred2)
+ } is unsat
+ -- incorrect: "it's possible to satisfy what i think always holds"
+ p1eqp2_B: {
+ myPred1 iff myPred2
+ } is sat
+
+}
+
+These two tests do not express the same thing! One asks Forge to find an instance where the predicates are not equivalent (this is what we want). The other asks Forge to find an instance where they are equivalent (this is what we're hoping holds for any instance, not just one)!
+ +If you're taking this course or reading this book, you've probably had to complete some programming assignments---or at least written some small program for a course or an online tutorial. Take a moment to list a handful of such assignments: what did you have to build?
+Now ask yourself:
+In the context of a course, there are "expected" answers to these questions. For instance, you might say you know your code worked because you tested it (very thoroughly)! But is that really the truth? In terms of consequences, the true bar for excellent in a programming class is the grade you got. That is:
+But outside that context, as (say) a professional engineer, you lose the safety net. You might be working on a program that controls the fate of billions of dollars, tempers geopolitical strife, or controls a patient's insulin pump. Even if you had a TA, would you trust them to tell you that those programs worked? Would you trust your boss to understand exactly what needed to happen, and tell you exactly how to do it? Probably not! Instead, you need to think carefully about what you want, how to build it, and how to evaluate what you and others have built.
+As engineers, we strive for perfection. But perfection is an ideal; it's not obtainable. Why? First: we're human. Even if we could read our customers' minds, that's no guarantee that they know what they really need. And even if we can prove our code is correct, we might be checking for the wrong things. Second: our environment is hostile. Computers break. Patches to dependencies introduce errors. Cosmic radiation can flip bits in memory. 100% reliability is hopeless. Anyone who tells you differently is trying to sell you something.
+But that doesn't mean we should give up. It just means that we should moderate our expectations. This book won't teach you how to never make mistakes. Instead, it will teach you a multiple ways to increase your confidence in correctness.
+Let's start with unit-testing and deconstruct it. What does it do well? What does it do poorly?
+Exercise: Make two lists to answer the above question. Why do we test? What could go wrong, and how can the sort of testing you've done in other classes let us down?
+Hopefully we all agree that concrete input-output testing has its virtues and that we should keep doing it. But let's focus on the things that testing doesn't do well.
+You might have observed that (for most interesting programs, anyway) tests cannot be exhaustive because there are infinitely many possible inputs. And since we're forced to test non-exhaustively, we have to hope we pick good tests---tests that not only focus on our own implementation, but on others (like the implementation that replaces yours eventually) too.
+Worse, we can't test the things we don't think of, or don't know about; we're vulnerable to our limited knowledge, the availability heuristic, confirmation bias, and so on. In fact, we humans are generally ill equipped for logical reasoning, even if trained.
+Suppose we're thinking about the workings of a small company. We're given some facts about the company, and have to answer a question based on those facts. Here's an example. We know that:
+Question: Does someone who graduated from Brown directly supervise someone who graduated from another University?
+Yes! Regardless of whether Bob graduated from Brown, some Brown graduate supervises some non-Brown graduate. Reasoning by hypotheticals, there is one fact we don't know: where Bob graduated. In case he graduated Brown, he supervises Charlie, a non-Brown graduate. In case he graduated from another school, he's supervised by Alice, a Brown graduate.
+Humans tend to be very bad at reasoning by hypotheticals. There's a temptation to think that this puzzle isn't solvable because we don't know where Bob graduated from. Even Tim thought this at first after seeing the puzzle—in grad school! For logic!
+Now imagine a puzzle with a thousand of these unknowns. A thousand boolean variables means cases to reason through. Want to use a computer yet?
+ +Of course, this isn't really about logic puzzles.
+There's a real cryptographic protocol called the Needham-Schroeder public-key protocol. You can read about it here. Unfortunately the protocol has a bug: it's vulnerable to attack if one of the principles can be fooled into starting an exchange with a badly-behaved or compromised agent. We won't go into specifics. Instead, let's focus on the fact that it's quite easy to get things like protocols wrong, and sometimes challenging for us humans to completely explore all possible behaviors -- especially since there might be behaviors we'd never even considered! It sure would be nice if we could get a computer to help out with that.
+A pair of former 1710 students did an ISP on modeling crypto protocols, using the tools you'll learn in class. Here's an example picture, generated by their model, of the flaw in the Needham-Schroeder protocol:
+You don't need to understand the specifics of the visualization; the point is that someone who has studied crypto protocols would. And this really does show the classic attack on Needham-Schroeder. You may not be a crypto-protocol person, but you probably are an expert in something you'd like to model, and you might very well get the chance to do so this semester.
+The human species has been so successful, in part, because of our ability to use assistive devices—tools! Eyeglasses, bicycles, hammers, bridges: all devices that assist us in navigating the world in our fragile meat-bodies. One of our oldest inventions, writing, is an assistive device that increases our long-term memory space and makes that memory persistent. Computers are only one in a long line of such inventions.
+So, naturally, we've found ways to:
+There's a large body of work in Computer Science that uses logic to do all those things. We tend to call it formal methods, especially when the focus is on reasoning about systems. It touches on topics like system modeling, constraint solving, program analysis, design exploration, and more. That's what this book is about: the foundational knowledge to engage with many different applications of these ideas, even if you don't end up working with them directly every day.
+More concretely, we'll focus on a class of techniques called lightweight formal methods, which are characterized by tradeoffs that favor ease of use over strong guarantees (although we'll sometimes achieve those as well).
+Jeanette Wing and Daniel Jackson wrote a short article coining the term "lightweight FM" in the 90's, which you can find online.
+When we say "systems" in this book we don't necessarily mean the kind of systems you see in a class on networks, hardware architecture, or operating systems. You can apply the techniques in this book to those subjects quite naturally, but you can also apply it to user interfaces, type systems in programming, hardware, version control systems like Git, web security, cryptographic protocols, robotics, puzzles, sports and games, and much more. So we construe the word "system" very broadly.
+Here are some examples of "systems" that students have modeled in Forge: lifetimes in Rust, network reachability, and poker!
+For better or worse, The shape of engineering is changing. Lots of people are excited, scared, or both about large language models like ChatGPT. This book won't teach you how to use generative AI, so it's reasonable to wonder: why learn from this book, instead of another AI book?
+There are two questions that will never go out of style, and won't be answered by AI (at least, not in our lifetimes):
+Even setting aside the customer-facing aspects, we'll still need to think critically about what it is we want and how to evaluate whether we're getting it. The skills you learn here will remain useful (or become even more so) as engineering evolves. And those skills will be useful for more than just code.
+ + +The main tool we'll use in this course is Forge. Forge is a tool for modeling systems; we'll talk about what that means later on. For now, be aware that we'll be progressing through three language levels in Forge:
+We'll also use:
+