Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(spec/consensus): Small fixes to overview.md file. #873

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 45 additions & 28 deletions specs/consensus/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@

## Heights

The algorithm presented in the [pseudo-code][pseudo-code] represent the
operation of an instance of consensus in a process `p`.
The algorithm presented in the [pseudo-code][pseudo-code] represents the
operation the consensus algorithm running in a process `p`.

Each instance or **height** of the consensus algorithm is identified by an
integer, represented by the `h_p` variable in the pseudo-code.
Expand All @@ -39,7 +39,7 @@
of consensus, in which the same algorithm is executed again.

For the sake of the operation of the consensus algorithm, heights are
completely independent executions.
completely independent executions. NM{Not sure I fully understand what do we want to say here?}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Messages and events from a height H have no effect on the operation of any height H' != H.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
completely independent executions. NM{Not sure I fully understand what do we want to say here?}
communication-closed, that is, a message from some height `H` does not influence a process that is in height `H' != H`.

For this reason, in the remainder of this document we consider and discuss the
execution of a **single height of consensus**.

Expand Down Expand Up @@ -73,7 +73,7 @@
The current round step of a process `p` is stored in the `step_p` variable.
In general terms, when entering a round step, a process performs one or more
**actions**.
And the reception a given set of **events** while in a round step, leads the
And the reception of a given set of **events** while in a round step, leads the
process to move to the next round step.

### Propose
Expand All @@ -83,7 +83,7 @@
function, where it also increases `round_p` to the started `round` number.
The `propose` step is the only round step that is asymmetric: different
processes perform different actions when starting it.
More specifically, the proposer of the round has a distinguish role in this round step.
More specifically, the proposer of the round has a distinguished role in this round step.

In the `propose` round step, the **proposer** of the current round selects the
value to be the proposed in that round and **broadcast**s the proposed value to all
Expand All @@ -97,7 +97,7 @@
`propose` step.
The value proposed for the round can be accepted (lines 24 or 30) or rejected
(lines 26 or 32) by the process.
The proposed value can be also rejected if not received when the timeout
The proposed value can also be rejected if not received before the timeout
scheduled in the `propose` step expires (line 59).

The action taken by a process when it moves from the `propose` to the `prevote`
Expand Down Expand Up @@ -166,9 +166,9 @@
associated to a [round step](#round-steps):

- `⟨PROPOSAL, h, r, v, vr⟩`: broadcast by the process returned by the proposer
selection function `proposer(h, r)` function when entering the
selection function [`proposer(h, r)`](#proposer-selection) when entering the
[`propose`](#propose) step of round `r` of height `h`.
Carries the proposed value `v` for height `h` of consensus.
It carries the proposed value `v` for height `h` of consensus.
Since only proposed values can be decided, the success of round `r` depends
on the reception of this message.
- `⟨PREVOTE, h, r, *⟩` broadcast by all processes when entering the
Expand All @@ -180,7 +180,8 @@
[`precommit`](#precommit) step of round `r` of height `h`.
The last field can be either the unique identifier `id(v)` of a proposed
value `v` for which the process has received `⟨PREVOTE, h, r, id(v)⟩`
messages from a super-majority of processes, or the special `nil` value otherwise.
messages from a [super-majority](#voting-power) of processes, or the special `nil` value otherwise.
NM{I am not sure we have defined super-majority anywhere?}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a link to the #voting-power section, where we define it properly. Since we indeed did not define it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NM{I am not sure we have defined super-majority anywhere?}

I think @cason already addressed this


Before discussing in detail the role of each message in the protocol, it is
worth highlighting a main aspect that differentiate the adopted messages.
Expand Down Expand Up @@ -215,10 +216,10 @@
The proposer of a round `r` defines which value `v` it will propose based on
the values of the two state variables `validValue_p` and `validRound_p`.
They are initialized to `nil` and `-1` at the beginning of each height, meaning
that the process is not aware of any proposed value that has became **valid**
that the process is not aware of any proposed value that has become **valid**
in a previous round.
A value `v` becomes **valid** at round `r` when a `PROPOSAL` for `v` and an
enough number of `PREVOTE` messages for `id(v)` are received during round `r`.
A value `v` becomes **valid** at round `r` when a `PROPOSAL` for `v` and
`PREVOTE` messages for `id(v)` are received from a super-majority of processes during round `r`.
This logic is part of the pseudo-code block from line 36, where `validValue_p`
and `validRound_p` are updated.

Expand Down Expand Up @@ -257,20 +258,23 @@
Attacks 2. and 3. are constitute forms of the **amnesia attack** and are harder
to identify.
Notice, however, that correct processes check whether they can accept a proposed
value `v` with valid round `vr` based in the content of its state variables
value `v` with a valid round `vr` based on the content of its state variables
`lockedValue_p` and `lockedRound_p` (lines 23 and 29) and are likely to reject
such proposals.

Attack 4. constitutes a double-signing or **equivocation** attack.
The most common approach for a correct process is to only consider the first
A correct process can only consider the first
`⟨PROPOSAL, h, r, v, *⟩` received in the `propose` step, which can be accepted
or rejected.
sounds like they can also act in a different way}
However, it is possible that a different `⟨PROPOSAL, h, r, v', *⟩` with
`v' != v` is accepted by different processes and, as a result, triggers state
transitions in the `prevote` or `precommit` round steps.
So, a priori, the algorithm expects a correct process to store all the
multiple proposals broadcast by a Byzantine proposer.
Which, by itself, constitutes an attack vector to be considered.
NM{I don't know how this is implemented in the code, but in a round process do need to store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Are you sure that this approach (always) work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can start a discussion or have an issue on that. And point out in > code that this might be a solution. At the moment I am not sure that this is strictly correct.

In any case, the algorithm assumes we to store all proposals, and changes on that should come in the hopefully upcoming design section or directory.

all values but if one of the values collects 2f+1 prevotes it can discard others? }

While hard to handle, it is easy to prove that a process has performed an
equivocation attack: it is enough to receive and store distinct messages for
Expand Down Expand Up @@ -368,12 +372,12 @@
expected contents of its `lockedValue_q` and `lockedRound_q` variables.

Since Byzantine processes can always produce **equivocation attacks**, a
correct process can deal with them is by only considering the first
correct process can deal with them by only considering the first
`⟨PREVOTE, h, r, *⟩` or `⟨PRECOMMIT, h, r, *⟩` messages received from a process
in a round `r` of height `h`.
Different (equivocating) versions of the same message from the same sender
should, from a defensive point of view, be disregarded and dropped by the
consensus logic as they were duplicated messages.
consensus logic as they were duplicated messages. NM{Ok I guess this is the change you want to implement or have alrady implemented, that is different from the paper?}

Check warning on line 380 in specs/consensus/overview.md

View workflow job for this annotation

GitHub Actions / Check spelling

"alrady" should be "already".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are talking about the algorithm, that assumes that we store multiple votes from a Byzantine sender, and reasoning that we could simplify the assumption.

Then in the next paragraph we show that this is not that simple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here is that it should be safe to disregard equivocating votes. The next paragraph shows that this, however, can constitute a liveness issue.

Probably we need to reword this discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
consensus logic as they were duplicated messages. NM{Ok I guess this is the change you want to implement or have alrady implemented, that is different from the paper?}
consensus logic as they were duplicated messages (note that doing this maintains safety, but leads to violations of the gossip property from the [Tendermint paper](https://arxiv.org/abs/1807.04938), and thus introduces liveness issues; see below).

The reason for which is the fact that a Byzantine process can produce an
arbitrary number of such messages, therefore store all of them may constitute
an attack vector.
Expand Down Expand Up @@ -575,15 +579,15 @@
the data about the blockchain state, that is, past decisions etc., is captured in `decision_p`.)
This implies that
a value that might be valid at blockchain height 5, might be invalid at height 6. Observe
that this, strictly speaking, the above defined property 1.
that this, strictly speaking, violates the above defined property 1.
However, as long as all processes agree on
the state of the chain up to height 5 when they
start consensus for height 6, this still satisfies properties 2 and 3 and it will not harm
liveness. (There have been cases of
non-determinism in the application level that led to processes disagreeing on the
application state and thus consensus being blocked at a given height)

> **Remark.** We have seen slight (ab)uses of valid, that use external data. Consider he toy
> **Remark.** We have seen slight (ab)uses of valid, that use external data. Consider the toy
example of the proposer proposing the current room temperature, and the processes
checking in `valid` whether the temperature is within one degree of their room
temperature. It is easy to see that this a priori violates the points 1-3 above.
Expand Down Expand Up @@ -744,7 +748,7 @@
for an arbitrary amount of time.
There is, however, a (possibly unknown) Global Stabilization Time (`GST`), a
time from which the system becomes synchronous and `∆` is observed for all
messages sent by correct processes.
messages sent by correct processes. \NM{Ok this "possibly unknown" in both sentences is confusing me? Did you try to somehow capture both versions of partially synchronous system models? }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are. But this is theoretically confusing.

A solution here is to remove the two "(possibly unknown)". And possibly add a sentence that either Delta or GST might be unknown. But from the next paragraph, this sentence is probably of not much help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine, I would just keep it. Even if both are unknown, the algorithm should work, no? That would be the model in the Chandra/Toueg paper on failure detectors.


In practical systems, `GST` is usually unknown, although it is assumed that the
system eventually stabilizes.
Expand Down Expand Up @@ -815,7 +819,7 @@
in round `vr`, if `p` _had received_ the POL messages while in the `prevote`
round step of round `vr`.
If `p` could have locked `v` in round `vr`, then any correct process could
have produced a valid lock in round `vr`.
have produced a valid lock in round `vr`. \NM{Ok, this explanation is too high level for me but I understand that it is hard to explain at this point that it can accept it because it knows that lockedValue is not decided}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understood. What should we change here?

And more recent (from high-numbered rounds) locks prevail.

> **Remark**: notice that the actual condition in line 29 is `vr >= lockedRound_p`.
Expand All @@ -838,9 +842,9 @@
Those values are rejected by processes locked in round `r`, but they are
accepted by correct processes that have no locks.
With the votes of Byzantine processes, which may misbehave, it is then
possible to produce another lock in a round `r' > r`.
possible to produce another lock in a round `r' > r`. \NM{These processes do not need to be Byzantine.}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understood.

The previous sentence mentions correct processes that accept the value because they are not locked. Are you saying that we might not need votes from Byzantine ones?

None of the locked values can be decided, for the lack of votes, but liveness
is under threat, as detailed in this [discussion][equivocation-discussion].
is under threat, as detailed in this [discussion][equivocation-discussion]. \NM{Potentially they cannot be decided, so maybe we can rephrase a bit this sentence. }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is describing a scenario, where we assume that this happens. So I am not sure if we need to fix the sentence.

The only way out of this _hidden locks_ scenario is when the processes locked
in round `r` learn the POL for round `r' > r`, so that they can disregard
their own lock.
Expand All @@ -859,14 +863,14 @@
may have locked `v` in this round.
In this case, to ensure liveness, if the process becomes a proposer in a future
round, it should re-propose `v`.
This is achieved by setting `validValue_p` to `v` the in the pseudo-code line
This is achieved by setting `validValue_p` to `v` in the pseudo-code line
42 then using it as the proposal value when it becomes the proposer of a round,
in line 16.
The concrete scenario is detailed as follows.

A proposed value `v` becomes _globally_ valid (in opposition to the _local_
validity represented by the [`valid(v)` function](#validation)) in a round `r`
when it is accepted by a big enough number of processes; they accept it by
when it is accepted by a super-majority of processes; they accept it by
broadcasting a `PREVOTE` for `id(v)`.
If a process `p` observes these conditions, while still in round `r`, line 36
of the pseudo-code is eventually triggered.
Expand All @@ -879,6 +883,8 @@
3. `step_p = precommit`: in this case, `p` only updates `validValue_p` to `v`
and `validRound_p` to `r`.

\NM{I would maybe just talk about two scenarios here, when it locks and when it just updates valid values}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at the end there are two. For the first we say that either 2. or 3. apply (or maybe none of them).


In scenarios 1 and 2, `p` locks the proposed value `v` and therefore cannot
accept any value different than `v` in rounds greater than `r`.
To provide liveness, if `p` becomes the proposer of a upcoming round, it must
Expand Down Expand Up @@ -907,6 +913,11 @@
This is the reason for which pseudo-code lines 15-16 adopt `validValue_p`
instead of `lockedValue_p`.

\NM{The explanations here are fine for me, but I am not good to review this as I know in detail how this mechanism
works. I think we should ask someone who does not understand how valid value and locked value correlate and see
if he/she got it from this. I think this explanation introduces the difference but do not explain the mechanism.
Update: I saw that latter you do explain, so I am happy with the explanations. }
Comment on lines +916 to +919
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\NM{The explanations here are fine for me, but I am not good to review this as I know in detail how this mechanism
works. I think we should ask someone who does not understand how valid value and locked value correlate and see
if he/she got it from this. I think this explanation introduces the difference but do not explain the mechanism.
Update: I saw that latter you do explain, so I am happy with the explanations. }

I guess the update says that it is fine as is.


### Safety

This section argues that Tendermint is a safe consensus algorithm.
Expand All @@ -929,7 +940,8 @@
- every process `p` in `C` has set `lockedValue_p` to `v` and `lockedRound_p`
to `r` ;
- every process `p` in `C` will broadcast a `PREVOTE` for `nil` in rounds
`r' > r` where a value `v' != v` is proposed.
`r' > r` where a value `v' != v` is proposed. \NM{this might not be clear for all and this is
what we explain below, so maybe we can somehow rephrase" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where we describe there, I could not follow here. Maybe we can explain why they prevote nil, otherwise I don't know how to change the description.


Next, lets define a `Q'` any set of processes that does not include process in
`C`, i.e. `Q' ∩ C = ∅`:
Expand Down Expand Up @@ -984,14 +996,14 @@
- The content of the variables `lockedValue_p` and `validValue_p` change in
a rather complicated manner at different processes in arbitrary
asynchronous prefixes of the computation.
- **Synchrony.** Messages should be delivered and timeouts should not expire.
- **Synchrony.** Messages should be delivered before the associated timeouts expire.

#### Value Handling

The first set of conditions for the success of a round `r` depends on its proposer:

1. The [`proposer(h, r)` function](#proposer-selection) returns a correct process `p`;
2. And `validRound_p` equals the maximum `validRound_q` among every correct process `q`.
2. And `validRound_p` is larger or equals the maximum `lockedRound_q` among every correct process `q`.

A correct proposer `p` (Condition 1) broadcasts a single `PROPOSAL` message (it
does not equivocate) in round `r`, including a proposed value `v` that will be
Expand All @@ -1018,6 +1030,10 @@
Notice, however, that from the [Gossip communication property](#network), `q`
should eventually receive the POL for `v` in round `vr`, since `p` is a correct
process.
And, as we detail on the next section, when the synchrony assumptions are observed, `p` will receive such messages before it receives the new `PROPOSAL`.
\NM{I think we should mention in the text here that we show the intution how condition 2 is achieved later, because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Added a sentence to the last paragraph, check if it works. :)

while reading I was feeling that this was missing, and then I saw that you address it below}


#### Synchrony

Expand Down Expand Up @@ -1047,14 +1063,15 @@
In other words, it should be ensured that, from `GST` on, if any correct
process executes pseudo-code line 36 for a round `r*`, then every correct
process executes the same line while still in round `r*`.
This is achieved by the means of, and it is actually the reason for which
This is achieved by the means of, and it is actually the key reason for which
Tendermint has, the `timeoutPrecommit(r*)`.
Before moving to the next round, because the current one has not succeeded,
processes wait for a while to ensure that they have received all the `PREVOTE`
messages from that round, broadcast or received by other correct processes.
As a result, if a correct process has locked or updated its valid value in
round `r*`, then every correct process will at least update its valid value in
round `r*`.
This happens because, due to [Gossip communication property](#network), every correct process receives the same set of messages within `∆`.
This enables the next correct proposer, of a round `r > r*` to fulfill
Condition 2, thus enabling `r` to be a successful round.

Expand Down
Loading