Byzantine Fault Tolerance - DCL
Transcript of Byzantine Fault Tolerance - DCL
Byzantine Fault Tolerance and Consensus
Adi Seredinschi Distributed Programming Laboratory
1
(Original) Problem
2
Correct process
General goal:Run a distributed algorithm
(Original) Problem
2
Correct process
Arbitraryfailure
General goal:Run a distributed algorithm
(Recasting the) Problem
3
Client
ApplicationREQUEST
REPLY
Distributedalgorithm
Recasting the problem
4
Client
ApplicationREQUEST
REPLY
Distributedalgorithm
Application requirements: • High-Availability
(give a reply to a request) • Reliability
(give correct replies)
Boils down to fault-tolerance
SolutionFault-tolerance basic techniques:
• Agreement = Consensus
• Replication = State Machine Replication
In the following we will see…
• PBFT
• Seminal algorithm for Byzantine Fault Tolerance
5
Client
ApplicationREQUEST
REPLY
Distributedalgorithm
PBFT Practical Byzantine Fault Tolerance
6
OSDI’99
Miguel Castro
Barbara Liskov
Modules
7
ApplicationState Machine
Replication (SMR)
Consensus
Perfect LinksChannels
Modules
7
ApplicationState Machine
Replication (SMR)
Consensus
Perfect LinksChannels
PBFT
Overview
PBFT:
• System model— slightly different from what we’ve seen so far
• SMR
• Consensus
8
System modelProcesses
Three types of processes in this algorithm:
• Clients
• n replicas
• one of them is primary
• others are backup
9
…
…1
2 n3
System modelFailure model
• Arbitrary (Byzantine) faults
• Clients: • Any client can be faulty
• Replicas: • n = 3f + 1 • f faulty (upper bound)
10
….
n=4 f=1
n=7 f=2
System model Network & crypto
• Assume perfect links
• Direct links between any two processes
• For messages:
• Public-key signatures, message authentication codes
• Avoid spoofing, replays, corruption
• Clients are authenticated
• Can revoke access to faulty clients
11
Overview
PBFT:
• System model— slightly different from what we’ve seen so far
• SMR
• Consensus
12
State Machine Replication (SMR)
• A fault-tolerance technique • Basic ideas:
• Application = state machine
• Run the application on multiple processes
• Each processes is a faithful replica of the application
• Note: We can ignore the primary/backup distinction in this example
13
….
….
Process 1 Process n
SMR in a nutshell
14
S1
S1
S1
All SMR replicas start from the
same state (S1)
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
a
null
c
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
A faulty client can
wreak havoc
S2
S3
S4
a
null
c
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
A faulty client can
wreak havoc
S2
S3
S4
a
null
c
rand()
rand()
rand()
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
A faulty client can
wreak havoc
S2
S3
S4
a
null
c
Non-determinism is also harmful
S5
S6
S7
rand()
rand()
rand()
SMR in a nutshell
14
S1
S1
S1
a
a
a
S2
S2
S2
All SMR replicas start from the
same state (S1)
Transition tostate (S2) due to
operation ‘a’
A faulty client can
wreak havoc
S2
S3
S4
a
null
c
Non-determinism is also harmful
S5
S6
S7
rand()
rand()
rand()
Diverging states!
• Avoid diverging states
• All replicas must:
1. Start in the same state
2. Execute the same sequence of operations
3. Use only providedoperation (+parameters), thus avoid non-determinism
15
SMR Requirements
• Avoid diverging states
• All replicas must:
1. Start in the same state
2. Execute the same sequence of operations
3. Use only providedoperation (+parameters), thus avoid non-determinism
15
SMR Requirements
Simple
Consensus
Depends on application
SMRin PBFT
• Application = a distributed file system — Network File System (NFS)
• Operations = write to a file, delete, etc. • Primary/backup distinction is relevant
16
NFSState Machine
Replication (SMR)
Consensus
Perfect LinksChannels
S1
…S1
SMRin PBFT
• Application = a distributed file system — Network File System (NFS)
• Operations = write to a file, delete, etc. • Primary/backup distinction is relevant
16
NFSState Machine
Replication (SMR)
Consensus
Perfect LinksChannels
S1
…S1
The replicated state is a file system
17
SMRin PBFT
REQUEST
Client contacts the primary
with a request
17
SMRin PBFT
REQUEST
Client contacts the primary
with a request
Cons
ensu
s
All replicas agree on the request
& execute it
17
SMRin PBFT
REQUEST
Client contacts the primary
with a request
Cons
ensu
s
REPLY
REPLY
REPLY
Client gets thesame reply from
all correct replicas
REPLY
All replicas agree on the request
& execute it
18
SMRin PBFT
REQUEST
Client contacts the primary
with a request
REPLY
REPLY
Redundancy in Replies =
Cope with failures
REPLY
Lost
REPLY
All replicas agree on the request
& execute it
Cons
ensu
s
18
SMRin PBFT
REQUEST
Client contacts the primary
with a request
REPLY
REPLY
Redundancy in Replies =
Cope with failures
REPLY
Lost
REPLYREPLY
All replicas agree on the request
& execute it
Cons
ensu
s
Overview
PBFT:
• System model— slightly different from what we’ve seen so far
• SMR
• Consensus
19
Consensus• The core for many algorithms, including: • TRB, Group membership, View synchronous
b-cast, State machine replication
20
Traditionally
• Processes propose values
• Agree on a proposed value
In PBFT: • Clients propose requests • Primary multicasts the
requests to backup replicas • Primary & replicas agree on
the sequence of request
Inst
ance
Inst
ance
21
• We’ll assume one client • Proposals = requests for
application operations
• Assume:
• n = 4, f = 1
• The faulty replica does not cooperate
• Concurrent requests: • Consensus to agree on a
sequential execution of requests
Consensus in PBFT
REQUEST
REQUEST
REQUEST
Algorithm ideas:
• Client sends requests to the primary replica
• Execute a sequence of consensus instances:
• Each instance is dedicated to a request
• Instances (and therefore requests) are sequentially ordered by the primary
• Backup replicas adopt requests from the primary in the imposed order
Properties: Validity, Agreement, Termination, Integrity
22
Consensus in PBFT
Consensus in PBFT
23
REQUEST
Cons
ensu
s
Consensus in PBFT
23
REQUEST
Cons
ensu
sA three-phase protocol
Consensus instance = three-phase protocol
24
REQUEST
123
Consensus instance = three-phase protocol
25
REQUEST
123
Consensus instance = three-phase protocol
26
REQUEST
123
Consensus instance = three-phase protocol
27
REQUEST
123
Consensus instance = three-phase protocol
28
REQUEST REPLY
123
Consensus instance = three-phase protocol
28
REQUEST REPLY
123
Consensus Corner case
What if the primary is faulty, e.g. does not multicast the request to the backups?
• View change protocol: primary replaced by one of the backups
• Idea:
• Replicas are numbered 1 … n
• In view v, the replica p is the primary, wherep = v mod n
29
Consensus Why 3 phases?
1. PRE-PREPARE
2. PREPARE
3. COMMIT
30
Consensus Why 3 phases?
1. PRE-PREPARE
2. PREPARE
3. COMMIT
30
Agree on (the order of) requests within the same view
Consensus Why 3 phases?
1. PRE-PREPARE
2. PREPARE
3. COMMIT
30
Agree on (the order of) requests within the same view
Ensure that requests which execute are totally ordered
across different views +
Garbage collection
Practical BFT“Reasonable overhead”
• Does not assume synchrony
• Some clever optimizations:
• MD5 replaces digital signatures
• Message digests
• Read-only requests, tentative execution
31
PBFTNFS
3%
• Castro, M., & Liskov, B. (1999). Practical Byzantine fault tolerance. OSDI, (February), 1–14. Available at: http://dl.acm.org/citation.cfm?id=296806.296824
• Castro, M. (2011). Practical Consensus. Microsoft Research Cambridge. Available at: http://msrvideo.vo.msecnd.net/rmcvideos/167097/dl/167097.pdf
Further reading
32