Byzantine Fault Tolerance - DCL

Byzantine Fault Tolerance and Consensus

Adi Seredinschi Distributed Programming Laboratory

1

(Original) Problem

2

Correct process

General goal:Run a distributed algorithm

(Original) Problem

2

Correct process

Arbitraryfailure

General goal:Run a distributed algorithm

(Recasting the) Problem

3

Client

ApplicationREQUEST

REPLY

Distributedalgorithm

Recasting the problem

4

Client

ApplicationREQUEST

REPLY


Application requirements: • High-Availability

(give a reply to a request) • Reliability

(give correct replies)

Boils down to fault-tolerance

SolutionFault-tolerance basic techniques:

• Agreement = Consensus

• Replication = State Machine Replication

In the following we will see…

• PBFT

• Seminal algorithm for Byzantine Fault Tolerance

5

Client

ApplicationREQUEST

REPLY


PBFT Practical Byzantine Fault Tolerance

6

OSDI’99

Miguel Castro

Barbara Liskov

Modules

7

ApplicationState Machine

Replication (SMR)

Consensus

Perfect LinksChannels

Modules

7

ApplicationState Machine

Replication (SMR)

Consensus


PBFT

Overview

PBFT:

• System model— slightly different from what we’ve seen so far

• SMR

• Consensus

8

System modelProcesses

Three types of processes in this algorithm:

• Clients

• n replicas

• one of them is primary

• others are backup

9

…

…1

2 n3

System modelFailure model

• Arbitrary (Byzantine) faults

• Clients: • Any client can be faulty

• Replicas: • n = 3f + 1 • f faulty (upper bound)

10

….

n=4 f=1

n=7 f=2

System model Network & crypto

• Assume perfect links

• Direct links between any two processes

• For messages:

• Public-key signatures, message authentication codes

• Avoid spoofing, replays, corruption

• Clients are authenticated

• Can revoke access to faulty clients

11

Overview

PBFT:


• SMR

• Consensus

12

State Machine Replication (SMR)

• A fault-tolerance technique • Basic ideas:

• Application = state machine

• Run the application on multiple processes

• Each processes is a faithful replica of the application

• Note: We can ignore the primary/backup distinction in this example

13

….

….

Process 1 Process n

SMR in a nutshell

14

S1

S1

S1

All SMR replicas start from the

same state (S1)

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)

Transition tostate (S2) due to

operation ‘a’

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)


operation ‘a’

a

null

c

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)


operation ‘a’

A faulty client can

wreak havoc

S2

S3

S4

a

null

c

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)


operation ‘a’

A faulty client can

wreak havoc

S2

S3

S4

a

null

c

rand()

rand()

rand()

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)


operation ‘a’

A faulty client can

wreak havoc

S2

S3

S4

a

null

c

Non-determinism is also harmful

S5

S6

S7

rand()

rand()

rand()

SMR in a nutshell

14

S1

S1

S1

a

a

a

S2

S2

S2


same state (S1)


operation ‘a’

A faulty client can

wreak havoc

S2

S3

S4

a

null

c

Non-determinism is also harmful

S5

S6

S7

rand()

rand()

rand()

Diverging states!

• Avoid diverging states

• All replicas must:

1. Start in the same state

2. Execute the same sequence of operations

3. Use only providedoperation (+parameters), thus avoid non-determinism

15

SMR Requirements

• Avoid diverging states

• All replicas must:

1. Start in the same state

2. Execute the same sequence of operations

3. Use only providedoperation (+parameters), thus avoid non-determinism

15

SMR Requirements

Simple

Consensus

Depends on application

SMRin PBFT

• Application = a distributed file system — Network File System (NFS)

• Operations = write to a file, delete, etc. • Primary/backup distinction is relevant

16

NFSState Machine

Replication (SMR)

Consensus


S1

…S1

SMRin PBFT

• Application = a distributed file system — Network File System (NFS)

• Operations = write to a file, delete, etc. • Primary/backup distinction is relevant

16

NFSState Machine

Replication (SMR)

Consensus


S1

…S1

The replicated state is a file system

17

SMRin PBFT

REQUEST

Client contacts the primary

with a request

17

SMRin PBFT

REQUEST


with a request

Cons

ensu

s

All replicas agree on the request

& execute it

17

SMRin PBFT

REQUEST


with a request

Cons

ensu

s

REPLY

REPLY

REPLY

Client gets thesame reply from

all correct replicas

REPLY


& execute it

18

SMRin PBFT

REQUEST


with a request

REPLY

REPLY

Redundancy in Replies =

Cope with failures

REPLY

Lost

REPLY


& execute it

Cons

ensu

s

18

SMRin PBFT

REQUEST


with a request

REPLY

REPLY

Redundancy in Replies =

Cope with failures

REPLY

Lost

REPLYREPLY


& execute it

Cons

ensu

s

Overview

PBFT:


• SMR

• Consensus

19

Consensus• The core for many algorithms, including: • TRB, Group membership, View synchronous

b-cast, State machine replication

20

Traditionally

• Processes propose values

• Agree on a proposed value

In PBFT: • Clients propose requests • Primary multicasts the

requests to backup replicas • Primary & replicas agree on

the sequence of request

Inst

ance

Inst

ance

21

• We’ll assume one client • Proposals = requests for

application operations

• Assume:

• n = 4, f = 1

• The faulty replica does not cooperate

• Concurrent requests: • Consensus to agree on a

sequential execution of requests

Consensus in PBFT

REQUEST

REQUEST

REQUEST

Algorithm ideas:

• Client sends requests to the primary replica

• Execute a sequence of consensus instances:

• Each instance is dedicated to a request

• Instances (and therefore requests) are sequentially ordered by the primary

• Backup replicas adopt requests from the primary in the imposed order

Properties: Validity, Agreement, Termination, Integrity

22

Consensus in PBFT

Consensus in PBFT

23

REQUEST

Cons

ensu

s

Consensus in PBFT

23

REQUEST

Cons

ensu

sA three-phase protocol

Consensus instance = three-phase protocol

24

REQUEST

123


25

REQUEST

123


26

REQUEST

123


27

REQUEST

123


28

REQUEST REPLY

123

Consensus Corner case

What if the primary is faulty, e.g. does not multicast the request to the backups?

• View change protocol: primary replaced by one of the backups

• Idea:

• Replicas are numbered 1 … n

• In view v, the replica p is the primary, wherep = v mod n

29

Consensus Why 3 phases?

1. PRE-PREPARE

2. PREPARE

3. COMMIT

30


1. PRE-PREPARE

2. PREPARE

3. COMMIT

30

Agree on (the order of) requests within the same view


1. PRE-PREPARE

2. PREPARE

3. COMMIT

30

Agree on (the order of) requests within the same view

Ensure that requests which execute are totally ordered

across different views +

Garbage collection

Practical BFT“Reasonable overhead”

• Does not assume synchrony

• Some clever optimizations:

• MD5 replaces digital signatures

• Message digests

• Read-only requests, tentative execution

31

PBFTNFS

3%

• Castro, M., & Liskov, B. (1999). Practical Byzantine fault tolerance. OSDI, (February), 1–14. Available at: http://dl.acm.org/citation.cfm?id=296806.296824

• Castro, M. (2011). Practical Consensus. Microsoft Research Cambridge. Available at: http://msrvideo.vo.msecnd.net/rmcvideos/167097/dl/167097.pdf

Further reading

32

http://dl.acm.org/citation.cfm?id=296806.296824

http://msrvideo.vo.msecnd.net/rmcvideos/167097/dl/167097.pdf

Byzantine Fault Tolerance - DCL

Documents

Transcript of Byzantine Fault Tolerance - DCL