Byzantine Fault Tolerance

33
Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2011 1 Material drived from slides by I. Gupta and N.Vaidya

description

Byzantine Fault Tolerance. CS 425: Distributed Systems Fall 2011. Material drived from slides by I . Gupta and N .Vaidya. Reading List. L. Lamport , R. Shostak , M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS 1982. - PowerPoint PPT Presentation

Transcript of Byzantine Fault Tolerance

Page 1: Byzantine Fault Tolerance

1

Byzantine Fault Tolerance

CS 425: Distributed SystemsFall 2011

Material drived from slides by I. Gupta and N.Vaidya

Page 2: Byzantine Fault Tolerance

2

Reading List

• L. Lamport, R. Shostak, M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS 1982.

• M. Castro and B. Liskov, “Practical Byzantine Fault Tolerance,” OSDI 1999.

Page 3: Byzantine Fault Tolerance

Byzantine Generals Problem

A sender wants to send message to n-1 other peers

• Fault-free nodes must agree

• Sender fault-free agree on its message

• Up to f failures

Page 4: Byzantine Fault Tolerance

Byzantine Generals Problem

A sender wants to send message to n-1 other peers

• Fault-free nodes must agree

• Sender fault-free agree on its message

• Up to f failures

Page 5: Byzantine Fault Tolerance

S

321

v

vv

Byzantine Generals Algorithm

5

value v

Faulty peer

Page 6: Byzantine Fault Tolerance

S

321

v

vv

6

value v

v v

Byzantine Generals Algorithm

Page 7: Byzantine Fault Tolerance

S

321

v

vv

7

value v

v v

?

?

Byzantine Generals Algorithm

Page 8: Byzantine Fault Tolerance

S

321

v

vv

8

value v

v

v v

?

v?

Byzantine Generals Algorithm

Page 9: Byzantine Fault Tolerance

32

v

vv

9

value v

x

v v

?

v?[v,v,?]

[v,v,?]

S

1

Byzantine Generals Algorithm

Page 10: Byzantine Fault Tolerance

32

v

vv

10

value v

x

v v

?

v?v

v

S

1Majorityvote resultsin correctresult atgood peers

Byzantine Generals Algorithm

Page 11: Byzantine Fault Tolerance

S

321

v

wx

11

Faulty source

Byzantine Generals Algorithm

Page 12: Byzantine Fault Tolerance

S

32

v

w

x

12

w w1

Byzantine Generals Algorithm

Page 13: Byzantine Fault Tolerance

S

32

v

wx

13x

w w

v

xv

1

Byzantine Generals Algorithm

Page 14: Byzantine Fault Tolerance

S

32

v

wx

14x

w w

v

xv

1[v,w,x]

[v,w,x]

[v,w,x]

Byzantine Generals Algorithm

Page 15: Byzantine Fault Tolerance

S

32

v

wx

15x

w w

v

xv

1[v,w,x]

[v,w,x]

[v,w,x]

Vote resultidentical atgood peers

Byzantine Generals Algorithm

Page 16: Byzantine Fault Tolerance

Known Results

• Need 3f + 1 nodes to tolerate f failures

• Need Ω(n2) messages in general

16

Page 17: Byzantine Fault Tolerance

Ω(n2) Message Complexity

• Each message at least 1 bit

• Ω(n2) bits “communication complexity” to agree on just 1 bit value

17

Page 18: Byzantine Fault Tolerance

18

Practical Byzantine Fault Tolerance

• Computer systems provide crucial services• Computer systems fail– Crash-stop failure– Crash-recovery failure– Byzantine failure

• Example: natural disaster, malicious attack, hardware failure, software bug, etc.

• Need highly available service Replicate to increase availability

Page 19: Byzantine Fault Tolerance

19

Challenges

Request A Request B

Client Client

Page 20: Byzantine Fault Tolerance

20

Requirements

• All replicas must handle same requestsdespite failure.

• Replicas must handle requests in identical order despite failure.

Page 21: Byzantine Fault Tolerance

21

Challenges

2: Request B

1: Request A

Client Client

Page 22: Byzantine Fault Tolerance

22

State Machine Replication

2: Request B

1: Request A

2: Request B

1: Request A

2: Request B

1: Request A

2: Request B

1: Request A

Client Client

How to assign sequence number to requests?

Page 23: Byzantine Fault Tolerance

23

Primary Backup Mechanism

Client Client

2: Request B

1: Request A

What if the primary is faulty?Agreeing on sequence number

Agreeing on changing the primary (view change)

View 0

Page 24: Byzantine Fault Tolerance

24

Normal Case Operation

• Three phase algorithm:– PRE-PREPARE picks order of requests– PREPARE ensures order within views– COMMIT ensures order across views

• Replicas remember messages in log• Messages are authenticated– .σk denotes a message sent by k

Page 25: Byzantine Fault Tolerance

25

Pre-prepare Phase

Primary: Replica 0

Replica 1

Replica 2

Replica 3

Request: m

PRE-PREPARE, v, n, mσ0

Fail

Page 26: Byzantine Fault Tolerance

26

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

Accepted PRE-PREPARE

Page 27: Byzantine Fault Tolerance

27

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE, v, n, D(m), 1σ1

Accepted PRE-PREPARE

Page 28: Byzantine Fault Tolerance

28

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE, v, n, D(m), 1σ1

Accepted PRE-PREPARE

Collect PRE-PREPARE + 2f matching PREPARE

Page 29: Byzantine Fault Tolerance

29

Commit PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE

COMMIT, v, n, D(m)σ2

Page 30: Byzantine Fault Tolerance

30

Commit Phase (2)Request: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE COMMIT

Collect 2f+1 matching COMMIT: execute and reply

Page 31: Byzantine Fault Tolerance

31

View Change

• Provide liveness when primary fails– Timeouts trigger view changes– Select new primary (= view number mod 3f+1)

• Brief protocol– Replicas send VIEW-CHANGE message along with

the requests they prepared so far– New primary collects 2f+1 VIEW-CHANGE messages– Constructs information about committed requests

in previous views

Page 32: Byzantine Fault Tolerance

32

View Change Safety

• Goal: No two different committed request with same sequence number across views

Quorum for Committed Certificate (m, v, n)

At least one correct replica has Prepared Certificate (m, v, n)

View Change Quorum

Page 33: Byzantine Fault Tolerance

33

Related WorksFault Tolerance

Fail Stop Fault Tolerance

Paxos1989 (TR)

VS ReplicationPODC 1988

Byzantine Fault Tolerance

Byzantine Agreement

RampartTPDS 1995

SecureRingHICSS 1998

PBFT OSDI ‘99

BASETOCS ‘03

Byzantine Quorums

Malkhi-ReiterJDC 1998

PhalanxSRDS 1998

FleetToKDI ‘00

Q/USOSP ‘05

Hybrid Quorum

HQ Replication OSDI ‘06