What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte...

65
What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Transcript of What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte...

Page 1: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

What is ’’hard’’ in distributed computing?

R. Guerraoui EPFL/MIT

joint work with. Delporte and H. Fauconnier (Univ of

Paris)

Page 2: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)
Page 3: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)
Page 4: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)
Page 5: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)
Page 6: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

What is ’’hard’’ in distributed computing?

Page 7: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Problem A is harder than problem B if any solution to A can be used to solve B B is said to be reducible to A

Page 8: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Black-box reductions

Grey-box reductions

Roadmap

Page 9: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

B

A

Black-box reduction

Page 10: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Distributed system

p1

p2

p3

Page 11: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

Register

Queue

Which one is harder?

Page 12: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

In particular

• Register: read() and write(x)

• Queue: enq(x) and deq()

• Compare&Swap: c&s(x,y)

Page 13: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Register execution

p1

p2

p3

write(a) -> ok

read() -> b

write(b) -> ok

Page 14: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Queue execution

p1

p2

p3

enq(0) -> ok

deq() -> 0

enq(1) -> ok

Page 15: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

C&S execution

p1

p2

p3

c&s(,1) ->

c&s(,2) -> 1

c&s(,3) -> 1

Page 16: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Black-box reductions: some established results

Grey-box reductions

Roadmap

Page 17: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

SRSW-Safe-Binary-Register

MRMW-Atomic-M-Register

All registers are equivalent (L86)

NB. They are clearly not equivalent if we consider (memory) complexity (L86, CDG06)

Page 18: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

Register

Queue

Which one is harder?

Page 19: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

The consensus benchmark

• One operation propose()

• All operations return the same value, and this has to be one of the values proposed

Page 20: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus execution

p1

p2

p3

prop(0) -> 0

prop(1) -> 0

prop(1) -> 0

Page 21: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus number (H91)

The consensus number of an object is the maximum number of processes among which the object implements consensus

Page 22: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

Register

Queue Test&Set

Fetch&Add

Snapshot(1)

(2)

()

(3)

Page 23: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers?

p1(0)

p2(1)

write(0) -> ok

write(1) -> ok

read() -> 0

read() -> 1

Page 24: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers?

P(0)

Q(1)

write(1) -> ok read() -> 1

crash

Page 25: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Queue execution

p1

p2

p3

enq(0) -> ok

deq() -> 0

enq(1) -> ok

Page 26: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

p1w(0) deq() -> winner Return(0)

R1 Q

p2w(1) deq() -> loser Return(0)

R2 Q

2-Consensus with queues

r()->0

R2

Page 27: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

p1w(0) deq() -> winner Return(0)

R1 Q

p2w(1) deq() -> loser

R2 Q

3-Consensus with queues?

p3w(0) deq() -> loser

R1 Q

Page 28: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

C&S execution

p1

p2

p3

c&s(,1) ->

c&s(,2) -> 1

c&s(,3) -> 1

Page 29: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

P1(1)c&s(,1) -> Return(1)

C&S

P2(2)Return(1)

C&S

3-Consensus with c&s

c&s(,2) -> 1

P3(3)Return(1)

C&S

c&s(,3) -> 1

Page 30: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus hierarchy

For any integer k, there is an object with consensus number k

An object with consensus number is said to be universal

Page 31: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Black-box reductions: some established results

Grey-box reductions: some new results

Roadmap

Page 32: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

The traditional notion of black-box reduction classifies objects, assuming these objects were available

What if the objects are not available?

Page 33: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

Register

Queue Test&Set

Fetch&Add

Snapshot(1)

(2)

()

(3)

Page 34: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Registers cannot implement consensus,…in an asynchronous system (FLP85,LA87,DLS86,H91)

Page 35: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers?

p1(0)

p2(1)

write(0) -> ok

write(1) -> ok

read() -> 0

read() -> 1

Page 36: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers?

p1(0)

p2(1)

write(1) -> ok read() -> 1

crash

Page 37: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers and a failure detector

p1(0)

p2(1)

suspected(p1)

crash

Return(1)

Page 38: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus with registers and a failure detector

p1(0)

p2(1)

write(0) -> ok

read() -> 0

Return(0)

Return(0)

Page 39: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Consensus

Weakest failure detector(encapsulating timing assumptions)

Register

Page 40: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Failure detector

A distributed oracle that provides each process with information about the status correct/failed of other processes

A failure detector is implemented with timing assumptions

A failure detector A is harder than B if A can emulate B

Page 41: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&SwapQueue Test&SetFetch&Add

Consensus

Weakest failure detectors

Register

Page 42: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Classic result (CHT92,LH94,GK04)

The weakest failure detector to implement consensus (among any number of processes) with registers is

Page 43: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

(p1)p1

p2

p3

Failure detector

(p3)

(p2)

(p1) (p2)

(p3)

Page 44: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap:

Register

Queue?Test&Set?

… ?

Fetch&Add?

(1)

(2)

(N)

… ?

Page 45: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

The weakest failure detector to implement any object that can solve consensus among at least 2 processes is

Less classic result (DFG05)

Page 46: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

QueueTest&Set

Fetch&Add

Consensus

All objects are equivalent

Page 47: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Step 1

Consider a pair of processes {p,q} :

{p,q} outputs at each process of {p,q} a leader (might not be in {p,q})

{p,q} is the weakest to implement a consensus object shared by {p,q} (CHT92,LH94,GK04)

Page 48: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

s

q

r

Failure detector {p,q}

(q) (r)p

(p) (r)

Page 49: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

{p,q}

p r

q

Page 50: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Step 2

* ({p,q} {p,q}) =

Page 51: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Emulating with * ({p,q} {p,q})

Processes periodically exchange {p,q} and (1) Build a digraph of leaders(2) Extract the sub-digraph of accessible leaders(3) Ouput a process in the sink of the super-digraph of strongly connected components

Page 52: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

* ({p,q} {p,q})

ps

qr

Page 53: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

q->r at p if p knows r is the leader of q:

Phase 1

p r

q

The graph might contain faulty processes

Page 54: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

q is removed from p’s graph if q is not accessible from p

Phase 2

p r

q

The graphs have only correct processes but might be different

Page 55: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

p extracts the sink of its digraph of strongly connected components

Phase 3

All digraphs of strongly connected components eventually have the same sink (we use here the property of {p,q})

Page 56: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Compare&Swap

QueueTest&Set

Fetch&Add

Consensus

(Almost) All objects are equivalent

Page 57: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

If objects are given as black-boxes, they are different

If we can extract from the objects the failure information needed to implement them, then they are all equivalent (and universal)

Reductions (black-box vs. grey box)

Page 58: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)
Page 59: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

Object A is harder than object B if the weakest failure detector to implement A implements B

Grey-box reduction

Page 60: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

B

A

Grey-box reduction

FD(A)

Page 61: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

The weakest failure detector to boost the consensus number of an object from level k to k+1 is

Conjecture (Neiger)

Page 62: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

(p1)p1

p2

p3

Failure detector

(p1,p3)

(p2)

(p1) (p2)

(p1,p3)

Page 63: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

(1-set) consensus

2-consensus 1-consensus

2-set consensus

Impossible

Page 64: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)

(1-set) consensus

2-consensus 1-consensus

2-set consensus

Same weakest failure detector ConjectureConjecture

Page 65: What is ’’hard’’ in distributed computing? R. Guerraoui EPFL/MIT joint work with. Delporte and H. Fauconnier (Univ of Paris)