CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006...

39
CPSC 668 Set 19: Asynchronous Solvability 1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch

Transcript of CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006...

Page 1: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 1

CPSC 668Distributed Algorithms and Systems

Fall 2006

Prof. Jennifer Welch

Page 2: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 2

Problems Solvable in Failure-Prone Asynchronous Systems• Although consensus is not solvable in failure-

prone asynchronous systems (neither message passing nor read/write shared memory), there are some interesting problems that are solvable:– set consensus– approximate agreement– renaming– k-exclusion

weakenings of consensus

- "opposite" of consensus

- fault-tolerant variant of mutex

Page 3: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 3

Model Assumptions

• asynchronous• shared memory with read/write registers• at most f crash failures of procs.

• results can be translated to message passing if f < n/2 (cf. Chapter 10)

• may be a few asides into message passing

Page 4: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 4

Set Consensus Motivation

• By judiciously weakening the definition of the consensus problem, we can overcome the asynchronous impossibility

• We've already seen a weakening of consensus:– weaker termination condition for randomized

algorithms

• How about weakening the agreement condition?

• One weakening is to allow more than one decision value:– allow a set of decisions

Page 5: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 5

Set Consensus Definition

Termination: Eventually, each nonfaulty processor decides.

k-Agreement: The number of different values decided on by nonfaulty processors is at most k.

Validity: Every nonfaulty processor decides on a value that is the input of some processor.

Page 6: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 6

Set Consensus Algorithm

• Uses a shared atomic snapshot object X– can be implemented with read/write

registers

• update your segment of X with your input• repeatedly scan X until there are at least

n - f nonempty segments• decide on minimum value appearing in

any segment

Page 7: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 7

Correctness of Set Consensus Algorithm• Termination: at most f crashes.• Validity: every decision is some proc's

input• Why does k-agreement hold?

– We'll show it does as long as k > f.– Sanity check: When k = 1, we have

standard consensus. As long as there is less than 1 failure, we can solve the problem.

Page 8: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 8

k-Set Agreement Condition

• Let S be set of min values in final scan of each nf proc; these are the nf decisions

• Suppose in contradiction |S| > f + 1.

• Let v be largest value in S, the decision of pi.

• So pi's final scan misses at least f + 1 values, contradicting the code.

Page 9: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 9

Set Consensus Lower Bound

Theorem: There is no algorithm for solving k-set consensus in the presents of f failures, if f ≥ k.

• Straightforward extensions of consensus impossibility result fail; even proving the existence of an initial bivalent configuration is quite involved.

• Original proof of set-consensus impossibility used concepts from algebraic topology

• Textbook's proof uses more elementary machinery, but still rather involved

Page 10: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 10

Approximate Agreement Motivation

• An alternative way to weaken the agreement condition for consensus:

• Require that the decisions be close to each other, but not necessarily equal

• Seems appropriate for continuous-valued problems (as opposed to discrete)

Page 11: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 11

Approximate Agreement Definition

Termination: Eventually, each nonfaulty processor decides.

-Agreement: All nonfaulty decisions are within of each other.

Validity: Every nonfaulty decision is within the range of the input values.

Page 12: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 12

Approximate Agreement Algorithm• Assume procs know the range from which

input values are drawn:– let D be the length of this range

• up to n - 1 procs can fail• algorithm is structured as a series of

"asynchronous rounds":– exchange values via a snapshot object, one per

round– compute midpoint for next round

• continue until spread of values is within , which requires about log2 D/ rounds

Page 13: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 13

Approximate Agreement AlgorithmInitially local variable v = pi's inputInitially local variable r = 1

• update pi's segment of ASO[r] to be v• let scan be set of values obtained by

scanning ASO[r]• v := midpoint(scan)• if r = log2 (D/) + 1 then decide v and

terminate• else r++

Page 14: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 14

Analysis of Algorithm

Definitions w.r.t. a particular execution:

• M = log2 (D/) + 1

• U0 = set of input values

• Ur = set of all values ever written to ASO[r]

Page 15: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 15

Helpful Lemma

Lemma (16.8): Consider any round r < M. Let u be the first value written to ASO[r]. Then the values written to ASO[r+1] are in this range:

umin(Ur) max(Ur)(min(Ur)+u)/2 (max(Ur)+u)/2

elements of Ur+1 are in here

Page 16: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 16

Implications of Lemma

• The range of values written to the ASO object for round r + 1 is contained within the range of values written to the ASO object for round r.– range(Ur+1) range(Ur)

• The spread (max - min) of values written to the ASO object for round r + 1 is at most half the spread of values written to the ASO object for round r.– spread(Ur+1) ≤ spread(Ur)/2

Page 17: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 17

Correctness of Algorithm

• Termination: Each proc executes M asynchronous rounds.

• Validity: The range at each round is contained in the range at the previous round.

-Agreement:spread(UM) ≤ spread(U0)/2M

≤ D/2M

Page 18: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 18

Handling Unknown Input Range

• Range might not be known.• Actual range in an execution might be

much smaller than maximum possible range.

• First idea: have a preprocessing phase in which procs try to determine input range– but asynchrony and possible failures

makes this approach problematic

Page 19: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 19

Handling Unknown Input Range

• Use just one atomic snapshot object• Dynamically recalculate how many rounds are

needed as more inputs are revealed• Skip over rounds to try to catch up to

maximum observed round• Only consider values associated with

maximum observed round• Still use midpoint

Page 20: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 20

Unknown Input Range Algorithmshared atomic snapshot object A; initially all segments updatei(A,[x,1,x]), where x is pi's inputrepeat scan A let S be spread of all inputs in non- segments if S = 0 then maxRound := 0

else maxRound := log2(S/) let rmax be largest round in non- segments let values be set of candidates in segments with round

number rmax

update pi's segment in A with [x,rmax+1,midpt(values)]

until rmax ≥ maxRounddecide midpoint(values)

Page 21: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 21

Analysis of Unknown Input Range Algorithm

Definitions w.r.t. a particular execution:

• U0 = set of all input values

• Ur = set of all values ever written to A with round number r

• M = largest r s.t. Ur is not empty

With these changes, correctness proof is similar to that for known input range algorithm.

Page 22: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 22

Key Differences in Proof

• Why does termination hold?– a proc's local maxRound variable can only

increase if another proc wakes up and increases the spread of the observable inputs. This can happen at most n - 1 times.

• Why does -agreement hold?– If pi's input is observed by pj the last time pj

computes its maxRound, same argument as before.

– Otherwise, when pi wakes up, it ignores its own input and uses values from maxRound or later.

Page 23: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 23

Renaming

• Procs start with unique names from a large domain

• Procs should pick new names that are still distinct but that are from a smaller domain

• Motivation: Suppose original names are serial numbers (many digits), but we'd like the procs to do some kind of time slicing based on their ids

Page 24: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 24

Renaming Problem Definition

Termination: Eventually every nonfaulty proc pi decides on a new name yi

Uniqueness: If pi and pj are distinct nonfaulty procs, then yi ≠ yj.

We are interested in anonymous algorithms: procs don't have access to their indices, just to their original names. Code depends only on your original name.

Page 25: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 25

Performance of Renaming Algorithm

• New names should be drawn from {1,2,…,M}.

• We would like M to be as small as possible.

• Uniqueness implies M must be at least n.

• Due to the possibility of failures, M will actually be larger than n.

Page 26: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 26

Renaming Results

• Algorithm for wait-free case (f = n - 1) with M = n + f = 2n - 1.

• Algorithm for general f with M = n + f.• Lower bound that M must be at least n + 1,

for wait-free case.– Proof similar to impossibility of wait-free

consensus

• Stronger lower bound that M must be at least n + f, if f is the number of failures– Proof uses algebraic topology and is related to

lower bound for set consensus

Page 27: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 27

Wait-Free Renaming AlgorithmShared atomic snapshot object A; initially all segments s := 1 // suggestion for my new namewhile true do

update pi's segment of A to be [x,s], where x is pi's original name

scan A if s is also someone else's suggestion then let r be rank of x among original names of non-

segments let s be r-th smallest positive integer not currently

suggested by another proc else decide on s for new name and terminate

Page 28: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 28

Analysis of Renaming Algorithm

Uniqueness: Suppose in contradiction pi and pj choose same new name, s.

pi's lastscan beforedeciding s

pj's lastscan beforedeciding s

pi's lastupdatebeforedeciding:suggests s

sees s as pi'ssuggestion anddoesn't decide s

Page 29: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 29

Analysis of Renaming Algorithm

• New name space is {1,…,2n - 1}.• Why?

• rank of a proc pi's original name is at most n (the largest one)

• worst case is when each of the n - 1 other procs has suggested a different new name for itself, say {1,…,n - 1}.

• Then pi suggests n + n - 1 = 2n - 1.

Page 30: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 30

Analysis of Renaming Algorithm

Termination: Suppose in contradiction some set T of nonfaulty procs never decide in some execution.

• Consider the suffix of the execution in which – each proc in T has already done at least

one update and – only procs in T take steps (others have

either already crashed or decided).

Page 31: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 31

Analysis of Renaming Algorithm• Let F be the set of new names that are free (not

suggested at the beginning of by any proc not in T) -- the trying procs need to choose new names from this set.

• Let z1, z2,… be the names in F in order.• By the definition of , no proc wakes up during

and reveals an additional original name, so all procs in T are working with the same set of original names during .

• Let pi be proc whose original name has smallest rank (among this set of original names). Let r be this rank.

Page 32: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 32

Analysis of Renaming Algorithm

• Eventually procs other than pi stop suggesting zr as a new name:

– After starts, every scan indicates a set of free names that is no larger than F.

– Every trying proc other than pi has a larger rank and thus continually suggests a new name for itself that is larger than zr, once it does the first scan in .

Page 33: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 33

Analysis of Renaming Algorithm

• Eventually pi does suggest zr as its new name:– By choice of zr as r-th smallest free new

name, and fact that eventually other trying procs stop suggesting z1 through zr, eventually pi sees zr as free name with r-th smallest rank.

• Contradicts assumption that pi is trying (i.e., stuck).

• So termination holds.

Page 34: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 34

General Renaming

• Suppose we know that at most f procs will fail, where f is not necessarily n - 1.

• We can use the wait-free algorithm, but it is wasteful in the size of the new name space, 2n - 1, if f < n - 1.

• We can do better (if f < n - 1) with a slightly different algorithm:– keep track in the snapshot object of whether you

have decided– an undecided proc suggests a new name only if its

original name is among the f + 1 lowest names of procs that have not yet decided.

Page 35: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 35

k-Exclusion Problem

• A fault-tolerant version of mutual exclusion.

• Processors can fail by crashing, even in the critical section (stay there forever).

• Allow up to k processors to be in the critical section simultaneously.

• If < k processors fail, then any nonfaulty processor that wishes to enter the critical section eventually does so.

Page 36: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 36

k-Exclusion Algorithm

cf. paper by Afek et al. [5].

Page 37: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 37

k-Assignment Problem• A specialization of k-Exclusion to include:• Uniqueness: Each proc in the critical section

has a variable called slot, which is an integer between 1 and m. If pi and pj are in the C.S. concurrently, then they have different slots.

• Models situation when there is a pool of identical resources, each of which must be used exclusively:– k is number of procs that can be in the pool

concurrently– m is the number of resources– To handle failures, m should be larger than k

Page 38: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 38

k-Assignment Algorithm Schema

k-exclusion entry section

renaming using m = 2k-1 names

k-assignment entry section

k-exclusion exit section

k-assignment exit section

Page 39: CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668 Set 19: Asynchronous Solvability 39

k-Assignment Algorithm Schema

k-exclusion entry section

request-name for long-livedrenaming using m = 2k-1 names

k-assignment entry section

k-exclusion entry section

release-name for long-livedrenaming using m = 2k-1 names

k-assignment exit section