Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San...

22
Coterie Coterie availability in availability in sites sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San...

Page 1: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

Coterie availability in sitesCoterie availability in sites

Flavio Junqueira and Keith Marzullo

University of California, San Diego

DISC, Krakow, Poland, September 2005

Page 2: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

2DISC’05

Multi-site systemsMulti-site systems

Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources

Data sets Computational power

E.g. BIRN, Geon, TeraGrid, PlanetLab

Site failure All the nodes in a site simultaneously

unavailable

Page 3: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

3DISC’05

Site availability — BIRNSite availability — BIRN

10 sites experience at least one outage

One site under 97%

Page 4: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

4DISC’05

Improving availabilityImproving availability

Better availability through replication Coteries

Set system of processes: a set of subsets of processes Each subset is called a quorum Minimal sets, pairwise intersect

Coteries are useful Distributed mutual exclusion Distributed registers Consensus through Paxos

Coterie availability in multi-site systems

Page 5: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

5DISC’05

RoadmapRoadmap

System model Availability metrics

Previous deterministic metrics not necessarily good A new metric

Failure model Characterize failures using survivor sets Survivor sets: more expressive

Quorum construction Multi-site hierarchical construction

Practical issues Failure model in practice PlanetLab experiment

Conclusions

Page 6: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

6DISC’05

System modelSystem model

Set P of processes Pairwise connected by quasi-reliable asynchronous channels Process failure: crash Processes can recover

Set B of sites Partition of the set processes Site failure: simultaneous failure of all the processes in the site Process failures are not independent

Execution Sequence of steps of processes E: set of all executions

In a step s

Available process in s p P is available if p F(s) €

NF(s) = P \ F(s)

F(s) = {p : ( p ∈ P)∧( p is faulty in s)}

Page 7: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

7DISC’05

Survivor setsSurvivor sets

A set S P is a survivor set iff

Example

∀p ∈ S : ∀E ∈E : S \ p ≠ NF(s)

∃E ∈E : ∃s ∈ E : S = NF(s)

Processes

Sites

E={E1,E2,E3,E4}

E1,E2: s1 s2 E3: s1 E4: s1

NF(si)

Survivor sets

Page 8: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

8DISC’05

Availability metricsAvailability metrics

Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges

Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs

Page 9: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

9DISC’05

A counterexampleA counterexample

Processes

Survivor sets

Sites

Majority Quorum: 5 processes In some step, no quorum can

be formed

Using SP as quorums In every step, at least one

quorum can be formed

Majority is not optimal

Page 10: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

10DISC’05

Availability metricsAvailability metrics

Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges

Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs

A new metric A(Q), Q is a coterie Number of covered survivor sets in Q A survivor set S is covered in Q if:

∃Q ∈Q : Q ⊆ S

Page 11: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

11DISC’05

Failure modelFailure model

Multi-site hierarchical model A set Fs of subsets of B

Subsets of simultaneously faulty sites

An array Fp One entry per site Each entry: subsets of

processes in the site Subsets of simultaneously

faulty processes at a site

A survivor set S: FS Fs

Bi FS:FP Fp[i]:P\FP S

Bi FS:Bi S =

Processes (P)

B1 B2 B3

Fs ={{B1},{B2},{B3}}

1 2 3 1 2 3 1 2 3

Fp [1]={{ }: i {1,2,3}}i

Fp [2]={{ }: i {1,2,3}}i

Fp [3]={{ }: i {1,2,3}}i

Sites(B )

Sp={{ }: i, j,k,l {1,2,3} ij kl}i j k l

{{ }: i, j,k,l {1,2,3} ij kl}i j k l

{{ }: i, j,k,l {1,2,3} ij kl}i j k l

Page 12: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

12DISC’05

Quorum constructionQuorum construction

Optimal availability with respect to A

Coterie Q : Sp = Q OR Q dominates Sp

Survivor sets in Sp pairwise intersect

If not, then optimally discarding survivor sets is NP-Complete

A special case: Qsite All subsets of B of size fs inFs

All subsets of size t of Bi in Fp[i], for every i

Site 1

Site 2

Site 3

E.g.: fs = 1, t = 1

Quorums

Page 13: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

13DISC’05

Model in practiceModel in practice

Qsite fs: Threshold on site failures

Data on site availability t : Threshold on process failures

Markov chains One Markov chain for each site

Transitions Failure transitions: same probability, homogeneous processes Repair transitions: variable probability, amount of resources used

Failure transitions

Repair transitions

Page 14: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

14DISC’05

PlanetLab experimentPlanetLab experiment

Toy application Paxos: quorums of acceptors Client accessing quorums

Hosts used Three sites: three from each site One UCSD host: proposer,

learner

Three settings 3Sites: One acceptor per site

Quorum: two hosts 3SitesMaj: All hosts

Quorum: four hosts, majority from each of two sites

SimpleMaj: All hosts Quorum: any five processes

UC Davis

UT Austin

DukeUC San Diego

SimpleMaj has worse availability

3SitesMaj has better availability

Page 15: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

15DISC’05

The Bimodal modelThe Bimodal model

Sites are survivor sets Sp is not a coterie

“Throw out” survivor sets In general, optimal solution is NP-Complete Simple solution for this model

Practical issues Practical for two sites More than two sites: open problem

n0

t0 t1 t t

00 01 0t

10 11 1t

0n

n1 n t nn

t n

1n

Page 16: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

16DISC’05

ConclusionsConclusions

Coteries for multi-site systems Site failures: process failures not independent

A new metric Counts covered survivor sets

Multi-site hierarchical construction Practical Illustrated with Markov model Experiment shows better availability

Using majority quorums is not a good idea Not optimal Poor performance

Future work More experiments, more constructions, real deployment

Page 17: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

17DISC’05

END

Page 18: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

18DISC’05

Backup Slides

Page 19: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

19DISC’05

Failure modelsFailure models

The multi-site hierarchical model A set Fs of subsets of B

An array Fp One entry per site Each entry: subsets of processes in

the site

A survivor set S: FS Fs

Bi FS:FP Fp[i]:P\FP S

Bi FS:Bi S =

The bimodal model A set Fs of subsets of B

There is one site that is in no element of Fs

An array Fp

A survivor set S As in the previous model OR

Bi B: S = Bi

Processes

B2B1

Fs =

Fp [1]={{ }: i {1,2,3}}

1 2 3 1 2 3

i

Fp [2]={{ }: i {1,2,3}}i

MSH: Sp={{ }: i, j,k,l {1,2,3}

ij kl} i j k l

B: Sp={{ }: i, j,k,l {1,2,3} ij kl} B

i j k l

Page 20: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

20DISC’05

Bimodal constructionBimodal construction

Bimodal model By construction: Not all pairs of survivor sets intersect

Discard survivor sets until remaining intersect Selecting optimally is NP-Complete

Solution: Remove |B|-1 survivor sets Survivor sets containing processes from multiple sites pairwise intersect Construction is also optimal with respect to metric A

A special case: Bsite All elements of Fs have size fs

All elements of Fp[i] have the same size t, for every i

E.g.: fs = 1, t = 1 B1

B2

Quorums

Page 21: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

21DISC’05

Site availabilitySite availability

Goals Show that sites are unavailable frequently enough

BIRN - Biomedical Informatics Research Network Test bed projects centered around brain imaging Currently: 19 universities, 26 research groups

Availability Monthly basis Pings (BIRN-CC) Storage broker logs

Site availability Jan/04-Aug/04 Availability under 100%

On average in 5 out of the 8 months

Availability = Total hours - Unplanned outages

Total hours×100

Page 22: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

22DISC’05

Causes of site failuresCauses of site failures

Misconfigured software Shared resources

1.Storage2.Power circuits3.Cooling pipes4.Air conditioning5.Network