CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

22
CS294, Yelick DataStructs, p1 CS 294-8 Distributed Data Structures http://www.cs.berkeley.edu/~yelick /294

description

CS 294-8 Distributed Data Structures http://www.cs.berkeley.edu/~yelick/294. Agenda. Overview Interface Issues Implementation Techniques Fault Tolerance Performance. Overview. Distributed data structures are an obvious abstraction for distributed systems. Right? - PowerPoint PPT Presentation

Transcript of CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

Page 1: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p1

CS 294-8Distributed Data

Structureshttp://www.cs.berkeley.edu/~yelick/294

Page 2: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p2

Agenda• Overview• Interface Issues• Implementation Techniques• Fault Tolerance• Performance

Page 3: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p3

Overview• Distributed data structures are an

obvious abstraction for distributed systems. Right?

• What do you want to hide within one?– Data layout?– When communication is required?– # and location of replicas– Load balancing

Page 4: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p4

Distributed Data Structures• Most of these are containers • Two fundamentally difference

kinds:– Those with integrators or ability to

look at all container elements• Arrays, meshes, databases*, graphs* and

trees* (sometimes)

– Those with only single element ops• Queue, directory (hash table or tree), all

*’d items above

Page 5: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p5

DDS in Ninja• Described in Gribble, Brewer,

Hellerstein, Culler• A distributed data structure (DDS) is a

self-managing layer for persistent data.– High availability, concurrency, consistency,

durability, fault tolerance, scalability

• A distributed hash table is an example – Uses two-phase commits for consistency– Partitioning for scalability

Page 6: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p6

Scheduling Structures• In serial code, most scheduling is

done with a stack (often implicit), a FIFO queue, or a priority queue

• Do all of these makes sense in a distributed setting?

• Are there others?

Page 7: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p7

Distributed Queues• Load balancing (work stealing…)

– Push new work onto a stack– Execute locally by popping from the

stack– Steal remotely by removing from the

bottom of the stack (FIFO)

Page 8: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p8

Interfaces (1)• Blocking atomic interfaces: operations

happen between invocation and return– Internally each operation performs locking or

other form of synchronization

• Non-blocking “atomic” interfaces: operation happens sometime after invocation– Often paired with completion synchronization

• Request/response for each operation• Wait for all “my” operations to complete• Wait for all operations in the world to complete

Page 9: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p9

Interfaces (2)• Non-atomic interface: use external

synchronization– Undefined under certain kinds (or all)

concurrency– May be paired with bracketing

synchronization• Aquire-insert-lock, insert, insert, Release-insert-lock• Begin-transaction…

• Operations with no semantics (no-ops)– Prefetch, Flush copies, …

• Operations that allow for failures– Signal “failed”

Page 10: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p10

DDS Interfaces• Contrast:

– RDBMS’s provide ACID semantics on transactions

– Distributed files systems: NFS weak, Frangipani and AFS stronger

• DDS:– All operations on elements are atomic

(indivisible, all or nothing)• This seems to mean that the hash table operations

that involve a single element are atomic

– One-copy equivalence: replication of elements is invisible

– No transaction across elements or operations

Page 11: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p11

Implementation Strategies (1)

• Two simple techniques– Partitioning:

• Used when the d.s. is large• Used when writes/updates are frequent

– Replication:• Used when writes are infrequent and

reads are very frequent• Used to tolerate failures• Full static replication is extreme; dynamic

partial replication is more common

• Many hybrids and variations

Page 12: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p12

Implementation Strategies (2)

• Moving data to computation good for:– dynamic load balancing

• I.e., idle processors grab work

– smaller objects in ops involving > 1 object

• Moving computation to data good for:– large data structures

• Other?

Page 13: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p13

DDS: Distributed Hash Table• Operations include:

– Create, Destroy – Put, Get, and Remove

• Built with storage “bricks”– Each manage a single node, network-visible

hash table– Contain a buffer cache, lock manager,

network stubs and skeletons

• Data is partitioned, and partitions are replicated– Replica groups are used for each partition

Page 14: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p14

DDS: Distributed Hash Table• Operations on elements:

– Get – use any replica in appropriate group

– Put or remove – update all replicas in group using two-phase commit• DDS library is commit coordinator• If individual node crashes during commit

phase, it is removed from replica• If DDS fails during commit phase, individual

nodes will coordinate: if any have committed, all must

Page 15: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p15

DDS: Hash Table

RG name

RG members

000 dds1,dds2

100 dds2

10 dds5,dds4

01 dds7

011 dds5,dds3

111 dds2

Key: 110011

0 1

0

0

1

1

1

10

0

DP map

RG map

Page 16: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p16

Example: Aleph Directory• Maps names to mobile objects

– Files, locks (?), processes,…

• Interested in performance at scale, not reliability

• Two basic protocols:– Home: each object has a fixed

“home” PE that keeps track of cache copies

– Arrow: based on path-reversal idea

Page 17: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p17

Path ReversalFind

Page 18: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p18

Path Reversal

Page 19: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p19

Aleph Directory Performance• Aleph is implemented as Java

packages on top of RMI (and UDP?)• Run on small systems (up to 16

nodes)– Assumed that “home” centralized

solution would be faster at this scale• 2 messages to request; 2 to retrieve

– Arrow was actually faster• Log2 p to request; 1 to retrieve

• In practice, only 2 to request (counter ex.)

Page 20: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p20

Hybrid Directory Protocol• Essentially the same as the “home”

protocol, except• Link waiting processors into a chain

(across the processors)– Each keeps the id of the processor ahead of

it in the chain

• Under high contention, resource moves down the chain

• Performance:– Faster than home and arrow on counter

benchmark and some others…

Page 21: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p21

How Many Data Structures?• Gribble et al claim:

– “We believe that given a small set of DDS types (such as a hash table, a tree, and an administrative log), authors will be able to build a large class of interesting and sophisticated servers.”

– Do you believe this?– What does it imply about tools vs.

libraries?

Page 22: CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS294, Yelick DataStructs, p22

Administrivia• Gautam Kar and Joe L. Hellerstein

speaking Thursday– Papers online– Contact me about meeting with them

• Final projects: – Send mail to schedule meeting with me

• Next week:– Tuesday: guest lecture by Aaron Brown on

benchmarks; related to Kar and Hellerstein work.– Still to come: Gray, Lamport, and Liskov