Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software...

Post on 22-Dec-2015

216 views 3 download

Tags:

Transcript of Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software...

Lecture 10

Naming services for flat namespaces

EECE 411: Design of Distributed Software Applications

Logistics / reminders Project

Send Samer and me your group membership by the end of the week

Quizzes: Q1: next time Q2: 11/16

EECE 411: Design of Distributed Software Applications

Implementation options: Flat namespace

Problem: Given an essentially unstructured name how can we design a scalable solution that associates names to addresses?

Possible designs: [last time] Simple solutions (broadcasting,

forwarding pointers) Hash table-like approaches

Consistent hashing, Distributed Hash Tables

EECE 411: Design of Distributed Software Applications

Functionality to implement Map: names access points (addresses)

Similar to a hash-table Manage (huge) list of pairs (name, address)

or (key, value)

Put (key, value) Lookup (key) value

Key idea: partitioning. Allocate parts of the list to different nodes

EECE 411: Design of Distributed Software Applications

Why the put()/get() interface?

API supports a wide range of applications imposes no structure/meaning on keys

Key/value pairs are persistent and global Can store keys in other values (indirection) And thus build complex data structures

EECE 411: Design of Distributed Software Applications

Why Might The Design Be Hard?

Decentralized: no central authority Scalable: low network traffic overhead Efficient: find items quickly (latency) Dynamic: nodes fail, new nodes join General-purpose: flexible naming

EECE 411: Design of Distributed Software Applications

The Lookup Problem

Internet

N1

N2 N3

N6N5

N4

Publisher

Put (Key=“title”Value=file data…) Client

Get(key=“title”)

?

• At the heart of all these services

EECE 411: Design of Distributed Software Applications

Motivation: Centralized Lookup (Napster)

Publisher@

Client

Lookup(“title”)

N6

N9 N7

DB

N8

N3

N2N1SetLoc(“title”, N4)

Simple, but O(N) state and a single point of failure

Key=“title”Value=file data…

N4

EECE 411: Design of Distributed Software Applications

Motivation: Flooded Queries (Gnutella)

N4Publisher@

Client

N6

N9

N7N8

N3

N2N1

Robust, but worst case O(N) messages per lookup

Key=“title”Value=file data…

Lookup(“title”)

EECE 411: Design of Distributed Software Applications

Motivation: FreeDB, Routed DHT Queries (Chord, &c.)

N4Publisher

Client

N6

N9

N7N8

N3

N2N1

Lookup(H(audio data))

Key=H(audio data)Value={artist,

album title, track title}

EECE 411: Design of Distributed Software Applications

Hash table-like approaches Consistent hashing, Distributed Hash Tables

EECE 411: Design of Distributed Software Applications

Partition Solution: Consistent hashing

Consistent hashing: the output range of a hash function is treated as a

fixed circular space or “ring”.

CircularID Space N32

N10

N100

N80

N60

Key ID Node ID

K52

K30

K5

K99

K11

K33

128 0

EECE 411: Design of Distributed Software Applications

Partition Solution: Consistent hashing

Mapping keys to nodes Advantages: incremental scalability, load

balancing

N32

N10

N100

N80

N60

CircularID Space

K33, K40, K52

K11, K30

K5, K10

K65, K70

K99

Key ID Node ID

EECE 411: Design of Distributed Software Applications

Consistent hashing

How do store & lookup work?

N32

N10

N100

N80

N60 K33, K40, K52

K11, K30

K5, K10

K65, K70

K99

Key ID Node ID

“Key 5 isAt N10”

What node stores K5?

EECE 411: Design of Distributed Software Applications

Additional trick: Virtual Nodes

Problem: How to do load balancing when nodes are heterogeneous?

Solution idea: Each node owns an ID space proportional to its ‘power’

Virtual Nodes: Each physical node hosts multiple (similar) virtual nodes. Virtual nodes are treated the sameAdvantages: load balancing, incremental scalability, dealing with

failures Dealing with heterogeneity: The number of virtual nodes that a

node is responsible for can decided based on its capacity, accounting for heterogeneity in the physical infrastructure.

When a node joins (if it supports many VN) it accepts a roughly equivalent amount of load from each of the other existing nodes.

If a node becomes unavailable the load handled by this node is evenly dispersed across the remaining available nodes.

EECE 411: Design of Distributed Software Applications

Consistent Hashing – Summary so far

Mechanism: Nodes get an identity by hashing their IP address, keys are also

hashed into same space A key with id (hashed into) k, is assigned to first node whose

hashed id is equal or follows k, in circular space: successor(k)

Advantage Incremental scalability, load balancing Theoretical results:

[N number of nodes, k number of keys in the system] [With high probability] Each node is responsible for at most

(1+)K/N keys [With high probability] Joining or leaving of a node relocates

O(K/N) keys (and only to or from the responsible node)

EECE 411: Design of Distributed Software Applications

BUT Consistent hashing – problem

How large is the state maintained at each node? O(N); N number of nodes.

N32

N10

N100

N80

N60 K33, K40, K52

K11, K30

K5, K10

K65, K70

K99

Key ID Node ID

“Key 5 isAt N10”

EECE 411: Design of Distributed Software Applications

Basic Lookup (nonsolution)

N32

N10

N5

N20

N110

N99

N80

N60

N40

“Where is key 50?”

“Key 50 isAt N60”

• Lookups find the ID’s successor• Correct if successors are correct

EECE 411: Design of Distributed Software Applications

Successor Lists Ensure Robust Lookup

N32

N10

N5

N20

N110

N99

N80

N60

• Each node remembers r successors• Lookup can skip over dead nodes

N40

10, 20, 32

20, 32, 40

32, 40, 60

40, 60, 80

60, 80, 99

80, 99, 110

99, 110, 5

110, 5, 10

5, 10, 20

EECE 411: Design of Distributed Software Applications

“Finger Table” Accelerates Lookups

N80

½¼

1/8

1/161/321/641/128

EECE 411: Design of Distributed Software Applications

Lookups take O(log N) hops

N32

N10

N5

N20

N110

N99

N80

N60

Lookup(K19)

K19

EECE 411: Design of Distributed Software Applications

Summary of Performance Characteristics

Efficient: O(log N) messages per lookup Scalable: O(log N) state per node Robust: survives massive membership

changes

EECE 411: Design of Distributed Software Applications

Joining the Ring Three step process

Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node

Two invariants to maintain to insure correctness Each node’s successor list is maintained successor(k) is responsible for monitoring k

EECE 411: Design of Distributed Software Applications

N36

1. Lookup(37,38,40,…,100,164)

N60

N40

N5

N20N99

N80

Join: Initialize New Node’s Finger Table

Locate any node p in the ring Ask node p to lookup fingers of new

node

EECE 411: Design of Distributed Software Applications

N36

N60

N40

N5

N20N99

N80

Join: Update Fingers of Existing Nodes

New node calls update function on existing nodes Existing nodes recursively update fingers of other

nodes

EECE 411: Design of Distributed Software Applications

Copy keys 21..36from N40 to N36 (the others saty)K30

K38

N36

N60

N40

N5

N20N99

N80

K30

K38

Join: Transfer Keys

Only keys in the range are transferred

EECE 411: Design of Distributed Software Applications

N120

N113

N102

N80

N85

N10

Lookup(90)

Handling Failures Problem: Failures could cause incorrect lookup Solution: Fallback: keep track of successor’s

successor (i.e., keep list of r successors)

EECE 411: Design of Distributed Software Applications28

Choosing Successor List Length

r - length of successor list N – nodes in the system

Assume 50% of the nodes fail P(successor list all dead for a specific node) =

(1/2)r i.e., P(this node breaks the ring) depends on independent failure assumption

P(no broken nodes) = (1 – (1/2)r)N

r = 2log(N) makes prob. = 1 – 1/N

EECE 411: Design of Distributed Software Applications

DHT – Summary so far

Mechanism: Nodes get an identity by hashing their IP address, keys are also

hashed into same space A key with id (hashed into) k, is assigned to first node whose

hashed id is equal or follows k, in circular space: successor(k)

Properties Incremental scalability, good load balancing Efficient: O(log N) messages per lookup Scalable: O(log N) state per node Robust: survives massive membership changes

EECE 411: Design of Distributed Software Applications

Some experimental results

EECE 411: Design of Distributed Software Applications

Chord Lookup Cost Is O(log N)

Number of Nodes

Avera

ge M

ess

ag

es

per

Looku

p

Constant is 1/2

EECE 411: Design of Distributed Software Applications

Failure Experimental Setup Start 1,000 CFS/Chord servers

Successor list has 20 entries Wait until they stabilize Insert 1,000 key/value pairs

Five replicas of each Stop X% of the servers Immediately perform 1,000 lookups

EECE 411: Design of Distributed Software Applications

DHash Replicates Blocks at r Successors

N40

N10

N5

N20

N110

N99

N80

N60

N50

Block17

N68

• Replicas are easy to find if successor fails• Hashed node IDs ensure independent failure

EECE 411: Design of Distributed Software Applications

Massive Failures Have Little Impact

0

0.2

0.4

0.6

0.8

1

1.2

1.4

5 10 15 20 25 30 35 40 45 50

Faile

d L

ooku

ps

(Perc

en

t)

Failed Nodes (Percent)

(1/2)6 is 1.6%

EECE 411: Design of Distributed Software Applications

Applications

EECE 411: Design of Distributed Software Applications

An Example Application: The CD Database

Compute Disc Fingerprint

Recognize Fingerprint?

Album & Track Titles

EECE 411: Design of Distributed Software Applications

An Example Application: The CD Database

Type In Album andTrack Titles

Album & Track Titles

No Such Fingerprint

EECE 411: Design of Distributed Software Applications

A DHT-Based FreeDB Cache FreeDB is a volunteer service

Has suffered outages as long as 48 hours Service costs born largely by volunteer

mirrors Idea: Build a cache of FreeDB with a

DHT Add to availability of main service Goal: explore how easy this is to do

EECE 411: Design of Distributed Software Applications

Cache Illustration

DHTDHTNew Albums

Disc Fingerp

rint

Disc In

fo

Disc Fingerprint

EECE 411: Design of Distributed Software Applications

Trackerless BitTorrent:

A client wants to download the file: Contacts the tracker identified in

the .torrent file (using HTTP) Tracker sends client a (random)

list of peers who have/are downloading the file

Client contacts peers on list to see which segments of the file they have

Client requests segments from peers Client reports to other peers it knows about that it

has the segment Other peers start to contact client to get the segment

(while client is getting other segments)

EECE 411: Design of Distributed Software Applications

Next

A distributed system is: a collection of independent computers that

appears to its users as a single coherent system

Components need to: Communicate Cooperate => support needed

Naming – enables some resource sharing Synchronization