Distributed Hash Tables - KTH

9
KTH ROYAL INSTITUTE OF TECHNOLOGY Distributed Hash Tables Vladimir Vlassov Distributed Hash Tables Large scale data bases hundreds of servers High churn rate servers will come and go Benefits fault tolerant high performance self administrating ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES A key-value store Associative array to store key-value pairs, a data structure known as a hash table (array of buckets) that maps keys to values. Operations: put (key, object) – store a given object with a given key object: = get (key) – read a object given key. Design issues: Identify : how to uniquely identify an object Store: how to distribute objects among servers Route: how to find an object ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES Unique identifiers We need unique identifiers to identify objects, i.e. to find a bucket to get/put an object with a given key identifier = f(key, size_of_hash_table) How to select identifiers: use a key (a name) a cryptographic hash of the key a cryptographic hash of the object why hash? ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Transcript of Distributed Hash Tables - KTH

Page 1: Distributed Hash Tables - KTH

KTH ROYAL INSTITUTEOF TECHNOLOGY

Distributed Hash TablesVladimir Vlassov

Distributed Hash Tables• Large scale data bases

– hundreds of servers• High churn rate

– servers will come and go• Benefits

– fault tolerant– high performance– self administrating

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

A key-value storeAssociative array to store key-value pairs, a data structure known as a hash table (array of buckets) that maps keys to values.

Operations:put (key, object) – store a given object with a given keyobject: = get (key) – read a object given key.

Design issues:• Identify : how to uniquely identify an object• Store: how to distribute objects among servers• Route: how to find an object

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Unique identifiersWe need unique identifiers to identify objects, i.e. to find a bucket to get/put an object with a given key

identifier = f(key, size_of_hash_table)

How to select identifiers:• use a key (a name)• a cryptographic hash of the key• a cryptographic hash of the object

why hash?

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Page 2: Distributed Hash Tables - KTH

Key distribution – direct map

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Direct map of keys to identifiers (buckets) gives a non-uniform (uneven) distribution of keys among buckets

Key distribution – hashing keys

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

A cryptographic hash function gives a uniform (even) distribution of the keys among buckets

hash = hashfunc(key)identifier = hash % table_size

Add a server

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

at three-o’clock-in-the-morning do:

Random distribution

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Random distribution of key ranges among servers

How to find a server responsible for a given key?

Page 3: Distributed Hash Tables - KTH

• ID domain: 0,1,2,…, size-1• clockwise step along the ring

i = (i + 1)% size• responsibility: from your

predecessor to your number• when inserted: take over responsibility

Circular domain

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

blue:45

• responsibility: from your predecessor to your number

• when inserted: take over responsibility

Circular domain

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

blue:45

red:120

• responsibility: from your predecessor to your number

• when inserted: take over responsibility

Circular domain

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

blue:45

red:120

green:2900• responsibility: from your

predecessor to your number• when inserted: take over

responsibility• talk to the node in front of you

Circular domain

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

blue:45

red:120

yellow:250

green:2900

Page 4: Distributed Hash Tables - KTH

• predecessor• successor• how do we insert a new node• concurrently

Double linked circle

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

q:120

0

q

p

:12q

r:82

s:97

s: - Who is your predecessor?q: - It’s p at 70.s: - Why don’t you point to me!

Stabilization

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

q:120

0

q

p

:12q

s:97

Ask your successor: Who is your predecessor? Correct a wrong link if any

20

s: - Who is your predecessor?q: - It’s p at 70.s: - Why don’t you point to me!p: - Who is your predecessor?q: - It’s s at 97.p: - Hmmm, that’s a better successor.

Stabilization

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

q:120q

p

:12q

s:97

Ask your successor: Who is your predecessor? Correct a wrong link if any

20

s: - Who is your predecessor?q: - It’s p at 70.s: - Why don’t you point to me!p: - Who is your predecessor?q: - It’s s at 97.p: - Hmmm, that’s a better successor.p: - Who is your predecessor?s: - I don’t have one.p: - Why don’t you point to me!

Stabilization

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

q:120

p

:12q

s:97

Ask your successor: Who is your predecessor? Correct a wrong link if any

20

0

Let’s play a game!

Page 5: Distributed Hash Tables - KTH

StabilizationStabilization is run periodically: allow nodes to be inserted concurrently.

Inserted node will take over responsibility for part of a segment.

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

• monitor neighbors• safety pointer

Crashing nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70r:82

s:97

q:120

97

0p

q

r

s

q:120q:120

• monitor neighbors• safety pointer• detect crash

Crashing nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

s:97

q:120

p

q

s

q:120q

• monitor neighbors• safety pointer• detect crash• update forward pointer

Crashing nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

s:97

q:120

p

q

s

q:120q

s:97s

Page 6: Distributed Hash Tables - KTH

• monitor neighbors• safety pointer• detect crash• update forward pointer• update safety pointer

Crashing nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

s:97

q:120

p

qq:120qq:12q

s:97

• monitor neighbors• safety pointer• detect crash• update forward pointer• update safety pointer• stabilize

Crashing nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

p:70

s:97

q:120

p

qq:120qq:12q

s:97

0

Russian roulette

How many safety pointers do we need?

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Replication

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Where should we store a replica of our data?

Page 7: Distributed Hash Tables - KTH

Routing overlay• The problem of finding an object in our distributed table:

• nodes can join and crash• trade-off between routing overhead and update

overhead

In the worst case we can always forward a request to our successor.

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Leaf setAssume that each node holds a leaf set of its closest (±l )neighbors (a.k.a. a finder table).

We can jump l nodes in each routing step but we still have complexity of O(n).

Leaf set is updated in O(l).

The leaf set could be as small as only the immediate neighbors but is often chosen to be a handful.

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

• we’re looking for the responsible node of an object

• each router hop brings us closer to the responsible node

• the leaf set gives us the final destination

Improvement

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

1020

4050

70

85

112120

130145

350337

310

280

267

250

238

224

210 158195 170

get (222)

0

20

4

0

224

210

8

PastryA routing table, each row represents one level of routing.

• 32 rows• 16 entries per row• any node found in 32 hops• maximal number of nodes 1632 or 2128 (more than

enough)• search is O(lg(n)) where n is number of nodes

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Page 8: Distributed Hash Tables - KTH

• be lazy• detect failed nodes when used• route in alternative direction• ask neighbors of alternative node

The price of fast routing

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

1020

4050

70

85

112120

130145

350337

310

280

267

238

224

210 158195 170

0

20

4

0

get (222)00

1

4

• when inserting new node• attach to the network-wise closest

node• adopt the routing entries on the

way down

Network aware routing

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

1020

4050

70

85

112120

130145

350337

310

280

267

238

224

210 158195 170

0

20

4

0

get (230)

250

230

5

Structured• a well-defined structure• takes time to add or delete nodes• takes time to add objects• easy to find objects

Overlay networks

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Unstructured• a random structure• easy to add or delete nodes• easy to add objects• takes time to find objects

DHT usageLarge scale key-value store.

• fault tolerant system in high churn rate environment• high availability low maintenance

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Page 9: Distributed Hash Tables - KTH

• replaces the tracker by a DHT• clients connects as part in the DHT• DHT keeps track of peers that share

content

The Pirate Bay

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

• large scale key-value store• inspired by Amazon Dynamo• implemented in Erlang

Riak

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES

Summary DHT

• why hashing?• distribute storage in ring• replication• routing

ID2201 DISTRIBUTED SYSTEMS / DISTRIBUTED HASH TABLES