An Overlay Infrastructure for Decentralized Object Location and Routing

43
An Overlay Infrastructure for Decentralized Object Location and Routing Ben Y. Zhao [email protected] University of California at Berkeley Computer Science Division

description

An Overlay Infrastructure for Decentralized Object Location and Routing. Ben Y. Zhao [email protected] University of California at Berkeley Computer Science Division. Peer-based Distributed Computing. Cooperative approach to large-scale applications - PowerPoint PPT Presentation

Transcript of An Overlay Infrastructure for Decentralized Object Location and Routing

Page 1: An Overlay Infrastructure for Decentralized Object Location and Routing

An Overlay Infrastructure for Decentralized Object Location and Routing

Ben Y. [email protected]

University of California at BerkeleyComputer Science Division

Page 2: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 2

Peer-based Distributed Computing

Cooperative approach to large-scale applications peer-based: available resources scale w/ # of participants better than client/server: limited resources & scalability

Large-scale, cooperative applications are coming content distribution networks (e.g. FastForward) large-scale backup / storage utilities

leverage peers’ storage for higher resiliency / availability cooperative web caching application-level multicast

video on-demand, streaming movies

Page 3: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 3

What Are the Technical Challenges? File system: replicate files for resiliency/performance

how do you find close by replicas? how does this scale to millions of users? billions of files?

Page 4: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 4

Node Membership Changes Nodes join and leave the overlay, or fail

data or control state needs to know about available resources node membership management a necessity

Page 5: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 5

A Fickle Internet Internet disconnections are not rare (UMichTR98,IMC02)

TCP retransmission is not enough, need route-around IP route repair takes too long: IS-IS 5s, BGP 3-15mins good end-to-end performance requires fast response to faults

Page 6: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 6

reliable comm.reliable comm.reliable comm.reliable communication

dynamic membershipdynamic membershipdynamic membershipdynamic node membership algorithms

data locationdata locationdata locationefficient, scalable data location

FastForwardYahoo IMSETI

An Infrastructure Approach First generation of large-scale apps: vertical approach Hard problems, difficult to get right

instead, solve common challenges once build single overlay infrastructure at application layer

Internet

networktransportsession

presentationoverlay

physicallink

application

Page 7: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 7

Personal Research Roadmap

Tapestry

robust dynamicalgorithms

structuredoverlay APIs

resilientoverlay routing

WAN deployment(1500+ downloads)

SPAA 02 / TOCS IPTPS 03ICNP 03

JSAC 04

landmark routing(Brocade)

IPTPS 02

DOLR

PRR 97

multicast(Bayeux)NOSSDAV 02

file system(Oceanstore)

ASPLOS99/FAST03

spam filtering(SpamWatch)

Middleware 03

rapid mobility(Warp)IPTPS 04

a p p l i c a t i o n s

service discoveryservice

XSet lightweightXML DB

Mobicom 99 5000+ downloads

TSpaces

modeling of non-stationary datasets

Page 8: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 8

Talk Outline Motivation

Decentralized object location and routing Resilient routing Tapestry deployment performance Wrap-up

Page 9: An Overlay Infrastructure for Decentralized Object Location and Routing

What should this infrastructure look like?

here is one appealing direction…

Page 10: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 10

Node IDs and keys from randomized namespace (SHA-1) incremental routing towards destination ID each node has small set of outgoing routes, e.g. prefix

routing log (n) neighbors per node, log (n) hops between any node

pair

Structured Peer-to-Peer Overlays

To: ABCD

ID: ABCE

A930AB5F

ABC0

Page 11: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 11

Related Work Unstructured Peer to Peer Approaches

Napster, Gnutella, KaZaa probabilistic search (optimized for the hay, not the needle) locality-agnostic routing (resulting in high network b/w costs)

Structured Peer to Peer Overlays the first protocols (2001): Tapestry, Pastry, Chord, CAN then: Kademlia, SkipNet, Viceroy, Symphony, Koorde,

Ulysseus… distinction: how to choose your neighbors

Tapestry, Pastry: latency-optimized routing mesh distinction: application interface

distributed hash table: put (key, data); data = get (key); Tapestry: decentralized object location and routing

Page 12: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 12

Defining the Requirements1. efficient routing to nodes and data

o low routing stretch (ratio of latency to shortest path distance)

2. flexible data locationo applications want/need to control data placement

o allows for application-specific performance optimizationso directory interface

publish (ObjID), RouteToObj(ObjID, msg)

3. resilient and responsive to faults o more than just retransmission, route around failureso reduce negative impact (loss/jitter) on the application

Page 13: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 13

backbone

Decentralized Object Location & Routing

redirect data traffic using log(n) in-network redirection pointers average # of pointers/machine: log(n) * avg files/machine

keys to performance proximity-enabled routing mesh with routing convergence

k

k

publish(k)

routeobj(k)

routeobj(k)

Page 14: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 14

Why Proximity Routing?

Fewer/shorter IP hops: shorter e2e latency, less bandwidth/congestion, less likely to cross broken/lossy links

0123401234

Page 15: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 15

Performance Impact (Proximity)

Simulated Tapestry w/ and w/o proximity on 5000 node transit-stub network Measure pair-wise routing stretch between 200 random nodes

Prefix Routing w/ and w/o Proximity

0

20

40

60

80

100

120M

ean

Rout

ing

Stre

tch

Ideal (RDP=1)ProximityRandomized

Ideal (RDP=1) 1 1 1Proximity 1.46 1.73 1.79Randomized 108.31 15.81 4.46

In-LAN In-WAN Far-WAN

Page 16: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 16

DOLR vs. Distributed Hash Table

DHT: hash content name replica placement modifications replicating new version into DHT

DOLR: app places copy near requests, overlay routes msgs to it

Page 17: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 17

Performance Impact (DOLR)

simulated Tapestry w/ DOLR and DHT interfaces on 5000 node T-S measure route to object latency from clients in 2 stub networks DHT: 5 object replicas DOLR: 1 replica placed in each stub network

0

200

400

600

800

1000

1200

1400

64 256 1024 4096

Overlay Size

Ave

rage

Rou

ting

Late

ncy

DHT Min DHT Avg DHT Max DOLR

Page 18: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 18

Talk Outline Motivation Decentralized object location and routing

Resilient and responsive routing Tapestry deployment performance Wrap-up

Page 19: An Overlay Infrastructure for Decentralized Object Location and Routing

How do you get fast responses to faults?

Response time = fault-detection + alternate path discovery + time to switch

Page 20: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 20

Fast Response via Static Resiliency Reducing fault-detection time

monitor paths to neighbors with periodic UDP probes O(log(n)) neighbors: higher frequency w/ low bandwidth exponentially weighted moving average for link quality estimation

avoid route flapping due to short term loss artifacts loss rate: Ln = (1 - ) Ln-1 + p

Eliminate synchronous backup path discovery actively maintain redundant paths, redirect traffic immediately repair redundancy asynchronously

create and store backups at node insertion restore redundancy via random pair-wise queries after failures

End result fast detection + precomputed paths = increased responsiveness

Page 21: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 21

Routing Policies Use estimated overlay link

quality to choose shortest “usable” link

Use shortest overlay link withminimal quality > T

Alternative policies prioritize low loss over latency

use least lossy overlay link use path w/ minimal “cost

function”cf = x latency + y loss rate

Page 22: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 22

Talk Outline Motivation Decentralized object location and routing Resilient and responsive routing

Tapestry deployment performance Wrap-up

Page 23: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 23

Tapestry, a DOLR Protocol Routing based on incremental prefix matching Latency-optimized routing mesh

nearest neighbor algorithm (HKRZ02) supports massive failures and large group joins

Built-in redundant overlay links 2 backup links maintained w/ each primary

Use “objects” as endpoints for rendezvous nodes publish names to announce their presence e.g. wireless proxy publishes nearby laptop’s ID e.g. multicast listeners publish multicast session name to self

organize

Page 24: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 24

Weaving a Tapestry

Existing Tapestry

inserting node (0123) into network1. route to own ID, find 012X nodes, fill last column2. request backpointers to 01XX nodes3. measure distance, add to rTable4. prune to nearest K nodes5. repeat 2—4

ID = 0123XXXX 0XXX 01XX 012X

1XXX2XXX3XXX

00XX

02XX03XX

010X011X

013X

012001210122

Page 25: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 25

Implementation Performance Java implementation

35000+ lines in core Tapestry, 1500+ downloads Micro-benchmarks

per msg overhead: ~ 50s, most latency from byte copying

performance scales w/ CPU speedup 5KB msgs on P-IV 2.4Ghz: throughput ~ 10,000 msgs/sec

Routing stretch route to node: < 2 route to objects/endpoints: < 3

higher stretch for close by objects

Page 26: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 26

Responsiveness to Faults (PlanetLab)

B/W network size N, N=300 7KB/s/node, N=106 20KB/s sim: if link failure < 10%, can route around 90% of survivable failures

0

500

1000

1500

2000

2500

0 200 400 600 800 1000 1200

Link Probe Period (ms)

Tim

e to

Sw

itch

Rou

tes

(ms)

alpha=0.2alpha=0.4

300

660 = 0.2 = 0.4

Page 27: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 27

0

100

200

0 5 10 15 20 25 30

Time (minutes)

Succ

ess

Rat

e (%

)

0

50

100

150

200

250

Net

wor

k Si

ze

Stability Under Membership Changes

Routing operations on 40 node Tapestry cluster Churn: nodes join/leave every 10 seconds, average lifetime = 2mins

success rate (%)

killnodes

largegroup join

constantchurn

Page 28: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 28

Talk Outline Motivation Decentralized object location and routing Resilient and responsive routing Tapestry deployment performance

Wrap-up

Page 29: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 29

Lessons and Takeaways Consider system constraints in algorithm design

limited by finite resources (e.g. file descriptors, bandwidth) simplicity wins over small performance gains

easier adoption and faster time to implementation Wide-area state management (e.g. routing state)

reactive algorithm for best-effort, fast response proactive periodic maintenance for correctness

Naïve event programming model is too low-level much code complexity from managing stack state

important for protocols with asychronous control algorithms need explicit thread support for callbacks / stack

management

Page 30: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 30

Future Directions Ongoing work to explore p2p application space

resilient anonymous routing, attack resiliency Intelligent overlay construction

router-level listeners allow application queries efficient meshes, fault-independent backup links, failure notify

Deploying and measuring a lightweight peer-based application focus on usability and low overhead p2p incentives, security, deployment meet the real world

A holistic approach to overlay security and control p2p good for self-organization, not for security/ management decouple administration from normal operation explicit domains / hierarchy for configuration, analysis, control

Page 31: An Overlay Infrastructure for Decentralized Object Location and Routing

Thanks!

Questions, comments?

[email protected]

Page 32: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 32

Impact of Correlated Events

web / application servers independent requests maximize individual throughput

Network

???

???

?

ABC

correlated requests: A+B+CD e.g. online continuous queries, sensor

aggregation, p2p control layer, streaming data mining

event handler

+ + =

Page 33: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 33

Some Details Simple fault detection techniques

periodically probe overlay links to neighbors exponentially weighted moving average for link quality estimation

avoid route flapping due to short term loss artifacts loss rate: Ln = (1 - ) Ln-1 + p p = instantaneous loss rate, = filter constant

other techniques topics of open research How do we get and repair the backup links?

each hop has flexible routing constraint e.g. in prefix routing, 1st hop just requires 1 fixed digit backups always available until last hop to destination

create and store backups at node insertion restore redundancy via random pair-wise queries after failures

e.g. to replace 123X neighbor, talk to local 12XX neighbors

Page 34: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 34

Route Redundancy (Simulator)

Simulation of Tapestry, 2 backup paths per routing entry 2 backups: low maintenance overhead, good resiliency

00.10.20.30.40.50.60.70.80.9

1

0 0.05 0.1 0.15 0.2Proportion of IP Links Broken

Porti

on o

f All

Pairs

Rea

chab

le

Instantaneous IP Tapestry / FRLS

Page 35: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 35

Another Perspective on Reachability

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.05 0.1 0.15 0.2

Proportion of IP Links Broken

Prop

ortio

n of

All

Path

s

Portion of all pair-wise paths where

no failure-free paths remain

Portion of all paths where IP and FRLS both

route successfully

A path exists, but neither IP nor

FRLS can locate the path

FRLS finds path, where

short-term IP routing fails

Page 36: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 36

Single Node Software Architecture

SEDA event-driven frameworkJava Virtual Machine

Dynamic Tap.

distance map

core router

application programming interface

applications

Patchwork

network

Page 37: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 37

Related Work Unstructured Peer to Peer Applications

Napster, Gnutella, KaZaa probabilistic search, difficult to scale, inefficient b/w

Structured Peer to Peer Overlays Chord, CAN, Pastry, Kademlia, SkipNet, Viceroy, Symphony, Koorde,

Coral, Ulysseus, … routing efficiency application interface

Resilient routing traffic redirection layers

Detour, Resilient Overlay Networks (RON), Internet Indirection Infrastructure (I3)

our goals: scalability, in-network traffic redirection

Page 38: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 38

Node to Node Routing (PlanetLab)

Ratio of end-to-end latency to ping distance between nodes

All node pairs measured, placed into buckets

0

5

10

15

20

25

30

35

0 50 100 150 200 250 300

Internode RTT Ping time (5ms buckets)

RDP

(min

, med

, 90%

) Median=31.5, 90th percentile=135

Page 39: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 39

Object Location (PlanetLab)

Ratio of end-to-end latency to client-object ping distance Local-area stretch improved w/ additional location state

0

5

10

15

20

25

0 20 40 60 80 100 120 140 160 180 200

Client to Obj RTT Ping time (1ms buckets)

RD

P (m

in, m

edia

n, 9

0%)

90th percentile=158

Page 40: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 40

Micro-benchmark Results (LAN)

0.01

0.1

1

10

100

0.06

0.13

0.25 0.

5 1 2 4 8

16 32 64

128

256

512

1024

2048

Message Size (KB)

Tran

smis

sion

Tim

e (s

)

P-III 1Ghz

P-IV 2.4Ghz

P-III 2.3 Speedup

Per msg overhead ~ 50s, latency dominated by byte copying Performance scales with CPU speedup For 5K messages, throughput = ~10,000 msgs/sec

0

10

20

30

40

50

60

70

80

90

0.06 0.13 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 2048Message Size (KB)

Band

wid

th (M

B/s)

P-III 1Ghz LocalP-IV 2.4Ghz LocalP-IV 2.4Ghz 100MBE

100mb/s

Page 41: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 41

B

Traffic TunnelingLegacyNode A

LegacyNode B

ProxyProxy

registerregister

Structured Peer to Peer Overlay

put (hash(B), P’(B))

P’(B)

get (hash(B)) P’(B)

A, B are IP addresses

put (hash(A), P’(A))

Store mapping from end host IP to its proxy’s overlay ID Similar to approach in Internet Indirection Infrastructure (I3)

Page 42: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 42

Constrained Multicast Used only when all paths are below

quality threshold Send duplicate messages on

multiple paths Leverage route convergence

Assign unique message IDs Mark duplicates Keep moving window of IDs Recognize and drop duplicates

Limitations Assumes loss not from congestion Ideal for local area routing

2046

1111

2281 2530

2299 2274 2286

2225

? ? ?

Page 43: An Overlay Infrastructure for Decentralized Object Location and Routing

April 22, 2023 [email protected] 43

Link Probing Bandwidth (PL)

0

1

2

3

4

5

6

7

1 10 100 1000

Size of Overlay

Ban

dwid

th P

er N

ode

(KB

/s)

PR=300msPR=600ms

Bandwidth increases logarithmically with overlay size Medium sized routing overlays incur low probing bandwidth