PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel...

32
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge

Transcript of PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel...

Page 1: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC: Practical Internet Coordinates for Distance

Estimation

Manuel Costa

joint work with

Miguel Castro, Ant Rowstron, Peter Key

Microsoft Research Cambridge

Page 2: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Why estimate distances?

Page 3: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Why estimate distances?

• Distance estimation can be used to optimize large scale distributed systems:– Server selection– Locality aware peer-to-peer overlay networks– Application level multicast

• Problems with on-demand measurement:– Slow– High overhead

Page 4: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC

• Maps the Internet into a geometric space

• Allows very low cost distance estimation

• Fully decentralized

• Tolerates malicious nodes

Page 5: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Outline

• Estimating distances with coordinates

• Securing the coordinate computation process

• Application to peer-to-peer overlays

• Conclusion

Page 6: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Internet as a geometric space

• Map each node to a position in the geometric space

• Compute distances based on coordinates

• Any node can compute the distance between any other two nodes

• Proposed by GNP (Global Network Positioning)

y

x

(x2,y2)

(x3,y3)

(x1,y1)

Page 7: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

GNP – computing coordinates

• Measure distance to fixed landmarks

• Assign coordinates by solving a multi-dimensional global minimization problem

• There is no exact solution:– Internet is not euclidean– Measurements have errors

y

x

(x1,y1)

(x2,y2)

(x3,y3)

(x4,y4)

d1 d2

d3

Page 8: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC – computing coordinates

• Any node in the system can act as a landmark

• Strategies for choosing landmarks include:– Random nodes– Close nodes– Hybrid

y

x

(x1,y1) (x2,y2)

(x3,y3)

(x4,y4)

(x5,y5)

d1

d2

d3

Page 9: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC – any node can act as landmark

Page 10: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC – advantages

• Self-organizing - no provisioning of servers needed

• Scalable - load distributed among all the peers

• Resilient - avoids centralized points of failure

Page 11: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Experimental evaluation

• 40 000 node network on 3 topologies: Georgia Tech, Mercator, Corpnet

• Compare predicted distance to real distance for 100 000 node pairs

• Euclidean space with 8 dimensions, 16 landmarks

Page 12: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Accuracy: Georgia Tech

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140 160 180 200

relative error (%)

frac

tio

n o

f d

ista

nce

s

GNPrandomclosesthybrid

Page 13: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Accuracy over short distances

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140 160 180 200

relative error (%)

frac

tio

n o

f d

ista

nce

s

randomGNPclosesthybrid

Page 14: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Accuracy: CorpNet

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140 160 180 200

Relative error (%)

Fra

ctio

n o

f d

ista

nce

s

GNP

randomclosest

hybrid

Page 15: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Accuracy: Mercator

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140 160 180 200

Relative error (%)

Fra

ctio

n o

f d

ista

nce

s

GNPrandomclosesthybrid

Page 16: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC – security

• Problem: Malicious/compromised nodes can provide incorrect coordinates or fake distances

• Solution– Incorrect coordinates and distances

are likely to violate triangle inequality– Remove landmarks that violate triangle

inequality

Page 17: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

PIC – security

• Remove landmarks with highest sum of deviations from these bounds

• When testing landmark i, check:

dn,i di,j dn,j≤ +

dn,i di,j dn,j≥ −

dn,i dn,j di,j≥ −joining node n

landmark i(under test)

landmark j

dn,i

dn,j

di,j

Page 18: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Security evaluation• Fraction f of colluding attackers

– Know everything

• When a node joins, attackers collude to provide a set of fake coordinates and distances that maximize the distance to the correct position

• This is a very powerful attack

Page 19: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Accuracy under attack

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140

relative error (%)

fra

cti

on

of

dis

tan

ce

s

no attackers, security on

10% colluding attackers

20% colluding attackers

Page 20: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Application to peer-to-peer overlays

• Structured overlays:– Nodes have nodeIds – Message sent to a key is delivered to node

with closest nodeId

Page 21: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Structured overlays: Mapping keys to nodes

• large id space (128-bit integers)

• nodeIds picked randomly from space

• keys picked randomly from space

• key is managed by its root node:

• live node with id closest to the key

root nodefor key

id space

nodeIdkey

Page 22: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Pastry: Node routing state

0* 1* 2* 3*

20* 21* 22* 23*

200* 201* 202* 203*

2030* 2031* 2032* 2033*

203231

• topology aware routing table• nodeIds and keys in some base 2b (e.g., 4)• prefix constraints on nodeIds for each slot• pick closest node satisfying slot constraints

leaf set

nodeId

Page 23: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Pastry: routing

• prefix matching: each hop resolves an extra key digit

323310

323211

322021

313221

103231

nodeId

key

route(m,323310)

Page 24: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Proximity neighbour selection

• Select close nodes for use in routing

• Important to achieve low delay routes

• PIC can replace network distance probes

Page 25: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Pastry: prefix-based routing

• Prefix matching: each hop resolves an extra key digit• Proximity neighbour selection: use closest known

node that matches an extra digit

323211322021313221

103231route(m,323310)

route(m,323310)

route(m,323310)

323310

Page 26: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Proximity test variants

• Full probing– RTT measured by taking the minimum of

three probes

• PIC– RTT estimated with coordinates

• Filtered probing– Use coordinates to filter bad candidates,

always probe before replacing a neighbour

Page 27: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Trace-driven evaluation

• Dynamic node arrival and failure generated from UW Gnutella study– 60 hour trace– Average session time 2.3 hours– number of active nodes varies from 1300-

>2700

• Georgia Tech topology

Page 28: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Distance probes

0

0.05

0.1

0.15

0.2

0.25

0.3

0 10 20 30 40 50 60

Time (hours)

Pro

bes

per

sec

on

d p

er n

od

e full probing

PIC

filtered probing

Page 29: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Relative delay penalty

full probing filtered probingPIC

no locality

0

0.5

1

1.5

2

2.5

3

3.5

RD

P

Page 30: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Related Work

• GNP: maps Internet into geometric space using centralized landmarks

• Lighthouses: uses decentralized random landmarks

• Mithos: uses closest nodes as landmarks• Virtual landmarks: partitions nodes into sets,

maps coordinates between sets• Vivaldi: computes coordinates continuously by

passively monitoring RPC delays

Page 31: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Conclusion

• PIC enables practical distance estimation in large distributed systems– Accurate– Self-organizing– Scalable– Secure

• Future Work– Deployment and evaluation on the Internet– Different distance metrics (e.g. bandwidth)

Page 32: PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.

Questions ?