Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

50
Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech

Transcript of Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Page 1: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Modeling Internet Topology

Ellen W. Zegura

College of Computing

Georgia Tech

Page 2: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 2

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations of degree-based generation

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 3: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 3

Networking background

access networks

hosts/endsystems

routers

domains/autonomous systems exchange point

stub domains

transit domains

border routerspeering

lowly worm

Page 4: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 4

Topology modeling

• Graph representation

• Router-level modeling– vertices are routers – edges are one-hop IP connectivity

• Domain- (AS-) level modeling– vertices are domains (ASes)– edges are peering relationships

Page 5: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 5

Survey of models

• Waxman (Waxman 1988)– router level model capturing locality

• Transit-stub (Zegura 1997), Tiers (Doar 1997)– router level model capturing hierarchy

• Inet (Jin 2000)– AS level model based on degree sequence

• BRITE (Medina 2000)– AS level model based on evolution

Page 6: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 6

Waxman model (Waxman 1988)

• Router level model• Nodes placed at random in

2-d space with dimension L• Probability of edge (u,v):

– ae^{-d/(bL)}, where d is Euclidean distance (u,v), a and b are constants

• Models locality

v

u d(u,v)

Page 7: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 7

Transit-stub model (Zegura 1997)

• Router level model

• Transit domains – placed in 2-d space

– populated with routers

– connected to each other

• Stub domains – placed in 2-d space

– populated with routers

– connected to transit domains

• Models hierarchy

Page 8: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 8

Real data: AS topology• Oregon route view server; peers with routers to collect

BGP routing tables

• Data publicly available from Nov 97 to present (nlanr.org, routeviews.org)

• Faloutsos 1999– degree sequence approximated by power law

– i.e., let f(d) be fraction of nodes with degree d, then f(d) d^• Chen 2002

– Oregon data incomplete (but so is theirs!)

– degree sequence highly variable but not strict power law

Page 9: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 9

Inet (Jin 2000)

• Generate degree sequence • Build spanning tree over nodes with

degree larger than 1, using preferential connectivity– randomly select node u not in tree– join u to existing node v with probability

d(v)/d(w)

• Connect degree 1 nodes using preferential connectivity

• Add remaining edges using preferential connectivity

Page 10: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 10

BRITE (Medina 2000)

• Generate small backbone, with nodes placed:– randomly or

– concentrated (skewed)

• Add nodes one at a time (incremental growth)

• New node has constant # of edges connected using:– preferential connectivity and/or

– locality

Page 11: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 11

Router-level measurement

• General technique: traceroute, returns list of IP addresses on a path from source to destination

• Collection challenges:– obtaining sufficient traceroute origin points

– deciding set of destination IP addresses (for coverage)

– limiting traceroute load

• Postprocessing challenges:– resolving aliases (which IP addresses belong to

same router)

source 0

destination 0

S1

D1

Page 12: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 12

Projects

• Lucent (Burch 1999)– single source (Lucent), ~100k destinations– emphasis: longitudinal study, visualization

• Skitter (Broido 2001)– 20 sources (“monitors”), ~400k destinations– emphasis: measurement repository, analysis

• Mercator (Govindan 2000)– single source (but uses source routing), 150k interfaces– emphasis: heuristics for map construction

Page 13: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 13

What is known? (hard to say)

• Caveat: router-level mapping clearly incomplete, so conclusions are weak

• Observations:– qualitatively similar to AS graph on a number

of measures– Weibull distributions good fit for number of

quantities (including degree distribution)

Page 14: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 14

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 15: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 15

Foundations of degree-based generation (Mihail 2002)

• Given degree sequence d(1) >= d(2) >= … >= d(n)• A degree sequence is realizable if there is a simple graph (no

self-loops or multiple links) with this sequence• Necessary and sufficient condition for degree sequence to be

realizable:– for each subset of k highest degree nodes, degrees can be “absorbed”

within the nodes and the outside degrees

Page 16: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 16

Construction algorithm

• Maintain residual degrees of vertices, d(v)

• Repeat until all vertices have been chosen:– pick arbitrary vertex v

– add edges from v to d(v) vertices of highest residual degree

– update residual degrees

• Note: order to pick v arbitrary

Page 17: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 17

Sparse/dense core

• Dense core– pick v’s starting with

high degree vertices

– will tend to connect high degree vertices

• Sparse core– pick v’s starting with

low degree vertices

– less likely to connect high degree vertices

Page 18: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 18

Example

• Large topology (11000+ nodes, 32000+ edges)• Dense core

– diameter 5

– average path length 3.6

• Sparse core– diameter 29

– average path length 17.9

Page 19: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 19

Random instance

• Start from any realization of degree sequence

• Pick two edges at random, (u,v) and (s,t), with distinct endpoints

• If doesn’t disconnect graph, remove edges and insert (u,s) and (v,t)

• Result satisfies degree sequence

• In the limit, reaches every possible connected realization with equal probability

u

v

s

t

u

v

s

t

Page 20: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 20

Example

• Different starting points• Snapshots, 25k, 50k, 100k, 300k, 600k iters• Large topology, sparse initial core

– diameter: 29, 13, 11, 11, 10, 10

– avgspl: 5.6, 3.6, 3.4, 3.4, 3.4, 3.4

• Large topology, dense initial core– diameter: 5, 10, 10, 10, 10, 10

– avgspl: 3.6, 3.2, 3.2, 3.4, 3.4, 3.4

Page 21: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 21

Notes about models

• Variants on evolutionary models

• Variants on degree-driven models

• Appeal of evolutionary

• Relationship to work on “networks” in general

Page 22: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 22

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 23: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 23

Evaluation

• Question: what determines whether a topology generator is “good”?

• Essentially an unsolved (and hard) problem– depends on what topologies are used for

• NOT “degree sequence follows a power law!”

Page 24: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 24

Metrics

• Path-related metrics– diameter, shortest path length

• Clustering metrics– neighborhood size (“expansion”), eigenvalue

decomposition, clustering coefficient

• Robustness metrics– resilience

• Hierarchy metrics– link usage, size of layers

Page 25: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 25

• Defined by two measures:– characteristic path length L = number of edges in

shortest path between two vertices, averaged over all vertex pairs

– clustering coefficient C:• take vertex v with k 1 neighbors

• at most k(k-1)/2 edges among neighbors

• C(v) = fraction of k(k-1)/2 edges present

• C = average clustering coefficient

• C >> C_random, L L_random

Small world topologies (Bu 2002)

v k nodes

Page 26: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 26

Findings

• AS-level topologies satisfy small-world test

• Example Mar 00:– L=3.7, L_random=3.8 – C=.39, C_random=.0023

• Example, Sept 01:– L= 3.6, L_random=3.6– C=.47, C_random=.0015

Page 27: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 27

Distinguishing between types of generators (Tangmunarunkit 2001)

• Goal: large-scale metrics that distinguish between classes of graphs

• Proposal: Expansion, resilience and distortion– differentiate between canonical graphs (mesh, tree,

random graph)

– differentiate between three types of generators• random graph (e.g., Waxman)

• structural (e.g., Transit-Stub, Tiers)

• degree-based (e.g., PLRG, BRITE)

Page 28: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 28

Model “signatures”

• Signature: expansion, resilience, distortion• Waxman: H H H (like random)• Tiers: L H L • Transit-stub: H L L (like tree)• PLRG: H H L (like complete graph)• Also: real topologies and other degree-based

generators have H H L signature

Page 29: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 29

Measure of hierarchy

• link-value measure

• see paper for details…

• bottom line: degree-based generators contain loose notion of hierarchy that is somewhat similar to loose notion in Internet

Page 30: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 30

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 31: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 31

Semantics: policy-based routes

• Internet routes are not hop-based shortest paths

• General policies:– path between two nodes in

a domain remains in that domain

– path between two nodes in two different domains traverses zero or more transit domains

Page 32: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 32

Transit-stub

• Use edge weights so that shortest-paths obey general policies

• Four weights (in order)– intra-domain edges

– T-T edges

– S-T edges

– S-S edges

Page 33: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 33

BGP peering relationships (Gao 2000)

• Problem: Routes determined by routing policy, including AS-level contractual agreements

• Idea: label edges in AS-level graph as– provider-to-customer (customer pays provider

for connectivity to rest of Internet)

– peer-to-peer (exchange traffic between customers free of charge)

– sibling-to-sibling (provide connectivity to rest of Internet for each other)

• Use BGP routing table entries

AS2 AS6AS3

AS1 AS7

AS4 AS5

Page 34: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 34

Principles

• e.g., routing table entry = AS path 1849 702 701 1

• downhill path: all edges provider-to-customer or sibling-to-sibling

• uphill path: all edges customer-to-provider or sibling-to-sibling

• An AS path of a BGP routing table is:

– an uphill path followed by a downhill path (either path segment may be empty)…or...

– an uphill path followed by a peer-to-peer edge followed by a downhill path (either path segment may be empty)

Page 35: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 35

Examples

• an uphill path followed by a downhill path– AS4-AS2-AS1-AS3-AS5

– AS7-AS1-AS2

• an uphill path followed by a peer-to-peer edge followed by a downhill path– AS5-AS6-AS3-AS5

– AS6-AS3-AS2-AS4

AS2 AS6AS3

AS1 AS7

AS4 AS5

Page 36: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 36

Basic algorithm sketch

• Compute degrees for each AS

• For each routing table path:– find highest degree AS (“top provider” T)

– AS edge (u,v) to left of T assigned value 1

– AS edge (u,v) to right of T assigned value 1

• For each edge (u,v):– if (u,v) =1 and (v,u) = 1 then sibling-to-sibling

– else if (v,u) = 1 then provider-to-customer

– else if (u,v) = 1 then customer-to-provider

• Note: complete algorithm also identifies peer-to-peer edges

Page 37: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 37

Hierarchical classification (Subramanian 2002)

• Idea: partition ASes into hierarchical levels using directed graph of peering relationships

• Process:– identify and remove nodes with out-degree 0 (customers)

– recursively identify and remove nodes with out-degree 0 (small ISPs)

– identify dense core as largest subset of nodes that is “almost a clique” (in and out-degree at least half nodes)

– identify transit core as smallest subset of nodes that peer primarily with each other and ASes in dense core

– remaining nodes are outer core

Page 38: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 38

Example result

• Dense core - 20 ASes

• Transit core - 162 ASes

• Outer core - 675 ASes

• Small regional ISPs - 950 ASes

• Customers - 8852 ASes

Page 39: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 39

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 40: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 40

Visualization: netvisor (Eagan 2002)

• Tool for router-level layout

• Combines automatic placement with user-assisted placement

• Understands domain semantics

• Collaboration between Information Visualization experts and Networking experts

Page 41: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 41

Page 42: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 42

Visualization: conceptual model (Faloutsos 2002)

• Idea: simple representation of AS-level topology, useful for intuitive understanding (and NY Times publication!)

• e.g., bowtie model for web

• jellyfish model– highly connected core

– layers (“shells”)

– degree one nodes form legs

– length of legs denotes density

core layers

legs

Page 43: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 43

Outline• Part I - Modeling topology

– Background

– Survey of models + what is known about topology

– Example: mathematical foundations

– Evaluation of topologies

• Part II - Reality check– Beyond simple topology

– Visualization

• Open questions/Bold statements/Random thoughts

• Reading list

Page 44: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 44

Open Problems

• Evaluation– what metrics are important?

• Useful modeling/scaling– what topologies should be used for

simulations?

• Semantics– let’s move beyond simple topology

Page 45: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 45

Are AS-level topologies useful?

• Many interesting problems arise due to large scale of Internet, hence need simulations that are “big enough”

• AS-level topology (about 10,000 nodes) manageable for some simulations

• But…representation of every AS as a comparable node (especially in 2-d space!) is a gross simplification

Page 46: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 46

Observations on level of detail• AS level models are limited (useless?)

– not enough distinction (all ASes look alike)

– not suitable for packet level simulations

• router level models are limited (useless?)– too small to be realistic…or...

– too large for simulations

• need alternative models– intermediate (border routers, exchange points,…)

– fluid flow network model??

• need better understanding of scaling

Page 47: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 47

Reading List (1 of 3)• [Broido 2001] Broido and Claffy, “Internet topology: local properties”, SPIE

ITCom 2001.• [Bu 2002] Bu and Towsley, “Distinguishing between Internet power-law

generators”, IEEE Infocom 2002.• [Burch 1999] Burch and Cheswick, “Mapping the Internet”, IEEE Computer,

April 1999.• [Chen 2002] Chen, Chang, Govindan, Jamin, Shenker and Willinger, “The

origin of power laws in Internet topologies revisited”, • [Calvert 1997] Calvert, Doar and Zegura, “Modeling Internet topology”, IEEE

Communications Magazine, June 1997.• [Doar 1997] Doar and Leslie, “How bad is naïve multicast routing”, IEEE

Infocom 1993.• [Eagan 2002] Netvisor. http://www.cc.gatech.edu/gvu/ii/netviz/• [Faloutsos 1999] Faloutsos, Faloutsos and Faloutsos, “On power-laws

relationships of the Internet topology”, ACM Sigcomm 1999.

Page 48: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 48

Reading List (2 of 3)• [Gao 2000] Gao, “On inferring autonomous system relationships in the

Internet”, IEEE Infocom 2000.• [Govindan 2000] Govindan and Tangmunarunkit, “Heuristics for Internet map

discovery”, IEEE Infocom 2000.• [Jin 2000] Jin, Chen and Jamin, “Inet: Internet topology generator”, U.

Michigan technical report CSE-TR-433-00, September 2000.• [Medina 2000] Medina, Matta and Byers, “On the origin of power-laws in

Internet topologies”, ACM CCR, April 2000.• [Mihail 2002] Mihail, Gkantsidis, Saberi, Zegura, “On semantics of Internet

topologies”, GT technical report, January 2002.• [Subramanian 2002] Subramanian, Agarwal, Rexford and Katz,

“Characterizing the Internet from multiple vantage points”, IEEE Infocom 2002.

• [Tauro 2002] Tauro, Palmer, Siganos and Faloutsos, “A simple conceptual model for the Internet topology”, Global Internet 2001.

Page 49: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 49

Reading List (3 of 3)• [Tangmunarunkit 2001] Tangmunarunkit, Govindan, Jamin, Shenker and

Willinger, “Network topologies, power laws, and hierarchy”, USC technical report 01-746, 2001.

• [Waxman 1988] Waxman, “Routing of multipoint connections”, IEEE JSAC, 1988.

• [Zegura 1997] Zegura, Calvert and Donahoo, “A quantitative comparison of graph-based models for Internet topology”, IEEE/ACM Transactions on Networking, December 1997.

Page 50: Modeling Internet Topology Ellen W. Zegura College of Computing Georgia Tech.

Zegura - Mar 2002 IPAM Workshop Tutorial 50

The End