School of Computing National University of Singaporetbma/teaching/cs5229y14... · 2014-10-02 ·...

Richard T. B. Ma

School of Computing

National University of Singapore

Forwarding and Routing

CS 5229: Advanced Compute Networks

IP Protocol Stack: Key Abstractions

Best-effort local packet delivery

Best-effort global packet delivery

Reliable streams

Applications

Unreliable datagrams

Link

Network

Transport

Application

Datagram networks

no call setup at network layer

routers: no end-to-end state information no network-level concept of “connection”

packets forwarded using destination host address

1. send datagrams

application

transport

network

data link

physical

application

transport

network

data link

physical

2. receive datagrams

Two Key Network-Layer Functions

Forwarding (data plane): move packets from an input interface to an appropriate output interface in a router

Routing (control plane): determine route taken by packets from source to destination

routing algorithms

analogy:

routing: process of planning trip from source to destination

forwarding: process of getting through single interchange

Forwarding: Hop-by-Hop

Each router has a forwarding table maps destination addresses to outgoing

interfaces

Upon receiving a packet inspect the destination IP address in the

header index into the table

determine the outgoing interface

forward the packet out that interface

Then, the next router on the path repeats and the packet travels along the path to the

destination

Router Architecture Overview

two key router functions: run routing algorithms/protocol

forwarding datagrams from incoming to outgoing link

router input ports switching

fabric

routing

processor

line card

line card

line card

line card

router output ports

Data Plane

Control Plane

line

termination

link layer

protocol (receive)

lookup,

forwarding

queueing

Input port functions

decentralized switching: given datagram destination, lookup output

port using forwarding table in input port memory (“match plus action”)

goal: complete input port processing at “line speed”

queuing: if datagrams arrive faster than forwarding rate into switch fabric

physical layer: bit-level reception

data link layer: e.g., Ethernet

switch fabric

Switching fabrics

transfer packet from input buffer to appropriate output buffer

switching rate: rate at which packets can be transfer from inputs to outputs often measured as multiple of input/output line rate

N inputs: switching rate N times line rate desirable

three types of switching fabrics

memory

memory

bus crossbar

Switching via memory

first generation routers: traditional computers with switching under direct

control of CPU

packet copied to system’s memory

speed limited by memory bandwidth (2 bus crossings per datagram)

input port

(e.g., Ethernet)

memory

output port

(e.g., Ethernet)

system bus

Switching via a bus

datagram from input port memory to output port memory via a shared bus

internal label is used to indicate output port

bus contention: switching speed limited by bus bandwidth

32 Gbps bus, Cisco 5600: sufficient speed for access and enterprise routers

bus

Switching via interconnection nets

overcome bus bandwidth limitations

banyan networks, crossbar, other interconnection nets initially developed to connect processors in multiprocessor

advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric.

Cisco 12000: switches 60 Gbps through the interconnection network

crossbar

Banyan networks

Output ports

buffering required when datagrams arrive from fabric faster than the transmission rate

scheduling discipline chooses among queued datagrams for transmission

line termination

link layer

protocol (send)

switch fabric

datagram

buffer

queueing

Output port queueing

buffering when arrival rate via switch exceeds output line speed

queueing (delay) and loss due to congestion and output port buffer overflow!

at t, packets more

from input to output

one packet time later

switch

fabric

switch

fabric

Input port queuing

fabric slower than input ports combined -> queueing may occur at input queues

delay and loss due to input buffer overflow! Head-of-the-Line (HOL) blocking: queued datagram

at front of queue prevents others in queue from moving forward

output port contention:

only one red packet can be transferred

(lower red packet is blocked)

switch

fabric

one packet time later:

green packet experiences

HOL blocking

switch

fabric

Routing Processor

“Control-Plane” software implementation of routing protocols

creation of forwarding table for the line cards

Handling of special data packets packets with IP options enabled

packets with expired Time-to-Live

Network management functions command-line interface (CLI) for configuration

transmission of measurement statistics

switching fabric

routing

processor

line card

line card

line card

line card

Routing vs. Forwarding

Routing: control plane Computing paths the packets will follow

Routers talking amongst themselves

Creating the forwarding tables

Forwarding: data plane Directing a data packet to an outgoing link

Using the forwarding tables

1 2 3

0111

value in arriving

packet’s header

routing algorithm

local forwarding table

header value output link

0100

0101

0111

1001

3

2

2

1

Interplay between routing and forwarding

Transmitted

(forwarded)

packet

buffers: arriving packets

dropped if no free space

queued packets

routing algorithm determines end-end-path through network

forwarding table determines local forwarding at this router

Router architecture overview two key router functions: run routing algorithms/protocol

forwarding datagrams from incoming to outgoing link

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data

plane (hardware)

routing, management

control plane (software)

forwarding tables computed,

pushed to input ports

Goal of routing

You may be able to send datagrams directly to the destination

If not, the router attempts to send datagrams to a router that is nearer the destination.

The goal of a routing protocol is very simple: Find a “good” path from source to destination.

Hosts often have default routers/gateways We can focus on the routing from the source router

to the destination router

Routing algorithm classification

global: all routers have complete topology, link cost infomation

“link state” algorithms/protocols

Dijkstra’s algorithm and OSPF protocol (RFC 2328)

decentralized: router knows costs to physically-connected neighbors

iterative process of computation, exchange of info with neighbors

“distance vector” algorithms/protocols

Bellman-Ford algorithm and RIP protocol (RFC 2453)

u

y x

w v

z 2

2 1

3

1

1

2

5 3

5

Graph: G = (N,E)

N = set of routers = { u, v, w, x, y, z }

E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

Graph abstraction

Remark: Graph abstraction is useful in other network contexts

Example: P2P, where N is set of peers and E is set of TCP connections

Graph abstraction: costs

u

y x

w v

z 2

2 1

3

1

1

2

5 3

5 • c(x,x’) = cost of link (x,x’)

- e.g., c(w,z) = 5

• cost could always be 1, or inversely related to bandwidth, or inversely related to congestion

Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

Question: What’s the least-cost path between u and z ?

Routing algorithm: algorithm that finds least-cost path

A Link-State Routing Algorithm

Dijkstra’s algorithm net topology, link costs

known to all nodes accomplished via “link

state broadcast”

all nodes have same info

computes least cost paths from one node (source) to all others gives forwarding table

for that node

notation: c(x,y): link cost from

node x to y; = ∞ if not direct neighbors

D(v): current value of cost of path from source to destination v

N': set of nodes whose least cost path definitively known

w 3

4

v

x

u

5

3 7 4

y

8

z 2

7

9

Dijsktra’s algorithm

Step

N' D(v)

0

1

2

3

4

5

D(w) D(x) D(y) D(z)

u ∞ ∞ 7 3 5

uw ∞ 11 6 5

14 11 6 uwx

uwxv 14 10

uwxvy 12

notes: construct shortest

path tree by tracing predecessor nodes

ties can exist (can be broken arbitrarily)

uwxvyz

Distance Vector (DV) algorithm

distributed, iterative, and asynchronous receives info from directly attached neighbors

continues on until no more info is exchanged

nodes can update and send info asynchronously

Dx(y) = estimate of least cost from x to y x maintains distance vector Dx = [Dx(y): y є N ]

For node x: knows cost to each neighbor v: c(x,v)

also maintains its neighbors’DVs. For each neighbor v, x maintains Dv = [Dv(y): y є N ]

Distance Vector Algorithm

Bellman-Ford Equation (dynamic programming)

Define

dx(y) := cost of least-cost path from x to y

Then

dx(y) = min { c(x,v) + dv(y) }

v

cost to neighbor v

cost from neighbor

v to destination y min taken over all

neighbors v of x

The fact behind Bellman-Ford

dx(y) = min {c(x,v) + dv(y) }

dv(y) – shortest distance between v and y c(x,v) – cost between x and v N - number of neighbor of x

v

Bellman-Ford example

u

y x

w v

z

2

2

1 3

1

1

2

5 3

5

clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3

du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) }

= min { 2 + 5, 1 + 3, 5 + 3 } = 4

B-F equation says:

Practical contributions of B-F equation: node achieving minimum is next hop in the

shortest path, used in forwarding table

it suggests the form of the neighbor-to-neighbor communication used in the DV algorithm

key idea: from time-to-time, each node sends its own

distance vector estimate to neighbors

when x receives new DV estimate from neighbor, it updates its own DV using B-F equation:

Dx(y) ← minv { c(x,v) + Dv(y) } for each y ∊ N

under minor, natural conditions, the estimate Dx(y) converge to the actual least cost dx(y)

Distance vector algorithm

x y z

x

y

z

0 2 3

fro

m

cost to

x y z

x

y

z

0 2 7

fro

m

cost to

x y z

x

y

z

0 2 3

fro

m

cost to

x y z

x

y

z

0 2 3

from

cost to

x y z

x

y

z

0 2 7

fro

m

cost to

2 0 1

7 1 0

2 0 1

3 1 0

2 0 1

3 1 0

2 0 1

3 1 0

2 0 1

3 1 0

time

x y z

x

y

z

0 2 7

∞ ∞ ∞

∞ ∞ ∞

fro

m

cost to

fro

m

fro

m

x y z

x

y

z

0

x y z

x

y

z

∞ ∞

∞ ∞ ∞

cost to

x y z

x

y

z ∞ ∞ ∞

7 1 0

cost to

∞

2 0 1

∞ ∞ ∞

2 0 1

7 1 0

time

x z

1 2

7

y

node x table

Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}

= min{2+0 , 7+1} = 2

Dx(z) = min{c(x,y) +

Dy(z), c(x,z) + Dz(z)}

= min{2+1 , 7+0} = 3

3 2

node y table

node z table

cost to

from

Real routing problems …

count-to-infinity problem (of distance vector algo): Dy(x) = 4, Dy(z) = 1, Dz(y) = 1, and Dz(x) = 5 link cost c(x,y) increases to 60 Dy(x) = min{c(y,x) + Dx(x), c(y,z) + Dz(x)} = min{60 + 0, 1 + 5} = 6 Dz(x) = min{50 + 0,1 + 6} = 7

oscillations problem (of link state algorithm): e.g., suppose link cost = amount of carried traffic:

x z

1 4

50

y 60

A

D

C

B

1 1+e

e 0

e

1 1

0 0

initially

A

D

C

B

given these costs, find new routing…. new costs

2+e 0

0 0

1+e 1

A

D

C

B


0 2+e

1+e 1

0 0

A

D

C

B


2+e 0

0 0

1+e 1

Routing is difficult in practice! Theory is simple, implementation is difficult:

Bellman-Ford algorithm 1-2 pages

RIP protocol, RFC 2453, 39 pages

Dijkstra’s algorithm 1-2 pages

OSPF protocol, RFC 2328, 245 pages

Protocols need to handle issues related to distributed systems in nature: Link failure

Inconsistent views

Routing loops

School of Computing National University of Singaporetbma/teaching/cs5229y14... · 2014-10-02 ·...

Documents

Transcript of School of Computing National University of Singaporetbma/teaching/cs5229y14... · 2014-10-02 ·...