Internet Protocols Fall 2005

47
Internet Protocols Fall 2005 Lectures 7-8 Andreas Terzis

Transcript of Internet Protocols Fall 2005

Page 1: Internet Protocols Fall 2005

Internet ProtocolsFall 2005

Lectures 7-8Andreas Terzis

Page 2: Internet Protocols Fall 2005

Outline

CS 349/Fall05 2

• Internet Protocol– Service Model– Fragmentation– Addressing

• Original addressing scheme• Subnetting• CIDR

– Forwarding– ICMP– ARP– Address Shortage

• NAT• IPv6

Page 3: Internet Protocols Fall 2005

IP Internet

CS 349/Fall05 3

• Concatenation of Networks

• Protocol Stack

R2

R1

H4

H5

H3H2H1

Network 2 (Ethernet)

Network 1 (Ethernet)

H6

Network 4(point-to-point)

H7 R3 H8

Network 3 (FDDI)

R1 R2 R3

H1 H8

ETH FDDI

IP

ETH

TCP

FDDI PPP PPP ETH

IP

ETH

TCP

IP IP IP

Page 4: Internet Protocols Fall 2005

Service Model

CS 349/Fall05 4

• Connectionless (datagram-based)• Best-effort delivery (unreliable service)

– packets are lost– packets are delivered out of order– duplicate copies of a packet are delivered– packets can be delayed for a long time

• Datagram format

Version HLen TOS Length

Ident Flags Offset

TTL Protocol Checksum

SourceAddr

DestinationAddr

Options (variable) Pad(variable)

0 4 8 16 19 31

Data

App

Transport

Network

TCP / UDP

IP

Data Hdr

Data

TCP Segment

IP DatagramHdr

Link

Page 5: Internet Protocols Fall 2005

Fragmentation

CS 349/Fall05 5

Problem: A router may receive a packet larger than the maximum transmission unit (MTU) of the outgoing link.

Source Destination

AMTU=1500 bytes

BMTU=1500 bytesEthernet

MTU<1500 bytesR1 R2

Solution: R1 fragments the IP datagram into mutiple, self-contained datagrams.

Data HDR (ID=x)

Data HDR (ID=x) Data HDR (ID=x) Data HDR (ID=x)

Offset>0More Frag=0

Offset=0More Frag=1

Page 6: Internet Protocols Fall 2005

Fragmentation (II)

CS 349/Fall05 6

• Fragments are re-assembled by the destination host; not by intermediate routers.

• To avoid fragmentation, hosts commonly use path MTU discovery to find the smallest MTU along the path.

• Path MTU discovery involves sending various size datagramsuntil they do not require fragmentation along the path.

• Most links use MTU>=1500bytes today. • Try: traceroute –f www.mit.edu 1500 and

traceroute –f www.mit.edu 1501• (DF=1 set in IP header; routers send “ICMP” error message,

which is shown as “!F”).

Page 7: Internet Protocols Fall 2005

Global Addresses

CS 349/Fall05 7

• Properties– globally unique– hierarchical: network + host

• Dot Notation– 10.3.2.4– 128.96.33.81– 192.12.69.77

Network Host

7 24

0(a)

Network Host

14 16

1 0(b)

Network Host

21 8

1 1 0(c)

Page 8: Internet Protocols Fall 2005

Datagram Forwarding

CS 349/Fall05 8

• Strategy– every datagram contains destination’s address– if connected to destination network, then forward

to host– if not directly connected, then forward to some

router– forwarding table maps network number into next

hop– each host has a default router– each router maintains a forwarding table

• Example (R2)

Network Number Next Hop

1 R3

2 R1

3 interface 1

4 interface 0

R2

R1

H4

H5

H3H2H1

Network 2 (Ethernet)

Network 1 (Ethernet)

H6

Network 4(point-to-point)

H7 R3 H8

Network 3 (FDDI)

Page 9: Internet Protocols Fall 2005

IP Addressing

CS 349/Fall05 9

• Problem:– Address classes were too “rigid”. For most organizations,

Class C were too small and Class B too big. Led to very inefficient use of address space, and a shortage of addresses.

– Organizations with internal routers needed to have a separate (Class C) network ID for each link.

– And then every other router in the Internet had to know about every network ID in every organization, which led to large address tables.

– Small organizations wanted Class B in case they grew to more than 255 hosts. But there were only about 16,000 Class B network IDs.

Page 10: Internet Protocols Fall 2005

IP Addressing

CS 349/Fall05 10

• Two solutions were introduced:• Subnetting is used within an organization to

subdivide the organization’s network ID.• Classless Interdomain Routing (CIDR) was

introduced in 1993 to provide more efficient and flexible use of IP address space across the whole Internet.

• CIDR is also known as “supernetting”because subnetting and CIDR are basically the same idea.

Page 11: Internet Protocols Fall 2005

Subnetting

CS 349/Fall05 11

• Add another level to address/routing hierarchy: subnet

• Subnet masks define variable partition of host part• Subnets visible only within site

Network number Host number

Class B address

Subnet mask (255.255.255.0)

Subnetted address

111111111111111111111111 00000000

Network number Host IDSubnet ID

Page 12: Internet Protocols Fall 2005

Subnet Example

CS 349/Fall05 12

Forwarding table at router R1Subnet Number Subnet Mask Next Hop128.96.34.0 255.255.255.128 interface 0128.96.34.128 255.255.255.128 interface 1128.96.33.0 255.255.255.0 R2

Subnet mask: 255.255.255.128Subnet number: 128.96.34.0

128.96.34.15128.96.34.1

H1 R1

128.96.34.130 Subnet mask: 255.255.255.128Subnet number: 128.96.34.128

128.96.34.129128.96.34.139

R2H2

128.96.33.1128.96.33.14

Subnet mask: 255.255.255.0Subnet number: 128.96.33.0

H3

Page 13: Internet Protocols Fall 2005

Forwarding Algorithm

CS 349/Fall05 13

D = destination IP addressfor each entry (SubnetNum, SubnetMask, NextHop)

D1 = SubnetMask & Dif D1 = SubnetNum

if NextHop is an interfacedeliver datagram directly to D

elsedeliver datagram to NextHop

• Use a default router if nothing matches• Not necessary for all 1s in subnet mask to be contiguous • Can put multiple subnets on one physical network• Subnets not visible from the rest of the Internet

Page 14: Internet Protocols Fall 2005

Classless Interdomain Routing (CIDR)Addressing

CS 349/Fall05 14

The IP address space is broken into line segments.Each line segment is described by a prefix.A prefix is of the form x/y where x indicates the prefix of all addresses in the line segment, and y indicates the length of the segment.e.g. The prefix 128.9/16 represents the line segment containing addresses in the range: 128.9.0.0 … 128.9.255.255.

0 232-1

128.9/16

128.9.0.0

216

142.12/1965/8

128.9.16.14

Page 15: Internet Protocols Fall 2005

CS 349/Fall05 15

Classless Interdomain Routing (CIDR)Addressing

0 232-1

128.9/16

128.9.16.14

128.9.16/20 128.9.176/20

128.9.19/24

128.9.25/24

Most specific route = “longest matching prefix”

Page 16: Internet Protocols Fall 2005

Classless Interdomain Routing (CIDR)Addressing

CS 349/Fall05 16

Prefix aggregation:If a service provider serves two organizations with prefixes, itcan (sometimes) aggregate them to form a larger prefix. Other routers can refer to this larger prefix, and so reduce the size of their address table.E.g. ISP serves 128.9.14.0/24 and 128.9.15.0/24, it can tell other routers to send it all packets belonging to the prefix 128.9.14.0/23.

ISP Choice:In principle, an organization can keep its prefix if it changes service providers.

Page 17: Internet Protocols Fall 2005

Hierarchical addressing: route aggregation

CS 349/Fall05 17

Hierarchical addressing allows efficient advertisement of routing information:

“Send me anythingwith addresses beginning 200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7Internet

Organization 1

ISPs-R-Us “Send me anythingwith addresses beginning 199.31.0.0/16”

200.23.20.0/23Organization 2

...

...

Page 18: Internet Protocols Fall 2005

Hierarchical addressing: route aggregation

CS 349/Fall05 18

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7Internet

Organization 1

ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16”

200.23.20.0/23Organization 2

...

...

“Send me anything with addresses beginning 200.23.16.0/20”

Multi-homing“Send me anything with addresses beginning 199.31.0.0/16, or200.23.30.0/23 ”

Page 19: Internet Protocols Fall 2005

CS 349/Fall05 19

Size of the Routing Table at the core of the Internet

Source: http://bgp.potaroo.net/

Page 20: Internet Protocols Fall 2005

Prefix Length Distribution

CS 349/Fall05 20

0

10000

20000

30000

40000

50000

60000

70000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Prefix Length

Num

ber

of P

refixe

s

Source: Geoff Huston, Oct 2001

Page 21: Internet Protocols Fall 2005

How a Router Forwards Datagrams

CS 349/Fall05 21

e.g. 128.9.16.14 => Port 2

R1

R2

R3

R4

12

3

128.17.20.1

128.17.16.1

128.9/16128.9.16/20

128.9.176/20

128.9.19/24128.9.25/24

142.12/19

65/8

Prefix Port3227213

128.17.14.1128.17.14.1

128.17.20.1

128.17.10.1128.17.14.1

128.17.16.1

128.17.16.1

Next-hop

Forwarding table

Page 22: Internet Protocols Fall 2005

Forwarding in an IP Router

CS 349/Fall05 22

• Lookup packet DA in forwarding table.– If known, forward to correct port.– If unknown, drop packet.

• Decrement TTL, update header Checksum.• Forward packet to outgoing interface.• Transmit packet onto link.

Question: How is the address looked up in a real router?

Page 23: Internet Protocols Fall 2005

Making a Forwarding DecisionClass-based addressing

CS 349/Fall05 23

IP Address Space

Class A Class B Class C D

Class A

Class B

Class C212.17.9.0 Port 4

ExactMatch

Routing Table:212.17.9.4

212.17.9.0

Exact Match: There are many well-known ways to find an exact match in a table.

Page 24: Internet Protocols Fall 2005

Lookup Performance Required

CS 349/Fall05 24

Line Line Rate Pkt-size=40B Pkt-size=240B

T1 1.5Mbps 4.68 Kpps 0.78 Kpps

OC3 155Mbps 480 Kpps 80 Kpps

OC12 622Mbps 1.94 Mpps 323 Kpps

OC48 2.5Gbps 7.81 Mpps 1.3 Mpps

OC192 10 Gbps 31.25 Mpps 5.21 Mpps

Page 25: Internet Protocols Fall 2005

Direct Lookup

CS 349/Fall05 25

IP Address

Address

MemoryD

ataNext-hop, Port

Problem: With 232 addresses, the memory would require 4 billion entries.

Page 26: Internet Protocols Fall 2005

Associative Lookups“Contents addressable memory” (CAM)

CS 349/Fall05 26

Advantages:• Simple

Disadvantages• Slow• High Power• Small• Expensive

AssociativeMemory or CAM

NetworkAddress

PortNumber

PortNumberSearch

Data

32Hit?

Page 27: Internet Protocols Fall 2005

Hashed Lookups

CS 349/Fall05 27

HashingFunction

Memory

Addr

ess

Dat

a

Search Data

log2N

AssociatedData

Hit?

Address{1632

Page 28: Internet Protocols Fall 2005

Lookups Using HashingAn example

CS 349/Fall05 28

Hashing Function16

#1 #2 #3 #4

#1 #2

#1 #2 #3Linked list of entrieswith same hash key.

Memory

Search Data

AssociatedData

Hit?32

Page 29: Internet Protocols Fall 2005

Lookups Using Hashing

CS 349/Fall05 29

Advantages:• Simple

• Expected lookup time can be small

Disadvantages• Non-deterministic lookup time

• Inefficient use of memory

Page 30: Internet Protocols Fall 2005

Patricia Tries

CS 349/Fall05 30

Example Prefixes:a) 00001b) 00010c) 00011d) 001e) 0101f) 011g) 100h) 1010i) 1100j) 11110000e

f g

h i

Skip 5j

0 1

a b c

d

Page 31: Internet Protocols Fall 2005

How to Implement a router

CS 349/Fall05 31

• Issues– Performance

• Throughput

– Scaling

Page 32: Internet Protocols Fall 2005

Workstation-Based

CS 349/Fall05 32

• Aggregate bandwidth– 1/2 of the I/O bus bandwidth – capacity shared among all hosts connected to router– example: 1Gbps bus can support 5 x 100Mbps ports (in theory)

• Packets-per-second– must be able to switch

small packets– 300,000 packets-per-

second is achievable– e.g., 64-byte packets

implies 155Mbps

I/O bus

Interface 1

Interface 2

Interface 3

CPU

Main memory

Page 33: Internet Protocols Fall 2005

Switching Hardware

CS 349/Fall05 33

• Design Goals– throughput (depends on traffic model)– scalability (a function of n)

• Ports– buffering (input and/or output)

• Fabric– as simple as possible– sometimes do buffering (internal)

Switchfabric

Controlprocessor

Outputport

Inputport

Page 34: Internet Protocols Fall 2005

LAN Addresses and ARP

CS 349/Fall05 34

32-bit IP address:• network-layer address• used to get datagram to destination IP network (recall IP

network definition)LAN (or MAC or physical or Ethernet) address: • used to get datagram from one interface to another

physically-connected interface (same network)• 48 bit MAC address (for most LANs)

“burned” in the adapter EPROM

Page 35: Internet Protocols Fall 2005

LAN Address (more)

CS 349/Fall05 35

• MAC address allocation administered by IEEE• manufacturer buys portion of MAC address space (to assure

uniqueness)• Analogy:

(a) MAC address: like Social Security Number(b) IP address: like postal address

• MAC flat address => portability – can move LAN card from one LAN to another

• IP hierarchical address NOT portable– depends on IP network to which node is attached

Page 36: Internet Protocols Fall 2005

ARP: Address Resolution Protocol

CS 349/Fall05 36

• Each IP node (Host, Router) on LAN has ARP table

• ARP Table: IP/MAC address mappings for some LAN nodes

< IP address; MAC address; TTL>

– TTL (Time To Live): time after which address mapping will be forgotten (typically 20 min)

Question: how to determineMAC address of Bknowing B’s IP address?

Page 37: Internet Protocols Fall 2005

ARP protocol

CS 349/Fall05 37

• A wants to send datagram to B, and A knows B’s IP address.

• Suppose B’s MAC address is not in A’s ARP table.

• A broadcasts ARP query packet, containing B's IP address – all machines on LAN receive

ARP query• B receives ARP packet, replies

to A with its (B's) MAC address– frame sent to A’s MAC address

(unicast)

• A caches (saves) IP-to-MAC address pair in its ARP table until information becomes old (times out) – soft state: information

that times out (goes away) unless refreshed

• ARP is “plug-and-play”:– nodes create their ARP

tables without intervention from net administrator

Page 38: Internet Protocols Fall 2005

ARP Protocol (II)

CS 349/Fall05 38

• Proxy ARP– Reply on behalf of another node

• ARP Highjacking– Node can “steal” packets destined to another node– Solutions

• Static ARP entries• Switched LANs• Lock MAC addresses to Switch ports

Page 39: Internet Protocols Fall 2005

ICMP

CS 349/Fall05 39

• Internet Control Message Protocol:– Used by a router/end-host to report some types of error:– E.g. Destination Unreachable: packet can’t be forwarded

to/towards its destination. – E.g. Time Exceeded: TTL reached zero, or fragment didn’t

arrive in time. Traceroute uses this error to its advantage.– An ICMP message is an IP datagram, and is sent back to the

source of the packet that caused the error.

Page 40: Internet Protocols Fall 2005

IP Address Shortage

CS 349/Fall05 40

• Global IPv4 addresses are getting depleted– Increase in number of hosts

• PCs, PDAs, cellphones, microwaves, etc– Address inefficiencies

• What to do– Get larger address space -> IPv6– Remove the assumption that address is globally unique

• Can reuse the same address multiple times -> NAT

Page 41: Internet Protocols Fall 2005

NAT: Network Address Translation

CS 349/Fall05 41

10.0.0.1

10.0.0.2

10.0.0.3

10.0.0.4

138.76.29.7

local network(e.g., home network)

10.0.0/24

rest ofInternet

Datagrams with source or destination in this networkhave 10.0.0/24 address for

source, destination (as usual)

All datagrams leaving localnetwork have same single source NAT IP address: 138.76.29.7,different source port numbers

Page 42: Internet Protocols Fall 2005

NAT: Network Address Translation

CS 349/Fall05 42

10.0.0.1

10.0.0.2

10.0.0.3

S: 10.0.0.1, 3345D: 128.119.40.186, 80

1

10.0.0.4

138.76.29.7

1: host 10.0.0.1 sends datagram to 128.119.40, 80

NAT translation tableWAN side addr LAN side addr138.76.29.7, 5001 10.0.0.1, 3345…… ……

S: 128.119.40.186, 80 D: 10.0.0.1, 3345 4

S: 138.76.29.7, 5001D: 128.119.40.186, 802

2: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates table

S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3

3: Reply arrivesdest. address:138.76.29.7, 5001

4: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345

Page 43: Internet Protocols Fall 2005

NAT implementation

CS 349/Fall05 43

• NAT router must:– outgoing datagrams: replace (source IP address, port #) of every

outgoing datagram to (NAT IP address, new port #)• . . . remote clients/servers will respond using (NAT IP address, new

port #) as destination addr.– remember (in NAT translation table) every (source IP address, port

#) to (NAT IP address, new port #) translation pair– incoming datagrams: replace (NAT IP address, new port #) in dest

fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table

Page 44: Internet Protocols Fall 2005

NAT Problems

CS 349/Fall05 44

• Problems due to NAT– Increased network complexity, reduced

robustness– Cannot run services inside NAT (maybe)

• Address shortage should instead be solved by IPv6

Page 45: Internet Protocols Fall 2005

IPv6

CS 349/Fall05 45

• Motivation: 32-bit address space exhaustion• Take the opportunity for some clean-up

• IPv6 datagram format:– fixed-length 40 byte header– Address length changed from 32 bits to 128 bits– fragmentation fields moved out of base header– IP options moved out of base header

• Header Length field eliminated– Header Checksum eliminated– Type of Service field eliminated– Time to Live Hop Limit, Protocol Next Header– Precedence Priority, added Flow Label field– Length field excludes IPv6 header

Page 46: Internet Protocols Fall 2005

IPv6 header format

CS 349/Fall05 46

Destination Address (16 bytes)

Version Priority Flow LabelPayload Length Next Header Hop Limit

Source Address (16 bytes, 128 bits)

Version Hdr Len Total LengthIdentification Fragment Offset

Prec TOS

Time to Live Protocol Header ChecksumFlags

Source AddressDestination Address

PaddingOptions

32 bits

IPv4 header

Page 47: Internet Protocols Fall 2005

Transition From IPv4 To IPv6

CS 349/Fall05 47

• Not all routers can be upgraded simultaneously

• Two proposed approaches to allow the Internet operate with mixed IPv4 and IPv6 routers :

Dual Stack: some routers with dual stack (v6, v4) can “translate”between formats

Tunneling: IPv6 carried as payload in IPv4 packets among IPv4 routers