Post on 02-Jan-2016
description
Internet Protocol
ECS 152A
Xin Liu
Ref: slides by J. Kurose and K. Ross
Goals
• Principles of network layer services
• Internet Protocol
– Addressing
– Routing
– ICMP
Overview
HTTPUser
process SNMP
TCP UDP
ICMPICMP IPIP
ARP RARPHardwareinterface
applicationmessage
transportsegment
networknetworkdatagramdatagram
linkframe
Userprocess
Encapsulation
Demultiplexing
Network layer functions
• transport packet from sending to receiving hosts
• network layer protocols in every host, router
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
Functions
• path determination: route taken by packets from source to dest. Routing algorithms
• forwarding: move packets from router’s input to appropriate router output
• call setup: some network architectures require router call setup along path before data flows (not Internet)
Network service model
Q: What service model for transporting packets from sender to receiver?
• guaranteed bandwidth?• preservation of inter-packet
timing (no jitter)?• loss-free delivery?• in-order delivery?• congestion feedback to
sender?
? ??virtual circuit
or datagram?
The most important abstraction provided
by network layer:
serv
ice a
bst
ract
ion
Virtual circuits
• call setup, teardown for each call before data can flow• each packet carries VC identifier (not destination host ID)• every router on source-dest path maintains “state” for each
passing connection• link, router resources (bandwidth, buffers) may be allocated to VC
– to get circuit-like perf.
“source-to-dest path behaves much like telephone circuit”
Virtual circuits: signaling protocols
• used to setup, maintain teardown VC• used in ATM, frame-relay, X.25• not used in today’s Internet
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Initiate call 2. incoming call
3. Accept call4. Call connected5. Data flow begins 6. Receive data
Datagram networks: the Internet model
• no call setup at network layer• routers: no state about end-to-end connections
– no network-level concept of “connection”
• packets forwarded using destination host address– packets between same source-dest pair may take different
paths
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Send data 2. Receive data
VC vs. Datagram
• VC– Guaranteed service– Complexity
• Datagram– Simple– Best effort
Internet Protocol
• Functionality:– Determine how to route packets from source to
destination– Hide the details of the physical network– Unreliable, connectionless, datagram delivery
• To be mentioned:– Addressing– Routing– ICMP
Encapsulation
source destinationoriginal message
Transport
Network
Link
Application
Transport
Network
Link
Application
IP header
ver length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifier
header checksum
time tolive
32 bit source IP address
IP protocol versionnumber
header length (bytes)
max numberremaining hops
(decremented at each router)
forfragmentation/reassembly
total datagramlength (bytes)
upper layer protocolto deliver payload to
head.len
type ofservice
“type” of data flgsfragment
offsetupper layer
32 bit destination IP address
Options (if any) E.g. timestamp,record routetaken, specifylist of routers to visit.
20 bytes overhead
IP Header
• Version: 4
• Header length: 4 bits, max 15x4=60 bytes
• TOS: 0 for normal service,
• Total length: 16 bits, max 65535 bytes
• TTL: 32/64, decrease by one in each hop
• Protocol field: TCP,UCP,ICMP,IGMP,etc.
• Checksum: header only
IP Address
0network host
10 network host
110 network host
1110 multicast address
A
B
C
D
class1.0.0.0 to127.255.255.255
128.0.0.0 to191.255.255.255
192.0.0.0 to223.255.255.255
224.0.0.0 to239.255.255.255
32 bits
7 bits
14 bits
21 bits
28 bits
Class-based address:
IP addressing: CIDR• Classful addressing:
– inefficient use of address space, address space exhaustion– e.g., class B net allocated enough addresses for 65K hosts,
even if only 500 hosts in that network
• CIDR: Classless InterDomain Routing– network portion of address of arbitrary length– address format: a.b.c.d/x, where x is # bits in network portion
of address
11001000 00010111 00010000 00000000
networkpart
hostpart
200.23.16.0/23
CIDR
• Network address: 200.23.16.0/23– /23 : network mask
• More efficient use of address– Consider a network with 500 hosts– Classful address: a class B address, wasting over 64K
addresses– CIDR: a network with /23– One class B address can be used for 128 such networks using
CIDR• Routing difficulty
– Classful: only need the IP address to determine the network add– CIDR: also need network mask information to determine the
network address– Longest match first
IP routing• IP address: 32-bit
identifier for host, router interface
• interface: connection between host/router and physical link– router’s typically have
multiple interfaces– host may have multiple
interfaces– IP addresses
associated with each interface
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.1 = 11011111 00000001 00000001 00000001
223 1 11
Network address: 223.1.1.0/24
IP Routing• IP address:
– network part (high order bits) is used for routing
– host part (low order bits) not used for routing
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
network consisting of 3 IP networks(for IP addresses starting with 223, first 24 bits are network address)
LAN
Getting a datagram from source to dest.
IP datagram:
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
miscfields
sourceIP addr
destIP addr data
• datagram remains unchanged, as it travels source to destination
• addr fields of interest here• Default router for all other networks
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2Others 223.1.1.4 x
forwarding table in A
Getting a datagram from source to dest.
Starting at A, send IP datagram addressed to B:
• look up net. address of B in forwarding table
• find B is on same net. as A• link layer will send datagram directly
to B inside link-layer frame– B and A are directly connected
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2
miscfields223.1.1.1223.1.1.3data
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in A
Getting a datagram from source to dest.
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2Starting at A, dest. E:
• look up network address of E in forwarding table
• E on different network
– A, E not directly attached
• routing table: next hop router to E is 223.1.1.4
• link layer sends datagram to router 223.1.1.4 inside link-layer frame
• datagram arrives at 223.1.1.4
• continued…..
miscfields223.1.1.1223.1.2.3 data
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in A
Getting a datagram from source to dest.
Arriving at 223.1.4, destined for 223.1.2.2
• look up network address of E in router’s forwarding table
• E on same network as router’s
interface 223.1.2.9 – router, E directly attached
• link layer sends datagram to 223.1.2.2 inside link-layer frame
via interface 223.1.2.9 • datagram arrives at 223.1.2.2!!!
(hooray!)
miscfields223.1.1.1223.1.2.3 data Dest. Net router Nhops interface
223.1.1 - 1 223.1.1.4 223.1.2 - 1 223.1.2.9
223.1.3 - 1 223.1.3.27
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in router
CIDR Routing
11001000 00010111 00010000 00000000
networkpart
hostpart
200.23.16.0/23
11001000 00010111 00000000 00000000
networkpart
hostpart
200.23.0.0/17
CIDR routing: longest match first
IP Fragmentation & Reassembly• network links have MTU
(max.transfer size) - largest possible link-level frame.– different link types,
different MTUs • large IP datagram divided
(“fragmented”) within net– one datagram becomes
several datagrams– “reassembled” only at
final destination– IP header bits used to
identify, order related fragments
fragmentation: in: one large datagramout: 3 smaller datagrams
reassembly
IP Fragmentation and Reassembly
ID=x
offset=0
fragflag=0
length=4000
ID=x
offset=0
fragflag=1
length=1500
ID=x
offset=1480
fragflag=1
length=1500
ID=x
offset=2960
fragflag=0
length=1040
One large datagram becomesseveral smaller datagrams
Example• 4000 byte
datagram• MTU = 1500 bytes
IPv6• Initial motivation: 32-bit address space
completely allocated by 2008. • Additional motivation:
– header format helps speed processing/forwarding– header changes to facilitate QoS – new “anycast” address: route to “best” of several
replicated servers
• IPv6 datagram format: – fixed-length 40 byte header– no fragmentation allowed
IPv6 Header (Cont)Priority: identify priority among datagrams in flowFlow Label: identify datagrams in same “flow.” (concept of“flow” not well defined).Next header: identify upper layer protocol for data
Other Changes from IPv4
• Checksum: removed entirely to reduce processing time at each hop
• Options: allowed, but outside of header, indicated by “Next Header” field
• ICMPv6: new version of ICMP– additional message types, e.g. “Packet Too
Big”– multicast group management functions
Transition From IPv4 To IPv6
• Not all routers can be upgraded simultaneous– no “flag days”– How will the network operate with mixed IPv4 and
IPv6 routers?
• Two proposed approaches:– Dual Stack: some routers with dual stack (v6, v4) can
“translate” between formats– Tunneling: IPv6 carried as payload in IPv4 datagram
among IPv4 routers
Dual Stack Approach
A B E F
IPv6 IPv6 IPv6 IPv6
C D
IPv4 IPv4
Flow: XSrc: ADest: F
data
Flow: ??Src: ADest: F
data
Src:ADest: F
data
A-to-B:IPv6
Src:ADest: F
data
B-to-C:IPv4
B-to-C:IPv4
B-to-C:IPv6
TunnelingA B E F
IPv6 IPv6 IPv6 IPv6
tunnelLogical view:
Physical view:A B E F
IPv6 IPv6 IPv6 IPv6
C D
IPv4 IPv4
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Src:BDest: E
Flow: XSrc: ADest: F
data
Src:BDest: E
A-to-B:IPv6
E-to-F:IPv6
B-to-C:IPv6 inside
IPv4
B-to-C:IPv6 inside
IPv4
ICMP (Internet Control Message Protocol)
Type Code description Query Error0 0 echo reply (ping) x3 0 dest. network unreachable x3 1 dest host unreachable x3 2 dest protocol unreachable x3 3 dest port unreachable x3 6 dest network unknown x3 7 dest host unknown x4 0 source quench (congestion x control - not used)8 0 echo request (ping) x9 0 route advertisement x10 0 router discovery x11 0 TTL expired x12 0 bad IP header x
ICMP
IP header ICMP message
IP datagram
8-bit type 8-bit code 16-bit checksum
Contents depends on type and code
Error message
• ICMP error message: – ICMP header:
• type, code, checksum,– ICMP message
• IP header plus first 8 bytes of IP datagram causing error
• To prevent broadcast storm: NOT generate ICMP in response to– ICMP error message– Dest=IP broadcast address– Link layer broadcast– A fragment other than the first– Source address not defined as a single host
Ping
• Basic connectivity test
• uses ICMP eco request/reply messages instead of UDP/TCP.
• Client/server paradigm
• Usually implemented in the kernel.
• “man ping”
Format
type (0) code(0) 16-bit checksum
Optional data
identifier sequence no.
Pingbread% ping -s shannon.cs.ucdavis.eduPING shannon.cs.ucdavis.edu: 56 data bytes64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=0. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=1. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=2. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=3. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=4. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=5. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=6. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=7. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=8. time=0. ms64 bytes from shannon.cs.ucdavis.edu (169.237.6.199): icmp_seq=9. time=0. ms…----shannon.cs.ucdavis.edu PING Statistics----30 packets transmitted, 30 packets received, 0% packet lossround-trip (ms) min/avg/max = 0/0/0
Pingbread% ping -s mark.ecn.purdue.eduPING mark.ecn.purdue.edu: 56 data bytes64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=0. time=66. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=1. time=64. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=3. time=64. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=4. time=65. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=5. time=64. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=8. time=65. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=10. time=65. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=11. time=65. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=12. time=65. ms64 bytes from mark.ecn.purdue.edu (128.46.209.167): icmp_seq=15. time=64. ms^C----mark.ecn.purdue.edu PING Statistics----18 packets transmitted, 10 packets received, 44% packet lossround-trip (ms) min/avg/max = 64/65/66
Traceroute
• By Van Jacobson• See route that IP datagram follow• Use ICMP and TTL
– A router gets an IP datagram with TTL 0/1, discards the packet and sends back an ICMP to the source “time exceeded”.
– Source sends UDP fragment with 1,2,3, TTL values– IP packet contains an UDP with unused post #. dest.
Replies “port unreachable” ICMP message.
Traceroutebread% traceroute ector.cs.purdue.edutraceroute: Warning: Multiple interfaces found; using 169.237.6.16 @ qfe0traceroute to ector.cs.purdue.edu (128.10.2.10), 30 hops max, 40 byte packets 1 169.237.5.254 (169.237.5.254) 0.594 ms 0.337 ms 0.298 ms 2 169.237.246.238 (169.237.246.238) 0.533 ms 0.479 ms 0.474 ms 3 128.120.2.49 (128.120.2.49) 0.547 ms 0.475 ms 0.475 ms 4 core0.ucdavis.edu (128.120.0.30) 0.616 ms 0.671 ms 0.642 ms 5 area0-area14p.ucdavis.edu (128.120.0.222) 0.570 ms 0.468 ms 0.821 ms 6 area14p-border20.ucdavis.edu (128.120.0.250) 1.149 ms 0.691 ms 3.132 ms 7 dc-oak-dc2--ucd-ge.cenic.net (137.164.24.225) 4.751 ms 2.434 ms 4.521 ms 8 dc-oak-dc1--oak-dc2-ge.cenic.net (137.164.22.36) 2.394 ms 4.217 ms 2.452 ms 9 dc-svl-dc1--oak-dc1-10ge.cenic.net (137.164.22.30) 201.245 ms 5.091 ms 183.393 ms10 dc-sol-dc1--svl-dc1-pos.cenic.net (137.164.22.28) 13.421 ms 11.258 ms 11.155 ms11 hpr-lax-hrp1--dc-lax-dc1-ge.cenic.net (137.164.22.13) 11.571 ms 14.390 ms 11.809 ms12 abilene-LA--hpr-lax-gsr1-10ge.cenic.net (137.164.25.3) 13.431 ms 11.417 ms 11.289 ms13 snvang-losang.abilene.ucaid.edu (198.32.8.95) 19.141 ms 20.516 ms 19.117 ms14 kscyng-snvang.abilene.ucaid.edu (198.32.8.103) 54.300 ms 53.943 ms 53.998 ms15 iplsng-kscyng.abilene.ucaid.edu (198.32.8.80) 64.783 ms 68.220 ms 63.659 ms16 ul-abilene.indiana.gigapop.net (192.12.206.250) 63.567 ms 63.381 ms 63.025 ms17 tel-210-m10-01-gp.tcom.purdue.edu (192.5.40.9) 65.017 ms * 64.982 ms18 cs-2u01-c3550-01-242.tcom.purdue.edu (128.210.242.51) 65.527 ms 65.282 ms 65.083 ms19 * ector.cs.purdue.edu (128.10.2.10) 65.528 ms *
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
local network(e.g., home network)
10.0.0/24
rest ofInternet
Datagrams with source or destination in this networkhave 10.0.0/24 address for
source, destination (as usual)
All datagrams leaving localnetwork have same single source
NAT IP address: 138.76.29.7,different source port numbers
NAT: Network Address Translation
• Motivation: local network uses just one IP address as far as outside word is concerned:– no need to be allocated range of addresses from ISP: -
just one IP address is used for all devices– can change addresses of devices in local network
without notifying outside world– can change ISP without changing addresses of devices
in local network– devices inside local net not explicitly addressable, visible
by outside world (a security plus).
NAT: Network Address Translation
Implementation: NAT router must:
– outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #)
. . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.
– remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair
– incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
S: 10.0.0.1, 3345D: 128.119.40.186, 80
1
10.0.0.4
138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40, 80
NAT translation tableWAN side addr LAN side addr
138.76.29.7, 5001 10.0.0.1, 3345…… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345
4
S: 138.76.29.7, 5001D: 128.119.40.186, 80
2
2: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 5001
3
3: Reply arrives dest. address: 138.76.29.7, 5001
4: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345
NAT: Network Address Translation
• 16-bit port-number field: – 60,000 simultaneous connections with a single
LAN-side address!
• NAT is controversial:– routers should only process up to layer 3– violates end-to-end argument
• NAT possibility must be taken into account by app designers, eg, P2P applications
– address shortage should instead be solved by IPv6