Introduction to Networking
description
Transcript of Introduction to Networking
Introduction to Networking
Internet: Example
• Click -> get page• Specifies
- protocol (http) - location (www.cnn.com)
Internet: Locating Resource
• www.cnn.com – name of a computer – Implicitly also a file
• Map name to IP address– DNS
host local comcnn.com? cnn.com?
a.b.c.d a.b.c.d
Internet: Connection
• Http sets up a connection (tcp) – between the host and cnn.com to transfer the page
• The connection transfers page as a byte stream – without errors: flow control + error control
Host www.cnn.com
Connect
OK
Get page
Page; close
Internet: End-to-end• Byte stream flows end to end across many links/switches:
– routing (+ addressing)
• That stream is regulated and controlled by both ends:– retransmission of erroneous or missing bytes; flow control
HOST
CNN.COM
end-to-end pacing and error control
routing
Internet: Packets• The network transports bytes grouped into packets• Packets are “self-contained”; routers handle them 1 by 1• The end hosts worry about errors and pacing
– Destination sends ACKs; Source checks losses
C
HOST: B
CNN.COM: AA | B | # , CRC | bytes
B: toC
Internet: Bits
• Equipment in each node sends packets as string of bits• That equipment is not aware of the meaning of the bits• Frames (packetizing) vs. streams
01011...011...110
Transmitter Physical Medium Receiver
01011...011...110
OpticalCopperWireless
Internet: Points to remember
• Separation of tasks– send bits on a link: transmitter/receiver [clock, modulation,…]
– send packet on each hop [framing, error detection,…]
– send packet end to end [addressing, routing]
– pace transmissions [detect congestion]
– retransmit erroneous or missing packets [acks, timeout]
– find destination address from name [DNS]
• Scalability– routers don’t know full path– names and addresses are hierarchical
Internet : Challenges
• Addressing ?• Routing ?• Reliable transmission ?• Interoperability ?• Resource management ?• Quality of service ?
Concepts at heart of the Internet
• Protocol• Layered Architecture• Packet Switching• Distributed Control• Open System
Protocol
• Two communicating entities must agree on:– Expected order and meaning of messages they exchange– The action to perform on sending/receiving a message
• Asking the time
Layered Architectures
• Human beings can handle lots of complexity in their protocol processing.– Ambiguously defined protocols– Many protocols all at once
• How computers manage complex protocol processing?– Specify well defined protocols to enact.– Decompose complicated jobs into layers;
• each has a well defined task
Layered Architectures
• Break-up design problem into smaller problems– More manageable
• Modular design: easy to extend/modify.• Difficult to implement
– careful with interaction of layers for efficiency
Layered Architecture
Web, e-mail, file transfer, ...
Reliable/ordered transmission, QOS,security, compression, ...
End-to-end transmission,resource allocation, routing, ...
Point-to-point links,LANs, radios, ...
Applications
Middleware
Routing
Physical Links
usersnetwork
The OSI Model
• Open Systems Interconnect model is a standard way of understanding conceptual layers of network comm.
• This is a model, nobody builds systems like this.• Each level provides certain functions and guarantees, and
communicates with the same level on remote notes.• A message is generated at the highest level, and is passed
down the levels, encapsulated by lower levels, until it is sent over the wire.
• On the destination, it makes its way up the layers,until the high-level msg reaches its high-level destination.
OSI Levels
Presentation
Transport
Network
Data Link
Physical
Application
Presentation
Transport
Network
Data Link
Physical
ApplicationNode A Node B
Network
OSI Levels
• Physical Layer: electrical details of bits on the wire• Data Link: sending “frames” of bits and error detection• Network Layer:” routing packets to the destination• Transport Layer: reliable transmission of messages,
disassembly/assembly, ordering, retransmission of lost packets• Session Layer; really part of transport, typ. Not impl.• Presentation Layer: data representation in the message• Application: high-level protocols (mail, ftp, etc.)
Internet protocol stack
HTTP, SMTP, FTP, TELNET, DNS, …
TCP, UDP.
IP
Point-to-point links,LANs, radios, ...
Application
Transport
Network
Physical
usersnetwork
Air travel
Ticket (purchase)
Baggage (check)
Gates (load)
Runway (take off)
Passenger Origin
Ticket (complain)
Baggage (claim)
Gates (unload)
Runway (landing)
Passenger Destination
Airplane routing
Protocol stack
e-mail client
TCP server
IP server
ethernetdriver/card
user X
SMTP
TCP
IP
e-mail server
TCP server
IP server
ethernetdriver/card
user Y
IEEE 802.3 standard
electric signals
English
Protocol interfaces
e-mail client
TCP server
IP server
ethernetdriver/card
user X
e-mail server
TCP server
IP server
ethernetdriver/card
user Y
s = open_socket();socket_write(s, buffer);…
Addressing
• Each network interface has a hardware address– Multiple interfaces multiple addresses
• Each application communicates via a port– Port is a logical connection endpoint
– Allows multiple local applications to use network resources
– Up to 65535• < 1024 : used by privileged applications• 1024 ≤ available for use ≤ 49151• 49152 ≤ Dynamic ports/private ports ≤ 65535
– http ports 80 and 8080
– telnet 23, ftp 21, etc
• Think of a telephone network …
Addressing and Packet Format
• The ``Data'' segment contains higher level protocol information.
– Which protocol is this packet destined for?
– Which process is the packet destined for?
– Which packet is this in a sequence of packets?
– What kind of packet is this?
• This is the stuff of the OSI reference model.
Start (7 bytes)
Destination (6)
Source (6)
Length (2)
Msg Data (1500)
Checksum (4)
Ethernet packet dispatching
• An incoming packet comes into the Ethernet controller.• The Ethernet controller reads it off the network into a buffer.• It interrupts the CPU.• A network interrupt handler reads the packet out of the controller into
memory.• A dispatch routine looks at the Data part and hands it to a higher level
protocol• The higher level protocol copies it out into user space.• A program manipulates the data.• The output path is similar.• Consider what happens when you send mail.
Example: MailHi Dad.
Hi Dad.
To: Dad
SrcAddr: 128.95.1.2DestAddr: 128.95.1.3SrcPort: 110, DestPort: 110Bytes: 1-20
Hi Dad.
To: Dad
SrcEther: 0xdeadbeefDestEther: 0xfeedfaceSrcAddr: 128.95.1.2DestAddr: 128.95.1.3SrcPort: 100DestPort: 200Bytes: 1-20
Hi Dad.
To: Dad
Mail Composition And Display
Mail Transport Layer
Network Transport Layer
Link Layer
Network
User
Kernel
Hi Dad.
Hi Dad.
To: Dad
SrcAddr: 128.95.1.2DestAddr: 128.95.1.3SrcPort: 110, DestPort: 110Bytes: 1-20
Hi Dad.
To: Dad
SrcEther: 0xdeadbeefDestEther: 0xfeedfaceSrcAddr: 128.95.1.2DestAddr: 128.95.1.3SrcPort: 100DestPort: 200Bytes: 1-20
Hi Dad.
To: Dad
Protocol encapsulation
e-mail client
TCP server
IP server
ethernetdriver/card
user X
e-mail server
TCP server
IP server
ethernetdriver/card
user Y“Hello”
“Hello”
“Hello”
“Hello”
“Hello”
End-to-End Argument
• What function to implement in each layer?• Saltzer, Reed, Clarke 1984
– A function can be correctly and completely implemented only with the knowledge and help of applications standing at the communication endpoints
– Argues for moving function upward in a layered architecture
• Should the network guarantee packet delivery ?– Think about a file transfer program
– Read file from disk, send it, the receiver reads packets and writes them to the disk
End-to-End Argument
• If the network guaranteed packet delivery– one might think that the applications would be simpler
• No need to worry about retransmits– But need to check that file was written to the remote disk
intact
• A check is necessary if nodes can fail– Consequently, applications need to perform their retransmits
• No need to burden the internals of the network with properties that can, and must, be implemented at the periphery
End-to-End Argument
• An Occam’s razor for Internet design– If there is a problem, the simplest explanation is probably the
correct one
• Application-specific properties are best provided by the applications, not the network– Guaranteed, or ordered, packet delivery, duplicate
suppression, security, etc.
• The internet performs the simplest packet routing and delivery service it can– Packets are sent on a best-effort basis– Higher-level applications do the rest
Two ways to handle networking
• Circuit Switching– What you get when you make a phone call– Dedicated circuit per call
• Packet Switching– What you get when you send a bunch of letters– Network bandwidth consumed only when sending– Packets are routed independently
Circuit Switching
• End-to-end resources reserved for “call”– Link bandwidth, switch capacity
– Dedicated resources: no sharing
– Circuit-like (guaranteed) performance
– Call setup required
Packet Switching
• Each end-to-end data stream divided into packets– User’s packets share network resources
• Compared to dedicated allocation
– Each packet uses full link bandwidth• Compared to dividing bandwidth into pieces
– Resources are used as needed• Compared to resource reservation
• Resource contention:– Aggregate demand can exceed amount available– Congestion: packets queue, wait for link use– Store and forward: packets move one hop at a time
• Transmit over link• Wait turn at next link
Routing
• Goal: move data among routers from source to dest.• Datagram packet network:
– Destination address determines next hop
– Routes may change during session
– Analogy: driving, asking directions
– No notion of call state
• Circuit-switched network:– Call allocated time slots of bandwidth at each link
– Fixed path (for call) determined at call setup
– Switches maintain lots of per call state: resource allocation
Packet vs. Circuit Switching
• Reliability: no congestion, in-order data in circuit-switch• Packet switching: better bandwidth use• State, resources: packet switching has less state
– Good: less control plane processing resources along the way
– More data plane (address lookup) processing
• Failure modes (routers/links down)– Packet switch reconfigures sub-second timescale
– Circuit switching: more complicated• Involves all switches in the path
A small Internet
A
V
R
B
W
r1,e1r2,e2
r3
a,e3
w,e5 b,e4
Scenario:A wants to send data to B.
Packet forwarding
HTTP
TCP
IP
ethernet
Host A
IP
eth
Router R
link
HTTP
TCP
IP
ethernet
Router W
Host B
IP
ethlink
The Link Layer
What is purpose of this layer?
• Physically encode bits on the wire• Link = pipe to send information
– E.g. point to point or broadcast
• Can be built out of:– Twisted pair, coaxial cable, optical fiber, radio waves, etc
• Links should only be able to send data– Could corrupt, lose, reorder, duplicate, (fail in other ways)
How to connect routers/machines?
• WAN/Router Connections– Commercial:
• T1 (1.5 Mbps), T3 (44 Mbps)• OC1 (51 Mbps), OC3 (155 Mbps)• ISDN (64 Kbps)• Frame Relay (1-100 Mbps, usually 1.5 Mbps)• ATM (some Gbps)
– To your home:• DSL• Cable
• Local Area:– Ethernet: IEEE 802.3 (10 Mbps, 100 Mbps, 1 Gbps, 10Gbps)– Wireless: IEEE 802.11 b/g/a (11 Mbps, 22 Mbps, 54 Mbps)
Link level Issues
• Encoding: map bits to analog signals• Framing: Group bits into frames (packets)• Arbitration: multiple senders, one resource• Addressing: multiple receivers, one wire
Encoding
• Map 1s and 0s to electric signals• Simple scheme: Non-Return to Zero (NRZ)
– 0 = low voltage, 1 = high voltage
• Problems:– How to tell an error? When jammed? When is bus idle?
– When to sample? Clock recovery is difficult.• Idea: Recover clock using encoding transitions
1 0 1 1 0
Manchester Encoding
• Used by Ethernet• Idea: Map 0 to low-to-high transition, 1 to high-to-low
• Plusses: can detect dead-link, can recover clock• Bad: reduce bandwidth, i.e. bit rate = ½ baud rate
– If wire can do X transition per second?
0 1 1 0
Framing
• Why send packets?– Error control
• How do you know when to stop reading?– Sentinel approach: send start and end sequence
– For example, if sentinel is 11111
– 11111 00101001111100 11111 10101001 11111 010011 11111
– What if sentinel appears in the data?• map sentinel to something else, receiver maps it back
– Bit stuffing
Example: HDLC
• Same sentinel for begin and end: 0111 1110• packet format:
• Bit stuffing– Sender: If 5 1s then insert a 0
– Receiver: if 5 1s followed by a 0, remove 0
• Else read next bit
• Packet size now depends on the contents
0111 1110 header data CRC 0111 1110
0111 1110 0111 1101 0
0111 1101 0 0111 1110
Arbitration
• One medium, multiple senders– What did we do for CPU, memory, readers/writers?
– New Problem: No centralized control
• Approaches– TDMA: Time Division Multiple Access
• Divide time into slots, round robin among senders• If you exceed the capacity do not admit more (busy signal)
– FDMA: Frequency Division Multiple Access (AMPS)• Divide spectrum into channels, give each sender a channel• If no more channels available, give a busy signal
– Good for continuous streams: fixed delay, constant data rate
– Bad for bursty Internet traffic: idle slots
Ethernet
• Developed in 1976, Metcalfe and Boggs at Xerox• Uses CSMA/CD:
– Carrier Sense Multiple Access with Collision Detection
• Easy way to connect LANs
Metcalfe’s Ethernet sketch
CSMA/CD
• Carrier Sense:– Listen before you speak
• Multiple Access:– Multiple hosts can access the network
• Collision Detection:– Can make out if someone else started speaking
Older Ethernet Frame
CSMA
Wait until
carrier free
CSMA/CA
Garbled signals
If the sender detects a collision, it will stop and then retry!
What is the problem?
CSMA/CD
Packet?
Sense Carrier
Discard Packet
Send Detect Collision
Jam channel b=CalcBackoff();
wait(b);attempts++;
No
Yes
attempts < 16
attempts == 16
Ethernet’s CSMA/CD (more)
Jam Signal: make sure all other transmitters are aware of collision; 48 bits;
Exponential Backoff: • Goal: adapt retransmission attempts to estimated current load
– heavy load: random wait will be longer• first collision: choose K from {0,1}; delay is K x 512 bit
transmission times• after second collision: choose K from {0,1,2,3}…• after ten or more collisions, choose K from {0,1,2,3,4,…,1023}
Packet Size
If packets are too small, the collision goes unnoticed Limit packet size Limit network diameter Use CRC to check frame integrity truncated packets are filtered out
Ethernet Problems
• What if there is a malicious user?– Might not use exponential backoff– Might listen promiscuously to packets
• Integrating Fast and Gigabit Ethernet
Addressing & ARP
• ARP is used to discover physical addresses• ARP = Address Resolution Protocol
“What is the physical address of the host named 128.84.96.89”
128.84.96.90128.84.96.89
128.84.96.91
“I’m at 1a:34:2c:9a:de:cc”
Addressing & RARP
• RARP is used to discover virtual addresses• RARP = Reverse Address Resolution Protocol
“I just got here. My physical address is 1a:34:2c:9a:de:cc. What’s my name ?”
128.84.96.90RARP Server
???
128.84.96.91
“Your name is 128.84.96.89”
Repeaters and Bridges
• Both connect LAN segments• Usually do not originate data• Repeaters (Hubs): physical layer devices
– forward packets on all LAN segments– Useful for increasing range– Increases contention
• Bridges: link layer devices– Forward packets only if meant on that segment– Isolates congestion– More expensive
Backbone Bridge
The Network Layer
Purpose of Network layer
• Given a packet, send it across the network to destination• 2 key issues:
– Portability: • connect different technologies
– Scalability• To the Internet scale
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
application
transport
networkdata link
physical
application
transport
networkdata link
physical
What does it involve?
Two important functions:• routing: determine path from source to dest. • forwarding: move packets from router’s input to
output
T1T3
Sts-1
T3
T1
Network service modelQ: What service model for
“channel” transporting packets from sender to receiver?
• guaranteed bandwidth?
• preservation of inter-packet timing (no jitter)?
• loss-free delivery?
• in-order delivery?
• congestion feedback to sender?? ??
virtual circuitor
datagram?
The most important abstraction provided
by network layer:
service abstraction
Which things can be “faked” at the transport layer?
Two connection models• Connectionless (or “datagram”):
– each packet contains enough information that routers can decide how to get it to its final destination
• Connection-oriented (or “virtual circuit”)– first set up a connection between two nodes– label it (called a virtual circuit identifier (VCI))– all packets carry label
BAb b
C
BA1 1
C
1
Virtual circuits: signaling protocols
• used to setup, maintain teardown VC• setup gives opportunity to reserve resources• used in ATM, frame-relay, X.25• not used in today’s Internet
application
transportnetwork
data linkphysical
application
transportnetwork
data linkphysical
1. Initiate call 2. incoming call
3. Accept call4. Call connected5. Data flow begins 6. Receive data
Virtual circuit switching• Forming a circuit:
– send a connection request from A to B. Contains VCI + address of B– rule: VCI must be unique on the link its used on– switch creates an entry mapping input messages with VCI to output
port– switch picks a new VCI unique between it and next switch
a b
25
21
c12
1
(Input link,VCI) (output link, new VCI)(1, 2) (?, ?)(1, 5) (?, ?)
Virtual circuit forwarding• For each VCI switch has a table which maps input link to
output link and gives the new VCI to use– if a’s messages come into switch 1 on link 2 and go out on link 3 then
the table will be:
a b
25
21
c12
1
Switch 1
Switch 2
Switch 3
Virtual Circuits: Discussion
• Plusses: easy to associate resources with VC– Easy to provide QoS guarantees (bandwidth, delay)– Very little state in packet
• Minuses:– Not good in case of crashes
• Requires explicit connect and teardown phases
– What if teardown does not get to all routers?– What if one switch crashes?
• Will have to teardown and rebuild route
Datagram networks• no call setup at network layer• routers: no state about end-to-end connections
– no network-level concept of “connection”
• packets typically routed using destination host ID– packets between same source-dest pair may take different paths
• Best effort: data corruption, packet drops, route loops
application
transportnetwork
data linkphysical
application
transportnetwork
data linkphysical
1. Send data 2. Receive data
Datagrams: ForwardingHow does packet get to the destination?• switch creates a “forwarding table”, mapping destinations to
output port (ignores input ports)• when a packet with a destination address in the table arrives, it
pushes it out on the appropriate output port• when a packet with a destination address not in the table
arrives, it must find out more routing information (next problem)
a b
c1d
22
00S1
S2
S3
1
01
Datagrams
• Plusses:– No round trip connection setup time
– No explicit route teardown
– No resource reservation each flow could get max bandwidth
– Easily handles switch failures; routes around it
• Minuses– Difficult to provide resource guarantees
– Higher per packet overhead
• Internet uses datagrams: IP (Internet Protocol)
Datagrams Forwarding
• How to build forwarding tables?– Manually enter it
• What if nodes crashed
• What about scale?
• The graph-theoretic routing problem– Given a graph, with vertices (switches), edges (links), and
edge costs (cost of sending on that link)– Find the least cost path between any two nodes
• Path cost = (cost of edges in path)
Simple Routing Algorithm
• Choose a central node– All nodes send their (nbr, cost) information to this node
– Central node uses info to learn entire topology of the network
– It then computes shortest paths between all pairs of nodes• Using All Pair Shortest Path Algorithm
– Sends the new matrix to every node
• Nice, simple, elegant!• What is the problem?
– Scalability: centralization hurts scalability
– Central node is “crushed” with traffic
Link State Routing
• Basic idea:– Every node propagates its (nbr, cost) information
– This information at all nodes is enough to construct topology
– Can use a graph algorithm to find the shortest routes
• Mechanisms required:– Reliable flooding of link information
– Method to calculate shortest route (Dijkstra’s algorithm)
• Example link state update packet:– [node id, (nbr, cost) list, seq. no., ttl]
• Seq. no. to identify latest updates, ttl specifies when to stop msg.
Reliable flooding
receive(pkt) If already have a copy of LSP from pkt.ID
if pkt’s sequence number <= copy’sdiscard pkt
elsedecrement pkt.TTLreplace copy with pktforward pkt to all links besides the one that we received it on
# done every 10 minutes or sogen_LSP()
increment node’s sequence # by onerecompute cost vectorsend created LSP to all neighbors
Discussion: Link-State Routing• Plusses:
– Simple, determines the optimal route most of the time– Used by OSPF
• Minuses:– Might have oscillations
– Avoid using load as cost metric, reduce herding effect
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
Initially start withalmost equal routes
… everyone goes with least loaded
… recomputeLeast loaded =>
Most loaded
… recompute
Is our routing algorithm scalable?
• Route table size grows with size of network– Because our address structure is flat!
• Solution: have a hierarchical structure– Used by OSPF– Divide the network into areas, each area has unique number
• Nodes carry their area number in the address 1.A, 2.B, etc.
– Nodes know complete topology in their area– Area border routers (ABR) know how to get to any other
area
Hierarchical Addressing
1.a3.b
2.b11.b2
2
00S1
S2
S3
1
01
2.a
3
3.a
2
Zone 3
Zone 2
Forwarding table for switch 1Destination switch port
2. ?3. ? 1.b ? 1.a ?
IP has 2-layer addressing• Each IP address is 32 bits
– Network part: which network the host is on?– Host part: identifies the host.
• All hosts on same network have the same network part
• 3 classes of addresses: A, B and C
18.26.0.1
network 32-bits host
0 net host
1 7 24 bits
1 0 net host
2 14 16 bits
110 net host
3 21 8 bits
IP addressing• The different classes:
• Problems: inefficient, address space exhaustion
0network host
10 network host
110 network host
1110 multicast address
A
B
C
D
class1.0.0.0 to127.255.255.255
128.0.0.0 to191.255.255.255
192.0.0.0 to223.255.255.255
224.0.0.0 to239.255.255.255
32 bits
Unicast
Multicast
1111 reservedE 240.0.0.0 to255.255.255.255Reserved
IP addressing: CIDR
• Classless InterDomain Routing– network portion of address of arbitrary length– address format: a.b.c.d/x, where x is # bits in network portion
– Examples:• Class A: /8• Class B: /16• Class C: /24
11001000 00010111 00010000 00000000
networkpart
hostpart
200.23.16.0/23
Internet Protocol Datagram
ver length
32 bits
data (variable length,typically a TCP or UDP segment)
16-bit identifier Internet
checksumtime tolive
32 bit source IP address
IP protocol versionNumber
header length
max numberremaining hops
(decremented at each router)
forfragmentation/reassembly
total datagramlength (bytes)
upper layer protocolto deliver payload to
head.len
type ofservice
“type” of data flgsfragment offset
upper layer
32 bit destination IP address
Options (if any) E.g. timestamp,record routetaken, pecifylist of routers to visit.
Datagram Portability
• IP Goal: To create one logical network from multiple physical networks– All intermediate routers should understand IP
– IP header information sufficient to carry the packet to destination
– Goal: Run over anything!
• Problem:– Physical networks have different MTUs
– “max. transmission unit”: 1500 for Ethernet, 48 for ATM
• Solution 1:– Fit everything in the MTU (!)
IP Fragmentation & Reassembly
• Solution 2: (the one used)– If packet size > MTU of network, then fragment into pieces
• Each fragment is less than MTU size• Each has IP headers + frag bit set + frag id + offset
– Packets may get refragmented on the way to destination
– Reassembly only done at the destination
– What is a good initial packet size?
fragmentation: in: one large datagramout: 3 smaller datagrams
reassembly
Internet: Names and Addresses
Naming in the Internet
• What are named? All Internet Resources.– Objects: www.cs.cornell.edu/~einar
– Services: weather.yahoo.com/forecast
– Hosts: planetlab1.cs.cornell.edu
• Characteristics of Internet Names– human recognizable
– unique
– Persistent?
• Universal Resource Names (URNs)
Locating the resources
• Internet services and resources are provided by end-hosts– ex. www1.cs.cornell.edu and www2.cs.cornell.edu host Einar’s home
page.
• Names are mapped to Locations– Universal Resource Locators (URL)
– Embedded in the name itself: ex. weather.yahoo.com/forecast
• Semantics of Internet naming human recognizable
uniqueness
x persistent
Locating the Hosts?
• Internet Protocol Addresses (IP Addresses)– ex. planetlab1.cs.cornell.edu 128.84.154.49
• Characteristics of IP Addresses– 32 bit fixed-length
– enables network routers to efficiently handle packets in the Internet
• Locating services on hosts– port numbers (16 bit unsigned integer) 65536 ports
– standard ports: HTTP 80, FTP 20, SSH 22, Telnet 20
Mapping Not 1 to 1
• One host may map to more than one name – One server machine may be the web server (www.foo.com),
mail server (mail.foo.com)etc.
• One host may have more than one IP address– IP addresses are per network interface
• But IP addresses are generally unique!– two globally visible machines should not have the same IP
address– Anycast is an Exception:
• routers send packets dynamically to the closest host matching an anycast address
How to get a name?
• Naming in Internet is Hierarchical– decreases centralization– improves name space management
• First, get a domain name then you are free to assign sub names in that domain– How to get a domain name coming up
• Example: weather.yahoo.com belongs to yahoo.com which belongs to .com– regulated by global non-profit bodies
Domain name structure
ccTLDs
root (unnamed)
com milgovedu grorgnet fr ukus ......
cornell ustreas second level (sub-)domainslucent
gTLDs
gTLDs= Generic Top Level Domains ccTLDs = Country Code Top Level Domains
Top-level Domains (TLDs)
• Generic Top Level Domains (gTLDs)– .com - commercial organizations– .org - not-for-profit organizations– .edu - educational organizations– .mil - military organizations– .gov - governmental organizations– .net - network service providers– New: .biz, .info, .name, .xxx (nearly..)
• Country code Top Level Domains (ccTLDs)– One for each country
How to get a domain name?• In 1998, non-profit corporation, Internet
Corporation for Assigned Names and Numbers (ICANN), was formed to assume responsibility from the US Government
• ICANN authorizes other companies to register domains in com, org and net and new gTLDs– Network Solutions is largest and in transitional
period between US Govt and ICANN had sole authority to register domains in com, org and net
ICANN and politics..
• Why should a US company control Internet naming?
• Should companies (from whatever country) be able to profit from internet names?– 28th Aug 2006: “ICANN to allow domain registries to charge ‘what the market will bear’ for domain names & renewals”
How to get an IP Address?
• Answer 1: Normally, answer is get an IP address from your upstream provider– This is essential to maintain efficient routing!
• Answer 2: If you need lots of IP addresses then you can acquire your own block of them.– IP address space is a scarce resource - must prove you
have fully utilized a small block before can ask for a larger one and pay $$ (Jan 2002 - $2250/year for /20 and $18000/year for a /14)
How to get lots of IP Addresses? Internet
RegistriesRIPE NCC (Riseaux IP Europiens Network Coordination
Centre) for Europe, Middle-East, Africa
APNIC (Asia Pacific Network Information Centre )for Asia and Pacific
ARIN (American Registry for Internet Numbers) for the Americas, the Caribbean, sub-saharan Africa
Note: Once again regional distribution is important for efficient routing!
Can also get Autonomous System Numbers (ASNs from these registries
Are there enough addresses?
• Unfortunately No!– 32 bits 4 billion unique addresses– but addresses are assigned in chunks– ex. cornell has four chunks of /16 addressed
• ex. 128.84.0.0 to 128.84.255.255• 128.253.0.0, 128.84.0.0, 132.236.0.0, and 140.251.0.0
• Expanding the address space!– IPv6 128 bit addresses– difficult to deploy (requires cooperation and
changes to the core of the Internet)
DHCP and NATs
• Dynamic Host Control Protocol– lease IP addresses for short time intervals– hosts may refresh addresses periodically only live hosts need valid IP addresses
• Network Address Translators– Hide local IP addresses from rest of the world– only a small number of IP addresses are visible outside solves address shortage for all practical purposes access is highly restricted
• ex. peer-to-peer communication is difficult
NATs in operation
• Translate addresses when packets traverse through NATs
• Use port numbers to increase number of supportable flows
DNS: Domain Name System
Domain Name System:• distributed database implemented in
hierarchy of many name servers• application-layer protocol host, routers, name
servers to communicate to resolve names (address/name translation)– Note how a core Internet function is implemented
as application-layer protocol– complexity at network’s “edge”
DNS name serversName server: process
running on a host that processes DNS requests
local name servers:– each ISP, company has
local (default) name server– host DNS query first goes to
local name server
authoritative name server:– can perform name/address
translation for a specific domain or zone
How could we provide this service? Why not centralize DNS?
• single point of failure• traffic volume• distant centralized database• maintenance
doesn’t scale!
• no server has all name-to-IP address mappings
Name Server Zone Structure
root
Structure based onadministrative issues.
lucent
com miledugov grorgnet fr ukus
ustreas
www
irs Zone: subtree with commonadministration authority.
Name Servers (NS)
root
cornelllucent
com ...edugov
ustreas
customs
www
irsIRS NS
Ustreas NSLucent NS
Root NS
Name Servers (NS)
• NSs are duplicated for reliability.
• Each domain must have a primary and secondary.
• Each host knows the IP address of the local NS.
• Each NS knows the IP addresses of all root NSs.
DNS: Root name servers
• contacted by local name server that can not resolve name
• root name server:– Knows the
authoritative name server for main domain
• ~ 60 root name servers worldwide– real-world
application of anycast
Simple DNS example
host surf.eurecom.fr wants IP address of www.cs.cornell.edu
1. Contacts its local DNS server, dns.eurecom.fr
2. dns.eurecom.fr contacts root name server, if necessary
3. root name server contacts authoritative name server, dns.cornell.edu, if necessary (what might be wrong with this?)
requesting hostsurf.eurecom.fr
www.cs.cornell.edu
root name server
authorititive name serverdns.cornell.edu
local name serverdns.eurecom.fr
1
23
4
5
6
DNS exampleRoot name server:• may not know
authoritative name server
• may know intermediate name server: who to contact to find authoritative name server
requesting hostsurf.eurecom.fr
www.cs.cornell.edu
root name server
local name serverdns.eurecom.fr
1
23
4
authoritative name serverdns.cs.cornell.edu
intermediate name server
dns.cornell.edu
7
10
56
.edu name server
89
DNS Architecture
• Hierarchical Namespace Management– domains and sub-domains– distributed and localized authority
• Authoritative Nameservers– server mappings for specific sub-domains– more than one (at least two for failure resilience)
• Caching to mitigate load on root servers– time-to-live (ttl) used to delete expired cached
mappings
DNS: query resolution
iterated query:• contacted server
replies with name of server to contact
• “I don’t know this name, but ask this server”
• Takes burden off root servers
recursive query:• puts burden of name
resolution on contacted name server
• reduces latency requesting hostsurf.eurecom.fr
www.cs.cornell.edu
root name server
local name serverdns.eurecom.fr
1
23
4
8 7
authoritative name serverdns.cs.cornell.edu
intermediate name serverdns.cornell.edu
9
10
iterated query
56
.edu name server
recursive query
DNS records: More than Name to IP Address
DNS: distributed db storing resource records (RR)
• Type=NS– name is domain (e.g. foo.com)– value is IP address of
authoritative name server for this domain
RR format: (name, value, type,ttl)
• Type=A– name is hostname– value is IP address
– One we’ve been discussing; most common
• Type=CNAME– name is an alias name
for some “cannonical” (the real) name
– value is cannonical name
• Type=MX– value is hostname of
mailserver associated with name
nslookup
• Use to query DNS servers (not telnet like with http – why?)
• Examples:– nslookup www.yahoo.com– nslookup www.yahoo.com
dns.cs.cornell.edu• specify which local nameserver to use
– nslookup –type=mx cs.cornell.edu• specify record type
PTR Records
• Do reverse mapping from IP address to name
• Why is that hard? Which name server is responsible for that mapping? How do you find them?
• Answer: special root domain, arpa, for reverse lookups
Arpa top level domain
root
com miledugov grorgnet fr ukusarpa
In-addr
128
30 33 1
ietf
www
1.33.30.128.in-addr.arpa.
www.ietf.org.
Want to know machine name for 128.30.33.1?Issue a PTR request for 1.33.30.128.in-addr.arpa
Why is it backwards?
• Notice that 1.30.33.128.in-addr.arpa is written in order of increasing scope of authority just like www.cs.foo.edu
• Edu largest scope of authority; foo.edu less, down to single machine www.cs.foo.edu
• Arpa largest scope of authority; in-addr.arpa less, down to single machine 1.30.33.128.in-addr.arpa (or 128.33.30.1)
In-addr.arpa domain
• When an organization acquires a domain name, they receive authority over the corresponding part of the domain name space.
• When an organization acquires a block of IP address space, they receive authority over the corresponding part of the in-addr.arpa space.
• Example: Acquire domain berkeley.edu and acquire a class B IP Network ID 128.143
DNS protocol, messagesDNS protocol : query and reply messages, both with same message format
msg header• identification: 16 bit # for
query, reply to query uses same #
• flags:
– query or reply
– recursion desired
– recursion available
– reply is authoritative
– reply was truncated
DNS protocol, messages
Name, type fields for a query
RRs in reponseto query
records forauthoritative servers
additional “helpful”info that may be used
The Transport Layer
Purpose of this layer• Interface end-to-end applications and protocols
– Turn best-effort IP into a usable interface
• Data transfer b/w processes:– Compared to end-to-end IP
• We will look at 2:– TCP– UDP
application
transport
networkdata link
physical
application
transport
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physical
networkdata link
physicalnetworkdata link
physical
logical end-end transport
UDP• Unreliable Datagram Protocol
• Best effort data delivery between processes– No frills, bare bones transport protocol
– Packet may be lost, out of order
• Connectionless protocol:– No handshaking between sender and receiver
– Each UDP datagram handled independently
UDP Functionality
• Multiplexing/Demultiplexing– Using ports
• Checksums (optional)– Check for corruption
applicationtransportnetwork
MP2
applicationtransportnetwork
receiver
HtHnsegment
segment Mapplicationtransportnetwork
P1M
M MP4
segmentheader
application-layerdata
P3
Multiplexing/Demultiplexing• Multiplexing:
– Gather data from multiple processes, envelope data with header – Header has src port, dest port for multiplexing
• Why not process id?
• Demultiplexing:– Separate incoming data in machine to different applications– Demux based on sender addr, src and dest port
source port #dest port #
32 bits
Applicationdata
(message)
UDP segment format
length checksumLength, in
bytes of UDPsegment,including
header
Implementing Ports
• As a message queue– Append incoming message to the end– Much like a mailbox file
• If queue full, message can be discarded• When application reads from socket
– OS removes some bytes from the head of the queue
• If queue empty, application blocks waiting
UDP Checksum
• Over the headers and data– Ensures integrity end-to-end– 1’s complement sum of segment contents
• Is optional in UDP• If checksum is non-zero, and receiver computes
another value:– Silently drop the packet, no error message detected
UDP Discussion
• Why UDP?– No delay in connection establishment
– Simple: no connection state
– Small header size
– No congestion control: can blast packets
• Uses:– Streaming media, DNS, SNMP– Could add application specific error recovery
TCP
• Transmission Control Protocol– Reliable, in-order, process-to-process, two-way byte stream
• Different from UDP– Connection-oriented– Error recovery: Packet loss, duplication, corruption,
reordering
• A number of applications require this guarantee– Web browsers use TCP
Handling Packet Losssender receiver
time
message
There are a number of reasons why the packet may get lost:- router congestion, lossy medium, etc.
How does sender know of a successful packet send?
Lost Acks
sender receiver
time
message
ack
What if packet/ack is lost?
timeout
Delayed ACKssender receiver
time
message
ack
timeout
What will happen here? Due to congestion, small timeout, …Delayed ACKs duplicate packets
message
Delayed ACKssender receiver
time
m1
ack
timeout
m1
timeout
m2
ack
How to solve this scenario?
Insertion of Packetssender receiver
time
m1
ack1
m2
ack2
m2’
m2’ could be from an old expired session!
Message Identifiers
• Each message has <message id, session id>– Message id: uniquely identifies message in sender– Session id: unique across sessions
• Message ids detect duplication, reordering• Session ids detect packet from old sessions• TCP’s sequence number has similar functionality:
– Initial number chosen randomly– Unique across packets– Incremented by length of data bytes
TCP Packetssource port # dest port #
32 bits
applicationdata
(variable length)
sequence numberacknowledgement numberrcvr window size
ptr urgent datachecksum
FSRPAUheadlen
notused
Options (variable length)
URG: urgent data (generally not used)
ACK: ACK #valid
PSH: push data now(generally not used)
RST, SYN, FIN:connection estab(setup, teardown
commands)
# bytes rcvr willingto accept
countingby bytes of data(not segments!)
Internetchecksum
(as in UDP)
TCP Connection Establishment
sender receiver
(ack x, seq # y)
(open, seq # x)
(ack y)
TCP is connection-oriented. Starts with a 3-way handshake.Protects against duplicate SYN packets.
TCP Usagesender
receiver
(ack x, seq
# y)
(open, seq # x)
(ack y)Data
Data, ACK
Fin, ACK
Fin, ACK
TCP timeouts
• What is a good timeout period ?– Want to improve throughput without unnecessary transmissions
• Timeout is thus a function of RTT and deviation
NewAverageRTT = (1 - ) OldAverageRTT + LatestRTTNewAverageDev = (1 - ) OldAverageDev + LatestDevwhere LatestRTT = (ack_receive_time – send_time), LatestDev = |LatestRTT – AverageRTT|, = 1/8, typically.Timeout = AverageRTT + 4*AverageDev
TCP Windows
• Multiple outstanding packets can increase throughput
TCP Windows
• Can have more than one packet in transit
• Especially over fat pipes, e.g. satellite connection
• Need to keep track of all packets within the window
• Need to adjust window size
DATA, id=17DATA, id=18DATA, id=19DATA, id=20
ACK 17
ACK 18
ACK 19
ACK 20
TCP Congestion Control
• TCP increases its window size when no packets dropped
• It halves the window size when a packet drop occurs
– A packet drop is evident from the acknowledgements
• Therefore, it slowly builds to the max bandwidth, and hover around the max
– It doesn’t achieve the max possible though
– Instead, it shares the bandwidth well with other TCP connections
• This linear-increase, exponential backoff in the face of congestion is termed TCP-friendliness
TCP Window Size
• Linear increase• Exponential
backoff
• Assuming no other losses in the network except those due to bandwidth
Time
Ban
dwid
th
Max Bandwidth
TCP Fairness
• Want to share the bottleneck link fairly between two flows
Bandwidth for Host B
Ban
dwid
th f
or H
ost A
B
A
BottleneckLink
D
TCP Slow Start
• Linear increase takes a long time to build up a window size that matches the link bandwidth*delay
• Most file transactions are not long enough
• Consequently, TCP can spend a lot of time with small windows, never getting the chance to reach a sufficiently large window size
• Fix: Allow TCP to build up to a large window size initially by doubling the window size until first loss
TCP Slow Start
• Initial phase of exponential increase
• Assuming no other losses in the network except those due to bandwidth
Time
Ban
dwid
th
Max Bandwidth
TCP Summary
• Reliable ordered message delivery• Connection oriented, 3-way handshake
• Transmission window for better throughput• Timeouts based on link parameters
• Congestion control• Linear increase, exponential backoff
• Fast adaptation• Exponential increase in the initial phase