Post on 13-Jan-2016
T-110.5110 Computer Networks IIT-110.5110 Computer Networks II
Transport IssuesTransport Issues
29.9.200829.9.2008
Prof. Sasu Tarkoma
ContentsContents
•Transport Layer Overview
•Congestion Control
•TCP, TCP improvements, TCP and wireless
•Stream Control Transmission Protocol (SCTP)
•Datagram Congestion Protocol (DCCP)
•TLS and DTLS
Transport Layer OverviewTransport Layer Overview
•TCP congestion control principles introduced in late 1980s
– Not part of the original transport layer functionality
– Important design factor in today’s Internet protocol development
•Important issues
– Preventing congestion collapse
– Fairness
TCP and UDPTCP and UDP
•Transmission Control Protocol
– Connection oriented
– RFC 793
•User Datagram Protocol (UDP)
– Connectionless
– RFC 768
Motivation for Congestion ControlMotivation for Congestion Control
•UDP used instead of TCP by applications that prefer timeliness over reliability
•UDP does not have congestion control
•A problem with long-lived flows and traffic intensive flows (streaming video, audio, internet telephony)
•Greater use increases risk of congestion collapse
• Congestion control mechanisms refers to techniques and mechanisms that can either prevent congestion, before it happens, or remove congestion, after it has happened open-loop congestion control (prevention) and closed loop congestion control (removal)
Congestion PreventionCongestion Prevention
•Transmission rate must be reduced when congestion is detected
•Responsibility of transport layer, i.e., the sending end host
•Packet loss is assumed to be congestion signal
– No deployed explicit congestion notification scheme
– At most one congestion action / round-trip time
– Burst of packet losses can be indication of same congestion situation
FairnessFairness
•Transport implementations must be fair to other flows
•Transmission rate should be roughly similar to that of TCP
•Components of TCP-friendly congestion control
– Slow-start
– Additive Increase, Multiplicative Decrease (AIMD)
•Retransmission timers relative to round-trip time
SolutionsSolutions
• Implement congestion control below UDP: too low
• Above UDP: implement congestion control at application level– Reinventing the wheel each time– Complex, might not be done correctly– New protocol more interoperable than a user-level library
• In transport layer: modification of TCP, UDP, RTP, SCTP– More complex protocols– Not general enough– Introduces a fundamental change
• Current trend: new transport protocols, namely DCCP and SCTP
TCPTCP
•Reliable
•Cumulative acknowledgements
•Fast retransmit / fast recovery
•Reno [RFC 2581], NewReno [RFC 3782]
•Retransmission timeouts [RFC 2988]
•Stream-oriented
– no concept of datagram boundaries
– ideal for transferring files
– transferring series of structured messages more difficult
TCP ServicesTCP Services
•Reliable communication between pairs of processes
•Across variety of reliable and unreliable networks and internets
•Two labeling facilities
– Data stream push
• TCP user can require transmission of all data up to push flag
• Receiver will deliver in same manner
• Avoids waiting for full buffers
– Urgent data signal
• Indicates urgent data is upcoming in stream
• User decides how to handle it
TCP HeaderTCP Header
Source: William Stallings, Data and Computer Communications, Chapter 17.
TCP MechanismsTCP Mechanisms•Connection establishment
– Between ports
– Three way handshake
•Data transfer service
– Stream of octets
– Octets numbered modulo 223
– Flow control by credit allocation of number of octets
– Data buffered at sender and receiver
•Connection termination
– Graceful close
– Transport entity sets FIN flag on last segment sent
– Abrupt termination by ABORT primitive
Congestion ControlCongestion Control
•RFC 1122, Requirements for Internet hosts
•Retransmission timer management
– Estimate round trip delay by observing pattern of delay
– Set time to value somewhat greater than estimate
– Simple average
– Exponential average
– RTT Variance Estimation (Jacobson’s algorithm)
Exponential RTO BackoffExponential RTO Backoff
•Since timeout is probably due to congestion (dropped packet or long round trip), constant RTO is not good idea
•RTO increased each time a segment is re-transmitted
•RTO = q*RTO
•Commonly q=2
– Binary exponential backoff
Retransmission MechanismRetransmission Mechanism
•TCP receiver acknowledges next sequence number it expects to receive
– If receiver gets packet out of order it acknowledges same sequence number than earlier
– When sender receives 3 duplicate acknowledgements it considers the first unacknowledged segment lost
•Congestion response: reduce the congestion window by half
– Retransmit the first unacknowledged segment
– If no acknowledgements arrive for time RTO, sender retransmits the first unacknowledged segment
– Reset window to one segment
Karn’s AlgorithmKarn’s Algorithm
•If a segment is re-transmitted, the ACK arriving may be:
– For the first copy of the segment
• RTT longer than expected
– For second copy
•No way to tell
•Do not measure RTT for re-transmitted segments
•Calculate backoff when re-transmission occurs
•Use backoff RTO until ACK arrives for segment that has not been re-transmitted
Window ManagementWindow Management
•Slow start
– awnd = MIN[credit, cwnd]
– Start connection with cwnd=1
– Increment cwnd at each ACK, to some max
•Dynamic windows sizing on congestion
– When a timeout occurs
– Set slow start threshold to half current congestion window
• ssthresh=cwnd/2
– Set cwnd = 1 and slow start until cwnd=ssthresh
• Increasing cwnd by 1 for every ACK
– For cwnd >=ssthresh, increase cwnd by 1 for each RTT
Congestion ExampleCongestion Example
Source: http://dpnm.postech.ac.kr/itec522/lecture/Chapter12-3.ppt
Congestion avoidanceCongestion avoidance
Source: http://dpnm.postech.ac.kr/itec522/lecture/Chapter12-3.ppt
TCP timersTCP timers
Source: http://dpnm.postech.ac.kr/itec522/lecture/Chapter12-3.ppt
TCP Congestion SummaryTCP Congestion Summary
Source: http://dpnm.postech.ac.kr/itec522/lecture/Chapter12-3.ppt
TCP Summary and ImprovementsTCP Summary and Improvements
•Concepts: Congestion window, round-trip time, retransmission timeout, duplicate acknowledgement (triggered by out of order segment)
•Congestion control
– Packet loss as a signal, reduce rate
•Fairness
– Transport implementations must be fair to other flows
•Retransmission mechanism
•Selective acknowledgements (SACK), RFC 2018
– Additional information about ”holes” in sequence number space
•Limited transmit & early retransmit, timestamps
TCP ProblemsTCP Problems
•Minimal information from cumulative acknowledgements
•Problems in environments with frequent packet losses (wireless)
•Small window and packet retransmissions
•May prevent fast retransmit from working
•Retransmission ambiguity -- is ACK for original or retransmit?
•Hinders the round-trip time measurement
•Unnecessary retransmissions
•Unnecessary use of bandwidth (sometimes expensive in wireless)
SACKSACK
• Additional information about “holes” in sequence number space
– TCP option that reports discontinuous blocks of received data
– Sender gets better information about which segments are lost
– Allows more efficient retransmissions
– Without SACK sender can retransmit only one segment in
round-trip time
– With SACK more retransmissions can be made in a round-trip
time
– Allows more efficient tracking of number of outstanding
segments
– SACK option specified in RFC 2018
– SACK-based retransmission algorithm specified in RFC 3517
TimestampsTimestamps
•Specified in RFC 1323
•TCP option for sender to include timestamp in every packet
•TCP receiver echoes the timestamp back to sender
•Retransmissions have different timestamp than original
•Allows round-trip time measurement for retransmitted segments
– Not allowed without timestamps
– Allows detection of spurious retransmissions [Ludwig00]
– Allows protection against wrapped sequence numbers
Queue ManagementQueue Management
•Simple router implementation drops packet when queue is full
– Lock-out: Sometimes few flows get to dominate most of queue space
– Queue delay: Long packet queues increase transmission delays
•Active Queue Management marks packets before queue is full
– Random Early Detection (RED) [Floyd93]
• Mark a packet at probability P when queue length is more than L
• Marks are distributed more evenly between flows
Explicit Congestion NotificationExplicit Congestion Notification
•Sender marks a bit in IP header if transport is ECN capable
– Routers to indicate congestion with a congestion bit in IP header
– Used with Active Queue Management
– Reduces the number of packet losses
– Transport layer receiver echoes congestion notification to sender
• In transport header
– When receiving notification, sender reduces its transmission rate
– Implemented in many end-hosts, but not too many routers
•Problem: Some devices in network drop IP packets with ECN bits
TCP EvolutionTCP Evolution
1975 1980 1985 1990
1982TCP & IP
RFC 793 & 791
1974TCP described by
Vint Cerf and Bob KahnIn IEEE Trans Comm
1983BSD Unix 4.2
supports TCP/IP
1984Nagel’s algorithmto reduce overhead
of small packets;predicts congestion
collapse
1987Karn’s algorithmto better estimate
round-trip time
1986Congestion
collapseobserved
1988Van Jacobson’s
algorithmscongestion avoidance and congestion control(most implemented in
BSD Tahoe)
1975Three-way handshake
Raymond TomlinsonIn SIGCOMM 75
SYN CookiesSYN Cookies
• Client
– sends SYN packet and ACK number to server
– waits for SYN-ACK from server w/ matching ACK
number
• Server
– responds w/ SYN-ACK packet w/ initial SYN-cookie
sequence number
– Sequence number is cryptographically generated
value based on client address, port, and time.
• Client
– sends ACK to server w/ matching sequence number
• server
– If ACK is to an unopened socket, server validates
returned sequence number as SYN-cookie
– If value is reasonable, a buffer is allocated and
socket is opened
SYN
ack-number
SYN-ACK
seq-number as SYN-cookie,ack-number
NO BUFFER ALLOCATED
ACK
seq_numberack-number+data
SYN-ACK
seq-number, ack-number
TCP BUFFER ALLOCATED
Stream Control Transmission Stream Control Transmission Protocol (SCTP)Protocol (SCTP)
SCTPSCTP•Stream Control Transmission Protocol (SCTP)
•Specified in RFC 2960
•Additional features to TCP
– Preservation of message boundaries
– Support for multiple streams
– Support for multi-homing
•Packets consist of chunks: INIT, SACK, HEARBEAT, DATA, ABORT, SHUTDOWN, ERROR, and AUTH
•Partial reliability
– Retransmissions until abort
•Extended Socket API (bind(), context data with sendmsg())
•Suitable for signalling traffic
•Challenges with middleboxes
MotivationMotivation
•TCP, UDP do not satisfy all application needs
•SCTP evolved from work on IP telephony signaling
– Proposed IETF standard (RFC 2960)
– Like TCP, it provides reliable, full-duplex connections
– Unlike TCP and UDP, it offers new delivery options that are particularly desirable for telephony signaling and multimedia applications
•TCP + features
– Congestion control similar; some optional mechanisms mandatory
– Two basic types of enhancements:
• performance
• robustness
ComparisonComparison
• Services/Features SCTP TCP UDP
• Full-duplex data transmission yes yes yes
• Connection-oriented yes yes no
• Reliable data transfer yes yes no
• Unreliable data transfer yes no yes
• Partially reliable data transfer yes no no
• Ordered data delivery yes yes no
• Unordered data delivery yes no yes
• Flow and Congestion Control yes yes no
• ECN support yes yes no
• Selective acks yes yes no
• Preservation of message boundaries yes no yes
• Application data fragmentation yes yes no
• Multistreaming yes no no
• Multihoming yes no no
• Protection agains SYN flooding attack yes no n/a
• Half-closed connections no yes n/a
Packet formatPacket format• Unlike TCP, SCTP provides message-oriented data delivery service
– key enabler for performance enhancements
• Common header; three basic functions:– Source and destination ports together with the IP addresses– Verification tag– Checksum: CRC-32 instead of Adler-32
• followed by one or more chunks– chunk header that identifies length, type, and any special flags– concatenated building blocks containg either control or data
information– control chunks transfer information needed for association
(connection) functionality and data chunks carry application layer data.
– Current spec: 14 different Control Chunks for association establishment, termination, ACK, destination failure recovery, ECN, and error reporting
• Packet can contain several different chunk types
•Decoupling of reliable and ordered delivery
– Unordered delivery: eliminate head-of-line blocking delay
Chunk 2
Chunk 3
Chunk 4
Chunk 1
TCP receiver buffer
App waits
• Application Level Framing• Support for multiple data streams (per-stream ordered delivery)
- Stream sequence number (SSN) preserves order within streams
- no order preserved between streams
- per-stream flow control, per-association congestion control
PerformancePerformance
•Application may use multiple logical data streams
– e.g. pictures in a web browser
•Common solution: multiple TCP connections
– separate flow / congestion control, overhead (connection setup/teardown, ..)
Chunk 1 Chunk 2 Chunk 3 Chunk 4
Chunk 1 Chunk 2 Chunk 3 Chunk 4
App stream 1
App stream 2
TCP sender
Chunk 1 1
Chunk 1 2
Chunk 2 3
Chunk 2 4
Chunk 1 1
Chunk 2 4
Chunk 2 3
Chunk 1 2
TCP receiver
App 1 waits
Multiple Data StreamsMultiple Data Streams
MultihomingMultihoming
• TCP connection is equivalent to SCTP association
– 2 IP addresses, 2 port numbers 2 sets of IP addresses, 2
port numbers
• Goal: robustness
– automatically switch hosts upon failure
– eliminates effect of long routing reconvergence time
• TCP: no guarantee for “keepalive“ messages when connection idle
• SCTP monitors each destination's reachability via ACKs of
– data chunks and heartbeat chunks
• SCTP uses multihoming for redundancy, not for load balancing
Association phasesAssociation phases• Association establishment: 4-way handshake
– Host A sends INIT chunk to Host B– Host B returns INIT-ACK containing a cookie
• information that only Host B can verify• No memory is allocated at this point (prevents DoS)
– Host A replies with COOKIE-ECHO chunk; may contain A's first data.– Host B checks validity of cookie; association is established
• Data transfer– SCTP assigns each chunk a unique Transmission Sequence Number
(TSN)– SCTP peers exchange starting TSN values during association
establishment phase– Message oriented data delivery; fragmented if larger than destination path
MTU– Reliability through acks, retransmissions, and end-to-end checksum
• Association shutdown: 3-way handshake– SHUTDOWN SHUTDOWN-ACK SHUTDOWN-COMPLETE– Does not allow half-closed connections
Datagram Congestion Control Datagram Congestion Control Protocol (DCCP)Protocol (DCCP)
MotivationMotivation
•Some apps want unreliable, timely delivery
– For example: VoIP
•UDP: no congestion control
•Unresponsive long-lived applications
– endanger others (congestion collapse)
– may hinder themselves (queuing delay, loss, ..)
•Implementing congestion control is difficult
– may require precise timers; should be placed in kernel
DCCPDCCP
•Datagram Congestion Control Protocol (DCCP)
•Unreliable datagram-oriented protocol (RFC 4340)
– UDP with congestion control
•Connection-oriented, requires connection state machine
•Congestion control requires ack mechanism and sequence numbers
•Negotiable features and options
– Checksums, congestion control parameters
•Some features: partial checksums, service codes
•Suitable for long-lived non-reliable flows
•Challenges with middleboxes
DCCP RequirementsDCCP Requirements
•DCCP was designed for time-sensitive applications
•Application requirements:
– Choice of congestion control mechanism: TFRC vs. TCP-like
– Buffering control: do not deliver old data
– Low per-packet overhead
•Additional features
– Explicit Congestion Notification (ECN): mark congested packets
– NAT and firewall support: TCP-style explicit connection setup and teardown
DCCP RequirementsDCCP Requirements
• Well-known features from TCP and UDP:
– Port numbers, checksums, sequence numbers (with difficulty),
acks (congestion and ECN info), piggybacked acks
– Three-way handshake to set up, two-way with wait to tear down
• New features:
– Negotiate congestion control mechanism and parameters on
setup
– Two half-connections (A → B, B → A)
Half connectionsHalf connections
• Based on observation that traffic is typically asymmetric
• It follows that separation is useful
– Different routes implies different congestion issues
– Each half connection has own congestion control mechanism
and parameters
• Better than two one-way connections
– Works better with firewalls and NAT
– Can piggyback acks with data
DCCP Feature SelectionDCCP Feature Selection
•Reliable feature selection:
– A: change(f, α)
– B: confirm(f, α) / prefer(f, β)
– [A: confirm(f, β) ]
•Selection for both half-connections done in parallel at startup
•Generic, extensible
Issues with Acknowledgements (ACKs)Issues with Acknowledgements (ACKs)
• Acks must be at least partially reliable
• TCP-style cumulative acks won’t work, so must ack everything (ack vector)
• But ack state at receiver may grow without bound!
• So sender occasionally acks the receiver’s acks
• Receiver can throw away state for that ack
• Acks take up sequence number space– Useful: can be used to detect reverse-path congestion
Packet StructurePacket Structure
•Basic packet similar to UDP
•Small (12 bytes)
•Extensible for additional features instead of using a fixed-length flag field
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Source Port | Dest Port |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Data Offset | CCVal | CsCov | Checksum |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Type |X|# NDP| Sequence Number |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
““Plug ‘n Play” Congestion ControlPlug ‘n Play” Congestion Control
•CC mechanism and parameters (both ways) chosen during connection setup
•Currently two mechanisms:
– TFRC (control equation)
– TCP-like (TCP with tweaked parameters)
– Can add more later
Partial checksumsPartial checksums
• Checksum covers DCCP header and (optionally) any number of bytes into payload
• Allows delivery of some damaged data
• May be useful on error-prone links (eg. wireless)
• Drawbacks:– Might conflict with IP-level authentication (eg. IPSec’s AH)
When is a packet received?When is a packet received?
• TCP: acked packets must be delivered to application
• DCCP: acked packet might be dropped from application’s queue (apps might favour new data over old)
• Ack means received and placed into application queue
SecuritySecurity
• Assumption: hijacker cannot snoop packets
• Start session at random sequence number
• Sliding “valid sequence number” window, hijacker cannot throw in random packet
– If out-of-window packet received, ask sender if it’s correct and tell them what receiver’s window is
– This allows windows to be resynchronised
• ECN nonce in packets to prevent misbehaving receiver
• DOS attacks– TCP-style init cookies, avoid saving state at the server
TLS and DTLSTLS and DTLS
MotivationMotivation
•Why not use IPsec?– Better suited for host-host security, than
application-application security– Runs in the kernel– Non-uniform IPsec API’s– Complicated, inter-operability issues– Key exchange complicated
•TLS is the de facto transport layer security protocol
•Datagram TLS
HistoryHistory
•SSL 1.0– Internal Netscape design, early 1994?
•SSL 2.0– Published by Netscape, November 1994
•SSL 3.0– Designed by Netscape and Paul Kocher, November 1996
•TLS 1.0– Internet standard based on SSL 3.0, January 1999– Not interoperable with SSL 3.0
TLS BasicsTLS Basics
•TLS consists of two protocols
•Handshake protocol
– Use public-key cryptography to establish a shared secret key between the client and the server
•Record protocol
– Use the secret key established in the handshake protocol to protect communication between the client and the server
TLS Handshake ProtocolTLS Handshake Protocol
•Two parties: client and server
•Negotiate version of the protocol and the set of cryptographic algorithms to be used
– Interoperability between different implementations of the protocol
•Authenticate client and server (optional)
– Use digital certificates to learn each other’s public keys and verify each other’s identity
•Use public keys to establish a shared secret
Handshake Protocol StructureHandshake Protocol Structure
ClientHello
ServerHello, [Certificate],[ServerKeyExchange],[CertificateRequest],ServerHelloDone
[Certificate],ClientKeyExchange,[CertificateVerify]
Finished
switch to negotiated cipher
Finished
switch to negotiated cipher
ClientClientServerServer
Abbreviated HandshakeAbbreviated Handshake
•The handshake protocol may be executed in an abbreviated form to resume a previously established session
– No authentication, key material not exchanged
– Session resumed from an old state
Datagram Transport Layer Security (DTLS)Datagram Transport Layer Security (DTLS)
•Datagram TLS (DTLS) is a new protocol defined in RFC4347
•Can be used on top of unreliable transport protocols such as UDP
•OpenSSL
•RFC4347 refers to the TLS specification and specifies only the differences to TLS
•Replay protection with an explicit sequence number that is included in the record header, since message re-ordering, message duplication and message loss are possible
•Protection against Denial-of-Service attacks
•DTLS supports protection against DoS attacks with cookies similar to IKEv2
ComparisonComparison
•Scope of the protection
– TLS operates end-to-end between two applications
– IPSec can operate end-to-end between two hosts, middle-to-middle
•TLS protects only the payload of the application. It does not protect the transport header. It does not protect the IP header.
•IPSec protects the application payload, the transport header and the IP header
•TLS is used to establish a “secure channel”, which can protect several subsequent TCP connections
•IPSec can be used to protect several data flows between two hosts A and B.
Comparison IIComparison II• Sequence numbers for replay protection
– Both protocols provide replay protection
– In TLS the sequence number is not carried explicitly, since a reliable
transport is assumed with TCP.
– In AH/ESP the sequence number is carried explicitly in the AH/ESP header
– In DTLS, the sequence number is carried explicitly as well, since no
reliable transport is assumed
• Protection against DoS attacks
– IKEv2 provides protection against DoS attacks using cookies
– TLS does not provide protection against DoS attacks, since it runs on top
of TCP
– DTLS provides DoS protection using cookies, similar to IKEv2
Comparison IIIComparison III
• Administrative issues
– The configuration of IPSec policies on a host or a router requires
administrative access (e.g. root)
– TLS on the contrary can be used by any application and does not
require administrative access
– Administrative issues: Issues with firewalls and NATs
– IPSec has incompatibility issues with middleboxes such as firewalls
and NATs, since such devices examine and possibly manipulate the
IP header and the transport header
• TLS can operate well in the presence of firewalls and NATs (as long as the
port numbers used by the application are not blocked by the firewall),
since the IP header and the transport header are not modified by TLS