1 End-to-End Protocols User Datagram Protocol (UDP) Transmission Control Protocol(TCP)

Post on 19-Jan-2018

228 views 2 download

description

3 Common End-to-End Services Guarantee message delivery Deliver messages in the same order they are sent Deliver at most one copy of each message Support arbitrarily large messages Support synchronization Allow the receiver to flow control the sender Support multiple application processes on each host

Transcript of 1 End-to-End Protocols User Datagram Protocol (UDP) Transmission Control Protocol(TCP)

1

End-to-End Protocols

User Datagram Protocol (UDP)Transmission Control Protocol(TCP)

2

Underlying Best-Effort Network

• Drops messages• Re-orders messages• Delivers duplicate copies of a given message• Limits messages to some finite size• Delivers messages after an arbitrarily long delay

3

Common End-to-End Services

• Guarantee message delivery• Deliver messages in the same order they are sent• Deliver at most one copy of each message• Support arbitrarily large messages• Support synchronization• Allow the receiver to flow control the sender• Support multiple application processes on each host

4

Transport Protocol

• All end-to-end services provided by transport protocol

Separate layer of protocol stack Conceptually between

- Applications layer- IP layer

5

Terminology

• IP- Provides computer-to-computer (or node to node)

communication in two different networks- Source and destination addresses are computers- Called machine-to-machine

Transport protocols- Provide application-to-application communication- Need extended addressing mechanism to identify

applications (port numbers)- Called end-to-end

6

Transport Protocol Functionality• Identify sending and receiving applications (by

using port numbers) Optionally provide

- Reliability- Flow control- Congestion control

Note: not all transport protocols provide above facilities

7

Two Transport Protocols Available• Transmission Control Protocol (TCP) User Datagram Protocol (UDP) Major differences

- Interface to applications- Functionality

8

User Datagram Protocol (UDP)• Provides unreliable transfer Requires minimal

- Overhead- Computation- Communication

Best for LAN applications (because LAN is a very reliable environment)

Used to support audio/video streaming applications or similar applications that does not require absolute reliability

9

UDP Details• Connectionless service paradigm

- Message-oriented interface Each message encapsulated in IP datagram UDP header identifies

- Sending application- Receiving application

10

UDP Detail (cont’d)

• Included in TCP/IP suite Supports many protocol ports Transports a message from one machine to another Does not use acknowledgements Does not order incoming messages Does not do flow control Application program is responsible to handle all problems

such as loss, duplication, delay, out of order delivery, etc.

11

Simple Demultiplexor (UDP)• Adds multiplexing• No flow control• Endpoints identified by ports

– servers have well-known ports– see /etc/services on Unix

• Header format• Optional checksum

– pseudo header + UDP header + data Source port is optional (zero, if not used) Length includes all fields (Minimum value is 8)

SrcPort DstPort

Length Checksum

Data

0 16 31

12

UDP Checksum

• Optional: checksum can be used to verify delivery of datagram to a correction destination

• Pseudoheader– Consists of three fields from IP header: protocol number, source

IP address, and destination IP address plus the UDP length field

• UDP uses the same checksum algorithm as IP• Note: UDP length is included twice in the calculation

13

UDP Pseudo-Header

• Used to verify that UDP has reached its destination

• Used during UDP checksum computation

Source IP Address

Destination IP Address Zero Proto UDP Length

14

UDP Protocol Layering

• Operates in the same layer as TCP

Application

Transport

Internet

Network Interface

Physical

15

UDP Encapsulation Entire UDP message is encapsulated in an IP datagram

UDP-Hdr Data Area

IP Data AreaIP-Hdr

Frame Data AreaFrame-Hdr

16

Note: The IP layer is responsible only for transferring data between a pair of hosts on an internet, while UDP layer is responsible only for differentiating among multiple sources or destinations within one host.

17

UDP Message Queue

Applicationprocess

Applicationprocess

Applicationprocess

UDP

Packets arrive

Ports

Queues

Packetsdemultiplexed

18

Identifying An Application

• Cannot extend IP address: No unused bits Cannot use OS-dependent quantity: Process ID, task number

or Job name Must work on all computer systems Invent new abstraction

- Used only with TCP/IP- Identifies sender and receiver unambiguously

Technique- Each application assigned unique integer- Called protocol port number

19

Protocol Ports

Server- Follows standard- Always uses same port number- Uses lower port numbers

Client- Obtains unused port from protocol software- Uses higher port numbers

20

Protocol Port Example

• Domain name server application is assigned port 53 Application using DNS obtains port 28900 UDP datagram sent from application to DNS server has

- Source port number 28900- Destination port number 53

When DNS server replies, UDP datagram has- Source port number 53- Destination port number 28900

21

Common Port Assignments Protocol Port Encoding Purpose Echo 7 TCP/UDP Used to verify whether two machines can

communicate Daytime 13 TCP/UDP Provides the current time on the server FTP-Data 20 TCP Used to transfer files FTP 21 TCP To send ftp commands such as "put" and "get TELNET 23 TCP For interactive, remote command-line sessions SMTP 25 TCP To send email between machines Whois 43 TCP Directory service for Internet network

administrator Finger 79 TCP Gets information about a user or users HTTP 80 TCP Protocol of the World Wide Web POP3 110 TCP Post Office Protocol Version 3 for transfer of

email from host to clients NNTP 119 TCP Network News Transfer Protocol

22

Transmission Control Protocol (TCP)

Major transport protocol used in Internet Heavily used Completely reliable transfer

23

TCP Features

• Connection-oriented service Point-to-point Full-duplex communication Stream interface Stream divided into segments for transmission Each segment encapsulated in IP datagram Uses protocol ports to identify applications

24

TCP Overview

• Connection-oriented• Byte-stream

– app writes bytes– TCP sends segments– app reads bytes

Application process

Writebytes

TCPSend buffer

Segment Segment Segment

Transmit segments

Application process

Readbytes

TCPReceive buffer

… …

• Full duplex• Flow control: keep sender from

overrunning receiver• Congestion control: keep sender

from overrunning network

25

TCP Feature Summary

TCP provides a completely reliable (no data duplication or loss), connection-oriented, full-duplex stream transport service that allows two application programs to form a connection, send data in either direction, and then terminate the connection.

26

Data Link Versus Transport• Potentially connects many different hosts

– need explicit connection establishment and termination

• Potentially different RTT– need adaptive timeout mechanism

• Potentially long delay in network– need to be prepared for arrival of very old packets

• Potentially different capacity at destination – need to accommodate different node capacity

• Potentially different network capacity– need to be prepared for network congestion

27

Apparent Contradiction

IP offers best-effort (unreliable) delivery TCP uses IP TCP provides completely reliable transfer How is this possible?

28

Achieving Reliability

Reliable connection startup Reliable data transmission Graceful connection shutdown

29

Reliable Data Transmission

• Positive acknowledgement- Receiver returns short message when data arrives- Called acknowledgement

Retransmission- Sender starts timer whenever message is transmitted- If timer expires before acknowledgement arrives,

sender retransmits message

30

How Long Should TCP Wait Before Retransmitting?

Time for acknowledgement to arrive depends on- Distance to destination- Current traffic conditions

Multiple connections can be open simultaneously Traffic conditions change rapidly

31

Important Point

The delay is required for data to reach a destination and an acknowledgement to return depends on traffic in the internet as well as the distance to the destination. Because it allows multiple application programs to communicate with multiple destinations concurrently, TCP must handle a variety of delays that can change rapidly.

32

Solving Retransmission Problem

Keep estimate of roundtrip time on each connection

Use current estimate to set retransmission timer Known as adaptive retransmission Key to TCP's success

33

TCP Flow Control

Receiver- Advertises available buffer space- Called window

Sender- Can send up to entire window before ack arrives

34

Window Advertisement

• Each acknowledgement carries new window information- Called window advertisement- Can be zero (called closed window)

Interpretation: I have received up through X, and can take Y more octets

35

Startup And Shutdown

Connection startup- Must be reliable

Connection shutdown- Must be graceful

Difficult

36

Why Startup/Shutdown Difficult

Segments can be - Lost- Duplicated- Delayed- Delivered out of order- Either side can crash- Either side can reboot

Need to avoid duplicate "shutdown" message from affecting later connection

37

TCP's Startup/Shutdown Solution Uses three-message exchange Known as 3-way handshake Necessary and sufficient for

- Unambiguous, reliable startup- Unambiguous, graceful shutdown

SYN used for startup FIN used for shutdown

38

Connection Establishment and Termination

Active participant(client)

Passive participant(server)

SYN, SequenceNum = x

SYN + ACK, SequenceNum = y,

ACK, Acknowledgment = y + 1

Acknowledgment = x + 1

39

Illustration: Closing a connectionHost 1 Host 2Send FIN + ACK

Receive FIN+ACKSend ACK

Receive FIN+ACKSend FIN +ACK

Receive ACK

.

.

40

TCP Segment Format

• All TCP segments have same format- Data- Acknowledgement- SYN (startup)- FIN (shutdown)

Segment divided into two parts- Header- Payload area (zero or more

bytes of data)

Header contains protocol port numbers to

identify sending and receiving applications

Bits to specify items such as

SYN FIN ACK

Fields for window advertisement, acknowledgement, etc.

41

TCP Segment Format

Options (variable)

Data

Checksum

SrcPort DstPort

HdrLen 0 Flags

UrgPtr

AdvertisedWindow

SequenceNum

Acknowledgment

0 4 10 16 31

• Sequence number specifies where in stream data belongs• Acknowledgement field specifies the sequence number of first byte in

stream data to be received next (implying all previous bytes received correctly)

• Few segments contain options

42

Segment Format (cont)• Each connection identified with 4-tuple:

– (SrcPort, SrcIPAddr, DsrPort, DstIPAddr)• Sliding window + flow control

– acknowledgment, SequenceNum, AdvertisedWinow

• Flags (6 bits)– SYN, FIN, RESET, PUSH, URG, ACK

• Checksum– pseudo header + TCP header + data

• HdrLen - Header length in 32-bit words

Sender

Data (SequenceNum)

Acknowledgment +AdvertisedWindow

Receiver

43

Flag/Code Bits in TCP Segment• 6-bit field• Tells how to interpret other fields in the header• RST bit set to abort the connection during unexpected situation

(ex. Arrival of an unexpected segment )

Bit (left to right) Meaning if bit set to 1

URG ACK PSH RST SYN FIN

Urgent pointer field is valid Acknowledgement field is valid This segment requests a push Reset the connection Synchronize sequence numbers Sender has reached end of its byte stream

44

Out of Band Data• Data to be sent immediately Example: interrupting or aborting the program in a remote login

session. TCP option: specify data as urgent Set URG code bit Set the urgent pointer (UrgPtr) specifying where urgent data ends in

the segment (Urgent data is contained at the front the segment body) TCP tells the application to return to normal mode after consumption

of urgent data• PUSH flag to tell receiving TCP to notify the receiving process about

the PUSH operation

45

TCP Checksum Computation

• TCP prepends a pseudo header to the segment Appends enough zero bits to make the segment a multiple of 16 bits Computes 16-bit checksum over the entire result Uses 16-bit arithmetic Takes one's complement of the one's complement sum Notes:

- TCP does not count the pseudo header or padding in the segment length

- TCP does not transmit the pseudo header or padding- TCP assumes the checksum field itself is zero before

computation

46

Format of Pseudoheader used in TCP computation

• PROTOCOL field - for IP datagrams, it is 6• TCP LENGTH field - total length of TCP segment including

TCP header

0 8 16 31 SOURCE IP ADDRESS

DESTINATION IP ADDRESS ZERO PROTOCOL TCP LENGTH

47

Purpose of Pseudo Header

• To verify the segment has reached its correct destination (Port number & destination IP address)

IP passes the source and destination IP addresses to the receiving TCP from the datagram

Receiving TCP performs the same computation

48

Timeout and Retransmission• Note: TCP software must accommodate both the vast

differences in time required to reach various destinations and the changes in time required to reach a given destination as traffic load varies.

• Uses an adaptive retransmission algorithm Records the time at which each segment is sent Records the time at which an acknowledgement arrives for the

segment From the two times, it computes a sample round trip time

(RTT).

49

Adaptive Formulas From the sample RTT and old RTT, it estimates the round-trip time

for next segment- Example:

RTT = ( * Old_RTT) + ((1 - )* New_Round_Trip_Samplewhere, 0 < < 1

Calculates a timeout value as a function of the current round trip estimate- ExampleTimeout = * RTT where > 1 (Recommended value for is 2)

50

Accurate Measurement Of Round Trip Samples

• TCP uses a cumulative acknowledgement ACK refers to data received but not to the instance of a

specific segment In case of retransmission, both the original and the

retransmitted segments carry exactly same data TCP cannot relate an ACK with the original or the

retransmitted segment The problem is called acknowledgement ambiguity

51

Questions How is it going to compute accurate sample RTT?

What can happen, if it computes RTT from the time of original transmission? (larger and larger timeout, becomes unstable, why?)

What can happen, if it computes RTT from the time of the most recent transmission? (smaller and smaller but becomes stable at 1/2 of the correct value, thus it sends each segment twice even though no loss occurs)

52

Karn's Algorithm And Timer Backoff• If the original transmission and the most recent transmission

both fail to provide accurate round trip times, what should TCP do?

Simple solution: Avoid ambiguous ACKs altogether; only adjust the estimated round trip for unambiguous ACKs. Known as Karn's algorithm

It can lead to failure as well. When? How?- When TCP sends a segment after a sharp increase in delay

and keeps computing a timeout using existing roundtrip estimate. Timeout will be too small and will force retransmission. Will never update the estimate and cycle will continue.

53

Remedy• Sender combines retransmission timeouts with a timer

backoff strategy

Each time it must retransmit a segment, TCP increases the timeout upto an upperbound (larger than the delay along any path in the internet)

Example implementation:new_timeout = * timeouttypically, = 2.

54

Karn/Partridge Algorithm

• Do not sample RTT when retransmitting • Double timeout after each retransmission

Sender Receiver

Original transmission

ACK

Sam

pleR

TT Retransmission

Sender Receiver

Original transmission

ACK

Sam

pleR

TT

Retransmission

55

Final Form of Karn’s Algorithm• When computing the round trip estimate, ignore samples

that correspond to retransmitted segments, but use a backoff strategy, and retain the timeout value from a retransmitted packet for subsequent packets until a valid sample is observed.

Also known as Karn/Partridge algorithm Experience shows that Karn's algorithm works well even

in networks with high packet loss

56

Responding To High Variance In Delay

• Research shows:- Computations described above do not adapt to a wide

range of variations in delay. 1989 specification for TCP

- Estimate both the average time and the variance- Use the estimated variance in place of constant

Can adapt to a wider range of variation in delay and yield substantially higher throughput.

57

New Set of Equations• DIFF = SAMPLE - Old_RTT• Smoothed_RTT = Old_RTT + * DIFF• DEV = Old_DEV + (|DIFF| - Old_Dev)• Timeout = Smoothed_RTT + * DEV

whereDIFF is the estimated mean deviation is a fraction 0 to 1 (inverse power of 2) is fraction 0 to 1

is a controlling factor ( = 3, for 4.3BSD it was 2)

58

Example• Given

– Old_RTT = 40 ms– Sample_RTT = 25 ms– Old_Dev = 10 ms– and

• Calculate Timeout• Solution

– Diff = 25 - 40 = -15 ms– Smoothed_RTT = 40 + (1/4) * -15 = 36.25 ms– DEV = 10 + 0.4 * (15 - 10) = 12 ms– Timeout = 36.25 + 3 * 12 = 72.25 ms

59

Congestion

Condition of severe delay Caused by an overload of datagrams at one or more

switching points (routers)• Retransmissions aggravate congestion instead of

alleviating it. Why?

60

Congestion Collapse

• A condition at which the entire network becomes useless due to congestion (a stalemate situation).

61

How to Avoid Congestion Collapse

• TCP must reduce transmission rates when congestion occurs

• Routers watch queue lengths and use techniques like ICMP source quench to inform hosts that congestion has occurred

62

TCP Standard Techniques To Avoid Congestion

• Remember the size of the receiver's window• Maintain a second limit, called the congestion window

limit as:Allowed_window = min(receiver_advertisement,

congestion_window)• Use the slow-start algorithm during recovery stage and the

multiplicative decrease algorithm for avoidance of congestion

63

Multiplicative Decrease Congestion Avoidance Algorithm

Upon loss of a segment: – Reduce the congestion window by half (down to a

minimum of at least one segment)– For those segments that remain in the allowed window,

backoff the retransmission timer exponentially.

64

Slow-Start (Additive) Recovery

Whenever starting traffic on a new connection or increasing traffic after a period of congestion, start the congestion window at the size of a single segment and increase the congestion window by one segment each time an acknowledgement arrives (one more segment for each acknowledged one).

Notice carefully, during favorable situation (no congestion and no loss), the window will grow as 1, 2, 4, 8, 16, . . ., 2n segments

65

Congestion Avoidance Phase

Once the congestion window reaches one half of its original size before congestion, TCP slows down the rate of increment. It increases the congestion window by 1 only if all segments in the window have been acknowledged.

66

Sliding Window Revisited

• Sending side– LastByteAcked < = LastByteSent

– LastByteSent < = LastByteWritten

– buffer bytes between LastByteAcked and LastByteWritten

Sending application

LastByteWritten

TCP

LastByteSentLastByteAcked

Receiving application

LastByteRead

TCP

LastByteRcvdNextByteExpected

• Receiving side– LastByteRead < NextByteExpected

– NextByteExpected < = LastByteRcvd +1

– buffer bytes between LastByteRead and LastByteRcvd

67

Flow Control• Send buffer size: MaxSendBuffer• Receive buffer size: MaxRcvBuffer• Receiving side

– LastByteRcvd - LastByteRead < = MaxRcvBuffer– AdvertisedWindow = MaxRcvBuffer - (NextByteExpected - LastByteRead)

• Sending side– LastByteSent - LastByteAcked < = AdvertisedWindow– EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked)

– LastByteWritten - LastByteAcked < = MaxSendBuffer– block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer

where y is the number of bytes sender is trying to write• Always send ACK in response to arriving data segment• Persist when AdvertisedWindow = 0

68

Smart Sender/Dumb Receiver Rule

• ProblemSuppose– Receiver notifies AdvertisedWindow = 0 in an ACK

segment– Sender is not permitted to send any more segment. Receiver

does not spontaneously sends any non-data segment – How sender can discover that the advertised window is no

longer 0 in future• Solution

– Sender side persists in sending a segment with 1 byte of data.

69

Protection Against Wrap Around

• 32-bit SequenceNum

Bandwidth Time Until Wrap AroundT1 (1.5 Mbps) 6.4 hoursEthernet (10 Mbps) 57 minutesT3 (45 Mbps) 13 minutesFDDI (100 Mbps) 6 minutesSTS-3 (155 Mbps) 4 minutesSTS-12 (622 Mbps) 55 secondsSTS-24 (1.2 Gbps) 28 seconds

• Not enough for STS-12 and STS-24 as recommendation says that seqnum should not wrap around within a 120-second period of time

70

Keeping the Pipe Full• 16-bit AdvertisedWindow• Assume, RTT = 100 ms

Bandwidth Delay x Bandwidth ProductT1 (1.5 Mbps) 18KBEthernet (10 Mbps) 122KBT3 (45 Mbps) 549KBFDDI (100 Mbps) 1.2MBSTS-3 (155 Mbps) 1.8MBSTS-12 (622 Mbps) 7.4MBSTS-24 (1.2 Gbps) 14.8MB

71

TCP Extensions

• Implemented as header options• Store timestamp in outgoing segments• Extend sequence space with 32-bit timestamp • Shift (scale) advertised window

72

Initial Sequence Numbers• TCP software of both sides agree on initial sequence

numbers TCP software in each machine chooses an initial sequence

number at random Sequence numbers are sent and acknowledged during

handshake

73

TCP State Machine• TCP operation can be modeled as a finite state

machine Terminology

- Passive open command To wait for a connection from another machine

- Active open command To initiate a connection

- Maximum segment lifetime Maximum time an old segment can remain alive in an internet

(120 seconds by current standard)

74

State Transition DiagramCLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes

FIN/ACKACK

ACK

ACK

Close/FIN

Close

CLOSED

Active open/SYN

75

State Diagram Details• All states above ESTABLISHED are involved in

opening a connection• All states below ESTABLISHED are involved in

closing a connection• Actual data transfer occurs in ESTABLISHED

state• A state transition occurs when

– a segment arrives from the peer or– the local application process invokes an operation on

TCP• Each arc is labeled in the form of event/action

76

State Diagram Details (Contd)

• Establishing a connection (3-way handshake)– Client state transitions:

• CLOSED -> SYN_SENT -> ESTABLISHED– Server state transitions:

• LISTEN -> SYN_RECD->ESTABLISHED

• Three common ways to get a connection from ESTABLISHED state to CLOSED state– This sides closes first

• ESTABLISHED -> FIN_WAIT_1 -> FIN_WAIT_2 -> TIME_WAIT -> CLOSED

– The other side closes first• ESTABLISHED -> CLOSE_WAIT -> LAST_ACK -> CLOSED

– Both sides close at the same time• ESTABLISHED-> FIN_WAIT_1 -> CLOSING -> TIME_WAIT ->

CLOSED

77

TCP Performance

• TCP is a complex protocol Common misconception

- Since it is complex, the code must be cumbersome and inefficient

But experiments show: The same TCP that operates over the global Internet

can deliver 8 Mbps of sustained throughput of user data between two workstations on a 10Mbps Ethernet.