Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger...

45
Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl

Transcript of Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger...

Page 1: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

Computer Networks GroupUniversität Paderborn

Computer NetworksChapter 8: Transport layer

Holger Karl

Page 2: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 2

Goal of this chapter

We will discuss the mechanisms in addition to congestion control that are required to build a reliable, connection-oriented, end-to-end transport protocol

With TCP as the evident case study, intermixed with the presentation

We will draw heavily on the material from the link layer, as there many similar problems are solved in similar fashion – e.g., sliding window for error control

A brief discussion of other transport protocols complements the chapter

Namely, UDP and RPC

Page 3: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 3

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 4: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 4

Transport protocols – Services to upper layer

Connection-less and connection-oriented transport service Connection management necessary as auxiliary service – setup

and teardown

Reliable or unreliable transport In-order, all packets

Be a good citizen in the network – perform congestion control

Allow several transport endpoints in a single host Demultiplexing

Support different interaction models Byte stream, messages, remote procedure invocation

Page 5: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 5

Addressing

Provide multiple service access points to multiplex several applications

SAPs can identify connections or data flows

E.g., “port numbers” Dynamically allocated Predefined for “well-

known services” – port 80 for Web server

Page 6: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 6

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 7: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 7

Connection establishment

How to establish a joint context, a connection, between sender and receiver?

Note: only relevant in end-system, network layer assumed to be connection-less

Naïve solution: Sender sends a CONNECTION REQUEST Receiver answers by a CONNECTION

ACCEPTED Sender proceeds once that message is

received

CONNECTIONREQUEST

CONN

ACCEPTED

DATA

Page 8: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 8

Failure of naïve solution

Naïve solution fails in realistic networks

Where packets can be lost, stored/reorder, and duplicated

Example failure scenario All packets are duplicated and

delayed Due to congestion, errors, ... and

re-routing, e.g. In a banking application, e.g. Result: two independent

transactions performed, where only one was intended

Problem are delayed duplicates

Transfer money CONNECTIONREQUEST

CONN

ACCEPTED

TRANSFER

OK

OK

CONN

ACCEPTED

??? DROP

OK

??? DROP

OK

Page 9: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 9

Adding an additional handshake

First idea: the sender has to re-confirm to the receiver that it actually wants to set up this connection

! Add a third message to the connection setup phase

Three-way handshake This third message can

already carry data

CONNECTIONREQUEST

CONN

ACCEPTED

CONN CONFIRMED

Page 10: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 10

Is three-way handshake sufficient?

No, it does not protect against delayed duplicates! Problem: If both the connection request and the connection

confirmation are duplicated and delayed, receiver again has no way to ascertain whether this is fresh or an old copy

Solution: Have the sender answer a question that the receiver asks!

Actually: Put sequence numbers into connection request connection acknowledgement, and connection confirmation

Have to be copied by the receiving party to the other side

Connection only established if the correct number is provided

Page 11: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 11

Three-way handshake + sequence numbers

Two examples for critical cases (which are handled correctly):

Connection request appears as an old duplicate:

Connection request & confirmation appear as old duplicates:

Page 12: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 12

Connection setup – further issues

Further problems due to crashing nodes, need some memory

Sequence numbers negotiated here are also used for the following data packets

Terminology for TCP Connection setup – SYN (synchronize) packet Connection accepted – SYN/ACK packet (because both the

previous sequence number is acknowledged and a new sequence number from the receiver is proposed)

Connection confirmation – ACK packet (combined with DATA, as a rule)

Problem: SYN attack – flooding with nonsense SYN packets

Page 13: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 13

Connection identification in TCP

A TCP connection is setup Between a single sender and a single receiver More precisely, between application processes running on these

systems TCP can multiplex several such connections over the network

layer, using the port numbers as Transport SAP identifiers

A TCP connection is thus identified by a four tuple:

(Source Port, Source IP Address, Destination Port, Destination IP Address)

Page 14: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 14

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 15: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 15

Connection release

Once connection context between two peers is established, releasing a connection should be easy

Goal: Only release connection when both peers have agreed that they have received all data and have nothing more to say

I.e., both sides must have invoked a “Close”-like service primitive

It fact, it is impossible Problem: How to be sure that the other peer knows that you know

that it knows that you know … that all data have been transmitted and that the connection can now safely be terminated?

Analogy: Two army problem

Page 16: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 16

Two army problem

Coordinated attack Two armies form up for an attack against each other One army is split into two parts that have to attack together – alone they

will lose Commanders of the parts communicate via messengers who can be

captured

Which rules shall the commanders use to agree on an attack date? Provably unsolvable if the network can loose messages

How to coordinate?

Page 17: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 17

Connection release in practice

Two army problem equivalent to connection release But: when releasing a connection, bigger risks can be

taken

Usual approach: Three-wayhandshake again

Send disconnect request (DR), set timer, wait for DR from peer, acknowledge it

Page 18: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 18

Problem cases for connection release with 3WHS

Lost ACK solved by (optimistic) timer in Host 2

Lost second DR solved by retransmission of first DR

Timer solves (optimistically) case when 2nd DR and ACK are lost

Page 19: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 19

State diagram of a TCP connection

TCP uses three-way handshake for connection setup and release FIN, FIN/ACK for release Complicated by ability to have

asymmetric, have-open connections and data in transit while close is in progress

(Partial) state diagram Expresses both

“server” and “client”aspects

Each has a separate copy

Legend: event/action, segments all caps, local events normal capitalization

CLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes

FIN/ACK

ACK

ACK

ACK

Close/FIN

Close

CLOSED

Active open/SYN

CLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes

FIN/ACK

ACK

ACK

ACK

Close/FIN

Close

CLOSED

Active open/SYN

Page 20: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 20

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 21: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 21

Flow control

Recall: Flow control = prevent a fast sender from overrunning a slow receiver

Similar issue in link and transport layer

In transport layer more complicated Many connections, need to adapt the amount of buffer per

connection dynamically (instead of just simply allocating a fixed amount of buffer space per outgoing link)

Transport layer PDUs can differ widely in size, unlike link layer frames

Network’s packet buffering capability clouds the picture

Page 22: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 22

Flow control – buffer allocation

In order to support outstanding packets, the sender either Has to rely on the receiver to process packets as they come in

(packets must not be reordered) – unrealistic, or Has to assume that the receiver has sufficient buffer space

available

The more buffer, the more outstanding packets Necessary to obtain highly efficient transmission, recall bandwidth-

delay product!

How does sender have buffer assurance? Sender can request buffer space Receiver can tell sender: “I have X buffers available at the

moment” For sliding window protocols: Influence size of sender’s send window

Page 23: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 23

Example of flow control with ACK/permit separation

Arrows show direction of transmission, “…” indicates lost packet Potential deadlock in step 16 when control PDU is lost and not

retransmitted

Page 24: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 24

Flow control – permits and acknowledgements

Distinguish Permits (“Receiver has buffer space, go ahead and send more

data”) Acknowledgements (“Receiver has received certain packets”)

Should be separated in real-world protocols!

Can be combined with dynamically changing buffer space at the received

Due to, e.g., different speed with which the application actually retrieves received data from the transport layer

Example: TCP

Page 25: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 25

Send and receive buffers in TCP

TCP maintains buffer at Sender, to assist in error control Receiver, to store packets not yet retrieved by application or

received out of order Old TCP implementations used GoBack-N, discarded out-of-order

packets

Sending application

LastByteWrittenTCP

LastByteSentLastByteAcked

Receiving application

LastByteReadTCP

LastByteRcvdNextByteExpected

Sender Receiver

Page 26: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 26

TCP flow control: Advertised window

In TCP, receiver can advertise size of its receiving buffer Buffer space occupied:

(NextByteExpected-1) – LastByteRead Maximum buffer space available: MaxRcvdBuffer Advertised buffer space (Advertised window):

MaxRcvdBuffer – ((NextByteExpected-1) – LastByteRead)

Recall: Advertised window limits the amount of data that a sender will inject into the network

TCP sender ensures that LastByteSent – LastByteAcked · AdvertisedWindow

Equvialently:

EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked)

Page 27: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 27

Nagle’s algorithm – self-clocking and windows

Recall TCP self-clocking: Arrival of an ACK is an indication that new data can be injected into the network

What happens when an ACK for only small amount of data (e.g., 1 byte arrives)?

Send immediately? Network will be burdened by small packets (“silly window syndrome”)

Nagle’s algorithm describes how much data TCP is allowed to send

When application produces data to sendif both available data and advertised window · MSS send a full segmentelse if there is unacked data in flight, buffer new data until

MSS is full else send all the new data now

Page 28: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 28

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 29: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 29

Timeout computations

Timeouts necessary to protect against lost packets Timeouts should reflect round-trip time between sender

and receiver Problem: RTTs can be highly variable, range over several orders

of magnitude Has to be adapted dynamically

Simple approach: Keep a running average of RTTs Computed by an autoregressive model:

EstimatedRTTn = EstimatedRTTn-1 + (1-) RTTSamplen

Timeout = 2 EstimatedRTTas a conservative choice

Parameter smoothes estimation ( = 0.8 … 0.9, typically)

Page 30: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 30

Problem: ACKs do not refer to a transmission!

Simple algorithm described above cannot obtain correct RTT samples if packets have been retransmitted

ACKs refer to data/sequence numbers, not to individual packets!

Two examples: Sender Receiver

Original transmission

ACK

Sam

pleR

TT

Retransmission

Sender Receiver

Original transmission

ACK

Sam

pleR

TT

Retransmission

Solution: Do not take RTT samples for retransmitted packets (Karn/Partridge algorithm)

Page 31: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 31

Problem: Variance not considered

Previous estimation does not consider variance of RTTs If low, estimate is good; no need to set timeout = 2 * RTT If high, be more careful when computing timeout from RTTs;

estimate is likely to be of low quality

Jacobsen/Karels algorithm for new estimation: Differencen = SampleRTTn – EstimatedRTTn-1

EstimatedRTTn = EstimatedRTTn-1 + ( Differencen)

Deviationn = Deviationn-1 + (|Difference|n – Deviationn-1)

Timeoutn = EstimatedRTTn + Deviationn

2 (0,1) a constant, typically = 1, = 4 (in TCP) Deviation is an upper bound of standard deviation, but easier to

compute

Page 32: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 32

Timer and packet loss

What happens after a packet loss? Recall Ethernet behavior: Become more and more careful, back off Same idea here: After packet loss detected by timeout, use

successively larger timeout values Exponential backoff: Double timeout value for each additional

retransmission

The estimation of RTT and “basic” timeout value is performed as described on previous slide

The multiplicative factor for exponential backoff is reset upon ACK arrival

Page 33: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 33

Timeouts & timer granularity

Timeout calculation can suffer from coarse-grained timer resolution in the operating system

Example: some Unixes only have 500 ms resolution

Also: firing a timer heavily depends on relatively precise timer interrupts to be available

Sometimes, timeouts can happen only seconds after a packet has been transmitted – far too sluggish

Page 34: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 34

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 35: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 35

TCP – Summary

TCP consists of Reliable byte stream – error control via GoBack-N

or Selective Repeat (depending on version) Congestion control – window-based, AIMD, slow start, congestion

threshold Flow control – advertised receiver window Connection management – three-way handshake for setup and

teardown

TCP is perhaps the single most complicated and subtle protocol in the internet

Many little details and extensions are not discussed here Interaction of TCP with other layers is more complicated than it

looks (e.g., wireless) because of hidden, implicit assumptions

pRTT

MSSBW

*

*93.0

• max. TCP BandWidth• Max. Segment Size• Round Trip Time• loss probability

Page 36: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 36

TCP fairness & TCP friendliness

TCP attempts to Adjust dynamically to the available bandwidth Fairly share this bandwidth among all connections I.e.: If n connections share a given bottleneck link, each

connection obtains 1/n of its bandwidth

Interaction with other protocols Bottleneck bandwidth depends on load of other protocols as well,

e.g., UDP – which is NOT congestion-controlled I.e., UDP traffic can “push aside” TCP traffic

Consequence: Transport protocols should be TCP friendly They should not consume more bandwidth than a TCP connection

in a comparable situation

Page 37: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 37

TCP header

Options (variable)

Data

Checksum

SrcPort DstPort

HdrLen 0 Flags

UrgPtr

AdvertisedWindow

SequenceNum

Acknowledgment

0 4 10 16 31

Page 38: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 38

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 39: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 39

A simple demultiplexer transport protocol – UDP

An example for an unreliable, datagram protocol: User Datagram Protocol (UDP)

Only real function: (de)multiplex several data flows onto the IP layer and back to the applications

In addition: ensures packet’s correctness Checksum out of UDP header + data + “pseudoheader” (pivotal IP

header fields)

SrcPort DstPort

Checksum Length

Data

0 16 31

UDP header

Page 40: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 40

Remote Procedure Call (RPC)

UDP and TCP have essentially simple semantics: Transport data from A to B

A bit like “goto” in sequential languages: no structure in data flow

More complex interactions can be directly captured in communication primitives

Example: Invocation of a procedure as a primitive, as opposed to two separate gotos

! Remote procedure call Remote object invocation is really the same thing from a

communication perspective

Page 41: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 41

Remote procedure call – Structure

Important goal: transparency to caller/callee Achieved (partially) by stubs

Client Server

Request

Reply

Computing

Blocked

Blocked

Blocked

Caller(client)

Clientstub

RPCprotocol

Returnvalue

Arguments

ReplyRequest

Callee(server)

Serverstub

RPCprotocol

Returnvalue

Arguments

ReplyRequest

Network

Page 42: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 42

Overview

Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues

Page 43: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 43

A few words on performance

Performance is still important … Measuring performance

Measure relevant metrics Make sure that the sample size is big enough Make sure that samples are representative Be careful when using a coarse-grained clock Be sure that nothing unexpected is going on during your tests Caching can wreak havoc with measurements Understand what your are measuring and what is going on Be careful about extrapolating results Use a proper experimental design to finish in one human lifetime

Page 44: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 44

System design for better performance

CPU speed is more important than network speed Or interfaces like NIC $ CPU

Reduce packet count to reduce software overhead Minimize context switches Minimize copying You can buy more bandwidth but not lower delay Avoiding congestion is better than recovering from it Avoid timeouts Optimize for the common case Depending on network: design for speed, not utilization

Think about the right performance metric, it can be, e.g., energy efficiency rather than throughput or delay

Page 45: Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger Karl.

WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 45

Conclusion

Transport protocols can be anything from trivial to highly complex, depending on the purpose they serve

They determine to a large degree the dynamics of a network and – in particular – its stability

It is trivial to build “faster” than TCP protocols, but they are unstable

The interdependencies of various mechanisms in a transport protocols can be very subtle with big consequences

E.g., fairness, coexistence of different TCP versions