UDT: UDP based Data Transfer

45
UDT: UDP based Data Transfer Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago Németh Felicián, Tarján Pé The work is supported by the 2/032/2004 ELTE-BUTE-Ericsson NKFP project on Research and Developments of Tools Supporting Optimal Usage of Heterogen

description

UDT: UDP based Data Transfer. Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago. - PowerPoint PPT Presentation

Transcript of UDT: UDP based Data Transfer

Page 1: UDT: UDP based Data Transfer

UDT: UDP based Data Transfer

Yunhong Gu & Robert GrossmanLaboratory for Advanced Computing

University of Illinois at Chicago

Németh Felicián, Tarján Péter

The work is supported by the 2/032/2004 ELTE-BUTE-Ericsson NKFP project on 

Research and Developments of Tools Supporting Optimal Usage of Heterogen 

Communication Networks

Page 2: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 2

Outline

Background UDT Protocol UDT Congestion Control Implementation/Simulation Results Summary

Page 3: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 3

Background

Distributed data intensive applications over wide area optical networks: Grid computing, access of bulk scientific data,

data mining, high resolution video, etc. Transport protocol support:

Efficient and fair bandwidth unitization TCP does not work!

Page 4: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 6

Requirements to the New Protocol FAST

High utilization of the abundant bandwidth either with single or multiplexed connections

FAIR Intra-protocol fairness, independent of RTT

FRIENDLY TCP compatibility

Page 5: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 7

Use Scenarios

Small number of sources shares abundant bandwidth

Bulk data transfer Most of the packets can be packed in maximum

segment size (MSS) in a UDT session MSS can be set up by applications and the

optimal value is the path MTU

Page 6: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 8

UDT History

2000: SABUL Concept 2001: SABUL version 1.0 2002: dSABUL 2002: SABUL version 2.0, 2.1, 2.2, 2.3, 2003: UDT 1.0 2004: UDT 1.1 & 1.2 2004: UDT 2.0 2005: UDT 3.0

Page 7: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 9

Papers

Supporting Configurable Congestion Control in Data Transport ServicesYunhong Gu and Robert L. GrossmanUIC/LAC Technical Report, in submission

  Experiences in Design and Implementation of a High Performance Transport

ProtocolYunhong Gu, Xinwei Hong, and Robert L. GrossmanSC 2004, Nov 6 - 12, Pittsburgh, PA, USA.

  An Analysis of AIMD Algorithms with Decreasing Increases

Yunhong Gu, Xinwei Hong and Robert L. GrossmanFirst Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA, USA.

  Optimizing UDP-based Protocol Implementation

Yunhong Gu and Robert L. GrossmanPFLDNet 2005, Lyon, France, Feb. 2005

SABUL: A Transport Protocol for Grid ComputingYunhong Gu and Robert L. Grossman, Journal of Grid Computing, 2003, Volume 1, Issue 4, pp. 377-386

Page 8: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 10

UDT: UDP based Data Transfer Reliable, application level, duplex, transport protocol, over

UDP with congestion control Implementation: Open source C++ library

Two orthogonal parts The UDT protocol framework that can be implemented

above UDP, with any suitable congestion control algorithms The UDT congestion control algorithm, which can be

implemented in any transport protocols such as TCP

What’s UDT?

Page 9: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 11

Congestion Control Schemes

Window control sends data in bursts TCP pacing (decreases throughput)

Rate control can lead to continuous loss

Rate control + Supportive window control

Page 10: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 12

Packet Structure

Data Packet: Header: 1bit flag + 31bit sequence number

Control Packet: Header: 1bit flag + 3bit type + 12bit reserved +

16bit ACK seq. no. + (0 - 32n)bit control info Type: ACK, ACK2, NAK, Handshake, Keep-alive,

and Shutdown Actual size of a UDT packet can be

ascertained from UDP header

Page 11: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 13

Data Packet

0 Packet Sequence Number

User Data Payload

Flag Bit: 0

UDT uses 31-bit packet based sequence number, ranging from 0 and (231 - 1)

Sequence number may be wrapped if it exceeds the maximum available number

Page 12: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 14

Control Packet

1 type reserved ACK Seq. No.

Control Information Field

Flag Bit: 1

type: 3-bit

handshake (000), shutdown (101), keep-alive (001)

ACK (010), ACK2 (110), NAK (011)

UDT uses sub-sequencing: each ACK and related ACK2 are assigned a 16-bit unique ACK sequence number

Page 13: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 15

Acknowledgements

Selective acknowledgement (ACK) Generated at every constant interval to send back largest

continuously received sequence number of data packets. The sender sends back an ACK2 to the receiver for each

ACK (sub-sequencing). Also carries RTT, packet arrival speed, and estimated link

capacity. Explicit negative acknowledgement (NAK)

Generated as soon as loss is detected. Loss information may be resent if receiver has not received

the retransmission after an increasing interval. Loss information is compressed in NAK.

Page 14: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 16

Timing

Packet Scheduling Timer Tuned by Rate Control High precision in CPU clock cycles Implementation depends on self-clocking,

packet sending burst Rate Control Timer: trigger rate control

RCTP = 0.01 seconds ACK Timer: trigger acknowledgement

ATP = RCTP

Page 15: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 17

Timing (cont.)

NAK Timer: trigger negative acknowledgement NTP = RTT

Retransmission Timer: trigger retransmission based on time-out and maintain connection status RTP = (exp-count + 1) * RTT + ATP

where exp-count is the number of continuous time-out

Page 16: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 18

UDT Architecture

DATA

ACK

ACK2

NAK

Sender

Recver

Sender

Recver

Pkt. Scheduling Timer

ACK Timer

NAK Timer

Retransmission Timer

Rate Control Timer

Sender

Page 17: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 19

Congestion Control

Rate based congestion control (Rate Control) RC tunes the packet sending period. RC is triggered periodically at the sender side. RC period is constant of 0.01 seconds.

Window based flow control (Flow Control) FC limits the number of unacknowledged packets. FC is triggered on each received ACK at the

sender side.

Page 18: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 20

Rate Control

AIMD: Increase parameter is related to link capacity and current sending rate; Decrease factor is 1/9, but not decrease for all loss events.

Link capacity is probed by packet pair, which is sampled UDT data packets. Every 16th data packet and it successor packet are sent

back to back to form a packet pair.

The receiver uses a median filter on the interval between the arrival times of each packet pair to estimate link capacity.

… …

Page 19: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 21

Receiver Based Packet Pair (RBPP) OK in high-speed optical links Not working well in certain cases

multi-channel links Underestimation

UDT become conservative Overestimation

UDT become more aggressive. Finally, UDT turns into window-based cong. control

V. Paxson, End-to-End Internet Packet Dynamics, IEEE/ACMTransactions on Networking, Vol.7, No.3, pp. 277-292, June 1999.

Page 20: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 22

Rate Control (cont.)

1. If loss rate is greater than 1%, do not increase;

2. Number of packets to be increased in next RCTP time is:

where B is estimated link capacity, C is current sending rate. Both are in packets or packets per second. MSS is the packet size in bytes. β = 1.5 * 10-6.

3. Recalculate packet sending period (STP).

)/1,/10max( )8)((log10 MSSMSSinc MSSCB

)/( incRCTPSTPRCTPSTP

An Analysis of AIMD Algorithms with Decreasing IncreasesYunhong Gu, Xinwei Hong and Robert L. GrossmanFirst Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA, USA.

Page 21: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 23

Rate Control (cont.)

C (Mbps) B - C (Mbps) Increase Param. (Pkts)

[0, 9000) (1000, 10000] 10

[9000, 9900) (100, 1000] 1

[9900, 9990) (10, 100] 0.1

[9990, 9999) (1, 10] 0.01

[9999, 9999.9) (0.1, 1] 0.001

9999.9+ <0.1 0.00067

B = 10Gbps, MSS = 1500 bytes

Page 22: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 24

UDT Algorithm

Time

Sen

ding

Rat

e

L

f0(α0, p)

f1(α1, p)

f2(α2, p)f3(α3, p)

Page 23: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 25

Rate Control (cont.)

Decrease sending rate by 1/9, (or equivalently, increase packet sending period by 1.125), only if

1. Received an NAK, whose last lost sequence number is greater than the largest sequence number when last decrease occurred; or

2. The number of loss events since last decrease has exceeded a threshold, which increases exponentially and is reset when condition 1 is satisfied.

No data will be sent out for the next RCTP time if a decrease occurs.

Help to clear congestion. In a short period, loss rate due to congestion is

larger than loss rate due to physical link error

Page 24: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 26

BDP W = W*0.875 + AS*(RTT+ATP)*0.125 AS is the packets arrival speed at receiver

side. The receiver records the packet arrival intervals.

AS is calculated from the average of latest 16 intervals after a median filter.

It is carried back within ACK.

Flow Control

Page 25: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 27

Slow Start

Flow window starts at 2 and increases to the number of acknowledged packets, until the sender receives an NAK or reaches the maximum window size, when slow start ends.

Packet sending period is 0 during slow start phase and set to the packet arrival interval at the end of the phase.

Slow start only occurs at the beginning of a UDT session.

Page 26: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 28

Implementation: Performance

0 10 20 30 40 50 60 70 80 90 1000

100

200

300

400

500

600

700

800

900

1000

Time (s)

Th

rou

gh

pu

t (M

bp

s)

to StarLight, 40us RTTto Canarie, 16ms RTTto SARA, 110ms RTT

Page 27: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 29

Implementation: Intra-protocol Fairness

0 10 20 30 40 50 60 70 80 90 1000

200

400

600

0 10 20 30 40 50 60 70 80 90 100320

322

324

326

328

330

Time (s)

Th

rou

gh

pu

t (M

bp

s)

to StarLight, 40us RTTto Canarie, 16ms RTTto SARA, 110ms RTT

Page 28: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 30

Implementation: TCP Friendliness

020

4060

80100

UDT1

UDT2

TCP1

TCP2100

150

200

250

300

350

400

Time (s)

Th

rou

gh

pu

t (M

bp

s)

Page 29: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 31

Implementation: TCP Friendliness (cont.)

0 20 40 60 80 100

IT1IT2

ST1ST2

0

100

200

Time (s)

Th

rou

gh

pu

t (M

bp

s)

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

Time (s)

Th

rou

gh

pu

t (M

bp

s)

winthout UDTwith UDT

Page 30: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 32

Implementation: File Transfer

To StarLight Canarie SARA

From

StarLight 460 505 560

Canarie 440 502 -

SARA 441 - 660

Canarie StarLight SARA

1Gbps/15.9ms 1Gbps/110ms

DiskR: 800MbpsW: 550Mbps

DiskR: 800MbpsW: 500Mbps

DiskR: 1300MbpsW: 900Mbps

Page 31: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 33

Simulation: UDT Throughput at Different Bandwidth and RTT

10-2

10-1

100

101

102

103

10-2

10-1

100

101

102

103

60

70

80

90

100

RTT (ms)Bandwidth (Mbps)

Ba

nd

wid

th U

tiliz

atio

n (

%)

Page 32: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 34

Simulation: Performance of Concurrent UDT Flows

0 50 100 150 200 250 300 350 40098

98.5

99

99.5

100B

an

dw

idth

Util

iza

tion

(%

)

RTT = 1msRTT = 10msRTT = 100ms

0 50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

Number of Concurrent Flows

Sta

nd

ard

De

via

tion

(M

bp

s)

Page 33: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 35

Simulation: Intra-protocol Fairness

0 10 20 30 40 50 60 70 80 900

10

30

50

70

90T

hro

ug

hp

ut (

Mb

ps)

0 10 20 30 40 50 60 70 80 900

100

300

500

700

900

Time (s)

Th

rou

gh

pu

t (M

bp

s)

Bandwidth = 100Mbps, RTT = 1ms

Bandwidth = 1Gbps, RTT = 100ms

Page 34: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 36

Simulation: RTT Independence

020

4060

80100

0.0010.01

0.11

10100

10000

10

20

30

40

50

Time (s)RTT (ms)

Th

rou

gh

pu

t (M

bp

s)

Page 35: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 37

Simulation: TCP Friendliness

10-2

10-1

100

101

102

10310

-210

-110

010

110

210

30

0.5

1

1.5

2

RTT (ms)Bandwidth (Mbps)

TC

P F

rie

nd

line

ss (

TC

P w

/ UD

T v

s T

CP

w/ T

CP

)

Page 36: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 38

Simulation: Convergence/Stability

0 50 100 150 200 250

100

80

60

40

20

0

20

40

60

80

100

Time (s)

CB

R R

ate

(M

bp

s)

UD

T T

hro

ug

hp

ut (

Mb

ps)

Page 37: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 39

Simulation: Complex Scenario

100100100

5010

Flow ID 1 2 3 4 5 6

Throughput (Mbps)

89.3 90.0 5.18 41.7 50.8 4.78

Link capacityMbps

Flowand its ID

NodeDropTail

Page 38: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 40

Simulation: Multi-bottleneck

A

x

200200

B

C

X 0.1 1 10 20 40 60

AB 198.8 189.2 180.1 170.9 152.5 137.6

AC 0.098 0.979 9.955 19.88 39.46 57.70

X 80 100 120 140 160 180

AB 108.4 104.6 100.8 101.3 100.7 100.3

AC 73.49 92.42 98.47 98.04 98.65 99.00

Page 39: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 41

Summary

UDT Protocol Application level upon UDP Selective acknowledgement / explicit negative

acknowledgement UDT Congestion Control

Rate Control Bandwidth estimation for fast probing available bandwidth and

fast recovery AIMD for fairness Constant rate control interval

Flow Control Dynamic flow window according to packet receiving speed

Page 40: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 42

UDT Characters

Good use of available bandwidth Application level - no changes in router and

operating system No manual tuning Fair and Friendly: intra-protocol fairness, TCP

friendliness, and RTT independence. Open source

Page 41: UDT: UDP based Data Transfer

Thank You!

LAC: www.lac.uic.edu

UDT: sourceforge.net/projects/dataspace

Internet Draft: draft-gg-udt-01.txt

Page 42: UDT: UDP based Data Transfer

Q&A

Page 43: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 45

Comparison: mostly “Related Works”

x: packet sending rate

Page 44: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 46

SYS interval: 10 ms

Acceptable trade-off between efficiency and fairness Intra-protocol and TCP

Larger values Less responsive to network change Slower in loss recovery More stable Friendlier to TCP

• Smaller: the opposite

Page 45: UDT: UDP based Data Transfer

02/17/2004 PFLDnet 2004 47

Target environment

Large BDP networks, Bulk data transfer Speed of the control flow: ~4.5 kBps Who use it?

Using Teraflows to Transport Sloan Digital Sky Survey (SDSS) Data

http://sdss.ncdm.uic.edu:8080/sdss/ e-VLBI

compares Tsunami, UDT, FAST TCP http://apan.net/meetings/honolulu2004/materials/engineering/8-APAN2.ppt