1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported...

43
1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN- 0335190, and DOE DE-FG02-04ER25640 grants

Transcript of 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported...

Page 1: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

1

A Study of Applications for

Optical Circuit-Switched Networks

Xiuduan FangMay 1, 2006

Supported by NSF ITR-0312376, NSF EIN-0335190,

and DOE DE-FG02-04ER25640 grants

Page 2: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

2

Outline

Introduction CHEETAH Background

― CHEETAH concept and network― CHEETAH end-host software

Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions

Page 3: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

3

Introduction Many optical connection-oriented (CO)

testbeds― E.g., CANARIE's CA*net 4, UKLight, and CHEETAH― Primarily designed for e-Science apps

Use Generalized Multiprotocol Label Switching (GMPLS)

― Immediate request, call blocking Motivation: extend these GMPLS networks

to million of users Problem Statement

― What apps are well served by GMPLS networks?― Design apps to use GMPLS networks efficiently

Page 4: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

4

Circuit-switched High-speed End-to-End Transport ArcHitecture (CHEETAH)

Designed as an “add-on” service to the Internet and leverages the services of the Internet

Optical circuit-switched CHEETAH

network

Optical circuit-switched CHEETAH

network

Packet-switched Internet

Packet-switched Internet

Endhost

NIC I

NIC II

Endhost

NIC I

NIC II

IP router IP router

Ethernet-SONETgateway

Ethernet-SONETgateway

CHEETAH concept

Page 5: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

5

CHEETAH Network

zelda4

Sycamore SN16000

1G

ORNL, TN

Atlanta, GA

NC

Direct fibersVLANsMPLS tunnels

mvstu6

UVa

CUNY

zelda5

Sycamore SN16000

zelda3

zelda1

zelda2

OC-192 lambda

MCNCCatalyst

7600

wukongSN16000

UVa Catalyst

4948

NCSUM20

CentuarFastIron

FESX448

WASHAbileneT640

NYCHOPI

Force10

WASHHOPI

Force10

CUNYFoundry

CUNYHost

Page 6: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

6

CHEETAH End-host Software

Application

RSVP-TE client

TCP/IPNIC 1

NIC 2

End hostCHEETAH software

Routing decision

C-TCP

OCS clientInternet

CHEETAH network

Application

RSVP-TE client

TCP/IP NIC 1

NIC 2

End hostCHEETAH software

Routing decision

C-TCP

OCS client

OCS: Optical Connectivity ServiceRD: routing decisionRSVP-TE: ReSerVation Protocol-Traffic EngineeringC-TCP: Circuit-TCP

Page 7: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

7

Outline

Introduction CHEETAH Background

― CHEETAH concept and network― CHEETAH end-host software

Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions

Page 8: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

8

Assumptions: ― Call arrival rate, (Poisson process)― Single link― Single class: all apps are of the same type

A link of capacity C; m circuits; per-circuit BW=C/m m is a measure of high-throughput vs. moderate-

throughput For high-throughput (e.g., e-Science apps), m is small

Problem: what apps are suitable for GMPLS networks?

Analytical Models of GMPLS Networks

/1

― Measure of suitability: Call-blocking probability, Pb Link utilization, U

― App properties: Per-circuit BW Call-holding time,

Page 9: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

9

BW sharing models

is independent of/1 mC /

1

N

Link L, capacity C

,

1

N

Link L, capacity CRD0

/1 mC / is dependent on

File size distribution:

:shape , k :scale

Two kinds of apps: whether is dependent on /1 mC /

The Erlang-B formula

:crossover file size

Page 10: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

10

Numerical Results: is independent of/1 mC /

Two equations, four variables Fix U and m, compute Pb and

Page 11: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

11

Numerical Results: is independent of

/1

m=10

Pb=23.62%

Conclusions: to get high U Small m (~10): high Pb, thus book-ahead or call queuing Large m (~1000): high , thus large N Intermediate m (~100): large is preferred

/1 mC /

)/( N/1

Page 12: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

12

Conclusions: to get high U Small m (~10): high Pb, thus book-ahead or call

queuing As m increases, N does not increase m=100, to get U>80%, Pb<5%: 6MB< <29MB, thus

Numerical Results: is dependent on , whenmC /

ss 3.2/15.0

/1MBk 25.1,1.1

Page 13: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

13

Conclusions for Analysis Ideal apps require BW on the order of

one-hundredth the link capacity as per-circuit rate

Apps where is independent of― long call-holding time is preferred

Apps where is dependent on― need short call-holding time

mC //1

/1 mC /

Page 14: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

14

Outline

Introduction CHEETAH Background

― CHEETAH concept and network― CHEETAH end-host software

Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions

Page 15: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

15

APP I: Web Transfer App on CHEETAH

Why web transfer?― Web-based apps are ubiquitous― Based on the previous analysis, m=100 is

suitable for CHEETAH Consists of a software package WebFT

― Leverages CGI for deployment without modifying web client and web server software

― Integrated with CHEETAH end-host software APIs to allow use of the CHEETAH network in a mode transparent to users

Page 16: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

16

Control messages via Internet

WebFT Architecture

Web serverWeb client

Web Server (e.g. Apache)

CGI scripts (download.cgi &

redirection.cgi

URLResponse

WebFT sender

OCS API RD API

RSVP-TE API

C-TCP API

Web Browser(e.g. Mozilla)

WebFT receiver

RSVP-TE API

C-TCP API Data transfers via a circuit

OCS daemon

RD daemon

RSVP-TE daemon

RSVP-TE daemon

Cheetah end-host software APIsand daemons

Cheetah end-host software APIsand daemons

Page 17: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

17

Experimental Testbed for WebFT

zelda3 and wukong: Dell machines, running Linux FC3 and ext2/3, with RAID-0 SCCI disks

RTT between them: 24.7ms on the Internet path, and 8.6ms for the CHEETAH circuit.

load Apache HTTP server 2.0 on zelda3

CHEETAH Network

CHEETAH Network

InternetInternet

zelda3

NIC I

NIC II

wukong

NIC I

NIC II

IP routers IP routers

NCSUAtlanta, GA

Sycamore SN16000Atlanta, GA

Sycamore SN16000MCNC, NC

Page 18: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

18

Experimental Results for WebFT

The web page to test WebFT

Test parameters: ― Test.rm: 1.6 GB, circuit rate: 1 Gbps

Test results― throughput: 680 Mbps, delay: 19 s

Page 19: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

19

Outline

Introduction CHEETAH Background

― CHEETAH concept and network― CHEETAH end-host software

Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions

Page 20: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

20

APP II: Parallel File Transfers on CHEETAH

Motivation: E-Science projects need to share large volumes of data (TB or PB)

Goal: achieve multi-Gb/s throughput Two factors limit throughput

― TCP’s congestion-control algorithm― End-host limitations

Solutions to relieve end-host limitations

― Single-host solution― Cluster solution, which has two variations

General case: non-split source file Special case: split source file

Page 21: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

21

General-Case Cluster Solution

OriginalSource

Host 1

Host i

Host n

split

Host 1’

Host i’

Host n’

OriginalSink

transfer

transfer

transfer

assemble

……

… ……

Page 22: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

22

Software Tools: GridFTP and PVFS2

GridFTP: a data-transfer protocol on the Grid

― Extends FTP by adding features for partial file transfer, multi-streaming and striping

― We mainly use the GridFTP striped transfer feature.

PVFS: Parallel Virtual File System― An open source implementation of a parallel

file system― Stripes a file across multiple I/O servers like

RAID0― A second version: PVFS2

Page 23: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

23

SPOR <host-port pairs>

response to SPOR

GridFTP server

globus-url-copy

GridFTP striped transfer

Block 1

Block n+1

Block 1

Block n+1

data node R1

data node Rn

Parallel File System

GridFTP server

Block 1

Block n+1

Block 1

Block n+1

data node S1

data node Sn

Parallel File System

…receiving front end sending front end

SPAS

a list

of host-

port pair

s

Sending data nodes initiate data connections to receiving nodes

Page 24: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

24

General-Case Cluster Solution:DesignSteps

Approach

Pros. Cons.

Splitting &Assemblin

g

GridFTP partial file transfer

Wastes disk space,Performance overhead

Socket program

Avoids wasting disk space

Performance overhead

pvfs2-cpAvoids wasting disk space

Transferring

GridFTP partial file transfer

Many independent transfers incurring much overhead to set up and release connections

GridFTP striped transfer

A single file transfer

Page 25: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

25

General-Case Cluster Solution:Implementation

To get a high throughput, we need to make data nodes responsible for data blocks in their local disks

Block 1

Block n+1

Block 1

Block n+1

data node R1

data node Rn

PVFS2

Block 1

Block n+1

Block 1

Block n+1

data node S1

data node Sn

PVFS2

… …― Make PVFS2 and GridFTP have the same

stripe pattern Problems:

― PVFS2 1.0.1 does not provide a utility to inspect data distribution

― Data connections between sending and receiving nodes are random

Page 26: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

26

Random data connections

Block 1

Block n+1

Block 1

Block n+1

data node R1

data node Rn

PVFS2

Block 1

Block n+1

Block 1

Block n+1

data node S1

data node Sn

PVFS2

… …

Page 27: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

27

Random data connections

Block 1

Block n+1

Block 1

Block n+1

data node R1

data node Rn

PVFS2

Block 1

Block n+1

Block 1

Block n+1

data node S1

data node Sn

PVFS2

… …

Page 28: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

28

Implementation - Modifications to PVFS2

Goal: know a priori how a file is striped in PVFS2 Use strace command to trace systems calls

called by pvfs2-cp ― Pvfs2-fs-dump gives the (non-deterministic) I/O server

order of file distribution― Pvfs2-cp ignores the –s option for configuring stripe size

Modify PVFS2 code― For load balance, PVFS2 stripes files starting with a

random server: jitter = (rand() % num_io_servers); ― Set jitter = -1 to get a fixed order of data distribution― Change the default stripe size (original: 64KBytes)

Page 29: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

29

Implementation - Modifications to GridFTP Goal: use a deterministic matching

sequence between sending and receiving data nodes Method: modify the implementation of SPAS and SPOR commands

― SPAS: sort the list of host-port pairs based on the IP-address order for receiving data nodes

― SPOR: request sending data nodes to initiate data connections sequentially to receiving data nodes

Page 30: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

30

Experimental Results

Conducted on a 22-node cluster, sunfire Reduced network-and-disk contention Performance of PVFS2 implementation

was poor

Page 31: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

31

Summary and Conclusions Analytical Models of GMPLS Networks

― Ideal apps require BW on the order of one-hundredth the link capacity as per-circuit rate

Application I: Web Transfer Application― provided deterministic data services to

CHEETAH clients on dedicated end-to-end circuits

― No modifications to the web client and web server software by leveraging CGI

Application II: Parallel File Transfers― Implemented a general-case cluster solution

by using PVFS2 and GridFTP striped transfer ― Modified PVFS2 and GridFTP code to reduce

network-and-disk contention

Page 32: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

32

Publication Lists

M. Veeraraghavan, X. Fang, and X. Zheng, On the suitability of applications for GMPLS networks, submitted to IEEE Globecom2006

X. Fang, X. Zheng, and M. Veeraraghavan, Improving web performance through new networking technologies, IEEE ICIW'06, February 23-25, 2006 Guadeloupe, French Caribbean

Page 33: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

33

Future Work Analytical Models of GMPLS Networks

― Multi-class― Multiple links and network models

Application I: Web Transfer Application― Design a Web partial CO transfer to enable

non-CHEETAH hosts to use CHEETAH― Connect multiple CO networks to further

reduce RTT Application II: Parallel File Transfers

― Test the general-case cluster solution on CHEETAH

― Work on PVFS2 or try GPFS to get a high I/O throughput

Page 34: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

34

A Classification of Networks that Reflects Sharing Modes

Page 35: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

35

The client can be reached via the CHEETAH network (OCS)

Request a CHEETAH circuit (Routing Decision)

Set up a circuit (RSVP_TE client)

Send the file via C-TCP

Release the circuit (RSVP_TE client)

Yes

Yes

Succeed

No

No

Fail

Return Success Return Failure

The flow chart for the WebFT sender

Page 36: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

36

The WebFT Receiver Integrates with the CHEETAH end-host

software modules similar to the WebFT sender.

Runs as a daemon in the background on the client host to avoid manual intervention.

Also provides the WebFT sender a desired circuit rate.

Page 37: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

37

Experimental Results for WebFT

Page 38: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

38

PVFS2 Architecture

Page 39: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

39

Experimental Configuration Configuration of PVFS2 I/O servers

― The 1st PVFS2: sunfire1 through sunfire5― The 2nd PVFS2: sunfire10, and sunfire6 through 9

Configuration of GridFTP servers― Sending front end: sunfire1 with data nodes sunfire1

through sunfire5― Receiving front end: sunfire10 with data nodes

sunfire10, sunfire6 through sunfire9 GridFTP striped transfer

globus-url-copy -vb –dbg -stripe ftp://sunfire1:50001/pvfs2/test_1G

ftp://sunfire10:50002/pvfs2/test_1G1 2>dbg1.txt

Page 40: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

40

Four Conditions to Avoid Unnecessary Network-and-disk Contention

Know a priori how data are striped in PVFS2

PVFS2 I/O servers and GridFTP servers run on the same hosts

GridFTP stripes data across data nodes in the same sequence as PVFS2 does across PVFS2 I/O servers

GridFTP and PVFS2 have the same stripe size

Page 41: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

41

Page 42: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

42

The Specific Cluster Solution for TSI

Dell 5424

.

.

.

zelda1

zelda2

zelda5

zelda4

zelda3

compute-0-0

compute-0-1

compute-0-4

compute-0-3

compute-0-2

compute-0-19

controller-0(rudi)

disk-0-0

disk-3-0

disk-2-0

disk-1-0

monitoring host

disk-4-0

controller-1(orbitty)

orbitty at NCSU zelda at ORNL

Dell 5224

CHEETAH LAN

X1E at ORNL

X1E

Page 43: 1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640.

43

Numerical Results for is dependent on/1 mC /

Conclusions: Large m (~1000): does not increase N