1 NSF CHEETAH project “End-To-End Provisioned Optical Network Testbed for Large-Scale eScience...
-
date post
22-Dec-2015 -
Category
Documents
-
view
217 -
download
4
Transcript of 1 NSF CHEETAH project “End-To-End Provisioned Optical Network Testbed for Large-Scale eScience...
1
NSF CHEETAH project“End-To-End Provisioned Optical Network Testbed for Large-Scale
eScience Applications”
Xuan Zheng & Malathi Veeraraghavan Univ. of Virginia
{xuan, mv}@cs.virginia.edu
2
Team & Acknowledgment
• Team (PI/Co-PIs):– Malathi Veeraraghavan, Univ. of Virginia– Nagi Rao, Bill Wing, Tony Mezzacappa,
ORNL– John Blondin, NCSU– Ibrahim Habib, CUNY
• UVA funding sources:– NSF EIN grant ANI-0335190 “End-To-End
Provisioned Optical Network Testbed for Large-Scale eScience Applications”
– NSF ITR small grant ANI-0312376– DOE FG02-04ER25640
3
Outline
Background• CHEETAH
• Concept• Testbed and peering• Software• Applications: GridFTP, web application
• Conclusions
4
Background: eScience application requirements for the
network
• Our eScience partner: TSI project– High bandwidth end-to-end for terabyte
sized file transfers– End-to-end QoS assurance
• remote visualization• remote computational steering
• CHEETAH: Circuit-Switched High-speed End-to-End Transport Architecture
TSI: Terascale Supernova InitiativeQoS: Quality of Service
5
Outline
• BackgroundCHEETAH
• Concept• Testbed and peering• Software• Applications: GridFTP, web application
• Conclusions
6
CHEETAH: Circuit-Switched High-speed End-to-End Transport Architecture
• Optical circuit switched networks flavor of sharing– Provide high-speed, end-to-end circuit
connectivity to end hosts on a dynamic and call-by-call basis
• Meets TSI needs– High-bandwidth connections for file
transfers– End-to-end QoS for remote
visualization
7
CHEETAH concept• Use off-the-shelf circuit-based gateways
– that support GMPLS routing and signaling protocols for dynamic circuit setup/release
• control-plane functionality to be distributed to network switches– enables the creation of large-scale shared CO networks
• Require software upgrade on end hosts
Connection-oriented switch
Bandwidth manager
Connection-oriented switch
Bandwidth manager
BW-request BW-request
Distributed bandwidth managementScales to large networks
Complete bandwidthon a link used for oneconnection
End host
End host
8
CHEETAH as an “add-on” service to the Internet
• Use second NICs at hosts for circuit connectivity leaving primary NIC for Internet access
• Two paths available• Attempt circuit setup, if rejected,
fall back to using TCP/IP• Can talk to non-CHEETAH end host
Connectionless Internet
Connectionless Internet
End host I
End host II
Circuit-Switched Network
Circuit-Switched Network
Two paths available
9
CHEETAH topology & equipment
Raleigh PoP (MCNC)(Sycamore SN16000)
Atlanta PoP (SoX/SLR)(Sycamore SN16000)
Ethernetswitch
Hosts
5 GbEs
Enterprise networks
NCSU
Ethernetswitch
Hosts
OC192(NLR, SLR)
G. Tech
OC192
ORNL PoP(Sycamore SN16000)
Control
Gbps and 10GbpsEthernetinterfacecards
Time-divisionmultiplexing optical interfacecard
bandwidth manager: dynamic distributed sharing
Maps GbE to equivalent SONET circuit
Hosts
Ethernetswitch
Cray X1
10
Atlanta
Raleigh
ORNL
Chicago
Seattle
SunnyvaleFNAL
ANL
PNNL
CalTech
SLAC
LBL
NERSC
CHEETAH peeringDC dragon network
VORTEX
Virginia Tech
University of Virginia
George Mason University
Virginia CommonwealthUniversity
College of Williamand Mary
Old Dominion University
Mclean, VA DC PoP
To CHEETAH Raleigh PoP
11
Cheetah software on end hosts
TCP NIC I
NIC II
FRTPPrimary TCP/IP path
End-to-end CHEETAH circuit
GridFTPWeb
server
Remote viz.(Ensight)
Signalingclient
End-host CHEETAH software
Routing decision
DNS lookup
Applications
• Implement cheetah software to run on end hosts• Integrate with host applications
– applications generate requests for bandwidth as needed– SHORT-LIVED: increase sharing– Hold circuit for a few seconds/minutes and release
Fixed-Rate Transport Protocol
(FRTP) designed for circuits
DNS query(to check if far end
host is also on cheetah)
Routing decisionto check whether to
use the TCP/IP path or attempt a cheetah
circuit setup)Signaling client
to request a circuit
12
Transport protocol for end-to-end dedicated circuits
• Requirements & solution:– No contention for bandwidth resources in network during
user data flow (bandwidth already reserved)• No congestion control
– Contention at end-hosts due to multitasking• Flow control: null or window based
– Reliable transfer: error control• Detect/recover from drops in receiver buffer
– High circuit utilization• Keep sending rate fixed to match circuit rate• Hence the name Fixed-Rate Transport Protocol (FRTP)• Receive rate selection important
• FRTP Implementation– X. Zheng, A. P. Mudambi, and M. Veeraraghavan, “FRTP:
Fixed Rate Transport Protocol -- A modified version of SABUL for end-to-end circuits,” Pathnets2004 Workshop in the Broadnet2004 Conference, Sept. 2004, San Jose, CA.
13
Applications
• GridFTP• Web applications
14
GridFTP application
• Disk-to-disk transfer considerations– Hardware solution: RAID striping
• expensive solution
– Split large file into small pieces and store small files on disks of different hosts in a cluster
• need “collaboration” between each transfer• not user-friendly
– GridFTP striping with PVFS2 - striping across disks of different hosts of a cluster
• best solution, but both GridFTP and PVFS2 code need modifications to use on dedicated circuits
15
• PVFS2 (Parallel Virtual File System)– Three kinds of roles for nodes in PVFS2
• Compute node/client: on which applications are run• Metadata server: handles metadata operations• I/O nodes/server: stores file data for PVFS2 file systems
– Stripes a file across multiple servers like RAID0• But
– PVFS2 stripes files starting with a random IO server• Change pvfs2 code:
– PVFS2 stripes files starting with a random server (done with PINT_cached_config_get_next_io() function call in file src/common/misc/pint-cached-config.c) jitter = (rand() % num_io_servers);
– Change it into jitter = -1 to get a fixed order of data distribution
GridFTP striped transfer over PVFS2
16
ControlControl
globus-url-copyMode E
SPAS (Listen)
- returns list of host: port pairs
STOR <FileName>
Mode E
SPOR (Connect)
- connect to the host-port pairs
RETR <FileName>
Host A1Block 1
Block 4
Block 1
Block 4
But the current GridFTP does not work in this ideal way. The data channel connections between the sending and receiving sides are arbitrary because the processing of SPAS and SPOR commands is nondeterministic.
• Does not match with the dedicated circuit model• Code being modified
Block 2
Block 5
Block 2
Block 5
Host X2Host A2
Host X1
17
Web Application
• At the web server side– Hyperlink to file is a CGI script (download.cgi); filename embedded in hyperlink– Download.cgi started automatically at server when user clicks hyperlink, which
triggers CHEETAH FT sender– CHEETAH FT Sender initiates CHEETAH circuit setup by calling RSVP-TE client.– CHEETAH FT Sender starts data transfer on FRTP/circuit.
• At the web client side– A RSVP-TE client is running as daemon to accept the circuit setup request– A CHEETAH FT receiver is running as daemon to receive the user data
Web serverWeb client
Web Browser(e.g. Mozilla)
Web Server (e.g. Apache)
download.cgi
Data transfer
URL
Response
RSVP-TEMessages
CHEETAH FT receiver
FRTPRSVP-TE interface
CHEETAH FT sender
FRTPRSVP-TE interface
RSVP-TE daemon
RSVP-TE daemon
18
Conclusions
• End-to-end dedicated connections appear to be the right answer for many eScience applications– But, many networking problems need to be
solved to achieve cost reduction through scaling
• Utilization concerns: bandwidth sharing + FRTP
• Specific concerns of TSI: TB file handling – PVFS2 and GridFTP
• Web site: http://cheetah.cs.virginia.edu