A STUDY OF APPLICATIONS

download A STUDY OF APPLICATIONS

of 91

Transcript of A STUDY OF APPLICATIONS

  • 7/29/2019 A STUDY OF APPLICATIONS

    1/91

    A STUDY OF APPLICATIONS

    FOR

    OPTICAL CIRCUIT-SWITCHED NETWORKS

    A Thesis

    Presented to

    the faculty of the School of Engineering and Applied Science

    University of Virginia

    In Partial Fulfillment

    of the requirements for the Degree

    Master of Science

    Computer Science

    by

    Xiuduan Fang

    May 2006

  • 7/29/2019 A STUDY OF APPLICATIONS

    2/91

    APPROVAL SHEET

    This thesis is submitted in partial fulfillment of the requirements for the degree of

    Master of Science

    Computer Science

    Xiuduan Fang

    This thesis has been read and approved by the examining committee:

    Malathi Veeraraghavan (Advisor)

    Marty Humphrey (Chair)

    Alfred Weaver

    Accepted for the School of Engineering and Applied Science:

    Dean, School of Engineering and Applied Science

    May 2006

  • 7/29/2019 A STUDY OF APPLICATIONS

    3/91

    Abstract

    The networking community has made a significant investment in GMPLS networks, which are

    connection-oriented networks that support dynamic call-by-call bandwidth sharing. Currently,

    GMPLS switches are call blocking and GMPLS control-plane protocols only support immediate

    requests for bandwidth. This thesis first addresses the question of suitability for different types

    of applications for GMPLS networks. Using the Erlang-B formula, we reason that GMPLS net-

    works are well suited for applications in which the required per-circuit bandwidth is on the order of

    one-hundredth the shared link capacity.

    Then, we propose two applications for the GMPLS network, CHEETAH, which we have de-

    ployed as part of an NSF-sponsored project. The first is a web transfer application, for which we

    design and implement a software package called WebFT. We integrate the CHEETAH end-host

    software modules into WebFT to provide deterministic data-transfer services transparently to users.

    The CHEETAH network provides connection-oriented services in addition to the connectionless

    service offered by the Internet. This add-on design allows the WebFT package to provide normal

    web access to nonCHEETAH clients through the Internet while simultaneously serving CHEE-

    TAH clients on dedicated circuits. The experiments conducted on the CHEETAH testbed show

    that WebFT can achieve low-variance, end-to-end transfer delays at different circuit rates and low

    transfer delays when high-speed circuits are possible.

    The second application is parallel file transfers on CHEETAH. We identify that two factors

    limit file-transfer throughput on networks with a high bandwidth-delay product: TCPs congestion-

    control algorithm and end-host limitations. We propose a general cluster solution to overcome these

    two factors. The solution uses GridFTP striped transfer and Parallel Virtual File System, version

    iii

  • 7/29/2019 A STUDY OF APPLICATIONS

    4/91

    iv

    2 (PVFS2) to transfer data amongst multiple hosts in parallel over dedicated circuits. To minimize

    end-host networkanddisk contention, we modify GridFTP and PVFS2 code such that all pairs

    of sending and receiving hosts are only responsible for blocks located in their local disks, which

    results in improved throughput.

  • 7/29/2019 A STUDY OF APPLICATIONS

    5/91

    Acknowledgments

    I am indebted to my advisor, Professor Malathi Veeraraghavan, for her consistent guidance and

    support. Professor Veeraraghavan has tirelessly guided me, teaching me how to do research in a

    systematic way. She has spent significant time on improving my writing skills. She has been and

    will always be an excellent role model for me.

    I am also grateful to all the other members in our research group, Dr. Xuan Zheng, Xiangfei

    Zhu, Zhanxiang Huang, Tao Li, and Anant P. Mudambi, for all their help.

    I am especially grateful to my grandmother, my parents, my brother Kevin, and my husband

    Lin for their continuous love and support. Without them, I could not have achieved what I have

    achieved today.

    Finally, this work was carried out under the sponsorship of NSF ITR-0312376, NSF EIN-

    0335190, and DOE DE-FG02-04ER25640 grants.

    v

  • 7/29/2019 A STUDY OF APPLICATIONS

    6/91

    Contents

    Acknowledgments v

    1 INTRODUCTION 1

    2 BACKGROUND 3

    2.1 CO Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2.1.1 CO Networks and GMPLS Control-Plane Protocols . . . . . . . . . . . . . 3

    2.1.2 Existing Switches, Gateways, and Networks . . . . . . . . . . . . . . . . . 8

    2.2 CHEETAH Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.2.1 CHEETAH Concept and Network . . . . . . . . . . . . . . . . . . . . . . 11

    2.2.2 CHEETAH End-Host Software . . . . . . . . . . . . . . . . . . . . . . . 13

    3 ANALYTICAL MODELS OF GMPLS NETWORKS 15

    3.1 Bandwidth Sharing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    3.1.1 Model for Applications in which Call-Holding Time is Independent of Per-

    Circuit Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    3.1.2 Model for Applications in which Call-Holding Time is Dependent on Per-

    Circuit Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    3.2 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3.2.1 Applications in which Call-Holding Time is Independent of Per-Circuit

    Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    vi

  • 7/29/2019 A STUDY OF APPLICATIONS

    7/91

    Contents vii

    3.2.2 Applications in which Call-Holding Time is Dependent on Per-Circuit

    Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    4 WEB TRANSFER APPLICATION ON CHEETAH 29

    4.1 WebFT Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    4.1.1 WebFT Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    4.1.2 CGI Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.1.3 The WebFT Sender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    4.1.4 The WebFT Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4.2 Experimental Testbed and Results . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    5 PARALLEL FILE TRANSFERS ON CHEETAH 38

    5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    5.2.1 FTP and GridFTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.2.2 PVFS2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    5.3 The Single-Host Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    5.4 The General-Case Cluster Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    5.4.1 The Splitting Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    5.4.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    5.4.3 ImplementationModifications to PVFS2 . . . . . . . . . . . . . . . . . 53

    5.4.4 ImplementationModifications to GridFTP . . . . . . . . . . . . . . . . . 61

    5.4.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    5.5 The Specific Cluster Solution for TSI . . . . . . . . . . . . . . . . . . . . . . . . 68

    5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

  • 7/29/2019 A STUDY OF APPLICATIONS

    8/91

    Contents viii

    6 CONCLUSIONS AND FUTURE WORK 70

    6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    Bibliography 73

  • 7/29/2019 A STUDY OF APPLICATIONS

    9/91

    List of Figures

    2.1 Distributed call-setup process progressing hop-by-hop . . . . . . . . . . . . . . . 6

    2.2 CHEETAH concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.3 CHEETAH experimental testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.4 CHEETAH end-host software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.1 Call-based sharing model for any single link of a switch . . . . . . . . . . . . . . 15

    3.2 A bandwidth sharing model for file transfers . . . . . . . . . . . . . . . . . . . . 17

    3.3 Plots ofPb vs. m for U = 40%,60%,80%, and 90% . . . . . . . . . . . . . . . . . 20

    3.4 Plots of vs. m and /m vs. m . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    3.5 Plots ofPb vs. and U vs. for m = 10, 100, and 1000, N 0 = 50 and 100,

    = 1.1, and k= 1.25 MB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    3.6 Plot ofN0 vs. for m = 10, 100, and 1000, U = 60% and 80%, = 1.1, and

    k= 1.25 MB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    3.7 Plots ofN vs. m for U = 40%, 60%, 80%, and 90% . . . . . . . . . . . . . . . . . 25

    4.1 WebFT architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.2 The flow of events from running CGI scripts . . . . . . . . . . . . . . . . . . . . 32

    4.3 The flow chart for the WebFT sender . . . . . . . . . . . . . . . . . . . . . . . . . 33

    4.4 CHEETAH testbed for WebFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    4.5 The web page to test WebFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    5.1 The single-host solution vs. the general-case cluster solution . . . . . . . . . . . . 40

    5.2 The model and flow chart of third-party control . . . . . . . . . . . . . . . . . . . 42

    ix

  • 7/29/2019 A STUDY OF APPLICATIONS

    10/91

    List of Figures x

    5.3 The model and flow chart of GridFTP striped transfer . . . . . . . . . . . . . . . . 43

    5.4 PVFS system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    5.5 A model of using GridFTP partial file transfer to implement the transferring step . 52

    5.6 A model of using GridFTP striped transfer to implement the transferring step . . . 53

    5.7 A snippet ofpvfs2-fs2.conf, the PVFS2 configuration file on sunfire6 . . . . . . . . 55

    5.8 A part of the output for pvfs2-fs-dump . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.9 The content of an s KB file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5.10 A part of the output for the command more testfile/pvfs2cp2 | grep connect . . . . . 57

    5.11 A part of the output of the command more testfile/pvfs2cp2 | grep writev | more . . 58

    5.12 The pvfs2-fs-dump output for the test 1000M file . . . . . . . . . . . . . . . . . . 59

    5.13 A snippet from the file pvfs2cp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    5.14 A part of the output for the strace command . . . . . . . . . . . . . . . . . . . . . 60

    5.15 A snippet of the source code for PINT cached config get next io() . . . . . . . . . 61

    5.16 The commands to start GridFTP servers on sunfire . . . . . . . . . . . . . . . . . 62

    5.17 A part of the debug output for the GridFTP striped transfer . . . . . . . . . . . . . 63

    5.18 The tcptrace outputs for GridFTP striped transfer before we modified GridFTP code 645.19 The tcptrace outputs for GridFTP striped transfer after we modified GridFTP code . 67

    5.20 The specific cluster solution for TSI . . . . . . . . . . . . . . . . . . . . . . . . . 68

  • 7/29/2019 A STUDY OF APPLICATIONS

    11/91

    List of Tables

    2.1 A classification of networks that reflects sharing modes . . . . . . . . . . . . . . . 4

    4.1 Average throughputs and delays at a variety of circuit rates . . . . . . . . . . . . . 37

    5.1 A summary of possible approaches to implement the general-case cluster solution . 54

    5.2 The logical server numbers for the physical I/O servers . . . . . . . . . . . . . . . 56

    5.3 The file descriptors and IP addresses for sunfire6 through sunfire10 . . . . . . . . . 57

    5.4 The data-distribution pattern for /pvfs2/test 1000M . . . . . . . . . . . . . . . . . 58

    xi

  • 7/29/2019 A STUDY OF APPLICATIONS

    12/91

    List of Abbreviations

    API application programming interface

    AS autonomous system

    CHEETAH Circuit-switched High-speed End-to-End Transport ArcHitecture

    CGI Common Gateway Interface

    CL connectionless

    CN compute node

    CO connection-oriented

    C-TCP Circuit-TCP

    DNS Domain Name Server

    DRAGON Dynamic Resource Allocation via GMPLS Optical Networks

    FTP File Transfer Protocol

    GbE Gigabit Ethernet

    Gb/s gigabit per second

    GB gigabyte

    GFP Generic Framing Procedure

    GMPLS Generalized Multiprotocol Label Switching

    GPFS General Parallel File System

    GSR Gigabit Switch Router

    GT Globus Toolkit

    I/O Input/Output

    ION I/O node

    xii

  • 7/29/2019 A STUDY OF APPLICATIONS

    13/91

    List of Abbreviations xiii

    IP Internet Protocol

    KB kilobyte

    LAN Local Area Network

    LMP Link Management Protocol

    MAN Metropolitan Area Network

    Mb/s megabit per second

    MB megabyte

    MPLS Multiprotocol Label Switching

    MSPP Multi-Service Provisioning Platform

    MTU Maximum Transmission Unit

    NCSU North Carolina State University

    NFS Network File System

    NIC network interface card

    OC Optical Carrier

    OCS Optical Connectivity Service

    ORNL Oak Ridge National Laboratory (ORNL)

    PCIX Peripheral Component Interconnect Extended

    PVFS2 Parallel Virtual File System, version 2

    QoS Quality of Service

    RAID redundant array of inexpensive disks

    RD routing decision

    RSVPTE Resource ReSerVation ProtocolTraffic Engineering

    RTP Research Triangle Park

    RTT round-trip delay time

    SDM Space Division Multiplexing

    SLR Southern Light Rail

    SNMP Simple Network Management Protocol

    SONET Synchronous Optical Network

  • 7/29/2019 A STUDY OF APPLICATIONS

    14/91

    List of Abbreviations xiv

    SOX Southern Crossroads

    TB terabyte

    TCP Transmission Control Protocol

    TDM Time Division Multiplexing

    TE traffic engineering

    TSI Terascale Supernova Initiative

    VC virtual circuit

    VLSR Virtual Label Switch Router

    WAN Wide Area Network

    WDM Wavelength Division Multiplexing

  • 7/29/2019 A STUDY OF APPLICATIONS

    15/91

    Chapter 1

    INTRODUCTION

    The networking community has made a significant investment in connection-oriented (CO) net-

    working. Allowing the reservation of bandwidth in the form of a dedicated circuit, or virtual circuit

    (VC), through a CO network prior to data transfers, this networking mode is recognized for its

    ability to offer service guarantees at some cost of utilization and fairness.

    A number of optical CO testbeds, some of which use Generalized Multiprotocol Label Switch-

    ing (GMPLS), have been deployed for research and educational purposes. These include CA-

    NARIEs CA*net 4 [11], OMNInet [34], SURFnet [49], UKLight [55], DOEs UltraScience net

    [41], Dynamic Resource Allocation via GMPLS Optical Networks (DRAGON) [46], and Circuit-

    switched High-speed End-to-End Transport ArcHitecture (CHEETAH) [13]. Further software

    projects to enable the use of MPLS tunnels across Internet2 [26] and across the Department of

    Energys ESnet [15] are also underway.

    Most of these networks are primarily designed for large-scale scientific applications. Some of

    these applications require high-bandwidth circuits and long call-holding times. To create large-

    scale circuit or VC networks, we need to extend the usage of these networks beyond scientific

    applications to millions of users. Thus, we need to identify and design more applications to use

    these networks efficiently.

    The first goal of this thesis is to determine what applications are well served by GMPLS net-

    works, which currently only support immediate-request calls. We use the Erlang-B formula to

    analyze the suitability of different types of applications. The study of application suitability for

    1

  • 7/29/2019 A STUDY OF APPLICATIONS

    16/91

    Chapter 1. INTRODUCTION 2

    GMPLS networks identifies applications suited to these networks in general, and specifically the

    CHEETAH testbed.

    Then, we study two applications for CHEETAH. The first is a web transfer application, where

    we present a solution to improve web performance by leveraging CHEETAH without requiring

    modifications to existing web server and client software. We implement a CGI-based software pack-

    age called WebFT. WebFT is integrated with the CHEETAH end-host software modules to provide

    deterministic data-transfer services transparently to users. With dedicated circuits on CHEETAH,

    WebFT can achieve low-variance, end-to-end transfer delays at different circuit rates and low trans-

    fer delays when high-speed circuits are possible.

    The second application is parallel file transfers on CHEETAH, where we study how to achieve

    multi-Gb/s throughput for bulk data transfers over WANs. We identify two factors that limit

    throughput to hundreds of Mb/s: TCPs congestion-control algorithm and end-host limitations.

    Then, we present a cluster solution over dedicated circuits, using GridFTP striped transfer and Par-

    allel Virtual File System, version 2 (PVFS2) to achieve multiple-host parallelism, and thus, improve

    overall throughput.

    The rest of this thesis is organized as follows. In Chapter 2, we provide background informationon a class of call-blocking CO networks and the CHEETAH experimental testbed. In Chapter 3, we

    explore the suitability of different types of applications for call-blocking CO networks. In Chap-

    ter 4, we design and implement a software package, called WebFT, to improve web performance

    through CHEETAH. In Chapter 5, we propose a cluster solution using GridFTP striped transfer and

    PVFS2 for parallel file transfers. Finally, we present our conclusions and list future-work items in

    Chapter 6.

  • 7/29/2019 A STUDY OF APPLICATIONS

    17/91

    Chapter 2

    BACKGROUND

    In this chapter, we first review different types of GMPLS networks and control-plane protocols. We

    point out that current GMPLS implementations use a call-blocking approach. Then, we briefly de-

    scribe existing equipment and networks in which CO services can be enabled. Finally, we overview

    the CHEETAH network and CHEETAH end-host software because all the work in this thesis has

    been conducted as a part of the CHEETAH project.

    2.1 CO Networking

    Networks are commonly classified by scale into Local Area Networks (LANs), Metropolitan Area

    Networks (MANs), Wide Area Networks (WANs), wireless networks, home networks, and inter-

    networks [50]. This classification, however, misses the critical aspect of networkingresource

    sharing. To reflect how resources are shared in networks , Veeraraghavan and Karol gave a classifi-

    cation of networks based on both switching type and networking type, as shown in Table 2.1 [56]. In

    this section, we focus on the CO networking mode and, more specifically, on a class of call-blocking

    GMPLS networks.

    2.1.1 CO Networks and GMPLS Control-Plane Protocols

    There are two types of CO networks: packet-switched and circuit-switched (see Table 2.1). Packet-

    switched CO networks include

    3

  • 7/29/2019 A STUDY OF APPLICATIONS

    18/91

    Chapter 2. BACKGROUND 4

    Table 2.1: A classification of networks that reflects sharing modes

    PPPPPPPPPPPPPPP

    Networkingtype

    Multiplexing/Switching type

    Circuit-switched Packet-switched

    Connectionless Not an option e.g., IP networks; Ethernet

    networks

    Connection-oriented e.g., Telephone network,

    SONET/SDH, WDM

    e.g., X.25, ATM, MPLS

    Intserv IP networks [8]

    Multiprotocol Label Switched (MPLS) [42] and Asynchronous Transfer Mode (ATM) net-

    works

    IEEE 802.1p and 802.1q Virtual LAN (VLAN) Ethernet switch based networks [25]

    Circuit-switched networks include

    Time-Division Multiplexed (TDM) SONET/SDH networks

    All-optical Wavelength Division Multiplexed (WDM) networks

    Space-Division Multiplexed (SDM) Ethernet switch based networks (an SDM connection is

    created by mapping two ports into an untagged VLAN)

    The GMPLS control-plane protocols are defined as a common control plane for these differ-

    ent types of CO networks even though their data-plane protocols differ significantly. This common

    control plane consists of:

    1. Link Management Protocol (LMP) [29]

    2. Open Shortest Path FirstTraffic Engineering (OSPFTE) routing protocol [27]

    3. Resource Reservation ProtocolTraffic Engineering (RSVPTE) signaling protocol [3]

  • 7/29/2019 A STUDY OF APPLICATIONS

    19/91

    Chapter 2. BACKGROUND 5

    These three protocols are designed to be implemented in a control processor at each network

    switch. Each of these protocols provides an increasing degree of automation, and a corresponding

    decreasing dependence upon manual network administration. This triple combination serves as an

    excellent basis on which to create large-scale CO networks, in which switches can cooperate in a

    completely automated fashion to respond to requests for end-to-end bandwidth. We consider each

    protocol in a little more detail below, starting with LMP.

    Primarily, the LMP module automatically establishes and manages the control channels be-

    tween adjacent nodes, to discover and verify data-plane connectivity, and to correlate data-plane

    link properties. In GMPLS networks, there could be multiple data-plane links between two adja-

    cent nodes and the control channel could be established on a separate physical link from any of the

    data-plane links. A mechanism is required to automatically discover these data-plane links, verify

    their properties, combine them into a single traffic-engineering (TE) link, and correlate data-plane

    links to the control channel. Thus, LMP contributes to our plug-and-play goal for CO networks by

    minimizing manual administration.

    The OSPFTE routing protocol software module, located at a switch, enables the switch to

    send topology, reachability, and the loading conditions of its interfaces to other switches, and re-ceive corresponding information from them. This data-dissemination process allows the route com-

    putation module at the switch to determine the next-hop switch toward which to direct a connection

    setup (this module could be part of the signaling-protocol module or could be used to pre-compute

    routing data ahead of when call-setup requests arrive). As a routing protocol, its value in creating

    large-scale connectionless networks has already been observed with the success of the Internet. Ad-

    mittedly, being a link-state protocol, it is only used intra-domainthat is, within the network of an

    organization, referred to as an autonomous system (AS). Even within this intra-domain context, it

    organizes the AS as a two-layer hierarchy, meaning that the AS is partitioned into self-contained ar-

    eas interconnected by a backbone area. In conjunction with the distance-vector based inter-domain

    routing protocol, Border Gateway Protocol (BGP), we have a highly decentralized automated mech-

    anism to spread routing information, which was critical to the scaling of the Internet.

  • 7/29/2019 A STUDY OF APPLICATIONS

    20/91

    Chapter 2. BACKGROUND 6

    Finally, an RSVPTE signaling engine at a switch manages the bandwidth of all the interfaces

    on the switch, and programs the data-plane switch hardware to enable it to forward demultiplexed

    incoming user bits or packets as and when they arrive. Given that dynamic bandwidth sharing in

    CO networks is controlled by the signaling engine, the call-handling performance of this engine is

    critical to the scaling of CO networks. The faster the response times of signaling engines, the lower

    the cost to an application to release and reacquire bandwidth as and when needed. This allows

    applications to hold circuits only for the duration of their communication bursts, which, in turn,

    improves link utilization. The need for high call-handling performance from signaling engines can

    be met with a completely automated and distributed bandwidth-management implementation. This

    will allow for both temporal and spatial scalability (i.e., shorter call-holding times and networks

    with large numbers of switches and hosts).

    An RSVPTE engine implemented in a control card at a switch executes three steps when it

    receives a connection setup Path message (i.e., a request for bandwidth), as show in Fig. 2.1.

    BW: Bandwidth;

    D: Destination address

    Route lookup Bandwidth andlabel management

    Switch fabricconfiguration

    Route lookup Bandwidth andlabel management

    Switch fabricconfiguration

    GMPLS switch GMPLS switch

    Path message (BW, D)

    (from previous switch on path)

    Path message (BW, D)Path message (BW, D)

    (to next switch on path)

    Control plane

    Data plane

    Route lookup Bandwidth andlabel management

    Switch fabricconfiguration

    Route lookup Bandwidth andlabel management

    Switch fabricconfiguration

    Figure 2.1: Distributed call-setup process progressing hop-by-hop

    1. Route computation: Based on the destination address to which the connection is requested

    (D, in the example shown in Fig. 2.1), the RSVPTE engine determines the next-hop switch

  • 7/29/2019 A STUDY OF APPLICATIONS

    21/91

    Chapter 2. BACKGROUND 7

    toward which to route the connection or a subset of switches on the end-to-end path within

    its area of its domain. Constrained Shortest Path First (CSPF) algorithms can only be exe-

    cuted intra-area because of the intra-area scope of bandwidth related parameters in OSPFTE

    messages.

    2. Bandwidth and label management: If the switch is in a position to only compute the next-hop

    switch in the route computation phase, then it needs to check if there is sufficient bandwidth

    on a link connected to the next-hop switch. If it performs CSPF to determine a part of the

    end-to-end route (i.e., the subset of switches on the path within its area of its domain), then

    this step of bandwidth management is integrated with the partial route computation. But at

    subsequent switches within the area, this step is required to check if there is sufficient band-

    width available on the link to the next-hop indicated in the partial source route passed within

    the Path signaling message (see Fig. 2.1 for how Path messages travel hop-by-hop). This

    is because local conditions can change between the last routing protocol update, which pro-

    vided the data used in the CSPF computation, and the arrival of the call being set up. Typical

    implementations use a call-blocking approach where calls are simply rejected if sufficient

    bandwidth is not available. Label management is the selection of labels to be used on in-

    coming and outgoing switch interfaces. In the data plane, labels can be either explicit in the

    data plane (e.g., labels used within packet headers in VC networks), or implicit (e.g., time

    slots, wavelengths or interface identifiers in TDM, WDM, and SDM networks). In the con-

    trol plane, labels are explicit in both types of switches, with the labels identifying time slots,

    wavelengths and interface identifiers to be used for the connection across a circuit switch.

    These labels are used in the next step.

    3. Switch fabric configuration: This step is needed to configure the switch fabric to forward

    user data as and when they arrive. This function maps incoming labels associated with input

    interfaces to outgoing labels on appropriate outgoing interfaces. In packet switches, there is

    an additional step to program the scheduler to enable it to serve packets arriving on the VC

    being set up at the requested bandwidth level.

  • 7/29/2019 A STUDY OF APPLICATIONS

    22/91

    Chapter 2. BACKGROUND 8

    We do not show the rest of the call-setup procedure in Fig. 2.1, the continuation of the Path

    message propagation hop-by-hop, or the Resv message returning in the opposite direction, which

    implicitly confirms successful connection setup. Detailed procedures are also defined in RSVPTE

    for call-setup failure.

    As mentioned in step 2, the bandwidth-management procedure implemented in most GMPLS

    switches is based on call blocking. In other words, if the requested bandwidth is not available when

    a call arrives, the call request is rejected. There is support for preemption, but if no existing call is

    preemptable (because of priority levels), then the call is blocked.

    The counterpart call-queuing model, though analyzed in textbooks [44], is seldom imple-

    mented. This is because a call traversing multiple links requires a simultaneous allocation of

    bandwidth on all these links. A distributed call-queuing model requires a call (an RSVPTE Path

    message) to wait in a queue until resources become available at the first switch, and then to join a

    queue at the next switch in a hop-by-hop manner as shown in Fig. 2.1. Resources allocated to a call

    at upstream switches will lie unused while the Path messages are queued at downstream switches.

    Parallelizing this wait time by simultaneously queuing the call at multiple switches will decrease

    wasted bandwidth, but not eliminate it. Therefore, call queuing is seldom implemented.The RSVPTE and OSPFTE control-plane protocols do not support advance reservations of

    bandwidth. For example, there are no objects defined in RSVPTE to specify a future start time in

    a Path message. Nor are there parameters defined in OSPFTE to report future loading conditions

    in the TE link state advertisements. Hence, these GMPLS control-plane protocols only support

    immediate-request or on-demand calls.

    2.1.2 Existing Switches, Gateways, and Networks

    The most common network switches today are Ethernet switches, IP routers and SONET/SDH

    switches. The first two are primarily connectionless packet switches; however, Ethernet switches

    have VLAN capabilities with limited Quality of Service (QoS) support. A VLAN is constructed

    by programming the switch to include two or more ports. It can be tagged or untagged. In tagged

    mode, all Ethernet frames are tagged with a VLAN header that includes a VLAN ID. Frames

  • 7/29/2019 A STUDY OF APPLICATIONS

    23/91

    Chapter 2. BACKGROUND 9

    tagged with the same VLAN ID are treated in the same manner; that is, they are forwarded to all

    the ports belonging to that VLAN. An untagged VLAN with two ports is essentially a SDM circuit

    because all Ethernet frames arriving on either port are sent exclusively to the other port. No frames

    arriving on other ports are forwarded to ports in an untagged VLAN. Ethernet switches available

    from Extreme Networks, Dell, Cisco, Intel, Foundry, and Force 10, just to name a few vendors,

    have these capabilities. Thus, the data-plane capabilities required to create circuits or VCs through

    Ethernet switches are now available. However, control-plane software used to set up and release

    circuits dynamically is not implemented within these switches. The Dragon project has developed a

    software module called the Virtual Label Switch Router (VLSR), which implements the RSVPTE

    and OSPFTE protocols. It runs on an external Linux host connected to the Ethernet switch [46] and

    manages the bandwidth of the switch. It issues Simple Network Management Protocol (SNMP) [7]

    commands to create the VLANs for admitted connections. With this external software, the Ethernet

    switches become fully equipped CO switches.

    IP routers are equipped with MPLS engines and RSVPTE signaling software for dynamic

    control of MPLS VCs. Both Cisco and Juniper routers support MPLS.

    SONET/SDH and WDM switches are circuit switches in which time slots and wavelengthsare respectively mapped from incoming to outgoing interfaces. Some of these switches now sup-

    port RSVPTE and OSPFTE control-plane implementations. For example, Sycamore SONET

    switches implement these protocols. Examples of WDM switches that implement GMPLS control-

    plane protocols include Movaz and Calient WDM equipment.

    In addition to supporting pure CO-switching functionality, some of this equipment can be used

    as gateways to interconnect different types of networks. Before describing the gateway functional-

    ity of these pieces of equipment, we establish some terminology.

    We define the term network to consist of switches and endpoints (data-sourcing and sink-

    ing entities) interconnected by shared communication links, on which the sharing (multiplexing)

    mechanism is the same on all links. Further, we define the term switch as an entity in which all

    links (interfaces) support the same (single) form of multiplexing (referred to as switching capabil-

    ity [45]). For example, a SONET switch is one in which all interfaces carry TDM signals formatted

  • 7/29/2019 A STUDY OF APPLICATIONS

    24/91

    Chapter 2. BACKGROUND 10

    according to the SONET multiplexing standards, and a SONET network is one in which all the

    switches are SONET switches. Typical endpoints in a SONET network are IP routers with SONET

    line cards; these nodes are endpoints in the SONET network as they source and sink data carried on

    to the SONET network.

    We use the term internetwork to denote an interconnection of networks (referred to as multi-

    region networks) [45]. Entities (nodes) that interconnect networks necessarily need the ability to

    support interfaces with different types of multiplexing capabilities, minimally two. We use the term

    gateways to refer to such nodes. An IP router is a gateway in the connectionless Internet with

    different line cards implementing the protocols of the networks to which they are connected. The

    gateway functionality is achieved by the IP implementation within the router examining IP datagram

    headers to determine how to route a packet from an incoming network to an appropriate outgoing

    network. In contrast, gateways in a CO internetwork move data from one network to another using

    circuit or VC techniques. For example, Ethernet cards in a Sycamore SN16000 implement the

    Generic Framing Procedure (GFP) Ethernet-to-SONET encapsulation to map all frames received

    on any of its Ethernet ports into a port on a SONET line card, which connects this gateway node

    to a SONET network. In this scenario, the circuit is a simple SDM circuit. We thus refer to thesegateways as circuit or VC gateways to contrast them with packet-based IP routers. An example of

    a VC gateway is a Cisco GSR 12008, which supports line cards that can be programmed to map all

    frames arriving on a specific VLAN into an MPLS tunnel set up on one of its other ports. It thus

    interconnects a VLAN based CO network to an MPLS based CO network.

    While the data-plane capabilities for extracting data from one type of multiplexed connection

    and sending it on to a different type of multiplexed connection are available, the control-plane capa-

    bilities for controlling such circuits or VCs are not yet standardized, and hence, not implemented.

    Finally, as for current CO network deployments, SONET/SDH and WDM networks are al-

    ready in widespread deployment. However, the dynamic bandwidth provisioning capability sup-

    ported by the GMPLS control-plane protocols, while available on some switches in deployment, is

    not yet made available to users. Similarly, the Abilene backbone of Internet2 and DOEs ESnet has

    routers with built-in MPLS and RSVPTE capabilities. There are ongoing research projects [22,24]

  • 7/29/2019 A STUDY OF APPLICATIONS

    25/91

    Chapter 2. BACKGROUND 11

    to enable the use of dynamically requested VCs through these networks, including CHEETAH [13],

    a SONET based network, and DRAGON [46], a WDM based network. Both CHEETAH and

    DRAGON are call-blocking and immediate-request GMPLS networks.

    2.2 CHEETAH Network

    Our research group has deployed the CHEETAH network as part of an NSF-sponsored project

    proposed to provide high-speed, end-to-end connectivity on a call-by-call basis. In this section, we

    review the CHEETAH concept and the current experimental testbed. We also describe the end-host

    software needed in CHEETAH-connected computers.

    2.2.1 CHEETAH Concept and Network

    CHEETAH is a networking solution to provide end-host applications access to end-to-end CO ser-

    vices, while preserving the connectionless services already available to them via the Internet. In

    other words, CHEETAH is designed as an add-on service to existing Internet connectivity, and

    further, it leverages the services of the latter.

    As shown in Fig. 2.2, end hosts are equipped with two Ethernet Network Interface Cards (NICs).

    The primary NICs (NIC I) in the end hosts are connected to the public Internet through the usual

    Packet-switched

    Internet

    Packet-switched

    Internet

    End

    host

    Optical Circuit-switched

    CHEETAH Network

    Optical Circuit-switched

    CHEETAH Network

    NIC I

    NIC II

    End

    host

    NIC I

    NIC II

    IP routers IP routers

    Ethernet-SONET

    gateway

    Ethernet-SONET

    gateway

    Figure 2.2: CHEETAH concept

  • 7/29/2019 A STUDY OF APPLICATIONS

    26/91

    Chapter 2. BACKGROUND 12

    LAN Ethernet switches or IP routers, while the secondary NICs (NIC II) are connected to Ethernet

    ports on Ethernet-to-SONET circuit gateways.

    Ethernet-to-SONET circuit gateways, in turn, are connected to wide-area SONET circuit-

    switched networks, in which both circuit gateways and pure SONET switches are equipped with

    GMPLS protocols to support call-by-call dynamic bandwidth sharing. End-to-end CHEETAH cir-

    cuits (as shown in the dashed line in Fig. 2.2) are set up dynamically between end hosts with

    RSVPTE signaling messages being processed at each intermediate gateway or switch in a hop-by-

    hop manner.

    The add-on design of CHEETAH network brings two benefits:

    1. Connectivity to the Internet allows a CHEETAH end host to communicate with other non

    CHEETAH hosts on the Internet while it communicates with another CHEETAH end host

    through a dedicated CHEETAH circuit.

    2. Applications can selectively choose to request CHEETAH circuits only when the Internet

    path is estimated to provide a lower service quality than the CHEETAH circuit, and further

    fall back to the Internet path if the CHEETAH circuit-setup attempt fails due to an unavail-

    ability of circuit resources on the CHEETAH network.

    Currently, the CHEETAH network consists of three Ethernet-to-SONET circuit gateways,

    which are Sycamore SN16000 switches, deployed at MCNC in Research Triangle Park (RTP),

    NC, Southern Crossroads (SOX) and Southern Light Rail (SLR) in Atlanta, GA, and Oak Ridge

    National Laboratory (ORNL) in Oak Ridge, TN. The testbed layout is shown in Fig. 2.3. Hosts,

    running Linux, are connected via Gigabit Ethernet (GbE) NICs to the SN16000 switches. The cir-

    cuits, set up and released dynamically, consist of Ethernet segments from the hosts to the switches

    mapped to Ethernet-over-SONET segments between the switches. The GbE signal is mapped to a

    21-OC1 virtually concatenated SONET signal to create an end-to-end 1 Gb/s dedicated circuit.

  • 7/29/2019 A STUDY OF APPLICATIONS

    27/91

    Chapter 2. BACKGROUND 13

    zelda4

    zelda5

    Juniperrouter

    Controlcard

    OC192card

    Crossconnectcard

    zelda1

    zelda2

    zelda3

    Sycamore SN16000

    Juniper

    router

    InternetInternet

    ORNL, TN

    SOX/SLR, GA

    C

    ontrolcard

    OC192card

    Cros

    sconnectcard

    Sycamore SN16000

    wukong

    MCNC/NCSU, NC

    Figure 2.3: CHEETAH experimental testbed

    2.2.2 CHEETAH End-Host Software

    We have developed a software package for Linux hosts, called CHEETAH end-host software,

    to enable the automatic use of CHEETAH circuits. Wherever possible, our goal is to integrate li-

    braries of this CHEETAH end-host software into application software modules to make CHEETAH

    services transparent to human users.

    The CHEETAH end-host software architecture is shown in Fig. 2.4. The Optical Connectivity

    Service (OCS) client module is used to determine whether the correspondent end host (called

    party) is on the CHEETAH network. It does this by sending a TXT query to a Domain Name

    Server (DNS). The TXT resource record is a generic type supported by DNS to allow users to store

    any data about hosts. The TXT data we store for a CHEETAH end host consist of an indication that

    it is a CHEETAH end host, along with the IP and MAC addresses of the hosts secondary NIC.

    The routing decision (RD) module answers queries from applications as to whether to attempt

    a circuit setup. It makes these decisions by using collected measurements about the two paths, the

  • 7/29/2019 A STUDY OF APPLICATIONS

    28/91

    Chapter 2. BACKGROUND 14

    Application

    RSVP-TE client

    TCP/IPNIC 1

    NIC 2

    End hostCHEETAH software

    Routing decision

    C-TCP

    OCS clientInternet

    CHEETAHnetwork

    Application

    RSVP-TE client

    TCP/IP NIC 1

    NIC 2

    End hostCHEETAH software

    Routing decision

    C-TCP

    OCS client

    Figure 2.4: CHEETAH end-host software

    Internet path and the CHEETAH path, along with the size of the file to be transferred.

    The RSVPTE client module is used to initiate the setup and release of CHEETAH circuits

    [59]. Parameters provided to this module include the secondary NIC IP address of the destination

    to which a circuit is being requested and the desired bandwidth. The Sycamore switches in the

    CHEETAH network receive these RSVPTE messages, process them and set up circuits if the

    requested bandwidth is available to the specified destination. It is a distributed switch-by-switch

    signaling procedure.

    The Circuit-TCP (C-TCP) module is the transport protocol that we have developed for CHEE-

    TAH circuits [33]. Given that the bandwidth of a dedicated circuit is known before a file transfer

    starts, any changes in the sending rate will either cause the circuit to remain idle or cause the receiver

    buffer to fill up. Since neither option is desirable, we essentially removed the congestion-control

    algorithms of TCP that were designed to keep adjusting the sending rate based on IP network con-

    ditions in order to create our C-TCP module. This disabling of the congestion control is selectively

    done only by TCP connections traversing the secondary NIC, which is used for CHEETAH circuits.

    TCP connections traversing the primary NIC connected to the Internet continue using the standard

    TCP code.

    Corresponding to each CHEETAH software module is a library providing application program-

    ming interfaces (APIs) to invoke the services of each module. These libraries are expected to be

    linked into applications using the CHEETAH software and network.

  • 7/29/2019 A STUDY OF APPLICATIONS

    29/91

    Chapter 3

    ANALYTICAL MODELS OF GMPLS NETWORKS

    In Chapter 2, we reasoned that GMPLS networks are call-blocking networks that only support

    immediate-request calls. One important question is, what applications, if any, are suitable for GM-

    PLS networks. This chapter addresses this problem. First, we present bandwidth sharing models for

    two types of applications, ones in which the per-circuit bandwidth and mean call-holding time are

    independent and ones in which they are dependent (file transfers). Then, we provide numerical re-

    sults for both models. Finally, we conclude that, GMPLS networks are well suited for applications

    in which the required per-circuit bandwidth on the order of one-hundredth the shared link capacity

    for both types of applications.

    3.1 Bandwidth Sharing Model

    The switch model used in our analysis is illustrated in Fig. 3.1, in which calls originating from hosts

    on the N links (e.g., the N Ethernet links connecting hosts to Ethernet interfaces on a gateway)

    share the link capacity C on link L (e.g., the SONET/SDH/WDM/MPLS link out of a gateway).

    We assume that call-setup requests arrive according to a Poisson process with rate , since many

    12

    N-1N

    Link L,

    capacity C

    Figure 3.1: Call-based sharing model for any single link of a switch

    15

  • 7/29/2019 A STUDY OF APPLICATIONS

    30/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 16

    call-arrival processes observable in practice can be modeled as Poisson processes [44]. Further, we

    assume that call-holding times follow arbitrary distributions with a mean call-holding time denoted

    as 1/. To understand the types of applications that can be supported on GMPLS circuit-switched

    networks, we make a simplifying assumption that all calls are of the same typethat is, they need

    the same amount of bandwidth. This allows us to treat linkL as a link of m circuits, where each

    circuit is of capacity C/m.

    We ask two questions about the suitability of applications for GMPLS networks:

    1. Are applications that require high-bandwidth circuits more or less desirable than applications

    that require low-bandwidth circuits?1

    2. Are applications that generate calls with long mean holding times more or less desirable than

    calls with short mean holding times?

    The first question is related to m, the number of circuits. The larger the per-circuit bandwidth, the

    smaller the m for a given link capacity C. The second question is related to the mean call-holding

    time, 1/.

    For applications such as remote visualization and video conferencing, the mean holding time is

    independent of the per-circuit bandwidth. On the other hand, for file transfers, commonly identified

    as an application suitable for high-speed circuits [57], m and 1/ are related. The larger the per-

    circuit bandwidth (the smaller the m), the lower the mean call-holding time, 1/. We describe

    models for these two cases in the following subsections, respectively.

    3.1.1 Model for Applications in which Call-Holding Time is Independent of Per-

    Circuit Bandwidth

    Given our assumptions, we can model link L as an M/G/m/m system [44]. The call-blocking

    probability in this model is given by the well-known Erlang-B formula:

    Pb =m/m!

    m

    i=0

    (i/i!)(3.1)

    1In this chapter, we only use the word circuits, but the same model and analysis hold for virtual circuits as well.

    http://-/?-
  • 7/29/2019 A STUDY OF APPLICATIONS

    31/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 17

    where , the offered traffic load, is given by = /. Although this is a time-tested model for

    telephony traffic, we found it useful to our current problem of identifying applications suited to

    GMPLS networks.

    Assume that the number of calls per second arriving on each of the N ports that are destined for

    linkL is . Thus, from Fig. 3.1, the aggregate , call-arrival rate for linkL, is given by:

    = N (3.2)

    The utilization of linkL, U, is given by:

    U =

    m(1Pb) (3.3)

    3.1.2 Model for Applications in which Call-Holding Time is Dependent on Per-

    Circuit Bandwidth

    File-transfer applications belong in this category. Given that the GMPLS switch operates in a call-

    blocking mode even when used for this category of applications, equations (3.1)(3.3) apply here

    as well. If file sizes are too small, the overhead incurred in call-setup delay will significantly reduce

    link utilization (since call-setup delays could exceed file-transfer delays). Therefore, Veeraragha-

    vans team [57] proposed using an RD module at end hosts to decide, based on the file size and

    other metrics, whether to request a circuit for a particular file transfer, or whether to simply use the

    Internet connectivity.

    Fig. 3.2 illustrates a model for the file transfer application. We use a settable parameter

    crossover file size, , to model the behavior of the RD module, wherein files larger than are

    Link L,capacity C

    .

    .

    .

    12

    N-1N

    routing

    decision (RD)module

    end host

    0

    Figure 3.2: A bandwidth sharing model for file transfers

  • 7/29/2019 A STUDY OF APPLICATIONS

    32/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 18

    routed to the CO network.

    We assume that file sizes are distributed according to the Pareto distribution with the probability

    density function:

    f(x) =k

    x+1, xk (3.4)

    where is the shape parameter (the larger the , the higher the probability of small file sizes),

    and k is the scale parameter, denoting the minimum file size. Crovella [14] characterized web file

    sizes as following this distribution and suggested in the range from 1.0 to 1.3 and a value for kof

    1000 bytes.

    Given that only files larger than are routed to the CO network, using (3.4), we derive the mean

    file size, E[X|(X )], as

    E[X|(X )] =

    1(3.5)

    We then estimate the mean call-holding time, 1/, as

    1

    = Tprop +E[Temission] (3.6)

    where Tprop is the one-way propagation delay, and

    E[Temission] =E[X|(X )]

    C/m=

    1

    m

    C(3.7)

    By neglecting Tprop, we can approximate:

    1

    =

    1

    m

    C(3.8)

    capturing the inter-dependence of m and 1/. We justify neglecting Tprop as follows. E[Temission]

    should be larger than Tprop because the latter is incurred as part of call-setup delay, and to maintain

    a high link utilization, mean call-setup delay should be much smaller than E[Temission], which means

    that Tprop is much smaller than E[Temission].

  • 7/29/2019 A STUDY OF APPLICATIONS

    33/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 19

    From Fig. 3.2, we can derive the call-arrival rate at link L as:

    = N = N0 P(X ) = N0 k

    (3.9)

    Combining (3.9) with the mean holding time from (3.8), we get

    =

    = N0

    1

    k

    1

    m

    C(3.10)

    3.2 Numerical Results

    3.2.1 Applications in which Call-Holding Time is Independent of Per-Circuit Band-

    width

    Assume that the link capacity C= 10 Gb/s. This is a reasonable value if the switch is a SONET

    or MPLS switch. For WDM switches, if the number of wavelengths on link L is 100, then a more

    reasonable value for Cwould be 1 Tb/s because each wavelength is typically engineered to support

    10 Gb/s. We will consider this number later in this chapter. For now, we consider C= 10 Gb/s.

    We study the effect of changing m from 1 to 1000; in other words, the per-circuit bandwidth

    varies inversely from 10 Mb/s to 10 Gb/s. We obtain numerical results corresponding to four differ-

    ent fixed values ofU, 40%, 60%, 80%, and 90%. Since we have two equations (3.1) and (3.3), if

    we fix two parameters, U and m, then the other two variables, and Pb, become fixed as well. We

    use an iterative algorithm as follows to obtain these values. First, we observe that for a given m, U

    increases as increases. We also conduct experiments to confirm the observation. Then, we start

    to assign = m temporarily, and compute the corresponding Pb and U. If the current U is larger

    than the given U, meaning that is too large, we decrease by = 0.001 until the corresponding

    U in the current iteration is smaller than the given U; otherwise, we increase by until the

    corresponding U in the current iteration is larger than the given U. Next, we compare the current U

    and its neighbor in the previous iteration to get the closest one to meet the given U and m. Finally,

    we compute the corresponding Pb. Fig. 3.3 plots Pb vs. m.

  • 7/29/2019 A STUDY OF APPLICATIONS

    34/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 20

    0 20 40 60 80 1000

    0.2

    0.4

    0.6

    0.8

    1

    U=80%

    U=90%

    m

    Pb

    U=60%

    U=40%

    (a) m [1,100]

    101 400 700 10000

    0.01

    0.02

    0.03

    0.04

    0.05

    U=80%

    U=90%

    m

    Pb

    (b) m [101,1000]

    Figure 3.3: Plots ofPb vs. m for U = 40%,60%,80%, and 90%

    From Fig. 3.3a, we see that at small values ofm, it is hard to achieve high utilization combined

    with low call-blocking probability. Consider m = 10, which corresponds to a per-circuit allocation

    of 1 Gb/s per call (e.g., for HDTV applications). To run the link at an 80% utilization level, the

    corresponding call-blocking probability will be a high 23.62%. In Fig.3.3b, we show the effect of

    large m at which values both high utilization and low call-blocking probability are achievable.

    The effect of traffic load is not obvious from Fig. 3.3. Therefore, we plot the traffic load

    vs. m and /m vs. m in Fig. 3.4. From Fig. 3.4a, we see that should be engineered to be high

    0 20 40 60 80 1000

    20

    40

    60

    80

    100

    U=40%

    U=60%

    U=80%

    U=90%

    m

    (a) vs. m

    0 20 40 60 80 1000

    2

    4

    6

    8

    10

    U=40%

    U=60%

    U=80%

    U=90%

    m

    /m

    (b) /m vs. m

    Figure 3.4: Plots of vs. m and /m vs. m

  • 7/29/2019 A STUDY OF APPLICATIONS

    35/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 21

    when m is high. We also see that, as m increases, Pb decreases and /m approaches U according to

    (3.3). For example, when U = 60%, /m approaches 0.6, reaching this value when m = 80. Thus,

    is typically close to and less than m when Pb is low (close to 0) and U is high (close to 1). For

    example, at a fixed value ofU = 80%, when m = 100, = 80.35, Pb = 0.4%, and when m = 1000,

    = 800, Pb 0. Thus, is close to m when Pb is low (close to 0) and U is high (close to 1).

    From the two graphs (Figs. 3.3 and 3.4) we see that if we want to operate the link at a given

    value of call-blocking probability, and a given value of utilization, the number of circuits, m, and

    traffic load, , become fixed. An alternative starting point is that a given application has a fixed

    capacity requirement, which means that m is fixed. If we further assume that , the call-arrival

    rate per port, and mean call-holding time, 1/, are intrinsic to the application, then we can only

    adjust the aggregate traffic load by engineering N to achieve a given call-blocking probability or

    utilization. But these graphs show us that once m is set, ifm is small, we are highly limited in our

    ability to achieve both high utilization and low call-blocking probability.

    Having understood the influences of all the important variables in this model, , m, Pb and U, let

    us now consider three applications. The first application is a high-bandwidth application (m = 10),

    the second, a low-bandwidth application (m = 1000) and finally, an intermediate-level bandwidthapplication (m = 100).

    High-bandwidth applications: When m = 10that is, when the application requires a per-

    circuit bandwidth of 1 Gb/swe can achieve a target 80% utilization, only by operating the link at

    a high call-blocking probability of 23.62%. Such a high call-blocking probability could be unac-

    ceptable to users. We conclude that applications requiring a high per-circuit capacity relative to

    the shared link capacity are unsuitable for the immediate-request call-blocking mode of bandwidth

    sharing offered by GMPLS networks in situations where high utilization and low call-blocking prob-

    ability are important. Since, as discussed in Chapter 2.1.1, call queuing is not an option, it appears

    that we need a book-ahead mechanism for such applications.

    We then ask whether the above answer is dependent on the mean call-holding time. In other

    words, when m is small, do we require a book-ahead mechanism only if the mean call-holding time

    is large or do we need such a mechanism even if the mean call-holding time is small? For example,

  • 7/29/2019 A STUDY OF APPLICATIONS

    36/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 22

    in a doctors office, where there are three to four doctors per office (m is 3 or 4), since our mean

    holding times (appointment lengths) are fairly high, on the order of 20-30 minutes, we use a book-

    ahead mechanism. If the mean holding time is on the order of 1-2 minutes (e.g., at a bank teller),

    could an immediate-request approach work? The answer is that it would if there was space to wait.

    In other words, if the queuing system has a buffer to wait, high-bandwidth calls that have short

    mean holding times could be handled without a reservation system. Unfortunately, as explained in

    Chapter 2.1.1, queuing models are not suitable for calls. Therefore, for applications that require

    high bandwidth (i.e., m is small, irrespective of the mean call-holding time), our conclusion of

    needing a book-ahead mechanism holds.

    Low-bandwidth applications: At the other extreme, consider large values ofm, say m = 500

    to m = 1000. For example, in a video-telephony application with motion JPEG cameras operating

    at 25 frames/sec (motion-JPEG used instead of MPEG to meet the stringent delay requirements of

    telephony), we could allocate 10 Mb/s on an MPLS-shared 10 Gb/s link, in which case m = 1000.

    At these high values ofm, call-blocking probability of almost 0 and utilization levels close to 1 are

    achievable as seen in Fig. 3.3b; however, the required traffic load is high (close to m) as noted in

    our analysis of Fig. 3.4.Whether and how such traffic loads can be engineered depends upon the second important

    factor, mean call-holding time. At a traffic load = 500, if the mean call-holding time is small (say

    3 minutes for a video-telephony call, which is the number typically quoted as the mean duration of

    telephony calls), the aggregate call-arrival rate, , needs to be about 2.8 calls/sec. Say on average

    each end host makes 1 call every two hours, which means in (3.2) is about 0.5 calls/hour. This

    means that we need N to be 20160 to obtain an aggregate of 500 Erlangs. In other words, we

    need calls from 20106 end hosts to be multiplexed (perhaps through a multi-level hierarchy of

    switches) into the switch shown in Fig. 3.1, destined to share link Ls capacity. This is a high level

    of aggregation requiring switches with large numbers of ports. Since line cards (the more the ports,

    the more the line cards) drive up the cost of switches, our conclusion is that to achieve a high

    utilization with low-bandwidth applications that have short durations and low call-arrival rates,

    we need to equip the switch with a large number of line cards to generate sufficient traffic, which

  • 7/29/2019 A STUDY OF APPLICATIONS

    37/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 23

    could be expensive.

    Consider what happens if the mean call-holding time, 1/, is larger, say 2 hours, and mean

    call-arrival rate is still low at 1 per 2 hours. This means the number of ports, N feeding traffic into

    the shared link can be 540. Building switches with this order of line cards is more feasible. We thus

    conclude that the immediate-request, call-blocking mode of bandwidth sharing in GMPLS networks

    can be used for low-bandwidth applications that have relatively long durations and low call-arrival

    rates. There is an upper limit on mean call-holding time, because if it is very large, unless the call-

    arrival rate is very low, , will become very large causing a high call-blocking probability.

    Intermediate-bandwidth applications: Finally, consider an intermediate level, where m is in

    the range of 100. As seen from Fig. 3.3, call-blocking probabilities are very small when m = 100

    even at utilizations of 90%. Now consider the question of mean call-holding times. If we again use

    the video-conferencing application or eScience remote-visualization applications where the per-

    circuit bandwidth is 100 Mb/s on a 10 Gb/s link (which means m = 100), and mean call-holding

    times are in the 2-hour range, the required aggregate call-arrival rate is 40 per hour. If each port of

    the switch offers a load of 1 call per 5 hours, we need N to be 200, which is an acceptable number

    from a switch-cost perspective. Clearly, the higher the mean holding time, the smaller the N, andhence, the more preferable the application. This result again is surprising: calls with long holding

    times are preferable to calls with short holding times in a call-blocking mode of operation.

    In summary, applications suitable for present-day GMPLS networks are those in which the

    per-circuit capacity is 1/100th shared link capacity and have holding times on the order of tens of

    minutes or higher.

    3.2.2 Applications in which Call-Holding Time is Dependent on Per-Circuit Band-width

    As described in the model in Section 3.1.2, 1/(m) is constant if we neglect Tprop, and hence the

    two questions raised at the start of Section 3.1 seem to reduce to one question. But if we study

    the system at certain fixed values of m, say m = 10,100,1000 (as in Section 3.2.1), we have a

    new parameter , the crossover file size, with which to manipulate the mean call-holding time 1/.

  • 7/29/2019 A STUDY OF APPLICATIONS

    38/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 24

    Therefore, in this section, we study the effect of on various metrics, such as , Pb, U, and N0,

    which represents the total call-arrival rate for all files whose sizes are greater than k.

    Fig. 3.5 plots the two metrics, Pb, and U, against for fixed values ofm andN0. The influence

    of on is interesting because two factors operate in opposing directions. As increases, at a given

    m, the mean call-holding time, 1/, increases. But from (3.9), we see that is proportional to

    and hence decreases as increases. Since is larger than 1, decreases at a rate faster than 1/

    increases. As a result, decreases with increasing . Decreasing is the reason why Pb and U drop

    with increasing .

    0 5 10 15

    x 107

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    m=100, N0=100

    m=10, N0=100

    m=1000, N0=100

    (bytes)

    Pb

    (a) Pb vs.

    0 5 10 15

    x 107

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    m=100, N0=50

    m=100, N0=100

    m=10, N0=100

    m=1000, N0=100

    (bytes)

    U

    (b) U vs.

    Figure 3.5: Plots ofPb vs. and U vs. for m = 10, 100, and 1000, N0 = 50 and 100, = 1.1,and k= 1.25MB

    In Fig. 3.5, we hold N0 constant. But to see the effect of on the required call-arrival rate, we

    plot N0 against for a set of given U in Fig. 3.6. From (3.10), we see that N0 is proportional

    to 1

    . Therefore, N0 increases as increases. From this set of graphs, we see that we shouldselect a smaller so that the required N0 is not too large. IfN0 is large, and the per-host call-

    arrival rate, 0, is low, it means that we need to engineer our switches with a large number of ports.

    Another interesting result seen in this set of plots is that, unlike the results in Section 3.2.1, where

    as m is increased, the required traffic load increases, here we see in Fig. 3.6 that, as m increases, the

    required load N0 decreases.

  • 7/29/2019 A STUDY OF APPLICATIONS

    39/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 25

    0 5 10 15

    x 10

    7

    40

    60

    80

    100

    120

    140

    160

    U=60%, m=100

    U=80%, m=100

    U=80%, m=10

    U=80%, m=1000

    (bytes)

    N0

    Figure 3.6: Plot of N 0 vs. for m = 10, 100, and 1000, U = 60% and 80%, = 1.1, andk= 1.25MB

    We further plot Fig. 3.7 to contrast the effects of m on N for non-file-transfer applications and

    file-transfer applications by fixing U and . As shown in Fig. 3.3, increases as m increases.

    For non-file-transfer applications, since m and 1/ are independent and 1/ is constant, and N

    increase with increasing . We can also derive that the trend ofN vs. m is the same as that of vs.

    m (see Fig. 3.4a and Fig. 3.7a). In other words, for m at a small value, the curve has a higher slope

    0 20 40 60 80 1000

    50

    100

    150

    200

    250

    U=40%

    U=60%

    U=80%

    U=90%

    m

    N

    (a) N vs. m for non-file-transfer applications with =0.5 call/s and 1/ = 0.8 s

    0 20 40 60 80 1000

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    U=40%

    U=60%

    U=80%

    U=90%

    m

    N

    (b) N vs. m for file-transfer applications with 0 =0.5 call/s, = 1.1, k= 1.25 MB, and = 8 MB

    Figure 3.7: Plots ofN vs. m for U = 40%, 60%, 80%, and 90%

  • 7/29/2019 A STUDY OF APPLICATIONS

    40/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 26

    than that for m at a large value. In particular, for m at a high value, the curve has an approximately

    constant slope of(U)/0 (see Fig. 3.7a). But for file-transfer applications, 1/(m) is a constant

    for a fixed , C, and . From (3.10), we can see that the trend of N vs. m is the same as that of

    /m vs. m as shown in Fig. 3.4b. In particular, for large m, the curve for N vs. m is flat for a given

    U (see Fig. 3.7b). Thus, for file transfers, we can allocate smaller amounts of bandwidth per call,

    which means that m can be larger to achieve lower Pb and higher U without increasing N if the user

    can tolerate the longer holding time.

    Repeating the questions asked in Section 3.2.1, we consider whether high-bandwidth circuits

    can be used for file transfers. We reach the same answer as in Section 3.2.1 ifm = 10. Fig. 3.5 shows

    that the call-blocking probability is quite high (at 10% even at large ) when m = 10. Furthermore,

    Fig. 3.6 shows that a higher N0 load is required to achieve a certain U when m = 10 than when

    m is larger. Therefore, we conclude that high-bandwidth circuits, such as m = 10, are not suitable

    even for the file-transfer application, unless latency requirements dictate its use.

    We see from Fig. 3.5 that using low-bandwidth circuits (m = 1000) does not reduce Pb or

    increase U significantly if appropriate values of are selected, although it does not increase N

    either (see Fig. 3.7b). Given the natural advantage of lower delay to using lower m for file transfers,we focus the rest of our analysis on the intermediate-bandwidth m = 100 case.

    Now we consider the question of what crossover file size, , to select when m = 100. From

    Fig. 3.5, we see that should be in the range from 6 MB to 29 MB to meet a utilization higher than

    80% and a call-blocking probability lower than 5%. We observe that cannot be too large, because

    if it is, then U decreases and the required call-arrival rate, N0, becomes large as seen in Fig. 3.6.

    On the other hand, if it is too small, then Pb becomes too high.

    To achieve a low call-blocking probability and high utilization, just as we need to choose a

    fairly large m (e.g., m = 100) in Section 3.2.1, here we see the need for a fairly high call-arrival

    rate, N0 (e.g., N 0 = 100). At an aggregate value N 0 of 100 calls/sec, we also see that

    should be in the range from 6 MB to 29 MB. This means that the mean holding time is in the range

    of 0.5 s to 2.3 s since the per-circuit rate is 100 Mb/s when m = 100. These mean call-holding times

    are significantly smaller than the numbers we consider in Section 3.2.1, where even a mean call-

  • 7/29/2019 A STUDY OF APPLICATIONS

    41/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 27

    holding time of 3 minutes, results in a need for a large number of ports. We see from Fig. 3.5 that

    lowering N0 can lower utilization significantly. To engineer an N0 rate of 100 calls/sec, if0

    is 1 call every 10 s, it means that we require N to be 1000. This is not a small number and requires a

    cascade of switches to build up this load. For example, if the bottleneck link is an enterprise access

    link, it requires multiple aggregations from switches internal to the enterprise, whose links can be

    run at lower utilization levels, so that the aggregate traffic load for the enterprise access link is high

    enough to achieve a high utilization at an acceptable Pb.

    Next, we note that the very low mean call-holding times require high-speed signaling engines

    to reduce call-setup delays so that they approach round-trip propagation delays, and thus, the circuit

    utilization is high. Our work on hardware-accelerated signaling [58] shows the feasibility of im-

    plementing an RSVP-TE subset in hardware, which reduces per-switch call processing delays from

    the 100 ms range we measured on Sycamore switches to the order of microseconds.

    Finally, we note that, although a link capacity of 10 Gb/s is appropriate for SONET/SDH and

    MPLS shared links, it is low for a WDM link. If we assume that the shared link supports 100 wave-

    lengths, using a typical data rate of 10 Gb/s, link capacity is 1 Tb/s and the per-circuit bandwidth

    is 10 Gb/s. Media-immersive applications could consume such high-levels of end-to-end capacity(category of applications where the mean call-holding time is independent of m), but for the file-

    transfer application, file sizes should increase significantly to make the use of WDM networks with

    GMPLS control-plane protocols usable for file transfers.

    3.3 Conclusions

    In this chapter, we analyzed the call-blocking mode of operation to determine the types of appli-

    cations suitable for GMPLS networks by dividing them into two categories: those for which the

    per-circuit capacity is independent of the holding time, and those for which these two variables

    are directly related, such as file transfers. We concluded the following for the first category. First,

    applications that require high-bandwidth circuits relative to the link capacity (e.g., where the ratio

    is one-tenth, say 1 Gb/s circuits on a 10 Gb/s link) are not suitable. Second, applications that re-

  • 7/29/2019 A STUDY OF APPLICATIONS

    42/91

    Chapter 3. ANALYTICAL MODELS OF GMPLS NETWORKS 28

    quire low-bandwidth circuits but have short holding times (on the order of a few minutes) require a

    high degree of aggregation leading to expenses from large numbers of line cards. Ideal applications

    require on the order of one-hundredth the link capacity as per-circuit rates, and have long holding

    times. In the second category of applications, we found that the first conclusion to the first category

    still holds; however, the second does not because the number of line cards keeps almost constant

    for m at a high value. In this category of applications, we also found that calls need to have very

    short call-holding times (on the order of seconds).

  • 7/29/2019 A STUDY OF APPLICATIONS

    43/91

    Chapter 4

    WEB TRANSFER APPLICATION ON CHEETAH

    In this chapter, we describe our implementation of a software package, called WebFT, as an applica-

    tion for CHEETAH [16]. WebFT accomplishes web transfers across CHEETAH without changing

    existing web client and web server software by integrating the CHEETAH end-host software mod-

    ules into Common Gateway Interface (CGI) and other external modules.

    The main reasons why we chose web transfers as a showcase for CHEETAH are three-fold.

    First, web-based applications have become ubiquitous [19] and there is significant interest in im-

    proving web performance. Although solutions such as web caching focus on the problems of over-

    loaded web servers [9, 17], we focus on improving network performance. Second, according to

    the analysis of Chapter 3, CHEETAH network can be operated at a low call-blocking probability

    and a high utilization if circuits are on the order of one-hundredth the shared link capacity, for

    example, 100 Mb/s on a 10 Gb/s link, and a circuit of 100 Mb/s is suitable for either many small

    web file transfers or a single bulk web transfer. Third, many new types of web-based applications,

    such as large-file downloads, high-quality video streaming, and remote visualization, require high-

    throughput, low-jitter, and deterministic data transfers. These applications need QoS guaranteed

    network connectivity. The connectionless sharing mode of the current Internet is inadequate to

    provide such connectivity. We contend that the lack of rate-guaranteed network connectivity is hin-

    dering these web-based applications from being developed and deployed. An answer to this need

    lies in some of the newer networking technologiesfor example, CO networking technologies,

    currently under development and deployment. CO networks, such as CHEETAH and DRAGON,

    29

  • 7/29/2019 A STUDY OF APPLICATIONS

    44/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 30

    allow for the reservation of bandwidth in the form of a dedicated circuit or VC through the networks

    prior to data transfer.

    This chapter determines how we can leverage these new CO technologies to improve the per-

    formance of web applications. We first describe the WebFT software design and implementation.

    Then, we show our experimental results and reason that WebFT can achieve low-variance, end-to-

    end transfer delays at different circuit rates and low transfer delays when high-speed circuits are

    possible.

    4.1 WebFT Design

    A primary goal of the WebFT software design is to provide deterministic data-transfer services to

    clients connected to a web server via the CHEETAH network. WebFT leverages the coexistence

    of two paths between a web client and a web serverthat is, through the Internet and through

    the CHEETAH network. It allows clients that have network connectivity to the circuit-switched

    CHEETAH network to connect the WebFT server and download web content (e.g., large files or

    streamed video) through dedicated end-to-end circuits, while simultaneously providing normal web

    access to other nonCHEETAH clients through the Internet. The dedicated nature of the circuits

    allows for user data to be streamed unhindered from a web server to a web client via the CHEETAH

    network. This results in low-variance transfer delays.

    Another goal of the WebFT software design is not to impose any special requirements with

    regards to the operating system or the web server or client software packages executed on the client

    and server hosts. We leverage the CGI technology to achieve this goal [32].

    4.1.1 WebFT Architecture

    The WebFT architecture is shown in Fig. 4.1. On the web server side, WebFT includes two CGI

    scripts, download.cgi and redirection.cgi, and a process called WebFT sender. Download.cgi is em-

    bedded into web pages as a hyperlink, with the name of the file to be served as a parameter. When

    the user clicks the download.cgi hyperlink on the web page through any typical web client, the web

  • 7/29/2019 A STUDY OF APPLICATIONS

    45/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 31

    Web serverWeb client

    Web Server

    (e.g. Apache)

    CGI scripts

    (download.cgi &

    redirection.cgi

    URL

    Response

    WebFT sender

    OCS API RD API

    RSVP-TE API

    C-TCP API

    Web Browser

    (e.g. Mozilla)

    WebFT receiver

    RSVP-TE API

    C-TCP API

    Control messages

    via InternetData transfers

    via a circuit

    OCS daemon

    RD daemon

    RSVP-TE daemon

    RSVP-TE

    daemon

    Figure 4.1: WebFT architecture

    server receives an HTTP message causing download.cgi to be initiated. Download.cgi, in turn, initi-ates the WebFT sender process, which communicates with the WebFT receiver process on the client

    host to transfer the data from the server side to the client side. By leveraging the CGI technology,

    we avoid requiring any software upgrades to both web servers and web browsers.

    Integrated into the WebFT sender and receiver are libraries provided with the CHEETAH end-

    host software module described in Section 2.2. Through interaction with the CHEETAH end-host

    software modules, the WebFT sender determines whether to use the Internet path or attempt to set

    up a CHEETAH circuit, and if deemed appropriate, initiates the setup of a circuit. It then transfers

    the user data, and initiates the release of the circuit. If, for some reason, the user data cannot be

    transferred via the CHEETAH network (e.g., the client host is not connected to CHEETAH, the file

    size is too small, which makes it inefficient to use a circuit, or bandwidth is not available on the

    CHEETAH network), the WebFT sender process exits and redirection.cgi is invoked to transfer the

    file via the Internet.

    4.1.2 CGI Scripts

    CGI defines an approach for a web server to interact with external programs, which are often re-

    ferred to as CGI programs or CGI scripts. Fig. 4.2 shows the flow of events while running CGI

    scripts.1

    1This figure is adapted from Writing CGI Applications with Perl by Meltzer and Michalski [32].

    http://-/?-
  • 7/29/2019 A STUDY OF APPLICATIONS

    46/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 32

    `

    WWW Client HTTP Web Server

    HTTP request

    HTTP response

    Gateway programs

    CGI Run CGI

    Scripts

    Figure 4.2: The flow of events from running CGI scripts

    The WebFT package contains two CGI scripts developed in Perl5 on the server side: down-

    load.cgi and redirection.cgi. On receiving a request from a client, the web server invokes the

    download.cgi script with one input parameter, the requested file name. Download.cgi obtains the

    clients primary IP address by querying the environment variable of REMOTE ADDR. It then calls

    the WebFT sender process and passes the clients primary IP address and the requested file name to

    the WebFT sender process. If the WebFT sender returns indicating a failure to transfer the file over

    the CHEETAH network, download.cgi calls redirection.cgi to initiate a normal download of the file

    via the Internet.

    4.1.3 The WebFT Sender

    The WebFT sender is integrated with APIs for the four basic CHEETAH end-host software mod-

    ules. Thus, it interacts with the CHEETAH software daemons, including the OCS daemon, the RD

    daemon, and the RSVPTE daemon, as shown in Fig. 4.1. The flowchart for the WebFT sender is

    shown in Fig. 4.3. Once the sender is initiated by the download.cgi script, it calls the OCS client

    module to determine whether the client host is reachable via the CHEETAH network. If the answer

    is yes, the OCS client module returns with the IP address and the MAC address of clients secondary

    NIC (the one connected to the CHEETAH network).

    The WebFT sender then establishes a TCP connection through the host primary NIC via the

    Internet to the WebFT receiver, which is running as a daemon on a well-known port in the client

    host. Once the TCP connection is successfully established, the receiver sends back a desired CHEE-

    TAH circuit rate (based on its receiving capability) and a C-TCP listening port number for the data

  • 7/29/2019 A STUDY OF APPLICATIONS

    47/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 33

    The client can be reached via theCHEETAH network (OCS)

    Request a CHEETAH circuit (RD)

    Set up a circuit (RSVP_TE client)

    Send the file via C-TCP

    Release the circuit (RSVP_TE client)

    Yes

    Yes

    Succeed

    No

    No

    Fail

    Return Success Return Failure

    Figure 4.3: The flow chart for the WebFT sender

    transfer on the CHEETAH circuit.

    Then, the WebFT sender process calls the RD module (passing the client hosts primary IP

    address, secondary IP address, clients desired circuit rate, and file size as arguments) to deter-

    mine whether to attempt a CHEETAH circuit setup. The RD module chooses between the two

    options based on the loading conditions of the two networks (the Internet and the CHEETAH

    circuit-switched network), the round-trip delay time (RTT), and the file size. If it returns a de-

    cision to attempt a CHEETAH circuit setup, the WebFT sender process calls the RSVPTE client

    module (passing the clients primary and secondary IP addresses and the circuit rate), asking it to

    initiate circuit setup.

  • 7/29/2019 A STUDY OF APPLICATIONS

    48/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 34

    If the circuit setup is successful, the WebFT sender process calls the C-TCP send() subroutine,

    passing the following arguments: the circuit rate, the clients secondary IP address, the C-TCP

    port number on which the client is ready to accept an incoming C-TCP connection on the circuit,

    and the file name. The C-TCP send() subroutine opens a socket and connects the client through

    the secondary NIC and the CHEETAH circuit. The file is transferred on the dedicated CHEETAH

    circuit at a rate equal to the circuit rate.

    Once the data transfer is completed, the WebFT sender process invokes the RSVPTE client

    APIs to initiate release of the CHEETAH circuit. Finally, it returns a Success indication to the

    download.cgi script.

    If, during the above-mentioned procedure, the OCS client module determines that the client host

    does not have CHEETAH connectivity, or the RD module decides that it is better to use the Internet

    path, or the circuit setup initiated by the RSVPTE client module fails, the WebFT sender process

    immediately returns a Failure indication to the download.cgi script. The download.cgi process then

    calls redirection.cgi to download the file via the Internet as mentioned in Section 4.1.2.

    4.1.4 The WebFT Receiver

    To avoid manual intervention, the WebFT receiver is designed to run as a daemon on a well-known

    port in the background on the client host and to process incoming connection requests from the

    WebFT sender automatically. The WebFT receiver is completely independent of web browser soft-

    ware, and therefore does not require any modification to the latter. All clients connected to the

    CHEETAH network are configured to run this daemon.

    The WebFT receiver forks a child process to handle each request for a TCP connection from the

    WebFT sender through the primary NIC. The forked WebFT receiver process then creates a TCP

    connection with the WebFT sender to accept the request and sends to the latter the information of

    a pre-computed desired circuit rate. The circuit rate is typically computed based on the disk access

    rate of the client host because with todays technology, disk access rate is usually the bottleneck for

    file transfers. The forked WebFT receiver process also sends the listening C-TCP port number for

    the data transfer through the secondary NIC on the CHEETAH circuit.

  • 7/29/2019 A STUDY OF APPLICATIONS

    49/91

    Chapter 4. WEB TRANSFER APPLICATION ON CHEETAH 35

    The WebFT receiver includes the API libraries associated with the RSVPTE client and C-TCP

    modules of the CHEETAH end-host software. The RSVPTE client module API library accepts

    circuit setup requests from the CHEETAH network and the C-TCP module API library accepts

    incoming C-TCP connection requests from the WebFT sender to transfer user data. After a data

    transfer is completed, the forked child process terminates and returns to the parent WebFT receiver

    process.

    4.2 Experimental Testbed and Results

    The Linux implementation of WebFT described in the previous section has been tested on the

    CHEETAH experimental testbed. This section presents and discusses these