Master's Project Report HVNS RC3

8/9/2019 Master's Project Report HVNS RC3

1/77

Simulating the Performance of Data

Distribution and Retrieval Algorithms

withHardware Varying Network Simulator

Masters Project

Alexander G. Maskovyak

Rochester Institute of Technology

Golisano College of Computing & Information Sciences

Department of Computer Science

102 Lomb Memorial Drive

Rochester, NY 146235608

[email protected]

May 2010

Chair Hans-Peter Bischof

Reader Axel Schreiner

Observer TBA


2/77


3/77

P a g e | 3

Contents

Abstract .................................................................................................................................................. 7

1. Introduction .......................... .......................... ......................... .......................... ......................... .... 8

1.1. Motivation ......................... .......................... .......................... ......................... ......................... 8

1.2. Distributed Data .......................... ......................... ........................... ......................... ................ 8

1.3. Simulation .......................... .......................... .......................... ......................... ......................... 8

1.4. Current Simulators ........................... ......................... ......................... .......................... ............ 9

2. Project Description ......................... ......................... .......................... ......................... ................... 10

2.1. Terminology ....................... .......................... .......................... ......................... ....................... 10

2.2. Architecture ....................... .......................... .......................... ......................... ....................... 11

2.3. Report Structure ......................... ......................... ........................... ......................... .............. 12

3. Simulation ......................... ......................... .......................... ......................... ........................... ..... 14

3.1. Static versus Dynamic Simulation .......................... .......................... ......................... .............. 14

3.2. Deterministic versus Stochastic Simulation ......................... ......................... .......................... . 14

3.3. Continuous versus Discrete Simulation .......................... .......................... .......................... ..... 14

3.4. Analytical versus Agent-Based Simulation .......................... ......................... .......................... . 15

4. Simulation Implementation ........................ .......................... ......................... ........................... ..... 16

4.1. Simulator ........................... .......................... .......................... ......................... ....................... 16

4.2. Simulatables........................................ .......................... .......................... .......................... ..... 18

4.3. Operation Bound Simulatables .......................... ......................... .......................... .................. 20

5. Network Model ......................... .......................... .......................... ......................... ....................... 23

5.1. Node .......................... ......................... .......................... ......................... .......................... ...... 23

5.2. Connection Adaptor ........................ .......................... ......................... .......................... .......... 24

5.3. Connection Medium............................ .......................... ......................... ........................... ..... 25

5.4. Protocol Stack ......................... ......................... .......................... ......................... ................... 26

6. Hardware Model ........................ .......................... .......................... ......................... ....................... 28

6.1. Hardware Computer Node ........................... .......................... ......................... ....................... 29

6.2. Harddrive ........................... .......................... .......................... ......................... ....................... 29

6.3. Cache ......................... ......................... .......................... ......................... ........................... ..... 29

6.4. Connection Adaptor ........................ .......................... ......................... .......................... .......... 29

7. Distribution and Retrieval Algorithms ........................... .......................... ......................... .............. 31

7.1. Operation Model ......................... ......................... ........................... ......................... .............. 31


4/77

P a g e | 4

7.2. Implementation .......................... ......................... ........................... ......................... .............. 32

7.3. Client-Managed Distribution Algorithm ......................... .......................... .......................... ..... 33

7.4. Server-Managed Distribution Algorithm ........................ ......................... ........................... ..... 37

8. Configuration ......................... .......................... ......................... .......................... .......................... . 42

8.1. Java API .......................... .......................... ......................... .......................... ......................... .. 43

8.2. HVNSLanguage ........................ ......................... .......................... ......................... ................... 46

8.3. HVNSL Example Configuration ........................... ......................... .......................... .................. 47

8.4. Configuration Directory Structure ........................ ........................... ......................... .............. 51

8.5. HVNSL Grammar ......................... ......................... ........................... ......................... .............. 53

9. Algorithm Benchmarking ........................ .......................... ......................... .......................... .......... 55

9.1. Metrics............................... .......................... .......................... ......................... ....................... 55

9.2. Logging Implementation .......................... .......................... ......................... ......................... .. 56

9.3. Log File Format ........................... ......................... ........................... ......................... .............. 56

9.4. Generating Logs .......................... ......................... ........................... ......................... .............. 57

10. Simulation Expectations .......................... .......................... ......................... ........................... ..... 58

10.1. Varying Adaptor Speed ......................... .......................... ......................... ......................... .. 58

10.2. Varying Cache Size........................... .......................... ......................... ........................... ..... 58

10.3. Varying Cache Speed ....................... .......................... ......................... ........................... ..... 59

10.4. Varying Server Quantity ........................... .......................... ......................... ....................... 59

10.5. Varying Redundancy ........................ .......................... ......................... ........................... ..... 59

11. Simulation Results .......................... ......................... ........................... ......................... .............. 60

11.1. Varying Adaptor Speed ......................... .......................... ......................... ......................... .. 61

11.2. Varying Cache Size........................... .......................... ......................... ........................... ..... 62

11.3. Varying Cache Speed ....................... .......................... ......................... ........................... ..... 64

11.4. Varying Server Quantity ........................... .......................... ......................... ....................... 65

11.5. Varying Redundancy ........................ .......................... ......................... ........................... ..... 66

12. Simulator Comparison..................................... ........................... ......................... ....................... 67

12.1. ns-2 ........................ ......................... .......................... ......................... ........................... ..... 67

12.2. JiST ......................... ......................... .......................... ......................... ........................... ..... 69

12.3. OMNeT++ ....................... .......................... .......................... ......................... ....................... 70

13. Conclusions ........................ .......................... ......................... .......................... ........................... 71

14. Future Work .......................... .......................... .......................... ......................... ....................... 73


5/77

P a g e | 5

14.1. Architecture ........................ ......................... .......................... ......................... ................... 73

14.2. HVNSL ........................ .......................... ......................... .......................... ........................... 73

14.3. Algorithm Design .......................... ......................... ......................... .......................... .......... 74

14.4. Benchmarks ........................ ......................... .......................... ......................... ................... 74

References ............................................................................................................................................ 75


6/77

P a g e | 6


7/77

P a g e | 7

Computer Science is a science of abstractioncreating the right model for a problem and

devising the appropriate mechanizable techniques to solve it.

- A. Aho and J. Ullman

Abstract

Softwarebased network simulators are often an essential resource in networking research. Network

simulation affords researchers the ability to test communication protocols and topological design where

this would otherwise be economically or physically infeasible. A heretofore unexplored avenue for

simulation which has not been explored is the interplay between the computer hardware of devices on

a network and the impact this has on the performance of data distribution algorithms. This is line of

inquiry is becoming increasingly important as the amount of data produced begins to exceed the

capacity of local storage and must instead be stored on remote servers and later retrieved.

Hardware Varying Network Simulator (HVNS) was designed to fulfill this niche. HVNS models how

hardware attributes like disk-read speed, communication link bandwidth, and cache size affect the

performance of one data distribution and retrieval algorithm over another. HVNS can demonstrate

when adding functionality like cache utilization or a faster connection adaptor results in no performance

gain because of hardware properties of the network. HVNS can be used to test purely theoretical

hardware and its effects. This can be used to discriminate between algorithms which may not currently

be viable but may become viable should certain hardware capabilities become available in the future.

A cursory glance over network-related primary literature demonstrates that Network Simulator version

2 (ns-2) [1] is the de-facto standard in discrete event-driven network simulation. It is widely used in

transport, network, and multicast protocol testing for both wired and wireless networks. However, it

and similar simulators model the network at an inappropriate level of abstraction for this domain.

These simulators are slower, harder to learn, and more difficult to extend. HVNS was designed to be

easy to use, extensible, easily configurable, and perform well for this domain. This project examines the

design of HVNS, its configuration language, and provides insight into the affects that hardware can have

on data distribution and retrieval.


8/77

P a g e | 8

Data expands to fill the space available for storage.

- Parkinsons Law of Data

1. IntroductionData storage and access requirements in the present and nearterm are going to necessitate the

deployment of distributed hardware solutions and the use of data distribution and retrieval algorithms

for a wide variety of fields, namely scientific research. There is active ongoing research into the

development of these algorithms. Network simulators exist to test network protocols but are not

specifically designed to test the interaction between I/O algorithms and the computer hardware of the

devices in that network topology.

1.1. MotivationThe amount of data generated worldwide is increasing at an exponential rate [2]. Scientific research

using high performance computers produces a considerable amount of data. The experiments run on

CERN's Large Hadron Collider are expected to generate 2 gigabytes of data every second, with a total

yearly production of 10 to 20 petabytes of information [3][4]. The Wide Field Infrared Camera (WFCAM)

of Cambridges Astronomical Survey Unit captures approximately 2200 images of the night's sky which

take up 230 gigabytes of space [5]. It is physically and economically infeasible with current technology

to store and backup this data on a single localized storage medium [6]. Instead, these data are stored

on multiple networked machines which are retrieved in intelligent ways to mitigate the costs associated

with nonlocal access.

1.2. Distributed DataDistributed file systems allow these data to be stored and accessed from across multiple file servers as

though they were local resources. Distributing data incurs performance penalties. Additional time is

spent on I/O over the network connection for sending the data to be stored initially and retrieving it

later for an application's/client's later use. A variety of techniques are being explored in research to

increase the performance of file distribution and retrieval algorithms which include the use of caching,

data duplication, and data compression [7].

1.3. SimulationSoftware

based network simulators are often an essential resource in networking research. Network

simulation affords researchers the ability to test communication protocols and topological design where

this would otherwise be economically or physically infeasible. Simulation can similarly be applied to the

study of data distribution and access algorithms where hardware properties can greatly impact the

performance of one algorithm over another. Simulation also provides an additional benefit aside from

circumventing resource limitations: theoretical hardware models can be constructed to test the impact

of technology that is on the horizon or indevelopment [8].


9/77

P a g e | 9

1.4. Current SimulatorsA wide variety of simulators have been deployed in professional and academic settings. Some popular

simulators include Ns-2, JiST, and OMNeT++. Network Simulator version 2 is a discreteevent network

simulation environment and the defacto standard used in the field of network research to test, design,

and benchmark network protocols in wired environments. Ns2 offers support for simulating a full

network protocol stack, multicast protocols, and routing. The ns2 configuration file requires

knowledge of OTcl, a scripting language designed to be embedded in applications [7]. Java in Simulation

Time (JiST) is a prototype for virtual machine-based simulation where a Java applications code is

intercepted and transformed into bytecode which contains simulation time-semantics [9]. It is efficient

and allows a developer to avoid the use of a domain-specific simulation language to define a simulation.

OMNet is a discreteevent, network simulator [10]. It is an opensource project which provides model

frameworks which can be used to develop domainspecific projects. It is designed for extensibility.


10/77

P a g e | 10

The hardest part of the software task is arriving at a complete and consistent specification, and

much of the essence of building a program is in fact the debugging of the specification.

- F. Brooks

2. Project DescriptionThis project examines the interplay between remote storage-and-retrieval algorithms and how they are

affected by aspects of the hardware and network topology on which they are deployed. Specifically the

question is under what circumstances remote read requests can outperform local read requests. One

way to answer this question is via simulation. The approach is to design a custom simulator, model the

applicable network and computational domains, implement several distribution algorithms, and

measure their performance on a variety of network/computational configurations.

2.1. TerminologyThis project deals with several computer science domains: simulation, networking, and hardware.

Computer simulationexplores how an abstract models state changes over the course of time. The

modelis a representation of an object or grouping of objects as well as their behavior in response to a

variety of events. A simulatoris responsible for performing a simulation and is generally responsible for

managing the flow of time and monitoring system state changes. A discrete scheduled event simulator

manages time as a series of events in an event queue [11]. The simulator handles events scheduled to

occur at some point in time which can trigger the scheduling of additional events further into the future.

Events are created by the model as a part of its behavior and cause the model to undergo state changes.An agent-based simulation explicitly models individual entities as opposed to an analytical model which

operates with underlying mathematical equations or otherwise abstracts out entities as numeric

properties. Agents respond to events based upon a set of internal rules or logic [12]. These responses

can include an alteration to their state and/or a scheduling of future events within the simulator. The

behavior of a state-based agent can be modeled by a Finite State Automata. Finite State Automata (also

known as a Finite State Machines) are graphical models which describe the states of an entity, the

events an entity receives, the behavior of an entity in response to events, and how an entitys state

changes in response to events when it occupies any particular state [13].

A simulation run is the result of a simulator performing these functions on a model with some initialstate conditions. Typically, the simulation run will produce some form of diagnostic data regarding the

behavior of the system. An interested party will run several simulations with varying initial conditions

and model parameters in order to observe patterns, trends, or to benchmark performance against

some variety of metrics.

Computer networking deals with computational devices which are connected via media to facilitate

communication and the sharing of computational resources like hardware, software, or data. Networks


11/77

P a g e | 11

range in size from the small local area network (LAN) of a family-unit with a handful of devices

connected to a wireless access point to the internet which consists of hundreds of thousands of

networks of computers connected across the globe.

Software applications typically have two roles on a network: client or server. A clientrole indicates that

a piece of software is performing a request on a piece of software in the serverrole which is respondingto / servicing the request. Software communicates across the network via discrete messages which are

known as datagrams, packets, segments, or messages depending upon the level of abstraction being

used. Aprotocolis an agreed-upon dialogue that two pieces/instances of software program use that

governs that piece of the programs behavior/interactions in response to messages. Here too, an FSM

can be used to describe this interaction. Theprotocol stackis a layered series of modular protocols

which implements a suite of computer networking protocols. Each layer of the stack handles one aspect

of communication so that the layer above it does not have tothey are said to provide a service to

the layer above [14]. The protocol stack has 3 basic layers (though there are commonly more and

groupings can differ) with one protocol each handling media access, message transportation, and

applications. The media access layerhandles transformation of information into a form that can betransported by the underlying medium to which a device is connected. The transportationlayer

determines how messages find their way to their intended destination as they pass through a series of

devices. The application layeris where most end-user centric, desktop applications reside and is the

layer of the distribution algorithms developed for this project. The distributed algorithms above make

use of virtual hardware to store data and send messages to one another.

The hardware model includes both data storage and communication hardware. A harddrive has large,

long-term storage but generally has slow access times. A cache is much smaller, short-term storage but

has comparatively much faster access times. Caches are used to speed up data retrieval. Caches store

frequently accessed information so that the fast speed of the cache can be leveraged and the slowaccess of the harddrive avoided. When requests are uniform or predictable, the cache can further be

used to store information that has not yet been requested but which may be requested at some point in

the future. Due to the small size of a cache, it is important to make intelligent decisions about what is

stored there; otherwise no speedup over harddrive access can be realized. A connection adaptor is the

gateway to the medium connecting computational devices to one another. Network infrastructure

speeds are rapidly increasing, and in some cases, are much faster than the access time for local storage

[15].

The last observation is the impetus behind this project.

2.2. ArchitectureHardware Varying Network Simulator or HVNS is the result of the approach described in the description.

HVNS is an agent-based, discrete-event simulator. The simulator runs in its own thread with a thread-

safe, event priority-queue. Simulatables are entities which schedule events with the simulator and

handle messages contained within an event from a sender.


12/77

P a g e | 12

Events are containers that hold a message, intended receiver, and a time of delivery. Events are held in

the simulator queue first in time-order, and second by a priority order flag. The simulator pops events

from the top of its queue, updates its time to match the events time, and then delivers the message to

the intended simulatable recipient. The simulatable then proceeds to perform some operations in

response to this event which may result in the recipient scheduling additional events with the simulator.

Events can be scheduled by any simulatable for any simulatable. Messages from a simulatable to itself

are given the highest priority and are considered control messages that have no transit cost. They are

intended to allow a simulatable to schedule the alteration of its state for a future time and bootstrap

itself for future operations/work.

The network model consists of nodes directly connected with some set of neighbors via Ethernet-like

connection medium. Nodes exchange messages via packets over this medium via next-hop routing. The

computational model views nodes as computers, each of which contains a number of components

which can store and retrieve varying amounts of data at varying rates.

Distribution algorithms are the primary agents in this model and make use of the computational

facilities to request data to be distributed, data to be stored, and data to be retrieved. One algorithm on

one node is designated as the client. This client is told to execute which causes it to schedule a

bootstrapping event for itself in the queue. The client will then proceed to request a specified amount

of data from its harddrive and send this information out to select devices on the network for storage.

The interval of time between the first harddrive request and the last harddrive response is considered

the baseline local read-time. Server nodes will begin to store, replicate, and propagate this information

across algorithmically selected devices. Once distributed storage is complete a message is sent back to

the client node. The client node can then begin requesting the data back from the server nodes. The

interval of time between the first remote request and the reception of the last requested piece of data

is considered the experimental remote read-time. Smaller time intervals indicate greater/betterperformance. Ideally, the confluence of the algorithms design and the attributes of the available

hardware will allow the algorithms remote read-time outperform the local read-time.

The simulator environment and the model are configured through a domain-specific language called

HVNSLanguage (HVNSL). This configuration language has relatively few keywords and has a fairly

predictable, uniform syntax across the entire configuration space. The configuration language is

domain-specific which allows it to focus on simplifying the configuration of hardware-components as

well as the creation of network topologies.

2.3. Report StructureThis report begins with an examination of simulation theory with an emphasis placed on the techniques

employed by HVNS. It explores the architecture and interfaces of HVNS, the design decisions involved,

and the benefits and drawbacks of this approach.

The models under simulation are then explored in the same way. The design and abstractions employed

for the network, its protocol stack, and routing infrastructure is discussed. This is followed by an


13/77

P a g e | 13

examination of the architecture and abstraction behind the computer model, the hardware employed,

and the hooks included for the distribution algorithms. The report next explores the distribution

algorithms themselves, their operation, and the design intent.

HVNSLanguage (HVNSL), the simulator and model configuration language, is then discussed including its

syntax and its semantics.h

The benchmarking apparatus, measurements, and expectations are detailed. The results of the

measurements are then described. The report analyzes the results in comparison to expectations.

Further analysis is given comparing the actual simulator architecture itself to competing simulators

employed in the field. This includes a discussion on the benefits and drawbacks of the various

approaches described in relation to HVNS.

The report concludes with an evaluation of the project as a whole. Here the success of HVNS, the

algorithms, and benchmarking are discussed. Suggestions for future work are next discussed which

includes potential improvements to the simulator, model, configuration language, and testing

apparatus. The lessons learned from the approach taken to this projects development are also included

as guidance for future work.


14/77

P a g e | 14

To dissimulate is to feign not to have what one has. To simulate is to feign to have what one

hasnt. One implies a presence, the other an absence.

- Jean Baudrillard

3. SimulationIn the broadest sense, simulation is the fabrication or imitation of something which is real. An

imitation reproduces some subset of the defining characteristics of the real item. These characteristics

may be behavioral, physical, or intangible. In Computer Science, the preceding abstraction is known as a

model and the imitated properties are considered the models state. Computer simulation is generally

concerned with utilizing inputs in tandem with this model to affect change in the models state over

time (the cardinality of the change set can be one as well). How this occurs depends upon the

classification and specific implementation of the model.

Computer simulations can be classified into several types based upon aspects of the simulator on which

or model upon which they are being performed.

3.1. Static versus Dynamic SimulationSimulators possess either static or dynamic models. Static models produce a single solution for a

simulation run. Dynamic systems have models which can assume several states over the course of a

simulation run. Static models are useful for analyzing the relationships between sets of input with

output variables. Dynamic systems are useful for analyzing how a system gets from some arbitrary start

condition to an arbitrary ending condition inside the span of time simulated [16].

3.2. Deterministic versus Stochastic SimulationSimulators are either deterministic or stochastic. Deterministic simulators will evolve systems in an

identical way across all simulation runs when given the same input conditions. Stochastic simulators will

evolve systems with some degree of variance across all simulation runs when given the same input

conditions. Stochastic simulators rely upon pseudo-random number generators to introduce

randomness into a run [17].

3.3. Continuous versus Discrete SimulationSimulators are either continuous or discrete. Continuous simulators have models with explicit state

variables whose values are governed by differential-algebraic equations or differential equations.

Periodically (i.e. at some fixed time interval), the simulator will alter its state by solving the equations to

produce values for state-assignment. Discrete simulators can alter their variables at only a fixed number

of points in time. Discrete event simulators are an important subset of discrete simulators. Discrete

event simulators operate on a succession of events whose occurrence moves time forward [17].


15/77

P a g e | 15

3.4. Analytical versus Agent-Based SimulationSimulators have either analytical or agent-based models. Analytical models rely upon a collection of

rules, equations, or functions to determine how the state of the system as a whole is advanced. In

contrast, an agent-based system involves some number of autonomous rule-based entities whose

actions and interactions affect the system. These agents respond to events that occur in the system,

which may alter their internal state and cause them to schedule additional events. The state of the

system is comprised of the aggregation of the states of all agents and any state variables external to

them [17].


16/77

P a g e | 16

A good simulation, be it a religious myth or scientific theory, gives us a sense of mastery over

experience. To present something symbolically, as we do when we speak or write, is somehow to

capture it, thus making it ones own. But with this appropriation comes the realization that we

have denied the immediacy of reality and that in creating a substitute we have but spun another

thread in the web of our grand illusion.

- Rudolph Heinz Pagel

4. Simulation ImplementationHVNS runs on the JVM and uses Java as its implementation language for the simulator and all

simulatable constructs [18][19]. Java has strong OOP properties and an expansive standard library of

classes which implement many design patterns which have been leveraged (e.g. interfaces, observers,

threads) for HVNS. HVNS makes extensive use of design patterns and objectoriented principles to

allow it to be extensible and modular.The architecture is designed around interfaces, abstract base

classes, instantiations which are external from constructors, and factory patterns.

4.1. SimulatorHVNS is a dynamic, deterministic, discrete event-scheduling, agent-based simulator. HVNS is a dynamic

simulator with a model that experiences a range of states before the simulation concludes. HVNS is a

deterministic simulator in that it does not introduce randomness in any aspect of its event scheduling or

delivery. This means that the simulator itself neither creates random events nor alters the scheduling of

events by simulatables. HVNS is a discrete event-scheduling simulator which maintains an event queue

and execution threads. HVNS is an agent-based simulator as it relies upon the goal-based interactions of

its registered simulatables to cause the evolution of its system state.

The simulator and simulation environment are distinct from the network/computational model

employed. This means that the operation of HVNS can be made to emulate aspects of the alternative

simulation classes described in Chapter 3 through alterations of the model (HVNS itself can also be sub-

classed to produce these effects). As an example, a discrete simulator can be modeled as a single

simulatable which schedules do work message for itself at a fixed time interval. An analytical

approach can be created by having the aforementioned simulatable maintain the equations and state

variables which it recalculates with every do work message. As another example, non-determinism

can be introduced into the simulation via the model being simulated. Simulatables can employ random

number generators in a way that affects if and when they are going to schedule an event, or the way inwhich they determine the recipient of their events message. Alternatively, simulatables could be

wrapped inside a parent simulatable which can be made to intercept events, modify when and/if they

are to be delivered, and schedule a series of events to occur at random points in the future.


17/77

P a g e | 17

The simulator runs in its own thread and possesses a thread-safe priority-queue for events. There are

three major portions of the simulator that are discussed here: simulator startup, scheduling events, and

the main event loop.

The startup procedure for HVNS operation and for discrete event simulators in general is depicted in

Figure 4-1:

Simulator Startup

Set ENDING_CONDITION to FALSEInitialize event queueInitialize state variables / register agentsStart event thread

Schedule bootstrap event(s)

FIGURE 4-1. PSEUDO-CODE FOR THE SIMULATOR START-UP SEQUENCE.

Simulator startup consists of a few tasks. The ending condition is set to false for the simulator. This is

an important step otherwise the main event loop would never execute. The ending condition is

implementation specific and largely depends upon the model under consideration. The ending

condition may be to stop at some specific time, when a particular state variable or a derivative of that

variable reaches a value, or when a certain number of events have been processed. The ending

condition must be able to be set to false, otherwise simulation may proceed indefinitely. The ending

condition for HVNS is dependent upon a stopped variable. HVNS depends upon the models

configuration to tell it when to stop, as otherwise if will wait indefinitely for new events to be

introduced into the queue. The event queue is next initialized so that it is ready to handle the

scheduling of events. The models state is set, which can include the initialization of state values and/or

the creation and registration of agents (i.e. simulatables). The main event thread is then initialized

which begins the main event loop where event execution and message delivery occurs. Bootstrap

events are then scheduled, allowing the model to evolve.

Events and their execution are the main concern of the simulator. Events move the simulations time

forward. The chronological sequence of events represents the evolution of models state over a

simulation run. A simulator without scheduled events cannot evolve because the agents it contains are

passive entities who can only act upon the execution of an event/reception of a message. The bootstrap

events kick-off the simulatables to begin reacting to events.

Events are the primary form of communication between a simulatable and the simulator. Discrete

event simulators use events to cause alterations to the model at a specific time. Agent-based simulatorslike HVNS use scheduled events to pass messages between agents. The basic scheduling algorithm for

events is shown in Figure 4-2.


18/77

P a g e | 18

Event Scheduling Algorithm

if( event time < simulator time )Add event to queueSignal scheduled event

FIGURE 4-2. PSEUDO-CODE FOR EVENT SCHEDULING.

Simulatables call the scheduling function with their event. The simulator will add the event to the queue

so long as the event occurs in the future and not the past, otherwise non-causal events can occur (i.e.

events which depend not just upon past events but also future events). The event queue is a priority

queue which sorts events based upon an event comparator. Generally, events are sorted first in

ascending time-order, and second in descending priority-order. This ensures that events occur in a

chronologically ascending sequence allowing causality to be maintained. The priority-order allows

simulatables to send control messages to themselves that will always be received prior to external

messages. The main event thread is then signaled if it is waiting on an empty queue.

Main Event Loop

while( ENDING_CONDITION is FALSE )

while( queue is empty ) { wait for scheduled event }

Remove event from queueUpdate simulator time to event time

Execute event (deliver message to recipient)

FIGURE 4-3. PSEUDO-CODE FOR THE SIMULATOR'S MAIN EVENT HANDLING LOOP.

The main event loop handles the execution scheduled events and the delivery of messages. It is

depicted in Figure 4-3. It runs until some ending condition is met as discussed above. The event loop

avoids busy waiting by waiting on a condition variable when the queue is empty. The event loop wakes

upon being signaled by the scheduling function that a new event was added to the queue.

An event communicates three important pieces of information to HVNS: the time of delivery, the

message, and the intended recipient of the message. HVNS removes the head element from the queue,

updates its time to match the events time, and then executes the event by delivering the events

message to the specified simulatable recipient.

4.2. SimulatablesSimulatables, in aggregate, represent the simulators model. They are stateful entities (even if only

possessing a single state) which are capable of handling messages delivered to them as the result of an

event occurring. Simulatables are also agents which implement entity-specific logic which dictates how

they respond to messages. This response can include a state-change, the scheduling of additional


19/77

P a g e | 19

events (and messages to other simulatables), the generation and registration of additional simulatables,

etcetera.

Simulatables are passive entities that live inside the main event loops thread. They can only ever act if

an event with a message for them is scheduled and subsequently executed. Simulatables use events to

communicate with the simulator. Events are containers for messages. A message would otherwise be amethod call between one simulatable and another. Events allow such a call be to be affected by

simulation imposed limits on operations and the passage of time.

Simulatables receive messages which are labeled as implementing the IMessage interface. The content

of a message is completely dependent upon the implementation and expectations of the particular

simulatable subclass which is receiving the message. Messages are expected to contain information that

allows a simulatable to act in an event-appropriate fashion.

It is useful to use the abstraction that messages are time-delayed method calls. As such, the message

type flag or the message class itself can be used to indicate the method which is to be called. The fields

and/or message accessors provide the required parameters for this method call. If a message contains a

type flags, then it is generally heavyweight as it must provide parameters (even if null or empty) for

every method type. If the class of the message itself indicates the method that is to be called then the

message need only provide the parameters appropriate for that single method. This comes at the cost

of an expensive class type check. Systems with heterogeneous simulatables may implement different

interfaces. To keep messages compact and easy to understand, it is recommended that different

message types be employed for each of these interfaces.

Simulatables, as passive entities, must receive messages in order to act. The bootstrap events

scheduled at the start of a simulation run allow one or more simulatables to perform operations at

simulation onset. Typically, but ultimately dependant on the model, this will start a cascade of eventscheduling which moves the model between states and forward in time. Simulatables, too, can

bootstrap themselves or other simulatables. A simulatables accomplishes this by scheduling a control

message to itself to be received at some point in the future. The event containing this message has its

priority field set to INTERNAL which has a higher priority than the default EXTERNAL priority which

ensures that it is delivered prior to any external events.

Figure 4-4. depicts an example of this mechanism with a simulatable that wishes to accomplish some w

amount of work in work-state[n] followed by some w` amount of work in work-state[n+1]. The control

messages allow this simulatable to do work in a state until all work is exhausted, it can then proceed to

schedule a state change and schedule additional work messages to be completed in the next state.


20/77

P a g e | 20

Work[2]Work[1]

[receive bootstrap doWork[1]

schedule doWork[1]]

[receive doWork[1] && hasWork[1]

doWork[1]send doWork[1]]

[receive doWork[1] && !hasWork[1]send doWork[2]]

[receive doWork[2] && hasWork[2]

doWork[2]send doWork[2]]

[receive doWork[2] && !hasWork[2]send doWork[n-1]]

Work[n-1]

[receive doWork[n-1] && hasWork[n-1]

doWork[n-1]send doWork[n-1]]

Work[n]

[receive doWork[n] && hasWork[n]

doWork[n]send doWork[n]]

[receive doWork_n && !hasWork_nsend doWork_n+1]

Bootstrapping Future Work

FIGURE 4-4. FSM OF A SIMULATABLE BOOTSTRAPPING ITS WORK AND STATE CHANGES.

4.3. Operation Bound SimulatablesAn operation bound or performance restricted simulatable is a stateful entity with limitations placed on

its ability to respond to, or schedule events during an interval of time. An Operation Bound

Simulatables operation is represented by the FSM depicted in Figure 4-5.

Fully-Awake

[Get RequestDelegate Response

ops--refresh_sent_t = t;

Send(Renewal, t+refresh_sent_t, internal)]

Partially-AwakeBlocked

[Get RefreshOps = max_ops]


[Get RequestSend Response

ops--ops = 0

]

[Get RequestDelegate Response

ops--ops > 0]


[Get Request

Send(request, t+refresh_sent_t, external)]

Operation Bound Simulatable FSM

FIGURE 4-5. OPERATION BOUND SIMULATABLE FSM DEMONSTRATES A NODE THAT HAS A LIMITED CAPACITY TO ACT AND SO PUTS OFF

ADDITIONAL WORK FOR A FUTURE TIME WHEN IT WILL HAVE THE CAPACITY TO ACT AGAIN.


21/77

P a g e | 21

Operation bound simulatables possess three basic states which govern their ability to act. These states

include:

Fully-Awake the simulatable is able to perform up its maximum number of operations during atime interval.

Partially-Awake the simulatable is able to perform some reduced number of operations duringthe current time interval.

Blocked the simulatable has exhausted its ability to perform any additional operations duringthe current time interval.

Operation bound simulatables begin in the Fully-Awake state. Fielding a request and/or sending a

response in this state causes the simulatables allowed operations to be decreased by some amount, a

refresh message to be scheduled, and a transition to the Partially-Awake state.

The simulatable may continue to perform operations while in the Partially-Awake state, with each

operation performed resulting in additional decreases to the simulatables ability to field further

requests. Once all operational ability has been exhausted, the simulatable goes into a blocked state.

The Blocked state indicates that no operations can be performed. Any requests received in this state

are rescheduled by the simulatable to reoccur during the start of the next time interval. This ensures

that the events can be handled once the simulatable is again able to do so.

The refresh message is contained in an event which is scheduled by and for the simulatable. It is

scheduled to occur at the beginning of the next operation interval which is governed by the

simulatable's refresh_time variable. The refresh message is sent with the highest priority possible

ensuring that it is received before any other events can be received by the simulatable. Reception of the

refresh message indicates the start of the next operation interval and indicates that the simulatable mayreturn to the Fully-Awake state with the ability to perform the maximum number of allowed operations.

If the simulatable has been in the Blocked state and has rescheduled events to occur for this time

interval, these events will be redelivered by the simulator and can be handled normally.

The operation bound simulatable abstracts out performance restriction for subclasses by implementing

a function hook which is called during the Fully-Awake and Partially-Awake states. Subclasses override

this hook method so that they can be the delegate of model-specific functionality. The parent class

takes care of determining if and when delegation can occur. An FSM representing this is depicted in

Figure 4-6.


22/77

P a g e | 22

Fully-Awake

[Get Requestops--

refresh_sent_t = t;Send(Renewal, t+refresh_sent_t, internal)]

Partially-AwakeBlocked

[Get Refresh

Ops = max_ops]


[Get RequestSend Response

ops--ops = 0

]

[Get Requestops--

ops > 0

]

[Get Refresh

Ops = max_ops]

[Get RequestSend(request, t+refresh_sent_t, external)]

Delegated FSM

FIGURE 4-6. SUBCLASSED OPERATION BOUND SIMULATABLE OVERRIDING HOOK METHOD.

An alternative approach that was explored, involved the use of a separate message queue on all

simulatables. Simulatables would control their ability to poll the message queue by sending a refresh

message at the start of their polling operations. A simulatable would return to a blocked state once

they handled as many messages as allowed by their configuration. The simulatable would subsequently

wake up upon reception of the refresh message and start the entire process over again. This approach

was abandoned since it requires the duplication of existing functionalitythe simulator already

possesses a priority queue for events.


23/77

P a g e | 23

An idea is always a generalization, and generalization is a property of thinking. To generalize

means to think.

- Georg Hegel

5. Network ModelThe network model represents communication devices which can generate, receive, and propagate

collections of data in a container known as a packet. It is the model on top of which the hardware

model resides. A network is composed of three basic entities: Nodes, Connection Media, and

Connection Adaptors. All network entities use operation bound simulatables as their base which allows

their performance to be altered.

The network model uses a simplified single addressing scheme whereby a node has a single address

which is shared by all of its connection adaptors. This is as opposed to having separate MAC and

network addresses for each adaptor as would be the case in a real-world network.

It is important to note that the network model entities are all simulatables, specifically operation bound

simulatables. This means that these entities do not directly communicate via method calls during a

simulation. Instead, all communication between network entities occurs through the use of events

containing messages. The message passed to a network entity during an event is roughly equivalent to a

method call on its interface. As an example, when a connection medium is said to propagate a packet to

the connection adaptors to which it is attached, it is not actually calling each adaptors receive method

and providing it with the packet. Instead, the connection medium is actually scheduling an event for

each of these connection adaptors. This event will contain a message of the type

ConnectionAdaptorReceiveMessage and will contain the packet which the connection adaptor is toreceive and inspect. This ensures that all events are affected by the temporal mechanics of the

simulation environment.

5.1. NodeA node represents a packet generating/receiving device. Nodes are roughly equivalent to everyday

systems such as personal computers, file servers, phones, etcetera. Nodes are logically connected to

one another through a connection medium. Internally, a node may have several connection adaptors

which physically connect it to several connection mediums. A node sends messages by making use of

its protocol stack.


24/77

P a g e | 24

Node

ConnectionAdaptor

Node

ConnectionAdaptor

Medium

Two Nodes Connected via Medium

FIGURE 5-1. DEPICTION OF THE PHYSICAL CONNECTION BETWEEN TWO NODES AND THE RELATIONSHIP OF THE NETWORK ENTITIES TO ONE

ANOTHER.

Nodes in the network model can be logically connected to several other nodes. They also possess a

protocol handler, known as the NetworkProtocolHandler, which can perform routing services to

determine the next hop destination for a packet. These characteristics allow nodes to also emulate the

functionality provided by switches or routers. As such, the network model has been simplified to

exclude explicit implementation of these network objects.

Nodes also provide the transport API which allows algorithms (or any application layer item) to make

use of the protocol stack. They also fulfill the unreliable transport layer role by wrapping application

layer messages into datagrams which are then provided to the network layer for routing.

5.2. Connection AdaptorA connection adaptor is a packet propagator and sender. It represents any physical layer device which

interfaces with the medium itself. Examples of such devices include network interface cards and

wireless antennas. One or more connection adaptors can be contained within a node.

A connection adaptor must deal with two events, packets that are outgoing and packets that are

incoming (both of which are in respect to the attached node). A connection adaptor that receives a

packet from its own node (specifically from a protocol handler above it in the stack) sends that packet

out across the connection medium. A connection adaptor that receives a packet from a connection

medium must inspect that packet to determine its intended destination. If the packets destination

address does not match the connection adaptors address, then the connection adaptor was not the

intended recipient and the packet is dropped. However, if the addresses match then this connection

adaptor is considered the next hop and the packet must be handled by a protocol handler further up the

protocol stack.

NetworkProtocolHandleris the protocol handler one further up the stack. It is a network layer protocol

handler that provides routing services. It is responsible for determining the address of the next-hop

node along the path of nodes to the destination.


25/77

P a g e | 25

Node B

Connection

Adaptor

Connection

Adaptor

Network Protocol Handler

Node A

Connection

Adaptor

Node C

Connection

Adaptor

Medium

NetworkPH NetworkPH

Medium

Network of Three Nodes

FIGURE 5-2. THIS NETWORK OF THREE NODES DEPICTS THE INTERNAL MAKEUP OF A NETWORK OF NODES THREE NODES CONNECTED IN

SERIES. A NODE POSSESSES A CONNECTION ADAPTOR FOR EVERY MEDIUM TO WHICH IT IS CONNECTED. A SINGLE CONNECTION NETWORK

PROTOCOL HANDLER HANDLES ROUTING RESPONSIBILITIES FOR ALL CONNECTION ADAPTORS AND ALGORITHMS INSTALLED ON A NODE.

When NetworkProtocolHandler receives a datagram from a higher level protocol, it interrogates its

routing table to determine the address of the next closest node on the path from the current node to

the destination node. The datagram is encapsulated into a packet and delivered to the connection

adaptor to send out. In some cases, this next hop node may be identical to the destination node. In

other cases, there may be several intermediate nodes that must also receive and route the packet.

NetworkProtocolHandler also handles datagrams received as the payload of a packet from the lower

level protocol. The destination address of the datagram is inspected. If the address matches the

address of the node, then the payload data is removed from the datagram and delivered to the protocol

handler that matches the protocol of the datagram (i.e. the TransportProtocolHandler itself). If the

address does not match the address of the node, then the connection adaptor proceeds to act as if the

datagram had been received from the higher level protocol. It determines the next hop address via its

routing table, packages the datagram into a packet with this next hop address, and gives the packet to

the connection adaptor to send across the medium.

Node B

ConnectionAdaptor

ConnectionAdaptor

NetworkProtocolHandler

Node A

ConnectionAdaptor

Node C

ConnectionAdaptor

Medium

NetworkPH NetworkPH

Medium

Network of Three Nodes Routing a Packet

FIGURE 5-3. NODE A IS SENDING A PACKET WITH NODE C AS ITS DESTINATION. THE NEXT HOP NODE ALONG THE PATH IS B. WHEN B

RECEIVES THIS PACKET, IT INSPECTS THE DESTINATION ADDRESS OF THE DATAGRAM, DETERMINES THE NEXT HOP ADDRESS, REPACKAGES

THE DATAGRAM INTO A NEW PACKET ADDRESSED TO C, AND FINALLY SENDS IT TO NODE C.

5.3.

Connection Medium

Connection medium is a packet duplicating/propagating device. It represents the physical layer /

medium of an actual network which may be a coaxial cable, twisted pair Ethernet, airwaves, or

otherwise. Depending upon implementation, a connection medium can logically connect multiple nodes

to one another to allow packets to be propagated in a broadcast fashion to devices connected to the

medium. A connection medium is logically connected to a node but physically connected to the nodes

connection adaptor. It receives and sends packets to connection adaptors. All connection mediums are


26/77

P a g e | 26

currently reliable and will never drop packets or introduce errors into sent packets. The addition of

unreliable media would necessitate the implementation of a reliable transport protocol.

5.4. Protocol StackThe protocol stack is an abstraction that describes the collection of protocols installed on a node. It is alayered series of protocols where the lower level protocol (i.e. the protocol below) provides a service

the higher level protocol (i.e. the protocol above). Figure 5-4 provides a visual depiction of the protocol

stack in the internet protocol suite called TCP/IP as well as the protocol stack employed by node.

Nodes Protocol

StackTCP/IP

Transport API

Application Layer

ConnectionAdaptor

NetworkProtocolHandler

TransportProtocolHandler

Algorithm

Network Layer

Transport Layer

Data Link Layer

Physical Layer

Comparison of ProtocolStacks

Transport API

FIGURE 5-4 MAPPING BETWEEN TCP/IP'S PROTOCOL STACK AND NODE'S PROTOCOL STACK

The traditional TCP/IP protocol suite has 5 layers: Application, Transport, Network, Data Link, and

Physical and a transport layer API that serves as the glue between an application and the services

provided by the rest of the stack. Nodes protocol stack combines several of these layers. The

connection adaptor provides general media access service by handling both the physical and link layer

services. A NetworkProtocolHandler provides networking services by handling routing operations.

TransportProtocolHandler provides unreliable transportservices and also provides the transport API to

algorithms. The distribution algorithms sit at the top on the protocol stack and access the services of

the stack via the Transport API (i.e. the IProtocolHandler interface).

A protocol in this model implements the IProtocolHandler interface. IProtocolHandlers handle the

following operations:

Associate a protocol handler with a protocol name


27/77

P a g e | 27

Handle packets from the higher level protocol (implementation dependant) Handle packets from the lower level protocol (implementation dependant)

All packets implement the IPacket interface. They have a protocol type, source, destination, and

payload. The term used to refer to a packet is dependent upon the layer of the protocol stack. The

transport layer deals with datagrams. The network layer deals with packets. The physical layer dealswith frames. Each layer is responsible for dealing with one type of packet which contains information

pertinent to that layer. The generic termpacketwill be used when referring to a non-specific level of

the protocol stack.

Each layer of the protocol stack encapsulates the packet received from the higher level protocol before

contacting the lower level protocol. Encapsulation adds header (and/or footer) information that is

pertinent to the lower level protocol. The original packet from the higher level protocol is also added as

apayloadto the new lower level packet. This is important since it allows the original information to be

retrieved by the corresponding protocol handler on the protocol stack of the destination device.

The typical operation for a protocol handler is as follows. The protocol handler processes a packet. If

packet was received from a higher protocol then the next step of the chain is a lower protocol. The

handler encapsulates the originalpacket as payload inside of a newpacket and sends it to the lower

protocol. If packet was received from a lower protocol then next step of the chain is a higher protocol.

The handler removes the payload from the packet and hands this off to the higher level protocol.

Node A

ConnectionAdaptor

Network

Algorithm Sending a Message to another Algorithm

Transport

Node B

ConnectionAdaptor

Network

TransportMedium

FIGURE 5-5 THE ENCAPSULATION PROCESS OF A PACKET AS IT MOVES DOWN THE PROTOCOL STACK ON NODE A AND UP THE PROTOCOL

STACK ON NODE B.

Protocol handlers handle one or more protocol types. Protocol handlers maintain references to the next

higher and next lower protocol handlers in the stack. These references are setup up during theinitialization of the protocol handlers employed by a node.


28/77

P a g e | 28

Generalization is necessary to the advancement of knowledge; but particularly is indispensable

to the creations of the imagination. In proportion as men know more and think more they look

less at individuals and more at classes. They therefore make better theories and worse poems.

- Thomas B. Macaulay

6. Hardware ModelThe hardware model represents is an abstraction of a computer, its operating system, and several of the

components that impact its performance. This model intersects the network model as it shares a view

of the node and use of the connection adaptor. Distribution algorithms interface with the hardware

models API to retrieve information, store information, and to make use of the network for

communication. Hardware components, like network components, use operation bound simulatables

as their base which allows their performance to be altered.

All hardware objects share the following performance-altering properties:

Transit time Maximum allowed operations Refresh interval

Some of these were discussed in the context of operation bound simulatable and how they affect its

state-changing behavior. The discussion here focuses on their meaning and use in a hardware context.

Transit time is the delay associated with sending information from one hardware component to

another. It is used to represent both the time it takes to process a request as well as the time delay

associated with sending a response along the channel between the hardware source and the intended

recipient. Implementation wise, transit times value is used when scheduling an event to be received by

another device. The current simulator time plus the transit time is the soonest possible time that a

simulatable may use for scheduling a new event.

Maximum allowed operations affects how many operations can be performed by a piece of hardware

within a given activity interval. Individual hardware components are responsible for implementing a

coherent and reasonable view on what activities qualify as operations (i.e. quantify the value in

operations of every activity/method). Operations in the context of an adaptor or harddrive correspond

most closely to bandwidth. A harddrive that can perform 10 operations per one unit refresh interval can

possibly send or receive 20 data per unit time. If data is assumed to be a byte and unit time a second,

this corresponds to a bandwidth of 80 megabytes per second. Hardware components keep track of how

many operations they have performed during a time interval and subtract operations as activities are

performed.

The refresh interval value can be thought of as a hardware components internal clock; the frequency at

which a hardware component can operate. The time between refreshes is a hardware components


29/77

P a g e | 29

activity interval. The underlying operation bound simulatable uses the refresh interval to reset a

hardware components operation count which signifies the beginning of a new activity which allows a

hardware component to perform operations once more.

6.1. Hardware Computer NodeThe Computer interface provides for the installation of algorithms, harddrives, and caches. It is the

gateway that an algorithm uses to obtain access to hardware services performed by harddrives, caches,

and adaptors.

6.2. HarddriveHarddrives are large, long-term storage devices. They will generally have slow access times relative to

caches and even connection adaptors. Harddrives store data for distribution algorithms. Harddrive size

will generally be homogenous across all computer nodes in the network. The important exception is

that the client node must have a harddrive large enough to store the entirety of the data that is to be

distributed to servers on the network so that local-disk read time can be measured.

Harddrives store IData objects which are associated with indices. Harddrives can fetch and store data

from specified indices for a computer node.

6.3. CacheCaches are small, short-term storage devices. Caches are much smaller than harddrives, but also much

faster. They are optionally installed upon a computer node. Caches represent the use of main memory

to store data to speed up data retrieval. A cache may be used by an algorithm to store information that

it predicts it will need access to in the future. This allows an algorithm to pre-pay the cost of a slowharddrive access for a piece of data on the behalf of a future requester of that data. The algorithm can

retrieve the data from the cache when a request for it is finally made allowing the requester to

experience the fast cache access time and not the slow harddrive access time. Ideally, every request for

data could be met with a cache hitwhere the prediction for cache storage/data usage was correct and

the cache has the data necessary to field the request. A cache miss occurs when the prediction is

incorrect and the cache does not have the data requested. Cache misses penalize the requester to pay

the cost of the cache access time as well as the harddrive access which must be made.

Caches, like harddrives, store IData objects which are associated with indices. The index associated with

a piece of data is the same whether that data is stored in the cache or on the harddrive.

6.4. Connection AdaptorConnection adaptors represent network interface cards and are used for communication across the

network. Connection Adaptors operate on a packet level as does the medium to which they are

connected. They transfer full objects without transformation to a separate physical representation (i.e.


30/77

P a g e | 30

bits) across connection media. Connection adaptor speeds are approaching and in some cases

exceeding the transfer rates of local storage like harddrives [15].


31/77

P a g e | 31

A distributed system is one in which the failure of a computer you didnt even know existed can

render your own computer unusable.

- Leslie Lamport

7. Distribution and Retrieval AlgorithmsA distribution and retrieval algorithm (DRA) deals with the logistics of distributing data from a local

client node to a series of nodes remote to it on the network for storage, as well as retrieving this data at

some future point in time. A DRA has the following high-level functions:

Retrieve data from local storage. Set up and manage storage network of available nodes on a network. Retrieve data from remote storage nodes.

Distribution algorithms perform different roles as part of the hardware model, network model, andsimulation environment.

DRA implement the IAlgorithm interface and access the hardware model through installation onto an

entity which implements the IComputerinterface like HardwareComputerNode objects. This interface

grants a DRA the ability to reference storage devices like the computers harddrive and cache so that

IData objects can be stored and retrieved locally. In order to use a storage device, DRAs must

understand the interfaces presented by storage devices as made available through the storage devices

supported messages.

DRAs sit on top of the network protocol stack in the network model. DRAs implement the

IProtocolHandlerinterface which connects them to the transport layer protocol handler. This

connection provides DRAs with the ability to transmit and receive messages which contain control and

data values stored as a packets payload to and from DRAs remote to them in the network.

DRAs are operation bound simulatables in the simulation. As simulation agents they can schedule

events containing messages for other simulatables like harddrives, caches, and the transport protocol

handler.

7.1. Operation ModelDRAs use a client-server approach to communication. One DRA is selected to be a client in theconfiguration file which builds the simulation. The other DRAs remain in a passive role until they elect

and are thereafter selected to perform server duties.

The simulation begins when the simulator schedules a SET_CLIENTevent to the client node which

indicates it has been selected to perform the client role. The client then generates a specified quantity

of data which will be used for both the local and remote read tests. Once these are complete, the client


32/77

P a g e | 32

sends itself a series of bootstrap events to allow it to continue processing and moving through its

behavioral states.

The clients general operation is roughly as follows:

1. Select volunteer(s) and acknowledge their role as server(s).2. Read data from local storage.3. Distribute data to server(s).4. Await server(s) ready signal.5. Read data from server(s).

The manner in which this is accomplished is implementation specific. Overviews of the two algorithms

designed for this project are detailed in Sections 7.3 and 7.4.

7.2. ImplementationDRAs are implemented using the state design pattern [20]. The state design pattern abstracts out statevalues and the unique behaviors associated with those state values into the form of a state object. The

DRA itself is a state-context or state-holder object. The DRA holds a state object to which the it

delegates method invocations. The DRAs behavior thusly depends upon the behavior/implementation

of the state object it contains. The DRAs behavior can be altered by replacing the state object it holds.

State-transitions occur when the state object performs this replacement on its state-holder.

Algorithm

IState _state

delegateEvent( Event e )

Init


Distribute


Read


IState


setState( IState s )

IStateHolder


setState( IState s )

IState _state

FIGURE 7-1. UML OF STATE DESIGN PATTERN.


33/77

P a g e | 33

The state design pattern simplifies the creation of distribution algorithms which may have several roles

and several states per role to fulfill. It successfully isolates responsibilities, which creates simpler and

more easily tested code.

7.3. Client-Managed Distribution AlgorithmThe Client-Managed Distribution and Retrieval Algorithm (CMDRA) is a client-server approach that

requires a client which is active in the selection of servers, the fair distribution of data, and which

maintains an index table mapping data indices to server addresses. CMDRA features the use of cache on

servers to speed up retrieval operations. This section examines of the clients operation followed by an

examination of the server(s)s operation. FSMs are provided for both the client operation and the server

operation. The discussion text shares terminology with these diagrams and discusses the highlights of

each state of the FSM.

The operation of CMDRAs client is depicted as an FSM that is split into two parts, beginning in Figure

7-2 and ending in Figure 7-3.

AwaitVolunteersNullRole

[receive SetClient

broadcast VolunteerRequest]

[receive ServerVolunteers

and needMoreVolunteers()

store volunteer address

send VolunteerAccepted]

[receive ServerAcknowledgesserveracknowledgements++

and !needMoreAcknowledgements()

send DoWork]

Distribute

[Receive DoWork&& needToDistribute()

Send HD Request

distributed++

send DoWork]

Client FSM

[receive ServerAcknowledges

acknowledgements++

and haveAllAcknowledgements()]

[Receive HD Response

Send data next server round robin style]


acknowledgements++

and !haveAllAcknowledgements()]


send ClientRejectsVolunteer]

[Receive ServerAcknowledges

serveracknowledgements++

needMoreAcknowledgements()]

FIGURE 7-2. CMDRA CLIENT FSM (PART 1). FEATURES INCLUDE VOLUNTEER SELECTION AND THE DATA DISTRIBUTION PROCESSES.

Potential clients begin in the NullRole stage. They transition into client status once they receive the

SetClientmessage from the simulation. A client first attempts to locate some arbitrary number of

servers which will each store some fractional slice of the total data to be stored. The client broadcasts a

VolunteerRequestmessage to obtain these servers. The request message contains a count of the total

amount of data that a server is expected to store. This volunteer request message is disseminated

across all nodes in the network. Servers which are willing and able to store the specified amount of data


34/77

P a g e | 34

send back a ServerVolunteers message. The client accepts a user defined number of these servers and

rejects the rest.

Once all volunteers have been acknowledged the client begins data distribution. The client sends data

requests to its harddrive for all indices which are to be sent. The harddrive in turn responds with the

data stored in the index for each request it receives. The client sends the data from each of theseresponses to a server which is selected in a round-robin fashion. The client keeps track of index

ownership for future retrieval by maintaining a mapping of servers and the indices they hold. Every data

sent is acknowledged by the server as it is successfully stored. Once all data has been sent and all server

acknowledgements have been received, the client proceeds to the server confirmation stage called

ConfirmServerReady.

ConfirmServerReady Read

[Receive DoWorkand haveMoreDataStoreComplete()

dataStoreComplete++send DataStoreComplete to server]

[receive DoWorkand haveDataToRequest

send ServerDataRequest

dataRequests++send DoWork]

[receive ServerReadyand !needMoreServerReady

send DoWork]

[receive DataResponseand dataResponse is valid

dataresponses++and needMoreDataResponses]

Done

[receive DataResponsedataresponses++

and !needMoreDataResponses

send SimulationComplete]



[send DoWork]

[receive ServerReadyserverready++

and needMoreServerReady

send DoWork]



Client FSM

FIGURE 7-3. CMDRA CLIENT FSM (PART 2). FEATURES INCLUDE CONFIRMATION OF SERVER READYNESS AND THE REMOTE READING

PROCESS.

Inside ConfirmServerReady, the client indicates to each of the servers that it has finished sending it data.

The client then waits for the servers to confirm that they are ready to respond to read requests for that

data. Servers that have finished storing and processing data respond in time. The client enters the Read

state once this has occurred.

Inside Read, the client proceeds to send data requests to each server for the indices it knows that serverpossesses. The data received from these requests is compared against the data stored locally to ensure

that it is correct. Once all valid data have been received the client ends the simulation.


35/77

P a g e | 35

The operation of CMDRAs server is depicted as an FSM inFigure 7-4 and Figure 7-5. Like the client, the

server begins in a passive NullRole state without a specifically defined client or server role.

[Receive DataStoreComplete

send ServerReadysend DoWork]

NullRole

[VolunteerRequest && hasSpace()

Mark request id.Send ServerVolunteers

Broadcast VolunteerRequest.]

Volunteered

[VolunteerRequest&& !hasSpace()]

AwaitStorage

[receive ClientAcceptsVolunteer

send ServerAcknowledges]

[Receive DataStorage


Server FSM

[receive ClientRejectsVolunteer]

FIGURE 7-4. CMDRA SERVER FSM (PART 1). FEATURES INCLUDE THE VOLUNTEER AND DATA STORAGE PROCESSES.

This changes when a would-be server receives a volunteer request from the client. Servers which have

sufficient capacity and are willing to perform a storage role send a ServerVolunteers message to the

client and proceed into the Volunteeredstate. They also rebroadcast the volunteer message to other

nodes on the network (up until the time to live limit is reached) so that additional servers not directly

connected to the client may also receive the message. Volunteered CMDRAs can either be accepted or

rejected as a volunteer. Would-be servers that are rejected return to NullRole. Would-be servers that

are accepted enter a storage request acceptance state calledAwaitStorage and acknowledge this

transition with the client.

InsideAwaitStorage, servers who are awaiting storage requests field these requests from the client and

place data received into long-term storage on their harddrive. Servers acknowledge every piece of data

received from the client. Eventually, the client will indicate that all data has been sent. The server then

has time to process the data in some way (e.g. place data into cache, etc.). Once operations of this

nature complete, the server proceeds to the Service state and confirms that it is now ready to field read

requests.


36/77

P a g e | 36

Server FSM

Service

[Receive ClientDataRequest(index)

send CacheRequest(index)]

[Receive DoWork(cacheFreespace)

Send HDRequest(index) from Cache to HD

Send DoWork(cacheFreespace--)]

[Receive CacheResponse(data, address)

send DoWork(1)

send DataResponse(data, address)]

[Receive CacheResponse(null, index)

Send HDRequest(index)]

[Receive HDResponse(data, address)

send DataResponse(data, address)]

[]

FIGURE 7-5. CMDRA SERVER FSM (PART 2). FEATURES INCLUDE THE SERVICE PROCESS WHICH DEMONSTRATE CACHE AND HD STORAGEAND RETRIEVAL AND THE SERVICING OF CLIENT DATA REQUESTS.

Inside Service, the server is responsible for fielding read requests from the client. The server receives

requests for the data stored at an index. The server first attempts to retrieve this information from the

cache. The cache can respond with data or a null response. Cache data responses are shipped off to the

client. A null response forces the server to request the data from the harddrive. The hardrives data

response always contains data and can never be null. This data is similarly shipped off to the client.

During this time the server attempts to keep the cache filled with data. Cache hits result in new

requests by the server to the harddrive indicating that the cache needs to be filled with more data (i.e.

data which has not yet been requested).


37/77

P a g e | 37

7.4. Server-Managed Distribution AlgorithmThe Server-Managed Distribution and Retrieval Algorithm (SMDRA) is a client-server approach which

offloads most the server selection, data distribution, and data mapping to a single primary server which

is responsible for a collection of secondary storage servers. SMDRA features the use of cache as well as

the use of data redundancy to speed up data requests. There are three roles present in this algorithm:

client, primary server, and secondary server(s). This section examines the operation of each in turn.

FSMs are provided depicting the operations of the client both types of servers. The discussion text

shares terminology with these diagrams and discusses the highlights of each state of the FSM.

The operation of SMDRAs client is depicted in Figure 7-6 and Figure 7-7.

AwaitVolunteersNullRole

[receive SetClient

broadcast VolunteerRequest]

[receive ServerReady

send DoWork]

Distribute

[Receive DoWork

&& needToDistribute()

Send HD Request

distributed++send DoWork]

Client FSM

[receive ServerAcknowledgesacknowledgements++

and haveAllAcknowledgements()]

[Receive HD Response

Send data to primary server]


acknowledgements++

and !haveAllAcknowledgements()]



AwaitFirstVolunteer


send AcceptedAsPrimary]

[Receive ServerVolunteers

relay to primary server]

FIGURE 7-6. SMDRA CLIENT FSM (PART 1). FEATURES INCLUDE PRIMARY SERVER SELECTION, VOLUNTEER RELAYING, AND THE DATA

DISTRIBUTION PROCESSES.

Here again, would-be clients begin in the NullRole state until it receives a SetClientmessage from the

simulation. The client broadcasts a VolunteerRequestmessage and entersAwaitFirstVolunteer. The

volunteer request message is disseminated across all nodes in the network. Servers which are willing

and able to store the amount of data in a slice send back a ServerVolunteers message.

InsideAwaitFirstVolunteer, the server awaits the first of these messages. The sender of this first

message is selected by the client to be the primary server and is sent an acknowledgement of this role.

The acknowledgement includes information pertinent to the selection of additional volunteers including

the number of base servers and the amount of data redundancy required.

InsideAwaitVolunteers, the client relays all subsequent server volunteers to the primary server. It does

this until the primary server indicates that it is ready to receive storage requests.


38/77

P a g e | 38

Inside Distribute, the client proceeds to enter the distribute stage where it sends all data to the primary

server. It continues sending data until it exhausts its supply and receives an acknowledgement from the

primary server that the data has been received. The client then proceeds to ConfirmServerReady.

Client FSM

ConfirmServerReady Read

[receive DoWork

and haveDataToRequest

send ServerDataRequest

dataRequests++

send DoWork]

[receive ServerReady

and !needMoreServerReady

send DoWork]

[receive DataResponse

and dataResponse is valid

dataresponses++

and needMoreDataResponses]

Done

[receive DataResponsedataresponses++

and !needMoreDataResponses

send SimulationComplete]





[send DataStoreComplete]

FIGURE 7-7. SMDRA CLIENT FSM (PART 2). FEATURES INCLUDE CONFIRMATION OF SERVER READYNESS AND THE REMOTE READING

PROCESS.

The client remains in the ConfirmServerReadyuntil the primary server indicates that it has completed its

dissemination of the data. The client can then enter the Readstate.

Inside Read, the client sends a data request for every piece of data sent to the primary server. Data

responses will be received from the server (primary or secondary) that has the data and was selected by

the primary server to field the request. The data received is compared against the data stored locally to

ensure that it is correct. Once all valid data have been received the client ends the simulation.


39/77

P a g e | 39

The operation of SMDRAs primary server is depicted inFigure 7-8 and Figure 7-9.

NullRole

[VolunteerRequest && hasSpace()

Mark request id.

Send ServerVolunteers

Broadcast VolunteerRequest.]

Volunteered

[VolunteerRequest

&& !hasSpace()]

[receive AcceptedAsPrimary


Primary Server FSM

[receive ClientRejectsVolunteer]

AwaitVolunteers


and needMoreVolunteers()

store volunteer address

send AcceptedA

Master's Project Report HVNS RC3

Documents

Transcript of Master's Project Report HVNS RC3