Activity pre-scheduling for run-time optimization of grid workflows

Journal of Systems Architecture 54 (2008) 883–892

Contents lists available at ScienceDirect

Journal of Systems Architecture

journal homepage: www.elsevier .com/locate /sysarc

Activity pre-scheduling for run-time optimization of grid workflows

Giancarlo Tretola, Eugenio Zimeo *

Research Centre on Software Technology, University of Sannio, Benevento 82100, Italy

a r t i c l e i n f o

Article history:Received 10 July 2007Received in revised form 10 December 2007Accepted 15 January 2008Available online 7 March 2008

Keywords:Grid computingGrid workflowRun-time optimizationAsynchronous invocation

1383-7621/$ - see front matter � 2008 Elsevier B.V. Adoi:10.1016/j.sysarc.2008.01.009

* Corresponding author.E-mail address: [email protected] (E. Zimeo).

a b s t r a c t

The capability to support resource sharing between different organizations and high-level performanceare noteworthy features of grid computing. Applications require significant design effort and complexcoordination of resources to define, deploy and execute components on heterogeneous and oftenunknown resources. A common trend today aims at diffusing workflow management techniques toreduce the complexity of grid systems through model-driven approaches that significantly simplify appli-cation design through the composition of distributed services often belonging to different organizations.With this approach, the adoption of efficient workflow enactors becomes a key aspect to improve effi-ciency through run-time optimizations, so reducing the burden for the developer, who is only responsibleof defining the functional aspects of complex applications since he/she has only to identify the activitiesthat characterize the application and the causal relationships among them. This paper focuses on perfor-mance improvements of grid workflows by presenting a new pattern for workflow design that ensuresactivity pre-scheduling at run-time through a technique that generates fine-grained concurrency witha couple of concepts: asynchronous invocation of services and continuation of execution. The techniqueis implemented in a workflow enactment service that dynamically optimizes process execution with avery limited effort for application developer.

� 2008 Elsevier B.V. All rights reserved.

1. Introduction

Grid computing is widely adopted for a variety of applicationsinvolving intensive computation and/or massive data manipula-tion. These applications benefit from the execution on large-scalenetwork computing systems, typically distributed across the Inter-net and composed of heterogeneous resources [1], able to sharecomputing power and data storage capacity [4]. This potentialcan be exploited in different research and industrial areas, suchas scientific communities that can gain great benefits from collab-oration among different teams. Biologists, physicists, and earth sci-entists use complex applications that need a massive amount ofdata [2] to achieve scientific application goals [19]. However, notonly scientists are interested in grid computing: business pro-cesses, engineering, government and multimedia services are allexamples of potential grid applications.

Most of these applications are based on the coordination andmanagement of resources and tasks to achieve the application goal.Programming such coordination could be tedious and difficult dueto the scale, the variability and the heterogeneity of a grid system.Workflow programming, coming from the field of business processmanagement (BPM), is emerging as an effective paradigm to defineand manage processes in distributed environments spanning mul-tiple organizations. The paradigm, in fact, aims at the separation of

ll rights reserved.

control logic, which defines the steps of the process to be fulfilledto reach a goal, from application logic, used to manage the re-sources and to execute the tasks on them. This represents a pecu-liar feature to easily coordinate resources in grid environment, asdemonstrated by the increasing research focus on this topic andby the existence of the grid workflow forum [19], an open forumabout workflow in grid computing.

However, differently from BPM, grid computing is more con-cerned with performance optimization of application executed asworkflow. This could be obtained by acting at different abstractionlayers in the workflow management system (WfMS): at design-timeor at run-time. The former regards optimization performed duringprocess definition. The latter may be further divided in three typesaccording to the phase at run-time in which they occur: enactmenttime, when activities are scheduled for the assignment to adequateresources, at binding time, i.e., when resources are selected andactivities assigned to them for the execution, and at execution time,when the resources are engaged in executing the scheduled activ-ities [19].

While a lot of research in grid computing has been focused onthe optimization of resource selection and local scheduling trig-gered by the availability of the selected resources, a little or noeffort has been devoted to the optimization at enactment time.At this level, workflow’s sequential constraints between activitiescould be relaxed to improve the concurrency and consequentlythe degree of parallelism when the workflow is executed in thegrid.

mailto:[email protected]

http://www.sciencedirect.com/science/journal/13837621

http://www.elsevier.com/locate/sysarc

Process Description

WES

Control

Interaction

Grid Middleware

GridResources

BindingMatchMaker

ResourceDescriptions

Scheduling

Mapping

Execution

LocalScheduler

Fig. 1. Grid workflow system architecture.

1 See [18] for the meaning of workflow management terms.

884 G. Tretola, E. Zimeo / Journal of Systems Architecture 54 (2008) 883–892

Even though human intervention in the process definition canhelp to augment concurrency at design-time, automatic optimiza-tion is a desirable feature to keep simple the process description.Such optimization should be suggested by declarative labels pro-vided by designers during the inception and design phases, soreducing additional effort. To accomplish this objective we proposea declarative mechanism that allows designers to simplify work-flow design by specifying that some workflow tasks’ executioncould be anticipated with respect to the causal order specified bya regular transition, moving at run-time the actual choices on taskscheduling to improve workflow concurrency. The mechanism isbased on two programming concepts required by the executionenvironment: anticipation and continuation. They could be imple-mented directly in the workflow engine or provided by a grid mid-dleware exploited by the engine.

This paper discusses the adoption of such a mechanism, pre-sents the architecture of a WfMS that supports these featuresand analyzes the consequent impact on computation and datatransfer for some micro-benchmarks and typical grid workflows.To this end, the paper is organized as follows. In Section 2, a refer-ence architecture of a WfMS is presented with the aim of identify-ing the aspects of a workflow that can be optimized. Section 3discusses fine-grained concurrency, the way to achieve it in work-flows, the language adopted to describe workflows and activityanticipation, and its impact on the implementation of a prototypi-cal workflow engine. Section 4 evaluates some experimental re-sults to provide a theoretical interpretation of performances.Section 5 analyses related work in grid workflow management.Section 6 concludes this paper and outlines future activities.

2. Background

According to the reference model proposed by the workflowmanagement coalition (WfMC) [17], the operations of a WfMSare divided into two main areas: build-time and run-time. The for-mer regards the design and definition of the process and the ab-stract definition of the tasks. The latter concerns the enactmentof the activity, the selection of the most suitable resource for theactivity, the execution and monitoring of the process. This featureusefully separates the conceptual problem of the process definitionby the policies and rules of resource management.

The architecture depicted in Fig. 1 presents the main compo-nents of a WfMS for a Grid environment. It has a central compo-nent, the workflow enactment system (WES)1 that consists of oneor more workflow engines that create, manage and execute work-flow instances. From an architectural point of view, the WES couldbe described as a three-layered system. The upper layer (control) isresponsible for retrieving the process definition, interpreting it, nav-igating the corresponding graph, and choosing the activities to beenacted, i.e., it is responsible for the scheduling of the activities char-acterizing a grid process. The intermediate level (binding) is respon-sible for the association between a desired functionality and theperformer that is in charge of its execution. The binding level is as-sisted by the MatchMaker, an architectural component able to dis-cover the more appropriate resource for executing a task. Thebinding operation is conceptually related to the mapping concernof grid systems, in which computing resources are assigned to per-form a task. The lower level (interaction) is responsible of performingthe connection to the resources for actually performing the tasks.Grid resources are connected to the WfMS via a grid middleware[2], enabling the access to them, managing authorization and secu-rity and accomplishing many other tasks. Globus Toolkit [22] is anexample, but other middleware platforms can be exploited. The re-source descriptions are managed by a registry, which stores informa-tion (possibly semantic annotations) associated to the functionalitiesoffered by the grid resources.

The architecture described above is aimed at managing gridprocesses: they are typically described as graphs, where the nodesare the activities to perform and the edges are the transitions be-tween activities. These graphs are able to describe the control flowof the functionalities to be performed in order to achieve the pro-cess objectives. As consequence, the process enactment may beseen in two principal plans: scheduling and mapping as shown inFig. 1, which correspond, respectively to the control and bindinglayers of a WES. The scheduling defines the enactment time of eachactivity, so it is concerned with the transition links and the activitynodes in process description that express the control flow of theprocess. The mapping is related to the association between anactivity and a concrete resource able to perform its execution. In

BE A BE A

G. Tretola, E. Zimeo / Journal of Systems Architecture 54 (2008) 883–892 885

grid computing, mapping represents a characterizing aspect, sinceit aims at discovering and exploiting the resources, in the network,that ensure a desired quality of service to the grid workflow (a pro-cess often performed by meta-schedulers). From the previous anal-ysis, it emerges that run-time optimizations can be achieved byimplementing efficient scheduling or mapping techniques. How-ever, while a lot of research has been devoted to mapping tech-niques, as samples there are the works [3] and [7], schedulingoptimization represents an orthogonal plan to improve parallelismand to reduce execution times independently from the availableresources. This paper focuses on process scheduling optimization,enhancing the way sequential operations and their relationshipsare treated. The key point is related to sequential activities antici-pation, i.e., the enactment of an activity before its scheduled time.Such improvement could be obtained in two ways: at design-time,modifying the process description, or at run-time changing theway the process is executed.

3. Workflow fine-grained concurrency

Considering the potential optimizations on the schedulingdimension, this section introduces the comprehension of fine-grained concurrency in workflow execution, the achievable advan-tages of its implementation in workflow enactment services andthe overall impacts on a workflow management system.

3.1. Concept of fine-grained concurrency

At design-time, fine-grained analysis allows to exploit a greatnumber of situations, while modelling a workflow process, wherethe designer finds activities that cannot be executed concurrently,for logic constraints or control dependences, and are consequentlydescribed with sequential patterns. If the internal structure of anactivity were considered at a greater detail level, we could observethat data dependencies, from preceding activities, appear in differ-ent points of the execution flow, not always at the beginning of thedependent activity. So, even though two activities cannot run con-currently for logical reason following the well-known and-splitpattern [17] this is not sufficient to accept a completely sequentialexecution, since a partial concurrency is still possible.

To better explain the concept, we consider a simple sequentialflow of two activities A and B, the second one presenting a datadependency from the preceding one. The point of dependencycould be found in whatever point of the second activity. As de-picted in Fig. 2a, activity B could be considered as a sequence oftwo sub-activities: B0 and B00. B0 is the subset of B operations fromthe start of activity B to the dependence point. B00 is the subset of Boperations from the dependence point to the end.

If during the modelling phase of a workflow we could examinethe activities at a greater level of detail, this different nature of Boperations becomes evident. Taking into account such a structurewe could define an equivalent control flow, placing A and B0 inan and-split and then B00 after the join of the two activities. Theresulting control flow is depicted in Fig. 2b.

Sub-activity B0 is enacted in parallel with A and, when the and-join [18] is completed, sub-activity B00 is executed. By observing the

A B

B' B''

A

B'B''And

SplitAndJoin

Fig. 2. (a) Sequential flow and its decomposition; and (b) equivalent flow with fine-grained analysis.

process, the concept of fine-grained concurrency could be seen as anew control structure between the two extreme cases of sequenceand and-split. The sequence behaviour is obtained, if B0 subset isvoid and B00 = B, while the and-split behaviour is achieved if B0 = Band B00 is void. All the other intermediate situations could be morefittingly modelled with design-time, fine-grained concurrency,whereas in the traditional workflow modelling they are describedas sequences.

Performing a fine-grained analysis at design-time is not easy inmany cases: (1) the activity to be anticipated could have a privateimplementation; (2) the activity could be not simply decomposedin two subsets, because multiple dependence points exist in thecontrol flow or it could be too much complex and based on othersub-activities to be examined proficiently.

Therefore, a more effective and efficient way is to discover thedependence points at run-time by using a proper technique thatsimplify programming. We call this technique: fine-grained concur-rency at run-time. Fig. 3a shows the traditional way a workflowenactor (E) operates when the workflow description presents a se-quence. The activity A starts and runs until its operations are com-pleted, then the result is returned to the workflow enactor and theactivity B may be started receiving the result of A [17]. Fig. 3bshows the sequence enacted by using fine-grained concurrencyand considering that the data dependency appears after the activ-ity initialization. Both activities are started at almost the sametime. Activity A is invoked and immediately returns a placeholderfor the result of its computation. Activity B receives the placeholderas input data, runs until it reaches a point in its flow that requiresthe actual data, tries to access to the placeholder to retrieve the re-sult and, if it is not available yet, the activity is posed in suspendstate, waiting for the result. When A completes its elaboration,the actual data replaces the placeholder and B may restart com-pleting its execution, this is shown in Fig. 3b with the asynchro-nous data message at the end of A.

The gain is proportional to the execution time of the B activity’soperations that do not require the real data, i.e., B0. That operationscould be executed concurrently with the execution of activity A. Con-sidering the point in the B activity’s flow that notifies the data depen-dencies, it is possible to state that the nearer the point is to the end ofB the greater the performance gain is (best case). Vice versa, if thepoint is nearer to the beginning of B, the gain is minor (worst case).

We utilize anticipation and continuation to remove strictsequential executions of activity sequences in workflows by allow-

Data independent operations

Data independent operations

Data dependent operations

Waiting state

Data dependent operations

Fig. 3. (a) Sequential and (b) anticipated enactments.

A B

<xsd:element name="Transition"> <xsd:complexType > …

<xsd :attribute name=" FlowType " type=" xsd :string" use="optional"/> … </ xsd:complexType ></xsd:element>

< WorkflowProcess Id="Example1"> <ProcessHeader DurationUnit="S"/> <Activities> <Activity Id="A"/> … </Activity> <Activity Id="B"/> … </Activity> </Activities> <Transitions> <Transition Id="AB" From="A" To="B"FlowType="early"/> </Transitions> </WorkflowProcess >

Fig. 4. (a) XML schema of the early start pattern; (b) graphical notation of thepattern; and (c) XPDL description of the pattern.

2 Previously called aEngine.


ing intermediate results, or symbolic reference to them, to be usedas preliminary inputs into succeeding activities thus enablingenactment anticipation. This permits the continuation of workflowgraph navigation, evaluating the possibility to start subsequentactivities without waiting for the preceding to complete. Such exe-cution allows for enacting activities exploiting fine-grained con-currency at run-time, since the system is able to dynamicallyfind the dependence point. This simplifies the design, since thereis not design-time overhead to analyze the activities used in thecomposed process designed. The internal structure of the activityto be anticipated is no long a concern, even if it is too complexor if multiple dependence points exist.

3.2. Data-flow synchronization and optimization

The proposed approach could also improve data transferredduring execution without requiring an explicit management ofthe process using a data-flow approach. Data transfer on the net-work could be optimized using a placeholder as a symbolic refer-ence. It could acts as a substitute of the data, and could beforwarded to subsequent activities, using it as a pointer to dataand moving it across the network, instead of the data themselves,that are larger. In some activities, the data received by the invokeras a result from a preceding activity could be used only as param-eters for invoking subsequent activities. So, if the supporting mid-dleware is able to update the placeholder only where the data isreally requested and processed, we can avoid useless data transferto participants that receive and only pass them, and deliver thedata only to the participants that really need them for elaborationpurposes. This allows saving bandwidth and to shorten the timeneeded for data to reach the next participant for processing.

It is worth noting that with the proposed approach, it is possibleto obtain data-flow synchronization, without using an explicitdata-flow description [19] for process modelling.

3.3. Workflow language to support fine-grained concurrency

The ability to enforce a fine-grained concurrency at run-timeneeds a declarative mechanism to define a sequence as potentiallyconcurrent and then to delegate the enactment system to find thedependence points.

Even though a lot of languages are currently used to describeworkflow in grid computing, in this paper we refer to a standardde facto proposed by WfMC: XPDL [16] (XML process definitionlanguage). The same declarative approach for optimizations canbe applied to other languages. In this language, the process modelis activity-based [6] and [17], i.e., a process is defined as a set ofactivities to be executed by the appropriate performer, for reachingthe computational goal.

We extended the XPDL language to support a new workflowpattern, namely Early Start Pattern, with a new attribute in ele-ment ‘‘Transition” [16], modifying the XML schema, as follows:

A new optional attribute, FlowType, can be used to characterizea workflow transition. With this attribute, we declare that, duringprocess definition, two activities connected in sequence could beexecuted by using the early start pattern, if the value ‘‘early” is as-signed to the attribute FlowType. ‘‘Early” labelling is optional andusable at design-time. Optimizations are on behalf of the enact-ment system; no other effort is required for the designer. If theWfMS could interpret the attribute, the early start pattern couldbe executed, otherwise it will be ignored by traditional enginesthat do not support the pattern and will execute the sequence cor-rectly in the traditional way. The XPDL in Fig. 4c describes a se-quence of two activities A and B (whose graphical notation isshown in Fig. 4b) grouped in an early start pattern using the de-fined optional tag (whose XML schema is defined in Fig. 4a).

3.4. Middleware support for the engine implementation

To support the Early Start Pattern, the enactment system mustsatisfy the following requirements that we have defined in our pre-ceding papers [11] and [12]:

1. Invocation of the activities must be asynchronous, returning aplaceholder for the result not computed yet (deferred synchro-nous invocation).

2. Placeholder could be forwarded to subsequent activities asactual parameters to satisfy the activation conditions and soanticipating the activation.

3. Activities that receive the placeholder and try to access to thedata must be stalled until the data is ready to be used.

4. The forwarded placeholder must be updated as soon as possiblefor each activity that uses it.

In particular, the control layer must be able to continue to ana-lyse the process graph and to satisfy activation conditions usingplaceholders, i.e., to perform continuation. The interaction layermust be able to asynchronously invoke resources, receive theplaceholder and forward it, i.e., to perform anticipation. Also thesupporting middleware platform has some requirements to satisfy.The placeholder restitution and its updating are on behalf of theconnection and transmission modules. We chose ProActive [27]to perform this task.

ProActive is a pure Java library for parallel, distributed and con-current computing, that provides a framework based on the activeobject pattern [10]. An active object is an object with an interfaceexposing public methods executed by a dedicated control thread. Itis designed to improve simplicity and reuse in parallel program-ming, supporting separation of concerns related to functional andsystem aspects of programming [27]. ProActive provides asynchro-nous calls by means of future objects, which act like placeholders.These objects can be forwarded as parameters; they may be con-trolled to avoid premature accesses that would obtain inconsistentdata; they are able to stall the accessing thread when results arenot yet available; finally they are able to awake the stalled threadswhen results are updated.

To test the validity and extent of our idea, we have imple-mented an experimental WES, called SAW engine2 (Semantic andautonomic workflow engine), which is compliant with the WfMC ab-stract reference model [17]. It can execute a process defined by XPDLlanguage and implements the API to interact with other logical enti-ties, treated as invoked applications [17].

WorkflowDescription

ProActive Middleware

ProActiveResources

MatchMaker

Co

ntr

ol

Bin

din

g

SA

WE

Inte

ract

ion

Process Control

Activity Control

Invoker

Resource Manager

LocalRM . . . WS

RMEJBRM

Resource Inteface

HumanPerformer

RI

RMIRI

JavaReflection

RI

WebServices

RI

ProActiveRI

submit jobretrieve result

ProActiveRM

Web ServiceDescriptions

ProActiveService

Descriptions

retrieveconcrete resource

executeactivity

Fig. 5. Architecture of SAWE.


Fig. 5 shows the detailed current architecture of SAWE. The con-trol layer is able to receive the process description in XPDL (currentlyalso BPEL is interpretable), creates and navigates the process graph, aset of nodes and links that represents the active instances of the pro-cess, and chooses the activities that could be executed according tothe activation conditions. The binding layer is responsible for associ-ating a concrete resource to an abstract functionality in the process.The functionality may be described using syntactic or semanticannotations. The matchmaker collaborates with the binding layerand is responsible to perform the discovery and selection of the mostappropriate resource/functionality to be assigned to an activity forthe execution. The interaction layer enables remote invocations toimplement interactions with the bounded resources. At this level,several communication technologies can be used: RMI, Web Ser-vices, POJO, HTTP, or ProActive. The interaction with the resourcesis demanded to a specific adapter for each technology and is man-aged using a standard resource interface able to communicate withthe upper layers independently from the technology.

SAWE is multithreaded, uses asynchronous invocation towardsthe resources, and is able to anticipate the execution of sequentialactivities with a deferred blocking mechanism, if a proper middle-ware is used at interaction layer. Currently, to perform activityanticipation, the interaction layer uses the asynchronous invoca-tion of remote objects provided by ProActive [27], returning place-holders to the control layer that is so able to continue the graphnavigation by forwarding them to other schedulable activities.Moreover, we have developed a library that enables asynchronousinvocations and continuation also with web services, where asemantic extension has been provided. A preliminary work forintroducing asynchronous invocation in web services through a cli-ent-side approach for the interaction management has beenpresented in [13], whereas an implementation based on WS-addressing and future objects in web services interactions has beenpresented in [14].

4. Performance analysis

In order to assess the quantitative and qualitative characteriza-tion of the potential performance gain, we have conducted someexperiments. Firstly, we aimed at identifying a theoretical perfor-mance framework, then we observed the behaviour of a real andmore complex applications.

4.1. Micro-benchmarks analysis

To analyse the SAWE performance and scheduling behaviour,we executed an ‘‘in vitro” experiment. The experiment was con-ducted without considering the mapping problem, i.e. we are notconcerned with performance improvement or degradation linkedwith resource allocation, computational power and availability.We consider the problem in the ideal situation of infinite comput-ing resources. With a sequential scheduling, the mapping of activ-ities load on the computing resources may be considered almost aconstant value. On the other hand, with the Early Start Pattern themapping system must be able to follow the increased computa-tional request, due to the increased number of activities concur-rently scheduled in order to complete the process in less time. Agrid environment is an ideal candidate to ensure mapping supportand avoiding ‘‘saturation” problems.

In the following we analyze the ideal range of performanceimprovement obtainable with fine-grained concurrency. The work-flow process considered is a simple sequence of two activities: Aand B.

The process description in XPDL is the same presented as anexample in Section 3.3. In order to quantify the range of perfor-mance improvement, we consider two situations discriminatedby the point of data dependencies in activity B with respect tothe result of A. The entity of performance gain, in fact, is dependenton the point in which data dependencies are placed. Activity A pro-duces data that will be used by activity B to perform its task. Activ-ity A is composed of a series of operations on an integer value, thatrequire about 10 s to complete. Activity B receives the integer andperforms a set of operations that also completes after 10 s. Thusthe process under test is a sequence pattern with a data depen-dency in the second activity. The data transfer time is less signifi-cant, given the fact that the integer is a very small data. This kind ofprocess, scheduled in a traditional way, puts the constraint to startactivity A, then suspends the workflow enactment and waits untilactivity A completes. After that, the engine receives data from theresource and could use them to start the subsequent activity B. Thetotal running time of the process is about the sum of the activitiesrunning times, plus engine enactment time and network’s datatransfer delay. We considered a pure sequential execution andtwo anticipated ones in two ideal hypotheses: (1) data dependencyis at the start of the activity; (2) data is needed at the end. The

Table 1Experimental results

Test Time (s)

Pure sequential 20.506Anticipated: dependencies at the start 19.784Anticipated: dependencies at the end 11.689


activities are computation intensive applications that are simu-lated as long running with respect to the total duration of the pro-cess, because the elaboration time is greater than the engineoperation time and the data transfer time. The averaged test resultsare shown in Table 1.

The performance gain obtained for the sequence of activities is3.5% in the first case and 43% in the second case. The performancegain obtainable also in the worst case is due to the overlapping ofthe computation of the first activity with the transfer of data to thesecond one. The average gain is about 23.26%.

4.2. Real application analysis

To perform a real case analysis, we considered a process suitedto solve systems of linear equations, which may be used as anintermediate step in a more complex grid process. The applicationthat we took into account is a process that manipulates data, pro-duced by data sources or other grid processes, provided as ma-trixes. They are processed for solving the associated linearsystem, producing a result for subsequent elaborations. The linearsystem is solved through the Cramer method. The calculation ofthe determinant of the system matrix A is performed; if the resultis not equal to zero, a Solver computes the solutions of the system.

Fig. 6 depicts the process graph of the application, and thedeployment on the computational resources. One grid resource isresponsible for retrieving matrix A and vector b that describe thelinear system.

Activities ‘‘Retrieve A” and ‘‘Retrieve b” are described using theearly start pattern to point out the fact that design effort aimed atperformance improvement may be greatly reduced while obtain-ing good performance thanks to the run-time exploitation of thepotential concurrency. A different resource computes the determi-nant of A. The Solver process is responsible for executing the calcu-

Grid Resources Retrieve A

Retrieve b

|A|

Solver

...

Computationalresource




Fig. 6. A process to solve a sy

lation of the results using a cluster of computational resources. Weemployed a non-dedicated cluster with 36 nodes, each of them isequipped with an Intel Xeon Hyper Threading processor at2.98 GHz, with 2 GB of memory. The total resources of the clusterare 144 processors, with available memory of 71.28 GB and a diskspace of 2333.5 GB.

The application is described in XPDL and is enacted using SAWE.The transitions expressed with dashed lines in the process graph,shown in Fig. 6, are the ones managed with anticipation. The XPDLdescription of the process is sketched in the same figure.

We used the time for a sequential execution of the applicationobtained through RMI as a reference time. The computing re-sources in this case are Java objects able to manipulate matrixesand vectors elements and performing the operations, particularlythe LU decomposition and the determinant calculation [28][29].The Solver activity is really a sub-process that performs the com-putation of determinant of A and the allocation of modified ma-trixes Ai, obtained substituting each column of A with the bvector and then computing concurrently each determinant [30].

We chose this example for two main reasons. First of all, it rep-resents a typical pattern of a grid application that retrieves datafrom a repository for elaborating them with computational re-sources. Second, it exploits concurrent elaboration, so presentinglittle room for anticipation, other than data-flow optimizationand resource allocation overlapping. Also the computing environ-ment represents a worst case for the experimentation: the clusternodes used are connected with a high-speed dedicated bus, socommunication overhead is minimum, and the overlap of compu-tation and communication is small.

Anyway, our intention was to measure the possible improve-ments obtainable using anticipation with this process, basing onlyon asynchronous invocation and overlapping of objects instantia-tion and computation activities. The benefit obtainable in this casemay be considered as the improvement obtainable for a realisticprocess in an adverse situation. We may consider, for example, thatif the same process were executed in a geographic distributed grid,overlapping of communication time might give a further improve-ment. Moreover, the pool of resources available for the process islimited to a maximum of 18 nodes, equipped with ProActive andready to be used.

<Activities> <Activity Id="A" Name="Retrieve A"/> … </Activity> <Activity Id="B" Name "Retrieve b"/> … </Activity> <Activity Id="C" Name "Determinant of A"/> … </Activity> <Activity Id="D" Name "Solve System"/> … </Activity></Activities><Transitions> <Transition Id="AB" From="A" To="B"

FlowType ="early"/> <Transition Id=" BC" From="B" To="C"

FlowType ="early"/> <Transition Id="CD" From=" C" To="D"

FlowType ="early"/></Transitions>

stem of linear equations.

Table 2Comparison of sequential and anticipated execution times

Dim Sequential 4 nodes (s) Anticipated 4 nodes (s) Sequential 8 node (s) Anticipated 8 nodes (s) Sequential 16 nodes (s) Anticipated 16 nodes (s)

32 0.77861 0.63293 0.64823 0.57320 0.83338 0.7758864 1.06780 0.92967 0.88592 0.78620 0.86491 0.80758128 1.64752 1.52466 1.13404 1.01645 1.15792 1.10406256 5.29799 5.18177 3.48362 3.15447 2.64570 2.52414512 88.43059 86.87334 55.24335 53.98423 28.91419 27.72004


We executed the process with different resources available,considering 6, 10, 18 nodes, of which 4, 8 and 16 are deputed tocompute the Solver sub-process. Each experiment was performedusing matrix dimensions growing from 32 to 512, with doublingsize steps, in order to have homogenous allocation on each node.The obtained results are shown in Table 2, in which the averageexecution times are measured in seconds. The performance gainsexpressed in percentage are shown in Table 3, whereas the graph-ical representation is shown in Fig. 7.

The result obtained reflects some considerations reported in aprevious work [14]: performance gains are inversely proportionalto the task duration, when the dependence point is near to the startof the process, that means that the process has computational tasksthat can be only minimally overlapped with result placeholderanticipation, as in this case. The results are also consistent withthe simple model for the performance evaluation introduced inthe preceding section: performance improvement is greater witha lower duration of the process. For matrix dimension ranging from32 to 128 the transmission time is greater than the computationtime. When the matrix dimension increases the performance gain

Table 3Performance gain of anticipated execution

Dim Gain 4 nodes (%) Gain 8 nodes (%) Gain 16 nodes (%)

32 18.71 11.57 6.9064 12.94 11.26 6.63128 7.46 10.37 4.65256 2.19 9.45 4.59512 1.76 2.28 4.13

Perform

0.00%

2.00%

4.00%

6.00%

8.00%

10.00%

12.00%

14.00%

16.00%

18.00%

20.00%

32 64

Gain [%]

4 nodes 8 nodes

Fig. 7. Average per

decreases, since overlapping between communication with com-putation is limited to a small fraction of the overall computationtime. Moreover, it is worth to note that with a matrix dimensionless than 512, the performance gain decreases when the numberof computing nodes increases, whereas with a matrix dimensionequal or greater than 512 the performance gain increases whenthe number of computing nodes increases. This depends on thekind of benefit derived from the anticipation: when matrices aresmall, the communication time overcomes the computation timeand so the main benefit is related to the overlapping of communi-cation and computation; when matrices are larger, the communi-cation time becomes negligible and the gain depends on theachievable parallelism between the calculation of the determi-nants of Ai and the determinant of A. Therefore, the asymptoticgain obtainable with a high number of nodes is about 50%.

However, the great benefit of our technique can be appreciated,if we consider using a fine-grained analysis at design-time to en-sure the same efficiency of the execution performed by SAWEautonomously. For our analysis we start from the simple workflowdescribed in Fig. 6 and then modify its description.

In the fine-grained analysis, the activities of the original processare analyzed in their internal details, and where possible the pro-cess control flow is reorganized in order to partial overlap the com-putation. Such analysis produces the workflow shown in Fig. 8,whose complexity is much higher than the complexity of the sameworkflow described with the Early Start Pattern on the top of thefigure. A further improvement in performance could be achievedif a different granularity were considered. In our implementation,we supposed to have a service for calculating the determinantand consequently our analysis was performed at a medium levelof granularity.

ance Gain

128 256 512Dimension

16 nodes

formance gain.

Retrieve A Retrieve b |A| Solver

|A1|

|Ai|

|An|

xi

AndSplit

Retrieve A

Retrieve b

AndSplit

AndJoin

SolverCoordinator

|A| AndSplit

AndJoin

AndJoin

AndJoin

x1

xn

AndJoin

Solver

Fig. 8. Equivalent process obtained with fine-grained analysis.


If we consider a coarse-grained (black box) service to calculatethe solutions of a system of linear equations a design-time, fine-grained analysis is not possible, and so only a sequential executionbecomes possible. Our solution is able to automatically discoverdata dependencies at run-time and to exploit the internal concur-rency of each service implementation, so improving performancewithout design effort for eliciting concurrency.

5. Related work

Bonita [15] represents one of the first examples of design-timeapproach for implementing activity anticipation. It is a cooperativeworkflow system, which enacts processes that manage human col-laboration. Bonita is able to execute processes in two ways: tradi-tional or flexible. The former executes the process according to theWfMC reference model definition [17], whereas the flexible execu-tion is able to perform activity anticipation by specifying thisthrough a boolean attribute of the activity itself. When that prop-erty is true, the engine could anticipate the activity. The ‘‘anticipa-ble” property could be assigned at design-time or at run-time, butit must be explicitly defined by the process designer at design-time. The engine tries to anticipate the execution of the activitiesonly when they are labelled anticipable. Even though Bonita couldanticipate activity execution to improve performance, the controlof this feature is delegated to developers that have to verify thepossibility to anticipate some activities and to specify the data topropagate at design-time. However, the design-time approach,exemplified by Bonita, creates an additional burden for processdesigners.

In [6], Manolescu has presented the implementation of a WESwith object oriented technology that adopts a run-time approachfor the anticipation. He has introduced the concept of micro-work-flow, lightweight workflow architecture, oriented to the develop-ment of systems using the object-oriented paradigm and parallellanguages’ features. The fundamental ideas are the exploitation ofasynchronous invocation, continuation, and propagation of futureobjects, typical concurrent languages features, in workflow man-agement systems. This paper is targeted to workflow system devel-opers, who want to use continuation techniques, and to object-oriented developers that would obtain the advantages of clear sep-aration of control from logic, as stated in the classic work of Kowal-sky [8], so improving the adaptability of software systems tobusiness evolution. We perceived the Manolescu’s ideas as usefuland powerful techniques to extend in the context of grid workflows.

Also Huang and Huang [5] have described a run-time approachin the architecture of SWFL (service workflow language) workflowengine, a general-purpose system usable both in business and inscientific workflows. This paper has introduced the multilevel par-allelism: a server level parallelism, based on flexible scheduling in-side the engine using multithreading, a flow level parallelism,using multiple engines and partition of a process in sub-processes,and a message passing parallelism, based on MPI standard [9], atechnology widely used in scientific computing, adopting an XMLextension, the MPFL (message passing flow language). SWFL is cou-pled with VSCE (visual service composition environment), a graph-ical tool for process definition, which eases the definition of aprocess. Even though the approach seems interesting, it is limitedto use explicit parallelism, where it is obviously possible.

The grid workflow forum describes the research activities in-volved in workflow process optimization and performanceimprovement [19]. Many research projects in grid computing areaimed to improve the modelling of a workflow process. The keyactivities are related to design, configuration, and deployment ofgrid applications for large-scale, multi organizational, distributedenvironments. We briefly recall the fields involved with processoptimization. At language level, the lack of a well-establishedworkflow description language, which is suitable for describingworkflows on a high-level and at the same time efficiently usablein different application areas, is fostering a lot of research devotedto the definition of specific description languages that will be pro-posed for the standardization. A different research area deals withdynamically changing workflow processes with the aim of includ-ing one or more new execution paths in existing workflows and ofmapping activities to the appropriate resources. The performanceand reliability of a composite service, defined by a workflow pro-cess, can take advantage of the underlying grid technology in iden-tifying the available and most suitable grid service instances toinvoke. However, this form of dynamicity and self-adaptationneeds to be further and carefully investigated.

Due to the dynamic and heterogeneous nature of the grid, theoptimized selection of resources and services is not easy. Indeed,optimized resource selection should allow for discovering on thefly the most suitable component services that can be orchestratedfor delivering the desired composed workflow process. However,this is not always possible, because the selection should rely on ad-vanced monitoring and performance analysis of workflows andshould be based on QoS criteria that often are not available. More-over, it should take into account that resources and services can be


available/unavailable at run-time. Hence, optimization and/or fail-ure handling strategies commonly adopted in static environmentsmight not be optimal.

In our research activities, we developed an approach based ontwo main steps. In the first one an abstract process is automaticallydefined using planning techniques and abstract functional descrip-tions. In the second stage the process is executed performing dy-namic binding of the abstract functionality to concreteimplementations, described using also non-functional annotations.Furthermore, we are introducing in the workflow enactor the abil-ity to self-optimize the process at run-time during the execution.These are hot topics, as demonstrated by the number of researchprojects that are considering these aspects. Here, we present themost interesting ones.

Next grid aims at defining an architectural model for grid work-flow enactment based on semantic description and dynamic bind-ing [20]. A related project is FreeFluo [23], which is able todynamically execute web services look-up and discovery. Anotherimportant project is the K-Wf grid [21], whose main objective is tosupport the definition of knowledge based workflow processes tobe executed in grid environment. They are involved in semi-auto-mated process composition, description using Petri Nets, run-timemonitoring and knowledge collection, to be reused in future pro-cess management.

Other workflow engines implement run-time dynamic features[19]. GWFE [24] supports a just-in-time scheduling system, thusallowing the resource allocation decision to be performed at run-time in order to react to changing grid environments. Karajan[25] execution model includes execution elements being the tasksto be run, and the events generated during the workflow execution.Pegasus [26] executes workflows that specify the location of dataand the execution platforms, optimizing data reuses and transfers.

All of these engines try to optimize the workflow executionwith different techniques mainly tied to service selection. No oneof them addresses the problem of dynamically and automaticallyimproving the intrinsic parallelism of activity enactment. Our con-tribution aims at improving concurrency and consequently perfor-mance by transparently exploiting anticipation and continuationwith a very limited impact on process modelling.

6. Conclusions and future works

This paper addressed the problem of workflow process schedul-ing improvement to reach high performance in grid environmentswith a low cost for workflow designers. The system proposedimplements techniques derived from concurrent languages: asyn-chronous calls and symbolic references. In particular, we defined anew workflow pattern, Early Start, that allows the designer to labela set of activities as ‘‘early”, without the needs to further analysethe implementation, easing design and simplifying the descriptionof the processes but achieving better performance at run-time. Allthe operations needed to enact the activities with fine-grained con-currency are performed by the system. It is important to note thatsuch results may be obtained also in situations, where optimiza-tion techniques based on design-time analysis fail, because theactivities’ internal structure is not accessible, or is too complexand presents more than one dependence point and so anticipationat design-time could not being performed. Moreover, if we con-sider processes designed using abstract functionalities, the realperformer of the activities can be known only when the bindingis performed, making useless any attempt at design-time optimiza-tion activity.

The experimental results show that the ideal performancegains, in terms of total process time for a sequence with homoge-neous activities, are in the range between 3.5% and 43%, that rep-

resent the two extreme cases of the ideal performance. Anadditional experiment conducted with a real process executed bya non-dedicated cluster of computers showed that the anticipationtechnique may give important advantages even in a very adversecase, achieving a performance improvement ranging from 1.7% to18.8% (with the test-bed used), without any additional effort bythe designer of the process.

Fine-grained concurrency at run-time, by means of anticipation,has also contraindications. In fact, anticipating the resource alloca-tion before the availability of the data to process might occupy re-sources, so not allowing their assignment to other tasks. Therefore,additional research effort is required in this direction.

We planned to continue the evaluation in real environmentsand using real processes and data. We are, moreover, interestedto quantify the entity of performance gain obtainable with theincreasing of data-flow efficiency. It is also interesting to explorethe process simplification that could be obtained exploiting thedata-flow synchronization granted by SAWE.

Finally, we are considering to use also WSRF (web service re-source framework) [22] as grid middleware, extending its imple-mentation to support symbolic references and their restitution,forwarding and updating, using our ongoing approach coupledwith WS-Resource and WS-Notification. The requirements for themiddleware remain the same and SAWE could be used to realizedistributed applications exploiting web service composition.

Acknowledgements

The work described in this paper is framed within the CoreGRIDNetwork of Excellence funded by the European Commission andthe ArtDeco Project funded by the Italian Ministry of Research(MIUR). We also thank Ester Giallonardo and Franca Perrina fortheir support in the implementation of the system and for thefruitful discussions on the topic.

References

[1] K. Krauter, R. Buyya, M. Maheswaran, A taxonomy and survey of grid resourcemanagement systems for distributed computing, Software: Practice andExperience (SPE) 32 (2) (2002) 135–164.

[2] J. Yu, R. Buyya, A taxonomy of workflow management systems for gridcomputing, Journal of Grid Computing (2005).

[3] R. Buyya, M. Murshed, D. Abramson, A deadline and budget constrained cost-time optimization algorithm for scheduling task farming applications on globalgrids, in: International Conference on Parallel and Distributed ProcessingTechniques and Applications, Las Vegas, USA, June 2002.

[4] I.T. Foster, C. Kesselman, Steven Tuecke, The anatomy of the grid – enablingscalable virtual organizations, in: Euro-Par 2001, Proceedings of the ParallelProcessing 7th International Euro-Par Conference, Manchester, UK, August 28–31, 2001.

[5] Qifeng Huang, Yan Huang, Workflow engine with multi-level parallelismsupports, in: Fourth All Hands Meeting, 19–22nd September, 2005.

[6] Dragos-Anton Manolescu, Workflow enactment with continuation and futureobjects, in: OOPSLA 2002, 2002, pp. 40–51.

[7] N. Ranaldo, E. Zimeo, An Economy-driven mapping heuristic for hierarchicalmaster-slave applications in grid systems, in: IEEE IPDPS 06, Rhodes Island,Greece, 2006.

[8] R. Kowalski, Algorithm = logic + control, Communications of the ACM 22 (7)(1979) 424–436.

[9] M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra, MPI – The CompleteReference, The MPI Core, second ed., vol. 1, The MIT Press, 1998.

[10] Laurent Baduel, Francois Baude, Denis Caromel, Object-Oriented SPMD, INRIA,CNRS, University of Nice Sophia-Antipolis, 2004.

[11] G. Tretola, E. Zimeo, Workflow fine-grained concurrency with automaticcontinuations, in: Proceedings of the IEEE IPDPS 06, 20th International Paralleland Distributed Processing Symposium, Rhodes Island, Greece, April 25th–29th, 2006.

[12] G. Tretola, E. Zimeo, Activity pre-scheduling in grid workflow, in: Proceedingsof the 15th Euromicro International Conference on Parallel, Distributed andNetwork-based Processing (PDP), February 7–9, 2007.

[13] G. Tretola, E. Zimeo, Client-side implementation of dynamic asynchronousinvocations for web services, in: Proceedings of the IEEE IPDPS 07, 21stInternational Parallel and Distributed Processing Symposium, Long Beach,California, USA, March 26th–30th, 2007.

892 G. Tretola, E. Zimeo / Journal of System

[14] G. Tretola, E. Zimeo, Extending web services semantics to supportasynchronous invocations and continuation, in: Proceedings of the IEEE 2007International Conference on Web Services (ICWS), Salt Lake City, Utah, USA,July 9th–13th, 2007.

[15] Bonita, Workflow Cooperative System. <http://bonita.objectweb.org>.[16] Workflow Management Coalition, XML Process Definition Language,

Document Number WfMC TC-1025. <www.wfmc.org>.[17] Workflow Management Coalition, The Workflow Reference Model, Document

Number WfMC TC-1003. <www.wfmc.org>.[18] Workflow Management Coalition, Terminology and Glossary, Document

Number WFMC TC-1011. <www.wfmc.org>.[19] The Grid Workflow Forum. <www.gridworkflow.org>.[20] NextGRID. <http://www.nextgrid.org>.[21] K-Wf Grid. <http://www.kwfgrid.net/>.[22] Globus Project. <www.globus.org>.[23] Freefluo. <http://freefluo.sourceforge.net>.[24] GWFE. <http://www.gridbus.org/workflow/>.[25] Karajan. <http://www-unix.globus.org/cog/java/>.[26] Pegasus. <http://pegasus.isi.edu/>.[27] ProActive Manual. <http://www-sop.inria.fr/oasis/ProActive>.[28] E. Isaacson, H.B. Keller, Analysis of Numerical Methods, Wiley, New York, 1966.[29] L.W. Johnson, R.D. Riess, Numerical Analysis, second ed., Addison-Wesley,

Reading, MA, 1982.[30] J.R. Westlake, A Handbook of Numerical Matrix Inversion and Solution of

Linear Equations, Wiley, New York, 1968.

Giancarlo Tretola graduated in Computer Engineeringin 2003. He received a Ph.D. in Computer Engineeringfrom the Department of Engineering of the University ofSannio in 2007. His current research interests includeService Oriented Computing, Workflow Management,and Grid Computing.

Eugenio Zimeo graduated in Electronic Engineering atthe University of Salerno (Italy) and received the PhDdegree in Computer Science from the University ofNaples in 1999. Currently he is an assistant professor atthe University of Sannio in Benevento (Italy), whereteaches courses on Computer Networks, Web Technol-ogy and Distributed Systems. His primary researchinterests are in the areas of software architectures andframeworks for distributed systems, high performancemiddleware, service oriented computing and grid com-puting, wireless sensor networks and mobile comput-ing. He has published about 60 scientific papers in

journals and conferences of the field and leads many large research projects. He is amember of the IEEE Computer Society.

s Architecture 54 (2008) 883–892

http://bonita.objectweb.org

http://www.wfmc.org

http://www.wfmc.org

http://www.wfmc.org

http://www.gridworkflow.org

http://www.nextgrid.org

http://www.kwfgrid.net/

http://www.globus.org

http://freefluo.sourceforge.net

http://www.gridbus.org/workflow/

http://www-unix.globus.org/cog/java/

http://pegasus.isi.edu/

http://www-sop.inria.fr/oasis/ProActive

Activity pre-scheduling for run-time optimization of grid workflows

Documents

Transcript of Activity pre-scheduling for run-time optimization of grid workflows