Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al....

12
Original Article Concurrent Engineering: Research and Applications 2014, Vol. 22(1) 77–88 Ó The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1063293X13516965 cer.sagepub.com Evaluation of trade-offs between workflow escalation strategies Yain-Whar Si 1 , Marlon Dumas 2 and Ka-Leong Chan 1 Abstract Workflows in the service industry are subject to exceptional circumstances that affect the ability to complete work in a timely manner. For instance, workflows may need to deal with sudden spikes in customer demand due to a variety of events such as promotional deals, product launches, major news, or natural disasters. Escalation strategies can be incor- porated into the design of a workflow so that it can cope with sudden spikes in the number of service requests while mitigating the effects of missed deadlines. In this article, we propose a method forevaluating escalation strategies using simulation technology. The effectiveness of the proposed method is demonstrated on a workflow from an insurance company. Keywords Escalation strategy, workflow management, simulation, key performance indicators Introduction Business processes in the service industry are vulnerable to sudden increase in demand due to exceptional cir- cumstances. For example, in the insurance industry, severe flooding caused by heavy rain may lead to sud- den spikes in insurance claims. In order to cope with such spikes, insurers need to rapidly adapt their opera- tions, for example, by incorporating escalation strate- gies into their workflows. In the context of workflow management, an escala- tion is an action performed when a workflow execution is delayed to the extent that it is not on track to meet its deadline (Van der Aalst et al., 2007). For example, in an insurance claim workflow, an escalation is to rede- ploy staff from other departments into the call centers or to require less information from callers when lodging insurance claims so that more claims can be recorded. The decision on whether or not to escalate, and how, needs to consider the cost of missing the deadline (e.g. lower customer services standards) and the cost of esca- lating (e.g. additional staff costs). Several escalation strategies have been proposed in the literature (Panagos and Rabinovich, 1996, 1998; Van der Aalst et al., 2007), each one with its own trade-offs. In a previous work (Chan et al., 2009), a simulation-based method for evaluating four deadline escalation strategies proposed in the literature (Panagos and Rabinovich, 1996, 1998; Van der Aalst et al., 2007) was presented as follows: S1: Alternative path selection (Van der Aalst et al., 2007). An alternative execution path is taken to accel- erate the execution in order to meet a deadline (at the detriment of additional cost or reduced quality). S2: Resource redeployment (Van der Aalst et al., 2007). Resources (workers) are moved from one resource pool to another in order to accelerate certain tasks that may cause some workflow cases to run late. S3: Data degradation (Van der Aalst et al., 2007). Data entry or data checking steps are performed in a mini- malistic manner or postponed to a late stage in order to meet a deadline. S4: Early escalation (Panagos and Rabinovich, 1996, 1998). Continuous monitoring and prediction algo- rithms are employed to detect deadline violations as early as possible. When it is detected that a deadline 1 Department of Computer and Information Science, University of Macau, Macau, China 2 Institute of Computer Science, University of Tartu, Tartu, Estonia Corresponding author: Yain-Whar Si, Department of Computer and Information Science, University of Macau, Av. Padre Toma ´s Pereira, Taipa, Macau, China. Email: [email protected]

Transcript of Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al....

Page 1: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

Original Article

Concurrent Engineering: Researchand Applications2014, Vol. 22(1) 77–88� The Author(s) 2013Reprints and permissions:sagepub.co.uk/journalsPermissions.navDOI: 10.1177/1063293X13516965cer.sagepub.com

Evaluation of trade-offs betweenworkflow escalation strategies

Yain-Whar Si1, Marlon Dumas2 and Ka-Leong Chan1

AbstractWorkflows in the service industry are subject to exceptional circumstances that affect the ability to complete work in atimely manner. For instance, workflows may need to deal with sudden spikes in customer demand due to a variety ofevents such as promotional deals, product launches, major news, or natural disasters. Escalation strategies can be incor-porated into the design of a workflow so that it can cope with sudden spikes in the number of service requests whilemitigating the effects of missed deadlines. In this article, we propose a method for evaluating escalation strategies usingsimulation technology. The effectiveness of the proposed method is demonstrated on a workflow from an insurancecompany.

KeywordsEscalation strategy, workflow management, simulation, key performance indicators

Introduction

Business processes in the service industry are vulnerableto sudden increase in demand due to exceptional cir-cumstances. For example, in the insurance industry,severe flooding caused by heavy rain may lead to sud-den spikes in insurance claims. In order to cope withsuch spikes, insurers need to rapidly adapt their opera-tions, for example, by incorporating escalation strate-gies into their workflows.

In the context of workflow management, an escala-tion is an action performed when a workflow executionis delayed to the extent that it is not on track to meet itsdeadline (Van der Aalst et al., 2007). For example, inan insurance claim workflow, an escalation is to rede-ploy staff from other departments into the call centersor to require less information from callers when lodginginsurance claims so that more claims can be recorded.The decision on whether or not to escalate, and how,needs to consider the cost of missing the deadline (e.g.lower customer services standards) and the cost of esca-lating (e.g. additional staff costs).

Several escalation strategies have been proposed inthe literature (Panagos and Rabinovich, 1996, 1998;Van der Aalst et al., 2007), each one with its owntrade-offs. In a previous work (Chan et al., 2009), asimulation-based method for evaluating four deadlineescalation strategies proposed in the literature

(Panagos and Rabinovich, 1996, 1998; Van der Aalstet al., 2007) was presented as follows:

S1: Alternative path selection (Van der Aalst et al.,2007). An alternative execution path is taken to accel-erate the execution in order to meet a deadline (at thedetriment of additional cost or reduced quality).S2: Resource redeployment (Van der Aalst et al., 2007).Resources (workers) are moved from one resource poolto another in order to accelerate certain tasks that maycause some workflow cases to run late.S3: Data degradation (Van der Aalst et al., 2007). Dataentry or data checking steps are performed in a mini-malistic manner or postponed to a late stage in orderto meet a deadline.S4: Early escalation (Panagos and Rabinovich, 1996,1998). Continuous monitoring and prediction algo-rithms are employed to detect deadline violations asearly as possible. When it is detected that a deadline

1Department of Computer and Information Science, University of Macau,

Macau, China2Institute of Computer Science, University of Tartu, Tartu, Estonia

Corresponding author:

Yain-Whar Si, Department of Computer and Information Science,

University of Macau, Av. Padre Tomas Pereira, Taipa, Macau, China.

Email: [email protected]

Page 2: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

may be missed, a task is triggered to inform a desig-nated actor about the potential deadline violation.

In this article, we extend the simulation-basedmethod presented in Chan et al. (2009) to deal withseven additional escalation strategies proposed in Vander Aalst et al. (2007), namely, escalation subprocess,task pre-dispatching, overlapping, prioritization, batch-ing, splitting, and deferred data gathering. These strate-gies are applied to a case study from the insuranceindustry. Using this case study, we show how to designsimulation experiments to evaluate the impact of esca-lation strategies on key performance indicators (KPIs).In doing so, we illustrate trade-offs involved whenselecting escalation strategies and provide guidelines todeal with these trade-offs.

This article is structured as follows. In section‘‘Background and related work,’’ we discuss relatedwork on escalation and time management in work-flows. In section ‘‘Escalation strategies,’’ we introducethe escalation strategies. In sections ‘‘Escalation strat-egy evaluation method,’’‘‘Introducing escalations,’’ and‘‘Experimental results,’’ we describe the simulationmethod and illustrate it via a case study. Finally, weconclude in section ‘‘Conclusion.’’

Background and related work

Panagos and Rabinovich (1996, 1998) proposed‘‘Dynamic Deadline Adjustment’’ and ‘‘EarlyEscalation’’ as complementary strategies to minimizethe number of escalations needed during workflow exe-cution and to mitigate their associated costs. DynamicDeadline Adjustment aims at minimizing the numberof escalations by attaching an ‘‘expected executiontask’’ to each task, and by continuously monitoringeach workflow execution in order to detect delays assoon as possible. When a task takes less than expectedto complete, the difference between the expected and

actual execution time is accumulated into a slack timevariable. When the slack time is negative, it means thatthe workflow execution is delayed. This is where a sec-ond strategy, namely, Early Escalation, kicks in. In theEarly Escalation strategy, an algorithm is used to pre-dict whether a case is going to miss a deadline. When apotential deadline violation is detected, the case is esca-lated. Escalations are defined as actions executed inparallel to the normal workflow in order to reduce therisk of a deadline violation. Panagos and Rabinovich(1996, 1998) evaluate their strategies using simulationtechnology, but only from a temporal perspective(without considering resource costs).

Van der Aalst et al. (2007) analyze deadline escala-tion strategies using a so-called 3D approach (Detect,Decide and Do). In this approach, potential deadlineviolations are first detected and suitable escalation stra-tegies are selected and applied. Van der Aalst et al.(2007) evaluate the effectiveness of some sample escala-tion strategies using simulation experiments. The strate-gies considered in their study are alternative pathselection (performing an alternative task when the exe-cution is delayed), resource redeployment (bringing inmore resources into the workflow execution), datadegradation (requiring less data input in order to movefaster), escalation subprocess, task pre-dispatching,overlapping, prioritization, batching, splitting, anddeferred data gathering. Similar to Panagos andRabinovich (1996, 1998), Van der Aalst et al. (2007)only evaluate escalation strategies from the time per-spective. In contrast, in this article, we propose asimulation-based method that takes into account thecost of task execution, the cost of resources, and com-pensation cost. The escalation strategies analyzed inVan der Aalst et al. (2007) and Panagos andRabinovich (1996, 1998) are summarized in Table 1. Inthis article, we focus on strategies S5–S11 as the firstfour strategies are analyzed in our previous work(Chan et al., 2009).

Table 1. Escalation strategies analyzed in Van der Aalst et al. (2007) and Panagos and Rabinovich (1996, 1998).

Strategy Workflow time Escalation cost Cost (task, resource,compensation)

S1 Alternative path selection (Van der Aalst et al., 2007) O 3 3S2 Resource redeployment (Van der Aalst et al., 2007) O 3 3S3 Data degradation (Van der Aalst et al., 2007) O 3 3S4 Early escalation (Panagos and Rabinovich, 1996, 1998) O O 3S5 Escalation subprocess (Van der Aalst et al., 2007) 3 3 3S6 Task pre-dispatching (Van der Aalst et al., 2007) 3 3 3S7 Overlapping (Van der Aalst et al., 2007) 3 3 3S8 Prioritization (Van der Aalst et al., 2007) 3 3 3S9 Batching (Van der Aalst et al., 2007) 3 3 3S10 Splitting (Van der Aalst et al., 2007) 3 3 3S11 Deferred data gathering (Van der Aalst et al., 2007) 3 3 3

78 Concurrent Engineering: Research and Applications 22(1)

Page 3: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

Other work in the field of workflow escalationincludes that of Georgakopoulos et al. (2000), who out-line an approach to support dynamic changes in work-flows in crisis situations (e.g. for rescue operationsduring natural disasters). Their focus is on enablingdecision makers to escalate at runtime by changing thecourse of the workflow execution as required, whileretaining some level of control. In contrast, our workfocuses on analyzing the effectiveness and costs of dif-ferent escalation strategies at design time.

A related topic is that of specifying and analyzingtime constraints in workflows. Eder et al. (1999) pro-pose Program Evaluation and Review Technique(PERT)-like techniques for analyzing time constraintsattached to workflows. Bettini et al. (2002) proposealgorithms for checking time constraint satisfiability atdesign time, while Chen and Yang (2008) propose tech-niques for efficiently checking time constraints at run-time. Finally, Rhee et al. (2004) propose a PERT-basedtechnique to calculate critical paths and slack time anda guide to help workers prioritize tasks in order to opti-mize throughput. These and similar related studies arecomplementary to ours since we do not deal withchecking time constraints or optimizing the overall exe-cution of the workflow, but we focus on evaluatingescalation strategies to deal with workflow cases thatare likely to miss their deadlines.

Previous work also addresses the issue of determin-ing the (minimum) amount of resources needed in aworkflow in order to ensure that temporal constraintsare met with a certain probability (Son and Kim,2001). The reverse analysis is done in Li et al. (2004),where based on the available resources, estimates ofaverage execution time per workflow instance arederived. This work is complementary to ours: the esti-mates obtained using such techniques can be used toimplement escalation strategies based on resourceredeployment.

Escalation strategies

A case is an execution of a workflow model (also calledprocess model). For example, in an insurance claimprocess, a case is the set of activities performed in orderto handle a given insurance claim. An escalation strat-egy describes actions to be taken when a case or a setof cases are predicted to miss the deadline. In this arti-cle, we consider the following escalation strategies.

S5: Escalation subprocess. When a case is predicted tomiss a deadline or the case has already missed thedeadline but is still active, a special subprocess isinstantiated. The subprocess is intended to performactions specifically related to the deadline violation,

such as renegotiating a new deadline or performingcompensation actions and canceling the case.S6: Task pre-dispatching. The idea of task pre-dispatching is to ‘‘pipeline’’ the execution of upcomingtasks in the case in order to accelerate the case. Inother words, a lookahead is performed to find outwhich tasks will need to be executed, and whereverpossible, preparations for these tasks are triggered, forexample, by putting these upcoming tasks on the work-lists of the corresponding workers so that they areaware that they will need to perform the task withurgency once it is ready for execution. In the casewhere the execution of an upcoming task is conditional(e.g. only if a certain branch in the workflow is taken),it may happen that the preparatory actions for thistask need to be undone. This means that this strategymight create additional work that may become laterunnecessary (thus cost is increased). In this article, weexamine the case where only upcoming tasks that defi-nitely need to be performed (as opposed to condition-ally) are pre-dispatched.S7: Overlapping. The main idea is to make two sequen-tial activities execute in parallel as much as possible.This strategy decreases the workflow time by creatingconcurrency, along the lines of task pre-dispatching.However, the trade-off is that this strategy increasesthe coordination overhead between tasks and thereforeinvolves additional costs. It also involves allocatingresources earlier, and this may have an impact on othercases.S8: Prioritization. By assigning higher priority to casesthat are running late, this strategy allows these cases tobe accelerated at the expense of other cases competingfor the same resources. The assignment of priority canbe driven by various parameters such as the cost of thecase and waiting time. The priority value is used torank work items that are waiting to be assigned to agiven resource (i.e. work items corresponding to caseswith higher priority are given precedence). This strat-egy will make some cases with low priority have longerworkflow time because they will spend more time inthe waiting queues. In this article, the compensationcost is used as the priority value of a case: the higherthe compensation cost of a case is, the higher the prior-ity of work items associated to this case.S9: Batching. Group tasks or cases together andassigns them to one resource or a group of resources.This strategy reduces the workflow time by eliminatingsetup times and handover time, but it also entails thatsome cases will be waiting for others before beingtreated in batch.S10: Splitting. In this escalation strategy, a taskassigned to a resource or group of resources is split intosmaller tasks that are performed in parallel andassigned to separate resources. Because of parallelism,

Si et al. 79

Page 4: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

this strategy can reduce the workflow time. The trade-off is the additional setup time, handover time, setupcost, handover cost, and cost of consolidating the out-puts of smaller tasks.S11: Deferred data gathering. The idea of this strategyis not to gather data until the point when these dataare actually needed. The idea here is that some work-flows have tasks early on during the workflow that aremeant to gather data from the user or other stake-holders. However, some of these data are not neededstraight but only later in the workflow. Accordingly,data gathering tasks may be accelerated by requiringless data to be gathered, thus speeding up the workflowexecution. When data are actually needed, they aregathered by the task that needs the data.

Escalation strategy evaluation method

In this section, we present the simulation-based methodfor evaluating escalation strategies. The method con-sists of the following steps: (1) annotate the processmodel, (2) define normal and escalated scenarios, (3)simulate scenarios and estimate escalation needs, (4)introduce escalations by ‘‘perturbing’’ the workflowmodels, (5) simulate the workflow model with escala-tions, and (6) analyze simulation results.

To illustrate the method, we will make reference toan insurance claims handling process model takenfrom Van der Aalst et al. (2007). The process model isdepicted in Figure 1 in the form of an Event-DrivenProcess Chain (EPC). In the EPC notation, a processmodel is represented as a directed graph consisting offunctions (rounded rectangles) representing tasks;events (hexagons) representing, for example, out-comes of decisions; connectors (circles) representing,for example, points of choice; and resource types(ellipses).1

The EPC model in Figure 1 captures a process in alarge Australian insurance company for handlinginbound phone calls for lodging different types ofinsurance claims. Three subprocesses are involved inthis process: the back-office subprocess, the Brisbanecall center subprocess, and the Sydney call center sub-process. There are three tasks in each call center:‘‘check if sufficient information is available,’’ ‘‘classifythe kind of claim,’’ and ‘‘register claim.’’ Call centeragents handle all tasks in each call center. Each casemust be handled by a single resource. There are 90 callcenter agents assigned to each call center, respectively.In task ‘‘register claim,’’ one data item is required asinput. In normal conditions, approximately 9000 callsare received in each call center. Once the informationgathering process in the call center has been com-pleted, a claim moves into the back-office process.

In the back office, there are 150 legal experts, 150claim handlers, and 5 tasks: (1) determine likelihood ofclaim, (2) assess claim, (3) initiate payment, (4) adviseclaimant on reimbursement, and (5) close claim. Aninsurance claim will be rejected in call centers or inback office if the claim is not qualified for reimburse-ment. Similar to call centers, each case in back office ishandled by identical resources. In task ‘‘determine like-lihood of claim,’’ three data items are required as input.In task ‘‘access claim,’’ one data item is required asinput.

We extend the insurance claim scenario from Vander Aalst et al. (2007) by

� Including an additional task ‘‘classify the kind ofclaim’’ and a new data item for the task ‘‘registerclaim’’ for both call centers;

� Introducing three new data items for the task‘‘determine the likelihood of claim’’ and a newresource called ‘‘legal experts’’ for the first twotasks in the back office.

In the first step of the method, the modeler gathersdata to annotate the process model with attributes.These attributes are classified into four dimensions:task, case, resource, and data as discussed below. Theproblem of assigning values to simulation attributeshas been widely studied in the field of business processsimulation (Greasley, 2004; Tan and Takakuwa, 2007).Complementary techniques to this problem include thefollowing:

1. Domain expertise derived via interviews or otherqualitative data gathering methods (e.g. wide-bandDelphi) with domain experts.

2. Sampling experiments, wherein simulation attributevalues are derived from observations made on asample of cases in the workflow during a giventime window.

3. Industry benchmarks such as Supply-ChainOperations Reference (SCOR) and AmericanProductivity and Quality Centre (APQC) that pro-vide reference values for KPIs of typical processesin a variety of domains.

4. Input analysis, wherein simulation attributes arederived from past execution data, such as, forexample, analyzing records of task execution timesin order to determine the probability distributionof the duration of each task.

5. Sensitivity analysis, wherein a given assignment ofvalues to simulation attributes—obtained via oneof the above methods—is tested against reality byrunning simulations with the assigned attribute val-ues and comparing the simulation results with theobserved performance of the ‘‘As Is’’ process.

80 Concurrent Engineering: Research and Applications 22(1)

Page 5: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

In the case study at hand, attribute values werederived from the case study data provided in Van derAalst et al. (2007), which were themselves derived frominterviews with domain experts. In the case of ‘‘dataattributes,’’ values were chosen in a way that is consis-tent with the data in Van der Aalst et al. (2007), thoughnot directly taken from Van der Aalst et al. (2007).

Attributes of a task

Each task is assigned with an average execution time(shown in seconds in Figure 1). In addition, for eachtask, three attributes are required to assess escalation

options: execution cost, average completion time, andlatest completion time. Latest completion time isdefined as 1.8 times of average completion time.Exponential distribution is used in the simulation forthe execution time of every activity. In Table 2, we giveinitial attribute values for each task in the runningexample.

Attributes of a case

Each case is assigned a unique identifier as well as adeadline, which is calculated based on average execu-tion time of the tasks in the critical path of the

Phone call received

TB1:Check if sufficient

information is available

XORsufficient

information is available

sufficient information is not available

TB3:Register Claim

Claim is registered

0.90 0.10

30.00 Second(s)

320.00 Second(s)

Resource1:Call centre

Agent

90

Resource1:Call centre

Agent90

Call Centre Sydney9,000 calls per week

Payment has been initiated

To3:Initiate Payment

30.00 Second(s)

Resource3:Claims handler

150

XOR

To1:Determine the likelihood of

claim

Insured could not be liable

Insured could be

liable

Resource2:Legal Expert

150

Data2:Information1

Data3:Information2

Data4:Information3

XOR To2:Assess Claim

Resource2:Legal Expert

XORClaims have

been accepted Claims havebeen rejected

Claim hasbeen closed

To5:Close claim

Resource3:Claims handler

To4:Advise claimant on

reimbursement

Resource3:Claims handler

Claimant has been advised

150

150

150

20.00second(s)

20.00second(s)

30.00second(s)

20.00 Second(s)

660.00 Second(s)

120.00 Second(s) 180.00 Second(s)

0.80 0.20

0.850.15

0.50 0.50

Data1:Information1

TB2:Classify the kind

of claim

Claim is classified

Resource1:Call centre

Agent

90

Data1:Information1

200.00 Second(s)

Phone call received

TS1:Check if sufficient

information is available

XORsufficient

information is available

sufficient information is not available

TS3:Register Claim

Claim is registered

0.90 0.10

30.00 Second(s)

320.00 Second(s)

Resource1:Call centre

Agent

90

Resource1:Call centre

Agent90

Data1:Information1

TS2:Classify the kind of

claim

Claim is classified

Resource1:Call centre

Agent

90

200.00 Second(s)

Call Centre Brisbane 9,000 calls per week

Figure 1. The base model.

Si et al. 81

Page 6: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

workflow. In addition, a compensation cost (i.e. thecost of missing the deadline) is assigned to each case.The compensation cost can be a fixed amount, or avalue from a certain range, or a function that takes asinput the amount of time by which the deadline ismissed. In the running example, we assume that escala-tion cost is uniformly drawn from the range (120–170).

Attributes of a resource

Each (human) resource has three attributes: (1) role:used to describe the responsibility of an employee inthe workflow; (2) amount: number of resources; and(3) cost: wage of a resource. Three types of resources(Resource 1, Resource 2, and Resource 3) are definedin the running example. The first type of resources iscomposed of 180 call center agents. The second type ofresources includes 150 legal experts. The third type ofresources includes 150 claim handlers. There are 90 callcenter agents assigned to each call center, respectively.Legal experts are assigned to tasks TO1 and TO2 in theback office. Claim handlers are assigned to the remain-ing tasks TO3, TO4, and TO5 in the back office. Thewage of a call center agent is 4000 (per 2 weeks), andthe wage of a legal expert and claim handler is 6000(per 2 weeks). Resource cost represents the cost of uti-lizing a specific resource for a case.

Data attributes

Each association (task, data object) is annotated withan estimated preparation time—the time required toretrieve and prepare the data for the task in question.In the running example, three data objects (Data1,Data2, and Data3) are required for task ‘‘Determinelikelihood of claim.’’ This task can only be executedwhen all three objects are available. In our experiments,the preparation time of Data1, Data2, and Data3 was20, 20, and 30 s, respectively.

The second step of the simulation method is todefine the scenarios: one normal scenario and one ormany escalated scenarios. In the running example, weassume that there are two scenarios: (1) a normal sce-nario with approximately 9000 cases per 2-week periodat each call center and (2) an escalated scenario (stormseason) where the number of calls increases to 20,000cases. Here, we assume a negative exponential distribu-tion, but other distributions can be adopted.

The third step of the method is to encode the processmodel and associated attributes for the normal scenariousing a discrete event simulation technique. In this arti-cle, we use Colored Petri Nets (CPN; CPN Group,2013), but other techniques/tools could be used instead(e.g. Arena).

Once the initial process model and its attributes areencoded, we simulate it under each scenario. Table 3shows the results of simulation in storm season and innormal condition. The results show that the currentmodel is suitable for the normal scenario but not forthe storm season. Thus, escalation is needed. Table 3specifically shows that a bottleneck exists in back officeat task TO1 during storm season.

Introducing escalations

The next step in the simulation method is to perturbthe base process model in order to incorporate differentescalation strategies. Below, we show how this is donefor the escalation strategies previously introduced,using the running example as a basis.

S5: Escalation subprocess. To apply this strategy, weintroduce an additional subprocess to speed up theinsurance claim process. Only one escalation task, TS(Compensation-and-cancellation with client), is includedin the subprocess. We define the execution cost andaverage time of TS as 25 and 120, respectively. Duringthe execution of TS, a negotiation is carried out with

Table 2. Task attribute values for running example.

Task (Ti) Task description Execution cost (CTi) Average completion time Latest completion time

TB1 Check if sufficient information is available CTB1 = 10 30 54TB2 Classify the kind of claim CTB2 = 10 200 360TB3 Register claim CTB3 = 10 320 576TS1 Check if sufficient information is available CTS1 = 10 30 54TS2 Classify the kind of claim CTS2 = 10 200 360TS3 Register claim CTS3 = 10 320 576TO1 Determine likelihood of claim CTO1 = 20 20 36TO2 Access claims CTO2 = 30 660 1188TO3 Initiate payment CTO3 = 17 120 216TO4 Advise claimant on reimbursement CTO4 = 10 180 324TO5 Close claim CTO5 = 5 30 54

82 Concurrent Engineering: Research and Applications 22(1)

Page 7: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

the client to delay his/her claim request. Since the claimis going to be delayed, appropriate compensation isalso offered to the client. We define the compensationcost of TS as 140. After executing TS, the case is con-sidered end.

In addition, a new prediction task TP is added todetermine whether the subprocess should be applied.We define the execution cost and average time of TP as0 and 5, respectively. The prediction task TP and esca-lation task TS are inserted in the locations in the work-flow where a case is initially assigned to a resource. InFigure 2, TP and TS are inserted at the beginning ofback office. In this model, all tasks in a call center arehandled by the same call center agent, and all tasks inthe back office are handled by the same claim handler.The workflow after applying Escalation subprocess (S5)is shown in Figure 2.

S6: Task pre-dispatching. To apply this strategy, taskTB3 is replaced by TB3P and TB30, and task TS3 isreplaced with TS3P and TS30. Tasks TB3P and TB2and tasks TS3P and TS2 are executed simultaneouslyand handled by different resources. Therefore, thisstrategy increases the demand of resources and nearlyno influence to the cost and reliability of these alterna-tive tasks (see Table 4).S7: Overlapping. To apply this strategy, we introducefour tasks TB20, TS20, TB30, and TS30 to replace TB2,TS2, TB3, and TS3, respectively (see Table 4). TasksTB20 and TB30 are executed in parallel and are handledby different resources, and the same holds for tasksTS20 and TS30. This parallelism leads to higher execu-tion cost when compared to the original tasks, due toadditional coordination efforts.S8: Prioritization. We insert an escalation task, namelyTC, where priorities are assigned to tasks in the backoffice. The execution cost and average time of TC are 0and 10, respectively. During the execution of TC, if thecase is running late, the case is assigned a priorityaccording to its compensation cost. Task TC is insertedas the first task in the back-end part of the workflow.

S9: Batching. We introduce an additional escalationtask, namely, ‘‘Assign every 10 cases to be handled byone resource’’ (or TA for short) in order to batch thecases. We define the execution cost and average time ofTA as 30 and 10, respectively. TC is designed to batchevery 10 cases to one resource and therefore results theexecution cost for batching cases. Three tasks TO30,TO40, and TO50 are introduced to replace TO3, TO4,and TO5 (see Table 4). These new tasks handle every10 cases by one resource.S10: Splitting. Task TO2 is replaced by three tasksTO21, TO22, and TO23 (cf. Table 4) that are exe-cuted by three legal experts in parallel, thus leadingto shorter average completion times for Task Assessclaim compared to the original task. The setup andhandover time of TO21, TO22, and TO23 are set to30. The average completion time of tasks TO21,TO22, and TO23 is longer than the average comple-tion time of task TO2 due to additional setupand handover time. The cost of tasks TO21, TO22,and TO23 is also higher than the cost of task TO2 inthe original model due to setup and handover costsand cost of consolidating the outputs of the smallertasks.S11: Deferred data gathering. We introduce two newalternative tasks TB20 and TS20 to replace TB2 andTS2 (see Table 4). These new tasks are designed toquickly process their job by not gathering informa-tion item 1, thus resulting shorter average completiontime compared to the original tasks. In addition, atask TOA (‘‘Gathering information1’’) is added in theback office in order to gather information item 1when this information is needed (the probability thatthis information item is needed is 0.7). The executioncost and average time of TOA is 5 and 120, respec-tively. Task TOA is performed before task AssessClaim and only if information item 1 is required.

Experimental results

The final step in the proposed method is to analyze thesimulation results in terms of KPIs. In the current

Table 3. Initial results in normal condition and in storm season.

Measurement Normal condition(9000 cases per week)

Storm season condition(20,000 cases per week)

Time Average workflow time 1200 6436Waiting time at Brisbane 0 0Waiting time at Sydney 0 6Waiting time at back office 0 5917

Cost Average workflow cost (per week) 792,000 1,740,000Resource cost (per week) 2,520,000 2,520,000Average compensation cost (per week) 360,000 2,280,000Total cost (per week) 1,152,000 4,020,000

Si et al. 83

Page 8: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

experiment, we use average workflow time, workflowcost, compensation cost, and total cost to analyze thestrategies. Workflow execution time and costs are twocrucial dimensions of the insurance claim process,and thus, KPIs from these dimensions are chosen tomeasure the effect of escalation strategies. Dependingon the user’s requirements, other KPIs could be usedfor analyzing the trade-off among escalation strate-gies, including service level (fraction of workflowcases which are completed on or before the deadline),efficiency (ratio of execution time and waiting time inworkflow instances), utilized capacity of resources(e.g. utilization of resource ‘‘Call Center Agent’’ inour experiment), average processing time of aresource, and average delay (total number of latedays/total number of workflow instances). Forinstance, a manager of a logistic company might beinterested in using KPIs such as service level and

efficiency to evaluate the escalation strategies in adelivery process.

Time

Figure 3 shows waiting time of workflow process instorm condition. The analysis on waiting time revealsthe existence of a bottleneck at the back office duringstorm condition. Hence, applying escalation strategiesat the back office is expected to be more effective thanapplying at the call centers. The average of the work-flow time, which applies various escalation strategies, isdepicted in Figure 4.

Results indicate that task pre-dispatching (S6), over-lapping (S7), prioritization (S8), splitting (S10), anddeferred data gathering (S11) do not reduce averageworkflow time. There are several potential reasons forthis phenomenon.

Call Centre Brisbane Call Centre Sydney

Payment has been initiated

To3:Initiate Payment

30.00 Second(s)

Resource3:Claims handler

150

XOR

To1:Determine the

likelihood of claim

Insured could not be liable

Insured could be

liable

Resource2:Legal Expert

150

Data2:Information1

Data3:Information2

Data4:Information3

XORTo2:

Assess ClaimResource2:

Legal Expert

XORClaims have

been accepted Claims have been rejected

Claim havebeen closed

To5:Close claim

Resource3:Claims handler

To4:Advise claimant on

reimbursement

Resource3:Claims handler

Claimant have been advised

150

150

150

20.00second(s)

20.00second(s)30.00second(s) 20.00 Second(s)

660.00 Second(s)

120.00 Second(s) 180.00 Second(s)

0.80 0.20

0.850.15

0.50 0.50

Data1:Information1

TP:Checking if the case missing the

deadline

XOR

TS:Compensation-

and-cancellation

Resource2:Legal Expert

150

Resource2:Legal Expert

150

120.00 Second(s)The executing time of

case has exceed or must miss the deadline

The executing time of case will not miss the

deadline

Not miss the deadline

After one week

5.00 Second(s)

Figure 2. Running example after applying the Escalation subprocess (S5).

84 Concurrent Engineering: Research and Applications 22(1)

Page 9: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

� First, the positions in the experimental model wherethese strategies were applied are not bottlenecks. Asfor the deferred data gathering (S11), the averageworkflow time at the call centers has been decreasedby these strategies; however, the number of tasks atthe back office has been increased and the pressureand workload of the bottlenecks at the back officeare not been addressed. Therefore, deferred datagathering (S11) does not reduce but increases theaverage time for the workflow system.

� The aim of the prioritization (S8) strategy is not toreduce the average workflow time but to reducecompensation cost.

� Task pre-dispatching (S6), overlapping (S7), andsplitting (S10) decrease workflow time for casesthat are running late, but at the same time, theyincrease the demand for resources. As resources arealready highly utilized during storm conditions, thisadditional resource demand cannot be handled andleads to higher waiting times, which in turn lead toan increase in average workflow time.

In Figure 4, escalation subprocess (S5) is the beststrategy in reducing the average workflow time fol-lowed by batching (S9). Batching (S9) decreases thesetup time and handover time, leading to a reduction inaverage workflow time. In the case of escalation sub-process (S5), the subprocess is executed when a case isabout to miss the deadline. The fact that the executiontime of such subprocess is shorter than that of the origi-nal task leads to shorter execution times for those casesfor which the escalation is applied. At the same time,these shorter execution times lead to lower resource uti-lization, which has a positive effect on the remainingcases.

Workflow cost

We differentiate between workflow cost, resource cost,and compensation cost. Workflow cost is the total exe-cution cost of all cases during a 1-week period. The exe-cution cost of a case is the sum of the execution cost oftasks involved in that particular case (see Table 3).Figure 5 shows the average workflow cost per week invarious conditions. It also shows that the workflow costin normal condition is significantly lower. As expected,high workflow cost is recorded under storm conditionsdue to larger numbers of insurance claims. The results

Table 4. Modified tasks and their properties.

Strategy Task (Ti) Task description Tasks to be replaced Average cost (CTi) Average time

S6 TB3P Preparation of TB3 TB3 CTB3P = 5 180TB30 Register claim TB3 CTB3 = 5 140TS3P Preparation of TS3 TS3 CTS3P = 5 180TS30 Register claim TS3 CTS3 = 5 140

S7 TB20 Classify the kind of claim TB2 CTB2 = 20 200TB30 Register claim TB3 CTB3 = 20 320TS20 Classify the kind of claim TS2 CTS2 = 20 200TS30 Register claim TS3 CTS3 = 20 320

S9 TO30 Initiate payment TO3 CTO3 = 170 840TO40 Advise claimant on reimbursement TO4 CTO4 = 100 1080TO50 Close claim TO5 CTO5 = 50 180

S10 TO21 Access claim TO2 CTO21 = 20 250TO21 Access claim TO2 CTO22 = 20 250TO21 Access claim TO2 CTO23 = 20 250

S11 TB30 Rapid register claim TB3 CTB3 = 10 200TS30 Rapid register claim TS3 CTS3 = 10 200

0

2000

4000

6000

8000

Average wai�ng �me at Sydney

Average wai�ng �me at Brisbane

Average wai�ng �me at Back Office

Figure 3. Average waiting time during storm condition.

05000

10000150002000025000300003500040000

Figure 4. Average workflow time of the workflow.

Si et al. 85

Page 10: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

also show that overlapping (S7) and splitting (S10)increase the workflow cost. Overlapping (S7) increasesthe workflow cost due to additional coordination effort.Splitting (S10) increases the workflow cost due to addi-tional setup costs, handover costs, and costs of consoli-dating the outputs of the smaller tasks. The remainingescalation strategies have no influence on workflow costsince they do not modify the control flow.

Resource cost

All escalation strategies have the same resource costsince these strategies do not bring in additionalresources nor do they remove resources. The strategiesaffect the utilization of resources, but do not alter theset of resources working on the workflow.

Compensation cost

Compensation cost is incurred when workflow casesmiss their deadlines. We assume that compensationcost is normally determined when the case starts, butcan be renegotiated (cf. Early Escalation strategy).Figure 6 shows the average compensation cost (perweek) of the experiment model in various conditions.Comparison between Figures 4 and 6 reveals that theescalation strategies with shorter average workflowtime also have lower compensation cost, except in thecase of Prioritization (S8). Prioritization (S8) does notshorten workflow time overall. It shortens the work-flow time of the cases that are missing their deadline,but in doing so, it takes away resources from othercases, thus pushing the average execution time of theseother cases. In doing so, Prioritization (S8) reduces thecompensation cost since fewer cases miss their deadline.In other words, prioritization tends to push cases to thelevel where they use all the time they are allowed touse, but without missing their deadline. Of course, inan environment of high resource utilization, some casesstill miss their deadline despite prioritization and thussome compensation cost is still present.

Total cost

The total cost is the sum of workflow cost, resourcecost, and compensation cost. Total workflow costs fordifferent scenarios are shown in Figure 7. This figureshows that the total cost of applying overlapping (S7)and splitting (S10) exceeds the total cost of the experi-mental model in storm condition (with no escalation).The reason is that both strategies are unable to decreasethe workflow time. Specifically, they are less effective inreducing the number of cases missing their deadlines,and as a result, a large number of clients have to becompensated. Prioritization (S8) has the lowest total

cost among all strategies since it has the lowest compen-sation cost.

Conclusion

We presented a simulation method to evaluate escalationstrategies, and we applied it to a case study. The casestudy allowed us to assess the relative impact of the esca-lation strategies on two KPIs: average workflow timeand average cost. The relative performance of the escala-tion strategies on the case study is outlined in Table 5.

The results put into evidence the following trade-offs:

1. If the level of resource utilization is high for sus-tained periods of time, escalation strategies that

0

1000000

2000000

3000000

4000000

5000000

S5 S6 S7 S8 S9 S10 S11

Escala�on Normal Storm

Figure 5. Comparison on average workflow cost.

0100000020000003000000400000050000006000000

S5 S6 S7 S8 S9 S10 S11

Escala�on Normal Storm

Figure 6. Compensation cost of workflow process in variousconditions.

0

5000000

10000000

15000000

S5 S6 S7 S8 S9 S10 S11

Escala�on Normal Storm

Figure 7. Total cost of workflow process in various conditions.

86 Concurrent Engineering: Research and Applications 22(1)

Page 11: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

introduce additional resource demand for the pur-pose of speeding up late cases end up having anadverse effect on average workflow time overall.This is the case of strategies task pre-dispatching(S6), overlapping (S7), and splitting (S10). We canthus hypothesize that these strategies should bedeployed only when there is a slack in resource uti-lization; however, an additional study with differ-ent levels of resource utilization is needed tovalidate this hypothesis. Also, we observe thatdeferred data gathering (S11) reduces the workflowtime in the front-end, but adds workload to theback-end, and thus increases waiting time in theback-end. Thus, we hypothesize that deferred datagathering (S11) is more effective when used in con-junction with a Resource Redeployment strategy(i.e. increase the number of resources at the back-end).

2. Prioritization (S8) does not reduce average work-flow time when applied in isolation since this strat-egy speeds up cases that are running late at theexpense of other cases. The strategy aims atdecreasing compensation cost, not average work-flow time. Accordingly, we hypothesize that thisstrategy is more effective when combined with astrategy that decreases average workflow time.

3. The choice of locations where escalation strategiesare applied is an important factor. The analysisconfirms that when escalation strategies are appliedto tasks that are not in a bottleneck, they are lesseffective in terms of workflow time. For instance,Task pre-dispatching (S6), which is applied on thecall centers, has less effectiveness in decreasingworkflow time.

4. Prioritization (S8) has the most visible effects oncost, mainly because of its positive impact on com-pensation cost. Escalation subprocess also has apositive effect on compensation cost.

The above observations are to some extent specificto the insurance case study presented in this article andthe simulation parameters chosen for the tasks inserted

when applying each escalation strategy. While theseobservations do not necessarily generalize to other casestudies or when using different parameters, they high-light trade-offs when selecting and applying escalationstrategies. In addition, the evaluation results suggestthat applying certain strategies in isolation may notaddress the escalation problems associated with a givenworkflow. Instead, analysts should consider combiningescalation strategies, taking into account their budget,execution time constraints, and desired service qualitystandard.

In summary, this article described a method fordesigning simulation experiments in order to evaluatethe impact of various escalation strategies on a givenworkflow when the workflow is subjected to demandspikes (i.e. a higher-than-usual inter-arrival rate).Through a case study, we illustrated trade-offs involvedwhen selecting escalation strategies, and we providedguidelines to deal with these trade-offs. The method canbe applied to evaluate the performance of the presentedescalation strategies on a given workflow subjected todifferent scenarios such as different arrival rates, differ-ent numbers of available resources, or different effectsof applying a given escalation strategy on executiontimes and other KPIs. We foresee that the method canbe extended to handle other escalation strategies besidesthose considered in this article, although a validation ofthis claim is beyond the scope of this article.

Declaration of conflicting interests

The authors declare that there is no conflict of interest.

Funding

This research is funded by the University of Macau undergrant RG056/06-07S/SYW/FST, MYRG041(Y1-L1)-FST13-SYW, and by ERDF via the Estonian Centre of Excellence inComputer Science.

Note

1. The choice of EPC is driven by the fact that the case studyherein presented is originally captured in this notation.However, the proposed method and findings can be trans-posed to other modeling notations that share commonconstructs with EPC such as BPMN and UML Activity

Diagrams.

References

Bettini C, Wang XS and Jajodia S (2002) Temporal reasoning

in workflow systems. Distributed and Parallel Databases

11: 269–306.Chan K-L, Si Y-W and Dumas M (2009) Simulation-based

evaluation of workflow escalation strategies. In: Proceed-

ings of 2009 IEEE international conference on e-business

Table 5. Summarized analysis result.

Strategy Rank based onaverage flow time

Rank basedon total cost

Escalation subprocess (S5) 1 2Task pre-dispatching (S6) 4 4Overlapping (S7) 6 7Prioritization (S8) 3 1Batching (S9) 2 3Splitting (S10) 7 6Deferred datagathering (S11)

5 5

Si et al. 87

Page 12: Evaluation of trade-offs betweenfstasp/paper/sage2014.pdf · attached to workflows. Bettini et al. (2002) propose algorithms for checking time constraint satisfiability at design

engineering (ICEBE 2009), Macau, China, 21–23 October,pp. 75–82. New York: IEEE Press.

Chen J and Yang Y (2008) Temporal dependency basedcheckpoint selection for dynamic verification of fixed-timeconstraints in grid workflow systems. In: Proceedings of

the ACM/IEEE 30th international conference on software

engineering (ICSE), Leipzig, 10–18 May, pp. 141–150.Washington, DC: IEEE Computer Society.

CPN Group (2013) CPN Tools Home Page. Aarhus: CPNGroup, University of Aarhus.

Eder J, Panagos E and Rabinovich M (1999) Time constraintsin workflow systems. In: Proceedings of the 11th interna-

tional conference on advanced information systems engineer-

ing (CAiSE), Heidelberg, 14–18 June, pp. 286–300. Berlin/Heidelberg: Springer.

Georgakopoulos D, Schuster H, Baker D, et al. (2000) Man-aging escalation of collaboration processes in crisis mitiga-

tion situations. In: Proceedings of the 16th international

conference on data engineering (ICDE), San Diego, CA,28 February–3 March, pp. 45–56. Washington, DC: IEEEComputer Society.

Greasley A (2004) A redesign of a road traffic accident report-ing system using business process simulation. Business Pro-cess Management Journal 10: 635–644.

Li J, Fan Y and Zhou M (2004) Performance modeling andanalysis of workflow. IEEE Transactions on Systems

Man and Cybernetics, Part A: Systems and Humans 34:

229–242.Panagos E and Rabinovich M (1996) Escalations in workflow

management systems. In: Proceedings of the workshop on

databases: active and real-time (DART), Rockville, MD,

12–16 November, pp. 25–29. New York: ACM.Panagos E and Rabinovich M (1998) Reducing escalation-

related costs in WFMS. In: Proceedings of NATO advanced

study institute on workflow management systems and intero-

perability, Istanbul, Turkey, 12–21 August 1997, pp. 107–

127. Berlin/Heidelberg: Springer.Rhee S-H, Bae H and Kim Y (2004) A dispatching rule for

efficient workflow. Concurrent Engineering: Research and

Applications 12: 305–318.Son JH and Kim MH (2001) Improving the performance of

time-constrained workflow processing. Journal of Systems

and Software 58: 211–219.Tan Y and Takakuwa S (2007) Predicting the impact on busi-

ness performance of enhanced information system using

business process simulation. In: Proceedings of the 39th

conference on winter simulation, Washington, DC, 9–12

December, pp. 2203–2211. Piscataway, NJ: IEEE Press.Van der Aalst WMP, Rosemann M and Dumas M (2007)

Deadline-based escalation in process-aware information

systems. Decision Support Systems 43: 492–511.

Author biographies

Yain-Whar Si is an Assistant Professor at the University of Macau. His research interests are inthe areas of business process management and decision support systems. He is one of the foundersof the Data Analytics and Collaborative Computing Group at the University of Macau.

Marlon Dumas is Professor of Software Engineering at University of Tartu, Estonia andResearch Director at the Software Technology and Applications Competence Centre (STACC).His research interests are in the fields of business process management and service-oriented com-puting. He has been co-recipient of best paper awards at the ETAPS 2006, BPM 2010 and BPM2013 conferences.

Ka-Leong Chan obtained his undergraduate degree in Electronic & Information Technologyfrom South China University of Technology in 2002. In 2009, He was awarded his M.Sc. degreein Software Engineering from the University of Macau. His research interests are in the areas ofworkflow management and electronic commerce systems.

88 Concurrent Engineering: Research and Applications 22(1)