Graph Based Methods For The Representation And Analysis Of.
-
Upload
legal2 -
Category
Technology
-
view
402 -
download
0
Transcript of Graph Based Methods For The Representation And Analysis Of.
Graph-Based Methods for theRepresentation and Analysis ofBusiness Workflows
Amitava Bagchi
Indian Institute of Management Calcutta
ISI_Dec_05 2
References Mukherjee Arindam, Sen Anup K and Bagchi Amitava
(2004), Information analysis in workflows represented as task-precedence metagraphs, Proc WITS-2004, Workshop on Information Technology and Systems, Seattle, WA, USA, pp 32-37
Mukherjee Arindam, Sen Anup K and Bagchi Amitava (2005), Representation, Analysis and Verification of Business Processes: A Metagraph-Based Approach, Working Paper WPS-552, Indian Institute of Management Calcutta (http://www.iimcal.ac.in)
ISI_Dec_05 3
Outline
Business Process & Workflow Metagraphs & Information Elements Task Precedence Metagraphs (TPMGs) Information Analysis: Graphical Algorithm Functional & Organizational Perspectives Workflow Verification
ISI_Dec_05 4
Objectives To describe an AND/OR graph representation
scheme for business workflows
To present a graph traversal algorithm for the
analysis of information flow in such workflows
To extend the above method to task and
resource analyses
To outline how the structural correctness of
workflows can be verified
ISI_Dec_05 5
Business Process & Workflow A business process consists of a set of related
tasks in one or more functional areas (such as
finance or marketing), which, when performed in
any one of several permissible orders, enables an
organization to achieve a business goal.
Ex: A loan appraisal system used in a bank
ISI_Dec_05 6
Loan Appraisal: Business Process Example
ISI_Dec_05 7
Legend (see fig p 6) PD applicant’s property data CD data on comparable properties AC applicant’s account data APD loan application data AV appraised value of property CR applicant’s credit rating LA loan amount RLA revised loan amount LR risk level of loan AR, MR, BR the loan risk level is acceptable, marginally bad,
bad BP current portfolio of bank’s loans RE bank’s current loan exposure YES the application is approved NO the application is rejected
ISI_Dec_05 8
Business Process & Workflow
Workflow (or Workflow Instance): A specific
instance of flow of control in a business process;
it is a sub-graph of the given process graph In practice, the terms business process and
workflow are often used interchangeably.
ISI_Dec_05 9
WorkflowInstance 1
ISI_Dec_05 10
WorkflowInstance 2
ISI_Dec_05 11
WorkflowInstance 3
ISI_Dec_05 12
Business Process Modeling:Existing Approaches
Petri Nets & Related Formalisms Petri Nets (van der Aalst & van Hee 2002) Workflow Management Coalition (WfMC)
Guidelines (http://www.wfmc.org)
Metagraph-Based Formalisms Metagraphs (Basu & Blanning 2000, 1999, 1994)
ISI_Dec_05 13
Petri Nets & Related Formalisms Main focus is on the precedence relationships
between tasks
Flow of information plays a subsidiary role
Commercial products such as IBM’s MQSeries
have adopted this convention
Widely used, and quite suitable for engineering
applications
ISI_Dec_05 14
Metagraph-Based Formalisms A metagraph is a directed (hyper-)graph. It can be
viewed as a special type of AND/OR graph.
A metagraph is typically small in size (at most a few
hundred nodes) and is explicitly available, i.e., the
entire graph is supplied as input to a search
algorithm. So the expansion of a node just means
moving to its immediate successors.
In (implicit) game trees, new nodes actually get
added to the graph as they get created.
ISI_Dec_05 15
Metagraph-Based Formalisms Each node in a metagraph contains one or more
information elements (items).
In Information Analysis, an input set of items is
supplied at start, specifying the business
information initially available.
Another set of items, called the output set, contains
the target set of items that are desired as output.
ISI_Dec_05 16
Metagraph-Based Formalisms Each arc represents a task that converts one set
of items to another set of items.
The objective is to start from the input set of items,
perform the tasks in the given order of precedence
and derive all the items in the output set.
ISI_Dec_05 17
Metagraph-Based Formalisms The metagraph convention puts more emphasis
on the flow of information, so has an advantage
over Petri Nets for business applications.
However, it is not widely used in practice because
it suffers from certain shortcomings.
Metagraph for Loan Evaluation ProcessAccount
Data (AC)
ApplicantData (APD)
CreditRating(CR)
PropertyData (PD)
ComparablesData (CD)
Appr.Value(AV)
LoanAmt. (LA)
Bank’sPortfolio
(BP)
LoanRisk (LR)
Marg. BadRisk (MR)
LoanApproved
(YES)
BadRisk (BR)
LoanRejection
(NO)
AcceptRisk (AR)
RiskExposure
(RE)
e 2
Calcu
late A
pprais
ed
Value o
f Pro
perty
e5Calculate a new Loan
Amount
e6Calculate
Bank’s Risk
Exposure
e8
Marginal Risk
Assessment
e10
Bad Risk Assessment
e7
Acce pta ble R
i skA
sse ssme nt
e9Approvethe Loan
e11
Reject the
Loan
Applicaiton
e1Calculate
CreditRating
e4Calculate
LoanRisk
ISI_Dec_05 19
Metagraphs The existing metagraph model for workflows has
three main shortcomings: Flow of control is not displayed with clarity and the
diagram appears cluttered The analysis makes use of symbolic matrices
which are not easy to manipulate A clear distinction is not always drawn between
OR joins & AND joins (or even between OR splits & AND splits)
ISI_Dec_05 20
Task Precedence Metagraphs (TPMGs) A TPMG is a modified form of metagraph.
It is visually more appealing and is more like an
AND/OR graph in appearance.
It is less cluttered so the flow of control is
discerned more easily.
A TPMG is more general than a WfMC graph in
that AND & OR splits and joins are not always
required to be matched in pairs.
ISI_Dec_05 21
Terminology
Tasks & Propagation Edges
Init Nodes & Prop Nodes
OR Nodes& AND Nodes
Split Nodes & Join Nodes
ISI_Dec_05 22
Task Precedence Metagraphs (TPMGs) Edges are of two types:
Tasks: shown as bold arrows; a task converts the set of items at its start to another set of items, which cannot be obtained from any other task
Propagation Edges: shown as lightly drawn arrows; a propagation edge conveys an item from the outgoing end of a task to the incoming end of another task.
ISI_Dec_05 23
Task Precedence Metagraphs (TPMGs) Nodes are also of two types
Init Nodes An init node has a single outgoing edge
corresponding to a task Is shown as a bold oval
Prop Nodes A prop node can have multiple outgoing edges, all of
which are propagation edges Is shown as a lightly drawn oval
ISI_Dec_05 24
Task Precedence Metagraphs (TPMGs) Init and Prop Nodes
On every directed path, init nodes alternate
with prop nodes, i.e., a TPMG is a directed
bipartite graph, just like a Petri net.
ISI_Dec_05 25
Task Precedence Metagraphs (TPMGs) Nodes are of two types, OR and AND. An OR node (identified with a + sign) shows
alternate paths for flow of control. An AND node (identified with a • sign) indicates
that flow of control takes place along all the edges at the same time.
An OR (or AND) node is either a split node or a join node.
ISI_Dec_05 26
Task Precedence Metagraphs (TPMGs)
Split & Join Nodes
A split node is a node at which multiple paths
begin. It is always a prop node.
A join node is a node at which multiple paths
end. It is always an init node.
ISI_Dec_05 27
Task Precedence Metagraphs (TPMGs)
However, a TPMG differs from a Petri Net in that
every node has an associated subset of labeled
items. This underscores the role of business
information in a business process.
ISI_Dec_05 28
Information Analysis Given a workflow, we seek answers to questions
of the following type:
Suppose a set A of items is supplied. Starting
from A, can we produce all the items in another
given set B?
Is item ‘a’ essential for producing item ‘b’?
These can be formulated as graph search
problems.
ISI_Dec_05 29
Information Analysis But a standard AND/OR graph search algorithm
such as AO* (Nilsson 1980) is not appropriate
for our purpose because a TPMG differs from an
AND/OR graph in some ways:
TPMG: Multiple start nodes
AND/OR Graph: One start node
ISI_Dec_05 30
Information Analysis TPMG: Both AND joins and OR
joins
AND/OR Graph: Only OR joins
TPMG: Can have directed cycles AND/OR Graph: AO* assumes it is cycle-free
Note that a project scheduling network has only AND splits/joins and no OR splits/joins
ISI_Dec_05 31
Algorithm InfAnalysis Algorithm InfAnalysis is an iterative graph search
algorithm Given:
An explicit TPMG An input set of items An output (target) set of items
Determines whether all the items in the output set can be derived.
ISI_Dec_05 32
Algorithm InfAnalysis Algorithm InfAnalysis has some similarities with
A* and AO* and makes use of an edge-marking
method.
We think of the given TPMG as representing a
business process, and the marked solution sub-
graph produced by InfAnalysis as a workflow
instance.
ISI_Dec_05 33
Algorithm InfAnalysisMakes use of four lists:
ITEMSET: initially contains the input set of items;
new items get added as nodes get expanded
TARGET: contains the items desired as output
FRONTIER: only holds init nodes; initially holds
those that have all their items in ITEMSET
STACK: needed for processing OR nodes;
remembers which OR alternative should be
processed next
ISI_Dec_05 34
Algorithm InfAnalysis An active node is an init node in FRONTIER with
all its items in ITEMSET.
At each iteration, InfAnalysis looks for an active
node in FRONTIER, processes the correspond-
ing task, and updates ITEMSET & FRONTIER.
ISI_Dec_05 35
Algorithm InfAnalysis If all items in TARGET belong to ITEMSET then a
solution has been found (success).
If there is no active node in FRONTIER then the
next OR alternative in STACK must be pursued.
If STACK is also empty then failure.
ISI_Dec_05 36
Algorithm InfAnalysis Thus the algorithm traverses the given TPMG
exhaustively, looking for a workflow instance that
generates, for the given input set, a set of items
that contains the given output set.
When traversing a workflow instance, the edges
in the instance get marked (say by colouring red).
ISI_Dec_05 37
Algorithm InfAnalysis When the next workflow instance is examined, the
marking at the corresponding OR split node is
changed.
The advantage of marking is that each instance
need not be traversed from scratch; the work done
earlier can be remembered and partly reused.
The algorithm assumes that the TPMG is
structurally valid.
ISI_Dec_05 38
Algorithm InfAnalysis
Example: For the loan appraisal process, we want
to know whether, given the set of items S = { LA,
PD, CD, AC, APD, BP } as input, we can produce
the item YES as output.
A graph search algorithm is appropriate for such
problems. To keep the algorithm simple, we do not
indicate the edge markings.
ISI_Dec_05 39
Algorithm InfAnalysisinitialize ITEMSET, FRONTIER, STACK;do while (TARGET is not a subset of ITEMSET) { if (there is an active node n in FRONTIER) then { remove n from FRONTIER; expand n, entering its init successors in FRONTIER,
OR split nodes in ORLIST, and new items in ITEMSET; } // else examine next workflow instance else if (STACK is not empty) then {
take next init successor p of OR node m on top of STACK; enter p in FRONTIER adding its items to ITEMSET;
if (m has no other successors) then pop m; } else { announce “failure”; exit; }
} // no remaining workflow instancesannounce “success”; exit;
ISI_Dec_05 40
Algorithm InfAnalysis: Observations Works correctly on the example shown earlier
(TPMG for loan appraisal)
But for more complex TPMGs containing OR
split nodes that are not descendants of each
other, the STACK must be replaced by a more
flexible data structure
ISI_Dec_05 41
Functional Perspective Queries that relate to the execution of tasks
rather than to the flow of information:
Which other tasks must be completed before a
given task t can start?
If a task t cannot be executed, which other tasks
become inoperable?
ISI_Dec_05 42
Functional Perspective Algorithm InfAnalysis can be modified in a simple
way to answer such queries.
For example, to find the tasks that must be
completed before task t can start, consider the
set S of items contained in the init node that
immediately precedes t. Run InfAnalysis with the
given inputs and with S as the target set; the
required set of tasks are those in the marked
sub-graph.
ISI_Dec_05 43
Organizational Perspective Queries that relate to resources (i.e., the
executors of tasks, whether human agents or
machines):
If a resource r is unavailable, some tasks will not
get performed. As a result, some other resources
might become idle. Which are the resources that
will become idle?
ISI_Dec_05 44
Organizational Perspective Again, Algorithm InfAnalysis can be modified in a
simple way to answer such queries.
For example, to determine the other resources
that become idle when resource r is unavailable,
first find the set T of tasks that r executes. We
can determine which other tasks get held up
because the tasks in T cannot be executed. This
will tell us whether any resources have become
completely idle.
ISI_Dec_05 45
Temporal Constraints The control structure of a workflow imposes
temporal constraints on tasks. If a task precedes
another task, it must be performed earlier.
If temporal information, such as the duration of
tasks, is supplied, then issues arise similar to
those in project scheduling.
However, the presence of directed cycles in
workflows causes additional complications.
ISI_Dec_05 46
Structural Verification
A valid workflow always serves a business goal.
Given a business process W supplied in the form
of a TPMG, how do we tell whether W is valid?
To ensure the validity of W, some structural (i.e.,
syntactic) constraints must be imposed on W.
We now give examples of such structural
constraints.
ISI_Dec_05 47
Structural Problem: Deadlock Deadlock: Caused when an OR
split node is nested with an AND join node.
In the figure, only one of the two outgoing edges at the OR split node 2 can be marked at any time. So execution cannot proceed beyond the AND join node 7.
A valid workflow must not have any deadlocks.
ISI_Dec_05 48
Structural Problem: Lack of Synchronization
Lack of Synchronization: Caused when an AND split node is nested with an OR join node.
In the figure, since both the outgoing edges at the AND split node 2 will get marked, the task (7,8) will be executed twice.
A valid workflow must not suffer from lack of synchronization.
ISI_Dec_05 49
Structural Problem: Non-Terminating Cycle Non-Terminating Cycle: Caused when control cannot
exit from a directed cycle.
This problem can be avoided when every directed
cycle is well-formed, i.e., it has an OR join node lying
on it through which control can enter, and an OR split
node lying on it through which control can exit (see
loan appraisal example).
ISI_Dec_05 50
Other Structural Errors Examples of other structural errors that must be
eliminated:
Dangling Nodes: It should be ensured that if a
node in a TPMG contains items that are not
target items, then the node has a successor
task.
ISI_Dec_05 51
Structural Verification The structural verification algorithm TPMG_SYN
traverses the workflow instances in the given TPMG
one by one looking for structural problems.
As soon as it locates a problem it terminates with an
appropriate error message.
If TPMG_SYN does not find a problem, the given
TPMG models a valid business process.
ISI_Dec_05 52
Structural Verification TPMG_SYN has many similarities with Algorithm
InfAnalysis.
TPMG_SYN assumes for convenience that there is
one start node and one goal node. If this does not
hold for the given TPMG, an AND split node can be
added at the top and an OR join node at the bottom.
ISI_Dec_05 53
Algorithm TPMG_SYN1 initialize ITEMSET, FRONTIER, STACK; finish = false;2 do while (finish == false) {3 if (there is a non-goal active node n in FRONTIER) then {4 remove n from FRONTIER;5 expand n, entering its init successors in FRONTIER,6 OR split nodes in ORLIST, and new items in ITEMSET;7 }8 else if (there is a goal node in FRONTIER) then9 announce “one workflow instance scanned”; 10 else if (STACK is not empty) then {11 take next init successor p of OR node m on top of
STACK;12 enter p in FRONTIER adding its items to ITEMSET;13 if (m has no other successors) then pop m;14 }15 else finish = true; 16 }17 announce “the given TPMG is valid”; exit;
ISI_Dec_05 54
TPMG_SYN: Detection of Errors Illegal cycles and lack of synchronization can both
be detected at line 5 when node n is expanded.
We can just check whether as a result of the
expansion an OR join node has two marked
incoming edges. This would be illegal in general,
but in some situations it can indicate the presence
of a legal directed cycle.
ISI_Dec_05 55
TPMG_SYN: Detection of Errors Deadlock can be detected at line 8 when a goal
node is not found but inactive nodes are present in
FRONTIER.
Dangling nodes can also be detected at line 5
when node n is expanded.
ISI_Dec_05 56
Workflow Verification Note that some semantic constraints are imposed
by the meanings of the items contained in the
TPMG nodes.
TPMG_SYN when appropriately modified can
perform certain types of semantic verification of
TPMGs.
ISI_Dec_05 57
Workflow Verification A similar verification procedure can be devised for
workflows drawn using Petri Nets or any other
WfMC convention.
ISI_Dec_05 58
Thank You!