Liquid process model collections
-
Upload
marcello-la-rosa -
Category
Technology
-
view
543 -
download
0
Transcript of Liquid process model collections
Liquid Process Model Collectionshow to get there
Prof. Marcello La RosaBPM Discipline, Information Systems SchoolQueensland University of Technology
Research• Technology oriented• Business oriented
Teaching• BPM specializations• Master’s of BPM
Service• Professional training • Consultancy
BPM Discipline @ QUThttp://bpm-research-group.org
Each process is varied by product & brand…
End to end insurance process
Source: Guidewire reference models
Product Dev Sales Service Claims
Total number of insurance models: 3,000+
30variations
500tasks
Home Motor Commercial Liability CTP / WC
A few years back… Suncorp insurance
Managing large process model collections
versions & variants management
merging
refactoring /standardization
clonedetection
Process model repository querying
similaritysearch
80%
R. Dijkman, M. La Rosa, H. Reijers “Managing large collections of business process models – Current techniques and challenges”, COMIND 2012
2000 2002 2004 2006 2008 2010 20120
10
20
30
40
The Apromore Initiative (apromore.org)An open-source, highly scalable, SaaS platform to manage process model collections
M. La Rosa, H. Reijers, W. van der Aalst, R. Dijkman, J. Mendling, M. Dumas, L. Garcia-Banuelos “APROMORE: an advanced process model repository”, EXP.SYS.APP. 2011
“Build awareness”Understand differences and causes for these differences
“Achieve simplification”Identify and consolidate common business functions
“Achieve centralisation”Centralise support for non-core processes across LOBs
“Identify opportunities for partnering”Make better decisions about the processes you can partner/run in-house
Expected benefits (beginning of the project)
“The tool is great but it would be pretty useless because our process models aren’t great and we know they aren’t great” (*)
“If they [the models] would have been updated all along it would have been worthwhile but now they’re out of date, it’s not really worth the effort of bringing them up to date” (**)
(*) Realistic Suncorp employee(**) Disillusioned Suncorp employee
The reality (end of the project)
Large process model collections, hard to maintain
Large collections of “dead” process models
The other face of the medal
Vision: and if the collection could self-adapt?
Change is endemic to organizations and continuously affects them:
Requirements changeEnvironments change
Processes change
If an organization’s processes change, this will be recorded in the systems logs
Use process mining techniques to “discover” process changes from logs, and apply these changes to process model collections
Release the full potential of management techniques for process model collections
Let’s get more concrete
/
event log
live event stream
database
process model
patterns
conformanceanalysis
processperformance…
if A then B
extract process
knowledge
A process model collection that is
• Aligned with organizational behavior
• Can self-adapt to evolving organizational behavior
Solution: “Liquid” process model collection
W.M.P. van der Aalst, M. La Rosa, A.H.M. ter Hofstede, M.T. Wynn, Liquid Business Process Model Collections. In Modeling and Simulation-based Systems Engineering Handbook, 2014
1. Discovering a collection in the first place2. Coping with evolution3. Aligning logs with an existing process model collection
Approach: 3 interrelated challenges
Useless!
Event log
Challenge 1: Discovering a collection
Process discovery algorithm
Current situation: discovering “all-in-one” models
a b a b c d e c a f g h
a b a b k d e c h f g h
b c p q p r a k q r s
b c p h p r a k q r s
a x p h y z t t u
Trace clustering
a b a b c d e c a f g h
a b a b k d e c h f g h
b c p q p r a k q r s
b c p h p r a k q r s
a x p h y z t t u
Cluster 1
Cluster 2
Noise
Event log
Process variant 1
Process variant 2
Trace clustering
high complexity & redundancy
Our approach: Slice, Mine and Dice (SMD)
L. Garcia-Banuelos, M. Dumas, M. La Rosa, J. De Weerdt, C.C. Ekanayake. Controlled Automated Discovery of Collections of Business Process Models. Information Systems, 2014
slice the log horizontallyper variant
dice the discovered models hierarchically
mine
Process discovery
Trace clustering
Event log
Complexity thresholde.g. Size ≤ 30
Process model
Slice, Mine and Dice (SMD)
>?1) Slice
2) Mine
3) Dice
Process discovery
Trace clustering
Event log
Complexity thresholde.g. Size ≤ 30
Slice, Mine and Dice (SMD)
>?Process models
Process discovery
Trace clustering
Event log
Complexity thresholde.g. Size ≤ 30
Slice, Mine and Dice (SMD)
>?Process models
Process model M3
Process model M2
A closer look…
F10
F11 F13
F14F12
M2 RPST of M2
F20
F21 F22
F24F23
F25
RPST of M3
Refined Process Structure Tree (RPST)
J. Vanhatalo, H. Volzer, J. Koehler: The Refined Process Structure Tree. Data Knowl. Eng., 2009
M3
M2M3
F14 F25
M. Dumas, L. García-Bañuelos, M. La Rosa, and R. Uba. Fast Detection of Exact Clones in Business Process Model Repositories. Information Systems, 2013
RPSDAG
F10
F11 F13
F14F12
RPST of M2
F20
F21 F22
F24F23
F25
RPST of M3
F14
M2
M3
S3
+
S3
+
Extracting exact clones
S3
Exact clones:• SESE• Non-trivial• Identical
F12
F22M3
+
S3
+S3
?
?
Extracting approximate clones
M. La Rosa, M. Dumas, C. Ekanayake, L. Garcia-Baneulos, J. Recker, A.H.M. ter Hofstede, Detecting Approximate Clones in Business Process Model Repositories. Information Systems, 2015
Appr. clones:• SESE• Non-trivial• Similar• Unrelated
+
+
M2
Merging algorithm
Fragment F12 of model M2
Fragment F22 of model M3
Configurable gateway
Configurable label
M. La Rosa, M. Dumas, R. Uba, and R. M. Dijkman. Business Process Model Merging: An Approach to Business Process Consolidation. ACM TOSEM, 2013.
Merging approximate clones
S4
Consolidated process model collection
S4S3
+
The result…
+
+
+
Trace clustering• M. Song, C.W. Gunther, and W.M.P. van der Aalst, Improving Process Mining with
Trace Clustering, J. Korean Inst. of Industrial Engineers 34(4), 2008• R.P.J.C. Bose, W.M.P. van der Aalst, Trace Clustering Based on Conserved Patterns:
Towards Achieving Better Process Models, BPM 2009 Workshops• A.K.A. de Medeiros, A. Guzzo, G. Greco, W.M.P. van der Aalst, A.J.M.M. Weijters,
B.F. van Dongen, D. Saccà. Process Mining Based on Clustering: A Quest for Precision, BPM Workshops 2007
Discovery• A.J.M.M. Weijters, J.T.S. Ribeiro. Flexible Heuristics Miner (FHM), CIDM, 2011.
Evaluation setup
Log Traces Events Eventclasses
Duplication ratio
Motor 4,293 33,202 292 114
Commercial 4,852 54,134 81 668
BPI 2012 5,312 91,949 36 2,554
Evaluation – repository size and models number S: Song et al. B: Bose et al. M: de Medeiros et al.
• up to 64% reduction in repository size• up to 66% reduction in # of top level process models• up to 120 sub-processes extracted
Motor Comm BPI Motor Comm BPI
14% 22%
66%64%
Evaluation – individual model complexity
Motor Comm BPI Motor Comm BPI
30%
concept drift
log at time2 > time1
ABCXEFY
ABCXEY
BCCXE
BCCXE
BCCXEE
ABCXDFY
Challenge 2: Coping with evolution
log at time1
ABCDEFG
ABCDEG
BCCDF
BCCDE
BCCDEE
ABCDDFG liquid process
model collection(currently in use)
intentional changessince last version
processstakeholder
liquid processmodel collection(consolidated)
liquid processmodel collection
(from new behavior)
non-transient changessince last log
xy
xy
y
A time point when there is a statistically significant difference between the observed process behavior before and after this point
Concept drift in a single process
Example
<A,B,E,F,G><A,B,C,F,G> <A,B,C,F,G><A,B,D,F,G><A,B,E,F,G><A,B,D,F,G>
Drift
Log
<A,B,E,F,G><A,B,D,F,G> <A,B,E,F,G><A,B,D,F,G><A,B,D,F,G><A,B,D,F,G>
1. Fully automated2. Highly scalable (online use)3. Highly accurate
- types of drifts detected- delay in detecting the drift
4. Explainable
Requirements
• Statistically significant difference in process behavior, i.e. “when are two processes different?”
• Use an appropriate data structure to encode process behaviorPartial order runs of a process where concurrency is explicitly captured > configuration equivalence
• Process drift = time point when there is a statistically significant difference in the distribution of the runs before and after (for a given time window size)
Our approach: ProDrift
A. Maaradji, M. Dumas, M. La Rosa, A. Ostovar, Fast and accurate business process drift detection. In BPM 2015
1. Starting from an event log, we consider completed traces2. For each new trace
• update the concurrency relation• transform trace into run by encoding the associated concurrency relation
From a stream of traces to a stream of runs
Stream of tracesStream of runs
1. Define two juxtaposed sliding windows (reference and detection) forming the most recent runs
2. Consider the runs as observations of a categorical variable, one per window
3. Apply the Chi-square test of independence between the two windows
Reference window
Point of the hypothesis test
Detection window
𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1
Chi-square test of independence
P-value < threshold
Stream of runs
The detection delay d is the distance between the actual drift and the last trace read in order to detect a drift
To avoid sporadic stochastic oscillations of P-value, we have a drift when P-value < thresholdfor consecutive tests
Detection delay and noise filter
𝜋 𝑖+ 2𝑤
Actual drift
d
Reference window
Point of the hypothesis test
Detection window
𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1
Stream of runs
The choice of the windows size is critical for drift detection:• a higher variation needs more observations• a lower variation needs less observations
We use an adaptive window technique to have a more reliable statistical test based on the evolution ratio
Adaptive window
Reference window Detection window
Reference window Detection window
𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1
Stream of runs
𝑇 𝑗
𝑇 𝑗+1
Implementation in ApromoreWatch the screencast at https://youtu.be/97NLShSMJnQ
We generated a benchmark dataset of 72 logs by simulating a textbook example (loan origination process) using BIMP
Injected 18 different change patterns
For each pattern, we generated 4 logs of different lengths(2,500, 5,000, 7,500 and 10,000 traces)
Evaluation: synthetic dataset
Change patterns from Weber et al.12 simple change patterns:
+ 6 complex change patterns (3 nested patterns each):IRO, IOR, ORI, OIR, RIO, ROI
Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support features-enhancing flexibility in process-aware information systems. DKE 66(3), 2008
Drift injection – gold standardEach drift injected 9 times by composing 10 sublogs
juxtaposition
simulation
Time performance: time required to perform a new statistical test- min: 0.26ms- max: 2.3ms- mean: 0.5ms (real time)
Accuracy:- F-score- Mean delay
Evaluation measures
Impact of window size on F-score and mean delay
Impact of window size on mean delay
Impact of adaptive window size on F-score
Impact of adaptive window size on mean delay
• Log from claims management system of a large Australian insurance company
• 4,509 traces, 29,108 events with 12 event classes
Evaluation on real-life log
• Results validated with a business analyst from the insurance company
• Distribution of the number of active cases over log timeline confirms the results
Evaluation on real-life log
How to explain what happened?
Reference window Detection window
𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1
N.R. van Beest, M. Dumas, L. Garcia-Banuelos, M. La Rosa, Log delta analysis: Interpretable differencing of business process event logs. In BPM 2015
Event structure1 Event structure2
MERGE MERGE
Runs1Runs2
PSP
Before the drift, task “Emit invoice” could be repeated, afterwards not
anymore...
lack of accuracy
superfluous activitymissing activity
P PQ QR >>S >>E >>F F- XG -
Challenge 3: Aligning logs with existing collectionlog
PQRSEFG
ABD
ABCDE
PQFXEFG
process model collection
A
D
B CP
Q
F
X
A
D
B C
E
trace sub-trace
activity
event
A AB BC CD DE E
full alignment
partial alignment
overall alignment score = ?
Wil M. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 2012
Prof. Marcello La RosaAcademic Director (Corporate engagements)
BPM Discipline, IS SchoolScience & Engineering Faculty
Queensland University of Technology
@mlr80