Liquid process model collections

52
Liquid Process Model Collections how to get there Prof. Marcello La Rosa BPM Discipline, Information Systems School Queensland University of Technology

Transcript of Liquid process model collections

Page 1: Liquid process model collections

Liquid Process Model Collectionshow to get there

Prof. Marcello La RosaBPM Discipline, Information Systems SchoolQueensland University of Technology

Page 2: Liquid process model collections

Research• Technology oriented• Business oriented

Teaching• BPM specializations• Master’s of BPM

Service• Professional training • Consultancy

BPM Discipline @ QUThttp://bpm-research-group.org

Page 3: Liquid process model collections

Each process is varied by product & brand…

End to end insurance process

Source: Guidewire reference models

Product Dev Sales Service Claims

Total number of insurance models: 3,000+

30variations

500tasks

Home Motor Commercial Liability CTP / WC

A few years back… Suncorp insurance

Page 4: Liquid process model collections

Managing large process model collections

versions & variants management

merging

refactoring /standardization

clonedetection

Process model repository querying

similaritysearch

80%

R. Dijkman, M. La Rosa, H. Reijers “Managing large collections of business process models – Current techniques and challenges”, COMIND 2012

2000 2002 2004 2006 2008 2010 20120

10

20

30

40

Page 5: Liquid process model collections

The Apromore Initiative (apromore.org)An open-source, highly scalable, SaaS platform to manage process model collections

M. La Rosa, H. Reijers, W. van der Aalst, R. Dijkman, J. Mendling, M. Dumas, L. Garcia-Banuelos “APROMORE: an advanced process model repository”, EXP.SYS.APP. 2011

Page 6: Liquid process model collections

“Build awareness”Understand differences and causes for these differences

“Achieve simplification”Identify and consolidate common business functions

“Achieve centralisation”Centralise support for non-core processes across LOBs

“Identify opportunities for partnering”Make better decisions about the processes you can partner/run in-house

Expected benefits (beginning of the project)

Page 7: Liquid process model collections

“The tool is great but it would be pretty useless because our process models aren’t great and we know they aren’t great” (*)

“If they [the models] would have been updated all along it would have been worthwhile but now they’re out of date, it’s not really worth the effort of bringing them up to date” (**)

(*) Realistic Suncorp employee(**) Disillusioned Suncorp employee

The reality (end of the project)

Page 8: Liquid process model collections

Large process model collections, hard to maintain

Large collections of “dead” process models

The other face of the medal

Page 9: Liquid process model collections

Vision: and if the collection could self-adapt?

Change is endemic to organizations and continuously affects them:

Requirements changeEnvironments change

Processes change

Page 10: Liquid process model collections

If an organization’s processes change, this will be recorded in the systems logs

Use process mining techniques to “discover” process changes from logs, and apply these changes to process model collections

Release the full potential of management techniques for process model collections

Let’s get more concrete

/

event log

live event stream

database

process model

patterns

conformanceanalysis

processperformance…

if A then B

extract process

knowledge

Page 11: Liquid process model collections

A process model collection that is

• Aligned with organizational behavior

• Can self-adapt to evolving organizational behavior

Solution: “Liquid” process model collection

W.M.P. van der Aalst, M. La Rosa, A.H.M. ter Hofstede, M.T. Wynn, Liquid Business Process Model Collections. In Modeling and Simulation-based Systems Engineering Handbook, 2014

Page 12: Liquid process model collections

1. Discovering a collection in the first place2. Coping with evolution3. Aligning logs with an existing process model collection

Approach: 3 interrelated challenges

Page 13: Liquid process model collections

Useless!

Event log

Challenge 1: Discovering a collection

Process discovery algorithm

Current situation: discovering “all-in-one” models

Page 14: Liquid process model collections

a b a b c d e c a f g h

a b a b k d e c h f g h

b c p q p r a k q r s

b c p h p r a k q r s

a x p h y z t t u

Trace clustering

a b a b c d e c a f g h

a b a b k d e c h f g h

b c p q p r a k q r s

b c p h p r a k q r s

a x p h y z t t u

Cluster 1

Cluster 2

Noise

Event log

Process variant 1

Process variant 2

Trace clustering

high complexity & redundancy

Page 15: Liquid process model collections

Our approach: Slice, Mine and Dice (SMD)

L. Garcia-Banuelos, M. Dumas, M. La Rosa, J. De Weerdt, C.C. Ekanayake. Controlled Automated Discovery of Collections of Business Process Models. Information Systems, 2014

slice the log horizontallyper variant

dice the discovered models hierarchically

mine

Page 16: Liquid process model collections

Process discovery

Trace clustering

Event log

Complexity thresholde.g. Size ≤ 30

Process model

Slice, Mine and Dice (SMD)

>?1) Slice

2) Mine

3) Dice

Page 17: Liquid process model collections

Process discovery

Trace clustering

Event log

Complexity thresholde.g. Size ≤ 30

Slice, Mine and Dice (SMD)

>?Process models

Page 18: Liquid process model collections

Process discovery

Trace clustering

Event log

Complexity thresholde.g. Size ≤ 30

Slice, Mine and Dice (SMD)

>?Process models

Page 19: Liquid process model collections

Process model M3

Process model M2

A closer look…

Page 20: Liquid process model collections

F10

F11 F13

F14F12

M2 RPST of M2

F20

F21 F22

F24F23

F25

RPST of M3

Refined Process Structure Tree (RPST)

J. Vanhatalo, H. Volzer, J. Koehler: The Refined Process Structure Tree. Data Knowl. Eng., 2009

M3

Page 21: Liquid process model collections

M2M3

F14 F25

M. Dumas, L. García-Bañuelos, M. La Rosa, and R. Uba. Fast Detection of Exact Clones in Business Process Model Repositories. Information Systems, 2013

RPSDAG

F10

F11 F13

F14F12

RPST of M2

F20

F21 F22

F24F23

F25

RPST of M3

Page 22: Liquid process model collections

F14

M2

M3

S3

+

S3

+

Extracting exact clones

S3

Exact clones:• SESE• Non-trivial• Identical

Page 23: Liquid process model collections

F12

F22M3

+

S3

+S3

?

?

Extracting approximate clones

M. La Rosa, M. Dumas, C. Ekanayake, L. Garcia-Baneulos, J. Recker, A.H.M. ter Hofstede, Detecting Approximate Clones in Business Process Model Repositories. Information Systems, 2015

Appr. clones:• SESE• Non-trivial• Similar• Unrelated

+

+

M2

Page 24: Liquid process model collections

Merging algorithm

Fragment F12 of model M2

Fragment F22 of model M3

Configurable gateway

Configurable label

M. La Rosa, M. Dumas, R. Uba, and R. M. Dijkman. Business Process Model Merging: An Approach to Business Process Consolidation. ACM TOSEM, 2013.

Merging approximate clones

S4

Page 25: Liquid process model collections

Consolidated process model collection

S4S3

+

The result…

+

+

+

Page 26: Liquid process model collections

Trace clustering• M. Song, C.W. Gunther, and W.M.P. van der Aalst, Improving Process Mining with

Trace Clustering, J. Korean Inst. of Industrial Engineers 34(4), 2008• R.P.J.C. Bose, W.M.P. van der Aalst, Trace Clustering Based on Conserved Patterns:

Towards Achieving Better Process Models, BPM 2009 Workshops• A.K.A. de Medeiros, A. Guzzo, G. Greco, W.M.P. van der Aalst, A.J.M.M. Weijters,

B.F. van Dongen, D. Saccà. Process Mining Based on Clustering: A Quest for Precision, BPM Workshops 2007

Discovery• A.J.M.M. Weijters, J.T.S. Ribeiro. Flexible Heuristics Miner (FHM), CIDM, 2011.

Evaluation setup

Log Traces Events Eventclasses

Duplication ratio

Motor 4,293 33,202 292 114

Commercial 4,852 54,134 81 668

BPI 2012 5,312 91,949 36 2,554

Page 27: Liquid process model collections

Evaluation – repository size and models number S: Song et al. B: Bose et al. M: de Medeiros et al.

• up to 64% reduction in repository size• up to 66% reduction in # of top level process models• up to 120 sub-processes extracted

Motor Comm BPI Motor Comm BPI

14% 22%

66%64%

Page 28: Liquid process model collections

Evaluation – individual model complexity

Motor Comm BPI Motor Comm BPI

30%

Page 29: Liquid process model collections

concept drift

log at time2 > time1

ABCXEFY

ABCXEY

BCCXE

BCCXE

BCCXEE

ABCXDFY

Challenge 2: Coping with evolution

log at time1

ABCDEFG

ABCDEG

BCCDF

BCCDE

BCCDEE

ABCDDFG liquid process

model collection(currently in use)

intentional changessince last version

processstakeholder

liquid processmodel collection(consolidated)

liquid processmodel collection

(from new behavior)

non-transient changessince last log

xy

xy

y

Page 30: Liquid process model collections
Page 31: Liquid process model collections

A time point when there is a statistically significant difference between the observed process behavior before and after this point

Concept drift in a single process

Page 32: Liquid process model collections

Example

<A,B,E,F,G><A,B,C,F,G> <A,B,C,F,G><A,B,D,F,G><A,B,E,F,G><A,B,D,F,G>

Drift

Log

<A,B,E,F,G><A,B,D,F,G> <A,B,E,F,G><A,B,D,F,G><A,B,D,F,G><A,B,D,F,G>

Page 33: Liquid process model collections

1. Fully automated2. Highly scalable (online use)3. Highly accurate

- types of drifts detected- delay in detecting the drift

4. Explainable

Requirements

Page 34: Liquid process model collections

• Statistically significant difference in process behavior, i.e. “when are two processes different?”

• Use an appropriate data structure to encode process behaviorPartial order runs of a process where concurrency is explicitly captured > configuration equivalence

• Process drift = time point when there is a statistically significant difference in the distribution of the runs before and after (for a given time window size)

Our approach: ProDrift

A. Maaradji, M. Dumas, M. La Rosa, A. Ostovar, Fast and accurate business process drift detection. In BPM 2015

Page 35: Liquid process model collections

1. Starting from an event log, we consider completed traces2. For each new trace

• update the concurrency relation• transform trace into run by encoding the associated concurrency relation

From a stream of traces to a stream of runs

Stream of tracesStream of runs

Page 36: Liquid process model collections

1. Define two juxtaposed sliding windows (reference and detection) forming the most recent runs

2. Consider the runs as observations of a categorical variable, one per window

3. Apply the Chi-square test of independence between the two windows

Reference window

Point of the hypothesis test

Detection window

𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1

Chi-square test of independence

P-value < threshold

Stream of runs

Page 37: Liquid process model collections

The detection delay d is the distance between the actual drift and the last trace read in order to detect a drift

To avoid sporadic stochastic oscillations of P-value, we have a drift when P-value < thresholdfor consecutive tests

Detection delay and noise filter

𝜋 𝑖+ 2𝑤

Actual drift

d

Reference window

Point of the hypothesis test

Detection window

𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1

Stream of runs

Page 38: Liquid process model collections

The choice of the windows size is critical for drift detection:• a higher variation needs more observations• a lower variation needs less observations

We use an adaptive window technique to have a more reliable statistical test based on the evolution ratio

Adaptive window

Reference window Detection window

Reference window Detection window

𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1

Stream of runs

𝑇 𝑗

𝑇 𝑗+1

Page 39: Liquid process model collections

Implementation in ApromoreWatch the screencast at https://youtu.be/97NLShSMJnQ

Page 40: Liquid process model collections

We generated a benchmark dataset of 72 logs by simulating a textbook example (loan origination process) using BIMP

Injected 18 different change patterns

For each pattern, we generated 4 logs of different lengths(2,500, 5,000, 7,500 and 10,000 traces)

Evaluation: synthetic dataset

Page 41: Liquid process model collections

Change patterns from Weber et al.12 simple change patterns:

+ 6 complex change patterns (3 nested patterns each):IRO, IOR, ORI, OIR, RIO, ROI

Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support features-enhancing flexibility in process-aware information systems. DKE 66(3), 2008

Page 42: Liquid process model collections

Drift injection – gold standardEach drift injected 9 times by composing 10 sublogs

juxtaposition

simulation

Page 43: Liquid process model collections

Time performance: time required to perform a new statistical test- min: 0.26ms- max: 2.3ms- mean: 0.5ms (real time)

Accuracy:- F-score- Mean delay

Evaluation measures

Page 44: Liquid process model collections

Impact of window size on F-score and mean delay

Page 45: Liquid process model collections

Impact of window size on mean delay

Page 46: Liquid process model collections

Impact of adaptive window size on F-score

Page 47: Liquid process model collections

Impact of adaptive window size on mean delay

Page 48: Liquid process model collections

• Log from claims management system of a large Australian insurance company

• 4,509 traces, 29,108 events with 12 event classes

Evaluation on real-life log

Page 49: Liquid process model collections

• Results validated with a business analyst from the insurance company

• Distribution of the number of active cases over log timeline confirms the results

Evaluation on real-life log

Page 50: Liquid process model collections

How to explain what happened?

Reference window Detection window

𝜋 𝑖+ 2𝑤𝜋 𝑖+1𝜋 𝑖+𝑤 𝜋 𝑖+𝑤 +1

N.R. van Beest, M. Dumas, L. Garcia-Banuelos, M. La Rosa, Log delta analysis: Interpretable differencing of business process event logs. In BPM 2015

Event structure1 Event structure2

MERGE MERGE

Runs1Runs2

PSP

Before the drift, task “Emit invoice” could be repeated, afterwards not

anymore...

Page 51: Liquid process model collections

lack of accuracy

superfluous activitymissing activity

P PQ QR >>S >>E >>F F- XG -

Challenge 3: Aligning logs with existing collectionlog

PQRSEFG

ABD

ABCDE

PQFXEFG

process model collection

A

D

B CP

Q

F

X

A

D

B C

E

trace sub-trace

activity

event

A AB BC CD DE E

full alignment

partial alignment

overall alignment score = ?

Wil M. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 2012

Page 52: Liquid process model collections

Prof. Marcello La RosaAcademic Director (Corporate engagements)

BPM Discipline, IS SchoolScience & Engineering Faculty

Queensland University of Technology

[email protected]

@mlr80