A Company’s Digital Twin€¦ · Explorers, Variants Explorer, Standard KPI's ... Category...

A Company’s Digital TwinTUM Data Innovation Lab

Team: Sagarika Kathuria, Pooreumoe Kim,Frederik Wenkel, Jieyi Zhang

Mentors: Simon Brand, Anton Kurz, Sebastian Rossner

Co-Mentor: Laure Vuaille

Project Leader: Dr. Ricardo Acevedo Cabra

Supervisor: Prof. Dr. Massimo Fornasier

Our Team

Sagarika KathuriaMathematics in Data Science

Pooreumoe KimData Engineering and Analytics

Frederik WenkelMathematics

Jieyi ZhangData Engineering and Analytics

AGENDA

Motivation

Digital Twin Concept

Process Mining at Celonis

Methodology

Specific Use Case at Celonis

Summary & Outlook

AGENDA01

Motivation

Methodology

Summary & Outlook

Company as a Complex System

Resources Market

Company

➔ Company’s future performance?➔ Forecasting systems

➔ Sales forecasting, predictions for volatile resources

➔ “Patchwork” of forecasts➔ No complete picture

Competitors

AGENDA

Motivation

Methodology

Summary & Outlook

Major characteristics

➔ Similar behavior to original organization

➔ Family resemblance

➔ No exact copy➔ Digital Twin can be

raised differently

AGENDA

Motivation

Methodology

Summary & Outlook

➔ Every process leaves a digital footprint in the company’s environment

➔ Celonis Process Mining enables an organization to structure its raw data of a process into a data model

➔ The main object : Activity Table

➔ Consists of a number of activities represented by cases

➔ Each case has its own case key to distinguish different cases

➔ Each Activity has an event time,a sorting and a category

ACTIVITY TABLE

➔ Analysis in Celonis Data Mining:

➔ Helps visualize and understand the data using tools like Process Query Language(PQL) and Key Performance Indicators (KPI’s)

➔ Helps identify bottlenecks and inefficiencies within the process

➔ Consists of various objects like: Process Explorers, Variants Explorer, Standard KPI's

➔ Customizable with tables, charts, diagrams and custom KPI’s ANALYSIS

AGENDA

Motivation

Methodology

Summary & Outlook

Explorative Data Analysis

➔ Input Dataset : Celonis Happyfox IT Service Management

➔ Happyfox - a Saas Platform, offers help deskticketing system for approx. 12000 companies.

➔ Celonis Happyfox ITSM Data Model:

➔ Three data tables which contains the information for each ticket - _CEL_ITSM_happyfox_ACTIVITIES, Tickets, Updates

➔ Data stored in the form of string, boolean, number and date time values

➔ Data accessed using PQL (Process Query Language) in Celonis and Python API.

➔ ITSM Ticket Processing

➔ Each ticket - has its own process - each process type called a Variant

➔ Happy Path - variant that happens most frequently

➔ Throughput time -

= tsact

− tsact_previous

Ttotal_tp

= tslast_act

− tsfirst_act

➔ Total Variants : 1850 , 20% cases follow Happy Path

➔ Throughput Time Analysis

➔ 2 cases - tickets created in 2017 & 2018

➔ Median - close to 0 daysMax Throughput Time - 276 days

➔ Digital Twin model considers outliers as well

Throughput Time (days) 15/37

Throughput Time (days)

Throughput Time Analysis

➔ Kolmogorov–Smirnov(KS) Test

➔ Tests the distance between the empirical distribution function of the data and the cumulative distribution function (CDF) of the reference distribution.

➔ H0 = Two distributions follow the same

distribution

➔ H0 holds if p-value > 0.05

➔ Need non-parametric methods to simulate the Data such as Bins Method, Kernel Density Estimation

Distribution p-value Is passed

Poisson Distribution

4.675e-10 No

Exponential Distribution

0.0 No

Birnbaum-Saunders

0.0 No

KS Test Results

➔ Category Based Analysis

➔ Total Categories - 21

➔ Categories of Importance - 5

➔ Digital Twin simulates categories in cases based on their percentages

Count of Tickets based on Categories

➔ Ticket Number Analysis

➔ Total Tickets - 12000

➔ Dickey Fuller Test - Statistical test for checking stationarity.

H0 - Series is non-stationary

p-value : 0.091Cannot reject H

0as p-value > 0.05

➔ Use Transformations like log

➔ Estimate trend and seasonality to predict number of tickets using Time Series Methods

Count of Tickets Based on Creation Date

General Simulation Approach

Activity Simulation

Category Simulation

Event Time Simulation

Output

➔ First goal: Family resemblance

➔ Second goal: Raise twin differently

Activity Simulation

➔ First building block of simulation➔ Importance for subsequent blocks

➔ Measure of precision: Occurrences of most frequent activity flows from input data in output

➔ According to empirical observations from input

➔ Markov Process

➔ Manageable matrix representation➔ No dependencies captured on activity

history

Activity Simulation

➔ Preprocessing yields improvement➔ Treat activity flows with different starting activities separately

➔ Linear Additive Markov Process (LAMP) instead of ordinary Markov Process

➔ Parameters w, P have to be learned minimizing the following negative log likelihood

Category Simulation

➔ Category assignment to every activity flow from activity simulation

➔ Starting category assignment➔ Markov Process for category changes Category 1

Category 2

The number of Tickets Prediction

❏ Train time series models from training set

❏ Data Separation

❏ ARIMA, Holt Winter’s ...

❏ Compare the prediction result to test set

❏ Measure errors

The number of Tickets Prediction

➔ Choose Holt Winter’s Model

Throughput Time Simulation: KDE

❏ Kernel Density Estimation(KDE):Non-parametric way to estimate the probability density function(PDF) of a random variable.

❏ Two important parameters

h : Bandwidth K(.) : Kernel function

Throughput Time Simulation: KS-Test

❏ Kolmogorow-Smirnow-Test (KS-Test)

❏ p-value > 0.05:

Two sets have same distribution

The higher, the more identical

Throughput timeFrom <Status: New> to <Change assignee>

H0: FX(x) = FY(x)H1: FX(x) ≠ FY(x)

Throughput Time Simulation: Kernel Selection

23/37➔ Choose Uniform Kernel

❏ Simulating power(KS-Test)?

❏ Identical

❏ Table 7 (Doc)

❏ Negative value generation?

❏ Except Gaussian, no negative if

bandwidth <= 0.1

❏ Table 8 (Doc)

❏ Internal sampling method?

❏ Gaussian & Uniform

Throughput Time Simulation: Bandwidth Selection

❏ Three methods for bandwidth:❏ Constant bandwidth (0.1)

❏ Gridsearch

❏ Gridsearch with Cross validation

❏ Analysis of variance(ANOVA)

Comparison of running times

ANOVA result: cannot reject H0

➔ Choose Constant Bandwidth(0.1)

H0 : µ1 = µ2 = µ3

H1 : µi ≠ µj

Digital Twin: Integration of the Methodologies

❏ Activity Simulation❏ Category Simulation❏ Number of Cases❏ Throughput Time

➔ Generates Virtual Activity Table

Model Validation

ModelMarkov Process

KDE (Bandwidth=0.1)

HoltWinter(Period = 7 days)

M1 2 months 2 months All

M2 2 months 2 months 4 months

M3 1 month 1 month 4 months

● Training Data

● Cross Validation

Model Validation

● Simulated Values Validation (Prediction for May 2018)

Cases per day Events per Day Avg Total Throughput time

Trimmed Avg Total Throughput Time

Sample Size

rel. Error M1 31.25% 28.57% 15.52% 16.05% 21.45%

rel. Error M2 25.00% 32.38% 11.20% 11.11% 23.55%

rel. Error M3 18.75% 20.95% 18.10% 14.81% 25.45%

● Simulated Values Validation (Prediction for June 2018)

Trimmed Avg Total Throughput Time

Sample Size

rel. Error M1 45.16% 36.99% 43.53% 33.33% 13.40%

rel. Error M2 38.71% 30.64% 42.35% 31.82% 14.87%

rel. Error M3 36.67% 27.75% 28.24% 18.18% 14.87%

Model Validation

● Model 3 Total Throughput Time Distribution (2018-05) Activity Frequency Validation (2018-05)

Create Ticket 21.34% 21.36%

Status: New 16.46% 17.95%

Change assignee 12.20% 9.40%

Status: Open 11.59% 12.82%

Status: Closed 10.37% 9.40%

Change Category 9,76% 9.40%

Status: Solved 7.93% 6.84%

Status: On Hold 5.49% 6.84%

Change Priority 4.89% 6.84%

Activity FrequencyReal Data

FrequencyPrediction

Rel. ErrorIn %

Real Data Total Throughput Time (in Days)

Prediction Total Throughput Time (in Days)

AGENDA

Motivation

Methodology

Summary & Outlook

First Level Service Automation

Reduce the throughput time of steps which belong to first level service to zero and simulate the cases.

Simulated Value for May 2018

Trimmed Avg Total Throughput

Sample Size

Percentage of Change -6.25% -4.76% -68.10% -78.75% -24.09%

What if I buy a chatbot and automate the first-level service?

How does this affect my throughput time?How much people could I reallocate?...

First Level Service Automation

AutomatedReal World

● More tickets solved within a short time

● Average throughput time overall reduced

● Same Happy Paths

Tickets Created in May 2018 (Celonis Process Explorer)

AGENDA

Motivation

Methodology

Summary & Outlook

Summary and Outlook

❖ Process mining and Digital Twin Model

❖ Happyfox ITSM Data Model

❖ Data exploration

❖ Introduction of the techniques applied

❖ Model training and validation

❖ Application of Digital Twin

❖ More sophisticated model

❖ More what-if questions

➢ Number of tickets change

➢ 24/7 customer support

❖ More input data

❖ ...

Summary Future Works

Thank you!

A Company’s Digital Twin€¦ · Explorers, Variants Explorer, Standard KPI's ... Category...

Documents

Transcript of A Company’s Digital Twin€¦ · Explorers, Variants Explorer, Standard KPI's ... Category...

Terrorism Simulation Achieve your group goal to win.

KPI's Training Pack

7 Important KPI's for ServiceDesk

Retail Industry KPI's : Theme

FY13 KPI's Colleague Pack

SGS Supply Chain Solution - Dashboard KPI's Paper

A 3373 KEY PERFORMANCE INDICATORS (KPI's) TO BE …

Network Management KPI's

Doelwaarden op bedrijfsniveau voor de KPI's binnen de ...

International Benchmarking and Financial KPI's

Aligning Key Enterprise Risk to Strategic Initiatives ...vashrm.org/images/meeting/091417/speaker_2_1015_1115_howard_felt… · KPI's – KRI's KPI's: Identify underperforming aspects

005 benchmarking-kpi's

Managing with KPI's and KRI's

Benchmarking KPI's für Intrapreneure

Business Finance KPI's - tutorial

An analysis of crowdfunded projects: KPI's to success

Benchmarking kpi's

easyFairs 2011 Monitoring & KPI's in der Instandhaltung · ThyssenKrupp Xervon Powering Plant Performance easyFairs 2011 - Monitoring & KPI's in der Instandhaltung, Dr.-Ing. Marcus

The KPI's of job board Success

KPI's Analysis