Data Science - TU/e · Data Science Betere processen en producten dankzij (Big) data Wil van der...

38
Data Science Betere processen en producten dankzij (Big) data Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org

Transcript of Data Science - TU/e · Data Science Betere processen en producten dankzij (Big) data Wil van der...

Data Science Betere processen en producten dankzij (Big) data

Wil van der Aalst www.vdaalst.com @wvdaalst

www.processmining.org

Data Science Center Eindhoven

http://www.tue.nl/dsce/

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

DSC/e: Competences and Research Programs 28 groups and 420+ people involved

Context: Why are we using data science, does it have the intended effect, and will

people accept it?

Analysis: How to turn data into real value (models, answers/decisions, and

visualizations/insights)?

Enabling technologies: How to get the data and deal with computational/

infrastructural challenges (big data and hard questions)?

Probability and Statistics

Stochastic Networks

Data Mining

Process Mining

Visualization

Large-Scale Distributed Systems

Data-Intensive Algorithms

Data-Driven Operations Management

Data-Driven Innovation and Business

Human and Social Analytics

Privacy, Security, Ethics, and Governance

Internet of Things

[RP1] Process Analytics: Improving Service While Cutting Costs

[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior

[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability

[RP4] Quantified Self: Improving Performance and Well-Being

[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science

[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens

[RP7] Smart Grids: Data Intensive Infrastructures

Data Science Flagship (Philips & DSC/e)

• 4 Strategic topics

• 4 TU/e departments

• 16 PhD students

• 30 Data science specialists

1. Data Driven Value Propositions

2. Healthcare Smart Maintenance

3. Optimizing Healthcare Workflows

4. Continuous Personal Health

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

“Data Science University” in Den Bosch

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: On the interface

between process science and

data science

As generic as a

spreadsheet!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Spreadsheet: Killer App for early computers

• VisiCalc (killer

app for Apple II,

Oct. 1979)

• Lotus 1-2-3 (killer

app for IBM PC

1983)

• Microsoft Excel

(1985)

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Spreadsheet: Static data

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Spreadsheet: Static data

fact derived

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Spreadsheet: Static data

31 items

sold

total

value

average

distribution

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Spreadsheet: Static data

How to analyze operational processes?

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

• Input: events (“things that

have happened”)

• Mandatory per event:

− case identifier

− activity name

− timestamp/date

• Optional

− resource

− transaction type

− costs

− …

case

identifier

activity

name timestamp

resource row = event

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

208 cases

5987 events

74 activities

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

batching for activities

“opstellen eindnota” and

“archiveren”

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Loesje van

der Aalst

desire line

Process Discovery

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

process discovery

NO

modeling

needed!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

process discovery

NO

modeling

needed!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

event data process

model

Conformance Checking

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

desire line

very safe

system

Conformance Checking

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

conformance checking

? discovered or

hand-made

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

conformance checking

fitness of

93.5%

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

conformance checking

final inspection is

skipped 40 times

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

conformance checking

move on model

(something should have

happened, but did not)

move on log

(something happened that

should not happen)

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

performance analysis

average

flowtime is

1.92 months

bottleneck

NO

modeling

needed!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

performance analysis

waiting time of

15.74 days

NO

modeling

needed!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

animating reality

real cases

NO

modeling

needed!

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining: Spreadsheet for behavior

16 cases are

queueing

animating reality

Deviations

Where?

Why? time

costs

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

How to get started?

• Event Data

• Process Mining Tools

• Data Science Mindset

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Starting point for process mining:

Event data patient activity timestamp doctor age cost

5781 make X-ray [email protected] Dr. Jones 45 70.00

5541 blood test [email protected] Dr. Scott 61 40.00

5833 blood test [email protected] Dr. Scott 24 40.00

5781 blood test [email protected] Dr. Scott 45 40.00

5781 CT scan [email protected] Dr. Fox 45 1200.00

5833 surgery [email protected] Dr. Scott 24 2300.00

5781 handle payment [email protected] Carol Hope 45 0.00

5541 radiation therapy [email protected] Dr. Jones 61 140.00

5541 radiation therapy [email protected] Dr. Jones 61 140.00

… … … … … …

case id activity name timestamp other data resource

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

How to get started?

• Event Data

• Process Mining Tools

• Data Science Mindset

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining Software

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

900+ plug-ins available covering the

whole process mining spectrum

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

How to get started?

• Event Data

• Process Mining Tools

• Data Science Mindset

Process Mining

Data Science in Action

43.000+25.000 people joined!

Starts again on October 7th 2015! Register via https://www.coursera.org/course/procmin

Conclusion

http://www.tue.nl/dsce/

Get started today! spreadsheet

for behavior

data-oriented analysis (data mining, machine learning, business intelligence)

process model analysis (simulation, verification, optimization, gaming, etc.)

performance-oriented

questions, problems and

solutions

compliance-oriented

questions, problems and

solutions