Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo...

21
Representing and Visualizing Mined Artful Processes in MailOfMine Claudio Di Ciccio , Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information Quality in e- Health Graz, 2011, November the 24 th

Transcript of Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo...

Page 1: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Representing and Visualizing Mined Artful Processes in MailOfMine

Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci

HCI-KDD @ USAB 2011: Information Quality in e-HealthGraz, 2011, November the 24th

Page 2: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Motivation (1)

• Artful processes [HillEtAl06]– informal processes typically

carried out by those people whose work is mental rather than physical (managers, professors, researchers, engineers, etc.)

• “knowledge workers”[ACTIVE09]

• Knowledge workers create artful processes “on the fly”

• Though artful processes are frequently repeated, they are not exactly reproducible, even by their originators, nor can they be easily shared.

Artful processes and knowledge workers

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 2

Page 3: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Motivation (2)

• In collaborative contexts, knowledge workers share their information and outcomes with other knowledge workers

– E.g., a software development mgr.

• Typically, by means of several e-mail conversations

– E-mail conversations are actual traces of running processes that knowledge workers adhere to

E-mail conversations

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 3

Page 4: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Motivation (3)

• From the collection of e-mail messages, you can extract the processes that lay behind– Related e-mail conversations are traces of their runs

• Valuable advantages for users– Automated discovery of formal representations

• with no effort for knowledge workers

– Tidy organization for naïve best practices kept only in mind

– Opportunity to share and compare the knowledge on methodologies

– Automated discovery of bottlenecks, delays, structural defects• from the analysis of previous runs

• E-mail conversations are a kind of semi-structured text– this approach is not tailored to the electronic mail

• it can be extended to the analysis of other semi-structured texts

Processes from e-mail conversations

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 4

Page 5: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Motivation (4)

• Personal information management (PIM)– how to organize one’s own activities, contacts, etc. through the

usage of software [CatarciEtAl07, ACTIVE09]

• Information warfare– in supporting anti-crime intelligence agencies

• Enterprise engineering– for knowledge-heavy industries, where preserving documents

making up product data is not enough[SmVortex, Heutelbeck11]

• eHealth– for the automatic discovery of medical treatment procedures on

top of patient health records

Some areas of applicability

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 5

Page 6: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

The approach

• Representation– Declarative workflows [vanDerAalstEtAl09] for representing artful

processes

– Regular grammars to express declarative workflows constraints

• Mining– Object Matching [ZardettoEtAl10] for

• clustering e-mail conversations• finding the matching between activity and tasks instances

– Regular expression mining [GarofalakisEtAl99] for inferring constraints

– Supervised learning to group activities into processes

– Text mining information extraction to determine tasks out of e-mail messages [CohenEtAl04, SakuraiEtAl05]

How to represent and infer artful processes

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 6

Page 7: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Algorithm (1)From the e-mail archive to key parts

Mail archive Mail Database Conversations KeyParts

Multi-format mail storageplug-in based crawlers

[ZardettoEtAl10]-basedclustering algorithm

[CarvalhoEtAl04]-based filter

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 7

Page 8: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Algorithm (2)From key parts to the activities

Activityindicium

Tasks

Key PartsConcatenation

[ZardettoEtAl10]-based

Activities

[GarofalakisEtAl99]-based

pattern miner

[CohenEtAl04,SakuraiEtAl05]

-based task extractor

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 8

Page 9: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Algorithm (3)From activities to the processes

[GarofalakisEtAl99]-based

pattern miner

Supervised learning

Process indicium

Process

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 9

Page 10: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processes

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 10

The imperative model

• Represents the whole process at once

• The most used notation is based on a subclass of Petri Nets (namely, the Workflow Nets)

Page 11: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processes

• Rather than using a procedural language for expressing the allowed sequence of activities, it is based on the description of workflows through the usage of constraints• the idea is that every task

can be performed, except the ones which do not respect such constraints

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 11

The declarative modelIf A is performed,

B must be perfomed,no matter

before or afterwards(responded existence)

Whenever B is performed,C must be performed

afterwardsand B can not be repeated

until C is done(alternate response)

The notation here is based on [VanDerAalstEtAl06] (DecSerFlow)

Page 12: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processes

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 12

Imperative vS declarative

Imperative

Declarative

Declarative models work better in presence of a partial specification of the process scheme

Page 13: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

Representation of artful processesAn example of expected outcome

Existenceconstraints

Relationconstraints

Tasks

Notation based on [VanDerAalstEtAl06] (DecSerFlow)

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 13

Page 14: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processesAn example of DecSerFlow [VanDerAalstEtAl06] notation

No, it is not the initialaction

You could even start from here

• You might want to run a legal trace like this:

⟨ a3, a3, a3, a2, a2, a3, a4, a5, a6, a7, a6, a5 ⟩• What we want to state here is that such a notation is probably not quite intuitive

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 14

Page 15: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processesOur proposal

• We do not consider a static graph-based global representation alone the best suitable solution.

• A graphical representation, easy to understand at a first glimpse, must be used.

• Idea:– when presenting the process schema (static view):

1) a local view on tasks/activities, showing related constraints only;

2) a global view on the process, either:a) basic (less information, less symbols), or

b) extended (more information, more symbols, extending (a));

– (2) can work as a kind of navigation map for (1)

1) when presenting the running instance (dynamic view):1) a dynamic interactive trace representation diagram, based on the

local static view notation.

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 15

Page 16: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of processesIntroducing the new local view: the rationale

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 16

Page 17: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the visualization of constraintsThe static local view: some examples

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 17

Page 18: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the representation of processesThe static global view

Basic Extended

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 18

Page 19: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

A GUI sketchLocal and global views together

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 19

Page 20: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

On the representation of constraintsDynamic view

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 20

Page 21: Representing and Visualizing Mined Artful Processes in M AIL O F M INE Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci HCI-KDD @ USAB 2011: Information.

References

• [BranderEtAl11] Brander, S., Hinkelmann, K., Hu, B., Martin, A., Riss, U. V., Thönssen, B., Witschel, H. F.. Refining process models through the analysis of informal work practice. In BPM, Lecture Notes in Computer Science (6896), 116–131. Springer (2011).

• [HillEtAl06] Hill, C., Yates, R., Jones, C., Kogan, S.L.: Beyond predictable workflows: Enhancing productivity in artful business processes. IBM Systems Journal 45(4), 663–682 (2006)

• [ACTIVE09] Warren, P., Kings, N., et al.: Improving knowledge worker productivity - the active integrated approach. BT Technology Journal 26(2), 165–176 (2009)

• [CatarciEtAl07] Catarci, T., Dix, A., Katifori, A., Lepouras, G., Poggi, A.: Task-centred information management. Proc. DELOS Conference, LNCS 4877 (2007)

• [SmVortex] Smart vortex – management and analysis of massive data streams to support large-scale collaborative engineering projects. FP7 IP Project: http://www.smartvortex.eu/

• [Heutelbeck11] Heutelbeck, D.: Preservation of enterprise engineering processes by social collaboration software (2011), personal communication, to appear in Proc. COLLIN2011 - 2nd Symposium on Collective Intelligence

• [vanDerAalstEtAl09] van der Aalst, W.M.P., Pesic, M., Schonenberg, H.: Declarative workflows: Balancing between flexibility and support. Computer Science - R&D 23(2), 99–113 (2009)

• [ZardettoEtAl10] Zardetto, D., Scannapieco, M., Catarci, T.: Effective automated object matching. Proc. ICDE 2010

• [CohenEtAl04] Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. Proc. EMNLP 2004

• [SakuraiEtAl05] Sakurai, S., Suyama, A.: An e-mail analysis method based on text mining techniques. Appl. Soft Comput. 6(1), 62–71 (2005)

• [CarvalhoEtAl04] Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. Proc. CEAS 2004

• [GarofalakisEtAl99] Garofalakis, M.N., Rastogi, R., Shim, K.: Spirit: Sequential pattern mining with regular expression constraints. Proc. VLDB 1999

• [vanDerAalstEtAl06] van der Aalst, W.M.P., Pesic, M.: Decserflow: Towards a truly declarative service flow language. Proc. WS-FM 2006

• [Pnueli77] Pnueli, A.: The Temporal Logic of Programs. Proc. 18th Annual Symposium on Foundations of Software Technology and Theoretical Computer Science, 1977

Cited articles and resources, in order of appearance

Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 21