Information Support for Knowledge-Intensive Business ... · Information support for...

8
Information support for knowledge-intensive business processes -- combining workflows with document analysis and information retrieval Andreas Abecker~ Ansgar Bernardi, Heiko Maus, Claudia Wenzel German Research Center for Artificial Intelligence (DFKI) GmbH P.O.Box 2080, D-67608 Kaiserslautern, Germany Phone:+49 631 205 3582, Fax:+49 631 205 3210 (aabecker,bernardi,maus,sintek,wenzel} @dfki.uni-kl.de Abstract Explicit modeling of business processes and their enact- ment in workflow systems have proved to be valuable in increasing the efficiency of work in organizations. We argue that enacted business processes - that is: work- flow management systems (WfMS) - form a solid ba- sis for adequate information support in complex and knowledge-intensivebusiness processes. To support this claim we demonstrate results from two different projects: The VirtualOffice approach employs workflow context information to support the high-precision document analysis and understanding in standard office settings; the combination of workflow context and document analysis techniques allow for the automatic handling of incoming paper mail with respect to the appropriate workflows. The KnowMore approach focuses on the support of people who work on knowledge-intensive tasks by automatic delivery of relevant and goal-specific information; the context of the workflow, an extended process model, and a detai- led modeling of information sources are combined to this end. Both approaches show ways to proceed from workflow systems towards active knowledge manage- ment. Introduction Workflow Management is a widespread technology for automating structured business processes. Today it is mainly used to coordinate complex processes where many activities must be scheduled and dispatched among many possible agents. Further support comes from an integrated handling of application programs used in the process chain and a streamlined passing of application data and electronic documents flowing bet- weendifferent process steps. As today’s complex business processes rely on inten- sive information exchange with the company’s environ- ment, they are document-driven by nature: Employ- ees deal with and react to information and knowledge transferred by and embeddedin all kinds of documents, including forms, letters, books, manuals, records, either electronic or paper-based. Which knowledge and in- formation needs to be applied to perform the activities Copyright ©2000, American Association for Artificial In- telligence (www.aaai.org). All rights reserved. in a given process step typically are not represented in the WfMS. Nevertheless, the material is accessible somwhere in the company’s information space, enab- ling the employees to do their jobs. Consequently, one would like the WfMS to automatically offer access to relevant knowledge sources, or to even directly ’pump’ information items extracted from incoming documents to the appropriate places in the data models of the actual workflow instance. This vision requires an ex- tensive exchange of information items and suitable se- mantic annotations between workflow, application, and information space. To realize this, the WfMS should possess interfaces for exchanging knowledge items with the surrounding support environment, able to bridge between different conceptualizations and data models. Unfortunately, such considerations are not subject of current WfMS approaches ((Georgakopoulos, Hornick, & Sheth 1995), (Alonso et al. 1997)). Contemporary systems exhibit only a very thin interface for data ex- change with other office applications such as text pro- cessing applications or corporate data bases. In order to overcome this limitation, this paper will present two different solutions which describe how a true knowledge transfer between business processes and their surrounding information space can be established. Both approaches focus on process-embedded informa- tion delivery from documents and fit into a common description frame which is sketched in Figure 1. The WfMS represents running business processes by workflow instances and executes them by means of a workflow engine. The second system is the mediator system which we call inIormation provider. The information provider gets an information re- quest and some additional context information from the WfMS. It accesses the documents by some kind of do- cument index. This index may consist of an inverted index file as required for information retrieval tasks, but it might be any more sophisticated document mo- del as well. The retrieved information is handed over to the WfMS. So the interface between the two systems comprises three different kinds of information. Central to all our considerations is the notion of con- text. Since the context required highly depends on the actual information request, the common description 53 From: AAAI Technical Report SS-00-03. Compilation copyright © 2000, AAAI (www.aaai.org). All rights reserved.

Transcript of Information Support for Knowledge-Intensive Business ... · Information support for...

Page 1: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

Information support for knowledge-intensive business processes --combining workflows with document analysis and information retrieval

Andreas Abecker~ Ansgar Bernardi, Heiko Maus, Claudia Wenzel

German Research Center for Artificial Intelligence (DFKI) GmbH

P.O.Box 2080, D-67608 Kaiserslautern, GermanyPhone:+49 631 205 3582, Fax:+49 631 205 3210

(aabecker,bernardi,maus,sintek,wenzel} @dfki.uni-kl.de

Abstract

Explicit modeling of business processes and their enact-ment in workflow systems have proved to be valuable inincreasing the efficiency of work in organizations. Weargue that enacted business processes - that is: work-flow management systems (WfMS) - form a solid ba-sis for adequate information support in complex andknowledge-intensive business processes.To support this claim we demonstrate results fromtwo different projects: The VirtualOffice approachemploys workflow context information to support thehigh-precision document analysis and understanding instandard office settings; the combination of workflowcontext and document analysis techniques allow forthe automatic handling of incoming paper mail withrespect to the appropriate workflows. The KnowMoreapproach focuses on the support of people who workon knowledge-intensive tasks by automatic delivery ofrelevant and goal-specific information; the context ofthe workflow, an extended process model, and a detai-led modeling of information sources are combined tothis end. Both approaches show ways to proceed fromworkflow systems towards active knowledge manage-ment.

IntroductionWorkflow Management is a widespread technology forautomating structured business processes. Today itis mainly used to coordinate complex processes wheremany activities must be scheduled and dispatchedamong many possible agents. Further support comesfrom an integrated handling of application programsused in the process chain and a streamlined passing ofapplication data and electronic documents flowing bet-ween different process steps.

As today’s complex business processes rely on inten-sive information exchange with the company’s environ-ment, they are document-driven by nature: Employ-ees deal with and react to information and knowledgetransferred by and embedded in all kinds of documents,including forms, letters, books, manuals, records, eitherelectronic or paper-based. Which knowledge and in-formation needs to be applied to perform the activities

Copyright © 2000, American Association for Artificial In-telligence (www.aaai.org). All rights reserved.

in a given process step typically are not representedin the WfMS. Nevertheless, the material is accessiblesomwhere in the company’s information space, enab-ling the employees to do their jobs. Consequently, onewould like the WfMS to automatically offer access torelevant knowledge sources, or to even directly ’pump’information items extracted from incoming documentsto the appropriate places in the data models of theactual workflow instance. This vision requires an ex-tensive exchange of information items and suitable se-mantic annotations between workflow, application, andinformation space. To realize this, the WfMS shouldpossess interfaces for exchanging knowledge items withthe surrounding support environment, able to bridgebetween different conceptualizations and data models.

Unfortunately, such considerations are not subject ofcurrent WfMS approaches ((Georgakopoulos, Hornick,& Sheth 1995), (Alonso et al. 1997)). Contemporarysystems exhibit only a very thin interface for data ex-change with other office applications such as text pro-cessing applications or corporate data bases.

In order to overcome this limitation, this paper willpresent two different solutions which describe how atrue knowledge transfer between business processes andtheir surrounding information space can be established.Both approaches focus on process-embedded informa-tion delivery from documents and fit into a commondescription frame which is sketched in Figure 1.

The WfMS represents running business processes byworkflow instances and executes them by means of aworkflow engine. The second system is the mediatorsystem which we call inIormation provider.

The information provider gets an information re-quest and some additional context information from theWfMS. It accesses the documents by some kind of do-cument index. This index may consist of an invertedindex file as required for information retrieval tasks,but it might be any more sophisticated document mo-del as well. The retrieved information is handed overto the WfMS. So the interface between the two systemscomprises three different kinds of information.

Central to all our considerations is the notion of con-text. Since the context required highly depends onthe actual information request, the common description

53

From: AAAI Technical Report SS-00-03. Compilation copyright © 2000, AAAI (www.aaai.org). All rights reserved.

Page 2: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

workflow management information documents.stem prov,derworkflow in~ I

information

wo o g’

context

~[~

Figure 1: Information Support for Workflows: A Common Description Frame

frame does not answer the questions where necessarycontext comes from and what it consists of. We arguethat context delivery may require different system ar-chitectures. This statement will be clarified by intro-ducing two different solutions instantiating the generaldescription frame: The VirtuatOmce project (see also(Wenzel 1998)) integrates paper-based information intoany kind of workflow activity, while the KnowMore pro-ject (see also (Abecker et at. 1998)) aims at supportingso-called knowledge-intensive tasks by pro-active docu-ment delivery. After a review of some related work inwe summarize with some general insights on possiblesolutions for the questions how semantically enrichedinterfaces can be established and how context informa-tion can be dealt with practically.

The VirtualOfiice Scenario:Information Support by Paper

DocumentsAlthough the paperless office has been a buzzword formany years now, it still has not come into reach. On thecontrary, the enactment of business processes by admi-nistrative workflows has even complicated the integra-tion of paper documents since such workflows requireelectronic representations of all documents involved.

Such workflows are characterized by heterogenous do-cuments which belong to one common process and ar-rive in a chronological order. A typical example is aninsurance process with initial applications for contracts,changes in the policies, annual invoices, and damageclaims. Another example are business trips where thetraveller has to fill out an application, the applicationmust be confirmed, some invoices, e.g., for plane tickets,have to be payed in advance and, finally, several re-ceipts must be accounted for. Certainly, there are a lotof similar examples but within this paper we will use apurchasing process within a company as a basis for ourexamples.

The VirtualOt~ce project tackles the above-mentioned problem of integrating paper-based infor-mation into workflows. In order to achieve a solution,a system for document analysis and understanding(DAU) is used as workflow application which even

benefits from this integration: As stated in (Baumannet al. 1997a) context information from businessprocesses improves the analysis of business letters andis therefore a crucial point of our research.

In order to get an impression how document analy-sis and understanding works, figure 2 reveals typicalanalysis steps and results. Usually, DAU systems aredivided into two parts: The analysis front-end of thesystem works as provider of uniform information aboutthe characters appearing in a scanned document alongwith additional logical attributes. This information isinterpreted by the system’s document understandingcomponents which form the back-end.

The analysis of a scanned document starts with low-level image preprocessing such as skew angle adjust-ment and upside-down detection. Afterwards, segmen-tation divides the document into geometrically connec-ted components and identifies segments of characters,words, lines, and blocks. Then, text recognition ex-plores the captured text segments, generates charac-ter hypotheses, and merges them into word hypothe-ses. Structure classification takes this given geometricstructure to hypothesize the so-called logical objects ofa document, e.g. title, author, chapter, etc.

Within document understanding, the generated wordhypotheses are validated by dictionary look-up in thetext-based preprocessing step. Now, the content-basedpart of analysis is invoked. First, the message type of adocument has to be derived (e.g. confirmation of orderby supplier xyz). This information is used to start more in-depth analysis for the extraction of referencedata, products, the date of the letter and so on. Formore information on DAU, please refer to (Baumann etal. 1997b).

The basic scenario of our integration of DAU intoWFMSs fits well into the general architecture presen-ted in Figure 1: The DAU system represents an infor-mation provider and a company’s mail box representsthe collection of documents involved. The task whichis requested by a workflow instance represents an in-formation need of this particular instance. In order tosupport DAU in answering this request, the workflowinstance transfers context information to the DAU sy-

54

Page 3: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

Input Analysis Steps Results of Document Results of DocumentAnalysis Understanding

.., ........ .-.,p,._

’z-~_----"!’:l~be) Message Type:~, ~:~ ~ ;~BIL.¢ OCR text w th verfcat on ...... n o

-~ll~k~ ! i <~ :DAU System i~ : ~:~,~ ..........-L ~ I Clerk in ~hor,-,°.l

¯ ~~!!FSQNTEND.!~!~BAC~ENDi~i!~:i~i ~ ~ , / Mr Sc~m~t~’~’1

~:- i ;:~D:~U:M~i,~:I!I~ii~o~L~.M~~lnvo~.ce n,.~ber 3z3z3z

I Product C~l , ¯ I-’~

~~-,~,-~-~-~ Igory: h~

~,~ ImagePre-~i}~i~i~ ~/-~/;~:;~i~ LogicalDocu-’~ " " ~--~-’~tnv°~ce2u.rn2er: I’~i processing lex~-oaseo ~. mo~,tq~,,,,.h,,og, ....-’-- -1"- I o,o,o, v.u I= .... Pre-processing .............. ~’/" ISender: smartLa~ .T.-~..G;.--’:’-- i~i Segmen-ii~!i:~;i~ :i~;,~: ~ ~ <database_link I . ,lexlual

tation ~i’-2. ........................~ ................ I~-~1" ? ~ I -E~,%-.---~,-~ .... I Informat on

+ ~i ,-.~ ~%~i Determ nat on i~; I " I I-’roauc~ List: IWorkflow ~ ...

Expectations ;~Recognltion ~ii! Information [~i~ CoIorLoLoL~ ’~’’’’,~~ I -"

ii Structure ~!~i~ Extraction ~IDistribution Form Type ... ~’~I~=I=I-I=FIII’-I I

Figure 2: Document Analysis and Understanding: Steps and Results.

stem. Having finished the task, the DAU system handsover tile data requested and consequently satisfies theinformation need of the workflow instance.

Within VirtualO~lice, two different kinds of tasks aredistinguished: The process identification task is statedwhenever workflows inform DAU that they are waitingfor particular documents. In this case, DAU analysesincoming documents in order to determine the correctworkflow instance to which the document has to be assi-gned. This scenario is discussed in the following sectionand shown in Figure 3. The second task which we callinformation extraction is stated when workflows requestinformation contained in particular documents. Then,DAU analyses the document in order to deliver the in-formation requested. This scenario is subject to section2.2 and displayed in Figure 4.

Solving the Task of Process Identification

Process identification, i.e., assigning an incoming docu-ment to the corresponding workflow instance, is basi-cally a match of information contained in documentswith the corresponding data available in workflow in-stances. To achieve this, the relevant data in workflowinstances have to be collected for two different reasons:

First of all, they build one kind of input for the in-stance match. Imagine there are a lot of open instancesand a new document has to be assigned to them. Inthis case, the more data we have about the instancesand the expected document (e.g, document type, sen-der, products mentioned and even possible referencesto other events within the same instance) the more ac-curate the match can be accomplished. On the otherhand, these same data specify the analysis task for DAUsince they define which items DAU has to search for.

The resulting scenario is an instantiation of our gene-ral scenario and shown in Figure 3. In this scenario, se-veral activities within different workflow instances state

the current context of these instances whenever newcontext is available. All these workflow context dataare transferred to the context pool, a database availa-ble outside the WfMS.

There, the context is stored until an event occursin a workflow instance which implies a response by adocument (shown by hatched oval activities), i.e., theworkflow has an information need which will be satis-fied by an incoming document related to the event orworkflow. This information need is stated by handingover the request of "process identification" to the DAUsystem and by formulating its context by means of aso-called expectation. It describes content and meaningof the expected document, additional information needsuch as a list of data which has to be extracted, andsome administrative data in order to identify the work-flow after a process determination was successful. Inorder to allow the DAU to interpret the expectation, itis connected to the DAU’s document ontology which de-scribes structure and content of the documents of thedomain under consideration. Such an expectation isgenerated by using the context pool as input for a ruleinterpreter. The rule interpreter uses transformationrules which relate the context information of the con-text pool to possible contents of an expected document.

Thus, these rules transform the data scheme used inthe context pool into the DAU domain ontology. Forinstance, a rule states that the receiver of the (alreadyknown) order is the sender of the expected correspon-ding invoice.

After generation, the expectation is stored at the in-formation provider’s (namely the DAU system’s) sidein a so-called expectation set. Since the expectation setstores all expectations of all processes, it contains thewhole workflow context relevant from the DAU pointof view. Access to the expectation set is triggered byany incoming document for which DAU has to perform

55

Page 4: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

workflow management document analysissystem and understanding

workflow instances

~(ZZ).b(Z;~IK~Z). process identifier f ~’~""identify Pr2::::’:t~’J ~- I1

[ workflow engine

L "~ ~

/

context ~ transfor-, co,,ect on"~ mat,on I’~rf:~m~

incomingbusiness letter

Figure 3: Information Support by Process Identification for Paper Documents.

a process identifcation. Note that the identificationof the corresponding process takes all expectations ofall workflow instances into account. Consequently, thistask can be seen as inherent to the business letter do-main. The result of the process identification task isa unique process identifier and, of course, the name ofthe document which is assigned to the process. The ex-pectation of the correponding instance is deleted afterthe process identification has been verified within theinstance.

This ends the process identification scenario. Datakept within the context pool are deleted when the cor-responding process instance has come to an end.

Solving the Task of Information Extraction

Information support by incoming paper documents isnot complete by only handing over the document tothe corresponding workflow instance. Instead, the re-levant information contained in the document shouldbe transferred into an electronic representation. Thisis formulated by a second task called information ex-traction. The time for stating such an information ex-traction task is not restricted to process identificationonly. On the contrary, a workflow can also state a sepa-rate information extraction task after a document hasalready been assigned.

Figure 4 provides a conceptual view on the informa-tion extraction scenario: In this context, only one work-flow instance is of further interest. Within this instance,the context pool is still updated after a process identi-fication request whenever new context is available. Ifan activity states an information extraction request, anew expectation is generated as described above. Therequest now includes a list of identifiers according tothe business letter ontology which have to be extractedfrom the attached business letter. In contrast to ourprocess identification scenario, the DAU now has notonly the raw paper image at its disposal but it can alsoaccess previous analysis results for this document imagesuch as OCR and process identification results. This is

accomplished by using an adminstrative document in-dex.

Implementation Issues

One requirement of the VirtualOl~ce project was to usecommercial WfMSs. Therefore, we have to face someconceptual restrictions, e.g., it is impossible to extendthe workflow model in order to provide the necessarycontext information. Thus, our solution uses the inter-faces as provided by WfMSs and adds the concept ofworkflow context data. This allows us to incorporateany WfMS which is based on the WfMC standard forInterfaces 2/3 (Workflow Management Coalition 1998):

In order to extend a conventional workflow to be ableto get an automatic information support from paperdocuments, the following working steps have to be ac-complished: At the workflow side, the collection of con-text has to be modeled at build time of the workflowby including some actions which retrieve workflow con-text data and store them in the context pool. Further-more, the workflow has to be extended with activitieswhich state expectations and handle documents aftertheir assignment. At the information provider’s side,the ontology for the specification of DAU tasks and thedomain ontology for documents have to be made acces-sible. Furthermore, DAU techniques need an interfaceto use information stored in the expectation set. Bet-ween the two systems, a set of transformation rules hasto be specified.

The KnowMore Scenario:Active Information Support by

Workflow IntegrationThe Virtua1Ot~ce scenario emphasized the use ofworkflow-context information for improved documentanalysis and information extraction. Our secondscenario has a different focus: The KnowMore ap-proach targets on supporting a person working on someknowledge-intensive task by actively, i.e., without an

56

Page 5: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

workflow managementsystem

document analysisand understanding

workflow instances

"ext,’aotcontextJ ~ J- "\ j

context IC-"----~ transfor- L ~I ulodate ~ ~1 marion I~,,o~..I

attachedbusiness letter

documentindex

V-1

Figure 4: Information Support by Information Extraction frvm Paper Documents.

explicit, detailed request by the employee, deliveringcontext-sensitive and relevant information.

The KnowMore Scenario in an Example

To illustrate this approach we consider a snippet from arather simple process: managing a contact to a poten-tial customer at our research institute. After the initialcontact (e.g., a telephone call), the relevant topics of in-terest are identified (for instance, specific technologiesor tools potentially useful for the customer, or formerprojects dealing with similar issues as the customer’sproblems) and appropriate information material is se-lected (for example, a technology whitepaper, a bro-chure about a specific tool, or a project flyer). Havingdone this, a nice information package can be sent to thepotential customer, whose reaction will then determinethe further steps (e.g., arranging a meeting, or termi-nating the process due to lack of interest). Most ofthese activities can be considered knowledge-intensive.It is easy to imagine useful support for these activities:When selecting the information material to be sent ac-tive suggestions from the system would be helpful, sup-posed that the system takes into account the informa-tion from the activities done so far, e.g., the selectedtopics. An automatic, context-aware archiving of theresults is useful when a similar process is started at adifferent time and / or location: The initial contact willthen profit from information about earlier contacts tothe same company or about similar cases.

Figure 5 shows a screenshot of our experimental sy-stem prototype. On the left, in the background, wesee an editor window of the workflow application usedto indicate relevant information material. To do thisit is necessary to fill the text fields in the input mask.The KnowMore system provides support in the follo-wing way:

When the workflow engine starts this activity, thesystem takes the information needs associated to theactivity and finds out whether some element of the OMis relevant to this task (i.e., whether there is some ma-

terial which is relevant wrt. the topics identified). Thissuggested decision value is inserted in the user inputmask offering a proposed solution (in the example, thesuggestion comprises documents about the DFKI, a pa-per about corporate memories, and some material onthe ESB system which is an example of an organiza-tional memory). The suggested information elementsare ordered according to their relevance computed bythe retrieval function as well as to a predefined orderbased upon their information type. The informationelements are then offered to the user as hyperlinks inthe KnowMore information browser (we use standardweb browsers with Java applets for the user interface).

The user is free to accept or dismiss the suggestions orto select different material according to personal know-ledge. Whatever the choice, the system keeps trackof the solutions and the workflow, and records the re-sults automatically together with the relevant contextinformation. If, sometime later, further material is tobe selected, the system remembers the results of theearlier steps and modifies the suggestions accordingly.Thus the system actively offers supply to the user byproviding context-specific relevant information.

Solving the Task of Active andContext-Aware Information Supply

In order to provide the services described, the Know-More approach combines an extended workflow modelwith sophisticated information agents. As illustrated inFigure 6, the workflow actively asks to retrieve relevantdocuments. To this end, specific modeling activities areneeded at process definition time, which facilitate theadequate enactment at runtime.

Process Definition Time

Model business processes. Model the overall busi-ness process with a conventional BPM or workflow tool.

Extended modeling of knowledge-intensivetasks. The knowledge-intensive tasks (KIT) among theactivities of a particular process require special atten-

57

Page 6: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

find InTo ml~lnal ................................

Figure 5: Active Support by Suggesting Relevant Information Material.

tion. In order to facilitate the intended active infor-mation support, the KIT representation -- illustratedin Figure 6 as a hatched square -- extends the con-ventional description of a workflow activity by a sup-port specification describing the respective informationneeds as generic queries or query schemes, together withthe information agent responsible for their processing.At runtime, the agent’s processing of the instantiatedqueries will deliver relevant information which help toexecute the activity in question.

Attach extended information flow. In order toinstantiate the query schemata at runtime, thus exploi-ting situation-specific knowledge and context parame-ters, the retrieval process must have access to the work-flow parameters. These parameters are represented invariables which are handled by the workflow environ-ment. As they go typically beyond what is modeledin a conventional workflow specification we talk aboutKIT variables, although there is no conceptual diffe-rence between them and conventional workflow varia-bles. They simply describe the information flow bet-ween tasks in the workflow, and are the communica-tion channel between workflow and information retrie-val agents. In Figure 6, hatched triangles indicate ad-dional KIT variables.

To enable the necessary reasoning for intelligent re-trieval, the KIT variables must be embedded into a do-main ontology (essentially, this means that their valuesmust be of a type defined as an ontology concept).

In summary, the KIT variables partially model thedata flow in the process and represent the relevant con-text of the knowledge-intensive activities at runtime.

Model information sources. The informationwhich is actively offered to support particular tasks istaken from a variety of knowledge sources available in

an organizational memory. These sources are of dif-ferent nature, resulting in different structures, accessmethods, and contents. To enable the precise-contentretrieval from the heterogeneous sources, a powerful re-presentation scheme for uniform knowledge descriptionis needed. To this end, structure and metadata, infor-mation content and information context are modeledon the basis of formal ontologies. The document in-dex in Figure 6 is thus realized as a set of descripti-ons modeling the information sources and facilitatingan ontology-based access and retrieval.

Process Enactment Time

Enact knowledge service processes¯ Whenever aknowledge-intensive activity is reached during workflowenactment, the workflow engine not only starts this ac-tivity, but in addition initiates an appropriate informa-tion agent (which is specified in the KIT description).

The information agent performs the actual retrievalof relevant information from the information sources.It relies on domain knowledge - available in a domainontology - to realize an extended, ontology-based infor-mation retrieval, and utilizes the context informationfrom the ongoing workflow -- found in the instantiatedKIT variables -- in order to determine relevant infor-mation. Traversing the formal ontologies according tospecified search heuristics (Liao et al. 1999), the in-formation agent is able to extend and refine the givenqueries and to reason about the relevance of availableinformation items.

Realize context-aware information storage.Whenever a workflow activity results in the creation ofan information item worth preserving, an appropriateinformation agent takes into account the current con-text of the process -- available via KIT variables and

58

Page 7: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

workflow managementsystem

workflow instances

workflow engine

documents"retrieve

documents"

context

[] KIT descriptions/k context variables

Informationretrieval

documentsdocument

index~ /~

Figure 6: Information Support by Information Retrieval for Knowledge Intensive Tasks.

access to the workflow control data. Thus the informa-tion is automatically linked to its creation context andcan be retrieved accordingly, should the need arise.

Implementation Issues

The concepts described have been implemented ta-king into account the standard architecture of a WfMSas promoted by the workflow management coalition(Workflow Management Coalition 1999)" The KIT de-scriptions are represented by sets of workflow variablesand thus handled by the workflow engine. The work-list handler is responsible for invoking both the relevantactivities and the appropriate information agents spe-cified in the KIT descriptions. In spite of the fact thatthe KnowMore prototype uses a self-developed work-flow engine, this approach is expected to be compatiblewith a vast amount of existing and commercially availa-ble workflow implementations.

Related Work

Document analysis and understanding has reached apoint where the resulting systems are successfully puton the market. But these DAU systems are mostly stan-dalone applications with only one task, e.g., extractingall data from a form. Furthermore, the context usedis static, e.g., given by predefined keywords. Up tonow, the use of dynamic context as provided by busi-ness processes is still a unique feature of the Virtua-10t~ce project (Wenzel 1998). Although commercialsolutions provide an integration of document analysisand imaging into WfMSs (e.g., FormsRec AIDA solu-tion from ICR1), they only offer an isolated analysis ofdocuments, i.e., they do not consider any open work-flow instances waiting for a particular document. Forexample, the COI IntelliDoc solution 2 classifies docu-ments according to a set of predefined keywords. Af-terwards, the documents are routed to correspondingWfMS worklists to which the keywords are assigned.

lhttp://w~.icr.de2http://v~.coi.de

Concerning Organizational Memory based supportfor running workflows, there are two interesting rela-ted projects. (Staab & Schnurr 1999) present a pro-ject conducted at the University of Karlsruhe which isvery close in spirit to the KnowMore approach. Com-pared to our system, they explore in more depth theinferential power of ontology-based retrieval on top ofthe Ontobroker software (Fensel et al. 1998). They in-troduce the notion of context-based views for couplingworkflow and retrieval which is the analogue to our in-formation needs. The approach presented seems to bemore developed in some respect than ours, for instanceconcerning XML-based document representations, ora deeper conceptual integration of workflow and ser-vice process. However, they make strong assumptionsabout the underlying ontologically annotated Intranetinformation sources, and it seems not completely clearhow far their implementation is already. Another veryinteresting approach is the one by (Kaathoven et al.1999) who describe the mechanism of knowledge flowchunks for synchronizing a conventional WfMS schedu-ling activities with a concurrently running OM-basedtask support system assisting in enacting specific acti-vities. Their general considerations about conventionalWfMS machinery and external knowledge services com-plement our interfacing ideas. But, they also seem tobe still at the conceptual level of their work.

ConclusionWe demonstrated that the combination of business pro-cesses and their enactment in WfMSs with the infor-mation space of an organizational memory can lead toeffective user support. Two different approaches havebeen presented to illustrate this claim:

The VirtualOt~ce project basically aims at impro-ving the company’s information logistics by a smoothertransfer of information contained in paper documentsinto the electronic workfiows. Thus its service can beapplied virtually throughout the whole process and ineach process which has interfaces to the external envi-ronment of the company where media breaks occur. Itsupports the business process object level where paper

59

Page 8: Information Support for Knowledge-Intensive Business ... · Information support for knowledge-intensive business processes --combining workflows with document analysis and information

documents are an integral element.The KnowMore project is not so much focused on the

object level optimization of whole, and arbitrary work-flows. Instead, it concentrates on knowledge-intensivecore processes of a business, and knowledge-intensivetasks at the heart of those processes. Not the proces-ses themselves are improved, but a meta-level is added,an additional knowledge-service process which helps theuser working on her tasks.

Both examples emphasize the use of the businessprocess models and their enactment in the WfMS asindispensable source of context information. Further-more, we observe that successful coupling of workflowand knowledge-intensive application requires some for-mal semantics as the basis of communication. That is,we need a mapping from the context information mode-led in the workflow to the ontology of the informationprovider. The examples illustrate two principal approa-ches to this end:¯ When the workflow modeling is extended especially

with respect to the support of knowledge-intensiveactivities, it might be possible to consider the availa-ble ontology already at process modeling time. Theresulting KIT representation is then well-grounded inthe ontology of the information provider.

¯ On the other hand, if the already-defined workflow isto be left untouched, it is possible to add the ne-cessary semantics by defining transformation ruleswhich map the terms already used in the workflowmodel to the onto|ogy of the information provider.

To effectively access the workflow context informa-tion it seems useful to treat the workflow instances asfirst order citizens in the information world, as demon-strated:¯ The explicit modeling of context information in KIT

variables extends the data flow model in the work-flow. This is suitable if the workflow engine controlswhen and what is transferred outside.

¯ The handling of a separate context pool is suitable todeal with situations where a large amount of workflowinstances provide volatile context information whichis required by activities outside of the workflow’s in-fluence. The continuous collection of context infor-mation and its organization according to the needsof the supporting application (e.g. the DAU com-ponent) even allows to handle context informationwhich the WfMS only creates temporarily and whichotherwise might not be available when needed.

In summary, the extension of the workfiow scenario isa suitable way to provide active and extended servicesand to transform available information into process-oriented actionable knowledge. Some of the sketchedsolutions seem still a little bit clumsy. However, ourobservations might indicate some further work worthinvestigating which might result in suitable adaptati-ons of future WfMS in order to ease the utilization ofworkflow context information.

Acknowledgement The two projects described havebeen funded by the German Federal Ministry for Educa-tion and Research (Bundesministerium ffir Bildung undForschung, bmb+f) under grants 01 IW 804 (Know-More) and 01 IW 807 (Virtua1Ot~ce), respectively.

References

Abecker, A.; Bernardi, A.; Hinkelmann, K.; Kfihn,O.; and Sintek, M. 1998. Toward a technology fororganizational memories. IEEE Intelligent Systems.Alonso, G.; Agrawal, D.; Abbadi, A. E.; and Mohan,C. 1997. Functionality and limitations of current work-flow management systems. IEEE Expert.Baumann, S.; Malburg, M.; auf’m Hofe, H. M.; andWenzel, C. 1997a. From paper to a corporate memory: A first step. In KI-97 Workshop on Knowledge-BasedSystems for Knowledge Management in Enterprises.Baumann, S.; Ali, M. B. H.; Dengel, A.; J/iger, T.;Malburg, M.; Weigel, A.; and Wenzel, C. 1997b. Mes-sage extraction from printed documents - a completesolution. In Int. Conference on Document Analysisand Recognition (ICDAR 97), Ulm, Germany.Fensel, D.; Decker, S.; Erdmann, M.; and Studer, R.1998. Ontobroker: The very high idea. In 11th Int.Florida AI Research Symposium (FLAIRS-98).Georgakopoulos, D.; Hornick, M.; and Sheth, A. 1995.An overview of workflow management: From processmodeling to workflow automation infrastructure. InDistributed Databases 3. Kluwer Academic Publishers.

Kaathoven, R. v.; Jeusfeld, M. A.; Staudt, M.; andReimer, U. 1999. Organizational memory sup-ported workflow management. In Scheer, A.-W.,and Niittgens, M., eds., Electronic Business Enginee-ring - 4. Internationale Tagung Wirtschaftsinforma-tik, Saarbriicken. Physica Verlag, Heidelberg.Liao, M.; Hinkelmann, K.; Abecker, A.; and Sintek,M. 1999. A competence knowledge base system for theorganizational memory. In XPS-99 - 5. Deutsche Ta-gung Wissensbasierte Systeme, Wiirzburg, Germany,number 1570 in LNAI. Berlin, Heidelberg, New York:Springer-Verlag.Staab, S., and Schnurr, H.-P. 1999. Knowledge andbusiness processes: Approaching an integration. InOM ’99 - Int. Workshop on Knowledge Managementand Organizational Memory (IJCAI-99).Wenzel, C. 1998. Integrating information extrac-tion into workflow management systems. In Na-tural Language and Information Systems Workshop(NLIS/DEXA 98), Vienna, Austria.Workflow Management Coalition. 1998. Workflow Cli-ent Application Application Programming Interface(Interface 2 &: 3) Specification. WFMC-TC-1009, Ver-sion 2.0.Workflow Management Coalition. 1999. Terminologyand Glossary. WFMC-TC-1011, Version 3.0.

6O