2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9...

28
Common Language Resources and Technology Infrastructure www.clarin.eu Web services and workflow creation 2010-07-01 Internal Version: 2 Editors: Marc Kemps-Snijders

Transcript of 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9...

Page 1: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

www.clarin.eu

Web services and workflow creation

2010-07-01 Internal Version: 2

Editors: Marc Kemps-Snijders

Page 2: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 2

Page 3: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 3

Web services and workflow creation

CLARIN-2010-D2R-7b

EC FP7 project no. 212230

Deliverable: D2R-7b - Deadline: T36

Responsible: Peter Wittenburg

Contributing Partners: MPI Contributing Members:

© all rights reserved by MPI for Psycholinguistics on behalf of CLARIN

Page 4: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 4

Scope of the Document This document describes the goals and requirements of web services and workflow systems that could be used by all CLARIN members and beyond, i.e. a functioning system could be used by other communities as well. Stepwise all CLARIN centers would need to introduce these requirements in their operational environment to come to a proper landscape of resources, services and tools where various instances can and will be created/operated at various places. This document will be discussed in the appropriate working groups and in the Executive Board. It will be subject of regular adaptations dependent on the progress in CLARIN. CLARIN References

• Centers Types CLARIN-2008-1 February 2009 • Persistent and Unique Identifiers CLARIN-2008-2 February 2009 • Centers CLARIN-2008-3 February 2009 • Language Resource and Technology Federation CLARIN-2008-4 February 2009 • Metadata Infrastructure for Language Resource

and Technology CLARIN-2008-5 February 2009 • Report on web services CLARIN-2008-6 March 2009 • Requirement specification web services and workflow

Systems CLARIN-2009-1 June 2009 • Integration of LR into web service infrastructure CLARIN-2009-D5R-3a December 2009

Page 5: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 5

Contents  

Contents  Executive Summary ................................................................................................................................ 6  1. Introduction ......................................................................................................................................... 7  2. Terminology ........................................................................................................................................ 8  

2.1. Definitions................................................................................................................................... 8  2.2 Acronyms ..................................................................................................................................... 9  2.2 Related Documents .................................................................................................................... 11  

3. Goals and activities ........................................................................................................................... 12  4. Architecture overview ....................................................................................................................... 13  

4.1 Introduction ................................................................................................................................ 13  4.2 Wrapper architecture .................................................................................................................. 18  4.3 CSB architecture ........................................................................................................................ 18  

5. Models............................................................................................................................................... 20  5.1 Class model ................................................................................................................................ 20  5.2 Sequence diagram ...................................................................................................................... 21  

6. Interface ............................................................................................................................................ 24  7. Algorithms ........................................................................................................................................ 24  8. Unit level test cases ........................................................................................................................... 25  9. Bibliography ..................................................................................................................................... 26  

Page 6: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 6

Executive Summary In this document we want to specify a detailed design on the part of the CLARIN infrastructure that deals with web service invocations and metadata and provenance information. It identifies all collaborating components and sub systems and provides the necessary base line information that is needed for the implementation process. This documents builds on the earlier Web Service and Workflow Requirements specification document [CLARIN-2009-1] and takes into account the following requirements that have been laid down there:

1. The CLARIN infrastructure is an open infrastructure and must provide ease of use of integrating existing web services into the CLARIN landscape. 2. The CLARIN infrastructure must be able support multiple communication protocols. SOAP and REST are considered the main communication protocols for web services, although the infrastructure should be open to extensions to other protocols as well if the need arises. 3. The CLARIN infrastructure must be able to support multiple transport protocols. Although the current web services primarily communicate via HTTP, the need may arise to support other protocols as well. An example of this is FTP for file transfer. 4. Each web service must be registered at a CLARIN compliant web service registry. All service metadata is to be registered according to the CLARIN MetaData Infrastructure (CMDI) guidelines, that is metadata is to be provided in stand-off XML documents using the CLARIN MetaData model. This is obligatory in CLARIN for all activities. 5. Functional metadata of resources and services is to be described through the data categories specified in the metadata profile of ISOcat. Amongst others this will cater for profile matching. 6. The result of each web service invocation must be associated with CLARIN metadata. This metadata is to be provided in an automated manner through the metadata component. CLARIN must make a standard basic metadata component available which is capable of delivering at least the basic level of required CLARIN metadata to reduce the amount of work needed to integrate existing web services into the CLARIN infrastructure. 7. CLARIN must ensure that users have access to temporal workspaces where they can carry out their operations and store created temporary data in a temporary fashion. 8. Each web service invocation needs to be associated with provenance data. Provenance data must be stored through a provenance component. CLARIN must make a basic standard component available which is capable of storing provenance data in a uniform manner to reduce the amount of work needed to integrate existing web services into the CLARIN infrastructure. 12. At first instance a wrapper functionality will be provided within CLARIN for test purposes and prototyping. But CLARIN will also need to study the CSB solution which will consist of action code, configuration files and library dependencies. This may be packaged as a single deployable unit. The packaging and deployment strategy needs to be worked out.

Page 7: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 7

1. Introduction This document describes detailed design of all components associated with the invocation of web services and metadata and provenance data generation. It takes into account the requirements as described in the Web Services and Workflow requirements document[CLARIN-2009-1]. Every resource in CLARIN is described using metadata following the CMDI recommendation. Similarly, each web service in CLARIN is also described using CMDI1, using a specialized metadata profile for web services. For each resource that is produced as a result of a web service invocation, as done in NLP pipelines, the resource must be associated with appropriate metadata and is initially stored in a user’s private workspace. Association with metadata is key to provide ease of use of the CLARIN centers and allows users to quickly publish their resources to a designated center without the need to create metadata from scratch. Provenance data is needed for reproducibility and traceability of the results. Two possible architectures are discussed, wrapper and CSB, which have previously been introduced in the afore mentioned CLARIN requirements document. These have in common that they both play the role of ServiceDelegate as is introduced in this document.

1 http://www.clarin.eu/wp2/wg-26/wg-26-documents/cmdi-profile-for-web-services

Page 8: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 8

2. Terminology

2.1. Definitions AAI [Stanica 2006] Authentication and Authorization infrastructure An infrastructure that provides Authentication and Authorization Services. The minimum service components include Identity and Privilege Management with respect to users and resources. Archive [CiTER] repository dedicated to the long-term preservation of the associated data IP [Stanica 2006] Identity provider An entity in an AAI that performs Identity Management. Metadata [Guenter 2004] structured information that describes, explains, locates, and otherwise makes it easier to retrieve and use an information resource. Metadata registry [Guenter 2004] registry a formal system for the documentation of the element sets, descriptions, semantics, and syntax of one or more metadata schemes Provenance data Information that provides a traceable record of the origin and source of a resource Resource [Berners-Lee 2005] The term "resource" is used in a general sense for whatever might be identified by a URI. Familiar examples include an electronic document, an image, a source of information with a consistent purpose (e.g., "today's weather report for Los Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, and bound books in a library can also be resources. Likewise, abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity). Repository [CiTER] facility that provides reliable access to managed digital resources SOA [Mackenzie 2006] Service Oriented architecture A paradigm for organizing and utilizing distributed capabilities that may be under the control of different ownership domains. It provides a uniform means to offer, discover, interact with and use capabilities to produce desired effects consistent with measurable preconditions and expectations. SP [Stanica 2006] Service provider An entity that provides access to a service based on federated authentication. Web service [Brown 2004] A web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format.

Page 9: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 9

Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations or people involved, required input and output information, and tools needed for each step in a business process

2.2 Acronyms Reference Abbreviation of Link [APA] Alliance for Permanent Access http://www.alliancepermanentaccess.eu [ARK] Archival Resource Key http://www.cdlib.org/inside/diglib/ark/ [BPEL] Business Process Execution Language http://en.wikipedia.org/wiki/WS-BPEL [CGN] Corpus Gesproken Nederlands http://lands.let.kun.nl/cgn/ [CLARIN] Common Language and Technology Infrastructure http://www.clarin.eu [DAM-LR] Distributed Access Management

for Language Resources http://www.dam-lr.eu/ [DC] Dublin Core http://dublincore.org/ [DCAM] http://dublincore.org/documents/abstract-model/ [DCR] Data Category Registry [DC-DS-XML] http://dublincore.org/documents/dc-ds-xml/ [DC-TEXT] http://dublincore.org/documents/dc-text/ [DEISA] Distributed European Infrastructure

for Supercomputing Applications http://www.deisa.eu/ [DFKI] Deutsche Forschungszentrum für

Künstliche Intelligenz http://www.language-archives.org/archive/dfki.de [DFKITR] Deutsche Forschungszentrum für

Künstliche Intelligenz Tool Registry http://registry.dfki.de/ [DFN] Deutsches Forschungsnetz http://www.dfn.de/ [DOBES] [DOBES] Dokumentation Bedrohter

Sprachen http://www.mpi.nl/dobes [DOI] Digital Object Identifier http://www.doi.org/ [D-SPIN] Deutsche

Sprachressourcen-Infrastruktur http://www.sfs.uni-tuebingen.de/dspin/ [EAD] Encoded Archival Description,

http://en.wikipedia.org/w/index.php?title=Encoded_Archival_Description&oldid=250469911

[ebXML] e-business XML http://www.ebxml.org/ [EGEE] Enabling Grids for E-sciencE http://www.eu-egee.org/ [EGI] European Grid Initiative http://web.eu-egi.eu/ [EAGLES] Expert advisory Group on Language Engineering Standards http://www.ilc.cnr.it/EAGLES/home.html [e-IRG] e-Infrastructure Reflection Group http://www.e-irg.eu/ [ELDA UC] Universal Catalogue http://universal.elra.info/ [ENABLER] http://www.ilsp.gr/enabler/ [ESF] European Science Foundation

Second Learner Study http://books.google.de/books?id=g292tXMX4tgC&pg=PA1&lpg=PA1&dq=esf+Second+learner+perdue&source=bl&ots=WKi3GUQQP6&sig=n7QSWy3StXvD06nMfAzY7GBbm9w&hl=de&sa=X&oi=book_result&resnum=3&ct=result#PPP1,M1

[FIDAS] Fieldwork Data Sustainability Project http://www.apsr.edu.au/fidas/fidas_report.pdf

[HS] Handle System http://www.handle.net/ [ICONCLASS] http://en.wikipedia.org/wiki/Iconclass62 [INTERA] Integrated European language

data Repository Area http://www.mpi.nl/intera/ [ISOcat] http://www.isocat.org

Page 10: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 10

[LAF] Linguistic Annotation Framework [LIRICS] Linguistic Infrastructure for Interoperable

Resources and Systems http://lirics.loria.fr/ [LMF] Lexical Markup Framework http://www.lexicalmarkupframework.org

[MAF] Morhosyntactic Annotation Framework [METATAG]

http://en.wikipedia.org/w/index.php?title=Meta_element&oldid=256779491

[METS] Metadata Encoding and Transmission Standard http://en.wikipedia.org/wiki/METS

[MILE] http://www.mileproject.eu/ [MPEG7]

http://en.wikipedia.org/w/index.php?title=MPEG-7&oldid=241494600

[NLSR] http://registry.dfki.de/ [OAIS] Open Archival Information System

http://en.wikipedia.org/wiki/Open_Archival_Information_System

[OASIS] Organization for the Advancement of Structured Information Standards http://www.oasis-open.org/

[ODD] One Document Does all http://www.tei-c.org/wiki/index.php/ODD [OLAC] Open Language Archives

Community http://www.language-archives.org/ [PMH] Protocol for Metadata Harvesting http://www.openarchives.org/pmh/ [REST] Representational State Transfer

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

[SCHEMAS] http://www.schemas-forum.org/ [SHIBBOLETH] Shibboleth http://shibboleth.internet2.edu/ [SIMPLESAML]SimpleSAMLphp http://rnd.feide.no/simplesamlphp [SOAP] Simple Object Access Protocol http://www.w3.org/TR/soap12-part1/ [SRU] Search/Retrieve via URL http://www.loc.gov/standards/sru/ [SRW] Search/Retrieve Web Service

http://en.wikipedia.org/wiki/Search/Retrieve_Web_Service

[SYNAF] Syntactic Annotation Framework [TEI] Text Encoding Initiative http://www.tei-c.org/ [UDDI] Universal Description Discovery

and Integration http://www.oasis-open.org/committees/uddi-spec/doc/tcspecs.htm

[WADL] Web Application Description Language https://wadl.dev.java.net/wadl20090202.pdf [WSDL] Web Services Description

Language http://www.w3.org/TR/wsdl20 [Z39.50] http://en.wikipedia.org/wiki/Z39.50

Page 11: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 11

2.2 Related Documents [CLARIN_NEWS_4] CLARIN news letter Dec 2008 http://www.clarin.eu/filemanager/active?fid=231 [CLARIN_WS_NOTE] CLARIN note on web services http://www.clarin.eu/filemanager/active?fid=270 [D-SPIN_PRES] D-SPIN workshop report

and Presentations http://www.clarin.eu/wp2/wg-26/wg-26documents/web-service-presentations-at-the-wp2-workshop-in-oxford

[CLARIN_MD_SHRT] CLARIN Component Metadata Shortguide http://www.clarin.eu/files/metadata-CLARIN-

ShortGuide.pdf [CLARIN 2008-1] Centers Types

CLARIN-2008-1 February 2009 http://www.clarin.eu/files/wg2-1-center-types-doc-

v5.pdf [CLARIN-2008-2] Persistent and Unique Identifiers

CLARIN-2008-2 February 2009 http://www.clarin.eu/files/wg2-2-pid-doc-v4.pdf

[CLARIN-2008-3] Centers CLARIN-2008-3 February 2009 http://www.clarin.eu/files/wg2-1-centers-doc-v8.pdf

[CLARIN-2008-4] Language Resource and Technology Federation CLARIN-2008-4 February 2009 http://www.clarin.eu/files/wg2-2-federation-doc-v6.pdf

[CLARIN-2008-5] Metadata Infrastructure for Language Resource and Technology CLARIN-2008-5 February 2009 http://www.clarin.eu/files/wg2-4-metadata-doc-v5.pdf

[CLARIN-2008-6] Report on web services CLARIN-2008-6 March 2009 http://www.clarin.eu/files/state-report-WP2-6-v2.pdf

Page 12: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 12

3. Goals and activities CLARIN is devoted to build a persistent integrated and interoperable infrastructure that will facilitate the access and the combination of language resources and tools/web services for those researchers that are working with language material in some form, in particular of course the humanities and social sciences. To achieve this goal CLARIN started a number of different activities. At the integration level all technologies that allow users to make resources visible and to bring them together are in focus: federation technologies, persistent identifiers, metadata, registries, workspaces and portals. At the interoperability level we are faced with the problems of exploiting the virtually integrated resources and tools/services, i.e. typically issues such as structural and semantic interoperability need to be solved to allow users for example to build and execute short and long chains of operations which we will call workflow chains. All integration aspects which are largely independent of the internal structure and semantics of resources and tools have been dealt with already by other requirements specification documents and are currently being worked out in prototypical form. They are normative for all CLARIN activities. This design document deals with the interoperability aspects as they occur in service oriented architectures. Here we want to give two examples for such scenarios:

• a structured search operating across a virtual collection of heterogeneously tagged and encoded annotations for example requires to have a mapping between the related tag and encoding sets;

• the creation of a workflow chain of operations requires that the input resource(s) need to provide the format and encoding that the following operation can work on.

In this document we want to specify the design of a number of key components in the service oriented architecture which CLARIN is envisaging. This includes aspects that have to do with resources, service invocation, metadata and provenance. While work package 5 will deal with these issues in detail from the linguistic point of view, work package 2 is focusing on the computer science aspects. The design aspects introduced in this document are generic in nature and as such not tied to specific linguistic encoding formats. The aspects covered in this document will however continuously need be revised against any further requirements that come out of work package 2 or 5 and may result in updated versions of this document.

Page 13: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 13

4. Architecture overview

4.1 Introduction The CLARIN requirements specification for web services and workflow[CLARIN-2009-1] provides a

description of the interactions involved in web service invocation. The following characteristics are considered of importance:

• Invocation of web services is considered to be data driven, i.e. users will generally supply a resource they want to invoke a service on and are interested in (part) of the information that results from the service invocation.

• In CLARIN all resources are associated with separate metadata which may link to provenance data if resources are the result of previous processing steps.

• After service invocation a new resource is created which is again to be associated with appropriate metadata.

• After service invocation the metadata description of a resource must contain appropriate provenance information containing sufficient information to allow reinvocation of the service in the future.

Gathering of metadata and provenance data are crucial elements of service invocations to provide well described, traceable resources.

The ability to generate metadata and provenance data for service requests relies on the following components to be part of the CLARIN infrastructure:

• Metadata generation component, generates metadata for the generated resources • Provenance data generation component, generates provenance data for the service invocation • Transformers, transforms data between different resource formats, tag sets and annotation level

types. Transformers From the CLARIN deliverable ‘Integration of LR into web service infrastructure’ it becomes clear that there is a wide spread in output formats for the services made available within CLARIN. Proprietary formats remain to be developed, even within the CLARIN framework and it is unforeseeable for the near future that a quick convergence to standard pivot formats can be achieved. And although the development of the proprietary TCF format in the D-Spin project has produced some interesting results in terms of integrating services from a number of partners there are currently no signs which indicate widespread adoptance of this strategy. As a consequence point to point conversions present the only viable transformation scenario available for the upcoming constriction phase. Although this may not be seen as problematic given the amount of services available this significantly adds to the overhead and development effort to prepare resources generated from one service to be usable as input for another one and will continue to rise as the number of service is increasing. To make one service interoperable with all others transformers will need to be produced for each service which is capable of producing resources that can be consumed by the service and also for each service which may consume the produced result. As presented in the CLARIN Requirements for web services and workflow deliverable[CLARIN-2009-1], use of a pivot model greatly reduces the amount of transformers needed to support all possible transformation scenarios in the service infrastructure. For the purpose of this document transformers are either considered to be individual web services, meaning that they behave as any web service and are subject to the metadata and provenance strategies. Alternatively, they are considered to be integrated as part of a web service, when invoking a web service the transformation is considered to be carried out as part of a web service. Given this, transformers do not need to be considered specifically in this document. The component diagram shown below shows the structural relations between the various components that play a role in this part of the infrastructure.

Page 14: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 14

Figure 1: Component diagram for web service invocations.

Each component may both provide and require interfaces to other components. An interface is an abstraction of one or more methods and zero or more attributes that should define a cohesive set of behaviors. Provided interfaces are modeled using the lollipop notation and required interfaces are modeled using the socket notation. ServiceDelegate The ServiceDelegate acts as a service abstraction. It provides and abstraction and thus hides the implementation of the WebService and other associated components to external clients. It also reduces the coupling between the client and all components involved. The ServiceDelegate receives messages from external clients, invokes the intended web service on their behalf, takes care that for the resulting resource(s) appropriate metadata is generated and that relevant provenance data is generated. The communication diagram below displays the interactions between the components in more detail. PIDResolver The PIDResolver is capable of returning the location of a (metadata) resource of a given PID, which is assumed to be a URI in the figure above. This is a standard web service that must be provided by any of the PID providers such as the centers associated with EPIC (European Persistent Identifier Consortium). EPIC will use handles which may be resolved using the resolver located at http://hdl.handle.net Resource repository This component describes a Resource Repository which within CLARIN will typically be of Type A, B or C. For centers of type C the resources must be accessible through the metadata descriptions, but access to the resources may only be descriptive in nature [CLARIN-2008-1]. For the Resource Repository described in this document it is assumed however that resources are machine accessible through the locations described in the metadata description which in some cases limits the interaction possibilities for the C type centers. The Resource Repository will thus provide an machine interface for accessing and retrieving resources. Access to the resource may be restricted due to authorization restrictions set on the resource. These are maintained by the resource provider and may be determined by the resource provider on the basis of the authentication details of the user provided upon login to the CLARIN infrastructure. Workspace repository The role of the Workspace repository component is to store resources that are associated with a user’s ‘work in progress’ and as such is not part of any of the archives yet. These resources may be transient in nature or may in time be persisted as part of any of the center’s archives in the infrastructure. Metadata mut be

Page 15: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 15

associated to each of the resources in the Workspace repository. The Workspace Repository may be associated with other types of information as well that are related a researcher’s private workspace, such as tool preferences. Detailed design of the Workspace component is however not within the scope of this document. As an initial approach it is assumed that one workspace will be assigned to each user so that the workspace may be associated with the identity of the user which is established upon login to the infrastructure. Metadata repository The Metadata repository represents the central repository where all metadata records for all centers are harvested. As described in [CLARIN-2008-5] these records will be harvested as CMDI metadata records trough the OAI-PMH protocol. Metadata documents are freely accessible by any requesting party. Metadata component The purpose of the metadata component is to generate metadata for the resources which are produced through service invocations. Given a single resource serving as input for a service invocation the metadata component must decide whether to generate a brand new metadata description for the resulting resource or reuse and extend the metadata description of the original resource. The distinction to be made is basedon the metadata of the WebService through the /AnnotationStandoff/ data category (see figure 3). This data category indicated whether the annotation created by the service is produced in an inline or standoff manner with the understanding that standoff implies that the resulting annotation is stored as a separate resource. In case of an inline annotation the metadata of the original resource may be preserved and extended with information relevant for the service invocation, such as the /AnnotationLevelType/2 produced by the service. Provenance component The purpose of the Provenance component is to provide information on the source of the resource, i.e. all relevant information that is used to trace the origin of the resource and its methods of production. Provenance data accounts for proper acknowledgement of the sources of the resource and must record sufficient information to allow for reproducibility of the resource. For web services the minimum amount of information to be recorded is which service and method is being invoked and the parameters used to invoke the service method with. Web Service This component describes a generic Web Service which, in real life, may be a corpus cleaning tool, a POS tagger or any other type of web service that is available within the CLARIN infrastructure. The metadata descriptions of these web services are stored in the Metadata Repository and are described in CMDI. The figure below displays the structure of the technical metadata of the proposed CMDI profile. This CMDI profile may be further extended with descriptive elements intended for human end users. This includes information such as the organization responsible for the web service, contact persons or descriptions of the functionality of the web service. The CMDI profile in figure 3 has been validated against the D-Spin metadata descriptions which makes it possible to provide a profile matching strategy as currently implemented in the D-Spin project. The Metadata component described above relies heavily on this profile since it will use it to extract information on the resources generated by the web service.

2 The /AnnotationLevelType/ specifies the types of annotation levels provided by the resource.

Page 16: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 16

Figure 2: Collaboration diagram for web service invocations.

The collaboration pattern between these components is as follows:

1. The Web Service client wants to invoke a WebService and sends a message to the ServiceDelegate to invoke the service on his behalf. The message contains basic information on the web service to invoke, the operation, parameter information and possibly a reply address in case the request is an asynchronous one. The message format can be based on the WS-Addressing[WSAddressing] specification.

2. The WebService to invoke is encoded as a PID pointing to the metadata description of the web service. The PIDResolver will resolve this PID to a location of the metadata description. The PIDResolver will in fact be invoked multiple time during the process, that is, whenever a PID needs to be resolved. For clarity reasons the interaction with the PIDResolver is only shown onece in this diagram.

3. The metadata for the WebService can be retrieved using the location as returned by the PIDResolver. Metadata descriptions of web services will always be stored in the central metadata repository. Web services will be described using the CMDI metadata specification for web services as shown in figure 3. From this metadata description the ServiceDelegate can extract the list of parameters that are expected to be found for the operation to perform. Parameters specifications may contain references to resources, these will contain TechnicalMetadata information describing the types of resource that are expecte. In the message that is sent by the client references to resources are not made directly, but rather through the metadata PID of the resource. This allows the ServiceDelegate to retrieve the associated metadata of the resource and perform a last minute check on the TechnicalMetadata characteristics of the resource against those which are expected for the given parameters.

Page 17: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 17

Figure 3: CMDI metadata profile for web services.

4. From the message the PIDs of the metadata descriptions of resources will be extracted. The PIDresolver is used to provide the exact locations of the metadata descriptions for each of the resources. This metadata may either be located at the central metadata repository and resource metadata and associated resources may either be obtained from the metadata repository and resource repository or from the Workspace repository, depending whether the resource is one from a recognized resource center or from a user’s private workspace. A1. The metadata resides in a CLARIN center. The metadata repository is requested to provide the metadata of the WebService. A2. The resource PID is read from the metadata, the PIDResolver is contacted to obtain the location of the resource and the resource is requested from the Resource repository. alternatively: B1 The metadata resides in a user’s workspace and is requested from that workspace B2 The resource PID is read from the metadata, the PIDResolver is contacted to obtain the location of the resource and the resource is requested from the Workspace Repository. Metadata and resources are thus retrieved in a uniform manner using the location information supplied by the PIDResolver.

5. The endpoint of the WebService is extracted from the service’s metadata. For each of the parameters, the metadata PIDs of the resources are substituted by the contents of the corresponding resource.Next, the service is invoked and the result is returned to the ServiceDelegate.

6. The ServiceDelegate requests the metadata component to compose a metadata description for a resulting resource. For the construction of this, the Metadata component will use the WebService metadata description which provides TechnicalMetadata for the output parameters. Metadata construction is based on the principle that if the TechnicalMetadata for a resulting resource (of a paremeter) indicates that the resource is a standoff annotation in a separated resource(!) as indicated through the /annotationStandoff/ data category then a brand new metadata description for the resource is to be generated. The TechnicalMetadata component is copied in its entirety into the new metadata description. If /annotationStandoff/ indicates that the that the annotations are added to the same file, then the metadata description of the original resource is taken and only those information fragments from the TechnicalMetadata are added that do not already appear in the resource’s original TechnicalMetadata section. This in particular pertains to the /AnnotationLevel/

Page 18: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 18

information( see figure 4) which encodes which linguistic concepts are presented in the annotation file. For a tokenized document the TechnicalMetadata might look like:

Figure 4: Example TechnicalMetada specification

7. The Provenance component is instructed to compose the provenance data for the resulting resource.

This may consist of a copy of the original message that was sent by the client. This message contains all necessary information to invoke the same procedure again at a later stage.

8. Once the metadata and provenance data for the resulting resource has been obtained the metadata is stored in the user’s private workspace.

9. The resulting resource is also stored in the user’s private workspace Finally, the result as obtained from the WebService component is returned to the Web service client.

The primary components of focus for this design are the ServiceDelegate, Metadata Component and the Provenance Component. Although the interfaces through which they interact with other components are subject of this design and places requirements on the design of these interfaces the detailed design of these other components falls outside the scope of this document.

4.2 Wrapper architecture One way to implement the ServiceDelegate component described above is to create a wrapper, encapsulating the web service. This means that for each web service an individual wrapper must be created and must itself be exposed as a web service. There is not direct interaction with the encapsulated web services by clients, but only through the exposed wrapper interface. It is therefore the wrapper that is described in the web service registry, rather than the native service.

4.3 CSB architecture The CSB architecture is a CLARIN specific implementation of an Enterprise Service Bus. The term Enterprise Service Bus refers here to the middleware design pattern, but also to ESB product such as JBossESB or Mule. The CSB will play the role of the ServiceDelegate as described in this document, Provenance and Metadata components are inserted into the CSB. IBM defines the ESB pattern as follows[Hutch 2005]

Page 19: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 19

In the ESB pattern, rather than interacting directly, participants in a service interaction communicate through a bus that provides virtualization and management features that implement and extend the core definition of SOA. The IBM ESB pattern provides virtualization of:

• Location and identity: Participants need not know the location or identity of other participants. For example, requesters don't need to be aware that a request could be serviced by any of several providers. Service providers can be added or removed without disruption.

• Interaction protocol: Participants need not share the same communication protocol or interaction style. A request expressed as SOAP/HTTP may be serviced by a provider that only understands Java Remote Method Invocation (RMI).

• Interface: Requesters and providers don't need to agree on a common interface. The ESB reconciles differences by transforming request messages into a form expected by the provider.

• Qualities of (Interaction) Service (QoS): Participants declare their QoS requirements, including performance and reliability, authorization of requests, encryption/decryption of message contents, automatic auditing of service interactions, and how their requests should be routed (such as to available implementations based on workload distribution criteria). Policies that describe the QoS requirements and capabilities of requesters and providers may be fulfilled services themselves or fulfilled by the ESB compensating for mismatches.

Figure 5: IBM's basic ESB pattern.

In the initial CLARIN implementation not all of the points in IBM’s definition will be covered at once. Depending upon the CLARIN needs functionality needs to be added to the ESB. ‘Off the shelf’ ESB products such as JBossESB provide a standard web service invocation mechanisms that are able to invoke both SOAP services and Http end points. These are however configured to contact specific end points which are configured in the ESB setup. To be usable for CLARIN these mechanism need to be customized to allow invocation of any service in the CLARIN registry without the need to reconfigure the ESB for each service individually. It is expected that the invocation mechanisms can be easily modified to work with the CLARIN registry Since the ESB exposes its own end point and as such is not tied specifically to any particular web service, contrary to the wrapper described above, all information of the web service to invoke must be carried in a standard message format. WS-Addressing provides transport-neutral mechanisms to address web services and messages. An example fragment is shown below.

Figure 6: Example fragment WS-addressing

Using the information carried in the message, the CSB is able to locate the requested service ( as indicated in the To field) from the service registry and invoke the requested action(see Action field). Optionally a replyTo may be used for asynchronous messaging.

Page 20: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 20

5. Models This section describes all classes and their relations.

5.1 Class model

Figure 7: Class diagram for ServiceDelegate The class diagram shows the different relations of the ServiceDelegate to its surrounding components. The classes ServiceDelegate, Metada component, Provenance component and interface IMetadata component and IProvenance component all are part of the same sub system. The other classes are external systems which are used by the ServiceDelegate. The interfaces shown here describe the types of interaction with these systems in more detail and provide the only dependencies to these systems. The ServiceDelegate accesses the WebService to invoke the requested service operations on behalf of the client. Interaction with the metadata repository and resource repository are based on access to metadata and resources using the resolved PID URIs of these. Interaction with the PIDResolver only requires access to a method that is capable of resolving a PID to a URI, i.e. the location of a metada document or resource. Interaction with the Workspace repository only requires that it is possible to store metadata documents

Page 21: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 21

and resources there. In addition, access to these may be requested be requested from the Workspace repository in the same manner as indicated for Metadata repository and Resource repository to access and use resources for further processing purposes. Interfaces are described in more detail in chapter 6.

5.2 Sequence diagram The diagram below shows the sequence diagram for the interactions between the various components.

Figure 8: Sequence diagram for ServiceDelegate interaction The individual steps for this sequence diagram are as follows:

1. The Web service client invokes the ServiceDelegate. The ServiceDelegate assesses which service to invoke. In case a wrapper achitecture is used, the ServiceDelegate will be targeted at one specific web service, in the CSB approach the web service to invoke must be obtained from the contents of the message sent to the CSB. Inc ase WS-Addressing is used to encode this in the message format the To element will contain the PID pointint to the web service. In either case however, the next few steps will be dedicated to retrieving the metadata description for the service.

2. The location of the metadata document for the web service to invoked is retrieved from the PIDResolver.

3. With the location of the metadata document, the metadata document is requested from the Metadata repository.

4. In this step the parameter information is gathered. For a wrapper architecture parameters can be directly exposed parameters of the external wrapper service interface. For the CSB these must be extracted from the message sent by the Web service client.

Page 22: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 22

Each of the parameters must also be described in the Parameter components of the CMDI Web Service Profile obtained in step 2. In the next steps these parameters are evaluated further.

5. If a parameter is associated with a resource then the CMDI Web Service profile will contain a TechnicalMetadata component describing the resource characteristics that are expected for the given parameter. This may be matched against the corresponding parameter sent by the Web service client. Resources are however not to b sent directly by the client. Rather, the PIDs of the resource’s metadata are sent. In this step, the location for the metadata documents for each of these resources is requested though the PIDResolver.

6. The metadata documents for the resources are retrieved here. These metadata documents each contain TechnicalMetadata component as well. This provides the possibility for the ServiceDelegate to check whether the resource requirements for operating the web service are met by the resources. If, for example, a web service specifies that it expects a document of /characterEncoding/ = ‚UTF-8‘ and a /mimeType/ = ‚text/xml‘ then these datacategories should at least be represented in the TechnicalMetadata of the resource’s metadata description.

7. Next, the WebService itself is invoked and the result is returned. The result is captured by the ServiceDelegate.

8. From the WebService metadata’s output parameter descriptions those parameter descriptions are evaluated which indicate the presence of a resource there. These may be matched with the result obtained from the previous step and identify the resources in the return message. It is assumed at this point that the full resource is part of the return message, allthough this may be extended to include URI’s, for example, as well. If a web service employs this strategy then it must be mentioned in the WebService’s metadata. If the TechnicalMetadata of the output parameter indicates that it uses a /annotationStandoff/ strategy, by which is meant that resources are annotated in a standoff manner in a seperate file, then for each of the resources identified in the result a new Metadata document may be created. The TechnicalMetadata characterizing each of these resources can be duplicated in full from corresponding the TechnicalMetadata descriptions of the WebService’s output parameters. If the output parameter indicates that it is not standoff then this means that the metadata from the resource at the input can be extended with the TechnicalMetadata information specified for the output parameter. There may be cases however where multiple resources serve as input for a web service and an output parameter declares to be /annotationStandoff/. Here, a decission needs to be made which of the input resources will serve as the basis, i.e. to which resource are the annotations added. In these cases the relation between input and output parameter must be expressed explicitly to make it possible to determine which of the metadata documents of the input parameter resources is to be used as the basis of the metadata document of the a resulting resource.

9. The resulting resources are stored in the Workspace repository. As a consequence each will be assigned a PID which is added to the metadata description of the resources.

10. The metadata documents of the resulting resources are stored in the Workspace repository. Each of these will be assigned a PID which will be gathered.

Page 23: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 23

11. One of the last steps in the process involves the creation of the necessary provenance data. One of the most important pieces of information to be maintained is service identifier and the parameters used to invoke the WebService with. As indicated before, the service identifier corresponds to the WebService’s metadata PID. For a wrapper this refers to the metadata descrption of the web service it encapsulates, in the CSB architecture this may be obtained from the message information sent to the ESB. To ensure consist encoding of the provenenance information service identifier and parameter information are to be encoded the same for both architectures. It makes sense to use the same encoding used in the CSB architecture to transfer the message from the Web service client to the CSB, which serves as a ServiceDelegate in this case. WS-addressing appears to be usable for this, which means that the provenance data will consist of WS-Addressing fragment describing the input. It also makes sense to store the result information as part of the provenance data. When added to the metadata of a generated resource it provides information on the results of all the output parameters that were produced using the web service, not only the parameter which produced this resource. This makes it possible not only to determine whether a resource may be reproduced using a specific service, but it also allows to determine whether other output parameters produced by the service are still the same. The format of this information can again be derived from the input format used to store provenance input information, e.g. WS-addressing, but is more limited in scope. Here, only parameter information needs to be stored. The replyTo field of the original message already identifies the receiver of the result. Another aspect is that the information on the resources is to be encoded using the metadata PID in the provenance data. If complete resources are passed back to the client as part of the result then the amount of information that is then stored as part of the provenance data wold become enormous. At the same time, the resulting resources have already been complemented with the necessary metadata and stored in the user’s private workspace. So, the provenance data for the output parameters becomes limited in size by using these PIDs. If a user at some point decides to remove a resource from their workspace then the situation occurs that a the results of a future reinvocation can no longer be compared to the newly generated resource. If reprodicibilty of results is an issue then these resources must be maintained persistently, for example by moving them into an established archive that is part of the CLARIN infrastructure. Finally, the provenance data may be extended with additional information, for example QoS(Quality of Service) information, such as execution time, as measured by the Service Delegate or exception information that was raised during service invocation.

12. After the provenance data has been added to the metadata of each of the resources the metadata documents are restored in the Workspace repository. At this point no version control on the metadata doccuments is expected. It is also assumed here that provenance data is embedded in the metadata document. Alternatively an external provenance store may be used and a PID for locating the provenance data may be added to the metadata document

13. The result is returned to the Web service client. Note that the result that is being returned to the client is the same as if the client had invokded the WebService directly.

Page 24: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 24

In the sequence diagram shown above it is assumed that the interaction is synchronous in nature. In many cases however, the interaction is expected to be asynchonous. This may be dealt with using appropriate callback handlers exposed by the ServicedDelegate which processes incoming (asynchonous) response, performs steps 9 through 12 and returns the result to a callback handler exposed by the WebService client. A more detailled discussion on this may be found in chapter 7.

6. Interface This section describes the implementation of the interfaces.

IPIDResolver The IPIDResolver interface describes an interface that makes it possible to resolve a specified PID to a (URI) location. Although it is realized at this stage that the signature of the interface may not correspond directly to the IResourceStorage The IResourceStorage is an interface providing the necessary functionality to store a resource. The only component for which this applies is the Workspace repository which will store a resource in a user’s private workspace. Once AAI is in place the user’s identity can be attained, but this currently remains an open issue. IMetadataStorage The IResourceStorage interface provides the necessary functionality to store metadata documents. The only component to which this applies is the Workspace repository. As with resources, metadata documents are stored in the user’s private workspace which depends upon the possibility to retrieve the user’s identity. IMetadata component The IMetadata component interface provides the interface that will construct metadata documents for created resources. The information that needs to be passed are the WebService’s metadata description and the metadata descriptions of the original resources involved as parameters in the process, i.e. if the /annotationStandOff/ mode of the WebService is false. IProvenance component The IProvenance component interface provides the necessary functionality to construct provenance data based on the message content supplied by the client and the result of the service invocation.

7. Algorithms This section describes the algorithms used in more detail which have not already been describedd elsehere in the document. Asynchonous messaging The ability to handle asynchronous messaging is one of the key requirements for the ServiceDelegate. Assuming WS-addressing is used as the message format, an example excerpt from a message is shown below.

Here, the MessageID provides a message identifier that may be used to uniquely identify the message. This is supplied by the client. The From field provides the source reference point and indicates whereh the message came from. The To field provides the destination URI and indicates where the message should go to. Here, as well as in the From field, the URI (hdl:1839/..) points to the metadata document of the service. Action conveys

Page 25: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 25

the action related to the message, This can be used to identify the operation to be invoked upon receiving a request message and here points to the Operation section in the service’s metadata document. The ReplyTo field provides the reply end point reference and indicates where the reply message is to be sent to. If not present, the message is usually sent back to the endpoint the request orginated from. When the ServiceDelegate receives this message it will call the service identified in the To and Action fields. It will insert a callback service of its own to allow for post processing and identfy itself as the caller in the From field. The resulting message may look like this:

The MessageID in this message represents an internal messageID maintained by the ServiceDelegate which allows it to retrieve the original message. Additionally, a RelatesTo field may be added to this message to convey the ID of the original message, along with the relationType, but this is optional. The WebService being called upon will return the result back to the ServiceDelegate on the end point reference that it exposes in the ReplyToField. When the ServiceDelegate receives the result it will contain the messageID sent earlier allowing for retrieval of the original message, which may have been persisted in a local datastore. In general, asynchronous messages should be stored persistently since the response time is not known in advance. After the necessary post processing for metadata and provenance data generation, the recipient of the result as specified in the original message’s ReplyTo field can be contacted with the result.

8. Unit level test cases This section provides a description of unit level test cases. The system can be tested against against a skeleton web service which takes an arbitrary document as input and simply returns the document as is which will act as the WebService component described in this document. The PIDResolver will can also be implemented as a simple skeleton resolver which only returns the PID. The PID for testing is taken to be a file location identifiying the metadata document or resource. The metadata documents for the resources and service are prepared by hand. These should, as a minimum, only include the TechnicalMetadata information. The system can be tested by sending small messages into the ServiceDelegate which subsequently invokes the interaction patterns that have been discussed in this document. Special attention is given here to the /annotationStandoff/ datacategory of the WebService’s TechncialMetadata controlling the metadata creation process.

Page 26: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 26

9. Bibliography [ASV09a] Natural Language Processing Department: Projekt Deutscher Wortschatz: WebServices. URL: http://wortschatz.uni-leipzig.de/Webservices/, 2009.

[ASV09b] Natural Language Processing Department: About NLP - Abteilung Automatische Sprachverarbeitung – Methods. URL: http://www.asv.informatik.uni-leipzig.de/en/asv/methoden, 2009. [Bentivogli et al, 2004] Bentivogli, L., Forner, P., Magnini, B., Pianta, E. (2004). Revising WordNet Domains

Hierarchy: Semantics, Coverage, and Balancing. In Proceedings of COLING 2004 Workshop on "Multilingual Linguistic Resources". Geneva, Switzerland, pp. 101--108.

[Berners-Lee 2005] Berners-Lee, T, et al., "Uniform Resource Identifier (URI): Generic Syntax", IETF RFC 3986,January 2005, http://tools.ietf.org/rfc/rfc3986.txt [Box 2001]Box, Don (2001).A Brief History of SOAP, http://webservices.xml.com/pub/a/ws/2001/04/04/soap.html [Brown 2004] Brown, A. and Haas, H. (2004). Web Services Glossary. W3C working group note. http://www.w3.org/TR/ws-gloss/ [Broeder 2008] Broeder D., Declerck T, Hinrichs E., Piperidis S., Romary L., Calzolari N., Wittenburg P. Foundation of a Component-based Flexible Registry for Language Resources and Technology, In proceedings of LREC 2008, Marakech, Morocco, http://www.lrec-conf.org/proceedings/lrec2008/

[Bue09] Büchler, M.: Medusa Release Homepage. http://www.eaqua.net/medusa/, 2005-9. [CiTER] Citation of Electronic Resources, ISO Draft (2008) [Calzolari and Monachini, 1995] Calzolari, N. & Monachini, M. EAGLES proposal for morphosyntactic standards: In view of a ready-to-use package, Literary and Linguistic Computing. OUP, 1995. [Esuli & Sebastiani, 2006] Esuli, A., Sebastiani, F. (2006). SentiWordNet: A publicly Available Lexical

Resourced for Opinion Mining, In Proceedings of LREC2006. Genoa, Italy, pp. 417--422. [Fielding 2000] Fielding, R.T, (2000). Architectural Styles and the Design of Network-based Software Architectures, PhD Thesis, http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm [Hadley 2009] Hadley, M.J., Web Application Description Language (WADL), https://wadl.dev.java.net/wadl20090202.pdf [Hutch 2005] B. Hutchinson, M. Schmidt, D. Wolfson, M. Stockton, SOA programming model for implementing Web services, Part 4: An introduction to the IBM Enterprise Service Bus, July 26th 2005, http://www.ibm.com/developerworks/library/ws-soa-progmodel4/ [Ion, 2007] Ion, R. (2007). Word Sense Disambiguation Methods Applied to English and Romanian. PhD Thesis, Romanian Academy, Bucharest 2007. [Mackenzie 2006] MacKenzie C.M., Laskey K., McCabe F., Brown P.F., Metz R., Hamilton B.A. OASIS

Reference Model for Service Oriented Architecture 1.0, August 2006, http://www.oasis-open.org/committees/download.php/19679/soa-rm-cs.pdf

[Mel’čuk, 1988] Mel’čuk, I.A. (1988). Dependency Syntax: Theory and Practice. Albany, NY: State University

of New York Press.

Page 27: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 27

[Niles & Pease, 2001] Niles I., Pease, A. (2001) Towards a Standard Upper Ontology. In Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001). Ogunquit, Maine, pp. 2--9. [Schaefer 2006] Schäfer, U. (2006). Middleware for creating and combining multi-dimensional NLP markup. In Proceedings of the EACL-2006 workshop on multi-dimensional markup in natural language processing. Trento, Italy. http://www.dfki.de/dfkibib/publications/docs/hognlpxml2006.pdf [Stanica 2006] Stanica M., Wiberg T., Wierenga K., Winter S., Rauschenbach J.,JRA5 Glossary of Terms - Second Edition- update of DJ5.1.1 [Takase 2008]Toshiro Takase, Satoshi Makino, Shinya Kawanaka, Ken Ueno, Christopher Ferris and Arthur Ryman (2008) Definition languages for RESTful Web services: WADL vs. WSDL 2.0, http://download.boulder.ibm.com/ibmdl/pub/software/dw/specs/ws-wadlwsdl/WADLWSDLpaper20080621.pdf [Tufiş, 1999] Tufiş, D. (1999). Tiered Tagging and Combined Classifiers. In F. Jelinek, E. Nth (Eds.), Text,

Speech and Dialogue, Lecture Notes in Artificial Intelligence. Springer, pp. 28--33. [Winer 1999] Winer D., XML-RPC Specification, http://www.xmlrpc.com/spec [WSAddressing] Don Box et all, 2004, http://www.w3.org/Submission/ws-addressing/ [Wulong 2001] Wulong T., 2001, http://searchcio.techtarget.com/sDefinition/0,,sid182_gci213384,00.html [Yuret, 1998] Yuret, D. (1998). Discovery of linguistic relations using lexical attraction. PhD thesis, Department of Computer Science and Electrical Engineering, MIT.

Page 28: 2R-7b Web Services and workflow creating-v2 · WG2.7 Web services and workflow creation 9 Workflow[Wulong 2001] Workflow is a term used to describe the tasks, procedural steps, organizations

Common Language Resources and Technology Infrastructure

WG2.7 Web services and workflow creation 28