Web Services Discovery

77
Advanced Distributed Computing 1 Lecture 8b Web Services: Composition & Discovery Zhou Shuigeng June 1, 2007

Transcript of Web Services Discovery

Advanced Distributed Computing 1

Lecture 8bWeb Services: Composition & Discovery

Zhou Shuigeng

June 1, 2007

Advanced Distributed Computing 2

ReferencesB. Benatallah, M. Dumas , Q. Sheng, A. Ngu. Declarative Composition and Peer-to-Peer Provisioning of Dynamic Web Services. In Proc. of ICDE2002. D. Chakraborty, et al. Dreggie: semantic service discovery for M-commerce applications. 2001.D. Chakraborty, et al. GSD: a novel group-based services discovery protocol for MANETS. 2002.M. Klein and A. Bernstein. Searching for services on the semantic Web using process ontologies. In the Emerging Semantic Web. Isabel C., Decker S., Euzenat J., and McGuinnessD. Eds. Amsterdam: IOS press, pp. 159-172C. Schmidt and M. Parashar. A peer-to-peer approach to Web services discovery. Technical report of Department of Electrical and Computer Engineering, Rutgers University, 2004.

Advanced Distributed Computing 3

Declarative Composition and Peer-to-Peer Provisioning of Dynamic

Web Services(ICDE2002)

B. Benatallah, M. Dumas , Q. Sheng, A. Ngu

The University of New South WalesQueensland University of Technology

Advanced Distributed Computing 4

Services Composition Requirements

Fast compositionWeb services composing is largely ad-hoc, time-consuming and requiring low level programming

Scalable compositionInvolving a lot services; integrated temporarily

Distributed executioncentralized execution model incurs severe problems including, scalability, availability, and security problems

Advanced Distributed Computing 5

ContributionsSELF-SERV (compoSing wEb accessibLeinFormation & buSiness sERVices)

a framework for dynamic and peer-to-peer provisioning of Web services.A declarative language for composing services based on statechartsA concept of service communities to architect the composition of a potentially large number of dynamic servicesA peer-to-peer service execution model

Advanced Distributed Computing 6

Types of ServicesSELF-SERV distinguished 3 types of services

Elementary servicesComposite servicesService communityA Web service is specified by

an identifier (e.g., URL)a set of attributes, and a set of operations

Advanced Distributed Computing 7

Elementary Servicesan individual Internet accessible application (e.g., a Java program) that does not rely on another Web service to fulfill user requestsoperations of an elementary service are realized in terms of calls to proprietary/legacy applicationseach operation of an elementary service is associated with a translator, which is mainly used to map the SELF-SERV operation into the format understood by the underlying proprietary/legacy applications

Advanced Distributed Computing 8

Composite ServicesComposite service operations are expressed as a composition of operations of other Web services using statechartsAdvantages of statechart

has formal semanticsA standard process modeling languageOffers most control flow constructs

Advanced Distributed Computing 9

Statechart (1)A statechart is made up of states and transitions

Transitions are labeled by ECA rulesStates can be basic or compound

a basic state corresponds to the execution of a service, whether elementary or composite.Compound states contain one or several entire statechartswithin them

OR-state contains a single statechartAND-state contains several statecharts (separated by dashed lines) which are intended to be executed concurrently

Advanced Distributed Computing 10

Statechart (2)An operation of a service can be seen as having input parameters, output parameters, consumed and produced eventsInternal variables

data item that affects the outcome of a service execution, but which is not an input nor an output parameter of the operation that the statechart implements

Advanced Distributed Computing 11

Composite Services: An example (1)

Advanced Distributed Computing 12

Signature of the operation PrepareTripof CTS, and signatures

of the service operations

that it invokes

Composite Services: An example (2)

Advanced Distributed Computing 13

Invocation table of CTS::prepareTrip

Composite Services: An example (3)

Advanced Distributed Computing 14

Service CommunitiesA service community is an aggregator of service offers with a unified interfaceThe registration of a service with a community requires the specification of mappingsbetween the operations of the service and those of the community

Advanced Distributed Computing 15

Each state ST appearing in a composite service specification is represented by a state coordinatorresponsible for

when should a state within a statechart be entered?, what should be done after the state is entered?when should the state be exited?, andwhat should be done after the state is exited?

The knowledge needed by a coordinator in order to answer these questions at runtime, is statically extracted from the statechart describing the composite service operation, and represented in the form of routing tables

Peer-to-Peer Provisioning of Web Services: Overview

Advanced Distributed Computing 16

Coordinator Actions (1)Receiving notifications of completion from other state coordinators and determining from these notifications when should state ST be enteredInvoking the service labeling ST whenever all the preconditions for entering ST are met. This invocation is done by sending a message to the service’s wrapper and waiting for a reply

Advanced Distributed Computing 17

Notifying the execution’s completionto the coordinators of the states which may need to be entered next.When state ST is active, the coordinator monitors possible events and take responsesaccordingly

Coordinator Actions (2)

Advanced Distributed Computing 18

Composite Service Execution(1)

A composite service execution is orchestrated through

peer-to-peer message exchangesbetween the coordinators of the states of the service’s description, and message exchanges between the coordinators and the wrappers

Advanced Distributed Computing 19

The messages exchanged between the coordinators are called control-flow notificationsA (control-flow) notification sent by a coordinator C1 to a coordinator C2 expresses the fact that the execution of the state represented by C1 has completed, and that C1 believes that the state represented by C2 needs to be entered The notification message contains the input parameters of the composite service execution, as well as the up-to-date values of all the internal variables of the statechart that C1 needs to transmit to C2

Composite Service Execution(2)

Advanced Distributed Computing 20

Messages exchanged between the coordinators of a state and the wrapper of the service labeling this state are called service invocations/completionsA service invocation message contains the name of the service operation that is being invoked, as well as the values of the input parameters A service completion message contains the values of the return parameters

Composite Service Execution(3)

Advanced Distributed Computing 21

CS Execution Example

Interactions between the coordinators and the wrappers during an execution of CTS

Advanced Distributed Computing 22

Prototype

Advanced Distributed Computing 23

Web services: actors, objects and operations

Advanced Distributed Computing 24

Web services programming stack

Advanced Distributed Computing 25

Searching for services on the semantic Web using process

ontologies(1st Semantic-Web Working Symposium,2002)

M. Klein and A. Bernstein

MIT & NYU

Advanced Distributed Computing 26

Current Technologies(1)Keyword based approach

Low precision & low recallFrames based approach

e.g. Jini, UDDI, Salutation Low recall & high precision

Advanced Distributed Computing 27

Deductive retrieval approachHigh recall & high precisionFacing practical difficulties

Current Technologies(2)

Advanced Distributed Computing 28

The state of the art in service discovery

Current Technologies(3)

Advanced Distributed Computing 29

Process Ontologies Based Approach(1)

Capturing the function(s) of a service as a process modelIndexing the service model by placing it and all its components into the appropriate sections of the ontologyExpressing queries as (partial) process modelsFinding all the services whose process modelsmatch that of the query using the semantic relationships encoded in the process ontology

Advanced Distributed Computing 30

Process Ontologies Based Approach(2)

Advanced Distributed Computing 31

Define Process Ontology(1)Major concepts of process knowledge

Process attributes: textual description, performance values, pre- /post-/during-conditionsDecomposition: process->activities-> sub-activitiesPorts: I/O-behavior of an activity/ process interfaceDependencies: dependency->coordination mechanism->managing resources flowExceptionsSpecialization: type taxonomies, bundles

Advanced Distributed Computing 32

An activity is defined by specifying its decomposition (i.e., the activities it contains), its interface (as defined by the ports it contains), the dependencies between its sub-activities, and the attributes defined for each of those entitiesEach activity can be linked to the kinds of exceptions it can face, and these exceptions can be linked in turn to the processes if any used to handle themEvery type of entity has its own specialization hierarchy into which it can be placed, making this a fully-typed process description approach.

Define Process Ontology(2)

Advanced Distributed Computing 33

Define Process Ontology(3)

Partial meta-model for process ontology

Advanced Distributed Computing 34

Define Process Ontology(4)

Specialization hierarchy for grant loan

Advanced Distributed Computing 35

Index Services(1)Services are indexed into the already constructed process ontologyIndexing a service comes down to placing the associated process model, and all of its components (attributes, ports, dependencies, subtasks and exceptions) in the appropriate place in the ontology

Advanced Distributed Computing 36

Index Services(2)

The loan selling service process model

Advanced Distributed Computing 37

Index Services(3)

The mortgage selling service process model

Advanced Distributed Computing 38

Define Queries (1)The process query language (PQL) is used for defining queryPQL queries are built up as combinations of two types of clauses

Entity <entity> isa <entity type> :test <predicate>Relationship <source entity> <relationship type> <target entity> :test <predicate>

Advanced Distributed Computing 39

Define Queries (2)

A query for mortgage services

Advanced Distributed Computing 40

Find MatchesThe clauses in the PQL query are tried in order, each clause executed in the variable binding environment accumulated from the previous clauses. The sets of bindings that survive to the end represent the matching servicesUsing semantics-preserving query mutation operators

allow a type specification to be more generalrelax the constraints on a parameter value……

Advanced Distributed Computing 41

DReggie: semantic service discovery for M-commerce applications

(SRDS2001 Workshop)GSD: a novel group-based services

discovery protocol for MANETS(MWCN 2002)

D. Chakraborty, et al.

Computer Science & Electrical EngineeringUniversity of Maryland, Baltimore County

Advanced Distributed Computing 42

The Research Goal

Semantic services discovery in mobile, ad hoc environment where

a priori information and/or description of the services is, more often than not, unavailable (distributed)The same service in different site may implement different interfaces

Advanced Distributed Computing 43

The Approaches

DReggie: using DAML as language to represent and reason about the capabilities and functionality of different servicesGSD: peer-to-peer caching of service advertisements and group-based intelligent forwarding of service requests for achieving efficient network bandwidth usage as well as flexibility in service matching process

Advanced Distributed Computing 44

Existing SD Infrastructure(1)Service Location Protocol(SLP)

Language independenceAgent-oriented

User Agents(UAs): discovering DAs, acquiring service handles on behalf of end-userService Agents(SAs): advertising service handles to DAs Directory Agents(DAs): collecting service handles and maintaining the directory of advertised servicesMulticasting for service registration and discovery, and unicasting for service discovery response from DAs / SAs

Relying on predefined service attributes

Advanced Distributed Computing 45

Jini: distributed service-oriented architectureA collection of services forms a Jini federation(JF)Jini Lookup Service(JLS) maintains the dynamic information of available services in JFWhen a Jini service wants to join a Jini federation, it first discovers one or many JLS. The service then upload its service proxy to the JLSService client contacts the original service and invoke methods on the service by using proxy

Existing SD Infrastructure(2)

Advanced Distributed Computing 46

Universal Plug and Play(UPnP)Works and defines standards at lower-layer network protocolUses the Simple Service Discovery Protocol(SSDP) for discovery servicesWhen a service joins the network, it transmits an announcement to indicate its presenceWhen a client wants to discover a service, it multicast request or contact the target URL directly

Existing SD Infrastructure(3)

Advanced Distributed Computing 47

Salutation: open standard, communication, OS and platform-independent service discovery protocol

Provides methods for describing & advertising services’ capabilitiesSalutation Lookup Manager(SLM) functions as a service broker for services in the network. SLM can classify the services based on their meaningful functionalityServices are discovered by the SLM based on a comparison of the required service types with the service types stored in the SLM directory

Existing SD Infrastructure(4)

Advanced Distributed Computing 48

Drawbacks of the existing techniquesLack of rich representation

Lack expressive language, representations and tools that are good at representing a broad range of service description and are good for reasoning about the functionality and the capabilities of the services

Lack of constraint specification and inexact matching

Service functionality are described at the syntax level or object level

Lack of ontology support

Existing SD Infrastructure(5)

Advanced Distributed Computing 49

DAML: DARPA Agent Markup Language

Enables the transformation of human-oriented Web to semantic WebProvides machines not only with the capability to read data but also to interpret and make inferences from the dataBuilt upon XML and RDFS

Advanced Distributed Computing 50

DReggie: overviewAims to enhance the matching mechanisms in Jiniand other service discovery systems. The heart of DReggie is an enhanced JLS enabling smart discovery of Jini servicesThe Dreggie lookup server consists of a simple Java-based matching module (performing attribute matching and limited constraint-based matching) and an advanced Prolog-based reasoning engineEnables service discovery systems to perform matching based on semantic information(capabilities, functionality, portability and system requirements)

Advanced Distributed Computing 51

DAML Representation of servicesDAML is used a language to representation the semantic descriptionof the capabilities and requirements of m-servicesA DAML ontology was created to describe m-service in terms of their functionality, capability, platform requirements and other attributes

DReggie: implementation(1)

Advanced Distributed Computing 52

Prolog reasoner for semantic discoveryThe Prolog reasoner’s Knowledge Base includes

Parsed service ontology is in a form that Prolog engine uses to perform matching and loaded into knowledge BaseDAML profiles of servicesDAML relationship rules(e.g., inverse, disjoint etc.)

Service profile and parsed service query are matched based on relationships between attributes and their values

DReggie: implementation(2)

Advanced Distributed Computing 53

GSD: service descriptionGroup-based semantic service description

DAML+OIL is used to define ontology to describe services/resourcesExtended Dreggie ontologyGrouping services primarily based on service functionalityThe generic class service is functionally classified into two main sub-classes: hardware and software service, and each sub-class is further classified into more specific service

Advanced Distributed Computing 54

Service advertisementsPeer-to-peer cachingRequest routing

GSD: discovery protocol(1)

Advanced Distributed Computing 55

GSD: discovery protocol(2)Service Advertisements

Service advertisements are initiated by each individual node that hosting one or more servicesEach service has a corresponding service description using DAML-based ontologyEvery node after every ADV_TIME_INTERVAL sends a list of the services it has to all the nodes within ADV_DIAMETER hops of diameter

<Packet-type, Source-Address, Service-Description,Service-Groups, Other-Groups, Lifetime, ADV_DIAMETER>

Advanced Distributed Computing 56

Peer-to-Peer CachingEach node maintains a finite cache to store service description of remote servicesEach entry in service cache has following format:

GSD: discovery protocol(3)

<Packet-type, local, Service-Description,Service-Groups, Other-Groups, Lifetime>

Advanced Distributed Computing 57

Request routingA service request is described using DAML-based ontologyService request is firstly match with the services in local cacheServices are matched based on service groups, input/outputs, functional similarity, service capabilities etc.

GSD: discovery protocol(4)

Advanced Distributed Computing 58

Request routing(cont’d)If a match is not found in the local cache, a service request is constructed

Using the Other-Groups information in each entry of Service Cache, a set of neighbors can be selected for forwarding the service request

GSD: discovery protocol(5)

<Packet-type, BroadcastId, Service-Description,Request-Groups, Source-Address, Last-Address, Hop-Count>

Advanced Distributed Computing 59

Request routing(cont’d)Request forwarding algorithm

GSD: discovery protocol(6)

If (Hop-Count >0) then {for (each entry in Service Cache) do {if any one of the request groups G1…Gnbelongs to Other-Groups then {

Decrease Hop-Count by 1Forward the request to the node}

}if (the request was never forwarded) then {

Decrease Hop-Count by 1Broadcast the service request to the neighboring nodes}

}

Advanced Distributed Computing 60

A Peer-to-Peer Approach to Web Service Discovery

(World Wide Web Journal, Vol. 7, Issue 2, June 2004)

C. Schmidt and M. Parashar

The Applied Software Systems LabDept. of Electrical and Computer Engineering

Rugters University

Advanced Distributed Computing 61

OutlineIntroductionSystem Architecture & DesignExperimental Evaluation

Advanced Distributed Computing 62

IntroductionThe goal

Building dynamic, scalable, decentralized registries with real-time and flexible search capabilities, to support Web services discovery

TechniquesKey words based Web services descriptionChord as the overlay networkHillbert Space Filling Curves(SFC) is used for

reducing dimensions of dataLocality preservation indexing of data

Advanced Distributed Computing 63

ArchitectureFive major components

Hilbert SFC: A locality preserving mapping that maps data elements to indicesChord: An overlay network topologyA mapping from indices to nodes in the overlay networkLoad balancing mechanismsA query engine for routing and efficiently resolving keyword queries using successive refinements and pruning

Advanced Distributed Computing 64

Locality Preserving MappingEach Web service is represented by a sequence of keywordsA set of selected keywords form a multidimensional keyword space where a services are points in the spaceLocality preserving mapping: mapping two services of similar semantics into two close points in keyword space

Advanced Distributed Computing 65

Hilbert Space-Filling CurveA SFC is a continuous mapping from a d-dimensional space to a 1-dimensional space

The d-dimensional space is viewed as a d-dimensional cube, which is mapped onto a linesuch that the line passes once through each point in the volume of the cube, entering and exiting the cube only once. Using this mapping, a point in the cube can be described by its spatial coordinates, or by the length along the line, measured from one of its ends

NNf d →:

Advanced Distributed Computing 66

Hilbert Space-Filling Curve

Space-filling curve approximations d = 2: n = 2 (a) 1st order approximation; (b) 2nd order

approximation d=2: n = 3 (c) 1st order approximation, (d) 2nd order

approximation

Advanced Distributed Computing 67

Mapping Indices to PeersMapping 1-dimensional index space onto Chord

Each node stores the keys that map to the segment of the curve between itself and the predecessor node.

Advanced Distributed Computing 68

The Query EngineQuery Processing: Processing a query consists of two steps

translating the keyword query to relevant clusters of the SFC-based index spaceand querying the appropriate nodes in the overlay network for target data

Advanced Distributed Computing 69

The Query EngineQuery optimization: recursive refining the query

Advanced Distributed Computing 70

The Query EngineBalancing load (1)

Load balancing at node joinThe incoming node generates several identifiers and sends multiple join messages using these identifiers. Nodes that are logical successors of these identifiers respond reporting their load. The new node uses the identifier that will place it in the most loaded part of the network

Advanced Distributed Computing 71

The Query EngineBalancing load (2)

Load balancing at runtimeperiodically running a local load balancing algorithm between few neighboring nodes

neighboring nodes exchange information about their loads, and the most loaded nodes give a part of their load to their neighbors.Uses virtual nodes. In this algorithm, each physical node houses multiple virtual nodes. The load at a physical node is the sum of the load of its virtual nodes. When the load on a virtual node goes above a threshold, the virtual node is split into two or more virtual nodes. If the physical node is overloaded, one or more of its virtual nodes can migrate to less loaded physical nodes (neighbors or fingers)

Advanced Distributed Computing 72

Experimental EvaluationEvaluating the query engine

Performance is measured in terms of the number of nodes that participate in a querythe number of messages required to process a querythe number of nodes where matches are found.

Evaluating casesthe number of data elements stored in the P2P system grows with the size of the systemthe size of the system remains constant while the number of stored data-elements increasesthe number of data elements is kept constant and the system size is increased

Advanced Distributed Computing 73

Experimental Evaluation

Advanced Distributed Computing 74

Experimental Evaluation

Advanced Distributed Computing 75

Experimental Evaluation

Advanced Distributed Computing 76

Experimental Evaluation

Advanced Distributed Computing 77

The End

Thanks!