Web Services Discovery
Transcript of Web Services Discovery
Advanced Distributed Computing 1
Lecture 8bWeb Services: Composition & Discovery
Zhou Shuigeng
June 1, 2007
Advanced Distributed Computing 2
ReferencesB. Benatallah, M. Dumas , Q. Sheng, A. Ngu. Declarative Composition and Peer-to-Peer Provisioning of Dynamic Web Services. In Proc. of ICDE2002. D. Chakraborty, et al. Dreggie: semantic service discovery for M-commerce applications. 2001.D. Chakraborty, et al. GSD: a novel group-based services discovery protocol for MANETS. 2002.M. Klein and A. Bernstein. Searching for services on the semantic Web using process ontologies. In the Emerging Semantic Web. Isabel C., Decker S., Euzenat J., and McGuinnessD. Eds. Amsterdam: IOS press, pp. 159-172C. Schmidt and M. Parashar. A peer-to-peer approach to Web services discovery. Technical report of Department of Electrical and Computer Engineering, Rutgers University, 2004.
Advanced Distributed Computing 3
Declarative Composition and Peer-to-Peer Provisioning of Dynamic
Web Services(ICDE2002)
B. Benatallah, M. Dumas , Q. Sheng, A. Ngu
The University of New South WalesQueensland University of Technology
Advanced Distributed Computing 4
Services Composition Requirements
Fast compositionWeb services composing is largely ad-hoc, time-consuming and requiring low level programming
Scalable compositionInvolving a lot services; integrated temporarily
Distributed executioncentralized execution model incurs severe problems including, scalability, availability, and security problems
Advanced Distributed Computing 5
ContributionsSELF-SERV (compoSing wEb accessibLeinFormation & buSiness sERVices)
a framework for dynamic and peer-to-peer provisioning of Web services.A declarative language for composing services based on statechartsA concept of service communities to architect the composition of a potentially large number of dynamic servicesA peer-to-peer service execution model
Advanced Distributed Computing 6
Types of ServicesSELF-SERV distinguished 3 types of services
Elementary servicesComposite servicesService communityA Web service is specified by
an identifier (e.g., URL)a set of attributes, and a set of operations
Advanced Distributed Computing 7
Elementary Servicesan individual Internet accessible application (e.g., a Java program) that does not rely on another Web service to fulfill user requestsoperations of an elementary service are realized in terms of calls to proprietary/legacy applicationseach operation of an elementary service is associated with a translator, which is mainly used to map the SELF-SERV operation into the format understood by the underlying proprietary/legacy applications
Advanced Distributed Computing 8
Composite ServicesComposite service operations are expressed as a composition of operations of other Web services using statechartsAdvantages of statechart
has formal semanticsA standard process modeling languageOffers most control flow constructs
Advanced Distributed Computing 9
Statechart (1)A statechart is made up of states and transitions
Transitions are labeled by ECA rulesStates can be basic or compound
a basic state corresponds to the execution of a service, whether elementary or composite.Compound states contain one or several entire statechartswithin them
OR-state contains a single statechartAND-state contains several statecharts (separated by dashed lines) which are intended to be executed concurrently
Advanced Distributed Computing 10
Statechart (2)An operation of a service can be seen as having input parameters, output parameters, consumed and produced eventsInternal variables
data item that affects the outcome of a service execution, but which is not an input nor an output parameter of the operation that the statechart implements
Advanced Distributed Computing 12
Signature of the operation PrepareTripof CTS, and signatures
of the service operations
that it invokes
Composite Services: An example (2)
Advanced Distributed Computing 13
Invocation table of CTS::prepareTrip
Composite Services: An example (3)
Advanced Distributed Computing 14
Service CommunitiesA service community is an aggregator of service offers with a unified interfaceThe registration of a service with a community requires the specification of mappingsbetween the operations of the service and those of the community
Advanced Distributed Computing 15
Each state ST appearing in a composite service specification is represented by a state coordinatorresponsible for
when should a state within a statechart be entered?, what should be done after the state is entered?when should the state be exited?, andwhat should be done after the state is exited?
The knowledge needed by a coordinator in order to answer these questions at runtime, is statically extracted from the statechart describing the composite service operation, and represented in the form of routing tables
Peer-to-Peer Provisioning of Web Services: Overview
Advanced Distributed Computing 16
Coordinator Actions (1)Receiving notifications of completion from other state coordinators and determining from these notifications when should state ST be enteredInvoking the service labeling ST whenever all the preconditions for entering ST are met. This invocation is done by sending a message to the service’s wrapper and waiting for a reply
Advanced Distributed Computing 17
Notifying the execution’s completionto the coordinators of the states which may need to be entered next.When state ST is active, the coordinator monitors possible events and take responsesaccordingly
Coordinator Actions (2)
Advanced Distributed Computing 18
Composite Service Execution(1)
A composite service execution is orchestrated through
peer-to-peer message exchangesbetween the coordinators of the states of the service’s description, and message exchanges between the coordinators and the wrappers
Advanced Distributed Computing 19
The messages exchanged between the coordinators are called control-flow notificationsA (control-flow) notification sent by a coordinator C1 to a coordinator C2 expresses the fact that the execution of the state represented by C1 has completed, and that C1 believes that the state represented by C2 needs to be entered The notification message contains the input parameters of the composite service execution, as well as the up-to-date values of all the internal variables of the statechart that C1 needs to transmit to C2
Composite Service Execution(2)
Advanced Distributed Computing 20
Messages exchanged between the coordinators of a state and the wrapper of the service labeling this state are called service invocations/completionsA service invocation message contains the name of the service operation that is being invoked, as well as the values of the input parameters A service completion message contains the values of the return parameters
Composite Service Execution(3)
Advanced Distributed Computing 21
CS Execution Example
Interactions between the coordinators and the wrappers during an execution of CTS
Advanced Distributed Computing 25
Searching for services on the semantic Web using process
ontologies(1st Semantic-Web Working Symposium,2002)
M. Klein and A. Bernstein
MIT & NYU
Advanced Distributed Computing 26
Current Technologies(1)Keyword based approach
Low precision & low recallFrames based approach
e.g. Jini, UDDI, Salutation Low recall & high precision
Advanced Distributed Computing 27
Deductive retrieval approachHigh recall & high precisionFacing practical difficulties
Current Technologies(2)
Advanced Distributed Computing 29
Process Ontologies Based Approach(1)
Capturing the function(s) of a service as a process modelIndexing the service model by placing it and all its components into the appropriate sections of the ontologyExpressing queries as (partial) process modelsFinding all the services whose process modelsmatch that of the query using the semantic relationships encoded in the process ontology
Advanced Distributed Computing 31
Define Process Ontology(1)Major concepts of process knowledge
Process attributes: textual description, performance values, pre- /post-/during-conditionsDecomposition: process->activities-> sub-activitiesPorts: I/O-behavior of an activity/ process interfaceDependencies: dependency->coordination mechanism->managing resources flowExceptionsSpecialization: type taxonomies, bundles
Advanced Distributed Computing 32
An activity is defined by specifying its decomposition (i.e., the activities it contains), its interface (as defined by the ports it contains), the dependencies between its sub-activities, and the attributes defined for each of those entitiesEach activity can be linked to the kinds of exceptions it can face, and these exceptions can be linked in turn to the processes if any used to handle themEvery type of entity has its own specialization hierarchy into which it can be placed, making this a fully-typed process description approach.
Define Process Ontology(2)
Advanced Distributed Computing 33
Define Process Ontology(3)
Partial meta-model for process ontology
Advanced Distributed Computing 34
Define Process Ontology(4)
Specialization hierarchy for grant loan
Advanced Distributed Computing 35
Index Services(1)Services are indexed into the already constructed process ontologyIndexing a service comes down to placing the associated process model, and all of its components (attributes, ports, dependencies, subtasks and exceptions) in the appropriate place in the ontology
Advanced Distributed Computing 38
Define Queries (1)The process query language (PQL) is used for defining queryPQL queries are built up as combinations of two types of clauses
Entity <entity> isa <entity type> :test <predicate>Relationship <source entity> <relationship type> <target entity> :test <predicate>
Advanced Distributed Computing 40
Find MatchesThe clauses in the PQL query are tried in order, each clause executed in the variable binding environment accumulated from the previous clauses. The sets of bindings that survive to the end represent the matching servicesUsing semantics-preserving query mutation operators
allow a type specification to be more generalrelax the constraints on a parameter value……
Advanced Distributed Computing 41
DReggie: semantic service discovery for M-commerce applications
(SRDS2001 Workshop)GSD: a novel group-based services
discovery protocol for MANETS(MWCN 2002)
D. Chakraborty, et al.
Computer Science & Electrical EngineeringUniversity of Maryland, Baltimore County
Advanced Distributed Computing 42
The Research Goal
Semantic services discovery in mobile, ad hoc environment where
a priori information and/or description of the services is, more often than not, unavailable (distributed)The same service in different site may implement different interfaces
Advanced Distributed Computing 43
The Approaches
DReggie: using DAML as language to represent and reason about the capabilities and functionality of different servicesGSD: peer-to-peer caching of service advertisements and group-based intelligent forwarding of service requests for achieving efficient network bandwidth usage as well as flexibility in service matching process
Advanced Distributed Computing 44
Existing SD Infrastructure(1)Service Location Protocol(SLP)
Language independenceAgent-oriented
User Agents(UAs): discovering DAs, acquiring service handles on behalf of end-userService Agents(SAs): advertising service handles to DAs Directory Agents(DAs): collecting service handles and maintaining the directory of advertised servicesMulticasting for service registration and discovery, and unicasting for service discovery response from DAs / SAs
Relying on predefined service attributes
Advanced Distributed Computing 45
Jini: distributed service-oriented architectureA collection of services forms a Jini federation(JF)Jini Lookup Service(JLS) maintains the dynamic information of available services in JFWhen a Jini service wants to join a Jini federation, it first discovers one or many JLS. The service then upload its service proxy to the JLSService client contacts the original service and invoke methods on the service by using proxy
Existing SD Infrastructure(2)
Advanced Distributed Computing 46
Universal Plug and Play(UPnP)Works and defines standards at lower-layer network protocolUses the Simple Service Discovery Protocol(SSDP) for discovery servicesWhen a service joins the network, it transmits an announcement to indicate its presenceWhen a client wants to discover a service, it multicast request or contact the target URL directly
Existing SD Infrastructure(3)
Advanced Distributed Computing 47
Salutation: open standard, communication, OS and platform-independent service discovery protocol
Provides methods for describing & advertising services’ capabilitiesSalutation Lookup Manager(SLM) functions as a service broker for services in the network. SLM can classify the services based on their meaningful functionalityServices are discovered by the SLM based on a comparison of the required service types with the service types stored in the SLM directory
Existing SD Infrastructure(4)
Advanced Distributed Computing 48
Drawbacks of the existing techniquesLack of rich representation
Lack expressive language, representations and tools that are good at representing a broad range of service description and are good for reasoning about the functionality and the capabilities of the services
Lack of constraint specification and inexact matching
Service functionality are described at the syntax level or object level
Lack of ontology support
Existing SD Infrastructure(5)
Advanced Distributed Computing 49
DAML: DARPA Agent Markup Language
Enables the transformation of human-oriented Web to semantic WebProvides machines not only with the capability to read data but also to interpret and make inferences from the dataBuilt upon XML and RDFS
Advanced Distributed Computing 50
DReggie: overviewAims to enhance the matching mechanisms in Jiniand other service discovery systems. The heart of DReggie is an enhanced JLS enabling smart discovery of Jini servicesThe Dreggie lookup server consists of a simple Java-based matching module (performing attribute matching and limited constraint-based matching) and an advanced Prolog-based reasoning engineEnables service discovery systems to perform matching based on semantic information(capabilities, functionality, portability and system requirements)
Advanced Distributed Computing 51
DAML Representation of servicesDAML is used a language to representation the semantic descriptionof the capabilities and requirements of m-servicesA DAML ontology was created to describe m-service in terms of their functionality, capability, platform requirements and other attributes
DReggie: implementation(1)
Advanced Distributed Computing 52
Prolog reasoner for semantic discoveryThe Prolog reasoner’s Knowledge Base includes
Parsed service ontology is in a form that Prolog engine uses to perform matching and loaded into knowledge BaseDAML profiles of servicesDAML relationship rules(e.g., inverse, disjoint etc.)
Service profile and parsed service query are matched based on relationships between attributes and their values
DReggie: implementation(2)
Advanced Distributed Computing 53
GSD: service descriptionGroup-based semantic service description
DAML+OIL is used to define ontology to describe services/resourcesExtended Dreggie ontologyGrouping services primarily based on service functionalityThe generic class service is functionally classified into two main sub-classes: hardware and software service, and each sub-class is further classified into more specific service
Advanced Distributed Computing 54
Service advertisementsPeer-to-peer cachingRequest routing
GSD: discovery protocol(1)
Advanced Distributed Computing 55
GSD: discovery protocol(2)Service Advertisements
Service advertisements are initiated by each individual node that hosting one or more servicesEach service has a corresponding service description using DAML-based ontologyEvery node after every ADV_TIME_INTERVAL sends a list of the services it has to all the nodes within ADV_DIAMETER hops of diameter
<Packet-type, Source-Address, Service-Description,Service-Groups, Other-Groups, Lifetime, ADV_DIAMETER>
Advanced Distributed Computing 56
Peer-to-Peer CachingEach node maintains a finite cache to store service description of remote servicesEach entry in service cache has following format:
GSD: discovery protocol(3)
<Packet-type, local, Service-Description,Service-Groups, Other-Groups, Lifetime>
Advanced Distributed Computing 57
Request routingA service request is described using DAML-based ontologyService request is firstly match with the services in local cacheServices are matched based on service groups, input/outputs, functional similarity, service capabilities etc.
GSD: discovery protocol(4)
Advanced Distributed Computing 58
Request routing(cont’d)If a match is not found in the local cache, a service request is constructed
Using the Other-Groups information in each entry of Service Cache, a set of neighbors can be selected for forwarding the service request
GSD: discovery protocol(5)
<Packet-type, BroadcastId, Service-Description,Request-Groups, Source-Address, Last-Address, Hop-Count>
Advanced Distributed Computing 59
Request routing(cont’d)Request forwarding algorithm
GSD: discovery protocol(6)
If (Hop-Count >0) then {for (each entry in Service Cache) do {if any one of the request groups G1…Gnbelongs to Other-Groups then {
Decrease Hop-Count by 1Forward the request to the node}
}if (the request was never forwarded) then {
Decrease Hop-Count by 1Broadcast the service request to the neighboring nodes}
}
Advanced Distributed Computing 60
A Peer-to-Peer Approach to Web Service Discovery
(World Wide Web Journal, Vol. 7, Issue 2, June 2004)
C. Schmidt and M. Parashar
The Applied Software Systems LabDept. of Electrical and Computer Engineering
Rugters University
Advanced Distributed Computing 61
OutlineIntroductionSystem Architecture & DesignExperimental Evaluation
Advanced Distributed Computing 62
IntroductionThe goal
Building dynamic, scalable, decentralized registries with real-time and flexible search capabilities, to support Web services discovery
TechniquesKey words based Web services descriptionChord as the overlay networkHillbert Space Filling Curves(SFC) is used for
reducing dimensions of dataLocality preservation indexing of data
Advanced Distributed Computing 63
ArchitectureFive major components
Hilbert SFC: A locality preserving mapping that maps data elements to indicesChord: An overlay network topologyA mapping from indices to nodes in the overlay networkLoad balancing mechanismsA query engine for routing and efficiently resolving keyword queries using successive refinements and pruning
Advanced Distributed Computing 64
Locality Preserving MappingEach Web service is represented by a sequence of keywordsA set of selected keywords form a multidimensional keyword space where a services are points in the spaceLocality preserving mapping: mapping two services of similar semantics into two close points in keyword space
Advanced Distributed Computing 65
Hilbert Space-Filling CurveA SFC is a continuous mapping from a d-dimensional space to a 1-dimensional space
The d-dimensional space is viewed as a d-dimensional cube, which is mapped onto a linesuch that the line passes once through each point in the volume of the cube, entering and exiting the cube only once. Using this mapping, a point in the cube can be described by its spatial coordinates, or by the length along the line, measured from one of its ends
NNf d →:
Advanced Distributed Computing 66
Hilbert Space-Filling Curve
Space-filling curve approximations d = 2: n = 2 (a) 1st order approximation; (b) 2nd order
approximation d=2: n = 3 (c) 1st order approximation, (d) 2nd order
approximation
Advanced Distributed Computing 67
Mapping Indices to PeersMapping 1-dimensional index space onto Chord
Each node stores the keys that map to the segment of the curve between itself and the predecessor node.
Advanced Distributed Computing 68
The Query EngineQuery Processing: Processing a query consists of two steps
translating the keyword query to relevant clusters of the SFC-based index spaceand querying the appropriate nodes in the overlay network for target data
Advanced Distributed Computing 70
The Query EngineBalancing load (1)
Load balancing at node joinThe incoming node generates several identifiers and sends multiple join messages using these identifiers. Nodes that are logical successors of these identifiers respond reporting their load. The new node uses the identifier that will place it in the most loaded part of the network
Advanced Distributed Computing 71
The Query EngineBalancing load (2)
Load balancing at runtimeperiodically running a local load balancing algorithm between few neighboring nodes
neighboring nodes exchange information about their loads, and the most loaded nodes give a part of their load to their neighbors.Uses virtual nodes. In this algorithm, each physical node houses multiple virtual nodes. The load at a physical node is the sum of the load of its virtual nodes. When the load on a virtual node goes above a threshold, the virtual node is split into two or more virtual nodes. If the physical node is overloaded, one or more of its virtual nodes can migrate to less loaded physical nodes (neighbors or fingers)
Advanced Distributed Computing 72
Experimental EvaluationEvaluating the query engine
Performance is measured in terms of the number of nodes that participate in a querythe number of messages required to process a querythe number of nodes where matches are found.
Evaluating casesthe number of data elements stored in the P2P system grows with the size of the systemthe size of the system remains constant while the number of stored data-elements increasesthe number of data elements is kept constant and the system size is increased