[IEEE 2014 4th IEEE International Conference on Information Science and Technology (ICIST) -...

Semantic Matching for User Focused Information Services Selection*

Kexue Dai and Qiang Li Xinrui Wan and Ling Jin Yinbo Shao NO.4 Department Huangpi NCO School Scientific Research Department Air Force Early Warning Academy Air Force Early Warning Academy Air Force Early Warning Academy

Wuhan, Hubei Province, China Wuhan, Hubei Province, China Wuhan, Hubei Province, China

* This work is supported by the academy grant #2013QNCX0107.

Abstract – To select user focused information services from a service-oriented information system semantically and automatically, a matching approach is proposed in this paper to describe and match the semantics of information services using WSDL-S. The proposed matching approach uses seven QoS factors weighted by user to match requested services with advertised service for more accurate selection. The similarity between the request and the advertisement is computed with semantic ontology. Similar categories of advertisements may be grouped and irrelevant information services may be eliminated from the result set. So user focused information services can be selected by the system semantically.

Index Terms - Information system; Information service; Semantic matching; User focused.

I. INTRODUCTION

Metadata is conventionally used to describe the capabilities of information services. However, simple metadata of information services only provide an insignificant improvement in information system integration. And queries of users are also weakly aided to integrate information system. Fundamental research on information services is towards constructing a semantic grid, which has some problem solving services and knowledge services based on semantics [1]. Semantic information services can flexibly cooperate with each other and would be computed automatically on a large scale. For easier selection of the services, facilities would be provided to the system user and to the service provider in such system.

Reasoning the similarity of semantic services is a challenging research area. The matching approach proposed in this paper uses Web Services Description Language-Semantics (WSDL-S) [2] to add semantics concepts to the information service. Knowledge can be inferred from the description and be used to find out the closely related services. Modelling non-functional semantics via mechanisms like WS-Policy [3] allows for better selection of partners. User-defined weighted Quality of Service (QoS) factors are used in this paper to select the user focused services.

The remaining part of this paper is arranged as follows: related work is discussed in section . The core of the matching system is expounded in section . Section details the weighted QoS factors used for service matching and section illustrates some examples for the matching approach. The advantages of the system and the further research of it are concluded in section .

II. RELATED WORK

Mechanisms of discovering information service have defined and implemented in the Monitoring and Discovery System (MDS) of Globus Toolkit. However, MDS supports symmetric and attribute based service matching, Web Services Description Language (WSDL) is used to define and describe the communication mechanism with web services. WSDL is an Extensible Markup Language (XML) format for describing services [4]. It provides a way through which web services can be described according to their functional and non-functional behaviour.

However, WSDL lacks the ability to describe semantics of information service. The WSDL descriptions endowed with semantics, which is called WSDL-S, is utilized in literature [5, 6]. In [7], a semantic web service discovery approach named as Latent Semantic Analysis technique is proposed. The approach analyses the relationships between some documents and their terms and produces a set of concepts related to the documents and their terms.

A Web Service Modelling Ontology (WSMO) based semantic web service discovery model is proposed in [8], in which they introduced several matching techniques. The Ontology Web Language for Services (OWL-S) is a semantic description language. It can make web service more meaningful by using ontology [4]. However, OWL-S based matching algorithms consider input parameters and also output parameters. The service matching algorithms will possibly lead to retrieve irrelevant services.

Quality criteria may be differently defined in different domains. In the context of web services, quality criteria is something impact the capability of web services. The definition of it is a set of criteria such as performance or availability. In literature [9], non-functional QoS parameters are used for resource discovery. However, the functional

____________________________________978-1-4799-4808-6 /14/$31.00 ©2014 IEEE

quality of service can be also used to discover and compose interoperable web services [10]. This paper considers the QoS parameters as suggested by [11]. The parameters are proposed as CPU cycles, disk space, throughput, memory space, response time, bandwidth and reliability.

Semantic web services matching can be computed with model checking approach [12] or semantic distance-based approach [13]. The approach proposed in this paper is that information services are described with WSDL-S and then the QoS factors are matched if multiple advertised services satisfy the requests of user. The service requester is allowed to associate a weight to the factor defined by the matching system. And the service advertisements are grouped into different categories by a proposed clustering mechanism for faster selection of user focused services.

III. SYSTEM MODULES

There are two part of the system, namely the information service description and the user focused service selection.

A. Information Service Description

The description part generates semantic descriptions of information services with WSDL-S. The information service provider describes information service according to the regulation of WSDL-S and then creates a keyword based description file. So this description can only be used to select information services with keywords. The semantic annotations lack expressiveness.

Generally, the concepts of a particular domain can be semantically described with ontology and OWL is used to create ontology [14]. There is some limitation for describing services with OWL. OWL-S is semantically extended on OWL. The performance of information services can be described with it. However, it is difficult to convert the WSDL description of an information service into an OWL-S description.

Therefore, WSDL-S gives a good choice for description of information service. It provides semantics to the concepts involved in information service. Thereby, advertised services are created semantically. In the research work, concepts include the data used in the service, input parameters and output parameters of the service, functionality of the service [15, 16]. This approach integrates non-semantic data and also semantic data in the service description itself.

Semantically describing service with annotations is to provide machine understandable services and help to select semantic service according to user goal or specification. Both functional and non functional aspects are considered to annotate a web service for searching a user focused service efficiently. The elements include operations performed by service, and the messages, Data types or communication protocols used by service.

Domain ontology is used to compute the similarity between the advertised and the requested concepts. All possible concepts are semantically described with domain ontology. Each concept and its sub-concept representing some knowledge present in the descriptions of service. Giving an

example, concept ontology can well represent the data type of an output parameter of a service.

The mechanism of the proposed system gives a faster selection of user focused information services. It clusters similar WSDL-S descriptions into groups and stores them into UDDI registry. Functionality based mechanism also provides each cluster a service offer as well as a unique index. This grouping eliminates irrelevant information services and minimizes the number of advertisements. So the requested advertisements can be quickly compared for the selection of the user focused service.

B. User Focused Service Selection

To find the appropriate information service focused by user, the selection module is to compare the requested service against the advertised with the similarity of them. And the service that satisfies user requirements is returned. The matching approach firstly identifies the suitable cluster, which contains the services related to the requested functionality. Once a suitable cluster is got, the approach retrieves the advertised service one by one from it and compares to other parameters.

By this way, irrelevant services are eliminated from the set being compared. It results in a more accurate set of user focused services. However, different user may have different QoS requirements. For an instance, reliability and cost may be concerned in addition to inputs, outputs and functions.

For this reason, choosing the most appropriate one that satisfies the requirements is a difficult thing, especially when multiple services match the requested service. There are lots of researches trying to solve the problem. However, it is not easy to design a generic QoS based approach, especially when user wants to vary the QoS requirements. The matching approach proposed in this paper attempts to solve this problem. The way is using several QoS factors and associating them with weights defined by user.

Users of the system are allowed to specify the priorities on their needs in terms of weights to the supported QoS factors. For an instance, a user may want a service provider with high reliability but not the one that processing the request fastest. In this circumstance, the user would prioritize it by endowing reliability factor with a higher weight than the others. The user may also prioritize other factors just like this. The overall QoS is computed with the weighted QoS values. Then the most appropriate service would be selected out by the module.

. WEIGHTED FACTORS

The QoS of a service request is modelled with seven different factors. Namely, CPU cycles (CC), memory buffers (MB), disk space (DS), throughput (TH), network bandwidth (NB), reliability (RE) and response time (RT). These factors can be computed with the following formula:

CC = 1 - (RCC / ACC)

MB = 1 - (RMB / AMB) DS = 1 - (RDS / ADS) TH = 1 - (RTH / ATH)

NB = 1 - (RNB / ANB) RE = 1 - (RRE / ARE) RT = 1 - (RRT / ART)

Where RCC is the minimum CPU cycles that user required,

ACC is the CPU cycles that is available at the provider, and so on. So, for a certain service provider, R/A must be the minimum, and 1- (R/A) must be the maximum that the provider can offer the highest QoS.

With any Network Monitoring Tool, it is easy to obtain the parameters representing available resources such as ACC from the service provider. If the value of any factor is negative, it means that the service is particular. If the availability is lower than the minimum requirement, it means that the service is not fit to the requirements. In this circumstance, the user may do some adjustments.

For a service provider, reliability is the ratio between the number of successfully satisfied services from it (NoS) and the total number of requested services to it (ToS).

Reliability = {NoS / ToS} (1)

All these QoS parameters can be obtained periodically

from the service provider, then provided as the inputs of the matching system.

The service requester gives the parameters required by the service request such as RCC. The priorities of the QoS in terms of weights (W) are also specified by the service requester. Weights are very important for service selection.

When the capabilities of user requirements are satisfied by more than one service, the most suitable service is selected on the basis of weights. A function of weighted QoS parameters shown below is used to compute the overall QoS requested by user.

QoS = (w1*CC,w2*NB,w3*DS,w4*MB,

w5*RE,w6*TH,w7*RT) (2)

Where w is the weight value of each factor and wi

=QoS * n. n=2, 3, 4.... The value of wi is a multiple of number of QoS factors.

And it is the maximum number of weights that user can adjust amongst the QoS factors depending on the requirements.

Since seven QoS factors have been considered in our system, wi is a multiple of seven. The greater “n” is, the distinction of weights of factors is clearer. Various factors may be assigned obvious different weights. And this will result in more accurate specification of priority.

. EXAMPLE ILLUSTRATION

This section illustrates the effect of the approach in selecting a best service. The value of “n” is simply determined as 2 as default and hence wi =14. To discovery the effect of default as well as the adjusted weights by user, the weight values of QoS factors for two services S1 and S2 is assumed. The service selection in both the following cases is clearly distinguished on their accuracy.

Let SSn be the set of QoS factors of the service Sn as SSn = {CC, NB, DS, MB, RE, TH, RT}. The values of the sets SS1 and SS2 for S1 and S2 are assumed as follows:

SS1= {0.1, 0.3, 0.4, 0.5, 0.7, 0.4, 0.6} SS2= {0.8, 0.1, 0.5, 0.4, 0.2, 0.1, 0.1}

A. Case (i)

In this case, any priority towards QoS is not specified. So, the system distributes wi equally to all QoS factors. The Overall QoS for services S1 and service S2 can be computed with (2) as:

Overall QoS = (SSn * W) . The score for S1 is shown in Table I:

TABLE I

THE SCORE FOR S1 IN CASE (I)

CC NB DS MB RE TH RT

SS1 0.1 0.3 0.4 0.5 0.7 0.4 0.6

W 2 2 2 2 2 2 2

SS1*W 0.2 0.6 0.8 1 1.4 0.8 1.2

Overall QoS = (SS1*W) =6.0 The score for S2 is similarly shown in Table II:

TABLE II

THE SCORE FOR S2 IN CASE (I)


SS2 0.8 0.1 0.5 0.4 0.2 0.1 0.1

W 2 2 2 2 2 2 2

SS2*W 1.6 0.2 1 0.8 0.4 0.2 0.2

Overall QoS= (SS2*W) =4.4 It is obviously observed that the overall QoS obtained for

S1 is greater than that of S2. This illustrates that if user does not prioritize the

requirements, namely the QoS factors are endowed with default weights, the service S1 would be selected as the best service according to the user’s requirements. The weights of QoS factors have no effect on the service selection in this case.

B. Case (ii)

In this case, user wants to adjust CC and NB as the higher prioritized factors than the others. With the same weight set of QoS factors for service S1 and service S2, the overall QoS is computed as below.

The score for S1 in case (ii) is shown in Table III:

TABLE III

THE SCORE FOR S1 IN CASE (II)


SS1 0.1 0.3 0.4 0.5 0.7 0.4 0.6

W 4 3 2 1 2 1 1

SS1*W 0.4 0.9 0.8 0.5 1.4 0.4 0.6

Overall QoS= (SS1 *W) =5.0 And the score for S2 in case (ii) is shown in Table IV:

TABLE IV

THE SCORE FOR S2 IN CASE (II)


SS2 0.8 0.1 0.5 0.4 0.2 0.1 0.1

W 4 3 2 1 2 1 1

SS2*W 3.2 0.3 1.0 0.4 0.4 0.1 0.1

Overall QoS= (SS2 *W) =5.5 In this case, it is obviously observed that the overall QoS

obtained for S1 is smaller than that of S2. When the QoS factors are associated with different weights for prioritized requirements, the matching system infers that S2 is the user focused service but not S1. The weights have effect on the ultimate selection of information services.

The above illustrations indicate that the proposed matching system gives user more flexibility in selecting the focused services by specifying different requirements.

. CONCLUSION

Quality is the measure used to identify that how a particular service satisfied the requirements of the user. The quality criteria can be classified into groups with different perspectives such as performance, safety, cost and etc.

In this paper, a similarly matching system is proposed which provides users to semantically describe and select their focused information services. The matching approach based on seven different factors provides a flexible selection mechanism for user to control the result set. The considered factors are CPU cycles, memory buffers, disk space, throughput, network bandwidth, reliability and response time. With effective clustering techniques, similar categories of advertisements may be grouped and irrelevant information services may be eliminated from the result set.

More factors have to be taken consideration for the system to be useful. The quality criteria can be further studied to enhance the relevancy with some task specific QoS parameters or more internet specific QoS parameters such as availability, accessibility and security.

ACKNOWLEDGMENT

The research is funded by the foundation of the academy (2013QNCX0107). And sincere thanks are also expressed to all authors of the references.

REFERENCES [1] W. Tian, and H. Qi, “An e-learning semantic grid for life science

education,” Computing and Informatics, Vol. 27, pp.53–72, 2008. [2] A. Rama, F. Joel, M. John, N. Meenakshi, S. Marc-Thomas, S. Amit, and

V. Kunal, University Of Georgia, IBM Software Group, “Web service semantics - WSDL-S,” 2005.

[3] V. Kunal, A. Rama, and G. Richard, “Semantic matching of web service policies,” Second Int. Workshop on Semantic and Dynamic Web Processes, Orlando, Florida, pp. 79-90, 2005.

[4] C. Roberto, “Web Services Description Language (WSDL) Version 2.0: Core Language,” Available at: http://www.w3.org/TR/wsdl20/ 2007.

[5] K. Verma, J. Miller, and P. Rajasekaran, “Enhancing web service descriptions using WSDL-S,” Tech. Report at EclipseCon, Burlingame, CA, 2006.

[6] J. Cardoso, J. A. Miller, and S. Emani, “Web services discovery utilizing semantically annotated WSDL, reasoning web,” 4th Int. Summer School Tutorial Lectures, Springer-Verlag, Berlin, Heidelberg, 2008.

[7] W. Chen, E. Chang, and A. Aitken, “An empirical approach for semantic web service discovery,” presented in 19th Australian Conf. on Software Engineering, IEEE 2008.

[8] H. Li, X. Du, and X. Tian, “A WSMO based semantic web service discovery framework in heterogeneous ontologies environment,” presented in KSEM, Springer Link 2007.

[9] Y. Huang, and N. Venkatasubramanian, ”QoS-based resource discovery in intermittently available environments,” in Proc. of the 11th IEEE Int. Symposium on High Performance Distributed Computing, 2002.

[10] B. Jeong, H. Cho, and C. Lee, “On the functional quality of service to discover and compose interoperable web services,” Expert Systems with Application, vol.36, no.3, pp.5411-5418, 2009.

[11] T. S. Somasundaram, and R. A. Balachandar, V. Swaminathan, A. Kumar, and V. Paramasivan, “Semantic description and discovery of grid services using matchmaking algorithm,” Advanced Computing and Communications, 2009.

[12] G. Akin, and Y. Pinar, “Semantic matchmaking of web services using model checking,” in Proc. of the 7th int. joint conf. on Autonomous agents and multiagent systems, Estoril, Portugal, 2008.

[13] T. A. Farrag, A. I. Saleh, and H. A. Ali, “Semantic web services matchmaking: semantic distance-based approach,” Computers and Electrical Engineering, vol.39, no.2, pp.497-511, 2013.

[14] J. Euzenat, and P. Shvaiko, Ontology Matching, Springer, 2007. [15] A. Formica, “Concept similarity by evaluating information contents and

feature vectors: a combined approach,” Communications of the ACM, vol.52, no.3, pp.145-149, 2009.

[16] A. Formica, M. Missikoff, E. Pourabbas, and F. Taglino, “Semantic search for matching user requests with profiled enterprises,” Computers in Industry, vol.64, no.3, pp.191-202, 2013.

[IEEE 2014 4th IEEE International Conference on Information Science and Technology (ICIST) -...

Documents

Transcript of [IEEE 2014 4th IEEE International Conference on Information Science and Technology (ICIST) -...