An Ontology Based Resource Scheduling Scheme in Grids

An Ontology based resource scheduling scheme in grids

Hamed Vahdat-Nejad, Nasser Nemat-Bakhsh Computer engineering department, University of Isfahan, IRAN

[email protected], [email protected]

Abstract Traditional schedulers in the grid environment do not support semantic description of grid resources and jobs. They do exact matching between resource requesters and providers. These mechanisms are too restrictive because there is no prior agreement on how a resource or a service or a job will be represented. In this paper, we propose an extensible scheduler that uses the semantic description of entities related to scheduling. More precisely, the scheduler uses the semantic description of resources and jobs by means of ontology. Furthermore, it exploits a semantic matching scheme to perform scheduling. Experimental results show the feasibility and effectiveness of the proposed scheduler. KeywordsGrid computing; Ontology; Scheduling; Semantic matching

1. Introduction A grid computing infrastructure is a collection of resources connected by a network, in which by means of appropriate software resource advertisement, discovery and sharing is made possible [1]. Resource management and scheduling is an important issue in grid computing. Although scheduling is a difficult, NP-complete problem in clusters of workstations, it is more difficult in grids because of the following factors:

Dynamic nature of the grid; Numerous available resources; Several types of resources such as

computational, storage, catalog, network, sensors, labs;

Several options for scheduling; Difficulty of knowing the capabilities of

resources in advance;

Automation; Heterogeneity; Site Autonomy; Different Site Policies.

In summary, scheduling of a single resource in this environment is already a challenging task, scheduling of several independent resources to execute a complex workflow is hardly possible with today's grid middleware [2]. On the other hand, a scheduling decision is nothing more than a matching between the application requirements and resource descriptions. Existing resource description and application requirements description methods in the grid is highly constrained. Traditional resource matching is done based on symmetric, attribute-based matching [7]. In these systems, the attributes values advertised by resources are compared and matched with those required by applications and jobs. For the comparison to be meaningful, the resource consumer and provider should agree upon names and values of the attributes. In addition, the exact matching and coordination between resource providers and consumers make the system inflexible and difficult to extend [7]. Finally, in the scheduling process, users and system administrators of grid need to have specific knowledge. Users need the knowledge to find and select appropriate resources, while administrators have to manually provide resource knowledge in a format consumable by users. In a highly dynamic environment such as a grid where resources come and go, it is important and desirable to automate the matching process between resource descriptions and application requirements. Delegating this process from humans to machines raises the need to provide the knowledge of resources and jobs in a machine-processable format [2].

2008 International Conference on Computer and Electrical Engineering

978-0-7695-3504-3/08 $25.00 2008 IEEEDOI 10.1109/ICCEE.2008.178

340

In this paper, we propose a new approach to perform scheduling using an ontology-based scheme. The goal is to use the semantic description of entities related to scheduling. Unlike the traditional schedulers, which exploit a symmetric way for describing resource/job properties, separate ontologies are developed to describe resource properties and job requirements, independently. In addition, the proposed scheduling ontology scheme exploits a semantic matching instead of exact matching between resources and jobs. Finally, it can be easily extended by adding new vocabularies and extending the ontologies. When the ontology is built and shared among resource brokers, user agents, and schedulers, the process of finding and allocating resources for a submitted job becomes easier.

2. Background to ontology In recent years the development of ontologies has been changed from the realm of Artificial-Intelligence laboratories to the desktops of domain experts. Ontology is a formal unambiguous description of concepts in a domain of interest (classes), properties of each concept (slots), and restrictions on slots (facets). Ontology is used to provide Meta information for describing data semantics [8]. It offers a shared understanding of a domain of interest to users and agents, and facilitates communication between them in a machine-processable format [9]. An ontology is composed of a set of concepts and relationships among them, and can be used in information retrieval to deal with user queries [10]. Furthermore, it can be used as a tool for constructing knowledge bases. An ontology together with a set of individual instances of classes constitutes a knowledge base. Often an ontology of a domain is not a goal itself. Developing an ontology is similar to defining a set of data and their structure for other programs to use. Problem-solving methods, domain-independent applications, and software agents use ontologies and knowledge bases built from ontologies as data. We assume shared understanding and shared values in our day to day human interactions, so we pay no attention to the ontology problem. But in a grid environment where resource scheduling is an important issue, the need to share a common ontology is a significant matter. RDF(S) is an ontology/knowledge representation language, but it is quite primitive. OIL [11] and DAML+OIL [12] are more recent proposals for ontology representation languages. DAML+OIL is based on the original OIL language, but differs in a number of ways. DAML+OIL offers a better interoperability on the semantic level. In this way, DAML+OIL expands the RDF(S) basic primitives for providing a more expressive ontology modeling language and some simple terms for creating inferences[13]. Protg-2000 [3] is the latest

version of the protg line of tools, created by the Stanford medical informatics group at Stanford University [4]. Protg is an ontology editor which supports the RDF schema. It provides an integrated environment for editing ontology and instances. It conceals the ontology language from the ontology developers, and allows developers to work with high level concepts, which as a result lead to rapid ontology development [7].

3. Semantic Scheduling strategy In this section, we define the proposed semantic scheduling (SS) scheme as well as the Grid environment in which the scheduling strategy was designed. A Grid computing environment consists of many diverse machine types, software, disks, and networks. In the assumed Grid environment, individual desktops, servers, clusters, and even large multiprocessors systems can create resources. Distinct resources offer varying amounts of CPU FLOPs, memory, hard disk, and bandwidth. In the proposed approach, separate ontologies are developed for describing resources properties and job requirements. After sharing these ontologies among user agents, resource brokers and schedulers, the scheduling can be done in an automatic and asymmetric manner, and users can be released of agreeing upon attribute names of different resources and jobs. Each scheduler has a database which contains the information of services and jobs. The service provider registers its service in the scheduler database and the scheduler creates a new instance in the service description ontology. The service requester or Grid user sends the description of its job to the scheduler (submitting the job), and the scheduler creates an instance in the job requirement ontology. Upon receiving a job description request, the scheduler calls the scheduling program to return a service which matches best to the job requirements.

3.1 Methodology

We developed the ontologies for resources and job requirements domain using protg-2000 [3]. The reason of exploiting protg is its extensible architecture for creating and integrating easily new extensions [4]. These extensions allow the implementation of applications that use protg-2000 ontologies. We have implemented two extensible ontologies to show the feasibility of the problem. Obviously, one can extend the ontologies for new resource types. The resource ontology mainly consists of a root class resource, and six subclasses (figure 1): computational, memory, hard disk, network, software, and operating system. Each class has some attributes,

341

and some of them have further subclasses. For example, the attributes of the class computational consists of Number of available CPUs, clock speed, operating system, and type. The type attribute shows the kind of connection of machines for nodes with more than one CPU, i.e. either tightly or loosely coupled. Figure 2 shows the hierarchical structure of part of the ontology (subclasses structure of the class Operating system), which has several concepts and the is a relationship. The Operating system node has three subnodes including UNIX, Sun, and Windows. The UNIX and Windows subnodes have further three other subnodes, and the same is true for Linux subnode. Afterward, for each class a variety of instances are created, and each resource is instantiated.

Fig 1. The class hierarchy of resource ontology

Fig 2. The hierarchical structure of the Operating system concept The job requirement ontology consists of job requirement as a root class. This class has several attributes: name, owner, type There are also six subclasses as been shown in figure 3. When a user submits a job, the scheduler creates an instance in the job requirement ontology. The scheduler is implemented using protg-2000 Java API. The scheduler accesses protg ontologies via loading the protg project files, which have been written for resource and job requirement ontologies. The scheduler is an extension of our previous works [5], [6], which has been equipped to the ontologies and semantic matching. When a new job is submitted, the

scheduler is triggered and allocates the highest rank site to the arrival job.

Fig 3. The class hierarchy of job requirement ontology

3.2 Semantic matching example

An example illustrates the schedulers ability to perform semantic matching. When a job is submitted, an instance of the jobs requirements is made. We say a jobs required operating system is compatible with operating system of a resource, if they are the same or the resource OS is a subnode of the required OS of the job in the ontology. For example, if a jobs required OS is specified as UNIX, it is compatible with the OS of nodes, which are specified as UNIX or any subnode of it in the hierarchy, including IRIX, AIX, Linux, and also Redhat, Fedora, and Debian.

3.3 Experiments In process of generating jobs, we assumed the running time of a job as a random variable, and the arrival time of jobs as a Poisson process with parameter =6 seconds. Figure 3 shows the results of the simulation as a makespan versus time plot of submitted jobs. Makespan is defined as the time interval between submitting a job and completing its execution. Accordingly, we showed the necessity and feasibility of using ontology and knowledge in middleware and especially scheduling of a grid. We believe that using ontology in grid middleware layer is a suitable solution to integrating different grids and automating the scheduling problem.

Figure 3. Makespan versus time plot

0

50

100

150

200

250

300

350

2 10 15 19 25 30 34 40 45 54 58 63 69 75 82

342

References [1] Michael Walker, A Framework for Effective Scheduling of Data-Parallel Applications in Grid Systems, Master thesis, University of Virginia, 2001.

[2] Wieder, Bringing knowledge to middleware- Grid scheduling ontology, core Grid technical report, 2005. [3] http://protege.stanford.edu. [4] Gomez-Perez, et. al, Ontological engineering, chapter 5, Springer, 2003. [5] Hamed Vahdat-Nejad, et.al, Distributed resource scheduling in grid computing using fuzzy approach, In Proceedings of the third international conference on information and knowledge technology, IKT2007, IRAN. [6] Hamed Vahdat-Nejad, et.al, A new fuzzy algorithm for global job scheduling in multiclusters and grids, In proceedings of the CIMSA 2007, IEEE. [7] Tangmunarunkit, et.al, Ontology-based resource matching in the grid- The grid meets the semantic web, Information science institute, University of California. [8] D. Fensel, Ontologies: A silver bullet for knowledge management and electronic commerce, Springer, 2003. [9] H. Stuckenschmidt, Ontology-based information sharing in weekly structured environments, Ph.D. thesis, AI Department, Vrije University Amsterdam, 2002. [10] FIPA 2001, Foundation for intelligent physical agents. FIPA Ontology Service Specification, http://www.fipa.org/specs/fipa00086/XC00086D.html. [11] D. Fensel, et al., OIL in a nutshell, Proceedings of EKAW-2000, LNAI, 2000. [12] DAML. Darpa Agent Markup Language Program, http://www.daml.org. [13] Simone A. Ludwig, S.M.S. Reyhani, Introduction of semantic matchmaking to Grid computing, journal of parallel and distributed computing, 2005.

343

An Ontology Based Resource Scheduling Scheme in Grids

Documents

Transcript of An Ontology Based Resource Scheduling Scheme in Grids