A Dynamic Ontology Management System for the...

15
A Dynamic Ontology Management System for the Semantic Web AnneYun-An Chen, Shan Gao, Dennis McLeod Department of Computer Science, University of Southern California, 941 W. 37th Place, Los Angeles, CA 90089-0781, U.S.A. {yunanche, mcleod, sgao}@usc.edu Abstract. The semantic web is an extension of the current web [1][2] and the data are well-defined based on those semantics for the agents to comprehend. Ontologies provide a solid base to represent metadata for the semantic web en- vironment. Several web-compatible languages and technologies [4][5][6] have been developed to support the semantic knowledge management for ontologies. Current ontology management systems, such as ProtØgØ [12], allow only one relation type and thereby provide insufficient support for other web technolo- gies. Overly simplified models lack the semantics expressions required. To solve these problems, we propose a dynamic ontology management system. The system implements a data model enclosing sufficient semantics, integrates the ontologies developed by different web-based technologies, and incorporates other data resources for the ontology refinements. The performance of the on- tology management system is being enhanced to ensure less human involve- ment. 1 Introduction The semantic web is defined as the data representation which reveals the semantics in the data on the world wide web [7]. After the first release of Resource Description Framework (RDF) working draft in 1997 [8], many web-based languages and tech- nologies have been proposed to represent the data on the web and to manage the metadata. RDF [5] describes the resources in terms of triples, the representations of the subject, property, and object. DARPA Agent Markup Language (DAML) [6] was developed as an extension of XML and RDF. Ontology Inference Layer (OIL) [9] was introduced later on and DAML+OIL [10] was developed based on W3C stan- dards as a semantic markup language. The Web Ontology Language (OWL) [11] was one step further that the content of the metadata can be processed if needed. eXtensi- ble Markup Language (XML) has been an essential component in those web- compatible technologies. Ontologies can be described with these technologies in the web environment for machines to read and understand. Although there is a common component, XML and there are several ontology management system which have been designed and developed, ontology developers tend to design their own ontologies using different technologies without compromis-

Transcript of A Dynamic Ontology Management System for the...

A Dynamic Ontology Management System for the Semantic Web

AnneYun-An Chen, Shan Gao, Dennis McLeod

Department of Computer Science, University of Southern California, 941 W. 37th Place, Los Angeles, CA 90089-0781, U.S.A.

{yunanche, mcleod, sgao}@usc.edu

Abstract. The semantic web is an extension of the current web [1][2] and the data are well-defined based on those semantics for the agents to comprehend. Ontologies provide a solid base to represent metadata for the semantic web en-vironment. Several web-compatible languages and technologies [4][5][6] have been developed to support the semantic knowledge management for ontologies. Current ontology management systems, such as Protégé [12], allow only one relation type and thereby provide insufficient support for other web technolo-gies. Overly simplified models lack the semantics expressions required. To solve these problems, we propose a dynamic ontology management system. The system implements a data model enclosing sufficient semantics, integrates the ontologies developed by different web-based technologies, and incorporates other data resources for the ontology refinements. The performance of the on-tology management system is being enhanced to ensure less human involve-ment.

1 Introduction

The semantic web is defined as the data representation which reveals the semantics in the data on the world wide web [7]. After the first release of Resource Description Framework (RDF) working draft in 1997 [8], many web-based languages and tech-nologies have been proposed to represent the data on the web and to manage the metadata. RDF [5] describes the resources in terms of triples, the representations of the subject, property, and object. DARPA Agent Markup Language (DAML) [6] was developed as an extension of XML and RDF. Ontology Inference Layer (OIL) [9] was introduced later on and DAML+OIL [10] was developed based on W3C stan-dards as a semantic markup language. The Web Ontology Language (OWL) [11] was one step further that the content of the metadata can be processed if needed. eXtensi-ble Markup Language (XML) has been an essential component in those web-compatible technologies. Ontologies can be described with these technologies in the web environment for machines to read and understand.

Although there is a common component, XML and there are several ontology management system which have been designed and developed, ontology developers tend to design their own ontologies using different technologies without compromis-

ing other existing ontologies. This not only costs the ontology developers unneces-sary time and efforts but also results in fragmentary ontologies. One reason is that there isn�t a complete corpus which can be easily accessed. There are many ontolo-gies developed using RDF, DAML+OIL, or OWL. A system which integrates on-tologies built by RDF, DAML+OIL, and OWL will facilitate the establishment of the corpus and benefit all ontology developers.

One well-developed ontology management system is Protégé [12]. Although Pro-tégé provides friendly user interfaces for ontology developers to create domain on-tologies and manage the instances, there are several deficiencies. First, there is only one hierarchical relation type. One hierarchical type is too simplified and unable to describe the world with precise semantics. Furthermore, the ontologies developed in Protégé are not portable and not universally understandable. Although Protégé pro-vides several domain ontologies for downloading, ontology developers must first install Protégé itself. Although Protégé now provides support to RDF schema, there is no integration of the schema afforded.

In order to facilitate the corpus establishment and to provide universally under-standable schemas, we propose a dynamic ontology management system, which inte-grates the metadata statements from several resources and updates the ontologies spontaneously. The intentions of our system design are: 1. Transform schemas using a data model with sufficient semantics to prevent the information loss 2. Integrate different schemas to establish an ontology corpus for ontology developers 3. Provide the base of ontologies for ontology developers to mature the domain on-tologies for both the individual user and the corpus 4. Build a skeleton to unite the heterogeneous data resources 5. Afford a simple and accessible environment for both machines and human beings; ensure the compatibility with web services

Our system performs the following tasks: facilitating domain ontology creations and updates, learning new concepts, relations, and patterns among the metadata, and providing the base for the semantic wrapping of information sources. Classified Inter-related Object Model [13] is implemented in our system in order to map metadata to a well-defined schema. CIOM provides abundant semantics to map the schemas gener-ated with other languages without information loss and to describe the domain knowl-edge more precisely in order to assist future decision making and problem solving applications.

2 Background

2.1 Classified Interrelated Object Model

Classified Interrelated Object Model (CIOM) is a subset of SDM [14]. With the basic structures, operations, constraints of SDM, CIOM intends to model the real world and

capture the semantics of the application environment. The collections of objects are represented as classes. The properties of a class are inheritable.

Classes. A class is defined as a collection of objects (information units). This collection is a logically homogeneous set of one type of objects. The objects of a class are said to be the members of the class. - Class Name. The name of a class is used to identify the class from others. A class name can be any string of symbols that uniquely represents the nature of this class. - Built-In Classes. There are several built-in classes predefined in CIOM. They are often used as the value classes for basic attributes. For example, STRINGS is the class of all strings over the set of alphanumeric characters. NUMBERS is the class of all numeric values of integers, real or complex numbers. BOOLEANS is the class of two values �True� and �False�. - Member Attributes. Member attributes are the common aspects of members of a class. A member attribute describes every member of the class. Each attribute has a name that uniquely identifies that attribute from other attributes of the same class. Similar to a class name, an attribute name can be any string of symbols. The value class of an attribute is the set of all possible values that applies to the attribute. This set can be any built-in classes or any user-defined classes as well. In a CIOM schema, attributes always have inverses, and are usually shown in pairs. In each pair, each attribute is said to be the inverse of the other. This relation is specified symmetrically. The cardinality of attributes can be classified into three categories: One-to-One, One-to-Many and Many-to-Many. In CIOM, an attribute can be specified as non-null. In order to retrieve the needed information, attributes can be concatenated. - Class Attributes. CIOM supports aggregate functions by means of class attributes. An attribute can be defined as the number of members in one class (similar to the count() function in relational SQL). An attribute can also be defined as the maximum, minimum, average, and sum of the members in one class, similar to max(), min(), avg(), sum() functions in SQL.

Subclasses. A subclass means specialization, viz. its membership is a subset of the members of its parent class. For example, Men and Women are two subclasses of class People. There are two ways to define a subclass: by operation or by predicate. For a predicate defined subclass, its membership is determined by some conditions upon one or more attributes. The conditions are called predicates. If a member�s attributes values satisfy all conditions, then the member is automatically added to the subclass. For operationally defined subclasses, their membership is not automatically determined by any predicate, but decided by user operations (adding/inserting members). - Class Intersection. An intersection subclass contains the members in both of the parent classes involved in the intersection. For type compatibility, the Classes in-volved in this operation must both be subclasses of some common parent, directly or indirectly. - Class Union. A union subclass contains the members in either of the parent classes involved in the union operation.

- Class Difference. A difference subclass contains the members of one parent class that are not in the other class. Note the difference operator is not symmetric, i.e., A-B ≠ B-A. - Attribute Inheritance. A subclass, whether predicate defined or operationally de-fined, automatically inherits all attributes from its parent class. These attributes need not be shown on the subclass but they exist implicitly. For subclasses defined by set-operators, the attribute inheritance is more complex. A subclass by intersection of parents inherits all attributes of both parents. A subclass by parents union inherits only those attributes that are common to both parents. A subclass generated by dif-ference of parents inherits all attributes of the parent at the left side of the operator. - Mutually Exclusiveness and Collectively Exhaustiveness. For two or more sub-classes, mutually exclusiveness means that there is no member belongs to more than one of the subclasses. For example, since a person cannot be both a male and a fe-male at the same time, we say the subclasses �male� and �female� are mutually exclu-sive. Collectively exhaustiveness means for any member of the parent class, it must belong to at least one of the subclasses. For example, if a person must be either a male or a female, then we say the subclasses �male� and �female� are collectively exhaustive.

Grouping Classes. A grouping class defines a class of classes. It is a second order class in that its members are of higher-order object type than those of the underlying classes. The members of a grouping class are viewed as classes themselves.

2.2. Universal Data Format

Many data sources, databases, have been represented by the relational model since the model has been introduced in 1970 by Ted Codd [15].Those data sources are not easily accessed by other web service stubs which don�t understand the schemas be-hind. The reason is that there isn�t any universal standard for the format of storing schemas. In order to achieve interoperability for those data sources, a semantic trans-formation of the schema are essential. This semantic transformation of the schema is required to be web-based and compatible with web service technologies. The use of web services has been increased expeditiously to invoke the applications among serv-ers and clients. The software systems from the client sides can interact with web ser-vices if these systems uses XML based languages conveyed by internet protocols as required [16]. These software systems make requests of message exchanges, file transfers, and data retrievals to the servers. Although data retrievals using web ser-vices have been practically operated on the world wide web, there are limitations to apply the use of web services on most heterogeneous databases. Every commercial database management system may have its own format of describing the data schemas and few commercial database management systems can transform database schemas into acceptable formats for the world wide web. eXtensible Markup Language (XML) is one of the data format extensively accepted on the world wide web. If the database schemas are described by DTD for XML or XML Schema, the databases can be more easily accessed on the world wide web environment. Several XML-based database

systems are experimentally or theoretically designed to be compatible to the architec-ture of web services and to enhance the performances of data retrievals [17]. It costs too many resources to migrate all the data from traditional database systems to XML-based database systems. Therefore, it is important to design a metadata management system to convert the schemas stored in traditional database management systems to a universal data format. Machines can use the transformed schemas as mediators to understand the traditional database schemas. XML Schema extends the properties of data definitions for XML and includes more semantic definitions. XML Schema is machine-accessible and web-compatible. XML Schema statements are sufficient to describe database schemas and to serve as the universal data format.

3. System Architecture

The dynamic ontology management system not only integrates the available ontolo-gies but also builds up its own schema databases. The most significant component in the system is the rule-based agent. There are three major functions of the rule-based agent: schema transformation, adaptation, and generation. In the first design of the dynamic ontology management system, a manual editing tool is included as one com-ponent in the system. The purpose of this editing tool is to update the schemas and to assist the schema creation when the rule-based agent doesn�t have a complete knowl-edge base. The system architecture is shown in Figure 1.

Fig. 1. The system architecture of the dynamic ontology management system for the semantic web

The dynamic ontology management system is designed to fulfill three important tasks. The system structure is suitable to achieve the completion of these tasks: 1. To provide the base for the semantic wrapping of information sources: The system includes one agent which produces the web-compatible, application-accessible, and machine-understandable data format, XML Schema, as the result.

There are three groups of data sources for the system. This agent manages all types of data by its own. The first one is the database schema. The second is results of topic mining, and the third is existing schemas built with web technologies or languages. CIOM, implemented as portion of the rule-based agent, is a solid base for the seman-tic wrapping because of the sufficient semantic provided by the model. The informa-tion in the original RDF schemas, DAML+OIL ontologies, and other schemas is preserved in the schema adaptation process. The production of XML Schema state-ments is implemented as one reusable function for schema transformations, genera-tions, and adaptations. 2. To learn new concepts, relations, and patterns among the metadata: The ontology integrator in our system is able to learn new knowledge from different data sources and decide what to preserve, to compromise, and to adjust. It collects con-cepts, relations, constraints, and patterns from ontologies which were defined in the same domain but by different developers. The learning process implemented in the integrator involves lots of computations to determine the mapping among the ontolo-gies in order to comprise those into one concluded representation. Initially, a manu-ally-edited synonym list is provided to the integrator. The intelligent implemented in the integrator compares the hierarchical relations to determined the new synonyms and decide the necessities of the concepts and relations from a fresh data source. Therefore, the possible redundancies of the concepts, relations, and constraints are avoided. 3. To facilitate domain ontology creations and updates: Our system is designed to generate an ontology corpus for all ontology developers. The whole system archi-tecture is constructed in order to facilitate domain ontology managements. The rule-based agent generates schemas from many data sources to enrich the references of the ontology corpus. The ontology integrator compromised these schemas to create and update domain ontologies. The well-defined computation method ensures the infor-mation preservations and redundancy eliminations for the final domain ontology representations. The designs of computation methods for ontology integrations are discussed in Section 5.

3.1 Semantic Sufficiency

The core of our dynamic ontology management system is CIOM. All the elements in the ontology can be mapped to the CIOM schema. The ontology is represented in a hierarchical structure [19]. Concepts in the ontology are mapped into classes and instances in CIOM; relations in the ontology are mapped into attributes, inheritances, memberships, and other descriptions of relations in CIOM. The representation from ontology to CIOM is shown in Figure 2.

Since CIOM is an object based model, it is very easy to express concepts and the relations in the candidate schema with XML Schema. The reason is that both the structure of XML Schema and relationship type of CIOM are hierarchical. The map-ping process is direct. An example of the mapping between XML Schema and CIOM is show in Figure 3.

Fig. 2. Representing an ontology using CIOM conceptual Schema

3.2 Schema Processing

Schema Transformation. The choice of XML Schemas is an appropriate choice to represent the results of schema transformation because of its abundant semantics. Schemas are the descriptions of databases. A database is a collection of related data [6], and the conceptual schema itself can be considered as the metadata. The question is that if the universal data format can represent the complete information in the schema. There are four most used data models in the traditional commercial database management systems: relational, network, hierarchical, and object-oriented data models [18]. Attributes in the relational models and data items in the network data models are easily expressed by the elements in XML Schema. The data types of relational data models and the formats of network data models can be transformed to data types of XML Schema. There are more than 40 data types defined in XML Schema and users are allowed to define new data types. The relationships and the cardinality in both models are able to be altered to the attributes of the elements in XML schema files. The structure of the XML Schema is hierarchical and the properties of the elements can be inherent. The schemas in hierarchical and object-oriented data models are able to be transformed into XML Schema with fewer adjustments.

Fig. 3-1. XML Schema statements representing a partial ontology in earthquake science do-main

Fig. 3-2. A diagram of CIOM schema match to the XML Schema statement

The first API in the system is developed to communicate with commercial database management system in order to retrieve the database schemas. The detector checks for the communication rules and calls Database Connectivity driver to establish the connection. The bytes of the file are read in and the transformation begins. Once the schema has been read into the API, schema transformation is performed in the next phase. The rule-based agent generates the description of the database schema using XML Schema. Attributes of the original database schema are transformed into elements in XML Schema files. The relationships and the cardinality altered to the attributes of the elements in XML Schema files.

Schema Adaptation. The schema adaptation performed by the rule-based agent is to read in the schemas and ontologies generated by other web-based technologies and languages. The schema adaptation process is to generate the class inheritance hierarchy, and maintain instances and their attributes.

3.3 Web Compatibility and Accessibility

XML Schema. The productions of the rule-base agent are XML Schema statements. The reasons for this design are that XML Schema is a universal data format and that it encloses abundant semantics. XML Schema extends the possibilities of the data definition for XML. In current web technologies, XML enables the communication among applications to applications on the web easier due to its universality. One category of important interfaces for the web environment, Web service, is based on the light weighted protocol to decrease the communication time between two applications. The well-defined representation of data and web services result in a better performance of the software agents on web. The applications, such as web services, require a universal data format which is readable by machines in the web environment. The XML schema statements produced by our system not only provide the well-defined data representation but also afford the universal format for the web environment.

Browser-based UI. The design of the browser-based user interface allows user to have a friendly interface to browse, search, and access the XML schema files, ontologies, and CIOM conceptual schemas. The retrieval process is described in Figure 4. Ontology developers can easily access the management tools on the web and refer to the schemas which they requests.

Fig. 4. Schema retrieval procedures in the dynamic ontology management system

4. Rule-Based Agent

Unlike the basic knowledge-based agent, our rule-based agent executes an iteration of actions. The schema generation, transformation, and adaptation are the mandatory task in the system; each is a schema mapping process. Some functions, such as itera-tions of sentence interpretations and schema productions, are reusable for all three mapping processes. This is the benefit to implement all mapping processes in one agent. The original schema from different data resources may be saved as different file format. For each type of data sources, the mapping rules are required to be gener-ated in advance. The knowledge base saves the mapping rules. The decision of the action is delineated for each schema mapping process. The knowledge base behind the rule-based agent is manually updated. The construction of the knowledge base for the rule-based agent involves manual editions. In the following, the processes of schema productions are briefly discussed.

4.1 Schema Transformations

One function of the rule-based agent is to transform database schema stored in com-mercial database management system into XML schema statements. The rule-based agent performs as the following: If the agent percepts the request of schema transfor-mation for the database schema from one certain type of the data model, it should request for all rules that map the original schema to XML Schema according to the type of the data model. Then, the agent start to transform the schema with the com-plete rule set returned.

4.2 Schema Adaptations

The schema adaptation process handles a lot of data sources and produces the basic knowledge for the ontology corpus. The agent executes the process in the following extent: If the rule-based agent receives the request of schema adaptation for one cer-tain type of schema, it should request for all rules that map the original statements to XML Schema and start to adapt the schema with the complete rule set returned.

4.3 Schema Generation

The schema generation is to manage the results from topic mining [20]. Topic mining provides information upon an event-based point of view. It hierarchically clusters collections of data and associates topical terms with each cluster according to a topic. Although the data are hierarchically clustered, the associations among the clusters and topic terms are not directly mapped to the hierarchical structure of the XML Schema and ontologies. The agent has to modify the associations to suit for another type of the structure. The agent accomplishes the task in the following steps: If the agent percepts the request of schema generation based on the results of the topic mining, it should request for all rules that introduce the relations among the clusters and terms. Then, the agent start to generate the schema with the complete rule set returned.

5. Ontology Integration and Interconnection

5.1 Methodology

The ontology integrator collects concepts, relations, and constraints from ontologies developed in the same domain but by different developers. It is designed to learn new knowledge from different data sources and decide what to preserve, to compromise, and to adjust. In order to determine what to be comprised into one concluded repre-sentation, a computation method including data mining techniques or pattern match-ing methods is required. Some data mining technologies specifically designed for the hierarchical structure have been experimentally and practically developed [22] [23], but those are only used to analyze the similarities between two small hierarchical structures which are potions from the same parent structure. These data mining tech-nologies are not suitable for the ontology integrations. In the other hand, tree match-ing algorithms [24] [25] [26] are more suitable for ontology integrations. The first barrier of applying tree matching algorithms is that different ontology developers may label the same concept with different syntax to cause the difficulties in the matching processes. The solution is to import a synonym list. The implementation is discussed in the following section. The second barrier is that different domain ontologies may have different scales. Therefore, a pre-procedure is designed to equal the semantic scales in order to perform the tree matching algorithm. This pre-procedure is designed

to be suitable for equaling several candidate domain ontologies in one time computa-tion. This procedure is designed in order to handle three different cases in the equal-ing process. Before performing the equaling process, a search for the synonym of the nodes and translations of the concept names is required in order to compare the scale of two ontologies. The design for the case handling is described below: - Case 1: If the candidate ontology has larger scale than the current ontology does,

a subtree with the same scale as the current ontology is extracted from the candi-date ontology for the further computation. The concept nodes of the candidate ontology excluding the ones in the subtree are kept in another queue for the in-sertion-only-voting mechanism with other nodes from large scale candidate on-tology in queue.

- Case 2: If the candidate ontology has the same scale as the current ontology does, the candidate ontology awaits the further computation directly.

- Case 3: If the candidate ontology has the smaller scale than the current ontology does, the subtree with the same scale as the candidate ontology is extracted from the current ontology. The computation of the edition operations is based on the candidate ontology and the subtree of the current ontology. The results of the ed-iting operations are considered in the voting process. The candidate ontology stands for one vote only for the editing operations within its scale.

In tree matching algorithms, the edit distance, defined as the minimum edit cost, is required to be determined in order to perform edit operations. The edit operation includes two categories: the node insertion and the node deletion (relabeling is not considered here because of the matches with the synonym list has been done in ad-vance). The function of mapping cost for two semanctics-equal tree/subtree T and T� is defined as the following:

∑∑∈∈

→+→=]}[{

'

]}[{ '

])[()][()(nodeTjnodeTi

jTciTcMc λλ

[25][26], where M is the mapping of T and T� (excluding the leaf nodes), c is the cost function for a sequence of editions, λis the null node

(1)

The goal is to minimize the cost of mapping. The implementation in the system is to establish a cost table in order to computer the minimal cost. The worst case of the computational complexity for each computation is the number of nodes in one tree which contains more nodes to the power of four.

In order to apply tree matching algorithms for ontology integrations, we need to introduce a voting mechanism on semantics-equal ontologies to avoid the bias infor-mation. The voting mechanism is defined as: In the ith operation of the ontology integration, n new candidate schemas (n >= 1 and n is an odd number), the ontology resulted from (i-2)th operation, and the ontology resulted from (i-3)th operation are matched to the current ontology individually. Each match results in a set of nodes (excluding leaf nodes) to be inserted or deleted. If one certain edit operation gains more than half of the votes from n+2 matches, the edit operation will be taken into action, either insertion or deletion.

The two previous results of ontology integrations are cached and included in the voting process. This is to decrease the effect of occasional bias operations of node editions for the final domain ontology. Furthermore, the leaf nodes are excluded in the voting process. It is because the leaf nodes represent very specific concepts (e.g.

the instances of their parent nodes), and those should be preserved in every integra-tion.

5.2 Synonym List

Synonym list is developed in order to determine the equality of semantics in the tree matching process. There are two different stages for the applications of the synonym list. The firs stage requires knowledge from domain experts. In the second stage, the intelligent agent is able to perform learning based on the available knowledge in the knowledge base. The descriptions of two stages are listed below: - Stage 1 (Phase of consumptions): In the beginning of domain ontology construc-tions, the information in the synonym list is consumed to substitute the syntax in the candidate schemas for the ontology integration process. In this stage, the synonym list requires manual editions of domain experts. - Stage 2 (Phase of feedbacks): When the structure of domain ontology is mature enough to contain a sub-structure which is an exact match for the candidate schema, the ontology integrator is able to feedback new synonyms by comparing the syntax and extracting the different one.

Each synonym corresponds to one concept in the domain ontology stored in the database. The cardinality ration of this binary relationship is many to one.

6. Conclusion

Current ontology developers establish domain ontologies with different tools and web technologies. In order to integrate voluminous of domain ontologies for effective utilization and convenient reference, a dynamic ontology management system is pro-posed. The dynamic ontology management system provides the base for the semantic wrapping of information sources, learns new concepts, relationships, and patterns among the metadata, and facilitates domain ontology creations and updates. Many ontology management systems are implemented with limited semantics. We propose to implement a data model, CIOM, as the core of our ontology management system. CIOM includes abundant semantics representations to model the real world.

An essential feature is the rule-based agent. The rule-based agent manages the schema transformation, adaptation, and generation. The rules in the knowledge base are sufficient but with no redundancy. The schema transformation wraps the database schemas with semantics. The database schemas are transformed into XML schema files. The schema adaptation inputs different types of schemas and alters those into useful data sources for ontology updates. The integrator merges the schemas from the same domain and computes the new domain ontology. The tasks performed by these components in our system indicate that the dynamic ontology management is able to be spontaneously without human efforts involved in the nearly future.

An important issue of the system design is the examinations and implementations of the algorithm behind the integrator. How to define the necessities of any concept and relation and what to preserve during every integration and update of the ontology

are needed to be considered during the algorithm design. A voting mechanism is introduced to assist ontology updates and to reduce the possibility of bias. Currently, experts from two domains, journalism and earthquake science, support the develop-ment of domain ontologies and synonym lists. The experiments to evaluate the per-formance of our system will be first designed to manage ontologies in journalism and earthquake science domains. A domain ontology for earthquake science has been developed based on CIOM. The information on concepts and relations among those are stored in a commercial database management system. Currently, there are 720 concepts stored for the earthquake science domain ontology. The synonym list asso-ciated with this domain ontology is developing.

References

1. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web, Scientific American (May 2001)

2. http://www.semanticweb.org/SWWS/ 3. Jewell, E. (Editor), Abate, F. (Editor): The New Oxford American Dictionary. Oxford Uni-

versity Press ( Dec 2001 ) 4. Bray, T., Paoli, J., C.M.: Extensible Markup Language (XML) 1.0, Second Edition. Sper-

berg-McQueen, Maler, E., Editors. World Wide Web Consortium. 6 (October 2000) This version of XML Schema Part 1: Structures is http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/.

5. Thompson, H.S., Beech, D., Maloney, M. Mendelsohn N., Editors: XML Schema Part 1: Structures. World Wide Web Consortium Recommendation, 2 (May 2001) This version of XML Schema Part 1: Structures is http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/.

6. http://www.daml.org/about.html 7. Berners-Lee, T., Swick, R.: The Semantic Web presentation at WWW9 (May 16, 2000)

Amsterdam 8. Lassila, O., Swick, R.: Resource Description Framework (RDF) Model and Syntax Specifi-

cation. W3C Recommendation (Feb 22, 1999) (URL: http://www.w3.org/TR/1999/REC-rdf-syntax-

19990222/) 9. Bechhofer, S., Broekstra, J., Decker, S., Erdmann, M., Fensel, D.,Goble, C.,Van Harmelen,

F., Horrocks, Ii., Klein, M., Cguinness, D., Motta, E.,Patel-schneider, P., Staab, S. , Studer, R: An informal description of standard OIL and instance OIL. Tech. rep., University of Manchester (2000)

10. http://www.daml.org/2001/03/reference 11. Smith, M., Welty, C., McGuinness, D. L.: OWL Web Ontology Language Guide. W3C

Proposed Recommendation (15 December 2003) 12. Noy, N. F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R. W., Musen, M. A.: Creating

Semantic Web Contents with Protege-2000. IEEE Intelligent Systems 16(2):60-71 (2001) 13. McLeod, D., Gao, S.: Classified Interrelated Object Model. Class material (2002) 14. Hammer, M., McLeod, D: Database Description with SDM: A Semantic Database Model.

ACM TODS 6(3): 351-386 (1981). 15. Codd, E: A relational Model for Large Shared Data Banks. CACM, 13:6 (June 1970) 16. Austin, D. , Barbir, A. , Ferris, C. , Garg, S. : Web Services Architecture Requirements.

W3C Working Draft (14 November 2002) (URL: http://www.w3.org/TR/wsa-reqs)

17. Graves, M., : Designing XML Databases, Prentice Hall, 1st edition (2001) 18. Le cluse, C., Richard, P., Velez, F. : O 2 , an object-oriented data model. in Stanley B.

Zdonik and David Maier (eds), Readings in Object-Oriented Database Systems, Morgan Kaufmann, San Mateo, CA (1990) pp. 227�241

19. Khan, L., McLeod, D., Hovy, E.: Retrieval Effectiveness of an Ontology-Based Model for Information Selection. The VLDB Journal (2003)

20. Chung, S., McLeod, D: Dynamic Topic Mining from News Stream Data. The 2nd Interna-tional Conference on Ontologies, Databases, and Application of Semantics for Large Scale Information Systems (2003)

21. http://imsc-dmim.usc.edu:9091/ciomplus/ 22. Shahabi, C., Zarkesh, A., Adibi, J., Shah, V.: Knowledge Discovery from Users Web-Page

Navigation. IEEE RIDE (1997) 23. Ganesan, P., Garcia-Molina, H., Widom, J.: Exploiting hierarchical domain structure to

compute similarity. ACM Transactions on Information Systems (ACM)21, no. 1, (Jan. 2003) : 64-93

24. Jung, J.J., Jeong-Seob Y., Geun-Sik J.,: Bartenstein, O.; Geske, U.; Hannebauer, M.; Yo-shie, O.: Collaborative information filtering by using categorized bookmarks on the Web. Web Knowledge Management and Decision Support. 14th International Conference on Ap-plications of Prolog, INAP (2001) Revised Papers (Lecture Notes in Artificial Intelligence Vol.2543); Berlin, Germany : Springer-Verlag, 2003, x+305 p. (237-50)

25. Wang, J., Zhang, K., Jeong, K., Shasha, D.: A System for Approximate Tree Matching. IEEE Transaction On Knowledge and Data Engineering. Vol. 6. (1994), 559-570

26. Apostolico, A., Galil, Z. (eds.): Pattern Matching Algorithms - Approximate Tree. Pattern Matching (Ch. 14). Oxford University Press. (1997)

27. Hammer J., McLeod D.: An Approach to Resolving Semantic Heterogenity in a Federation of Autonomous, Heterogeneous Database Systems. Int. J. Cooperative Inf. Syst. 2(1): 51-83(1993)

28. Kahng J., McLeod D.: Dynamic Classification Ontologies for Discovery in Cooperative Federated Databases. CoopIS: 26-35 (1996)