Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
-
Upload
delilah-hawkins -
Category
Documents
-
view
216 -
download
0
Transcript of Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Issues in Ontology-based
Information integration
By Zhan Cui, Dean Jones and Paul O’Brien
E-commerce
Requirement:
• Retrieve and Integrate information from multiple resources.
• Details of the resources are hided from users.
Obstacles:
• Understand queries.
• Determine resources.
• Integrate information.
Four levels in interoperability problems
1. System level: incompatible hardware and operating system.
2. Syntactic level: different language and data representations.
3. Structural level: different data models.
4. Semantic level: the meaning of terms.
e.g., synonyms.
Many technologies address problems in first three levels, such as CORBA, DCOM, etc.
XML vs. Semantic Heterogeneity Problems
XML:
• Can solve problems in schema level.
• Provide common syntax for exchanging heterogeneity information.
• Some standards for e-commerce:
e.g., ebXML.
XML:
• Cannot solve problems in semantic level.
• Terminology may not consistent in one file or in a set of files.
Solution
• Formally specify the meaning of the terminology of each system
(Formal ontology)
• Define a translation between each system terminologies and an intermediate terminology.
(Ontology mapping)
Outline
• Issues in Resolving Semantic Heterogeneity
• Description of DOME
• DOME Demonstrator
• Conclusion
Issues in Resolving Semantic Heterogeneity
--- Developing ontologies
Formal ontology:
• A formal ontology consists of definitions of terms.
• It usually includes concepts with associated attributes, relationships and constraints defined between the concepts and instances of concepts.
Formal ontologies include different type of ontologies for different purposes:
• Resource ontologies: define the terminology used by specific information resources.
• Personal ontologies: define the terminology of a user or some group of users.
• Shared ontologies: the common terminology between a number of different systems.
Issues in Resolving Semantic Heterogeneity
--- Developing ontologies
The best approach to develop ontologies is usually determined by the eventual purpose of the ontologies.
For example:
•Resource ontologies: bottom-up approach.
•Shared ontologies: top-bottom approach.
Issues in Resolving Semantic Heterogeneity
--- Mapping Between Ontologies
•Human intervention is necessary.
•Some tools are helpful: mediator systems, mapping libraries and conversion functions.
•Mapping is not accurate. Information could be lost. This is unacceptable for e-commerce.
Issues in Resolving Semantic Heterogeneity
--- Ontologies and Resource Information
• How to choose resources?
• It is necessary for resources to describe themselves: resource ontologies.
• Personal ontologies are important for the system to understand queries exactly.
• Many issues in locating resources:
e.g., users prefer one resource over another;
Issues in Resolving Semantic Heterogeneity
--- Ontologies and Database Schemas
Schema vs. Ontology
The main difference is their purposes.
• A schema is developed in order to model some data.
• A ontology is developed to define the meaning of the terms.
A resource has a formal ontology. Data are store in database based on schema. Mapping between formal ontology and resource schema is necessary.
Issues in Resolving Semantic Heterogeneity
--- Entity Correspondence
• There may be a lot of resources related to one query.
• Information have to be integrated to answer query.
• Construct correspondence between entities across resources.
• Key attributes can be used to build correspondence.
• It is hard to determine whether information from different resources is same or not.
DOEM Overview
• Ontology-based techniques.
• Designed for data reuse and knowledge sharing.
• Retrieve information from multiple resources to answer queries.
• Present results in a consistent way.
DOEM (Domain ontology Management Environment)
The DOEM architecture
DOEM Overview
• Develop and administrate a DOEM system.
• Extract (semi-automated) ontologies from legacy system to define ontologies and mappings.
• Allow engineers to select best developing approach: top-down or bottom-up.
• Engineers: define mapping between resource ontologies and shared ontologies, resources and shared ontologies, database schemas and resource ontologies.
--- Engineering client
DOEM Overview
• Store ontologies defined using the engineering client.
• Allow user to access: share ontologies, resource ontologies, application ontologies.
• Access through OKBC interface.
• Implement ontologies using the description logic CLASSIC which can store ontologies and make inference.
--- Ontology server
DOEM Overview
• Interface to access system.
• Query information space.
• Load and browse ontologies.
• Queries and results use the same terminology.
--- User client
DOEM Overview
• Store mappings between ontologies.
• Store generic conversion functions.
• Use a declarative syntax.
• Can be queried by query engine.
--- Mapping server
DOEM Overview
• Most interaction between a resource and the DOME network occurs via wrappers.
• Translate queries between DOME and resources.
• Translate information that will be put into the terminology of the particular resource.
--- Wrappers
DOEM Overview
• Let system know which resources are available and what these resources are.
• Store the directories and descriptions of resources.
--- Resource Directory
• Obtain a list of currently available and relevant resources from resource directory.
• Decompose the query into sub-queries.
• Send the sub-queries to the resources.
• Translate queries from the ontology of the query to that of the relevant resource.
• Integrate results.
DOEM Overview
--- Query engine
The DOEM architecture
DOME Demonstrator
• Based on a database of marketing scenario.
• DOME controls mapping and limits resources.
Conclusions
• Solve information query at semantic level with formal ontologies and ontology mappings.
• Provide an integrated view of networked heterogeneous databases.
• Allow a user to select and browse definitions of terminologies.
DOME:
Comments
• General description.
• There is no details and experiments.
• No new technique is introduced.