On Languages for the Specification of Integrity Constraints in Spatial Conceptual Models
Embed Size (px)
Transcript of On Languages for the Specification of Integrity Constraints in Spatial Conceptual Models
On Languages for the Specification of Integrity Constraints in Spatial Conceptual Models
Mehrdad SalehiYvan BdardMir Abolfazl MostafaviJean Brodeur
Center for Research in Geomatics (CRG)Department of Geomatics SciencesLaval UniversityCanada
ER 2007 Workshop on Semantics and Conceptual Issues in GIS (SeCoGIS)
Presentation PlanThe role of spatial integrity constraints in spatial data qualityDefinition of spatial integrity constraintsClassification of languages for the specification of spatial integrity constraints at the conceptual level + ExamplesNatural languageVisual languageFirst-order logic languageHybrid languageComparison of languagesConclusion and on-going/future work
The Role of SIC in Spatial Data QualityInternal Data Quality Completeness Positional Accuracy Temporal Accuracy Thematic Accuracy Logical ConsistencyExternal Data Quality
Deals with fitness for use of the dataSIC carry semantic information of a database application and are used to preserve logical consistency in spatial databasesSpatial Data Quality
Spatial Integrity ConstraintsA spatial IC (SIC) defines mandatory, allowed, and unacceptable spatial relationships and values, sometimes in relation to other specific attribute values, geometric features shapes, specific relationships, or for given areas of validity.
Simple examples of SIC:Topological IC: Based on topological properties and relationships Each building must be represented by a closed area. Two buildings do not overlap.
Metric IC: Based on metric properties and relationships The area of a house must be more than 100 square meters.Distance between a school and a gas station must be more than 30 meters.
Integrity constraints (IC) are assertions that restrict the data values that may appear in the database to prevent insertions of incorrect data.
Database Design ProcessConceptuel ModelImplementation ModelPIMTraditional Database Design ProcessOMGs MDA ApproachCIMCIM (Computation Independent Model): An end-users view of the data which is independent of the implementationPIM (Platform Independent Model): A developers view of the data on a family of platforms (e.g., OLAP servers)PSM (Platform Specific Model): A developers view of the data for a specific software package (e.g., Oracle)
Definition of SICSIC convey essential semantic information of database applicationsIt is necessary to define SIC at all levels of a spatial database design processEach database design level requires its specific SIC specification language (spatial ICSL)
Conceptual level: SIC must be first defined with a language understandable to a database usersImplementation level: SIC are then translated to a DDL or a programming language to be understandable to a computer This presentation focuses on the spatial ICSL at the conceptual level.SIC at the Conceptual ModelSIC at the Implementation ModelDisjoint (Road.geometry, Building.geometry) Road disjoint Building
Classification of the Spatial ICSL at the Conceptual LevelWe categorize the existing spatial ICSL at conceptual level into: Natural languages Free natural languages Controlled natural languages Visual languages First-order logic languages Hybrid languages Visual hybrid languages Natural hybrid languages
Natural LanguagesPeople use natural languages for their daily communications.They are the easiest languages for a database client to express SIC.
1. Free Natural Languages:are natural languages without additional limit to their syntax and semanticssupport a rich vocabularyare sometimes ambiguous or used too looselySeveral words may bear the same semanticsWords may have several meanings depending on the contextLoose usage of restrictive terms (and, or, must, can, )
2. Controlled Natural Languages:are sub-sets of natural languages whose syntax and semantics are restricted are proposed to overcome the ambiguity of free natural languages
Natural LanguagesExamples for controlled natural languages:Ubeda and Egenhofer approach (1997): (Entity Class1, Topological Relation, Entity Class2, Quantifier)extended 9I model topological relationships (e.g., inside, cross) forbidden at least n times at most n times exactly n times Vallieres et. al approach (2006):Objects Class1 + Topological Relation + Objects Class2 + [-,-] 8 topological relations extended by three notions: tangent, border, strictCardinalityExample: (Road, Cross, Building, Forbidden)Example: Road Segment Touch-Tangent Road Segment [1,2]
Visual LanguagesEmploys graphical and image notationsDatabase end-user mustlearn the semantics of every visual construct understand very well the very specific context of its usageSeveral ambiguities and unintended meanings can emerge
Example for visual spatial ICSL: Pizano et al. (1989)In this language pictures show unacceptable database states terms constraint pictures
Cars and people cannot be inside a crosswalk simultaneously
First-Order Logic LanguageSupports precise semantics and syntaxHowever, using and understanding this language requires a mathematical backgroundDatabase end-users do not necessarily have a mathematical background
Example of FOL for expressing SIC: Hadzilacos and Tryfona (1992) The syntax of this language is structured as:Atomic topological formulae consisting of: Binary topological relations between objectsGeometric operator over objectsComparison between attributes of objectsNegation, conjunction, disjunction, and universal and existential quantifications
Example of SIC A Road and a Building are disjoint :
Hybrid LanguagesAre not purely natural, visual, or logical, instead are the combination of themDepending on the dominant part of a language, they are:
1. Visual hybrid languagesThe main part includes visual symbols Visual constructs are enriched by a limited number of natural language descriptions
2. Natural hybrid languagesThe dominant part is a natural language Complementary components are visual pictograms (e.g., ) or symbols
Hybrid Languages1. Example for a visual hybrid languageThere is no visual hybrid spatial ICSLHowever, spatio-temporal conceptual modeling languages (e.g., Perceptory)specify a number of SIC in the conceptual schemacontradicts the conciseness rule of conceptual schemasare mostly limited to constraints on spatial relationsleave the remaining SIC to be defined by a specific spatial ICSL
Example: A Roundabout is crossed by at least one Route
Hybrid Languages2. Natural hybrid languages2.1. Example for a natural hybrid language with pictogramsNormand (1999) A language for defining SIC in the data dictionary and includes:three pictograms for point, line, and polygontopological relations based on ISO 9I modelDefines topological and metric IC on the relationship between objects Supports multiple geometriesExpress a complex SIC for an object in a tabular form
Hybrid Languages2. Natural hybrid languages2.2. Example for natural hybrid language with symbolsSpatial OCL (Kang et al. 2004) extends OCL, i.e., an ICSL along with UML, by adding: basic geometric primitives (e.g., point) to OCL meta-model9I topological relations (e.g., overlap) to OCL operatorsspecifies topological IC
Example A building is disjoint from a Road:
context Building inv:Road.allInstances()->forAll(R|R.geometry->Disjoint self.geometry))Is it really a Hybrid Natural Language ?
Comparing Spatial ICSLWhy comparing spatial ICSL?We are not aiming at finding the best spatial ICSL (if such a thing is possible!).We are revisiting our past practices (i.e. is Hybrid natural language still the best ICSL for the natural level?)Our goal is to summarize the potential avenues for developing ICSLs for spatio-temporal databases AND spatial datacubes.
Comparison Criteria1. Expressiveness:Semantic quality: Correspondence between ICs meaning and concepts supported by a spatial ICSL Syntactic quality: Degree to which the rules of spatial ICSL govern the structure of expressionsRichness: Capability to express the needed elements of SICInherence: Precision of an ICSL to be straight to the point and focuses on the essential aspects of SIC2. Pragmatics:Usability of the spatial ICSL by database end-usersFacility to translate spatial IC into technical languages In our context the former pragmatic quality has priority over the latter.
Comparing Spatial ICSL Three values Good, Medium, and Weak are used to rank the languages. The values represent our opinion from a literature study and 20 years of experience in spatial database modeling and development.
??Natural is the way to go for the conceptual level
ConclusionsSpatial IC convey important semantic information of applications
They must be first defined at the conceptual level for database end-users
We presented a classification of spatial ICSL at the conceptual level:
According to our opinion controlled natural languages and natural hybrid languages with pictograms are good candidates
On-Going and Future WorkWe are currently working on a classification of IC in spatio-temporal database applicationsThis classification provides the basic constructs to build an ICSL for spatio-temporal databases and spatial datacubesWe will build an ICSL for spatial datacubes based on:The results of the classification of ICsSpatial datacubes vocabulary (e.g., Dimension and Measure)The candidate languages resulted from the current research work
***The relationship between spatial data quality and SICs. IC are defined to improve internal data quality, more precisely the logical consistency.*What is an IC?What is a SIC?
Different examples for SIC*Here we talk about what a conceptual model and an implementation model mean for us by comparing them with regards to the MDA approach.*The goal of this slide is to show SIC can be defined in different levels of databases design (which introduced in the previous slide). However, each level requires different languages which are aimed to different target users. *Giving a global view of the languages that exist for the definition of IC in spatial conceptual models. *There are two types of natural languages. Here we introduce them and give their characteristics.*We represent two research works that employ controlled natural languages for representing SIC*The characteristics of visual languages are given. An example for an IC specification visual language is presented.
The aim of visual languages is to create an easy to use and perceive language . However, representing all the information of SICs using a visual language necessitates end-users to learn the meaning of every visual construct of the language to be able to use it, and to understand very well the very specific context of usage, which needs a big effort. In spite of such efforts, several ambiguitie ambiguities and unintended meanings can emerge from such representations and lead to false integrity constraints (ex. "cars cannot pass people at their left", this false IC is also context-sensitive vs "cars cannot pass people at their right" in British-influenced countries having the driver seat at right).
*The characteristics of FOL are described. An example for a spatial ICSL is given.*Here we define what hybrid languages are. Then we categorize them into two categorizes: visual hybrid language and natural hybrid languages and define each of them. Natural hybrid languages are themselves of two categories. *There is no visual hybrid spatial ICSL. However, modeling languages (which are mostly visual hybrid) can express a limited number of SIC. An example for a visual hybrid language (Perceptory) is given. *An example for natural hybrid language spatial ICSL is given. *Spatial OCL is an example for natural hybrid languages for representing SIC. The have used OCL of UML for defining SIC by including geometric primitives and topological relations. This work has been done by the research group at Clermont-Ferrand. *We have to clarify why we would like to compare the languages: Our goal is distinguish the potential ways for developing an ICSL for spatio-temporal databases as well as spatial datacubes. It doesnt seem meaningful introducing the best language.
For comparing the language, we have used two main criteria (Expressiveness and Pragmatic).
For the Pragmatics we considered the usability of the language by database end-user instead of facility to translate the language. The reason is that it is really challenging to learn each database end-users the programming code. Moreover, these persons may be replaced by other ones and hence we should learn the new users the programming language again. However, the translation into technical languages is done once.*Here we compared the languages based on two criteria we introduced in the previous slide. *We explain why we categorized the language and compared them: to have a candidate language for expressing IC in spatial datacubes.
The classification of ICs that we are working on it, will be integrated with this research to finally propose an ICSL for spatial datacubes.
tu peut dire : the IC in spatio-temporal databases (spatial datacubes can be condiered spatio-temporal also) are complex. this complexity particulary emerges when one attemps to define them formally. a classification of different types of IC in these databases would facilitate the task of defining these IC. We have done this classificaton based on the concepts that appear in their definition to capture the necessary constructs to build an ICSL for spatial datacuabes and spatio-temporal databases. Ex: spatial-only, temporal-only, spatio-temporal-only, thematic, spatial, temporal, spatio-temporal,Mehrdad says:ce sont des differentes classes des CIs