M ETA XPath

39
DC2001 Conference - Toyko METAXPath Curtis Dyreson E.E. and Computer Science Washington State University USA Michael Böhlen and Christian S. Jensen Computer Science Aalborg University Denmark Nykredit Center for Database Research Aalborg University, Denmark

description

M ETA XPath. Curtis Dyreson E.E. and Computer Science Washington State University USA. Michael Böhlen and Christian S. Jensen Computer Science Aalborg University Denmark. Nykredit Center for Database Research Aalborg University, Denmark. Outline. Data Data model XML - PowerPoint PPT Presentation

Transcript of M ETA XPath

Page 1: M ETA XPath

DC2001 Conference - Toyko

METAXPath

Curtis DyresonE.E. and Computer Science Washington State University

USA

Michael Böhlenand

Christian S. JensenComputer ScienceAalborg University

Denmark

Nykredit Center for Database ResearchAalborg University, Denmark

Page 2: M ETA XPath

DC2001 Conference - Toyko

Outline

• Data Data model

XML Query language

XPath

• Metadata METAXPath

• Future work

Page 3: M ETA XPath

DC2001 Conference - Toyko

An XML Database Architecture

XML data and metadata

Database

Client(HTTP browser)

HTTP server

Page 4: M ETA XPath

DC2001 Conference - Toyko

Database Data Model Evolution60s - Hierarchical data model

70s - Network data model

80s - Relational data model

90s - Object-oriented data model

00s - Unstructured/semistructured/XML Innovators

Unstructured data models (UPenn) UnQL/Strudel (AT&T) OEM and Lore (Stanford) XML (W3C)

Page 5: M ETA XPath

DC2001 Conference - Toyko

Object Exchange Model (OEM)• Heterogeneous OODBs

Exchange objects Text description

text (XML)

object 1

object 1

my database your database

object 2

Page 6: M ETA XPath

DC2001 Conference - Toyko

<person id=&1 name=“Joe Doe” age=“25” />

<person id=&1> <name>Joe Doe</name> <age>25</age> </person>

Object Representation in XML• Use names and values• Ignore types• &X denotes object X

// A person classclass Person { String name; int age; }

// A person objectPerson joe = new Person(‘Joe Doe’, 25);

<!ATTLIST person id ID #REQUIRED><!ELEMENT person (name age)>

Page 7: M ETA XPath

DC2001 Conference - Toyko

XML (XPath) Data Model• Each element or attribute is a node

• Edges indicate nesting

• Nodes contain information

• Tree is ordered

age

element

person

element

name=“Joe”

attribute

id=“&1”

attribute

/n

text

25

text

/n

text

root

XML

<person id=&1 name=“Joe”> <age>25</age> </person>

XPath

Page 8: M ETA XPath

DC2001 Conference - Toyko

Semistructured Data Model• Each element or attribute is a node• Edges indicate nesting• Edges are labeled

Joe25

XML Semistructured

&1

person

nameage

<person id=&1 name=“Joe”> <age>25</age> </person>

Page 9: M ETA XPath

DC2001 Conference - Toyko

Data Models Compared• Insensitive to

text order, whitespace attributes vs. elements

• Directed graph (many roots, can contain cycles)

• Captures text order, whitespace, attributes and elements

• A tree (single root, no cycles)

age

element

person

element

name=“Joe”

attribute

id=“&1”

attribute

/n

text

25

text

/n

text

root

Joe25Semistructured

&1

person

nameage

XPath

Page 10: M ETA XPath

DC2001 Conference - Toyko

Outline

• Data Data model

XML Query language

XPath

• Metadata XML - METAXPath

• Future work

Page 11: M ETA XPath

DC2001 Conference - Toyko

XPath

• W3C Recommendation – 1999 Used in XQuery, XSLT, and XPointer Language for selecting locations in an XML document

• Query Sequence of location steps separated by ‘/’ Location step

axis::node_test [predicate1]…[predicateN]

Evaluated with respect to a context node Results in a node-set (actually a list of nodes!) Step continues from nodes reached in previous step

Page 12: M ETA XPath

DC2001 Conference - Toyko

Descendent Axis Example

name

element

person

element

dateOfBirth

element

last

elementmonth

element

year

element

Susan

text

Douglas

text

January

text

1981

text

This…

comment

root

initial=“S”

attribute

SSN=“99…”

attribute

first

element

Page 13: M ETA XPath

DC2001 Conference - Toyko

• Ancestor, descendent, following, preceding, and self partition a tree.

Axes that Partition a Tree

preceding followingdescendent

ancestor

self

Page 14: M ETA XPath

DC2001 Conference - Toyko

XPath Node Test and Predicates

• Each node in result-set must pass node test Is this an element node named person?

person Is this an element node?

*

• Predicates are further tests (about other nodes) Does node have a ssn attribute?

[attribute::ssn]

Page 15: M ETA XPath

DC2001 Conference - Toyko

Example /child::person/child::*/child::last

name

element

person

element

dateOfBirth

element

last

elementmonth

element

year

element

Susan

text

Douglas

text

January

text

1981

text

This…

comment

root

initial=“S”

attribute

SSN=“99…”

attribute

first

element

root

person

element

name

element

This…

comment

dateOfBirth

element

last

element

last

element

Page 16: M ETA XPath

DC2001 Conference - Toyko

XPath Examples

• The dateOfBirth children of person nodes

/descendent::person/child::dateOfBirth

• The last text node

/descendent::text()[position()=last()]

Page 17: M ETA XPath

DC2001 Conference - Toyko

Abbreviated Syntax

• Think of file path specifications in Unix• Year child of dateOfBirth

child::dateOfBirth/child::year

dateOfBirth/year

• name siblings

parent::*/child::name

../name

• All year nodes

/descendent-or-self::*/child::year

//year

Page 18: M ETA XPath

DC2001 Conference - Toyko

Outline

• Data Data model

XML Query language

XPath

• Metadata XML - METAXPath

• Future work

Page 19: M ETA XPath

DC2001 Conference - Toyko

Metadata

• Database metadata Schema, security, transaction time (versions)

• Web metadata Author, language, subject, privacy

• Web metadata recommendations RDF, RDD, P3P

• Features Descriptive, but also exclusionary Irregular Multiple Ad-hoc

Page 20: M ETA XPath

DC2001 Conference - Toyko

A Movie Database

• Movie data Bruce Willis stars in Colour of Night. Colour of Night premiered 1/Jul/1995.

• Publication meta-data language English

URL http://www.auc.dk

publication date 2/Apr/1997

privacy/security ‘over 18’

publication history v1.2, modified 31/Jul/1998

subject Film, Suspense, Thriller

namespace http://www.auc.dk/movieDataDTD.xml

Page 21: M ETA XPath

DC2001 Conference - Toyko

Movie Database Queries

• Metadata only Retrieve information published at Danish web sites.

• Metadata compared to data Find reviews published in the first week of the movie’s release.

• Metadata and data, but independent Get suspense films starring Bruce Willis.

Page 22: M ETA XPath

DC2001 Conference - Toyko

Properties of a Metadata Data Model

• Goal: Same query language for data and metadata User learns “one” language Compiler/optimization reuse

• Challenges: Data and metadata in different dataspaces Query on data should not accidently query metadata Meta-metadata

Metadata for metadata Metadata has semantics Data with/without metadata

Page 23: M ETA XPath

DC2001 Conference - Toyko

METAXPath Data Model

• Data model Reuse XPath data model Meta attribute points to metadata tree “Right angle” data model

• Features Minimal extension of XPath Backwards-compatible

Page 24: M ETA XPath

DC2001 Conference - Toyko

Example

• Data<?xml version="1.0">

<person ssn="234">

<name>Ichiro</name>

</person>

• URL metadata<source URL=“www.wsu.edu/p.htm”>

• Language metadata of person element<language>English</language>

• Author meta-metadata - language metadata author<author name="Suzuki"/>

Page 25: M ETA XPath

Type element

Value person

Attributes {(ssn, 223)}

Type element

Value name

Attributes {}

Type text

Value Ichiro

Type root

Type text

Value \n

Type text

Value \n\t

<?xml version="1.0"><person ssn="234"> <name>Ichiro</name></person>

Page 26: M ETA XPath

Type element

Value name

Attributes {}

Type text

Value \n

Type text

Value \n\t

Type element

Value person

Attributes {(ssn, 223)}

Type text

Value Ichiro

Type root

Meta

Type element

Value source

Attributes {(URL, www.wsu.edu/p.htm)}

Type root

<source URL=“www.wsu.edu/p.htm”>

Page 27: M ETA XPath

Type element

Value name

Attributes {}

Type text

Value \n

Type text

Value \n\t

Type element

Value person

Attributes {(ssn, 223)}

Meta

Type element

Value language

Attributes {}

Type text

Value English

Type text

Value Ichiro

Type root

Meta

Type element

Value source

Attributes {(URL, www.wsu.edu/p.htm)}

Type root

Type root

<language>English</language>

Page 28: M ETA XPath

Type element

Value name

Attributes {}

Type text

Value \n

Type text

Value \n\t

Type element

Value person

Attributes {(ssn, 223)}

Meta

Type element

Value language

Attributes {}

Type text

Value English

Type text

Value Ichiro

Type root

Meta

Type element

Value source

Attributes {(URL, www.wsu.edu/p.htm)}

Type root

Type root

Meta Type root

Type element

Value author

Attributes {(name, Suzuki)}

<author name="Suzuki"/>

Page 29: M ETA XPath

DC2001 Conference - Toyko

Sharing and Excluding Metadata

• Meta property points to metadata for a node Shared pointers ==> shared metadata

• To share with child Copy pointer

• To exclude from child Duplicate excluded portion Copy remaining shared pointers

Page 30: M ETA XPath

Type text

Value Ichiro

Meta

Type element

Value person

Attributes {(ssn, 223)}

Meta

Type text

Value English

Meta

Type root

Meta

Type element

Value source

Attributes {(URL, www.wsu.edu/p.htm)}

Type root

Type root

Meta

Type text

Value \n\t

Meta

Type element

Value name

Attributes {}

Meta

Type text

Value \n

Meta

Type root

Type element

Value language

Attributes {}

Meta

Type element

Value author

Attributes {(name, Suzuki)}

Share metadata with descendents

Page 31: M ETA XPath

Type element

Value person

Attributes {(ssn, 223)}

Meta

Type root

Meta

Type element

Value source

Attributes {(URL, www.wsu.edu/p.htm)}

Type root

Type root

Meta

Type text

Value \n\t

Meta

Type element

Value name

Attributes {}

Meta

Type text

Value \n

Meta

Type root

Type root

Meta

Type text

Value English

Meta

Type element

Value language

Attributes {}

Meta

Type element

Value author

Attributes {(name, Suzuki)}

Type text

Value Ichiro

Meta

Ichiro text not

authored by

Suzuki

Page 32: M ETA XPath

DC2001 Conference - Toyko

METAXPath Queries

• XPath plus level shift operation meta axis ^ in abbreviated syntax

• Example - Locate data nodes with URL metadata of p.htm /descendent-or-self::*

[meta::*/child::source[attribute::URL="p.htm"]] In abbreviated syntax

//*[^source[@URL="p.htm"]]

• Example - Locate the URL metadata //*^source/@URL

• Example - Locate data that has metadata authored by Suzuki (meta-metadata)//*[^//*^author[@name="Suzuki"]]

Page 33: M ETA XPath

DC2001 Conference - Toyko

Outline

• Data Data model

XML Query language

XPath

• Metadata XML - METAXPath

• Future work

Page 34: M ETA XPath

DC2001 Conference - Toyko

Metadata Semantics

• Transaction time example

Color of Night

&2

&3

Colour of Night

name: title

trans. time: [1/Aug/1998 - uc]

&1

name: reviewed

trans. time: [1/Sep/1999 - uc]

name: movie

name: title

trans. time: [2/Apr/1997 - 31/Jul/1998]

&1

&2

&3

Not a path!

Page 35: M ETA XPath

DC2001 Conference - Toyko

AUCQL Collapse Example

• PropertyCollapse for name is concatenation, for trans. time it is temporal intersection.

Color of Night

&1

Colour of Night

name: reviewed

trans. time: [1/Sep/1999 - uc]

&2

&3

name: title

trans. time: [2/Apr/1997 - 31/Jul/1998]

name: title

trans. time: [1/Aug/1998 - uc]

name: movie

name: reviewed.movie.title

trans. time: [1/Sep/1999 - uc]

name: reviewed.movie.title

trans. time: undefined

Page 36: M ETA XPath

DC2001 Conference - Toyko

AUCQL Additional Operations

• Coalesce - compute a distributed property value

&1

&2

name: review

security! developer

trans. time: [1/Jul/1999 - 15/Jul/1999]

name: review

security! subscriber

trans. time: [16/Jul/1999 - uc]

trans. time: [1/Jul/1999 - uc]

Page 37: M ETA XPath

DC2001 Conference - Toyko

Thin Layer Impementation

METAXPath query

METAXPath CompilerMetadata

encoding

DB

XPath Compiler

XPath query

result

Page 38: M ETA XPath

DC2001 Conference - Toyko

Prototype Implementation

METAXPath query

METAXPath Compiler

DBM

Query Evaluation Engine

Evaluation Tree

result

Database API

Perl

Perl

XML

Parser

XML

RDF

Indexing

Page 39: M ETA XPath

DC2001 Conference - Toyko

Summary

• METAXPath website http://www.eecs.wsu.edu/~cdyreson/pub/MetaXPath

• AUCQL website VLDB ‘99 Implemented research prototype Free, downloadable, Unix environment http://www.eecs.wsu.edu/~cdyreson/pub/AUCQL Interactive query engine Tutorials