Querying along XLinks in XPath/XQuery: Situation ...

22
Querying along XLinks in XPath/XQuery: Situation, Applications, Perspectives Erik Behrends, Oliver Fritzen, Wolfgang May Institut f ¨ ur Informatik Universit ¨ at G ¨ ottingen Germany {behrends|fritzen|may}@informatik.uni-goettingen.de QLQP- Query Languages and Query Processing M ¨ unchen, 31.3.2006

Transcript of Querying along XLinks in XPath/XQuery: Situation ...

Page 1: Querying along XLinks in XPath/XQuery: Situation ...

Querying along XLinks inXPath/XQuery: Situation,

Applications, Perspectives

Erik Behrends, Oliver Fritzen, Wolfgang MayInstitut fur InformatikUniversitat Gottingen

Germany{behrends|fritzen|may}@informatik.uni-goettingen.de

QLQP- Query Languages and Query ProcessingMunchen, 31.3.2006

Page 2: Querying along XLinks in XPath/XQuery: Situation ...

Situation

focus on database aspect of XMLautonomous XML sources on the Webeach source provides data + external schemalinks between the sources

Different Scenarios

own sources reference other sourcesother sources use my data

what is the external schema?queries against the own document must be “forwarded”data restructuring, distribute data over several instancesstay conform with the same external schematransparent for the users

XLink 2

Page 3: Querying along XLinks in XPath/XQuery: Situation ...

W3C XLink & XPointer

XLink: language for defining links between XML documentsarbitrary elements can serve as linksarbitrary many sources can be connected (extended links)arbitrary parts (XML fragments) can be referencedby XPointers: url #xpointer(xpath-expr)

A simple XLink (we do not consider extended links here):<linkelement xlink:type=”simple”

xlink:href=”url #xpointer(xpath-expr)”>

contents</linkelement>

XLink and browsing: predefined xlink-attributes X

Data model for documents connected by XLinks?How to process XLinks during querying?

XLink 3

Page 4: Querying along XLinks in XPath/XQuery: Situation ...

Simple Links – Example (MONDIAL Database)<!-- http://foo.com/countries.xml -->

<countries>

<country car code=”B” area=”30510”>

<name>Belgium</name>

<population>10170241</population>

<capital xlink:type=”simple” xlink:href=

”http://bar.org/cities-B.xml#

xpointer(//city[name=’Brussels’])” />

<cities xlink:type=”simple” xlink:href=

”http://bar.org/cities-B.xml#xpointer(//city)” />

:</country>

:</countries>

<!-- http://bar.org/cities-B.xml -->

<cities>

<city>

<name>Brussels</name>

<population>951580</>

:</city>

<city>

<name>Antwerp</name>

<population>459072</>

:</city>

:</cities>

XLink 4

Page 5: Querying along XLinks in XPath/XQuery: Situation ...

Simple Links

similar to the HTML <A href=”...”> construct.

Capitals of countries:<!ELEMENT country (. . . capital . . . )>

<!ELEMENT capital EMPTY>

<!ATTLIST capital xlink:type (simple|extended|locator|arc)#FIXED ”simple”

xlink:href CDATA #REQUIRED >

<country code=”B” >

<capital xlink:href=”file:cities-B.xml#xpointer(//city[name=’Brussels’])” />

:</country>

query://country[@code=”B”]/capital/??/population

url#xpath-exprx

•url

Page 6: Querying along XLinks in XPath/XQuery: Situation ...

Querying along Links

W3C XML Query (XQuery) Requirements (2001):“the XML Query Data Model MUST include support forreferences, including both references within an XMLdocument and references from one XML document toanother”.

XPointer and XLink:specify how to express versatile inter-document links in XMLXLink specification is tailored to browsing, not to querying

There is not yet an official W3C proposal......how to add link semantics to the actual data model(e.g., the XML Query Data Model)...how to express/process queries through linksNote: no way in XQuery, even not with user-defined functions!...for evaluation strategies.

Page 7: Querying along XLinks in XPath/XQuery: Situation ...

Querying along Links

Restricted “shorthand” pointers based on ID attributes(similar to anchors in HTML: href=”http://www.foo.com#id”)

(: adapted from [Lehner, Schoening: XQuery] :)(: XPointer of the form http://.../country.xml#xpointer(id(’D’)):)

declare namespace fu = ”http://www.example.org/functions”;declare function fu:follow-xlink($href as xs:string) as item()*{ let $docValue := fn:substring-before($href,”#”)

let $x := fn:substring-after($href,”#xpointer( id(’”)let $idValue := fn:substring-before($x,” ’))”)return fn:doc($docValue) /fn:id($idValue)(: evaluates fn:doc(http://.../country.xml)/fn:id(’D’) :)

};

reduces XPointer evaluation to ID evaluationdoes not cover general case

Page 8: Querying along XLinks in XPath/XQuery: Situation ...

Querying along Links

General case (XPointer contains any XPath expression):

(: adapted from [Lehner, Schoening: XQuery] :)

(: XPointer of the form http://.../country.xml#xpointer(//country[@code=’D’]) :)

declare namespace fu = ”http://www.example.org/functions”;declare function fu:follow-xlink($href as xs:string) as item()*{ let $docValue := fn:substring-before($href,”#”)

let $x := fn:substring-after($href,”#xpointer(”)let $path := fn:substring-before($x,”)”)return fn:doc($docValue)/ $path(: should evaluate fn:doc(http://.../country.xml)//country[@code=’D’] :)

};

Syntax not allowed

Page 9: Querying along XLinks in XPath/XQuery: Situation ...

Querying along Links

With saxon extension function saxon:evaluate():

(: adapted from [Lehner, Schoening: XQuery] :)

(: XPointer of the form http://.../country.xml#xpointer(//country[@code=’D’]) :)

declare namespace fu = ”http://www.example.org/functions”;declare function fu:follow-xlink($href as xs:string) as item()*{ let $docValue := fn:substring-before($href,”#”)

let $x := fn:substring-after($href,”#xpointer(”)let $path := fn:substring-before($x,”)”)return fn:doc($docValue) /saxon:evaluate($path)(: evaluates fn:doc(http://.../country.xml)/saxon:evaluate(’//country[...]’) :)

};

Query:

doc(”http://.../countries.xml”)//country[@car code=”B”]/

capital/ fu:follow-xlink(@xlink:href) /population .

Page 10: Querying along XLinks in XPath/XQuery: Situation ...

Querying Linked Documents

One possible solution:extend the abstract XML data model with a linking construct

additional explicit navigation operator required(like dereferencing IDREF attributes with the id() function)

doc(”http://.../countries.xml”)//country[@car code=”B”]/

capital/ fu:follow-xlink(@xlink:href) /population .

⇒ Here, the user has to be aware of the XLinks:queries have to respect XLinksXLinks are not part of the application data

XLink 8

Page 11: Querying along XLinks in XPath/XQuery: Situation ...

Transparent Links

regard link elements to be transparent

<country code=”B” >

<capital xlink:type=”simple”href=”file:cities-B.xml#//city[name=’Brussels’]”attributes of Brussels >

contents of Brussels•

</capital>:

</country>

query://country[@code=”B”]/capital/population

url#xpath-exprx

•url

XLink 9

Page 12: Querying along XLinks in XPath/XQuery: Situation ...

Transparent Links

Our abstract data model makes the links transparent:each link can be seen as a view definition

integrates remote schemataembedded views implicitly become subtrees, while XLinksare resolved silentlycan be queried by standard XPath expressions:

doc(”http://.../countries.xml”)//country[@car code=”B”]/

capital/population .

⇒ We introduced a namespace dbxlink with additional directives:specification of modeling, evaluation and caching modesno changes needed in XPath/XQuery and XLink

XLink 10

Page 13: Querying along XLinks in XPath/XQuery: Situation ...

Modeling Switches = Integration Mapping

(a) Mapping of the target

(b) Mapping of the XLink element and adding the result

keep-body

drop-element

group-in-element

duplicate-element

make-attribute

×

{

insert-nodes

insert-bodies

}

<linkelement xlink:href=”xpointer ” attributes

dbxlink:transparent=”left-hand-directive right-hand-directive”>

content</linkelement>

url#xpath-exprx

•url

XLink 11

Page 14: Querying along XLinks in XPath/XQuery: Situation ...

Modeling – Example (MONDIAL Database)<!-- http://foo.com/countries.xml -->

<countries>

<country car code=”B” area=”30510”>

<name>Belgium</name>

<capital xlink:type=”simple” xlink:href=

”http://bar.org/cities-B.xml#

xpointer(//city[name=’Brussels’])”

dbxlink:transparent=”keep-body insert-bodies”

/>

<cities xlink:type=”simple” xlink:href=

”http://bar.org/cities-B.xml#xpointer(//city)”

dbxlink:transparent=”drop-element insert-nodes”

/>

</country>

</countries>

<!-- bar.org/cities-B.xml -->

<cities>

<city>

<name>Brussels</>

<population>951580</>

:</city>

<city>

<name>Antwerp</>

<population>459072</>

:</city>

:</cities>

Page 15: Querying along XLinks in XPath/XQuery: Situation ...

Modeling – Example (MONDIAL Database)<country car code=”B”>

<name>Belgium</name>

<capital>

<name>Brussels</name>

<population>951580</>

</capital>

<city>

<name>Brussels</name>

<population>951580</>

</city>

<city>

<name>Antwerp</name>

<population>459072</>

</city>

</country>

- query virtual integrated view://country[@code=”B”]/capital/population

- view created on-the-fly for queries

- reduction to relevant links

XLink 13

Page 16: Querying along XLinks in XPath/XQuery: Situation ...

Modeling Switches = Integration Mapping

(a) Mapping of the target

(b) Mapping of the XLink element and adding the result

keep-body

drop-element

group-in-element

duplicate-element

make-attribute

×

{

insert-nodes

insert-bodies

}

<linkelement xlink:href=”xpointer ” attributes

dbxlink:transparent=”left-hand-directive right-hand-directive”>

content</linkelement>

url#xpath-exprx

•url

XLink 14

Page 17: Querying along XLinks in XPath/XQuery: Situation ...

Modeling

Applications:

Data integration: building (virtual) XML documents bycombining autonomous sources.

Sometimes: given target DTD/XML SchemaSplitting an original XML document into a distributeddatabase: Keep the external schema unchanged:

virtual model of the linked documents should be valid wrt.the original DTD,all queries against the root document still yield the sameanswers as before.

⇒ cutting not only at (sub)elements, but also at attributes.

XLink 15

Page 18: Querying along XLinks in XPath/XQuery: Situation ...

Related Work

ActiveXML [Abiteboul et al, VLDB 02 etc.]General approach for invoking arbitrary Web Services thatreturn XML (as embedded views),<axml:call> elements,

no special mapping/transformation functionality.

dbxlink via Active XMLProvide a service that answers XPointers and does thetransformation mapping:

<axml:call href=”http://.../dbxlink-servlet”>

<capital xlink:type=”simple”xlink:href=”http://.../cities-B.xml#xpointer(//city[name=’Brussels’])”dbxlink:transparent=”keep-body insert-nodes”/>

</axml:call>

make-attribute not possible.XLink 16

Page 19: Querying along XLinks in XPath/XQuery: Situation ...

Related Work

ActiveXML [Abiteboul et al, VLDB 02 etc.]General approach for invoking arbitrary Web Services thatreturn XML (as embedded views),<axml:call> elements,

no special mapping/transformation functionality.

Active XML via dbxlink

<call xlink:type=”simple”xlink:href=”service-url ”dbxlink:transparent=”drop-element insert-nodes”/>

no parameters for call allowed (except in the url).⇒ dbxlink not only suitable for XPointer-XML views, but also

calling for Web Services.

XLink 16

Page 20: Querying along XLinks in XPath/XQuery: Situation ...

Summary

Basic Functionality of XLink

XML Requirements “fulfilled”:Data Model: seamless integration of view definitions into thedatabase,transparent semantics,querying in XPath (adaptation of internal evaluation).

Implementation

extension to the eXist [http://exist-db.org] XML database system.handling of cyclic instances.different evaluation modes (query, data and hybrid shipping).caching according to evaluation strategies.

XLink 17

Page 21: Querying along XLinks in XPath/XQuery: Situation ...

Optimizations and Perspectives

Resource descriptions (Path indexes, XMLSchema): check a priori whether the viewcan contribute to the answer of theremaining query.Evaluate the view already as a “projecteddocument” [Marian, Simeon VLDB03] wrt.the remaining query.Parallel evaluation of views duringquerying.Caching: utilize query containment forsimilar XPath expressions used inXPointers.Arcs and link bases.

url#xpath-exprx

•url

XLink 18

Page 22: Querying along XLinks in XPath/XQuery: Situation ...

Thank You !

Questions ??