Lecture 5: XML and XQuery
description
Transcript of Lecture 5: XML and XQuery
1
Lecture 5: XML and XQuery
2
Semistructured Data
Another data model, based on trees. Motivation: flexible representation
of data. Often, data comes from multiple
sources with differences in notation, meaning, etc.
Motivation: sharing of documents among systems and databases.
3
Graphs of Semistructured Data
Nodes = objects. Labels on arcs (attributes,
relationships). Atomic values at leaf nodes (nodes
with no arcs out). Flexibility: no restriction on:
Labels out of a node. Number of successors with a given label.
4
Example: Data Graph
Bud
A.B.
Gold1995
MapleJoe’s
M’lob
beer beerbar
manfmanf
servedAt
name
namename
addr
prize
year award
root
The bar objectfor Joe’s Bar
The beer objectfor Bud
Notice anew kindof data.
5
XML
XML = Extensible Markup Language. While HTML uses tags for formatting
(e.g., “italic”), XML uses tags for semantics (e.g., “this is an address”).
Key idea: create tag sets for a domain (e.g., genomics), and translate all data into properly tagged XML documents.
6
Well-Formed and Valid XML
Well-Formed XML allows you to invent your own tags. Similar to labels in semistructured
data. Valid XML involves a DTD
(Document Type Definition), a grammar for tags.
7
Well-Formed XML
Start the document with a declaration, surrounded by <?xml … ?> .
Normal declaration is:<?xml version = “1.0” standalone = “yes” ?> “Standalone” = “no DTD provided.”
Balance of document is a root tag surrounding nested tags.
8
Tags
Tags, as in HTML, are normally matched pairs, as <FOO> … </FOO> .
Tags may be nested arbitrarily. XML tags are case sensitive.
9
Example: Well-Formed XML
<?xml version = “1.0” standalone = “yes” ?><BARS>
<BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME>
<PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME>
<PRICE>3.00</PRICE></BEER></BAR><BAR> …
</BARS>
A NAMEsubobject
A BEERsubobject
10
XML and Semistructured Data
Well-Formed XML with nested tags is exactly the same idea as trees of semistructured data.
We shall see that XML also enables nontree structures, as does the semistructured data model.
11
Example
The <BARS> XML document is:
Joe’s Bar
Bud 2.50 Miller 3.00
PRICE
BAR
BAR
BARS
NAME . . .
BAR
PRICENAME
BEERBEER
NAME
12
DTD Structure
<!DOCTYPE <root tag> [<!ELEMENT <name>(<components>)>
. . . more elements . . .]>
13
DTD Elements
The description of an element consists of its name (tag), and a parenthesized description of any nested tags. Includes order of subtags and their
multiplicity. Leaves (text elements) have
#PCDATA (Parsed Character DATA ) in place of nested tags.
14
Example: DTD
<!DOCTYPE BARS [<!ELEMENT BARS (BAR*)><!ELEMENT BAR (NAME, BEER+)><!ELEMENT NAME (#PCDATA)><!ELEMENT BEER (NAME, PRICE)><!ELEMENT PRICE (#PCDATA)>
]>
A BARS object haszero or more BAR’snested within.
A BAR has oneNAME and oneor more BEERsubobjects.
A BEER has aNAME and aPRICE.
NAME and PRICEare text.
15
Element Descriptions
Subtags must appear in order shown. A tag may be followed by a symbol
to indicate its multiplicity. * = zero or more. + = one or more. ? = zero or one.
Symbol | can connect alternative sequences of tags.
16
Example: Element Description
A name is an optional title (e.g., “Prof.”), a first name, and a last name, in that order, or it is an IP address:
<!ELEMENT NAME (
(TITLE?, FIRST, LAST) | IPADDR
)>
17
Use of DTD’s
1. Set standalone = “no”.2. Either:
a) Include the DTD as a preamble of the XML document, or
b) Follow DOCTYPE and the <root tag> by SYSTEM and a path to the file where the DTD can be found.
18
Example (a)<?xml version = “1.0” standalone = “no” ?><!DOCTYPE BARS [
<!ELEMENT BARS (BAR*)><!ELEMENT BAR (NAME, BEER+)><!ELEMENT NAME (#PCDATA)><!ELEMENT BEER (NAME, PRICE)><!ELEMENT PRICE (#PCDATA)>
]><BARS>
<BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER>
</BAR> <BAR> …
</BARS>
The DTD
The document
19
Example (b)
Assume the BARS DTD is in file bar.dtd.<?xml version = “1.0” standalone = “no” ?><!DOCTYPE BARS SYSTEM “bar.dtd”><BARS>
<BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME>
<PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME>
<PRICE>3.00</PRICE></BEER></BAR><BAR> …
</BARS>
Get the DTDfrom the filebar.dtd
20
Attributes
Opening tags in XML can have attributes.
In a DTD,<!ATTLIST E . . . >
declares an attribute for element E, along with its datatype.
21
Example: Attributes
Bars can have an attribute kind, a character string describing the bar.
<!ELEMENT BAR (NAME BEER*)>
<!ATTLIST BAR kind CDATA #IMPLIED>
Character stringtype; no tags
Attribute is optionalopposite: #REQUIRED
22
Example: Attribute Use
In a document that allows BAR tags, we might see:
<BAR kind = “sushi”>
<NAME>Akasaka</NAME>
<BEER><NAME>Sapporo</NAME>
<PRICE>5.00</PRICE></BEER>
...
</BAR>
Note attributevalues are quoted
23
ID’s and IDREF’s
Attributes can be pointers from one object to another. Compare to HTML’s NAME = “foo”
and HREF = “#foo”. Allows the structure of an XML
document to be a general graph, rather than just a tree.
24
Creating ID’s
Give an element E an attribute A of type ID.
When using tag <E > in an XML document, give its attribute A a unique value.
Example:<E A = “xyz”>
25
Creating IDREF’s
To allow objects of type F to refer to another object with an ID attribute, give F an attribute of type IDREF.
Or, let the attribute have type IDREFS, so the F –object can refer to any number of other objects.
26
Example: ID’s and IDREF’s
Let’s redesign our BARS DTD to include both BAR and BEER subelements.
Both bars and beers will have ID attributes called name.
Bars have SELLS subobjects, consisting of a number (the price of one beer) and an IDREF theBeer leading to that beer.
Beers have attribute soldBy, which is an IDREFS leading to all the bars that sell it.
27
The DTD<!DOCTYPE BARS [
<!ELEMENT BARS (BAR*, BEER*)><!ELEMENT BAR (SELLS+)>
<!ATTLIST BAR name ID #REQUIRED><!ELEMENT SELLS (#PCDATA)>
<!ATTLIST SELLS theBeer IDREF #REQUIRED><!ELEMENT BEER EMPTY>
<!ATTLIST BEER name ID #REQUIRED><!ATTLIST BEER soldBy IDREFS #IMPLIED>
]>Beer elements have an ID attribute called name,and a soldBy attribute that is a set of Bar names.
SELLS elementshave a number(the price) andone referenceto a beer.
Bar elements have nameas an ID attribute andhave one or moreSELLS subelements.
Explainednext
28
Example Document
<BARS><BAR name = “JoesBar”>
<SELLS theBeer = “Bud”>2.50</SELLS><SELLS theBeer = “Miller”>3.00</SELLS>
</BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar …”/> …</BARS>
29
Empty Elements
We can do all the work of an element in its attributes. Like BEER in previous example.
Another example: SELLS elements could have attribute price rather than a value that is a price.
30
Example: Empty Element
In the DTD, declare:<!ELEMENT SELLS EMPTY>
<!ATTLIST SELLS theBeer IDREF #REQUIRED><!ATTLIST SELLS price CDATA #REQUIRED>
Example use:<SELLS theBeer = “Bud” price = “2.50”/>
Note exception to“matching tags” rule
31
XPath
Path ExpressionsConditions
32
Paths in XML Documents
XPath is a language for describing paths in XML documents.
Really think of the semistructured data graph and its paths.
33
Example DTD
<!DOCTYPE BARS [<!ELEMENT BARS (BAR*, BEER*)><!ELEMENT BAR (PRICE+)>
<!ATTLIST BAR name ID #REQUIRED><!ELEMENT PRICE (#PCDATA)>
<!ATTLIST PRICE theBeer IDREF #REQUIRED><!ELEMENT BEER EMPTY>
<!ATTLIST BEER name ID #REQUIRED><!ATTLIST BEER soldBy IDREFS #IMPLIED>
]>
34
Example Document
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar … ”/> …</BARS>
35
Path Descriptors
Simple path descriptors are sequences of tags separated by slashes (/).
If the descriptor begins with /, then the path starts at the root and has those tags, in order.
If the descriptor begins with //, then the path can start anywhere.
36
Value of a Path Descriptor
Each path descriptor, applied to a document, has a value that is a sequence of elements.
An element is an atomic value or a node.
A node is matching tags and everything in between. I.e., a node of the semistructured graph.
37
Example: /BARS/BAR/PRICE
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar …”/> …</BARS>
/BARS/BAR/PRICE describes theset with these two PRICE elementsas well as the PRICE elements forany other bars.
38
Example: //PRICE
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar …”/>…</BARS>
//PRICE describes the same PRICEelements, but only because the DTDforces every PRICE to appear withina BARS and a BAR.
39
Wild-Card *
A star (*) in place of a tag represents any one tag.
Example: /*/*/PRICE represents all price objects at the third level of nesting.
40
Example: /BARS/*
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar …”/> …</BARS> /BARS/* captures all BAR
and BEER elements, suchas these.
41
Attributes
In XPath, we refer to attributes by prepending @ to their name.
Attributes of a tag may appear in paths as if they were nested within that tag.
42
Example: /BARS/*/@name
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar …”/> …</BARS>
/BARS/*/@name selects allname attributes of immediatesubelements of the BARS element.
43
Selection Conditions
A condition inside […] may follow a tag.
If so, then only paths that have that tag and also satisfy the condition are included in the result of a path expression.
44
Example: Selection Condition
/BARS/BAR[PRICE < 2.75]/PRICE<BARS>
<BAR name = “JoesBar”><PRICE theBeer =
“Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …
The condition that the PRICE be< $2.75 makes this price but notthe Miller price satisfy the pathdescriptor.
45
Example: Attribute in Selection
/BARS/BAR/PRICE[@theBeer = “Miller”]<BARS>
<BAR name = “JoesBar”><PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …
Now, this PRICE elementis selected, along withany other prices for Miller.
46
Axes
In general, path expressions allow us to start at the root and execute steps to find a sequence of nodes at each step.
At each step, we may follow any one of several axes.
The default axis is child:: --- go to all the children of the current set of nodes.
47
Example: Axes
/BARS/BEER is really shorthand for /BARS/child::BEER .
@ is really shorthand for the attribute:: axis. Thus, /BARS/BEER[@name = “Bud” ]
is shorthand for /BARS/BEER[attribute::name = “Bud”]
48
More Axes
Some other useful axes are:1. parent:: = parent(s) of the current
node(s).2. descendant-or-self:: = the current
node(s) and all descendants. Note: // is really shorthand for this axis.
3. ancestor::, ancestor-or-self, etc.
49
XQuery
ValuesFLWR ExpressionsOther Expressions
50
XQuery
XQuery extends XPath to a query language that has power similar to SQL.
XQuery is an expression language. Like relational algebra --- any XQuery
expression can be an argument of any other XQuery expression.
Unlike RA, with the relation as the sole datatype, XQuery has a subtle type system.
51
The XQuery Type System
1. Atomic values : strings, integers, etc.
Also, certain constructed values like true(), date(“2004-09-30”).
2. Nodes. Seven kinds. We’ll only worry about four, on next
slide.
52
Some Node Types
1. Element Nodes are like nodes of semistructured data.
Described by !ELEMENT declarations in DTD’s.
2. Attribute Nodes are attributes, described by !ATTLIST declarations in DTD’s.
3. Text Nodes = #PCDATA.4. Document Nodes represent files.
53
Example Document
<BARS><BAR name = “JoesBar”>
<PRICE theBeer = “Bud”>2.50</PRICE><PRICE theBeer =
“Miller”>3.00</PRICE></BAR> …<BEER name = “Bud” soldBy = “JoesBar
SuesBar … ”/> …</BARS>
54
Example Nodes
BARS
PRICEPRICE
BEERBAR name =“JoesBar”
theBeer =“Miller”
theBeer= “Bud”
SoldBy= “…”
name =“Bud”
3.002.50 Green = elementGold = attributePurple = text
55
Document Nodes
Form: document(“<file name>”). Establishes a document to which a
query applies. Example:
document(“/usr/ullman/bars.xml”)
56
FLWR Expressions
1. One or more for and/or let clauses.
2. Then an optional where clause.3. A return clause.
57
Semantics of FLWR Expressions
Each for creates a loop. let produces only a local definition.
At each iteration of the nested loops, if any, evaluate the where clause.
If the where clause returns TRUE, invoke the return clause, and append its value to the output.
58
FOR Clauses
for <variable> in <expression>, . . . Variables begin with $. A for-variable takes on each item in
the sequence denoted by the expression, in turn.
Whatever follows this for is executed once for each value of the variable.
59
Example: FOR
for $beer in document(“bars.xml”)/BARS/BEER/@name
return<BEERNAME> {$beer} </BEERNAME>
$beer ranges over the name attributes of all beers in our example document.
Result is a list of tagged names, like <BEERNAME>Bud</BEERNAME> <BEERNAME>Miller</BEERNAME> . . .
“Expand the en-closed string byreplacing variablesand path exps. bytheir values.”
60
LET Clauses
let <variable> := <expression>, . . . Value of the variable becomes the
sequence of items defined by the expression.
Note let does not cause iteration; for does.
61
Example: LET
let $d := document(“bars.xml”)let $beers := $d/BARS/BEER/@namereturn
<BEERNAMES> {$beers} </BEERNAMES> Returns one element with all the names of
the beers, like:<BEERNAMES>Bud Miller …</BEERNAMES>
62
Following IDREF’s
XQuery (but not XPath) allows us to use paths that follow attributes that are IDREF’s.
If x denotes a sequence of one or more IDREF’s, then x =>y denotes all the elements with tag y whose ID’s are one of these IDREF’s.
63
Example
Find all the beer elements where the beer is sold by Joe’s Bar for less than 3.00.
Strategy:1. $beer will for-loop over all beer elements.2. For each $beer, let $joe be either the Joe’s-
Bar element, if Joe sells the beer, or the empty sequence if not.
3. Test whether $joe sells the beer for < 3.00.
64
Example: The Query
let $d := document(”bars.xml”)for $beer in $d/BARS/BEERlet $joe := $beer/@soldBy=>BAR[@name=“JoesBar”] let $joePrice := $joe/PRICE[@theBeer=$beer/@name]where $joePrice < 3.00return <CHEAPBEER> {$beer} </CHEAPBEER>
Attribute soldBy is of typeIDREFS. Follow each refto a BAR and check if itsname is Joe’s Bar.
Find that PRICE subelementof the Joe’s Bar element thatrepresents whatever beer iscurrently $beer.
Only pass the values of$beer, $joe, $joePrice tothe RETURN clause if thestring inside the PRICEelement $joePrice is < 3.00
65
Order-By Clauses
FLWR is really FLWOR: an order-by clause can precede the return.
Form: order by <expression> With optional ascending or descending.
The expression is evaluated for each output element.
Determines placement in output sequence.
66
Example: Order-By
List all prices for Bud, lowest first.
let $d := document(“bars.xml”)for $p in
$d/BARS/BAR/PRICE[@theBeer=”Bud”]
order by $preturn { $p }
67
Predicates
Normally, conditions imply existential quantification.
Example: /BARS/BAR[@name] means “all the bars that have a name.”
Example: /BARS/BAR[@name=”JoesBar”]/PRICE = /BARS/BAR[@name=”SuesBar”]/PRICE means “Joe and Sue have at least one price in common.”
68
Path Expression Examples
Doc =
&o1
&o12 &o24 &o29
&o43
&o70 &o71
&96
&243 &206
&25
“Serge”“Abiteboul”
1997
“Victor”“Vianu”
122 133
paper bookpaper
references
references references
authortitle
yearhttp
author
authorauthor
title publisherauthor
authortitle
page
firstname lastnamefirstname
lastnamefirst last
Bib
&o44 &o45 &o46
&o47 &o48 &o49 &o50 &o51
&o52
Bib/paper = <&o12,&o29>
Bib/book/publisher = <&o51>
Bib/paper/author/lastname = <&o71,&206>
Bib/paper = <&o12,&o29>
Bib/book/publisher = <&o51>
Bib/paper/author/lastname = <&o71,&206>
Note that order of elements matters!
69
FOR vs. LET: Example
FOR $x IN document("bib.xml")/bib/book
RETURN <result> $x </result>
FOR $x IN document("bib.xml")/bib/book
RETURN <result> $x </result>
Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ...
LET $x IN document("bib.xml")/bib/book
RETURN <result> $x </result>
LET $x IN document("bib.xml")/bib/book
RETURN <result> $x </result>
Returns:<result> <book>...</book> <book>...</book> <book>...</book> ...</result>
70
XQuery Example 1
Find all book titles published after 1995:
FOR $x IN document("bib.xml")/bib/book
WHERE $x/year > 1995
RETURN $x/title
FOR $x IN document("bib.xml")/bib/book
WHERE $x/year > 1995
RETURN $x/title
Result: <title> abc </title> <title> def </title> <title> ghi </title>
71
XQuery Example 2For each author of a book by Morgan
Kaufmann, list all books she published:
FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author)
RETURN <result>
$a,
FOR $t IN /bib/book[author=$a]/title
RETURN $t
</result>
FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author)
RETURN <result>
$a,
FOR $t IN /bib/book[author=$a]/title
RETURN $t
</result>
distinct = a function that eliminates duplicates (after converting inputs to atomic values)
72
Results for Example 2
<result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>
Observe how nested structure of result elements is determined by the nested structure of the query.
73
XQuery Example 3
count = (aggregate) function that returns the number of elements
<big_publishers>
FOR $p IN distinct(document("bib.xml")//publisher)
LET $b := document("bib.xml")/book[publisher = $p]
WHERE count($b) > 100
RETURN $p
</big_publishers>
<big_publishers>
FOR $p IN distinct(document("bib.xml")//publisher)
LET $b := document("bib.xml")/book[publisher = $p]
WHERE count($b) > 100
RETURN $p
</big_publishers>
For each publisher p
- Let the list of books published by p be b
Count the # books in b, and return p if b > 100
74
XQuery Example 4
Find books whose price is larger than average:
LET $a=avg(document("bib.xml")/bib/book/price)
FOR $b in document("bib.xml")/bib/book
WHERE $b/price > $a
RETURN $b
LET $a=avg(document("bib.xml")/bib/book/price)
FOR $b in document("bib.xml")/bib/book
WHERE $b/price > $a
RETURN $b
75
Collections in XQuery Ordered and unordered collections
/bib/book/author = an ordered collection Distinct(/bib/book/author) = an unordered
collection
Examples: LET $a = /bib/book $a is a collection;
stmt iterates over all books in collecion $b/author also a collection (several
authors...)
RETURN <result> $b/author </result>RETURN <result> $b/author </result>
Returns a single collection! <result> <author>...</author> <author>...</author> <author>...</author> ... </result>
However:
76
Collections in XQuery
What about collections in expressions ?
$b/price list of n prices
$b/price * 0.7 list of n numbers?? $b/price * $b/quantity list of n x m numbers ??
Valid only if the two sequences have at most one element Atomization
$book1/author eq "Kennedy" - Value Comparison $book1/author = "Kennedy" - General
Comparison
77
Sorting in XQuery
<publisher_list> FOR $p IN distinct(document("bib.xml")//publisher)
ORDERBY $p RETURN <publisher> <name> $p/text() </name> , FOR $b IN document("bib.xml")//book[publisher = $p]
ORDERBY $b/price DESCENDING RETURN <book>
$b/title , $b/price </book> </publisher></publisher_list>
<publisher_list> FOR $p IN distinct(document("bib.xml")//publisher)
ORDERBY $p RETURN <publisher> <name> $p/text() </name> , FOR $b IN document("bib.xml")//book[publisher = $p]
ORDERBY $b/price DESCENDING RETURN <book>
$b/title , $b/price </book> </publisher></publisher_list>
78
Conditional Expressions: If-Then-Else
FOR $h IN //holding
ORDERBY $h/titleRETURN <holding>
$h/title,
IF $h/@type = "Journal"
THEN $h/editor
ELSE $h/author
</holding>
FOR $h IN //holding
ORDERBY $h/titleRETURN <holding>
$h/title,
IF $h/@type = "Journal"
THEN $h/editor
ELSE $h/author
</holding>
79
Existential Quantifiers
FOR $b IN //book
WHERE SOME $p IN $b//para SATISFIES
contains($p, "sailing")
AND contains($p, "windsurfing")
RETURN $b/title
FOR $b IN //book
WHERE SOME $p IN $b//para SATISFIES
contains($p, "sailing")
AND contains($p, "windsurfing")
RETURN $b/title
80
Universal Quantifiers
FOR $b IN //book
WHERE EVERY $p IN $b//para SATISFIES
contains($p, "sailing")
RETURN $b/title
FOR $b IN //book
WHERE EVERY $p IN $b//para SATISFIES
contains($p, "sailing")
RETURN $b/title
81
Other Stuff in XQuery Before and After
for dealing with order in the input
Filter deletes some edges in the result tree
Recursive functions Namespaces References, links … Lots more stuff …
82
AppendixXML Schema and
XQuery Data Model
83
XML Schema
Includes primitive data types (integers, strings, dates, etc.)
Supports value-based constraints (integers > 100)
User-definable structured types Inheritance (extension or restriction) Foreign keys Element-type reference constraints
84
Sample XML Schema<schema version=“1.0”
xmlns=“http://www.w3.org/1999/XMLSchema”><element name=“author” type=“string” /><element name=“date” type = “date” /><element name=“abstract”> <type> … </type></element><element name=“paper”> <type> <attribute name=“keywords” type=“string”/> <element ref=“author” minOccurs=“0” maxOccurs=“*” /> <element ref=“date” /> <element ref=“abstract” minOccurs=“0” maxOccurs=“1” /> <element ref=“body” /> </type></element></schema>
85
XML-Query Data Model Describes XML data as a tree Node ::= DocNode |
ElemNode | ValueNode | AttrNode | NSNode | PINode | CommentNode | InfoItemNode | RefNode
http://www.w3.org/TR/query-datamodel/2/2001
86
XML-Query Data ModelElement node (simplified definition):
elemNode : (QNameValue, {AttrNode }, [ ElemNode | ValueNode]) ElemNode
QNameValue = means “a tag name”Reads: “Give me a tag, a set of attributes, a list of
elements/values, and I will return an element”
87
XML Query Data ModelExample:
<book price = “55”
currency = “USD”>
<title> Foundations … </title>
<author> Abiteboul </author>
<author> Hull </author>
<author> Vianu </author>
<year> 1995 </year>
</book>
<book price = “55”
currency = “USD”>
<title> Foundations … </title>
<author> Abiteboul </author>
<author> Hull </author>
<author> Vianu </author>
<year> 1995 </year>
</book>
book1= elemNode(book, {price2, currency3}, [title4, author5, author6, author7, year8])
price2 = attrNode(…) /* next */currency3 = attrNode(…)title4 = elemNode(title, string9)…
book1= elemNode(book, {price2, currency3}, [title4, author5, author6, author7, year8])
price2 = attrNode(…) /* next */currency3 = attrNode(…)title4 = elemNode(title, string9)…
88
89
XQuery Values
Item = node or atomic value. Value = ordered sequence of zero
or more items. Examples:
1. () = empty sequence.2. (“Hello”, “World”)3. (“Hello”, <PRICE>2.50</PRICE>,
10)
90
Nesting of Sequences Ignored
A value can, in principle, be an item of another value.
But nested list structures are expanded.
Example: ((1,2),(),(3,(4,5))) = (1,2,3,4,5) = 1,2,3,4,5.
Important when values are computed by concatenating other values.
91
Effective Boolean Values
The effective boolean value (EBV) of an expression is:
1. The actual value if the expression is of type boolean.
2. FALSE if the expression evaluates to 0, “” [the empty string], or () [the empty sequence].
3. TRUE otherwise.
92
EBV Examples
1. @name=”JoesBar” has EBV TRUE or FALSE, depending on whether the name attribute is ”JoesBar”.
2. /BARS/BAR[@name=”GoldenRail”] has EBV TRUE if some bar is named the Golden Rail, and FALSE if there is no such bar.
93
Boolean Operators
E1 and E2, E1 or E2, not(E ), if (E1) then E2 else E3 apply to any expressions.
Take EBV’s of the expressions first. Example: not(3 eq 5 or 0) has value
TRUE. Also: true() and false() are functions
that return values TRUE and FALSE.
94
Quantifier Expressions
some $x in E1 satisfies E2
1. Evaluate the sequence E1.2. Let $x (any variable) be each item in
the sequence, and evaluate E2.
3. Return TRUE if E2 has EBV TRUE for at least one $x.
Analogously:every $x in E1 satisfies E2
95
Document Order
Comparison by document order: << and >>.
Example: $d/BARS/BEER[@name=”Bud”] << $d/BARS/BEER[@name=”Miller”] is true iff the Bud element appears before the Miller element in the document $d.
96
Set Operators
union, intersect, except operate on sequences of nodes. Meanings analogous to SQL. Result eliminates duplicates. Result appears in document order.
97
Other Operators
Use Fortran comparison operators to compare atomic values only. eq, ne, gt, ge, lt, le.
Arithmetic operators: +, - , *, div, idiv, mod. Apply to any expressions that yield
arithmetic or date/time values.