1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated...

100
1 XML & related technologies

Transcript of 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated...

Page 1: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

1

XML & related technologies

Page 2: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

2

Outline

Markup Language XML DTD

API for XML DOM SAX

Related Technologies Name Space Xpath Xlink XSL

Query Languages for XML Quilt

Page 3: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

3

A critique of HTML

Extraordinarily flexible, but low on structure

Fixed tag set (vocabulary)No automatic validationUnreliable use of the syntaxNobody uses the data model

Page 4: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

4

How XML solves this

Define your own tags (vocabulary)Validate against the definitionError handling and strict definition of the

syntaxSmaller and simpler than SGMLStandardized APIs for working with itA data model specification is coming

Page 5: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

5

XML background

A subset of SGMLSimplifies SGML by:

leaving out many syntactical options and variants

leaving out some DTD features leaving out some troublesome features

Recommendation approved by the W3C

Page 6: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

6

A simple and complete element:

<address>

<street> 33, Terry Dr.</street>

<city> Morristown </city>

</address>

Elements

markup

Content

Start tag

End tag

Page 7: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

7

Elements

Attach a meaning to a piece of a document

Have an element type (`example’, `name’) represented by a markup (tag).

Can be nested at any depth

Page 8: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

8

Elements

Can contain: other elements (sub-elements)

<address>

<street> 33, Terry Dr.</street><city> Morristown </city>

</address>

text (data content)<street> 33, Terry Dr.</street>

a combination of them (mixed content)<par>Today, <date>05-06-2000</date> Mr. <name>Bill Gates<name> is in California to talk to ... </par>

Page 9: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

9

Document Element

It is the outermost element containing all the elements in a documentexample:

<employee> … </employee>

It must always exist

Page 10: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

10

Empty Elements

elements without content They do not have end tags Particular representation of start tags

example:

<medical-dossier …/>

Page 11: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

11

Attributes

Used to annotate the element with extra information

Always attached to start tags:<el-name attr-name1=“v1” .. attr-name1=“v1” >……<el-name/>

Elements can have any number of

attributes, but all distinct

Page 12: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

<Orders>

<SalesOrder SONumber="12345">

<Customer CustNumber="543">

<CustName>ABC Industries</CustName>

<Street>123 Main St.</Street>

<City>Chicago</City>

<State>IL</State>

<PostCode>60609</PostCode>

</Customer>

<OrderDate>981215</OrderDate>

<Line LineNumber="1">

<Part PartNumber="123">

<Description> Turkey wrench: Stainless steel,

one piece construction, lifetime guarantee.

</Description>

<Price>9.95</Price>

</Part>

<Quantity>10</Quantity>

</Line>

</SalesOrder>

</Orders>

Page 13: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

13

An XML document

<?XML version=“1.0”><books>

<book><entry isbn=“1-55860-622-X”>

<title>Data on the Web:...</title><publisher>Morgan Kaufmann</publisher>

</entry><author> Serge Abiteboul</author> <author> Peter Buneman</author><author> Dan Suciu</author><bookRef to=“0-201-53771-O 1-55860-463-4”/><articleLink href=“http://…/articles.xml#id(Abi97)”>

</book> <book>

<entry isbn=“0-201-53771-O”> <title>Foundation of Databases</title>

<publisher>Addison Wesley</publisher></entry><author> Serge Abiteboul</author>...

</book>...

</books>

Page 14: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

14

Elements Vs Attributes

An element, when: I need fast searching process

it is visible to everyone

it is relevant for the meaning of the document

An attribute, when: it is a choice it is visible only to the system

it is not relevant for the meaning of the document

Do I use an element or an attribute to store semantic info?

Page 15: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

15

Other Stuff

Processing instructions, used mainly for extension purposes (<?target data?>)

Comments (<!-- … -->)Character references (&#163;)Entities:

named files or pieces of markup can be referred to recursively, inserted at

point of reference

Page 16: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

16

Document Types

Basic idea: we need a type associated with a document, just like objects and values

A document type is a class of documents with similar structure and semanticsExamples: slide presentations, journal articles, meeting agendas, method calls, etc.

Page 17: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

17

DTDs

DTDs provide a standardized means for declaratively describing the structure of a document type

This means: which (sub-)elements an element can contain whether it can contain text or not which attributes it can have some typing and defaulting of attributes

Page 18: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

18

DTD

A DTD can be Internal: the DTD is in the document External: the DTD is in an external file and

included in the document mixed: part in the document and part outside

A DTD is logically composed of 2 parts: Element Type Definition Attribute List Declaration

Page 19: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

19

Element Type Definition

The element type definition specifies: structure of the document allowed contents (content model) allowed attributes (by the meaning of attribute

list declarations)

Page 20: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

20

Element Type Definition

<!ELEMENT A (B*, C, D?)> <!ELEMENT A (B | C+)> <!ELEMENT A (#PCDATA)> <!ELEMENT A EMPTY> <!ELEMENT A ANY> <!ELEMENT A (#PCDATA| B | C)*>

• The following are examples of possible declarations:

Page 21: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

21

Attribute-List Declarations

It is the list of allowed attributes for each element.For each attribute: name, type, and other information.

Attribute types. Three groups: string types (CDATA) tokenized types (ID,IDREF,IDREFS,...) enumerated types (as the ones in Pascal)

Page 22: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

22

Attribute-List Declarations

<!ATTLIST A a CDATA #IMPLIED> <!ATTLIST A a CDATA #IMPLIED b CDATA

#REQUIRED> <!ATTLIST A a CDATA #IMPLIED “aaa”> <!ATTLIST A a CDATA #REQUIRED “aaa”> <!ATTLIST A a CDATA #FIXED “aaa”> <!ATTLIST A a (aaa|bbb) #IMPLIED “aaa”> <!ATTLIST A id ID #REQUIRED> <!ATTLIST A ref IDREF #IMPLIED>

• <!ELEMENT A (#PCDATA)>

Page 23: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

<!DOCTYPE Orders[

<!ELEMENT Orders(SalesOrder)+>

<!ELEMENT SalesOrder(Customer,OrderDate,Line*)>

<!ELEMENT Customer(CustName,Street,City,State,PostCode,tel*)>

<!ELEMENT CustName (#PCDATA)>

<!ELEMENT Street (#PCDATA)> <!ELEMENT State (#PCDATA)>

<!ELEMENT PostCode (#PCDATA)> <!ELEMENT tel (#PCDATA)>

<!ELEMENT OrderDate (#PCDATA)> <!ELEMENT Line (Part,Quantity)>

<!ELEMENT Part(Description,Price)> <!ELEMENT Quantity (#PCDATA)>

<!ELEMENT Description (#PCDATA)> <!ELEMENT Price (#PCDATA)>

<!ATTLIST SalesOrder SONumber CDATA #REQUIRED>

<!ATTLIST Customer CustNumer CDATA #REQUIRED>

<!ATTLIST Line LineNumber CDATA #IMPLIED>

<!ATTLIST Part PartNumber CDATA #REQUIRED>

]

A DTD

Page 24: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

24

A DTD

<!DOCTYPE Books[

<!ELEMENT Books(book)+>

<!ELEMENT book(entry, author+, bookRef, articleLink*)>

<!ELEMENT entry(title, publisher)>

<!ELEMENT bookRef EMPTY>

<!ELEMENT articleLink EMPTY>

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (#PCDATA)>

<!ELEMENT pubblisher (#PCDATA)>

<!ATTLIST entry isdn ID #REQUIRED>

<!ATTLIST bookRef to IDREFS #IMPLIED>

<!ATTLIST articleLink

xmlns:xlink CDATA #FIXED “http://w3c.org/xlink”

xlink:type CDATA #FIXED “simple”

xlink:href CDATA #REQUIRED>

]>

Page 25: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

25

Well-formed and Valid Docs

A document is well-formed if it follows the grammar rules provided by W3C.

A document is valid if it conforms to a DTD which specifies the allowed structure of the document

Page 26: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

26

Uses of XML Entities

Physical partition size, reuse, "modularity", … (both XML docs

& DTDs)

Non-XML data unparsed entities binary data

Non-standard characters character entities

Shorthand for phrases & markup, => effectively are macros

Page 27: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

27

Types of Entities

Internal (to a doc) vs. External ( use URI)

General (in XML doc) vs. Parameter (in DTD)

Parsed (XML) vs. Unparsed (non-XML)

Page 28: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

28

Entities & Physical Structure

A logical elementcan be split into

multiplephysical entities

Mylife.xml

DTD...

<mylife>Chap1.xml

Chap2.xml

</mylife>

<teen>yada yada</teen>

<adult>blah blah..</adult>

Page 29: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

29

External Text Entities

<!ENTITY chap1 SYSTEM "http://...chap1.xml">

<mylife> &chap1; &chap2;</mylife>

External Text Entity Declaration

Entity Reference

<mylife> <teen>yada yada</teen><adult> blah blah</adult>

</mylife>

Logically equivalent to inlining file contents

URL

DTD

XML

Page 30: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

30

Internal Text Entities

<!ENTITY WWW "World Wide Web">

<p>We all use the &WWW;.</p>

Internal Text Entity Declaration

Entity Reference

<p>We all use the World Wide Web.</p>

Logically equivalent to actually appearing

DTD

XML

Page 31: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

31

Unparsed (& "Binary") Entities

<!ENTITY fusion SYSTEM "http://... fusion.ps" NDATA ps>

... and unparsed entity

<fullPaper source="fusion"/>

<!attlist fullPaper source ENTITY #REQUIRED>

Element with ENTITY attribute

Declare attribute type to be entity

<!NOTATION ps SYSTEM "ghostview.exe">

NOTATION declaration (helper app.)

Declare external...

DTD

XML

Page 32: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

32

Processing XML

Non-validating parser: checks that XML doc is syntactically well-formed

Validating parser: checks that XML doc is also valid w.r.t. a given DTD

Parsing yields tree/object representation: Document Object Model (DOM) API

Or a stream of events (open/close tag, data): Simple API for XML (SAX)

Page 33: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

33

API for handling XML Documents

DOM, SAX

Page 34: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

34

DOM Structure Model and API

hierarchy of Node objects: document, element, attribute, text, comment, ...

language independent programming DOM API: get... first/last child, prev/next sibling, childNodes insertBefore, replace getElementsByTagName ...

alternative event-based SAX API (Simple API for XML) does not build a parse tree (reports events when

encountering begin/end tags) for (partially) parsing very large documents

Page 35: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

35

DOM Summary

Object-Oriented approach to traverse the XML node tree

Automatic processing of XML docs

Operations for manipulating XML tree

Manipulation & Updating of XML on client & server

Database interoperability mechanism

Page 36: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

36

SAX Event-Based API

Pros: The whole file doesn’t need to be loaded into

memory XML stream processing Simple and fast Allows you to ignore less interesting data

Cons: limited expressive power (query/update) when

working on streams=> application needs to build (some) parse-tree when

necessary

Page 37: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

37

Related XML technologies

Namespace, Xlink, Xpath, XSL

Page 38: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

38

Namespace

Page 39: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

39

Namespace

Through namespaces it is possible to declare a set of names which meaning is not ambiguous, i.e. everyone agree on their meaning.

In other worlds, namespaces allow to distinguish two elements with the same name, but different meaning

Element/attribute name is a combination of 2 parts: prefix:name

Page 40: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

40

Example: Namespace

<person><name>Rosalie Panelli</name><address>33 Terry Dr.</address>

</person><webSite>

<name>XML Italia</name><address>http://www.xml.it</address>

</webSite>

<person><name>Rosalie Panelli</name><address>33 Terry Dr.</address>

</person><webSite>

<name>XML Italia</name><address>http://www.xml.it</address>

</webSite>

<person xmlns:person=“http://namespaces.xml.it/person”><person:name>Rosalie Panelli</person:name><person:address>33 Terry Dr.</person:address>

</person><webSite

xmlns:webSite=“http://namespaces.xml.it/webSite”><webSite:name>XML Italia</webSite:name><webSite:address>http://www.xml.it</webSite:address>

</webSite>

<person xmlns:person=“http://namespaces.xml.it/person”><person:name>Rosalie Panelli</person:name><person:address>33 Terry Dr.</person:address>

</person><webSite

xmlns:webSite=“http://namespaces.xml.it/webSite”><webSite:name>XML Italia</webSite:name><webSite:address>http://www.xml.it</webSite:address>

</webSite>

Page 41: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

41

Namespace declaration

Up to now, in order to parse a document containing a namespace against a DTD, it is necessary to include the prefixes in the element declarations:

<!ELEMENT person (person:name, person:address)><!ATTLIST person xmlns:person CDATA

#FIXED “http://namespaces.xml.it/person”><!ELEMENT person:name (#PCDATA)><!ELEMENT person:address (#PCDATA)>

Note: the address http://namespaces.xml.it/person might be dangling

Page 42: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

42

Xlink

Page 43: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

43

Xlink

Only internal links can be represented by the means of ID/IDREF(S) attributes

Xlink is a language that allows the definition of links among documents (external links) through Xlink elements

Unidirectional links can be defined as in HTML, but also other kinds

Based on a namespace specifically tailored by W3C

Page 44: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

44

The Xlink Namespace

It consists of the following attributes: Type Href Role Title Show Actuate From to

By the means of these attributes it is possible to describe the different kind of links

Page 45: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

45

Optional and Required Attributes

OTO

OFROM

OOACTUATE

OOSHOW

OOOOOTITLE

OOOOOROLE

ROHREF

RRRRRRTYPE

TITLEARCLOCATORRESOURCEEXTENDEDSIMPLEAttribute Link

Page 46: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

46

Description of attributes

Type it is the type of Xlink element

Href URI address of the used resource

Role link description used from the application

Title link description used from the human user

Show It specifies the behavior of an application when cross the link

Actuate It specifies when the behavior selected by the show attribute must be executed

From/To they specify the role attributes of the sources and of the targets in an extended link

Page 47: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

47

A simple link

<dsi xmlns:xlink=“http://www.w3c.org/”xlink:type=“simple”xlink:href=“http://www.dsi.unimi.it”xlink:show=“new”xlink:actuate=“onRequest”xlink:role=“DSI”xlink:title=“Dipartimento di Scienze dell’Info..”>Dipartimento di Scienze dell’Informazione

</dsi>

Page 48: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

48

Actuate e Show

By the means of these attributes it is possible to specify when a particular link should be crossed and the behavior that an application should show when the link is effectively crossed

Page 49: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

49

The values of show

New when the link is crossed, the resource is loaded in a new page

Replace when the link is crossed, the target resource is substituted in the current page

Embed the resource is included in the current document

Undefined the application is free to apply the behavior it likes

Page 50: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

50

The values of actuate

onLoad the link is crossed when the document is loaded into the application

OnRequest the link is crossed when the user explicitly request to cross the link

Undefined the application crosses the link as it likes

Page 51: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

51

The type of links

Simple: a link is simple when it is between two resources (one local, the other remote)

Extended: a link is extended when it is among different sources (both local and remote)

Page 52: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

52

The simple links

They are similar to those of HTML:<html>

...<a href=“http://www.dsi.unimi.it/Index.html”> Computer Science Dept </a>...</html>

Page 53: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

53

Example: Xlink declaration

<!Element a (#PCDATA)><!ATTLIST a

xmlns:xlink CDATA #FIXED “http://www.w3c.org”xlink:type (simple) #FIXED “simple”xlink:href CDATA #REQUIREDxlink:show (new|replace) #FIXED “replace”xlink:actuate (onRequest) #FIXED “onRequest”xlink:role CDATA #IMPLIEDxlink:title CDATA #IMPLIED

>

Page 54: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

54

Example: Xlink Instance

<xml>

...<a href=“http://www.disi.unige.it/Index.html”> Computer Science Dept </a>...</xml>

Page 55: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

55

Xpath

Page 56: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

56

Xpath

It is a language for expressing path expression on the hierarchical structure of XML documents

Based on the concept of context node.A context node is the node from which an

expression is evaluatedExamples:

/books/book/author /child::books/child::book/child::author

Page 57: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

57

Xpath: Location Path

The central construct is the location path, which is a sequence of location steps separated by /.

A location step is evaluated wrt some context resulting in a set of nodes.

A location path is evaluated compositionally, left-to-right, starting with some initial context.

Each node resulting from the evaluation of one step is used as context for evaluation of the next, and the results are combined together.

Page 58: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

58

Location steps

A location step has the form

axis :: node-test [ predicate ]

Example: child::section[position()<6]/descendant::cite/attribute::href

selects all href attributes in cite elements in the first 5

sections of an article.

Page 59: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

59

Axes

Page 60: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

60

Node tests

Testing by node type: text() chardata nodes

comment() comment nodes

processing-instruction() processing instruction nodes

node() all nodes (not including attributes and namespace decl.s)

Testing by node name: name nodes with that name * any node

Page 61: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

61

Predicates

expressions coerced to type boolean A predicate filters a node-set by

evaluating the predicate expression on each node in the set with that node as the context node

Different predefined predicates are available

Predicates can be conjunction, disjunction, negation of other predicates

Page 62: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

62

Examples of Predicates

position() returns the sequential location of the element being tested

ex: author[position()=1] abbr. author[1]

last() returns the last siblingex: author[last()]

id(value) returns the element with id equal to “value” ex: entry[id(1-55860-622-X)]

text() returns the text content of a nodeex: author[text()=“Serge Abiteboul”]

Page 63: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

63

Abbreviations

Normal syntax Abbreviation child:: nothing

attribute:: @

/descendant-or-self::node()/ //

self::node() .

parent::node() ..

Syntactic sugar: convenient notation for common situations

Page 64: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

64

Examples

/books/book/author //author //author[1] //book[entry/@isbn =“1-5586..” or author =

“...”] //book[author][last()>2]

Page 65: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

65

XSLeXtensible Stylesheet Language

Page 66: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

66

Why Stylesheets?

XML is not a fixed tag set (like HTML) and has no (application) semantics

XML markup does not (usually) include formatting information

Reuse: the same content can look different in different contexts

Multiple output formats: different media (paper, online), different sizes (manuals, reports), different classes of output devices (palmtops, workstations, cellurars)

Page 67: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

67

Why Style sheets? (Cont.)

Standardized styles: corporate style sheets can be applied to the content at any time

Freedom from style issues for content authors: technical writers need not be concerned with layout issues because the correct style can be applied later

Therefore there must be something in addition to the XML document that provides information on how to present or otherwise process the XML

Page 68: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

68

XSL

XSL stands for eXtensible Style Sheet Language

XSL has currently on preparation by W3CXSL is a style sheet language allowing one

to address the previous issues

Page 69: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

69

XSL - More than a Style Sheet

XSL consists of two parts: a method for transforming XML documents a method for formatting XML documents

Page 70: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

70

Transforming XML documents

An input document can be transformed into another document by: generation of constant text suppressing of content moving text (e.g. exchanging the order of the first and

last name) duplicating text (e.g. copying titles to make a table of

contents) sorting more complex transformations that “compute” new

information in terms of the existing information

Page 71: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

71

Formatting XML documents

A description of how to present the transformed information (i.e. specification of what properties to associate to each of the various parts of the transformed info)

This includes: Specification of the general screen or page layout Assignment of the transformed content into basic

“content container types” (e.g. lists, paragraphs, inline text)

Specification of formatting properties (spacing, margins, alignment, fonts, etc.) for each resulting “container”

Page 72: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

72

Components of XSL

The full XSL language logically consists ot 3 component languages which described in 3 W3C recommandations: Xpath XSLT: XSL Transformation -- a language for describing

how to transform one XML document into another XSL: Extensible Stylesheet Language -- XSLT + a

description of a set of Formatting Objects and Formatting Properties

Page 73: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

73

XSLT Processing Model

XML source tree XML,HTML, text… result tree

XSLT stylesheet

Transformation

Page 74: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

74

XSLT Elements

<xsl:stylesheet version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

root element of an XSLT stylesheet "program"

<xsl:template match=pattern name=qname priority=number mode=qname>

...template...</xsl:template>

declares a rule: (pattern => template)

<xsl:apply-templates select = node-set-expression mode = qname>

apply templates to selected children (default=all) optional mode attribute

   <xsl:call-template name=qname>

Page 75: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

75

XSLT Processing Model

XSL stylesheet: collection of template rules template rule: (pattern template)main steps:

match pattern against source tree (expressed in Xpath)

instantiate template (replace current node “.” by the template in the result tree)

select further nodes for processing (expressed in Xpath)

control can be a mix of recursive processing (<xsl:apply-templates> ...) program-driven (<xsl:foreach> ...)

Page 76: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

76

<xsl:template match="product"> <table> <xsl:apply-templates select="sales/domestic"/> </table> <table> <xsl:apply-templates select="sales/foreign"/> </table> </xsl:template>

Template Rule: Example

(i) match pattern: process <product> elements(ii) instantiate template: replace each product element with two HTML tables(iii) select the <product> grandchildren (“sales/domestic”, “sales/foreign”) for further processing

pattern

template

Page 77: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

77

An XML document<?xml version="1.0"?><books>

<book category="reference"><author>Nigel Rees</author><title>Sayings of the Century</title><price>8.95</price>

</book><book category="fiction">

<author>Evelyn Waugh</author><title>Sword of Honour</title><price>12.99</price>

</book><book category="fiction">

<author>Herman Melville</author><title>Moby Dick</title><price>8.99</price>

</book><book category="fiction">

<author>J. R. R. Tolkien</author><title>The Lord of the Rings</title><price>22.99</price>

</book></books>

Page 78: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

78

An XSL stylesheet<?xml version="1.0" ?><xsl:stylesheet

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0">

<xsl:template match="books"><html><body><h1>A list of books</h1><table width="640"><xsl:apply-templates/></table></body></html>

</xsl:template><xsl:template match="book">

<tr><td><xsl:number/></td><xsl:apply-templates/></tr>

</xsl:template><xsl:template match="author | title | price">

<td><xsl:value-of select="."/></td></xsl:template></xsl:stylesheet>

Page 79: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

79

The result (textual version)

<html> <body> <h1>A list of books</h1> <table width="640"> <tr> <td>1</td> <td>Nigel Rees</td> <td>Sayings of the Century</td> <td>8.95</td> </tr> <tr> <td>2</td> <td>Evelyn Waugh</td> <td>Sword of Honour</td> <td>12.99</td> </tr> …... </table> </body></html>

Page 80: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

80

The result (shown on a browser)

Page 81: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

81

Query Languages for XML

Page 82: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

82

Querying XML

Different XML QL paradigms depending on the community: (relational, oo, semistructured) database

perspectiveLorel, YaTL, XML-QL, XMAS, FLORA/FLORID, ...

document processing perspectiveXQL, XSL(T), XPath, ...

functional programming perspectiveQLs with structural recursion, …

Patching desirable features together: Quilt

Page 83: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

83

Important QL Features (DB Perspective)

typical parts of a query: (match) pattern (selects parts of the source XML tree

without looking at data)filter condition (selects further, now looking at the data)answer construction (putting the results together,

possibly reordered, grouped, etc.)

reordering based on nested queries, grouping, sorting

tag variables, path expressions for defining the patterns without requiring knowledge of the DTD

Page 84: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

84

Querying XML No "official" W3C XML QL yet (but bits and pieces) numerous quite different XML QLs are popping up some XML QL overviews, comparisons, and resources:

XML Query Languages: Experiences and Exemplarshttp://www-db.research.bell-labs.com/user/simeon/xquery.html (co-authored by several XML QL gurus)

XML and Query Languages (Oasis Cover Pages)http://www.oasis-open.org/cover/xmlQuery.html

Comparative Analysis of Five XML Query Languages (A. Bonifati, S. Ceri) http://www.acm.org/sigmod/record/issues/0003/bonifati.pdf.gz

A Data Model and Algebra for XML Query (Philip Wadler et.al. “functional (Haskell) perspective”)http://cm.bell-labs.com/cm/cs/who/wadler/topics/xml.html

XML-QL vs XSLT queries (Geert Jan Bex and Frank Neven)http://alpha.luc.ac.be/~gjb/xml-ql2xslt.html

children of the “(semistructured) database(s) crowd”: XML-QL, YaTL, Lorel, …

… from the “functional crowd”:

… from the “document processing folks”: XQL, XSL(T), XPath, ...

XPath: W3C Recommendation Powerful pattern language for selecting parts of XML docs Used by XSL(T), XPointer, and XQL

XQL based on XPath,

Browser:IE5 XML DBs: Excelon, Tamino, Perl, …

Page 85: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

85

Quilt Source: Daniela Florescu, Jérôme Siméon, VLDB 2000

Goals: Put together the most effective features of several existing

and proposed Query languages Design a small, clean, implementable language

Antecedent: Xpath and XQL XML-QL by A. Deutsch, M. Fernandez, D. Florescu, A. Levy,

D. Suciu SQL and OQL

Page 86: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

86

Antecedents: Xpath and XQL

Closely-related languages for navigating in a hierarchy

A path expression is a series of stepsEach step moves along an axis (children,

ancestors, attributes, etc.) and may apply a predicate

XQL has some additional operators: BEFORE, AFTER

Page 87: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

87

Antecedents: XML-QL

WHERE-clause binds variables according to a pattern, CONSTRUCT-clause generates output document

WHERE<part pno = $pno> $pname </> in “parts.xml”,<supp sno = $sno> $sname </> in “supp.xml”,<sp pno = $pno sno = $sno> $numofpart </> in “sp.xml”

CONSTRUCT<purchase> <partname> $pname </> <suppname> $sname </> <numofparts> $numofpart </></purchase>

Page 88: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

88

Antecedents SQL and OQL

SQL and OQL are database query languagesSQL derives a table from other tables by a

stylized series of clauses: SELECT-FROM-WHERE

OQL is a functional language a query is an expression Expressions can take several forms Expressions can be nested and combined SELECT-FROM-WHERE is one form of OQL-

expression

Page 89: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

89

An Example of Document: bib.xml

<bib><book>

<title>…</title><author>…</author>...<publisher>…</publisher><year>…</year><price>…</price>

</book>….

</bib>

Page 90: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

90

Some examples of queries

“Find all the books published in 1998 by Jackson”

FOR $b IN document(“bib.xml”)//bookWHERE $b/year = “1998” and $b/pablisher = “Jackson”RETURN $b SORTEDBY(author, title)

“Find title of books that have no authors”

<orphan_books>FOR $b IN document(“bib.xml”)//bookWHERE empty($b/author)RETURN $b/title

</orphan_books>

Page 91: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

91

Some examples of queries (cont.)

“Invert the hierarchy from publishers inside books to books inside publishers”

FOR $p IN distinct(//publisher) RETURN

<publisher><name> $p/text() </name>,

FOR $b IN //book[publisher = $p]RETURN <book> $b/title, $b/price </book> SORTEDBY(price DESCENDENT)

</publisher> SORTBY(name)

Page 92: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

92

Some examples of queries (cont.)

“Find the description and average price of each red part that has at least 10 orders”

FOR $p IN document(“parts.xml”)//part[color=“Red”]LET $o := document(“orders.xml”)//order[partno = $p/partno]WHERE count ($o) >= 10RETURN

<important_red_part>$p/description,<avg_price> avg($o/price)</avg_price>

</important_red_part>

Page 93: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

93

Quilt data model

A document in Quilt is a treeIn order to model parts of a document and

collection of documents Quilt supports the concept of forest of trees

E

T

A

EE

E

E

E

T

T T

TT A

A

denotes element

denotes text node

denotes attribute

a document a forest

Page 94: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

94

Quilt Expressions

Like OQL, Quilt is a functional language (a query is an expression, and expressions can be composed)

Some types of Quilt expressions: A path expression (using Xpath syntax):

document(“bids.xml”)//bis[itemno =“47”]/bid_amount

An expression using operators and functions:($x + $y) * foo($z)An element constructor

<bid><userid> $u </userid><bid_amount> $a </bid_amount>

<bid>A “FLWR” expression

Page 95: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

95

Expression results

The result of the evaluation of a Quilt expression can be: an XML document (i.e. a tree) a set of elements (i.e. a forest) a primitive value (i.e. a string)

Page 96: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

96

A FLWR Expression

A FLWR expression binds some variables, applies a predicate, and constructs a new result

FOR … LET … WHERE … RETURN …

RETURNCLAUSE

LETCLAUSE

WHERECLAUSE

FORCLAUSE

Page 97: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

97

FOR clause

FOR is used for iterating over one or more collections

Each expression evaluates to a collection of nodes

The FOR clause produces many binding-tuples from the Cartesian product of these collections

In each tuple, the value of each variable is one node and its descendents

The order of the tuples preserves document order unless some expression contains a non-order-preserving function such as distinct()

FOR var IN expr

,

Page 98: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

98

LET clause

LET is used for binding variables (without iteration) A LET clause produce one binding for each variable

(therefore it does not effect the number of binding-tuples)

The variable is bound to the value of expression, which may contain many nodes (i.e. the set of nodes returned by the expression evaluation are linked into a forest of nodes)

document order is preserved among the nodes in each bound collection, unless some expression contains a non-order-preserving function such as distinct()

LET var := expr

,

Page 99: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

99

WHERE clause

Applies a predicate to the tuples of bound variabes Retains only tuples that satisfy the predicate Preserves order of tuples, if any May contain AND, OR, NOT Applies scalar conditions to scalar variables:

$color = “Red” Applies set conditions to variables bound to sets:

avg($emp/salary) > 1000

WHERE boolean-expression

Page 100: 1 XML & related technologies. 2 Outline zMarkup Language yXML yDTD zAPI for XML yDOM ySAX zRelated Technologies yName Space yXpath yXlink yXSL zQuery.

100

RETURN clause

Constructs the result of the FLWR expression Executed once for each tuple of bound variables Preserves order of tuples, if any, … OR, can impose a new order using a SORTBY

clause Often uses an element constructor:

<item>$item/itemno,<avg_bid> avg($b/bid_amount) </avg_bid>

</item> SORTBY itemno

RETURN expr