XML Techno

8/8/2019 XML Techno

http://slidepdf.com/reader/full/xml-techno 1/40

A Document Type Definition (DTD) defines the

legal building blocks of an XML document. It

defines the document structure with a list of

legal elements and attributes.

A DTD can be declared inline inside an XML

document, or as an external reference.

8/8/2019 XML Techno


Internal DTD Declaration

If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPEdefinition with the following syntax:

<!DOCTYPE root-element [element-declarations]> Example XML document with aninternal DTD:

<?xml version="1.0"?><!DOCTYPE note [<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>]>

<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend</body></note>

8/8/2019 XML Techno


The DTD above is interpreted like this:

!DOCTYPE note defines that the root element of thisdocument is note

!ELEMENT note defines that the note element contains

four elements: "to,from,heading,body" !ELEMENT to defines the to element to be of type

"#PCDATA"

!ELEMENT from defines the from element to be of type"#PCDATA"

!ELEMENT heading defines the heading element to be of type "#PCDATA"

!ELEMENT body defines the body element to be of type"#PCDATA"

8/8/2019 XML Techno


External DTD Declaration

If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with thefollowing syntax:

<!DOCTYPE root-element SYSTEM "filename">

This is the same XML document as in previous case, but with an external DTD (Open it

, and selectview source):

<?xml version="1.0"?><!DOCTYPE note SYSTEM "note.dtd"><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body>

</note> And this is the file "note.dtd" which contains the DTD:

<!ELEMENT note (to,from,heading,body)>

<!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>

8/8/2019 XML Techno


Why Use a DTD?

With a DTD, each of your XML files can carry adescription of its own format.

With a DTD, independent groups of people canagree to use a standard DTD for interchangingdata.

Your application can use a standard DTD to verify

that the data you receive from the outside worldis valid.

You can also use a DTD to verify your own data.

8/8/2019 XML Techno


The Building Blocks of XML Documents

Seen from a DTD point of view, all XMLdocuments (and HTML documents) are made up

by the following building blocks: Elements

Attributes

Entities PCDATA

CDATA

8/8/2019 XML Techno


Elements

Elements are the main building blocks of both XML and HTML documents.

Examples of HTML elements are "body" and "table". Examples of XML elementscould be "note" and "message". Elements can contain text, other elements, or beempty. Examples of empty HTML elements are "hr", "br" and "img".

Examples: <body>some text</body>

<message>some text</message>Attributes

Attributes provide extra information about elements.

Attributes are always placed inside the opening tag of an element. Attributes

always come in name/value pairs. The following "img" element has additionalinformation about a source file:

<img src="computer.gif" /> The name of the element is "img". The name of theattribute is "src". The value of the attribute is "computer.gif". Since the elementitself is empty it is closed by a " /".

8/8/2019 XML Techno


Entities

Some characters have a special meaning in XML, like the less than sign (<) that defines the start of an XML tag.

Most of you know the HTML entity: " ". This "no-breaking-space" entity is used in HTML toinsert an extra space in a document. Entities are expanded when a document is parsed by an XMLparser.

The following entities are predefined in XML:

Entity References Character < < > > & & " " ' ' PCDATA

PCDATA means parsed character data.

Think of character data as the text found between the start tag and the end tag of an XML element.

PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser forentities and markup.

Tags inside the text will be treated as markup and entities will be expanded.

However, parsed character data should not contain any &, <, or > characters; these need to be

represented by the & < and > entities, respectively. CDATA

CDATA means character data.

CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated asmarkup and entities will not be expanded.

8/8/2019 XML Techno


DTD - Elements

« Previous Next Chapter » In a DTD, elements are declared with an ELEMENT declaration.

Declaring Elements

In a DTD, XML elements are declared with an element declaration with the following syntax:

<!ELEMENT element-name category>or<!ELEMENT element-name (element-content)>Empty Elements

Empty elements are declared with the category keyword EMPTY:

<!ELEMENT element-name EMPTY>

Example:

<!ELEMENT br EMPTY>

XML example:

<br />

8/8/2019 XML Techno


Elements with Parsed Character Data

Elements with only parsed character data are declared with #PCDATAinside parentheses:

<!ELEMENT element-name (#PCDATA)>

Example:

<!ELEMENT from (#PCDATA)>Elements with any Contents

Elements declared with the category keyword ANY, can contain anycombination of parsable data:

<!ELEMENT element-name ANY>Example:

<!ELEMENT note ANY>

8/8/2019 XML Techno


Elements with Children (sequences)

Elements with one or more children are declared with the name of the children elements insideparentheses:

<!ELEMENT element-name (child1)>or<!ELEMENT element-name (child1,child2,...)>

Example:

<!ELEMENT note (to,from,heading,body)>When children are declared in a sequence separated bycommas, the children must appear in the same sequence in the document. In a full declaration, thechildren must also be declared, and the children can also have children. The full declaration of the"note" element is:

<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)>

<!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>Declaring Only One Occurrence of an Element

<!ELEMENT element-name (child-name)>

8/8/2019 XML Techno


Example:

<!ELEMENT note (message)> The example above declares that the child element"message" must occur once, and only once inside the "note" element.

Declaring Minimum One Occurrence of an Element

<!ELEMENT element-name (child-name+)>

Example:

<!ELEMENT note (message+)> The + sign in the example above declares that thechild element "message" must occur one or more times inside the "note" element.

Declaring Zero or More Occurrences of an Element

<!ELEMENT element-name (child-name*)>

Example:

<!ELEMENT note (message*)> The * sign in the example above declares that thechild element "message" can occur zero or more times inside the "note" element.

8/8/2019 XML Techno


Declaring Zero or One Occurrences of an Element

<!ELEMENT element-name (child-name?)>

Example:

<!ELEMENT note (message?)> The ? sign in the example above declares that the child element"message" can occur zero or one time inside the "note" element.

Declaring either/or Content Example:

<!ELEMENT note (to,from,header,(message|body))> The example above declares that the "note"element must contain a "to" element, a "from" element, a "header" element, and either a"message" or a "body" element.

Declaring Mixed Content

Example:

<!ELEMENT note (#PCDATA|to|from|header|message)*> The example above declares that the"note" element can contain zero or more occurrences of parsed character data, "to", "from","header", or "message" elements.

8/8/2019 XML Techno


In a DTD, attributes are declared with an ATTLIST declaration.

Declaring Attributes

An attribute declaration has the following syntax:

<!ATTLIST element-name attribute-name attribute-type default-

value>

DTD example:

<!ATTLIST payment type CDATA "check">

XML example:

<payment type="check" />

8/8/2019 XML Techno


Type Description

CDATA The value is character data

(en1|en2|..) The value must be one from an enumerated list

ID The value is a unique id

IDREF The value is the id of another elementIDREFS The value is a list of other ids

NMTOKEN The value is a valid XML name

NMTOKENS The value is a list of valid XML names

ENTITY The value is an entity

ENTITIES The value is a list of entities

NOTATION The value is a name of a notation

xml: The value is a predefined xml value

Value Explanation

value The default value of the attribute

#REQUIRED The attribute is required

#IMPLIED The attribute is not required

#FIXED value The attribute value is fixed

The attribute-type can be one of the following:

The default-value can be one of the following:

8/8/2019 XML Techno


A Default Attribute Value

DTD:<!ELEMENT square EMPTY><!ATTLIST square width CDATA "0">

Valid XML:<square width="100" />

In the example above, the "square" element is defined to be an empty element with a "width" attribute of type

CDATA. If no width is specified, it has a default value of 0. #REQUIRED

Syntax

<!ATTLIST element-name attribute-name attribute-type #REQUIRED>

Example

DTD:<!ATTLIST person number CDATA #REQUIRED>

Valid XML:<person number="5677" />

Invalid XML:<person />

Use the #REQUIRED keyword if you don't have an option for a default value, but still want to force the attribute tobe present.

8/8/2019 XML Techno


#IMPLIED

Syntax

<!ATTLIST element-name attribute-name attribute-type #IMPLIED>

Example

DTD:<!ATTLIST contact fax CDATA #IMPLIED>

Valid XML:<contact fax="555-667788" />

Valid XML:<contact />

Use the #IMPLIED keyword if you don't want to force the author to include an attribute, and you don't have an option for a default value.

#FIXED

Syntax

<!ATTLIST element-name attribute-name attribute-type #FIXED "value">

Example

DTD:<!ATTLIST sender company CDATA #FIXED "Microsoft">

Valid XML:<sender company="Microsoft" />

Invalid XML:<sender company="W3Schools" />

Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes anothervalue, the XML parser will return an error.

8/8/2019 XML Techno


Enumerated Attribute Values

Syntax

<!ATTLIST element-name attribute-name (en1|en2|..) default-value>

Example DTD:

<!ATTLIST payment type (check|cash) "cash">

XML example:<payment type="check" />

or<payment type="cash" />

Use enumerated attribute values when you want the attribute valueto be one of a fixed set of legal values.

8/8/2019 XML Techno


XML Elements vs. Attributes

Use of Elements vs. Attributes

Data can be stored in child elements or in attributes.

Take a look at these examples:

<person sex="female"><firstname>Anna</firstname>

<lastname>Smith</lastname></person><person><sex>female</sex><firstname>Anna</firstname><lastname>Smith</lastname>

</person> In the first example sex is an attribute. In the last, sex is a child

element. Both examples provide the same information. There are no rules about when to use attributes, and when to use child

elements. My experience is that attributes are handy in HTML, but in XMLyou should try to avoid them. Use child elements if the information feelslike data.

8/8/2019 XML Techno


DTD - Entities

Entities are variables used to define shortcuts to standard text or specialcharacters.

Entity references are references to entities

Entities can be declared internal or external

An Internal Entity Declaration

Syntax <!ENTITY entity-name "entity-value"> Example

DTD Example:

<!ENTITY writer "Donald Duck."><!ENTITY copyright "CopyrightW3Schools.">

XML example:

<author>&writer;&copyright;</author>Note: An entity has three parts:an ampersand (&), an entity name, and a semicolon (;).

8/8/2019 XML Techno


An External Entity Declaration

Syntax

<!ENTITY entity-name SYSTEM "URI/URL"> Example

DTD Example:

<!ENTITY writer SYSTEM"http://www.w3schools.com/entities.dtd"><!ENTITY copyright SYSTEM"http://www.w3schools.com/entities.dtd">

XML example:

<author>&writer;&copyright;</author>

8/8/2019 XML Techno


DOM

What is the DOM?

The DOM is a W3C (World WideWeb Consortium) standard.

The DOM defines a standard for accessing documents like XML and HTML:

"The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically

access and update the content, structure, and style of a document."

The DOM is separated into 3 different parts / levels:

Core DOM - standard model for any structured document

XML DOM - standard model for XML documents HTML DOM - standard model for HTML documents

The DOM defines the objects and properties of all document elements, and the methods (interface) to access them.

What is the HTML DOM?

The HTML DOM defines the objects and properties of all HTML elements, and the methods (interface) to access them.

If you want to study the HTML DOM, find the HTML DOM tutorial on our Home page.

What is the XML DOM?

The XML DOM is:

A standard object model for XML

A standard programming interface for XML

Platform- and language-independent

AW3C standard

The XML DOM defines the objects and properties of all XML elements, and the methods (interface) to access them.

In other words: The XML DOM is a standard for how to get, change, add, or delete XML elements.

8/8/2019 XML Techno


In the DOM, everything in an XML document is a node.

DOM Nodes

According to the DOM, everything in an XML document

is a node. The DOM says:

The entire document is a document node

Every XML element is an element node

The text in the XML elements are text nodes

Every attribute is an attribute node

Comments are comment nodes

8/8/2019 XML Techno


DOM Example

Look at the following XML file :

<?xml version="1.0" encoding="ISO-8859-1"?><bookstore><book category="cooking"><title lang="en">Everyday Italian</title><author>Giada De Laurentiis</author><year>2005</year><price>30.00</price>

</book><book category="children">

<title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book><book category="web"><title lang="en">XQuery Kick Start</title><author>James McGovern</author><author>Per Bothner</author><author>Kurt Cagle</author><author>James Linn</author><author>Vaidyanathan Nagarajan</author><year>2003</year><price>49.99</price>

</book><book category="web" cover="paperback"><title lang="en">Learning XML</title><author>Erik T. Ray</author><year>2003</year><price>39.95</price>

</book></bookstore>

8/8/2019 XML Techno


The root node in the XML above is named <bookstore>. All othernodes in the document are contained within <bookstore>.

The root node <bookstore> holds four <book> nodes.

The first <book> node holds four nodes: <title>, <author>, <year>,and <price>, which contains one text node each, "Everyday Italian","Giada De Laurentiis", "2005", and "30.00".

Text is Always Stored in Text Nodes

A common error in DOM processing is to expect an element nodeto contain text.

However, the text of an element node is stored in a text node.

In this example: <year>2005</year>, the element node <year>,holds a text node with the value "2005".

"2005" is not the value of the <year> element!

8/8/2019 XML Techno


CSS Introduction

What is CSS?

CSS stands for Cascading Style Sheets

Styles define how to display HTML elements

Styles were added to HTML 4.0 to solve a problem

External Style Sheets can save a lot of work

External Style Sheets are stored in CSS files

CSS Demo

An HTML document can be displayed with different styles:

Styles Solved a Big Problem

HTML was never intended to contain tags for formatting a document.

HTML was intended to define the content of a document, like:

<h1>This is a heading</h1>

<p>This is a paragraph.</p>

When tags like <font>, and color attributes were added to the HTML 3.2 specification, it started a nightmare for web developers.

Development of large web sites, where fonts and color information were added to every single page, became a long and expensive

process.

To solve this problem, the WorldWide Web Consortium (W3C) created CSS.

In HTML 4.0, all formatting could be removed from the HTML document, and stored in a separate CSS file.

All browsers support CSS today.

8/8/2019 XML Techno


CSS Syntax

A CSS rule has two main parts: a selector, and one or more declarations:

The selector is normally the HTML element you want to style.

Each declaration consists of a property and a value.

The property is the style attribute you want to change. Each property has a value.

CSS Example CSS declarations always ends with a semicolon, and declaration groups are

surrounded by curly brackets:

p {color:red;text-align:center;} To make the CSS more readable, you can put onedeclaration on each line, like this:

Example

p

{color:red;text-align:center;}

8/8/2019 XML Techno


CSS Comments

Comments are used to explain your code, and may help you whenyou edit the source code at a later date. Comments are ignored bybrowsers.

A CSS comment begins with "/*", and ends with "*/", like this:

/*This is a comment*/p{text-align:center;/*This is another comment*/color:black;font-family:arial;}

8/8/2019 XML Techno


XSD

What is an XML Schema?

The purpose of an XML Schema is to define the legalbuilding blocks of an XML document, just like a DTD.

An XML Schema:

defines elements that can appear in a document

defines attributes that can appear in a document

defines which elements are child elements

defines the order of child elements

defines the number of child elements

defines whether an element is empty or can includetex

8/8/2019 XML Techno


Why Use XML Schemas?

XML Schemas Support Data Types

One of the greatest strength of XML Schemas is thesupport for data types.

With support for data types: It is easier to describe allowable document content

It is easier to validate the correctness of data

It is easier to work with data from a database

It is easier to define data facets (restrictions on data)

It is easier to define data patterns (data formats)

It is easier to convert data between different data types

8/8/2019 XML Techno


XML Schemas use XML Syntax

Another great strength about XML Schemas is that they are written in XML.

Some benefits of that XML Schemas are written in XML:

You don't have to learn a new language

You can use your XML editor to edit your Schema files

You can use your XML parser to parse your Schema files

You can manipulate your Schema with the XML DOM You can transform your Schema with XSLT

XML Schemas Secure Data Communication

When sending data from a sender to a receiver, it is essential that both parts have the same"expectations" about the content.

With XML Schemas, the sender can describe the data in a way that the receiver will understand.

A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in othercountries as 11.March.

However, an XML element with a data type like this: <date type="date">2004-03-11</date>

ensures a mutual understanding of the content, because the XML data type "date" requires theformat "YYYY-MM-DD".

8/8/2019 XML Techno


XML Schemas are Extensible

XML Schemas are extensible, because they are written in XML.

With an extensible Schema definition you can:

Reuse your Schema in other Schemas

Create your own data types derived from the standard types

Reference multiple schemas in the same document

Well-Formed is not Enough

A well-formed XML document is a document that conforms to the XML syntax rules, like: it must begin with the XML declaration

it must have one unique root element

start-tags must have matching end-tags

elements are case sensitive

all elements must be closed

all elements must be properly nested

all attribute values must be quoted

entities must be used for special characters

Even if documents are well-formed they can still contain errors, and those errors can have serious consequences. Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers. With XML Schemas,

most of these errors can be caught by your validating software.

8/8/2019 XML Techno


XSL Languages

It Started with XSL

XSL stands for EXtensible Stylesheet Language.

TheWorldWideWeb Consortium (W3C) started to develop XSL because there was a need for anXML-based Stylesheet Language.

CSS = Style Sheets for HTML

HTML uses predefined tags, and the meaning of each tag is well understood.

The <table> tag in HTML defines a table - and a browser knows how to display it.

Adding styles to HTML elements are simple. Telling a browser to display an element in a special fontor color, is easy with CSS.

XSL = Style Sheets for XML

XML does not use predefined tags (we can use any tag-names we like), and therefore the meaningof each tag is not well understood.

A <table> tag could mean an HTML table, a piece of furniture, or something else - and a browserdoes not know how to display it.

XSL describes how the XML document should be displayed!

XSL - More Than a Style Sheet Language

XSL consists of three parts:

XSLT - a language for transforming XML documents

XPath - a language for navigating in XML documents

XSL-FO - a language for formatting XML documents

8/8/2019 XML Techno


XSLT Introduction

XSLT is a language for transforming XML documentsinto XHTML documents or to other XML documents.

XPath is a language for navigating in XML documents.

XSLT stands for XSL Transformations XSLT is the most important part of XSL

XSLT transforms an XML document into another XMLdocument

XSLT uses XPath to navigate in XML documents XSLT is a W3C Recommendation

8/8/2019 XML Techno


XSLT = XSL Transformations

XSLT is the most important part of XSL.

XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this bytransforming each XML element into an (X)HTML element.

With XSLT you can add/remove elements and attributes to or from the output file. You can alsorearrange and sort elements, perform tests and make decisions about which elements to hide and

display, and a lot more. A common way to describe the transformation process is to say that XSLT transforms an XML

source-tree into an XML result-tree.

XSLT Uses XPath

XSLT uses XPath to find information in an XML document. XPath is used to navigate throughelements and attributes in XML documents.

If you want to study XPath first, please read our XPath Tutorial.

How Does it Work?

In the transformation process, XSLT uses XPath to define parts of the source document that shouldmatch one or more predefined templates. When a match is found, XSLT will transform the matchingpart of the source document into the result document.

XSLT is a W3C Recommendation

XSLT became a W3C Recommendation 16. November 1999.

8/8/2019 XML Techno


XPATH

What is XPath?

XPath is a syntax for defining parts of an XML document

XPath uses path expressions to navigate in XML documents

XPath contains a library of standard functions

XPath is a major element in XSLT

XPath is a W3C recommendation

XPath Path Expressions

XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look verymuch like the expressions you see when you work with a traditional computer file system.

XPath Standard Functions

XPath includes over 100 built-in functions. There are functions for string values, numeric values, date and timecomparison, node and QName manipulation, sequence manipulation, Boolean values, and more.

XPath is Used in XSLT

XPath is a major element in the XSLT standard. Without XPath knowledge you will not be able to create XSLTdocuments.

You can read more about XSLT in our XSLT tutorial.

XQuery and XPointer are both built on XPath expressions. XQuery 1.0 and XPath 2.0 share the same data modeland support the same functions and operators.

You can read more about XQuery in our XQuery tutorial.

XPATH is a W3C Recommendation

XPath became a W3C Recommendation 16. November 1999.

XPath was designed to be used by XSLT, XPointer and other XML parsing software.

8/8/2019 XML Techno


XPath Terminology

Nodes

In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes.

XML documents are treated as trees of nodes. The topmost element of the tree is called the root element.

Look at the following XML document:

<?xml version="1.0" encoding="ISO-8859-1"?>

<bookstore><book><title lang="en">Harry Potter</title><author>J K. Rowling</author>

<year>2005</year><price>29.99</price>

</book></bookstore> Example of nodes in the XML document above:

<bookstore> (root element node)

<author>J K. Rowling</author> (element node)

lang="en" (attribute node) Atomic values

Atomic values are nodes with no children or parent.

Example of atomic values:

J K. Rowling

"en" Items

Items are atomic values or nodes.

8/8/2019 XML Techno


Relationship of Nodes

Parent

Each element and attribute has one parent.

In the following example; the book element is the parent of the title, author, year, and price:

<book><title>Harry Potter</title><author>J K. Rowling</author>

<year>2005</year><price>29.99</price>

</book> Children

Element nodes may have zero, one or more children.

In the following example; the title, author, year, and price elements are all children of the bookelement:

<book><title>Harry Potter</title>

<author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book> Siblings

Nodes that have the same parent.

8/8/2019 XML Techno


In the following example; the title, author, year, and price elements are all siblings:

<book><title>Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book>Ancestors

A node's parent, parent's parent, etc.

In the following example; the ancestors of the title element are the book element and the bookstore element:

<bookstore>

<book><title>Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book>

</bookstore> Descendants

A node's children, children's children, etc.

In the following example; descendants of the bookstore element are the book, title, author, year, and price elements:

<bookstore>

<book>

<title>Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book>

</bookstore>

8/8/2019 XML Techno


SAX

SAX is the Simple API for XML, originally a Java-only API. SAX wasthe first widely adopted API for XML in Java, and is a de factostandard. The current version is SAX 2.0.1, and there are versionsfor several programming language environments other than Java.

SAX has recently switched over to the SourceForge project

infrastructure. The intent is to continue the open development andmaintainence process for SAX (no NDAs required) while making iteasier to track open SAX issues outside of the high-volume xml-devlist.

Project resources include archived mailing lists and a downloadarea. See the Project Page link for full information about project

facilities which are being used, as well as news announcements.Use the [email protected] mailing list to discussproblems that come up when you're trying to use SAX.

XML Techno

Documents

Transcript of XML Techno