XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents Data types XML Schemas...

62
XML Validation II Schemas Robin Burke ECT 360

Transcript of XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents Data types XML Schemas...

Page 1: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

XML Validation IISchemas

Robin Burke

ECT 360

Page 2: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Outline

Namespaces Documents Data types XML Schemas

ElementsAttributesDerived data types

RELAX NG

Page 3: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

XML so far

Languages defined by DTDsnames assigned by designers

OK for standalone systems Doesn't have

The ability to handle naming conflictsThe ability to partition work among

different developers

Page 4: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Namespaces

A way to identify a set of labelselement / attribute namesattribute values

As belonging to a particular application

Page 5: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

recordings title artist

• group | artist-name+ date label

artworks title artist date exhibit

books title author date publisher

Page 6: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Problem

Want to create a list of items related to 50s Beat-era cultureincludes music, art, literature

Could create a new DTDbetter to reuse existing ones

Page 7: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Namespace idea

Associate a short prefix with an applicationSchema or DTD

Use the prefix with a colon to "qualify" namesmusic:artistart:artistbook:author

Page 8: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Namespace idea, cont'd

A namespace is an association between a set of namesa unique identifier (URI)a prefix used to identify them

Page 9: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Namespace declaration

Standalone<?xml:namespace

ns="http://bookpeople.com/book" prefix="book"?>

Part of element<html xmlns="http://www.w3.org/1999/xhtml">

in this case, no prefix<book xmlns:book="http://bookpeople.com/book">

Page 10: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Namespace URI

Not a URLthere is no resource at the given

locationjust a unique identifier

URL-like identifiers are goodassociated with an organizationmust be unique on the Internet

Page 11: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

DTDs Document Problem

how to import the namespaces?

Page 12: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Solution

Fully-qualified names everywhereyuk!

DTDs & namespaces don't work well together

Page 13: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

XML so far

Languages defined by DTDscontain text elementsstring attributes

OK for text documents Not enough for

DatabasesBusiness process integration

Page 14: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Other DTD problems

Not XMLdifferent syntaxdifferent processor

No support for namespaces

Page 15: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Solution

Write language definition in XML Allow more control over document

contents XML document becomes

a complex data type XML language definition becomes

complex data type specification

Page 16: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

XML Schema

Always a separate documentno internal option

Written in XMLvery verbose

Can be complex

Page 17: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Schemas and namespaces

A schema uses elements from one application

• the XML Schema language to define another

Namespaces are necessary Namespaces apply to elements

not values Namespace of element assumed to apply to attributes

can have attributes from different namespaces<html xmlns="http://www.w3.org/1999/xhtml"

xml:lang="en">

Page 18: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example 1, XML

<grades assignment="Homework 1"><grade>

<student id="1234-12345">Jane Doe</student><assigned-grade>A</assigned-grade>

</grade><grade>

<student id="5432-54321">John Doe</student><assigned-grade>B</assigned-grade>

</grade></grades>

Page 19: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example 1, DTD

<!ELEMENT grades (grade*)>

<!ATTLIST gradesassignment CDATA #IMPLIED>

<!ELEMENT grade (student, assigned-grade)>

<!ELEMENT student (#PCDATA)>

<!ATTLIST student

id CDATA #REQUIRED>

<!ELEMENT assigned-grade (#PCDATA)>

Page 20: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Data types

grades a collection of items of type grade can never have more than 40 students

grade a structure containing a student and an

assigned grade student

a structure containing an id and some text probably should constrain the student id

assigned-grade is text probably should constrain to A-D,F,I

Page 21: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Built-in types

Part of the schema language Base types

19 fundamental typesExamples: string, decimal

Derived types25 more types that use the base typesExamples: ID, positiveInteger

Page 22: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Built-in types, cont'd

Page 23: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

To declare an element

<xs:element name="assigned-grade"

type="string">

Equivalent to<!ELEMENT assigned-grade (#PCDATA)>

Page 24: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Simple data type

A renaming of an existing data type<xs:element name="assigned-grade"

type="xs:string">

Or a restriction of a existing typestrings beginning with "D"more on this later

Page 25: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Complex datatype

<xs:element name=“name”>

<xs:complexType>

compositor

element declarations

attribute declarations

</xs:complexType>

</xs:element>

Page 26: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Compositor

sequence choice all

Page 27: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Sequence compositor

like "," in DTD DTD

<!ELEMENT foo (bar, baz)>

Schema<xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:sequence> </xs:complexType></xs:element>

Page 28: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Elements in sequences

Can specify optional / # of occurrences ?

<xs:element ref="bar" minOccurs="0" type="xs:string">

*<xs:element ref="bar" minOccurs="0"

maxOccurs="unbounded" /> +

<xs:element ref="bar" minOccurs="1" maxOccurs="unbounded" />

What about...<xs:element ref="bar" minOccurs="2"

maxOccurs="4" />

Page 29: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Choice compositor

like "|" in DTD DTD

<!ELEMENT foo (bar | baz)>

Schema<xs:element name="foo"> <xs:complexType> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> </xs:complexType></xs:element>

Page 30: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

All compositor

no simple DTD equivalent DTD

<!ELEMENT foo ( (bar, baz?) | (baz, bar?) >

Schema<xs:element name="foo"> <xs:complexType> <xs:all> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:all> </xs:complexType></xs:element>

Page 31: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Nesting

Compositors can be combined DTD

<!ELEMENT foo ( (bar, baz) | (thud, grunt) )> Schema

<xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> <xs:choice> <xs:element ref="thud" /> <xs:element ref="grunt" /> </xs:choice> </xs:sequence> </xs:complexType></xs:element>

Page 32: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

<!ELEMENT grades (grade*)><!ELEMENT grade (student, assigned-grade)><!ELEMENT student (#PCDATA)><!ELEMENT assigned-grade (#PCDATA)>

Page 33: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Local naming

Suppose we want to reuse an element namedifferent place in the structure

Example <!ELEMENT url-catalog (link*)><!ELEMENT link (link, description?)>

• not a legal DTD

schema?

Page 34: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Using namespaces

Schema must say to use schema namespace what namespace it is defining

• targetNamespace Document must say

that it is using the Schema Instance namespace

what namespace(s) it is using what prefix(es) are used where to find the relevant schemas

Page 35: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Multi-schema documents

Possible to validate multi-schema documents

Must use any element to import namespacecan't restrict to certain elements

Page 36: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Attributes

DTD attribute typesCDATA, enumeration, token

Schemacan be any of the basic or derived

typescan also be user-defined types

Declaration<xs:attribute name="x" type="xs:string" use="required" default="abc" />

Page 37: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Attribute declaration

Part of complex typefollows compositor(one exception)

Declaration<xs:attribute name="foo"

type="xs:positiveInteger" />

What if the attribute is a more complex type itself?we'll get to that

Page 38: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

grades element? add homework attribute

Page 39: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Exception: simple content

If an element has "simple content"no compositor usedinstead simpleContent element

• and extension to declare type of the content

Page 40: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

<!ELEMENT student (#PCDATA)><!ATTLIST student

id CDATA #REQUIRED>

<xs:element name="student"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType></xs:element>

Page 41: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

How to read this

student is a complex typeit is not simply a renaming of an

existing type its content is simple

being of only one type• string

but with an attribute• id of type string which is required

Page 42: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Standalone types

A type can stand outside of an element definition must have a name<xs:complexType name="bar-n-baz">

<xs:sequence>

<xs:element ref="bar" />

<xs:element ref="baz" />

</xs:sequence>

</xs:complexType>

Used in element definition<xs:element name="foo" type="bar-n-baz" />

Page 43: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Deriving types

DTDs do not allow types restrictionsbeyond enumeration, CDATA, token

• for attributes

PCDATA• for content

Schemas have built-in typesalso capability to create your own

Page 44: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Derivation operations

listsequence of values

unioncombine two types

• allowing either

restrictionplacing limits on the legal values

Page 45: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

List

<xs:element name="partList">

<xs:simpleType>

<xs:list itemType="partNo" />

</xs:simpleType>

</xs:element>

<partList>PN334-04 PN223-89 PQ1112-03</partList>

Must be separated by spacesprobably more useful to do this with

document structurepartList -> partNo*

Page 46: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Union

Allows data of either type to be used Example<xs:simpleType name="xs:integer">

<xs:union memberTypes="xs:negativeInteger xs:nonNegativeInteger" />

</xs:simpleType>

Bogus!

Page 47: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Restriction

Most useful Allow design to state exactly what

values are legalprices must be non-negativeSSN must follow a certain patternin-stock must yes or noetc.

Page 48: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Restriction, cont'd

Restrict a base typeaccording to "facets"

Different facets available for different data types

Page 49: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Facets

Page 50: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example: enumeration

<xs:simpleType name="grade"> <xs:restriction base="xs:string"> <xs:enumeration value="A"/> <xs:enumeration value="B"/> <xs:enumeration value="C"/> <xs:enumeration value="D"/> <xs:enumeration value="F"/> <xs:enumeration value="I"/> </xs:restriction></xs:simpleType>

Page 51: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example: numeric

<xs:simpleType name="drinkingAge"><xs:restriction

base="xs:positiveInteger"><xs:minInclusive value="21"/>

</xs:restriction></xs:simpleType>

Page 52: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example: pattern

Regular expressions againderived from perl

<xs:simpleType>

<xs:restriction base="xs:string">

<xs:pattern value="([A-D]|F|I)(\+|\-)?" />

</xs:restriction>

</xs:simpleType>

Page 53: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Extended example

Complete schema for grades

Page 54: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

RELAX NG

XML Schemas are biga lot of the page consists of

• < > /• repeated element names

RELAX NGcreated as an alternate validation

languagecompact, non-XML syntax

• also XML syntax

Page 55: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Example

element grades

{ element grade

{ element student { text },

element assigned-grade { text }

}*

}

Equivalent to<!ELEMENT grades (grade*)>

<!ELEMENT grade (student, assigned-grade)>

<!ELEMENT student (#PCDATA)>

<!ELEMENT assigned-grade (#PCDATA)>

Page 56: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Attributes

element grades

{ element grade

{ element student

{ text,

attribute id { text }

},

element assigned-grade (text)

}*

attribute assignment { text }

}

Page 57: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Types

instead of { text }use appropriate built-in data typeattribute age { xsd:positiveInteger }

facetsqualify with name / value pairattribute drinkingAge

{ xsd:positiveInteger

{ minInclusive="21" }

}

Page 58: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

What does this one say?

element grade

{ element student ....,

{ element assigned-grade

{ text { pattern="([A-D](\+|\-)?|F)" }

|

( element assigned-grade

{ text "I" },

element reason { text }

)

}

}

Page 59: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

The point

A schema language has two purposeslets the language designer state a

designlets the system validate documents

against that design Any language that serves this

purposes can be used

Page 60: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Validation languages

DTD SGML holdover ugly a dead-end fairly simple to express

Schema complete extensible baroque unreadable

RELAX NG readable

• esp. compact syntax more expressive than Schema fewer tools

Page 61: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Homework #3

Create a schema for "Grills.xml" Generate a schema for your

"books.xml" fileusing XML Spy's "generate"

featureedit generated schema

Page 62: XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.

Next week

CSS SVG

an XML application for generating graphics

online reading