Relax NG, a Schema Language for XML

Post on 20-Jan-2015

2.636 views 5 download

Tags:

description

Here is a presentation I gave on Relax NG, a schema language for XML, that I gave at OSCON 2003.

Transcript of Relax NG, a Schema Language for XML

RELAX NGA Schema Language for XML

Michael Fitzgerald

mike@wyeast.net

Wy’east Communications

9 July 2003 Slide 2

Introductions

A brief, swift, technical overview of RELAX NG

Some comparisons with DTDs and W3C XML Schema

As John Cowan has said: “Once RELAX NG crosses the ‘blood-brain barrier,’ you can never go back!”

9 July 2003 Slide 3

What Is RELAX NG?

RELAX NG is a schema language for XML A schema language for XML describes constraints

for a vocabulary beyond ordinary XML syntax RELAX NG is simple, intuitive, elegant, easy to

use and learn, and has a foundation in finite tree automata

RELAX NG is being forwarded as part of ISO/IEC 19757 Document Schema Definition Languages or DSDL (see http://www.dsdl.org)

9 July 2003 Slide 4

When & Who?

Version 1.0 specs were developed by the RELAX NG technical committee at OASIS between April and December 2001

Merges Murata Makoto’s RELAX and James Clark’s TREX

James Clark is chair of the RELAX NG technical committee

9 July 2003 Slide 5

An Elegant Alternative

Murata Makoto coauthored the RELAX NG tutorial and specification with James Clark

Offers an XML as well as a compact, non-XML syntax

Committed to simplicity, modularity, composability

No side effects, no PSVI An alternative to the dominant W3C XML

Schema

9 July 2003 Slide 6

XML Schema Competition?

Is RELAX NG poised to threaten XML Schema’s dominance? No.

Will RELAX NG replace XML Schema? No.

Will RELAX NG continue to attract seasoned schema developers based on word of mouth? Probably.

9 July 2003 Slide 7

A Few Preliminaries

Patterns describe content and structure of instances

An instance of a schema is a document that complies with that schema

Most RELAX NG patterns can act as a document element for the schema

Structure namespace also denotes version: http://relaxng.org/ns/structure/1.0

9 July 2003 Slide 8

DTDs & RELAX NG

RELAX NG is an evolution of the DTD Elements and other structures are defined,

not declared No concept of associating a schema with an

instance, as with a document type declaration, as in:

<!DOCTYPE date SYSTEM "date.dtd">

9 July 2003 Slide 9

Element Definitions

The <element> element Must have a name attribute or a <name>

child element In compact syntax, defined with an element keyword

XML Schema likewise has an <xs:element> element

9 July 2003 Slide 10

For Example, Elements & Schemas

Following is an element definition which is also a complete schema in XML syntax (date.rng):

<element name="date" xmlns="http://relaxng.org/ns/structure/1.0">

<text/>

</element>

Compact syntax, namespace assumed by default (date.rnc):

element date { text }

9 July 2003 Slide 11

Validating with Jing

Instance (date.xml):

<date>2003-07-09</date>

Jing is a multi-platform RELAX NG validator, written by James Clark, in Java

Validation the element examples:jing date.rng date.xml

jing –c date.rnc date.xml

9 July 2003 Slide 12

Other RELAX NG Tools

James Clark’s Trang, a schema translator (http://thaiopensource.com/relaxng/trang.html)

Sun’s Multi-schema Validator or MSV (http://wwws.sun.com/software/xml/developers/

multischema/) Asami Tomoharu’s Relaxer schema

compiler (http://www.relaxer.org)

9 July 2003 Slide 13

Adding an Attribute

Content models formed by simple nesting The <attribute> element is an example In XML syntax, <text/> is assumed as a

child of <attribute> and can be left out Compact syntax, however, requires the text keyword

XML Schema likewise uses an <xs:attribute> element

9 July 2003 Slide 14

Attributes Examples XML syntax (att.rng):

<element name="date" xmlns="http://relaxng.org/ns/structure/1.0"> <attribute name="type"/> <text/></element>

Compact syntax (att.rnc):element date { attribute type { text }, text}

Match (att.xml):<date type="ISO">2003-07-09</date>

9 July 2003 Slide 15

Empty Elements

Empty elements in XML may have attributes but no text or child element content

Element definitions may not be entirely empty If no attributes are defined, must use <empty/>

or empty in the content of the element definition XML Schema signals empty content by the

absence of a content model

9 July 2003 Slide 16

Empty Elements Examples XML syntax (empty.rng):

<element name="br" xmlns="http://relaxng.org/ns/structure/1.0">

<empty/> </element> Compact (empty.rnc):

element br {empty} XML syntax (image.rng):

<element name="image" xmlns="http://relaxng.org/ns/structure/1.0">

<attribute name="source"/></element>

Compact (image.rnc): element image {attribute source {text}}

Match empty.xml, image.xml

9 July 2003 Slide 17

Namespaces

In RNG, the ns attribute defines the namespace that the pattern should match

ns namespace matches a namespace defined in the instance with xmlns

ns is inherited by child elements xmlns declares, ns defines the matching

namespace Compact syntax uses default and namespace

keywords

9 July 2003 Slide 18

Namespaces Examples XML syntax (ns.rng):

<element name="date" ns="http://www.wyeast.net/date" xmlns="http://relaxng.org/ns/structure/1.0"> <text/></element>

Compact (ns.rnc): default namespace = "http://www.wyeast.net/date"element date {text}

Match (ns.xml): <date xmlns="http://www.wyeast.net/date">2003-07-09</date>

9 July 2003 Slide 19

Occurrence Constraints

<optional> is equivalent to ? (zero or one) in DTDs

<oneOrMore> is equivalent to + in DTDs <zeroOrMore> is equivalent to * in

DTDs ? and + and * work in RNC RELAX NG does not have minOccurs

and maxOccurs equivalents

9 July 2003 Slide 20

One or More Examples XML syntax (dates.rng):

<element name="dates" xmlns=“http://relaxng.org/ns/structure/1.0"> <oneOrMore> <element name="date"><text/></element> </oneOrMore></element>

Compact (dates.rnc): element dates {element date {text}*}

Match (dates.xml): <dates> <date>2003-07-07</date> <date>2003-07-08</date> <date>2003-07-09</date></date>

9 July 2003 Slide 21

choice & group

<choice> matches instances with any one of its children

Compact uses | as in DTDs <group> matches instances with all of its

children Compact uses () as in DTDs XML Schema also uses xs:choice and xs:group

9 July 2003 Slide 22

choice & group Examples XML syntax (instant.rng):

<element name="instant"><choice><group><element name="date"><text/></element><element name="time"><text/></element>

</group><element name="date-time"><text/></element></choice>

</element> Compact (instant.rnc):

element instant {(element date {text}, element time {text})| element date-time {text}}

9 July 2003 Slide 23

Definitions

Create named definitions with <define> Can refer to named definition with <ref> No name conflict with name of definition or

name of element or attribute Similar to <complexType> in XML

Schema

9 July 2003 Slide 24

Definition Examples

XML syntax (see def.rng):<define name="date">

<element name="date"><element name="year"><text/></element>

<element name="month"><text/></element><element name="day"><text/></element></element>

</define>

Compact syntax (see def.rnc):date = element date {element year {text},

element month {text}, element day {text}}

9 July 2003 Slide 25

Grammar

If <define> elements are used, both <grammar> and <start> must be used as well

<grammar> becomes root element for schema

<start> indicates document element in the instance (similar to what DOCTYPE does)

9 July 2003 Slide 26

grammar, start & ref Example

XML syntax (def.rng/.rnc with def.xml):<grammar xmlns="http://relaxng.org/ns/structure/1.0"><start>

<ref name="date"/></start><define name="date">

<element name="date"><element name="year"><text/></element>

<element name="month"><text/></element><element name="day"><text/></element></element>

</define></grammar>

9 July 2003 Slide 27

Datatypes

RELAX NG supports external datatype libraries, namely XML Schema datatypes

The datatypeLibrary attribute indicates the namespace for the datatype library (inherited)

The <data> element with the type attribute <param>, child element of <data>, indicates

facets of datatype per XML Schema

9 July 2003 Slide 28

Datatypes in Compact Syntax

The datatype library is automatically declared for XML Schema datatypes

xsd prefix is required unless XML Schema datatypes is redeclared

parameters are defined with literal strings

9 July 2003 Slide 29

Datatype Examples XML syntax (year.rng with year.xml):

<element name="year" xmlns="http://relaxng.org/ns/structure/1.0">datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <data type="gYear">

<param name="minInclusive">2002</param> <param name="maxInclusive">2005</param> </data>

</element>

Compact syntax (year.rnc also with year.xml):element year {

xsd:gYear {minInclusive="2002" maxInclusive="2005"}}

9 July 2003 Slide 30

Enumerations

In DTDs, enumerations are only possible in attributes, defined in a DTD like this:

<!ATTLIST week day (m|w|f) #REQUIRED>

Enumerations possible in RELAX NG in both elements and attributes

RELAX NG uses the <value> element in XML syntax, literals separated by | in compact syntax

XML Schema uses the <xs:enumeration> facet element, which is a child of <xs:restriction>, which is a child of <xs:simpleType>

9 July 2003 Slide 31

Enumeration Examples

XML syntax (day.rng with day.xml):<element name="day"

xmlns="http://relaxng.org/ns/structure/1.0"><choice><value>m</value><value>w</value><value>f</value></choice>

</element>

Compact syntax (day.rnc also with day.xml):element day { "m" | "w" | "f" }

9 July 2003 Slide 32

Lists

A list is whitespace-separated sequence of tokens

RELAX NG uses the <list> element to define a list, followed by one or more <data> elements

Compact syntax uses the list keyword followed by a comma separated list of types

9 July 2003 Slide 33

More on Lists

Can use occurrence constraints such as <optional> or ?, <oneOrMore> or +, <zeroOrMore> or *

DTDs use NMTOKENS, IDREFS, and ENTITIES, but for attributes only

XML Schema uses <xs:list>, a child of <xs:simpleType>, which can be constrained by facets

9 July 2003 Slide 34

List Examples

XML syntax (vertex.rng with vertex.xml):<element name="vertex" xmlns="http://relaxng.org/ns/structure/1.0"

datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

<list> <data type="float"/> <data type="float"/> <data type="float"/> </list></element>

Compact syntax (vertex.rnc also with vertex.xml):element vertex {list {xsd:float, xsd:float, xsd:float}}

9 July 2003 Slide 35

Interleave

In RELAX NG, you can define a pattern where element may appear in any order with <interleave>

Restores SGML’s & connector (both in either order) Uses a non-deterministic content model, which DTDs

and XML Schema forbid Can come close with ANY in DTDs and <xs:all>

or <xs:choice> in XML Schema Can use occurrence constraints in <interleave>

9 July 2003 Slide 36

Interleave Examples XML syntax (name.rng with name.xml)<element name="name" xmlns="http://relaxng.org/ns/structure/1.0">

<interleave> <element name="family"><text/></element> <oneOrMore> <element name="given"><text/></element> </oneOrMore> </interleave></element>

Compact syntax (name.rnc with name.xml):element name { element family {text}& element given{text}+ }

9 July 2003 Slide 37

Mixed Content Model

Mixed content models allow child elements to appear in any order and to be mixed with text

In DTDs, the model is (see mixed.dtd):<!ELEMENT name (#PCDATA | family | given)*>

RELAX NG XML syntax uses <mixed> element

9 July 2003 Slide 38

More on Mixed Content

<mixed> is syntax sugar for <interleave> with <text/>

Compact syntax uses mixed keyword, comma separated patterns

XML Schema uses the boolean attribute mixed on either <xs:complexType> or <xs:complexContent>

9 July 2003 Slide 39

Mixed Examples

XML syntax (mixed.rng with instance mixed.xml):

<element name="name" xmlns="http://relaxng.org/ns/structure/1.0">

<mixed><element name="family"><text/></element><element name="given"><text/></element></mixed>

</element> Compact syntax (mixed.rnc with instance mixed.xml):

element name {mixed { element family {text}, element given{text}}}

9 July 2003 Slide 40

Grammars

The <externalRef> element references an external pattern via a required href attribute

You can combine grammars using the combine attribute on define with a value of interleave or &= (match in any order) or choice or |= (match one of any)

The <include> element merges grammar together (has an href attribute)

9 July 2003 Slide 41

More on Grammars

You can nest a grammar in an <include> element which will override a definition with the same name

You can also use <notAllowed/> when merging grammars, forcing you to redefine not-allowed grammars

Grammars can be nested; <parentRef> references or escapes to the parent grammar in a nested grammar, allowing you to redefine a child grammar using a parent, a sort of import

9 July 2003 Slide 42

Name Classes

Name classes allow you to include or exclude whole classes of names in a pattern

<name> allows a given name <anyName/> allows any name <nsName> allows names from a given

namespace The <except> element removes names

from a class

9 July 2003 Slide 43

Schematron & RELAX NG

Schematron is an assertion-based, rather than a grammar-based schema language

Uses path expressions (XSLT and XPath) The “feather duster” that can reach corners

of your instances, where grammars can’t (Rick Jelliffe)

Co-occurrence constraints possible Can embed Schematron in RELAX NG

9 July 2003 Slide 44

Annotations

You can use elements and attributes from other vocabularies in RELAX NG, such as XHTML

<a:documentation> is a special documentation element defined in the RELAX NG DTD compatibility spec (http://www.oasis-open.org/committees/relax-ng/compatibility.html)

<div> allows a place to add foreign attributes for documentation purposes

9 July 2003 Slide 45

RELAX NG Resources

RELAX NG: http://www.relaxng.org OASIS: http://www.oasis-open.org RELAX: http://www.xml.gr.jp/relax/ James Clark, TREX, Jing, Trang:

http://www.thaiopensource.com Design of RELAX NG:

http://www.thaiopensource.com/relaxng/design.html

Conclusion

Thanks for your time!

Comments or questions?

mike@wyeast.net