XML Schema Neminath Simmachandran CS 486 – Spr’01.

31
XML Schema XML Schema Neminath Simmachandran Neminath Simmachandran CS 486 – Spr’01 CS 486 – Spr’01

Transcript of XML Schema Neminath Simmachandran CS 486 – Spr’01.

Page 1: XML Schema Neminath Simmachandran CS 486 – Spr’01.

XML SchemaXML Schema

Neminath SimmachandranNeminath Simmachandran

CS 486 – Spr’01CS 486 – Spr’01

Page 2: XML Schema Neminath Simmachandran CS 486 – Spr’01.

OverviewOverview::

• XML, a brush up

• Intro to Schemas

• Namespaces

• Elements, Attributes & Content model

• Summary

• History

Page 3: XML Schema Neminath Simmachandran CS 486 – Spr’01.

XML Brush upXML Brush up

• It’s a Meta-Markup language

• Markup language, uses tags embedded directly into the text to describe the various pieces and parts of the text.

• Document-Type Definition(DTD), describes sets of tags & attributes.

• DTD, rules by which its associated document must play.

Page 4: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Role of the DTDRole of the DTD

• Element Declarations: This specifies a single markup element.

Eg: <! ELEMENT book>, this identifies an element ‘book’.

• Attribute list: This declares sets of attributes for a specific element.

Eg: <! ATTLIST BOOK CLASS(FICTION|HORROR) >

• Content Model: This is part of an element declaration and describes what kind of content can be nested within an element.

Types: Data, Element, Mixed content.

Eg: <! ELEMENT book (title, author, publisher, isbn)>, the title, author, publisher & isbn are elements that must all be contained in that order within ‘book’ element.

Page 5: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Role of the DTDRole of the DTD (conti..)(conti..)

• Entity Declaration: This creates an entity, which is essentially an alias that associates a unique name with a group of data.

Eg: <! ENTITY XML “Extensible Markup Language”>

Page 6: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Role of the DocumentRole of the Document

• Document uses the markup and guidelines specified in the DTD to describe content.• Structure,

- Prolog

- Document element

- Elements

- Attributes

- Content

- Comment

- Processing instructions

Page 7: XML Schema Neminath Simmachandran CS 486 – Spr’01.

DTD - LimitationsDTD - LimitationsDTDs call for elements to consist of one of the three things,

• A text string

• A text string with other child elements mixed together

• A set of child elements

Also,

• DTD does not have XML syntax and so XML parsers cannot parse them into component parts very easily

• They have a very primitive system of data types

• They are not modular, so its not easy to reuse parts of a DTD

• They are not easily extensible.

Page 8: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Intro to SchemasIntro to Schemas

• Schemas are themselves XML documents with markups, elements, attributes and comments.

• XML Schema system aims to provide a rich grammatical structure for XML documents that overcomes the schema limitations of the DTD.

To illustrate the power of XML Schema mechanism let us the the example below,

An XML document fragment ,

<InvoiceNo>123456789</InvoiceNo>

<ProductID>J123456</ProductID>

Page 9: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Intro to SchemasIntro to Schemas (conti..) (conti..)

DTD fragment describing elements in the fragment above,

<!ELEMENT InvoiceNo (#PCDATA)>

<!ELEMENT ProductID (#PCDATA)>

XML Schema fragment describing elements in the XML fragment,

<element name='InvoiceNo' type='positive-integer'/>

<element name='ProductID' type='ProductCode'/>

<simpleType name='ProductCode' base='string'>

<pattern value='[A-Z]{1}d{6}'/> </simpleType>

Page 10: XML Schema Neminath Simmachandran CS 486 – Spr’01.

NamespacesNamespaces

A given XML Schema defines a set of new names such as the names if elements, types, attributes, attribute groups, whose definitions and declarations are written in the schema.

Need for Namespace ?

A document can use names from different schema. The namespace enables us to distinguish between declarations and definitions from different vocabularies.

XML Namespace form a mechanism for avoiding name conflicts in XML documents. A namespace itself has a fixed but arbitrary name that must follow the URL syntax.

Page 11: XML Schema Neminath Simmachandran CS 486 – Spr’01.

NamespaceNamespace (conti..) (conti..)

Namespace:

• Target Namespace: Names defined in a schema

• Source Namespace: Definitions & declaration in a schema that refers to names that belong to

other namespaces.

Eg: In the following piece of Schema

<element name='InvoiceNo' type='positive-integer'/>

<element name='ProductID' type='ProductCode'/>

<simpleType name='ProductCode' base='string'>

<pattern value='[A-Z]{1}d{6}'/> </simpleType>

InvoiceNo, ProductID & ProductCode belong to ‘target namespace’ and can be assigned a arbitrary name that follows a URL syntax.

Page 12: XML Schema Neminath Simmachandran CS 486 – Spr’01.

NamespaceNamespace (conti..) (conti..)

Eg: ( with namespace)

Fragment code: 1

<xsd:schema targetNamespace='http://www.SampleStore.com/Account‘    xmlns:xsd='http://www.w3.org/1999/XMLSchema'       xmlns:ACC= 'http://www.SampleStore.com/Account'>

<xsd:element name='InvoiceNo' type='xsd:positive-integer'/>

<xsd:element name='ProductID' type='ACC:ProductCode'/>

<xsd:simpleType name='ProductCode'base='xsd:string'> <xsd:pattern value='[A-Z]{1}d{6}'/>

</xsd:simpleType>

Page 13: XML Schema Neminath Simmachandran CS 486 – Spr’01.

NamespaceNamespace (conti..) (conti..)

Eg: ( with namespace) / Fragment code: 1

In this example,

Target namespace: http://www.SampleStore.com/Account, which contains InvoiceNo, ProductId & ProductCode.

Source namespace: http://www.w3.org/1999/XMlSchema, this has schema, element, simpleType, pattern, string & positive-integer.

Also the source has been abbrevated as ‘xsd’ through ‘xmlns’ declaration.

Page 14: XML Schema Neminath Simmachandran CS 486 – Spr’01.

NamespaceNamespace (conti..) (conti..)

Eg: with multiple source namespace - Fragment code: 2

<schema targetNamespace='http://www.SampleStore.com/Account'      

xmlns='http://www.w3.org/1999/XMLSchema'       xmlns:ACC=

'http://www.SampleStore.com/Account'       xmlns:PART= 'http://www.PartnerStore.com/PartsCatalog'>

<import namespace='http://www.PartnerStore.com/PartsCatalog'             schemaLocation='http://www.ProductStandards.org/repository/alpha.xsd'/>

<element name='InvoiceNo' type='positive-integer'/>

<element name='ProductID' type='ACC:ProductCode'/>

<simpleType name='ProductCode' base='string'>   <pattern value='[A-Z]{1}d{6}'/>

</simpleType>

<element name='stickyGlue' type='PART:SuperGlueType'/>

Page 15: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Elements, Attributes & Content ModelElements, Attributes & Content Model

• Element: It has a name and content model(defined by type)

• Type: Simple – cannot have elements or attributes in its value

Complex – can embed elements / associate attributes

• There is a major distinction between definition of elements, which create new types (both simple and complex), and declaration of elements, which enable elements and attributes with specific names and types (both simple and complex) to appear in document instances

Eg: User-defined simple type,

<element name='age' type='integer'/>

<element name='price' type='decimal'/>

Page 16: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Elements, Attributes & Content ModelElements, Attributes & Content Model (conti..)(conti..)

Complex type:

Eg: <element name='price'>  

<complexType base='decimal' derivedBy='extension'>

<attribute name='currency' type='string'/>  

</complexType>

</element>

In XML instance document, we can write,

<price currency='US'>45.50</price> -->

Page 17: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Elements, Attributes & Content modelElements, Attributes & Content model (conti..)(conti..)

A comparison of complex data types in DTD and XML Schema:

XML document, <Book>   

<Title>Cool XML<Title> <Author>Cool Guy</Author>

</Book> DTD, <!ELEMENT Book (Title, Author)><!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)>XML Schema,<element name='Book' type='BookType'/><complexType name='BookType'>

    <element name='Title' type='string'/>     <element name='Author' type='string'/>

</complexType>

Page 18: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Elements, Attributes & Content modelElements, Attributes & Content model (conti..)(conti..)

Constraints: Schema offers greater flexibility that DTD for expressing constraints on the content model.

Two constraints that are predefined in XML Schema are minOccurs, maxOccurs.

Lets study this with an example,

<element name='Title' type='string'/>

<element name='Author' type='string'/>

<element name='Book'><complexType>

<element ref='Title' minOccurs='0'/> <element ref='Author' maxOccurs='2'/></complexType>

</element>

Page 19: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Delving into Simple type Delving into Simple type

• Simple types like ‘string’ & ‘number’ are built into XML Schema, while others are derived from the built-in’s .

• New simple types can be defined by restricting an existing simple type.

Eg: A new integer type whose value rangers from 10000 to 99999,

<xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer">

<xsd:minInclusive value="10000"/> <xsd:maxInclusive value="99999"/> </xsd:restriction></xsd:simpleType>

Page 20: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Delving into Simple type Delving into Simple type (conti..) (conti..)

List Type:

• XML Schema has the concept of list type.

• List types are comprised of sequences of atomic types(integer, string, etc) and consequently the parts of a sequence themselves are meaningful and hence can be divided.

• XML Schema has three built-in list types, they are NMTOKENS, IDREFS, and ENTITIES.

• New list types can be created by derivation from atomic types,

<xsd:simpleType name="listOfMyIntType"> <xsd:list itemType=“integer"/>

</xsd:simpleType>

And an element that conforms to listOfMyIntType is:

<listOfMyInt>203 7 9707 30</listOfMyInt>

Page 21: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Delving into Simple type Delving into Simple type (conti..) (conti..)

Union Type: This type allows an element or attribute value to be one or more instancees of one type drawn from the union of multiple atomic and list types.

Eg: This example tries to create a union type for representing American states as letter abbreviations(string) or lists of numeric codes. The zipUnion union type is built from one atomic type and one list type

<xsd:simpleType name="zipUnion"> <xsd:union memberTypes=“string

listOfMyIntType"/> </xsd:simpleType>

Some valid instance of an element, say ‘state’, of type ‘zipUnion’ is,

<state>WV</state>

<state>26505 26504 26599</state>

Page 22: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Element Content Element Content

Elements can contain,

• other elements

• only a simple type of value

• elements having attributes & containing other elements

• attributes, but containing only a simple type of value

• other elements + character content

• no content

Page 23: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Element Content Element Content (conti..)

With attribute & simple value:

We are trying to define a Schema that will support this,

<internationalPrice currency="EUR">423.46</internationalPrice>

Solution,

<xsd:element name="internationalPrice"> <xsd:complexType>

<xsd:simpleContent> <xsd:extension

base="xsd:number"> <xsd:attribute name="currency" type="xsd:string" />

</xsd:extension> </xsd:simpleContent> </xsd:complexType>

</xsd:element>

Page 24: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Element Content Element Content (conti..)

Sub-elements + Character content:

Construct a schema where character data can appear alongside sub-elements and character data is not confined to the deepest sub-element.

We are trying to define this,

<letterBody> <salutation>Dear Mr.<name>Robert Smith</name>.</salutation> . ....

</letterBody>

Soultion,

The elements appearing are declared normally. To enable character data to appear between the child-elements of letterBody, the ‘mixed’ attribute on the type definition is set to true.

Page 25: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Annotations Annotations

We can annotate schema for the benefit of both human readers and applications.

Elements available for annotation are annotation, documentation & appInfo.

Eg:

<xsd:annotation> <xsd:documentation xml:lang="en">

info to user goes here..</xsd:documentation>

</xsd:annotation>

Page 26: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Content ModelsContent Models

• Content model for an <elementType> can be specified by using <element> to refer to other elementTypes.

• XML Schema enables a group of elements to be defined and named. So that the elements can be used to build up the content models of complex types.

To illustrate, we take a PurchaseOrderType definition, this element gives info about the shipping and billing address for a purchase. We use two groups so that purchase orders may contain either separate shipping and billing addresses, or a single address for those cases in which the both addresses are the same,

Page 27: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Content ModelsContent Models (conti..)

<xsd:complexType name="PurchaseOrderType"> <xsd:sequence>

<xsd:choice> <xsd:group

ref="shipAndBill" /> <xsd:element name="singleUSAddress“

type="USAddress" /> </xsd:choice> <xsd:element ref="comment" minOccurs="0"/>

<xsd:element name="items" type="Items" /> </xsd:sequence>

<xsd:attribute name="orderDate" type="xsd:date" /> </xsd:complexType>

<xsd:group name="shipAndBill"> <xsd:sequence>

<xsd:element name="shipTo" type="USAddress" /> <xsd:element name="billTo" type="USAddress" /> </xsd:sequence>

</xsd:group>

Page 28: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Attribute Groups Attribute Groups

• Attributes are used to provide information about each item by adding attribute declarations to the element type definition.

• We can also create a named attribute group containing all the desired attributes of an element, and reference this group by name in the element declaration.

Page 29: XML Schema Neminath Simmachandran CS 486 – Spr’01.

ExtensibilityExtensibility

XML Schema provides three types of element models with regards to extensibility,

• In the open model, the content and attributes that have been declared for the element are required, but other content and attributes can be present. Authors can add their own attributes and elements to XML documents.

• The refinable model requires the content and attributes that have been declared for the element, and allows for those that have been explicitly declared in the refined sub-types.

• The closed model reflects the status quo of DTD, where additional child elements and attributes not in the element declaration are not allowed to be present.

Page 30: XML Schema Neminath Simmachandran CS 486 – Spr’01.

Summary Summary

• New language to describe content & structure of XML documents

• In addition to all the capabilities of DTD it provides,

• Built-in data types as well as user defined data types

• Element occurrence constraints

• Export / import mechanism for schema constructs

• Extensibility

• Refinement, where elements can use the schema constraint of other elements

Page 31: XML Schema Neminath Simmachandran CS 486 – Spr’01.

History History

• Work started early in 1998

• Requirements document by Feb 1999

• Working draft by May 1999

• Last Proposed Recommendation (standards) submitted as latest as Feb 2001.

• Most of the browser, except for IE doesn’t support XML Schema

• XML Schema implementation provided with Internet Explorer 5 focuses on syntactic schemas, without support for inheritance or other object-oriented design feature.