05 - XML - Schemas

download 05 - XML - Schemas

of 208

Transcript of 05 - XML - Schemas

  • 8/3/2019 05 - XML - Schemas

    1/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    eXtensible Markup Language

    XML Schemashttp://www.w3.org/TR/xmlschema-0/ (Primer)http://www.w3.org/TR/xmlschema-1/ (Structures)http://www.w3.org/TR/xmlschema-2/ (Datatypes)Thanks to Roger L. Costello

    eXtensible Markup Language Phan Vo Minh Thang

  • 8/3/2019 05 - XML - Schemas

    2/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    3/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    4/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    5/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Lectures Notes XML Schemas

  • 8/3/2019 05 - XML - Schemas

    6/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Lectures Notes XML Schemas

  • 8/3/2019 05 - XML - Schemas

    7/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Lectures Notes XML Schemas

  • 8/3/2019 05 - XML - Schemas

    8/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Lectures Notes XML Schemas

  • 8/3/2019 05 - XML - Schemas

    9/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    10/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    < e s o 0 > < ate a > ectu es otes Sc e as

    Purpose of XML Schemas (and DTDs)

    Specify: the structureof instance documents

    "this element contains these elements, which contains these other

    elements, etc"

    the datatypeof each element/attribute

    "this element shall hold an integer with the range 0 to 12,000" (DTDsdon't do too well with specifying datatypes like this)

  • 8/3/2019 05 - XML - Schemas

    11/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Motivation for XML Schemas

    People are dissatisfied with DTDs

    It's a different syntax

    You write your XML (instance) document using one syntaxand the DTD using another syntax --> bad, inconsistent

    Limited datatype capability

    DTDs support a very limited capability for specifyingdatatypes. You can't, for example, express "I want the element to hold an integer with a range of 0to 12,000"

    Desire a set of datatypes compatible with those found indatabases

    DTD supports 10 datatypes; XML Schemas supports

    44+ datatypes

  • 8/3/2019 05 - XML - Schemas

    12/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs:

    Enhanced datatypes

    44+ versus 10

    Can create your own datatypes

    Example: "This is a new type based on the string type and elements ofthis type must follow this pattern: ddd-dddd, where 'd' represents a digit".

    Written in the same syntax as instance documents less syntax to remember

    Object-oriented'ish

    Can extend or restrict a type (derive new type definitions on the basis of old ones)

    Can express sets, i.e., can define the child elements to occur in any order

    Can specify element content as being unique (keys on content) and uniquenesswithin a region

    Can define multiple elements with the same name but different content

    Can define elements with nil content

    Can define substitutable elements - e.g., the "Book" element is substitutable for the

    "Publication" element.

  • 8/3/2019 05 - XML - Schemas

    13/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Let's Get Started!

    Convert the BookStore.dtd (next page) to the XMLSchema syntax

    for this first example we will make a straight, one-to-one

    conversion, i.e., Title, Author, Date, ISBN, and Publisher will holdstrings, just like is done in the DTD

    We will gradually modify the XML Schema to use stronger types

  • 8/3/2019 05 - XML - Schemas

    14/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    BookStore.dtd

  • 8/3/2019 05 - XML - Schemas

    15/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    ATTLIST

    ELEMENT

    ID

    #PCDATA

    NMTOKEN

    ENTITY

    CDATA

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    This is the vocabulary that

    DTDs provide to define yournew vocabulary

  • 8/3/2019 05 - XML - Schemas

    16/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    element

    complexType

    schema

    sequence

    http://www.w3.org/2001/XMLSchema

    string

    integer

    boolean

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    http://www.books.org (targetNamespace)

    This is the vocabulary that

    XML Schemas provide to define your

    new vocabulary

    One difference between XML Schemas and DTDs is that the XML Schema vocabulary

    is associated with a name (namespace). Likewise, the new vocabulary that you

    define must be associated with a name (namespace). With DTDs neither set of

    vocabulary is associated with a name (namespace) [because DTDs pre-dated namespaces].

  • 8/3/2019 05 - XML - Schemas

    17/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    BookStore.xsd (see example01)xsd = Xml-Schema Definition

    (explanations on

    succeeding pages)

  • 8/3/2019 05 - XML - Schemas

    18/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    ISBN, Publisher)>

  • 8/3/2019 05 - XML - Schemas

    19/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    All XML Schemas hav

    "schema" as the rootelement.

  • 8/3/2019 05 - XML - Schemas

    20/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    The elements anddatatypes that

    are used to construct

    schemas

    - schema

    - element

    - complexType- sequence

    - string

    come from the

    http:///XMLSchem

    namespace

  • 8/3/2019 05 - XML - Schemas

    21/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    element

    complexType

    schema

    sequence

    http://www.w3.org/2001/XMLSchema

    string

    integer

    boolean

    XMLSchema Namespace

  • 8/3/2019 05 - XML - Schemas

    22/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Indicates that the

    elements defined

    by this schema

    - BookStore

    - Book

    - Title- Author

    - Date

    - ISBN

    - Publisher

    are to go in the

    http://www.books.org

    namespace

  • 8/3/2019 05 - XML - Schemas

    23/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    http://www.books.org (targetNamespace)

    Book Namespace (targetNamespace)

  • 8/3/2019 05 - XML - Schemas

    24/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This is referencing a

    Book element declaration

    The Book in what

    namespace? Since there

    is no namespace qualifier

    it is referencing the Book

    element in the default

    namespace, which is the

    targetNamespace! Thus,this is a reference to the

    Book element declaration

    in this schema.

    The default namespace is

    http://www.books.org

    which is the

    targetNamespace!

  • 8/3/2019 05 - XML - Schemas

    25/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This is a directive to any

    instance documents which

    conform to this schema:Any elements used by the

    instance document which

    were declared in this

    schema must be namespace

    qualified.

  • 8/3/2019 05 - XML - Schemas

    26/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Referencing a schema in an XMLinstance document

    My Life and Times

    Paul McCartney

    July, 1998

    94303-12021-43892

    McMillin Publishing

    ...

    1. First, using a default namespace declaration, tell the schema-validator that all of the elements

    used in this instance document come from the http://www.books.org namespace.

    2. Second, with schemaLocation tell the schema-validator that the http://www.books.org

    namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of values).

    3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in

    the XMLSchema-instance namespace.

    1

    2

    3

  • 8/3/2019 05 - XML - Schemas

    27/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    schemaLocationtype

    noNamespaceSchemaLocation

    http://www.w3.org/2001/XMLSchema-instance

    nil

    XMLSchema-instance Namespace

  • 8/3/2019 05 - XML - Schemas

    28/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Referencing a schema in an XML instance document

    BookStore.xml BookStore.xsd

    targetNamespace="http://www.books.org"schemaLocation="http://www.books.org

    BookStore.xsd"

    - defines elements in

    namespace http://www.books.org

    - uses elements from

    namespace http://www.books.org

    A schema defines a new vocabulary. Instance documents use that new vocabulary.

  • 8/3/2019 05 - XML - Schemas

    29/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Note multiple levels of checking

    BookStore.xml BookStore.xsd XMLSchema.xsd

    (schema-for-schemas)

    Validate that the xml document

    conforms to the rules described

    in BookStore.xsd

    Validate that BookStore.xsd is a valid

    schema document, i.e., it conforms

    to the rules described in theschema-for-schemas

  • 8/3/2019 05 - XML - Schemas

    30/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Default Value for minOccurs and maxOccurs

    The default value for minOccurs is "1" The default value for maxOccurs is "1"

    Equivalent!

    Do Lab1

  • 8/3/2019 05 - XML - Schemas

    31/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Qualify XMLSchema, Default targetNamespace

    In the first example, we explicitly qualified all elements from the XML Schemanamespace. The targetNamespace was the default namespace.

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    http://www.books.org (targetNamespace)http://www.w3.org/2001/XMLSchema

    element

    complexType

    schema

    sequence

    string

    integer

    boolean

  • 8/3/2019 05 - XML - Schemas

    32/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Default XMLSchema,

    Qualify targetNamespace Alternatively (equivalently), we can design our schema so that XMLSchema is

    the default namespace.

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    http://www.books.org (targetNamespace)http://www.w3.org/2001/XMLSchema

    element

    complexType

    schema

    sequence

    string

    integer

    boolean

    h l "htt // 3 /2001/XMLS h "

  • 8/3/2019 05 - XML - Schemas

    33/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example02)

    Note that

    http:///XMLSchema

    is the default

    namespace.Consequently, there

    are no namespace

    qualifiers on

    - schema

    - element

    - complexType

    - sequence

    - string

  • 8/3/2019 05 - XML - Schemas

    34/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Here we are

    referencing aBook element.

    Where is that

    Book element

    defined? In

    what namespace?

    The bk: prefix

    indicates what

    namespace this

    element is in. bk:

    has been set to

    be the same as the

    targetNamespace.

  • 8/3/2019 05 - XML - Schemas

    35/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    bk: References the targetNamespace

    BookStore

    Book

    Title

    Author

    Date

    ISBNPublisher

    http://www.books.org (targetNamespace)http://www.w3.org/2001/XMLSchema

    bk

    Do Lab1.1

    element

    complexType

    schema

    sequence

    string

    integer

    boolean

    Consequently, bk:Bookrefers to the Book element in the targetNamespace.

  • 8/3/2019 05 - XML - Schemas

    36/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Inlining Element Declarations

    In the previous examples we declared an element andthen we refed to that element declaration. Alternatively,we can inlinethe element declarations.

    On the following slide is an alternate (equivalent) way ofrepresenting the schema shown previously, using inlinedelement declarations.

  • 8/3/2019 05 - XML - Schemas

    37/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example03)

    Note that we have moved

    all the element declarations

    inline, and we are nolonger ref'ing to the

    element declarations.

    This results in a much

    more compact schema!

    This way of designing the schema - by inlining everything - is called theRussian Doll design.

  • 8/3/2019 05 - XML - Schemas

    38/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Do Lab 2

    (see example03)

    Anonymous types (no name)

  • 8/3/2019 05 - XML - Schemas

    39/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    yp

    The following slide shows an alternate (equivalent)schema which uses a named complexType.

  • 8/3/2019 05 - XML - Schemas

    40/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example04)

    Named typeThe advantage of

    splitting out Book's

    element declaration

    and wrapping them

    in a named type is

    that now this type

    can be reusedby

    other elements.

  • 8/3/2019 05 - XML - Schemas

    41/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    is equivalent to:

    Element A references thecomplexType foo.

    Element A has the

    complexType definition

    inlinedin the element

    declaration.

  • 8/3/2019 05 - XML - Schemas

    42/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    but not Both! An element declaration can have a type attribute, or a

    complexType child element, but it cannot have both a typeattribute and a complexType child element.

  • 8/3/2019 05 - XML - Schemas

    43/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Summary of Declaring Elements (two

    ways to do it)

    A simple type

    (e.g., xsd:string)or the name of

    a complexType

    (e.g., BookPublication)

    1

    2

    A nonnegative

    integer

    A nonnegative

    integer or "unbounded"

    Note: minOccurs and maxOccurs can only

    be used in nested (local) element declarations

  • 8/3/2019 05 - XML - Schemas

    44/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Defining the Date element to be of type string is unsatisfactory (it allows any

    string value to be input as the content of the Date element, including non-datestrings).

    We would like to constrain the allowable content that Date can have. Modify theBookStore schema to restrict the content of the Date element to just date values

    (actually, year values. See next two slides).

    Similarly, constrain the content of the ISBN element to content of this form: d-ddddd-ddd-d or d-ddd-ddddd-d or d-dd-dddddd-d, where 'd' stands for 'digit'

  • 8/3/2019 05 - XML - Schemas

    45/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    A built-in datatype(i.e., schema validators know about this datatype)

    This datatype is used to represent a specific day (year-month-day)

    Elements declared to be of type date must follow this form: CCYY-MM-DD

    range for CC is: 00-99

    range for YY is: 00-99

    range for MM is: 01-12

    range for DD is:

    01-28 if month is 201-29 if month is 2 and the gYear is a leap gYear

    01-30 if month is 4, 6, 9, or 11

    01-31 if month is 1, 3, 5, 7, 8, 10, or 12

    Example: 1999-05-31 represents May 31, 1999

  • 8/3/2019 05 - XML - Schemas

    46/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    A built-in datatype (Gregorian calendar year)

    Elements declared to be of type gYear must follow this form: CCYY

    range for CC is: 00-99

    range for YY is: 00-99 Example: 1999 indicates the gYear 1999

  • 8/3/2019 05 - XML - Schemas

    47/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example05)

    targetNamespace="http://www.books.org"

    xmlns="http://www.books.org"

    elementFormDefault="qualified">

    Here we are defining a

    new (user-defined) data-

    type, called ISBNType.

    Declaring Date to be of

    type gYear, and ISBN to

    be of type ISBNType

    (defined above)

  • 8/3/2019 05 - XML - Schemas

    48/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    < sd:s p e ype a e S N ype >

    "I hereby declare a new type called ISBNType. It is a restricted form of

    the string type. Elements declared of this type must conform to one of the following patterns:

    - First Pattern: 1 digit followed by a dash followed by 5digits followed by another dash followed by 3 digits followed by

    another dash followed by 1 more digit, or

    - Second Pattern: 1 digit followed by a dash followed by 3

    digits followed by another dash followed by 5 digits followed by

    another dash followed by 1 more digit, or- Third Pattern: 1 digit followed by a dash followed by 2

    digits followed by another dash followed by 6 digits followed by

    another dash followed by 1 more digit."

    These patterns are specified usingRegular Expressions. In a few slideswe will see more of the Regular Expression syntax.

  • 8/3/2019 05 - XML - Schemas

    49/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    The vertical bar means "or"

  • 8/3/2019 05 - XML - Schemas

    50/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    When do you use the complexType element and when doyou use the simpleType element?

    Use the complexType element when you want to define childelements and/or attributes of an element

    Use the simpleType element when you want to create a new typethat is a refinement of a built-in type (string, date, gYear, etc)

  • 8/3/2019 05 - XML - Schemas

    51/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Primitive Datatypes

    string

    boolean

    decimal

    float

    double

    duratio

    dateTime

    time

    date

    gYearMonth

    gYear

    gMonthDay

    n

    Atomic, built-in

    "Hello World"

    {true, false, 1, 0}

    7.08

    12.56E3, 12, 12560, 0, -0, INF, -INF, NAN

    12.56E3, 12, 12560, 0, -0, INF, -INF, NAN

    P1Y2M3DT10H30M12.3S

    format: CCYY-MM-DDThh:mm:ss

    format: hh:mm:ss.sss

    format: CCYY-MM-DD

    format: CCYY-MM

    format: CCYY

    format: --MM-DD

    Note: 'T' is the date/time separatorINF = infinity

    NAN = not-a-number

  • 8/3/2019 05 - XML - Schemas

    52/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Primitive Datatypes gDay

    gMonth

    hexBina

    base64Binar

    anyUR

    QNam

    NOTAT

    ry

    y

    I

    e

    ION

    Atomic, built-in format: ---DD(note the 3 dashes)

    format: --MM

    a hex string

    a base64 string

    http://www.xfront.com

    a namespace qualified name

    a NOTATION from the XML spec

  • 8/3/2019 05 - XML - Schemas

    53/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Derived types normalizedString

    token

    language

    IDREFS

    ENTITIES

    NMTOKEN

    NMTOKENS

    Name

    NCName

    ID

    IDREF

    ENTITY

    integer nonPositiveInteger

    Subtype of primitive datatype A string without tabs, line feeds, or carriage returns

    String w/o tabs, l/f, leading/trailing spaces,consecutive spaces

    any valid xml:lang value, e.g., EN, FR, ...

    must be used only with attributes

    must be used only with attributes

    must be used only with attributes

    must be used only with attributes

    part (no namespace qualifier) must be used only with attributes

    must be used only with attributes

    must be used only with attributes

    456

    negative infinity to 0

  • 8/3/2019 05 - XML - Schemas

    54/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Built-in Datatypes (cont.)

    Derived types

    negativeInteger

    long

    int

    short

    byte

    nonNegativeInteger

    unsignedLong

    unsignedInt

    unsignedShort

    unsignedByte

    positiveInteger

    Subtype of primitive datatype

    negative infinity to -1

    -9223372036854775808 to 9223372036854775807

    -2147483648 to 2147483647

    -32768 to 32767

    -127 to 128

    0 to infinity

    0 to 18446744073709551615

    0 to 4294967295

    0 to 65535

    0 to 255

    1 to infinity

    Do Lab 3Note: the following types can only be used with attributes (which we will discuss later):

    ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, ENTITY, and ENTITIES.

  • 8/3/2019 05 - XML - Schemas

    55/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    A new datatype can be defined from an existing datatype (called the "base"

    type) by specifying values for one or more of the optional facetsfor the basetype.

    Example. The string primitive datatype has six optional facets:

    length

    minLength

    maxLength

    pattern

    enumeration

    whitespace (legal values: preserve, replace, collapse)

  • 8/3/2019 05 - XML - Schemas

    56/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Specifying Facet Values

    1. This creates a new datatype called 'TelephoneNumber'.

    2. Elements of this type can hold string values,3. But the string length must be exactly 8 characters long and

    4. The string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'.

    (Obviously, in this example the regular expression makes the length facet

    redundant.)

    1

    2

    3

    4

  • 8/3/2019 05 - XML - Schemas

    57/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This creates a new type called shape.

    An element declared to be of this type

    must have either the value circle, or triangle, or square.

  • 8/3/2019 05 - XML - Schemas

    58/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    The integer datatype has 8 optional facets:

    totalDigits

    pattern

    whitespace

    enumeration

    maxInclusive

    maxExclusive

    minInclusive

    minExclusive

  • 8/3/2019 05 - XML - Schemas

    59/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This creates a new datatype called 'EarthSurfaceElevation'.

    Elements declared to be of this type can hold an integer.However, the integer is restricted to have a value between

    -1290 and 29035, inclusive.

  • 8/3/2019 05 - XML - Schemas

    60/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Facets:- length

    - minlength

    - maxlength

    - pattern- enumeration

    - minInclusive

    - maxInclusive

    - minExclusive

    - maxExclusive...

    Sources:

    - string

    - boolean

    - number- float

    - double

    - duration

    - dateTime

    - time...

  • 8/3/2019 05 - XML - Schemas

    61/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    "or" them together?

    An element declared to be of type TelephoneNumber

    must be a string of length=8 andthe string must

    follow the pattern: 3 digits, dash, 4 digits.

    An element declared to be of type shape

    must be a string with a value ofeithercircle, or

    triangle, orsquare.

    Patterns, enumerations => "or" them together

    All other facets => "and" them together

  • 8/3/2019 05 - XML - Schemas

    62/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Thus far we have created a simpleType using one of thebuilt-in datatypes as our base type.

    However, we can create a simpleType that uses another

    simpleType as the base. See next slide.

  • 8/3/2019 05 - XML - Schemas

    63/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This simpleType uses EarthSurfaceElevation as its base type.

  • 8/3/2019 05 - XML - Schemas

    64/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Sometimes when we define a simpleType we want to

    require that one (or more) facet have an unchangingvalue. That is, we want to make the facet a constant.

    simpleTypes which

    derive from this

    simpleType may

    not change this

    facet.

  • 8/3/2019 05 - XML - Schemas

    65/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Error! Cannot

    change the valueof a fixed facet!

  • 8/3/2019 05 - XML - Schemas

    66/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Example. Create a schema element declaration for an elevation element.Declare the elevation element to be an integer with a range -1290 to 29035

    5240

    Here's one way of declaring the elevation element:

  • 8/3/2019 05 - XML - Schemas

    67/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Here's an alternative method for declaring elevation:

    The simpleType definition is

    defined inline, it is an anonymous

    simpleType definition.

    The disadvantage of this approach is

    that this simpleType may not be

    reused by other elements.

  • 8/3/2019 05 - XML - Schemas

    68/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    ways to do it)

    1

    2

    3

  • 8/3/2019 05 - XML - Schemas

    69/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    g ,

    for programs. Use for providing a comment to humans

    Use for providing a comment to programs

    The content is any well-formed XML

    Note that annotations have no effect on schema validation

    The following constraint is not expressible with XML Schema: The value of element A should be greater

    than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint.

    We will express this constraint in the appinfo section (below).

    A should be greater than B

  • 8/3/2019 05 - XML - Schemas

    70/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    You cannot put annotations at just any random location in

    the schema.

    Here are the rules for where an annotation element cango:

    annotations may occur before and after any global component

    annotations may occur only at the beginning of non-global

    components

  • 8/3/2019 05 - XML - Schemas

    71/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    targetNamespace="http://www.books.org"

    xmlns="http://www.books.org"

    elementFormDefault="qualified">

    Can put

    annotations

    only at

    theselocations

    Suppose that you want to annotate, say, the Date element

    declaration. What do we do? See next page ...

  • 8/3/2019 05 - XML - Schemas

    72/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    g p p g

    xmlns="http://www.books.org"elementFormDefault="qualified">

    This is how to annotate the Date element!

    Inline the annotation within the Date element declaration.

  • 8/3/2019 05 - XML - Schemas

    73/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    In the previous example we showed

    with no attributes. Actually, it can have two attributes:

    source: this attribute contains a URL to a file which containssupplemental information

    xml:lang: this attribute specifies the language that thedocumentation was written in

  • 8/3/2019 05 - XML - Schemas

    74/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Element

    In the previous example we showed

    with no attributes. Actually, it can have one attribute: source: this attribute contains a URL to a file which contains

    supplemental information

  • 8/3/2019 05 - XML - Schemas

    75/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    y p

    Schemas.

    Let's back up for a moment and look at XML Schemasfrom a "big picture" point of view.

  • 8/3/2019 05 - XML - Schemas

    76/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Code to check thestructure and content

    (datatype) of the data

    Code to actually

    do the work

    "In a typical program, up to 60% of the code is spent checking the data!"

    - source unknown

    Continued -->

  • 8/3/2019 05 - XML - Schemas

    77/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Code to check thestructure and content

    of the data

    Code to actually

    do the work

    If your data is structured asXML, and there is a schema,

    then you can hand the

    data-checking task off to a

    schema validator.

    Thus, your code is reduced

    by up to 60%!!!

    Big $$ savings!

  • 8/3/2019 05 - XML - Schemas

    78/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    SupplierConsumer

    P.O.

    Schema

    Validator

    P.O.

    Schema

    Software

    to Process

    P.O.

    "P.O. is

    okay" P.O.

    (Schema at third-party, neutral web site)

  • 8/3/2019 05 - XML - Schemas

    79/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Data Model

    With XML Schemas you specify how your XML data will beorganized, and the datatypes of your data. That is, with XMLSchemas you model how your data is to be represented in an

    instance document.

    A Contract

    Organizations agree to structure their XML documents in

    conformance with an XML Schema. Thus, the XML Schemaacts as a contract between the organizations.

    A rich source of metadata

    An XML Schema document contains lots of data about the data in

    the XML instance documents, such as the datatype of the data,the data's range of values, how the data is related to anotherpiece of data (parent/child, sibling relationship), i.e., XMLSchemas contain metadata

  • 8/3/2019 05 - XML - Schemas

    80/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    validate your data (so that you don't have to write code to do it)

    However, there are many other uses for XML Schemas. Schemas area wonderful source of metadata.

    Truly, your imagination is the only limit on its usefulness.

    On the next slide I show how to use the metadata provided by XMLSchemas to create a GUI. The slide after that shows how to

    automatically generate an API using the metadata in XML Schemas.Following that is a slide showing how to create a "smart editor" usingXML Schemas.

  • 8/3/2019 05 - XML - Schemas

    81/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    P.O.

    SchemaGUI

    BuilderP.O.

    HTMLSupplier

    Web

    Server

  • 8/3/2019 05 - XML - Schemas

    82/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    P.O.

    SchemaAPI

    BuilderP.O.

    API

  • 8/3/2019 05 - XML - Schemas

    83/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    P.O.

    SchemaSmart Editor

    (e.g., XML Spy)

    Helps you build your

    instance documents.

    For example, it pops

    up a menu showingyou what is valid next.

    It knows this by looking

    at the XML Schema!

  • 8/3/2019 05 - XML - Schemas

    84/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Schema

    ValidateXML

    documents

    Automatic

    GUI generationSemantic Web???

    AutomaticAPI generation

    Smart Editor

    Do Lab 4

  • 8/3/2019 05 - XML - Schemas

    85/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Recall that the string datatype has a pattern facet.The value of a pattern facet is a regular expression.

    Below are some examples of regular expressions:

    Regular Expression

    - Chapter \d- Chapter \d

    - a*b

    - [xyz]b- a?b

    - a+b

    - [a-c]x

    Example

    - Chapter 1- Chapter 1

    - b, ab, aab, aaab,

    - xb, yb, zb- b, ab

    - ab, aab, aaab,

    - ax, bx, cx

  • 8/3/2019 05 - XML - Schemas

    86/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    [a-c]x

    [-ac]x

    [ac-]x [^0-9]x

    \Dx

    Chapter\s\d

    (ho){2} there

    (ho\s){2} there

    .abc

    (a|b)+x

    ax, bx, cx

    -x, ax, cx

    ax, cx, -x any non-digit char followed by x

    any non-digit char followed by x

    Chapter followed by a blankfollowed by a digit

    hoho there

    ho ho there any (one) char followed by abc

    ax, bx, aax, bbx, abx, bax,...

  • 8/3/2019 05 - XML - Schemas

    87/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    a{1,3}x

    a{2,}x

    \w\s\w

    ax, aax, aaax

    aax, aaax, aaaax,

    word character (alphanumericplus dash) followed by a spacefollowed by a word character

    [a-zA-Z-[Ol]]* A string comprised of any

    lower and upper case letters,except "O" and "l"

    \. The period "." (Without the

    backward slash the period means"any character")

  • 8/3/2019 05 - XML - Schemas

    88/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    \t\\\|\-\^\?\*\+\{\}\(

    \)\[\]

    tabThe backward slash \The vertical bar |The hyphen -The caret ^The question mark ?The asterisk *The plus sign +The open curly brace {The close curly brace }The open paren (

    The close paren )The open square bracket [The close square bracket ]

  • 8/3/2019 05 - XML - Schemas

    89/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    \p{Lu}

    \p{Ll}

    \p{N}

    \p{Nd}

    \p{P}

    \p{Sc}

    An uppercase letter, from any language

    A lowercase letter, from any language

    A number - Roman, fractions, etc

    A digit from any language

    A punctuation symbol

    A currency sign, from any language

    "currency sign from any

    language, followed by one

    or more digits from any

    language, optionally

    followed by a period andtwo digits from any

    language"

    $45.99

    300

  • 8/3/2019 05 - XML - Schemas

    90/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]

    0 to 99 100 to 199 200 to 249 250 to 255

    This regular expression restricts a string to have

    values between 0 and 255. Such a R.E. might be useful in describing an

    IP address ...

  • 8/3/2019 05 - XML - Schemas

    91/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Datatype for representing IP addresses. Examples,

    129.83.64.255, 64.128.2.71, etc.

    This datatype restricts each field of the IP address

    to have a value between zero and 255, i.e.,

    [0-255].[0-255].[0-255].[0-255]Note: in the value attribute (above) the regular

    expression has been split over two lines. This is

    for readability purposes only. In practice the R.E.

    would all be on one line.

  • 8/3/2019 05 - XML - Schemas

    92/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Consider this XML document: (line break characters areexplicitly shown)

    \r\n

    \r\n

    This is a \r\n

    simple paragraph. What \r\n

    do you think of it? \r\n

    \r\n

    When an XML parser reads in this document it

    "normalizes" ALL line breaks. Thus, after normalizationthe XML document looks like this:

    \n

    \n

    This is a \n

    simple paragraph. What \ndo you think of it? \n

    \n

    Consequence: you don't have to be concerned about different

    platforms using different line break characters since all XML

    documents will have their line break characters normalized to \n

    regardless of the platform. (So, if you're writing an XML

    Schema regex expression you can simply use \n to indicate linebreak, regardless of the platform.)

    2. The xml:space="preserve" attribute has no impact on line

    break normalization.

    3. Suppose that you want a line break character in your XMLdocument, other than \n. For example, suppose that you want \r

    in your XML document. By default, it would get normalized to

    \n. To prevent this, use a character reference:

    Do Lab 5

  • 8/3/2019 05 - XML - Schemas

    93/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    We can do a form of subclassing complexTypedefinitions. We call this "derived types"

    derive by extension: extend the parent complexType withmore elements

    derive by restriction: create a type which is a subset of thebase type. There are two ways to subset the elements:

    redefine a base type element to have a restricted range ofvalues, or

    redefine a base type element to have a more restricted number

    of occurrences.

  • 8/3/2019 05 - XML - Schemas

    94/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    < sd:co p e ype a e >

    Note that

    BookPublication extends

    the Publication

    type, i.e., we are doing

    Derive by Extension

    (see example06)

  • 8/3/2019 05 - XML - Schemas

    95/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Elements declared to be of type BookPublication will have 5 child elements - Title, Author,

    Date, ISBN, and Publisher. Note that the elements in the derived type are appended to the

    elements in the base type.

  • 8/3/2019 05 - XML - Schemas

    96/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Title

    AuthorDate

    ISBN

    Publisher

    Publication

    BookPublication

  • 8/3/2019 05 - XML - Schemas

    97/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    TitleAuthor

    Date

    BookPublication

    ISBNPublisher

    "extends"

    Do Lab 6

  • 8/3/2019 05 - XML - Schemas

    98/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date.

    However, there must be exactly one Author element.

    Note that in the restriction type you must repeat all the declarations from the base type (except when

    the base type has an element with minOccurs="0" and the subtype wishes to delete it. See next slide).

  • 8/3/2019 05 - XML - Schemas

    99/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Note that in this

    subtype we haveeliminated the

    Author element,

    i.e., the subtype is

    just comprised of

    an unboundednumber of Title

    elements followed

    by a single Date

    element.If the base type has an element with minOccurs="0", and the subtype wishes to

    not have that element, then it can simply leave it out.

  • 8/3/2019 05 - XML - Schemas

    100/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    I simply show the delta (i.e., show those declarations that are changed)?

    What's the advantage of doing derived by restriction if I have to repeateverything? I'm certainly not saving on typing.

    Answer:

    Even though you have to retype everything in the base type there areadvantages to explicitly associating a type with a base type. In a few slideswe will see element substitution- the ability to substitute one element for

    another. A restriction of element substitution is that the substitutingelement have a type that derives from the type of the element it issubstituting. Thus, it is beneficial to link the type.

    Also, later we will see that an elements content model may be substitutedby the content model of derived types. Thus, the content of an elementthat has been declared to be of type Publication can be substituted with aSingleAuthorPublication content since SingleAuthorPublication derivesfrom Publication. We will discuss this type substitutabilityin detail later.

  • 8/3/2019 05 - XML - Schemas

    101/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    , j ,derivations.

    Rationale: "For example, I may create a complexType and make it publiclyavailable for others to use. However, I don't want them to extend it with

    their proprietary extensions or subset it to remove, say, copyrightinformation." (Jon Cleaver)

    Publication cannot be extended nor restricted

    Publication cannot be restricted

    Publication cannot be extended

  • 8/3/2019 05 - XML - Schemas

    102/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    declaredare those that have a representation in an XML instancedocument.

    You definecomponents that are used just within the schema document(s).

    Schema components that are definedare those that have norepresentation in an XML instance document.

    Declarations:

    - element declarations

    - attribute declarations

    Definitions:

    - type (simple, complex) definitions

    - attribute group definitions- model group definitions

  • 8/3/2019 05 - XML - Schemas

    103/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    ypchildren of

    Local element declarations, local type definitions:

    These are element declarations/type definitions that are nestedwithin other elements/types.

  • 8/3/2019 05 - XML - Schemas

    104/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Global type definition

    Global element declaratio

    Local element declarationsLocal type definition

  • 8/3/2019 05 - XML - Schemas

    105/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p p Answer: only global elements/types can be referenced (i.e.,

    reused). Thus, if an element/type is local then it is effectively

    invisible to the rest of the schema (and to other schemas).

  • 8/3/2019 05 - XML - Schemas

    106/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    In Boston we use the words "T" and "subway" interchangeably.For example, "we took the T into town", or "we took the subwayinto town".

    Thus, "T" and "subway" are substitutable. Which one is usedmay depend upon what part of the state you live in, what moodyou're in, or any number of factors.

    We would like to be able to express this substitutability in XMLSchemas.

    That is, we would like to be able to declare in a schema an element called

    "subway", an element called "T", and state that "T"may be substituted for"subway". Instance documents can then use either or ,depending on their preference.

  • 8/3/2019 05 - XML - Schemas

    107/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    g yhead element.

    subway is the headelement

    T is substitutable for subway

    So what's the big deal?

    - Anywhere a head element can be used in an instance document,

    any member of the substitutionGroup can be substituted!

  • 8/3/2019 05 - XML - Schemas

    108/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    q

    Red Line

    Instance doc:

    Red Line

    Alternative

    instance doc

    (substitute Tfor subway):

    This example shows the element being substituted with

    the element.

  • 8/3/2019 05 - XML - Schemas

    109/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    is shown a Spanish version of the element.

  • 8/3/2019 05 - XML - Schemas

    110/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Red Line

    Schema:

    Instance doc:

    Linea Roja

    Alternative

    instance doc(customized

    for our

    Spanish

    clients):

  • 8/3/2019 05 - XML - Schemas

    111/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    If the type of a substitutionGroup element is the same as the headelement then you can omit it (the type)

    In our Subway example we could have omitted the type attribute in thedeclaration of the T element since it is the same as Subways type(xsd:string).

  • 8/3/2019 05 - XML - Schemas

    112/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    This type must be the same as "xxx" or,

    it must be derived from "xxx".

  • 8/3/2019 05 - XML - Schemas

    113/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    114/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    PublicationType

    BookType MagazineType

    In order for Book and Magazine to be in a substitutionGroup with

    Publication, their type (BookType and MagazineType, respectively)must be the same as, or derived from Publication's type (PublicationType)

  • 8/3/2019 05 - XML - Schemas

    115/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Illusions: The Adventures of a Reluctant Messiah

    Richard Bach1977

  • 8/3/2019 05 - XML - Schemas

    116/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    0-440-34319-4

    Dell Publishing Co.

    Natural Health

    1999

    The First and Last Freedom

    J. Krishnamurti

    1954

    0-06-064831-7

    Harper & Row

    can contain any element in the substitutionGroup with Publication!

  • 8/3/2019 05 - XML - Schemas

    117/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    118/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    q

    Red Line

    Instance doc:

    Red Line

    Not allowed!

    Note: there is no error in declaring T to be substitutable with subway.

    The error occurs only when you try to do substitution in the instance document.

  • 8/3/2019 05 - XML - Schemas

    119/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    1. Transitive: if element A can substitute for element B, and element B can substitutefor element C, then element A can substitute for element C.

    A --> B --> C then A --> C

    2. Non-symmetric: if element A can substitute for element B, it is not the case that element Bcan substitute for element A.

    Do Lab 7

  • 8/3/2019 05 - XML - Schemas

    120/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    121/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Category (autobiography | non-fiction | fiction) #REQUIRED

    InStock (true | false) "false"

    Reviewer CDATA " ">

    BookStore.dtd

  • 8/3/2019 05 - XML - Schemas

    122/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example07)

    InStock (true | false) "false"

    Reviewer CDATA " "

    Category (autobiography | non-fiction | fiction) #REQUIRED

  • 8/3/2019 05 - XML - Schemas

    123/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    "Instance documents are required to have the Category attribute(as indicated by use="required"). The value of Category must be

    either autobiography, non-fiction, or fiction (as specified by the

    enumeration facets)."

    Note: attributes can only have simpleTypes (i.e., attributes cannot

    have child elements).

  • 8/3/2019 05 - XML - Schemas

    124/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    required

    optional

    prohibited

    The "use" attribute must be

    optional if you use

    default or fixed.

    xsd:string

    xsd:integer

    xsd:boolean...

    1

    2

  • 8/3/2019 05 - XML - Schemas

    125/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    When declaring a global attribute do not specify a "use"

  • 8/3/2019 05 - XML - Schemas

    126/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    g y q

    Local attribute declaration. Use the

    "use" attribute here.

    Global attribute declaration. Must NOT

    have a "use" ("use" only makes sense in

    the context of an element)

  • 8/3/2019 05 - XML - Schemas

    127/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    attributeGroup.

  • 8/3/2019 05 - XML - Schemas

    128/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example08)

  • 8/3/2019 05 - XML - Schemas

    129/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    y pthey are defined (nested) within.

    "bar and boo are

    attributes of foo"

  • 8/3/2019 05 - XML - Schemas

    130/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    These attributesapply to the

    element they are

    nested within (Book)

    That is, Book has three

    attributes - Category,InStock, and Reviewer.

    Do Lab 8.a,

  • 8/3/2019 05 - XML - Schemas

    131/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Example. Consider this:

    5440

    The elevation element has these two constraints:

    - it has a simple (integer) content- it has an attribute called units

    How do we declare elevation? (see next slide)

  • 8/3/2019 05 - XML - Schemas

    132/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    yp g q

    1. elevation contains an attribute.- therefore, we must use

    2. However, elevation does not contain child elements (which is what we generally

    use to indicate). Instead, elevation contains simpleContent.

    3. We wish to extend the simpleContent (an integer) ...

    4. with an attribute.

  • 8/3/2019 05 - XML - Schemas

    133/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Let's restrict elevation to hold an integer with a range 0 -

    12,000 and let's restrict units to hold either the string "feet"or the string "meters"

  • 8/3/2019 05 - XML - Schemas

    134/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    135/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    An alternative formulation of the above shapes example is to inline the simpleType definition:

  • 8/3/2019 05 - XML - Schemas

    136/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    An alternate formulation of the above Person example is to create a named complexType and then use that type:

  • 8/3/2019 05 - XML - Schemas

    137/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    138/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    139/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Example. Large, green, sour

  • 8/3/2019 05 - XML - Schemas

    140/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    X must be a complexType

    Y must be a simpleType

    versus

    Do Lab 8.b,

    8.c

  • 8/3/2019 05 - XML - Schemas

    141/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    element declarations, no attribute declarations allowed!

  • 8/3/2019 05 - XML - Schemas

    142/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    An example showing the use of the element

  • 8/3/2019 05 - XML - Schemas

    143/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    ...

    Cannot inline the

    group definition.

    Instead, you must

    use a refhere and

    define the group

    globally.

  • 8/3/2019 05 - XML - Schemas

    144/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    xmlns= http://www.travel.org

    elementFormDefault="qualified">

    (see example10)

    Note: the choice is an exclusive-or, that is, transportation can contain

    only one element - either train, or plane, or automobile.

  • 8/3/2019 05 - XML - Schemas

    145/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    xmlns= http://www.binary.org

    elementFormDefault="qualified">

    XML Schema:

    Notes:

    1. An element can fix its value, using the fixed attribute.2. When you don't specify a value for minOccurs, it defaults to "1".

    Same for maxOccurs. See the last example (transportation) where

    we used a element with no minOccurs or maxOccurs.

    (see example 11)

  • 8/3/2019 05 - XML - Schemas

    146/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    0

    or equivalently:

    red

    or equivalently:

  • 8/3/2019 05 - XML - Schemas

    147/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    XML Schema:

  • 8/3/2019 05 - XML - Schemas

    148/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p://www. . g

    elementFormDefault="qualified">

    XML Schema:

    means that Book must contain all five child elements, butthey may occur in any order.

    (see example 12)

  • 8/3/2019 05 - XML - Schemas

    149/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    The element cannot be nested within either ,

    , or another

    The contents of must be just elements. It cannot contain or

    Do Lab 9

    DTD:

  • 8/3/2019 05 - XML - Schemas

    150/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    targetNamespace http://www.photography.org

    xmlns="http://www.photography.org"

    elementFormDefault="qualified">

    Schema:

    Instance

    doc (snippet):

    Do Lab 10

    (see example 13)

  • 8/3/2019 05 - XML - Schemas

    151/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Consequences of having no namespace

    1. In the instance document dont namespace qualify the elements.

    2. In the instance document, instead of using schemaLocation usenoNamespaceSchemaLocation.

  • 8/3/2019 05 - XML - Schemas

    152/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example14)

    My Life and Times

    Paul McCartney

  • 8/3/2019 05 - XML - Schemas

    153/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    1998

    1-56592-235-2

    McMillin Publishing

    (see example14)

    1. Note that there is no default namespace declaration. So, none of the elements are

    associated with a namespace.

    2. Note that we do not use xsi:schemaLocation (since it requires a pair of values - a namespace

    and a URL to the schema for that namespace). Instead, we use xsi:noNamespaceSchemaLocation.

  • 8/3/2019 05 - XML - Schemas

    154/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Validation can apply to the entire XML instance document,or to a single element.

    My Life and TimesPaul McCartney

    1998

    1-56592-235-2

    Macmillan Publishing

    Validating against

    two schemas

    The elements are

    defined in Book.xsd, and

  • 8/3/2019 05 - XML - Schemas

    155/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Illusions: The Adventures of a Reluctant Messiah

    Richard Bach1977

    0-440-34319-4

    Dell Publishing Co.

    The First and Last Freedom

    J. Krishnamurti

    19540-06-064831-7

    Harper & Row

    John Doe

    123-45-6789

    Sally Smith

    000-11-2345

    Library.xml

    (see example 15)

    defined in Book.xsd, and

    the elementsare defined in Employee.xsd.

    The , ,

    and elements

    are not defined in any schema!

    1. A schema validator will

    validate each Book element

    against Book.xsd.

    2. It will validate each

    Employee element againstEmployee.xsd.

    3. It will not validate the other

    elements.

  • 8/3/2019 05 - XML - Schemas

    156/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    xsv performs lax validation. Thus, it will accept the instance document on the previous

    slide (but it will note validation="lax" in its output)

    All the other validators do strict validation. Consequently, they will reject the instancedocument on the previous slide.

  • 8/3/2019 05 - XML - Schemas

    157/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    schema (i.e., the schema that is doing the include)

    The net effect of include is as though you had typed all the definitionsdirectly into the containing schema

    LibraryBook.xsd LibraryEmployee.xsd

    Library.xsd

    d l T

  • 8/3/2019 05 - XML - Schemas

    158/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Library.xsd (see example 16)

    These are

    referencing

    element

    declarations

    in the otherschemas.

    Nice!

  • 8/3/2019 05 - XML - Schemas

    159/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p p

    Chameleon components.

  • 8/3/2019 05 - XML - Schemas

    160/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Product.xsd (see example17)

    Note that this schema has no targetNamespace!

  • 8/3/2019 05 - XML - Schemas

    161/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Company.xsd (see example17)

    This schema s Product.xsd. Thus, the components in Product.xsd are namespace-coerced

    to the company targetNamespace. Consequently, we can reference those components just as thoughthey had originally been declared in a schema with the same targetNamespace.

  • 8/3/2019 05 - XML - Schemas

    162/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Namespace

    A

    A.xsd

    Namespace

    B

    B.xsd

    C.xsd

  • 8/3/2019 05 - XML - Schemas

    163/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Camera.xsd

    Nikon.xsd Olympus.xsd Pentax.xsd

    Nikon.xsd

  • 8/3/2019 05 - XML - Schemas

    164/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    targetNamespace= http://www.olympus.com

    xmlns="http://www.olympus.com"

    elementFormDefault="qualified">

    Olympus.xsd

    Pentax.xsd

    These import

    elements give

    us access to

  • 8/3/2019 05 - XML - Schemas

    165/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    schemaLocation Olympus.xsd />

    Camera.xsd (see example 18)

    the componentsin these other

    schemas.

    Here I amusing the

    body_type

    that is

    defined

    in theNikon

    namespac

  • 8/3/2019 05 - XML - Schemas

    166/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Nikon.xsd

    http://www.olympus.com

    Olympus.xsd

    http://www.pentax.com

    Pentax.xsd">

    Ergonomically designed casing for easy handling

    300mm

    1.2

    1/10,000 sec to 100 sec

    Olympus, andPentax namespaces.

    Camera.xml (see example 18)

  • 8/3/2019 05 - XML - Schemas

    167/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    The next slide shows the non-redundant version.

    Ergonomically designed casing for easy handling

  • 8/3/2019 05 - XML - Schemas

    168/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    300mm

    1.2

    1/10,000 sec to 100 sec

    Camera.xml (non-redundant version)

  • 8/3/2019 05 - XML - Schemas

    169/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Do Labs

    11.a, 11.b,

    11.c

  • 8/3/2019 05 - XML - Schemas

    170/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Example: For a document containing a Lottery drawing we might have

    12 49 37 99 20 67

    How do we declare the element Numbers ...

    (1) To contain a list of integers, and

    (2) Each integer is restricted to be between 1 and 99, and

    (3) The total number of integers in the list is exactly six.

    July 1

    b 6 b

  • 8/3/2019 05 - XML - Schemas

    171/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    21 3 67 8 90 12

    July 8

    55 31 4 57 98 22

    July 15

    70 77 19 35 44 11

    Lottery.xml (see example19)

  • 8/3/2019 05 - XML - Schemas

    172/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Lottery.xsd

  • 8/3/2019 05 - XML - Schemas

    173/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Restrict the numbers to maxInclusive value="99"

    d l th l "6"/

  • 8/3/2019 05 - XML - Schemas

    174/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Lottery.xsd (see example19)

  • 8/3/2019 05 - XML - Schemas

    175/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    NumbersList is a list where the type of each item is OneToNinetyNine.LotteryNumbers restricts NumbersList to a length of six (i.e., an element

    declared to be of type LotteryNumbers must hold a list of numbers,

    between 1 and 99, and the length of the list must be exactly six).

  • 8/3/2019 05 - XML - Schemas

    176/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Alternatively,

    This is read as: "We are creating a new type called LotteryNumbers.

    It is a restriction. At this point we can either use the base

    attribute or a simpleType child element to indicate the type that

    we are restricting (you cannot use both the base attribute and the

    simpleType child element). We want to restrict the type that is a

    list of OneToNinetyNine. We will restrict that type to a length of 6."

  • 8/3/2019 05 - XML - Schemas

    177/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    i.e., lists only apply to simpleTypes

    In the instance document, you must separate each item in a list withwhite space (blank space, tab, or carriage return)

    The only facets that you may use with a list type are:

    length: use this to specify the length of the list

    minLength: use this to specify the minimum length of the list

    maxLength: use this to specify the maximum length of the list

    enumeration: use this to specify the values that the list may have

    pattern: use this to specify the values that the list may haveDo Lab 11.d

  • 8/3/2019 05 - XML - Schemas

    178/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    simpleType 1 simpleType 2

    simpleType 1+

    simpleType 2

    Note: you can create a union of more

    than just two simpleTypes

  • 8/3/2019 05 - XML - Schemas

    179/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p

    Cont. -->

  • 8/3/2019 05 - XML - Schemas

    180/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p yp

    Cont. -->

  • 8/3/2019 05 - XML - Schemas

    181/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Y2KFamilyReunion.xsd (see example 20)

    Mary

    Pat

    Patti

  • 8/3/2019 05 - XML - Schemas

    182/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Patti

    Christopher

    Elizabeth

    Judy

    PeterTom

    Cheryl

    Marc

    JoeRoger

    Y2KFamilyReunion.xml (see example 20)

  • 8/3/2019 05 - XML - Schemas

    183/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    y

    Version 2 of Y2KFamilyReunion.xsd (see example 21)

    y

    simpleTypes

    The disadvantage of

    creating the union

    type in this manner

    is that none of the

    simpleTypes are

    reusable.

  • 8/3/2019 05 - XML - Schemas

    184/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Alternatively,

  • 8/3/2019 05 - XML - Schemas

    185/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    defined in the schema for schemas, but it gives you theidea of how the schemas-for-schemas might implement it)

  • 8/3/2019 05 - XML - Schemas

    186/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    (see example22)

  • 8/3/2019 05 - XML - Schemas

    187/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    2. simpleType that uses another simpleType as the base type:

  • 8/3/2019 05 - XML - Schemas

    188/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    4. An alternate form of the above, where the list's datatype is specified using an inlined simpleType:

  • 8/3/2019 05 - XML - Schemas

    189/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    6. An alternate form of the above, where the datatype UnboundedType is specified using an inline simpleType:

  • 8/3/2019 05 - XML - Schemas

    190/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Now an instance document author can optionally extend (after )

    the content of elements with any element.

  • 8/3/2019 05 - XML - Schemas

    191/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    SchemaRepository.xsd (see example23)

    Suppose that the instance document author discovers this schema repository, and wants to extend

    his/her elements with a element. He/she can do so! Thus, the instance

    document will be extended with an element never anticipated by the schema author. Wow!

    My Life and Times

    Paul McCartney

    199894303-12021-43892

    P bli h M Milli P bli hi /P bli h

    This instance document

    uses components from

    two different schemas.

  • 8/3/2019 05 - XML - Schemas

    192/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    McMillin Publishing

    Roger

    Costello

    Illusions: The Adventures of a Reluctant Messiah

    Richard Bach

    1977

    0-440-34319-4

    Dell Publishing Co.

  • 8/3/2019 05 - XML - Schemas

    193/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    194/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    provided it's from the specified namespace

    Note: you can specify a list of namespaces,

    separated by a blank space. One of the

    namespaces can be ##targetNamespace (seenext)

    allows a new element,

    provided it's from the namespace that the schema

    is defining.

    allows an element from any namespace. This is the default.

    the new element must come from no namespace

  • 8/3/2019 05 - XML - Schemas

    195/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Now an instance document author can add any number of attributes onto a

    element (as well as extend the element content).

    / d

  • 8/3/2019 05 - XML - Schemas

    196/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    SchemaRepository.xsd (see example24)

    Suppose that the instance document author discovers this schema, and wants to extend

    his/her elements with an id attribute. He/she can do so! Thus, the instance document

    will be extended with an attribute never anticipated by the schema author. Wow!

    My Life and Times

    Paul McCartney

    1998

    1-56592-235-2McMillin Publishing

  • 8/3/2019 05 - XML - Schemas

    197/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Roger

    Costello

    Illusions: The Adventures of a Reluctant Messiah

    Richard Bach

    1977

    0-440-34319-4

    Dell Publishing Co.

    BookStore.xml (see example24)

  • 8/3/2019 05 - XML - Schemas

    198/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    previous schemas where the content of all our elements werealways fixed and static.

    We are empowering the instance document author with theability to define what data makes sense to him/her!

  • 8/3/2019 05 - XML - Schemas

    199/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    from the specified namespace.

    Note: you can specify a list of

    namespaces, separated by a blank

    space. One of the namespaces can be##targetNamespace (see next)

    allows new attributes, provided they're from

    the namespace that the schema is defining.

    allows any attributes. This is the default.

    allows any unqualified attributes (i.e., the attributes comes

    from no namespace)

  • 8/3/2019 05 - XML - Schemas

    200/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    201/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

  • 8/3/2019 05 - XML - Schemas

    202/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    p

    ...

    after every element.

    Further, it is allowing

    for attribute expansion

    on every element.

    Truly, this is the ultimate

    in openness!

  • 8/3/2019 05 - XML - Schemas

    203/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    With this schema we are

    allowing instance documents

    to extend only at the end of

    Book's content model.

  • 8/3/2019 05 - XML - Schemas

    204/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    get information about these new capabilities to its vendors ASAP. Further, they will have little motivation

    to wait for the next meeting of the cellphone community to consider upgrades to cellphone.xsd. They need

    results NOW. How does open content help? That is described next.

    Suppose that the cellphone schema is declared "open". Immediately NOKIA can extend its instance

    documents to incorporate data about the new features. How does this change impact the vendor

    applications that receive the instance documents? The answer is - not at all. In the worst case, the vendor's

    application will simply skip over the new elements. More likely, however, the vendors are showing the

    cellphone features in a list box and these new features will be automatically captured with the otherfeatures. Let's stop and think about what has been just described Without modifying the cellphone

    schema and without touching the vendor's applications, information about the new NOKIA features has

    been instantly disseminated to the marketplace! Open content in the cellphone schema is the enabler for

    this rapid dissemination. Continued -->

  • 8/3/2019 05 - XML - Schemas

    205/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    those vendors do not need to upgrade in lock-step

    To wrap up this example suppose that several months later thecellphone community reconvenes to discuss enhancements to the

    schema. The new features that NOKIA first introduced into the

    marketplace are then officially added into the schema. Thuscompletes the cycle. Changes to the instance documents have driven

    the evolution of the schema.

    Do Lab 12

  • 8/3/2019 05 - XML - Schemas

    206/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    relationships between the elements

    Use of strong typing will capture much of the data content

    The annotations can capture definitions and other explanatory information

    The structure of the "definitions" will always be consistent with the structureused in the schema since they are linked

    Since the schema itself is an XML document, we can use XSLT to extractthe annotations and transform the "semantic" information into a format

    suitable for human consumption

  • 8/3/2019 05 - XML - Schemas

    207/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    elements

    Capture data semantics Use XML annotation element to capture

    data definitionsUse data attributes to capture meta-data

    Make data, meta-data and semanticsaccessible to a wide variety of clients

    Use XSL to transform this XML taggeddata to make it readable to humans orother computers

    Ensure parallelism between datastructure definition and data semantics

    Consistency ensured by having both partof same XML schema

  • 8/3/2019 05 - XML - Schemas

    208/208

    eXtensible Markup LanguageLectu rer : Phan Vo Minh Than g MSc.

    Number of slides: 208 Updated date: 12/02/2006

    Contact: Mr.Phan Vo Minh Thang([email protected])