Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and...

31
Sheet 1 XML Technology in E-Commerce 2001 Lecture 2 XML Technology in E- Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema

Transcript of Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and...

Page 1: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 1XML Technology in E-Commerce 2001 Lecture 2

XML Technology in E-Commerce

Lecture 2

Logical and Physical Structure, Validity, DTD, XML Schema

Page 2: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 2XML Technology in E-Commerce 2001 Lecture 2

• Logical and Physical Structure of XML Documents;

• Validity;

• DTD– Element declarations;

– Attribute declarations;

• XML Schema– Element and Attribute declarations;

– Simple types definitions;

– Complex types definitions;

Lecture Outline

Page 3: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 3XML Technology in E-Commerce 2001 Lecture 2

• By definition each XML document has logical and physical structure;

• Markups are used to describe the structures;• Two structures must be properly nested according

to the specification rules;

See “Logical and Physical Structure of XML Documents”

Logical and Physical Structure

Page 4: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 4XML Technology in E-Commerce 2001 Lecture 2

• An XML Document is an information item;

• Document Logical Structure: represents the information in the way perceived by the user (application);

Logical Structure

Page 5: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 5XML Technology in E-Commerce 2001 Lecture 2

Physical Structure

• An XML Document is also a physical entity;• The content that we logically perceive can be distributed across several

physical entities. They form the physical structure:

<students>

</students>

<student> John Smith</student>

<student> John Smith Jr.</student>

Entity 1

Entity 2

Entity 3

<students> <student> John Smith </student> <student> John Smith Jr. </student></students>

Logical View

Page 6: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 6XML Technology in E-Commerce 2001 Lecture 2

• Well-formedness constraints don’t specify element and attribute names and types and the instance document structure;

• Validity Constraints - specify element and attribute names

and types and the document structure;

• DTD based validation and Schema based validation;

• Parsers:

– Non-validating parsers: check documents for well-formedness;

– Validating parsers: check documents for well-formedness and validity constraints;

Valid XML Documents

Page 7: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 7XML Technology in E-Commerce 2001 Lecture 2

DTD Validation

Page 8: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 8XML Technology in E-Commerce 2001 Lecture 2

• DTD - Document Type Definition;

• DTD is a grammar for a class of XML documents;

• Document Type Declaration:

– Contains the DTD for an XML document;

– External subset:

<!DOCTYPE root SYSTEM “myDTD.dtd” >

– Internal subset:

<!DOCTYPE root [

……markup declarations………

]>

DTD

Page 9: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 9XML Technology in E-Commerce 2001 Lecture 2

• Element type declarations;

• Attribute list declarations;

• Entity declarations - declare the entities that form

the document physical structure. See “Logical and

Physical Structure of XML Documents”;

• Notation declarations;

Document Type Declaration can also contain Processing Instructions and Comments

DTDMarkup Declarations

Page 10: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 10XML Technology in E-Commerce 2001 Lecture 2

Specifies the element type and content:

<!ELEMENT Name contentSpec>

Element’s Content:– Empty:

<!ELEMENT homepage EMPTY>

– Any:<!ELEMENT container ANY>

– Only elements (element content);

– Mixed;

DTDElement Type Declaration

Page 11: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 11XML Technology in E-Commerce 2001 Lecture 2

• Content Model Building Blocks:– Choice

(p | list | table | form )

– Sequence(street, zip, city, country)

– Occurrence Specifiers? + *

• Example:<!ELEMENT person (name, address+,

homepage?, note*)>

See also Deitel 6.4.1, page 139

DTD: Element’s Content Content Model

Page 12: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 12XML Technology in E-Commerce 2001 Lecture 2

• Elements with mixed content can contain other elements and character data or only character data

<!ELEMENT note (#PCDATA | em | strong | abbr)*>

<!ELEMENT p (#PCDATA | em | i | b | a | ul)*>

<!ELEMENT street (#PCDATA)>

<!ELEMENT city (#PCDATA)>

• Other examples - Deitel 6.4.2, page 143

DTD: Element’s Content Mixed Content

Page 13: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 13XML Technology in E-Commerce 2001 Lecture 2

• Attributes are always associated with a particular element;

• Attribute list declaration format:<!ATTLIST elName

attrName1 attrType1 attrDefault1

attrName2 attrType2 attrDefault2

………………………………… >

• Attribute types:– String type;

– Tokenized type;

– Enumerated type;

DTDAttribute List Declaration

Page 14: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 14XML Technology in E-Commerce 2001 Lecture 2

• String type:

<!ATTLIST person age CDATA #REQUIRED>

• Tokenized types:– ID, IDREF, IDREFS (Deitel 6.6.1 page 147);

– ENTITY, ENTITIES(Deitel 6.6.1 page 150, “Logical and Physical Structure of XML Documents”);

– NMTOKEN, NMTOKENS (Deitel 6.6.1 page 152);

<!ATTLIST person id ID #REQUIRED>

• Enumerated type:

<!ATTLIST person gender (M | F) #IMPLIED>

DTD: Attribute DeclarationsAttribute Types

Page 15: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 15XML Technology in E-Commerce 2001 Lecture 2

Provide information about the attribute’s presence:

• #REQUIREDAttribute must always be present.

• #IMPLIEDThe attribute may be absent. There is no default value.

• Default value

<!ATTLIST list type (ol|ul) “ul”>

<!ATTLIST list type (ol|ul) #FIXED “ul”>

DTD: Attribute DeclarationsAttribute Defaults

Page 16: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 16XML Technology in E-Commerce 2001 Lecture 2

• DTD is a grammar that specifies element and attributes types and names;

• DTD contains declarations for Entities and Notations that are used in the document physical structure (see “Logical and Physical Structure of XML Documents”);

• Mixed element content can not constrain the order of sub-elements;

• Attribute value type set doesn’t contain primitive data types like integer, date, time, etc.

Demo - DTD validation with XML Spy

Summary on DTD validation

Read: Deitel 6, “Logical and Physical Structure of XML Documents”

Assignment: Deitel Ex 6.6 and Ex 6.7, page 164

Page 17: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 17XML Technology in E-Commerce 2001 Lecture 2

• Logical and Physical Structure of XML Documents;

• Validity;

• DTD– Element declarations;

– Attribute declarations;

• XML Schema– Element and Attribute declarations;

– Simple types definitions;

– Complex types definitions;

Lecture Outline

Page 18: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 18XML Technology in E-Commerce 2001 Lecture 2

Schema Validation

Page 19: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 19XML Technology in E-Commerce 2001 Lecture 2

• XML Schema constrains the structure, element and attributes names and types of XML documents;

• There are several schema proposals. We will discuss W3C Schema;

• Schema specification defines an abstract data model for schemas and the correspondent XML representation;

• A schema is a set of components;

• There are 13 schema components divided into three groups:– Primary components;

– Secondary components;

– Helper components;

XML Schema

Page 20: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 20XML Technology in E-Commerce 2001 Lecture 2

Schema: XML Representation• schema element<xs:schema

xmlns:xs=”"http://www.w3.org/2000/10/XMLSchema" version=”1.0”>

<xs:attribute ……>

</xs:schema>

• Current namespace URI (30 March, no support in XML Spy 3.5): http://www.w3.org/2001/XMLSchema

• Components:– Element declarations;

– Attribute declarations;

– Simple type definitions;

– Complex type definitions;

Page 21: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 21XML Technology in E-Commerce 2001 Lecture 2

• Syntax:<element name=“myElement” type=“myType” />

<element ref=“myElement”/>

• Occurrence:minOccurs and maxOccurs attributes

<element ref=“myElement”

minOccurs=“2”

maxOccurs=“12”/>

<element ref=“myElement”

minOccurs=“0”

maxOccurs=“unbounded”/>

SchemaElement Declaration

Page 22: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 22XML Technology in E-Commerce 2001 Lecture 2

SchemaAttribute Declaration (1)

• Syntax:<attribute name=“myAttr” type=“myAttrType”/>

<attribute ref=“myAttr”/>

• Defaults:use and value attributes

<attribute ref=“myAttr” use=“required”/>

<attribute ref=“myAttr” use=“default”

value=“37”/>

<attribute ref=“myAttr” use=“fixed”

value=“37”/>

Page 23: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 23XML Technology in E-Commerce 2001 Lecture 2

SchemaAttribute Declaration (2)

Changes in attribute occurrence constraints syntax (made on 30 March, currently not supported by XML Spy 3.5)

• Defaults:use, default, fixed attributes

<attribute ref=“myAttr” use=“required”/>

<attribute ref=“myAttr” use=“optional”

default=“37”/>

<attribute ref=“myAttr” fixed=“37”/>

Page 24: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 24XML Technology in E-Commerce 2001 Lecture 2

• XML Schema provides two kinds of type definition:

– Simple types - specify constraints on strings that can be used

as values of attributes and elements with only character data

content;

– Complex types - specify attributes and content model of

document elements;

• Type definition hierarchy:– Types defined by restriction;

– Types defined by extension;

– Root type - anyType;

SchemaType Definitions

Page 25: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 25XML Technology in E-Commerce 2001 Lecture 2

• Usage - for attribute values and content of elements without attributes and children;

<phone>222-33-22-444-1</phone>

<age>23</age>

• Set of built-in simple datatypes defined in XML Schemas: Datatypes specification (see XML Primer, Appendix B, Table b1.a);

• Each simple type is a restriction of another simple type;

SchemaSimple Types

Page 26: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 26XML Technology in E-Commerce 2001 Lecture 2

• Syntax:<simpleType name=“mySimpleType”>

content: (restriction | union | list)

</simpleType>

• Restrictions:<simpleType name=“mySimpleType”>

<restriction base=“integer”>

<minInclusive value=“25”/>

<maxInclusive value=“100”/>

</restriction>

</simpleType>

• Facets (see XML Schema Primer, Appendix B);

• List and Union Types (see XML Schema Primer 2.3.1 and 2.3.2);

SchemaSimple Type Definition

Page 27: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 27XML Technology in E-Commerce 2001 Lecture 2

• Complex type definition contains a set of attribute

declarations and content model that specify the content and

attributes of a set of elements;

• Complex type can be:– a restriction of another complex type;

– an extension of a simple or complex types;

– a restriction of the anyType type;

• Extension mechanism adds additional content parts at

the end of the content model of the base definition

and/or adds new attribute declarations;

SchemaComplex Types

Page 28: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 28XML Technology in E-Commerce 2001 Lecture 2

Elements with text-only content and attributes. Extension of simple types:<height units=“m”>125</height>

<complexType name=“measurement”>

<simpleContent>

<extension base=“decimal”>

<attribute name=“units” type=“string”/>

</extension>

</simpleContent>

</complexType>

<element name=“height” type=“measurement”/>

SchemaComplex Type Definition

Page 29: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 29XML Technology in E-Commerce 2001 Lecture 2

• Model Group Elements (see XML Schema Primer 2.7):– sequence;

– choice;

– all;

– group;

• Mixed Content;<complexType name=“noteType” mixed=“true”>

<choice maxOccurs=“unbounded”>

<element name=“em” type=“string”/>

<element name=“b” type=“string”/>

<element name=“i” type=“string”/>

</choice>

</complexType>

• Empty Elements (see XML Schema Primer 2.5.3)

SchemaElement Content Model

Page 30: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 30XML Technology in E-Commerce 2001 Lecture 2

• Anonymous Types (Primer 2.4);

• Attribute Groups (Primer 2.8);

• Namespace (Primer 3.1);

• Deriving types by extension (Primer 4.2);

• Schema modularization (Primer 4.1);

• Annotations (Primer 2.6);

• Relating schema and document instances (Primer 5.6, Deitel 7.6)

Demo: Schema validation with XML Spy

SchemaAdditional Features

Page 31: Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.

Sheet 31XML Technology in E-Commerce 2001 Lecture 2

• Expressed in XML;

• Based on the explicit notion of types for elements and attribute values;

• Provides namespace control;

• Uses extension and restriction for type derivation;

• Lacks of support for entities;

Summary on XML Schema

Read: Deitel 7, XML Schema Primer (24.10.2000 version)

Skip: Deitel 7.3..7.5, Primer 5.1..5.3, 5.5

Assignment:Write schema for planner.xml (Deitel 5.9, page 126)

and compare with the syntax in Deitel 7.7. Validate with XML-

Spy. Use Chapter 2 and Appendix B from the Primer, Deitel 7.6