AdvXML_Lecture01_XML-Namspace

65
XML, NAMESPACE Lecture 1 Advanced XML 1

Transcript of AdvXML_Lecture01_XML-Namspace

Page 1: AdvXML_Lecture01_XML-Namspace

XML, NAMESPACE

Lecture 1

Advanced XML 1

Page 2: AdvXML_Lecture01_XML-Namspace

Objectives

• Introduction to XML

– Outline the feature of markup language and list

their drawbacks

– Define and describe XML

– State the benefits and scope of XML

• Exploring XML

– Describe the structure of an XML document

– Explain the lifecycle of an XML document

Advanced XML 2

Page 3: AdvXML_Lecture01_XML-Namspace

Objectives

• Exploring XML (Cont‟…)

– State the functions of editors for XML and list the

popularly used editors.

– State the functions of parsers for XML and list

names of commonly used parsers.

– State the functions of browsers for XML and list of

commonly used browsers

Advanced XML 3

Page 4: AdvXML_Lecture01_XML-Namspace

Objectives

• Working with XML

– Explain the steps towards building an XML

– Define what is meant by well-formed XML

• XML Syntax

– State and describe the use of comments and processing instructions in XML

– Classify character data that is written between tags

– Describe entities, DOCTYPE declarations and attributes

Advanced XML 4

Page 5: AdvXML_Lecture01_XML-Namspace

• Describe namespaces

–Define XML Namespaces

–Working with Namespaces Syntax

• Problems posed by prefixes

• Placing attributes in a namespace

• Default Namespaces

• Override default namespaces

Objectives

Advanced XML 5

Page 6: AdvXML_Lecture01_XML-Namspace

Documents recorded

using paper and pen Typesetters formatting

documents

Tools used by typesetters

to format a document

Advanced XML 6

History of Markup

Page 7: AdvXML_Lecture01_XML-Namspace

• A Markup language defines the rules that help to add

meaning to the content and structure of documents.

• They are classified as:

– Stylistic Markup – It determines the presentation of the

document

– Structure Markup – It defines the structure of the

document

– Semantic Markup – It determines the content of the

document

Advanced XML 7

Markup Language

Page 8: AdvXML_Lecture01_XML-Namspace

SGML

• Generalized Markup Language (GML) is the

system of formatting documents.

• GML was fine-tuned and came to be known as

Standard Generalized Markup Language

(SGML).

• SGML is the source of origin of all markup

languages

Advanced XML 8

Page 9: AdvXML_Lecture01_XML-Namspace

Features of SGML

• It describes markup language, which allows authors to create their own tags that relate to their content.

• It needs a separate file that will contain all the rules for the language, for its interpretation

• A SGML application is markup language derived from SGML.

Advanced XML 9

Page 10: AdvXML_Lecture01_XML-Namspace

• HTML is a MARKUP language

• Using HTML tags and elements, we can:

– Control the appearance of the page and the content

– Publish online documents and retrieve online information using the links inserted in the HTML document

– Create on-line forms. These forms can be used to collect information about the user, conduct transactions, and so on

Advanced XML 10

Introduction to HTML

Page 11: AdvXML_Lecture01_XML-Namspace

HTML

• HTML is the most famous markup language derived

from SGML.

• It was created to mark up technical papers so that

they could be transferred across different platforms

for the scientific community.

• It is now also used by those non-scientific users who

are concerned about their document‟s presentation.

Advanced XML 11

Page 12: AdvXML_Lecture01_XML-Namspace

Drawbacks of HTML

• Fixed tag set

• Presentation technology does not relate to the contents

• It is flat

• Clogging

• HTML is not international

• Data interchange is impossible

• Does not have a robust linking mechanism

• HTML is not reusable

Advanced XML 12

Page 13: AdvXML_Lecture01_XML-Namspace

HTML and XML code Examples

<UL>

<LI> TOM CRUISE

<UL>

<LI> CLIENT ID : 100

<LI> COMPANY : XYZ Corp.

<LI> Email : [email protected]

<LI> Phone : 3336767

<LI> Street Adress: 25th St.

<LI> City : Toronto

<LI> State : Toronto

<LI> Zip : 20056

</UL>

</UL>

<Details>

<CONTACT>

<PERSON_NAME>TOM CRUISE </PERSON_NAME>

<ID> 100 </ID>

<Company>XYZ Corp. </Company>

<Email>[email protected]</Email>

<Phone> 3336767</Phone>

<Street> 25th St.</Street>

<City>Toronto</City>

<State>Toronto</State>

<ZIP> 20056</ZIP>

</CONTACT>

</Details>

Advanced XML 13

HTML Code XML Code

Page 14: AdvXML_Lecture01_XML-Namspace

XML -1

• XML stands for Extensible Markup Language.

• It overcomes all the drawbacks of HTML.

• It allows the user to define their own set of tags, and

also makes it possible for others (people or programs) to understand it.

• It is more flexible than HTML.

• It inherits the features of SGML and combines it with

the features of HTML.

• It is a smaller version of SGML.

Advanced XML 14

Page 15: AdvXML_Lecture01_XML-Namspace

XML -2

• XML is a metalanguage and it describes other languages.

• The data contained in an XML file can be displayed in different ways.

• It can also be offered to other applications for further processing.

• Style sheets help transform structured data into different HTML views. This enables data to be displayed on different browsers.

Advanced XML 15

Page 16: AdvXML_Lecture01_XML-Namspace

XML Architecture - 1

• XML supports three-tier architecture for handling and manipulating data.

• It can be generated from existing databases using a scalable three-tier model.

• XML tags represent the logical structure of data that can be interpreted and used in various ways by different applications.

• The middle-tier is used to access multiple databases and translate data into XML.

Advanced XML 16

Page 17: AdvXML_Lecture01_XML-Namspace

XML Architecture -2

Advanced XML 17

Page 18: AdvXML_Lecture01_XML-Namspace

XML – A Universal data format

• HTML is a single markup language, but XML is a family of markup languages.

• Any type of data can be easily defined in XML.

• XML is popular because it supports a wide range of applications and is easy to use.

• XML has a structured data format, which allows it to store complex data

Advanced XML 18

Page 19: AdvXML_Lecture01_XML-Namspace

Benefits of XML

• The three-tier architecture has easier

scalability and better security.

• The benefits of XML are classified into the

following:

– Business benefits

– Technological benefits

Advanced XML 19

Page 20: AdvXML_Lecture01_XML-Namspace

Business Benefits

• Information sharing:

– Allows businesses to define data formats in XML

– Provides tools to read, write and transform data between

XML and other formats

• XML inside a single application:

– Powerful, flexible and extensible language

• Content Delivery:

– Supports different users and channels, like digital TV,

phone, web and multimedia kiosks

Advanced XML 20

Page 21: AdvXML_Lecture01_XML-Namspace

Business Benefits

• Other Benefits:

– Data Independence

– Easier to parse

– Reducing Server Load

– Easier to create

– Web Site Content

– Remote Procedure Calls

– e-Commerce

Advanced XML 21

Page 22: AdvXML_Lecture01_XML-Namspace

Technological Benefits

Technological

Benefits

Re-use of data

Separation of data

and presentation

Extensibility Semantic

information

Advanced XML 22

Page 23: AdvXML_Lecture01_XML-Namspace

XML Document Structure

Advanced XML 23

Page 24: AdvXML_Lecture01_XML-Namspace

XML Document Structure

Advanced XML 24

Page 25: AdvXML_Lecture01_XML-Namspace

XML Document Structure

• An XML document is composed of sets of “entities”

identified by unique names.

• All documents begin with a root or document entity.

• Entities are aliases for more complex functions.

• Documents are logically composed of declarations,

elements, comments, character references, and

processing instructions.

Advanced XML 25

Page 26: AdvXML_Lecture01_XML-Namspace

Well formed and Valid Documents

• An XML document is considered as well formed, if a minimum set of requirements defined in the XML 1.0 specification are satisfied.

• The requirements ensure that correct language terms are used in the right manner .

• A valid XML document is a well-formed XML document, which conforms to the rules of a Document Type Definition (DTD).

• DTD defines the rules that an XML markup in the XML document must follow.

Advanced XML 26

Page 27: AdvXML_Lecture01_XML-Namspace

XML Document Life cycle

• XML Document Life cycle

• Importance components

• Editors

• Parser

• Browser Advanced XML 27

Page 28: AdvXML_Lecture01_XML-Namspace

Editors

• The main functions that editors provide:

– Add opening and closing tags to the code

– Check for validity of XML

– Verify XML against a DTD/Schema

– Perform series of transforms over a document

– Color the XML Syntax

– Display the line numbers

– Present the content and hide the code

– Complete the word

Advanced XML 28

Page 29: AdvXML_Lecture01_XML-Namspace

Editors

• The popular used editors are:

– Oxygen

– XML Writer

– XML Spy

– XML Pro

– XML Mind

– XMetal

Advanced XML 29

Page 30: AdvXML_Lecture01_XML-Namspace

Parsers - 1

• Parsers help the computer interpret an XML

file.

<?xml version=“1.0”?> <nxn> </nxn>

Editor with the XML document

Parsed document viewed in the browser

XML document parsed by the parser

• Their are two types of parsers:

• Non Validating parser

• Validating parser

Advanced XML 30

Page 31: AdvXML_Lecture01_XML-Namspace

Parsers - 2

XML

file

Other related

files (like

DTD file)

Parsers load the XML

and other related files

to check whether the

XML document is

well formed and valid

Data tree

Advanced XML 31

Page 32: AdvXML_Lecture01_XML-Namspace

Parsers - 3

• Commonly used parsers are:

• Crimson

• Xerces

• Oracle XML Parser

• JAXP (Java API for XML)

• MSXML

Advanced XML 32

Page 33: AdvXML_Lecture01_XML-Namspace

Browsers

• Commonly used web browser are as follows:

• Netscape

• Mozilla

• Internet Explorer

• Firefox

• Opera

Advanced XML 33

Page 34: AdvXML_Lecture01_XML-Namspace

Data vs. Markup

<NAME> Tom Cruise </NAME>

Markup

Data

Advanced XML 34

Page 35: AdvXML_Lecture01_XML-Namspace

Creating an XML Document

• To create an XML document:

– State an XML declaration

– Create a root element

– Create the XML code

– Verify the document

Advanced XML 35

Page 36: AdvXML_Lecture01_XML-Namspace

Creating an XML Document

Advanced XML 36

Page 37: AdvXML_Lecture01_XML-Namspace

Stating an XML Declaration

• Syntax

<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?>

• „Standalone‟ and „encoding‟ attributes are

optional, only the version number is

mandatory

• „Standalone‟ – is the external declaration

• „Encoding‟ - specifies the character encoding

used by the author

• XML 1.0 version is default

Advanced XML 37

Page 38: AdvXML_Lecture01_XML-Namspace

Creating a Root Element

• There can only be one root element

• It describes the function of the document

• Every XML document must have a root

element

Example

<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?>

<BOOK>

</BOOK>

Advanced XML 38

Page 39: AdvXML_Lecture01_XML-Namspace

Creating the XML Code -1

• It is the process of creating our own elements and

attributes as required by our application.

• Elements are the basic units of XML content.

• Tags tell the user agent to do something to the content

encased between the start and end tag.

Opening Tag Content Closing Tag

<TITLE> FPT University </TITLE>

Element

Parts of an

element

Advanced XML 39

Page 40: AdvXML_Lecture01_XML-Namspace

Creating the XML Code -2

• Rules govern the elements:

– At least one element required

– XML tags are case sensitive

– End the tags correctly

– Nest tags Properly

– Use legal tags

– Length of markup names

– Define Valid Attributes

Advanced XML 40

Page 41: AdvXML_Lecture01_XML-Namspace

Verify the document

• The document should follow the XML rules;

otherwise it will not be read by the browser or

by any other XML reader

Advanced XML 41

Page 42: AdvXML_Lecture01_XML-Namspace

Comments

• This is information for the understanding of

the user, and is to be ignored by the processor.

• Syntax

<!- - Write the comment here -- >

Example

<!-- don't show these <NAME>KATE WINSLET</NAME> <NAME>NICOLE KIDMAN</NAME> <NAME>ARNOLD</NAME> --> <NAME>TOM CRUISE</NAME>

The example given will

display only the name

TOM CRUSIE, and others

are treated as comments.

Advanced XML 42

Page 43: AdvXML_Lecture01_XML-Namspace

Processing Instruction

• A processing information is a bit of information

meant for the application using the XML document.

• These instructions are directly passed to the

application using the parser.

• The XML declaration is also a processing agent.

<?xml:stylesheet type=“text/xsl”?>

Name of application Instruction information

Advanced XML 43

Page 44: AdvXML_Lecture01_XML-Namspace

Character Data

• The text between the start and end tags is

defined as „character data‟.

• Character data may be any legal (Unicode).

• Character data is classified into:

– PCDATA

– CDATA

Advanced XML 44

Page 45: AdvXML_Lecture01_XML-Namspace

PCDATA

• It stands for parsed character data.

• PCDATA is text that will be parsed by a Parser.

• Tags inside the text will be treated as markup and

entities will be expanded.

Entity Name

Character

&lt;

<

&gt;

>

&amp;

&

&quot;

"

&apos;

'

Predefined entities

Advanced XML 45

Page 46: AdvXML_Lecture01_XML-Namspace

CDATA

• It means character data.

• It will not be parsed by the Parser.

• CDATA are used to make it convenient to include large blocks of special characters.

• The character string ]]> is not allowed within a CDATA block as it will signal the end of the CDATA block.

<SAMPLE> <![CDATA[<DOCUMENT> <NAME>TOM CRUISE</NAME> <EMAIL>[email protected]</EMAIL> </DOCUMENT>]]> </SAMPLE>

Example

Advanced XML 46

Page 47: AdvXML_Lecture01_XML-Namspace

Entities

• Entities are used to avoid typing long pieces of text

repeatedly within a document.

• There are two categories of entities:

– General entities

Syntax

<!ENTITY ADDRESS "text that is to be represented by

an entity">

– Parameter entities

Syntax

<!ENTITY % ADDRESS "text that is to be represented by an entity">

Advanced XML 47

Page 48: AdvXML_Lecture01_XML-Namspace

Entities

Advanced XML 48

Page 49: AdvXML_Lecture01_XML-Namspace

Examples of Entities

An example of Parameter entities

< CLIENT = "&FPT;" PRODUCT =

"&PRODUCT_ID;" QUANTITY

= "15">

• Entity declaration

– Syntax

%PARAMETER_ENTITY_NA

ME;

– Example

%address;

An example of a General entity

<!ENTITY full_address " My

Address 12 Tenth Ave. Suite 12

Paris, France">

• Entity declaration

– Syntax

&ENTITY_NAME;

– Example

&address;

Advanced XML 49

Page 50: AdvXML_Lecture01_XML-Namspace

The DOCTYPE declarations

• The <!DOCTYPE [..]> declaration follows the XML declaration in an XML document.

• Syntax <?xml version="1.0"?> <!DOCTYPE myDoc [ ...declare the entities here.... <myDoc> ...body of the document.... </myDoc>

Example

<!DOCTYPE CUSTOMERS [ <!ENTITY firstFloor "15 Downing St Floor 1"> <!ENTITY secondFloor "15 Downing St Floor 2"> <!ENTITY thirdFloor "15 Downing St Floor 3"> ]>

Advanced XML 50

Page 51: AdvXML_Lecture01_XML-Namspace

Attributes

• An attribute gives information about an

element.

• Attributes are embedded in the element start

tag.

• An attribute consists of an attribute name and

attribute value.

Example

<TV count="8">SONY</TV>

<LAPTOP count="10">IBM</LAPTOP>

Advanced XML 51

Page 52: AdvXML_Lecture01_XML-Namspace

• Two or more applications on the Internet may also

have some element names that are common.

Namespaces help avoid such ambiguity that may

arise.

• It also allows to combine documents from different

sources and enables the identification of what element

or attributes come from which source.

• It instructs the user agent to access the DTD against

which the document is validated.

Advanced XML 52

XML Namespaces - 1

Page 53: AdvXML_Lecture01_XML-Namspace

• A URI(Uniform Resource Identifier) is used to identify namespaces in XML.

• It includes Uniform Resources Name(URN) and a Uniform Resource Locator(URL).

• URL contains the reference for a document or an HTML page on a web.

• URN is a universally unique number that identifies Internet resources.

Advanced XML 53

XML Namespaces - 2

Page 54: AdvXML_Lecture01_XML-Namspace

• Namespaces are used to overcome the conflict that arise when reuse and extension of the DTD‟s take place.

• Namespaces help standardize and uniquely brand elements and attributes.

• Namespaces employ the URI to instruct the user-agent about the location of the DTD against which the XML document is checked for validity.

• Namespaces ensure that element names do not conflict and do clarify their origins.

Advanced XML 54

Needs of a Namespace

Page 55: AdvXML_Lecture01_XML-Namspace

Advanced XML 55

Needs of a Namespace

Page 56: AdvXML_Lecture01_XML-Namspace

Advanced XML 56

Syntax for Namespace

Page 57: AdvXML_Lecture01_XML-Namspace

• A prefix is associated with the URI that can be used

as a namespace.

• Syntax

xmlns:[prefix]= “[URI of namespace]”

– The xmlns: is a reserved attribute

• Example

xmlns:ins= “http://www.fpt.edu.vn”

– Namespace needs to be declared before using

– It is declared in the root element of the document

Advanced XML 57

Syntax for Namespace

Page 58: AdvXML_Lecture01_XML-Namspace

• Attributes comes within the namespace of their element unless they are predefined.

• We can also incorporate attributes from two domains:

<sample

xmlns= “http://www.fpt.edu.vn”

xmlns:tea_batch= “http://www.tea.org”>

<batch-list>

<batch type=“thirdbatch”>Evening Batch</batch>

<batch tea_batch:type= “thirdbatch”>Tea batch III

</batch>

<batch>Afternoon Batch</batch>

</batch-list>

</sample>

Advanced XML 58

Attributes and Namespaces

Page 59: AdvXML_Lecture01_XML-Namspace

• The new XSL syntax makes use of namespace to identify both its own tags, and the formatting vocabulary tags.

• The xsl: prefix are in the http//www.w3.org/TR/WD-xsl namespace.

• The fo: prefix are in the http//www.w3.org/TR/WD-xsl/FO.

• XSL is written in XML syntax and uses tags, elements, and attributes.

Advanced XML 59

Namespace Application

Page 60: AdvXML_Lecture01_XML-Namspace

<book

xmlns:html=“http//www.w3.org/TR/WD-xsl/FO”>

<index>

<chapter>this is chapter 1</chapter>

<html:br/>

<chapter>this is chapter 1</chapter>

</index>

</book>

Advanced XML 60

Namespace Example

Page 61: AdvXML_Lecture01_XML-Namspace

Advanced XML 61

Default Namespace

Page 62: AdvXML_Lecture01_XML-Namspace

Advanced XML 62

Override Default Namespace

Page 63: AdvXML_Lecture01_XML-Namspace

Summary-1

• A markup language defines a set of rules that adds meaning to the content and structure of documents

• XML is extensible, which means that we can define our own set of tags, and make it possible for other parties (people or programs) to know and understand these tags. This makes XML much more flexible than HTML

• XML inherits features from SGML and includes the features of HTML. XML can be generated from existing databases using a scalable three-tier model. XML-based data does not contain information about how data should be displayed

• An XML document is composed of a set of “entities” identified by unique names

Advanced XML 63

Page 64: AdvXML_Lecture01_XML-Namspace

Summary-2

• A well-formed document is one that conforms to the basic rules of XML; a valid document is a well-formed document that conforms to the rules of a DTD (Document Type Definition)

• The parser helps the computer to interpret an XML file

• Steps involved in the building of an XML document are:

– Stating an XML declaration

– Creating a root element

– Creating the XML code

– Verifying the document

• Character data is classified into PCDATA and CDATA

Advanced XML 64

Page 65: AdvXML_Lecture01_XML-Namspace

Summary-3

• Entities are used to avoid typing long pieces of text repeatedly in a document. The two types of entities are:

– General entities

– Parameter entities

• The <!DOCTYPE […]> declaration follows the XML declaration in an XML document.

• An attribute gives information about an element

Advanced XML 65