Presentation 6: Introduction to XML and related technologies – for use with SOAP / WSDL = Web...

29
Presentation 6: •Introduction to XML and related technologies – for use with SOAP / WSDL = Web services

Transcript of Presentation 6: Introduction to XML and related technologies – for use with SOAP / WSDL = Web...

Presentation 6:•Introduction to XML and related technologies – for use

with SOAP / WSDL = Web services

2

Outline

• Why an XML presentation?• W3C & legacy of XML – ultra short• XML markup and Namespaces• DTD’s• XML Schemas• DOM/SAX• The SOAP connection

3

Why an XML presentation?

• Because SOAP, WSDL & UDDI is based on XML technologies

• Future “Home-brew” framework based on XML?• Important to understand how the API’s work

• Parsing mechanisms

• DOM• SAX• Why its slow ;)

4

W3C & the legacy of XML

• World Wide Consortium• Founded 1994 • Standardizations on the Internet• First Chairman: Tim Berners-Lee • Boards of members submits proposals• Ensures standardization of WWW technologies

• Like: XHTML, XML, XSL, CSS, SOAP, WAP etc.

• Members: Microsoft, IBM, SUN, Oracle and many others

• http://www.w3c.org• Legacy:

• Standard Generalized Markup Language (SGML)

• Same legacy as HTML

5

XML markup

• eXtended Markup Language• XML based on SGML (subset of)• Like SGML for data & structure not layout (as HTML)• XML targets the Internet – but is also being used for

application exchange formats (Open Office, XMI) – CSVs• XML is an W3C Recommendation• Structure decided by DTD or Schema (more later)• Wide spread support for XML

6

Presenting XML documents

• First standalone XML document and its component• Note: XML document are “Well-formed”

• Please visit http://www.w3schools.com/default.asp for in-depth examples of XML usage

Ingeniørhøjskolen i ÅrhusSlide 7 af 32

Article.xml

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.1: article.xml -->4 <!-- Article structured with XML -->5 6 <article>7 8 <title>Simple XML</title>9 10 <date>September 19, 2001</date>11 12 <author>13 <firstName>Tem</firstName>14 <lastName>Nieto</lastName>15 </author>16 17 <summary>XML is pretty easy.</summary>18 19 <content>Once you have mastered XHTML, XML is easily20 learned. You must remember that XML is not for21 displaying information but for managing information.22 </content>23 24 </article>

Optional XML declaration.

Element article is the root element.

Elements title, date, author, summary and content are child elements of article.

8

Browser displaying XML (unformatted)

IE5.5 displaying article.xml.

9

Use of XML Namespaces

• XML namespaces used to avoid naming conflicts• When several different elements are involved• <book> isnt always a book• Keyword ”xmlns”

Ingeniørhøjskolen i ÅrhusSlide 10 af 32

Namespace.xml

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.4 : namespace.xml -->4 <!-- Demonstrating Namespaces -->5 6 <text:directory xmlns:text = "urn:deitel:textInfo"7 xmlns:image = "urn:deitel:imageInfo">8 9 <text:file filename = "book.xml">10 <text:description>A book list</text:description>11 </text:file>12 13 <image:file filename = "funny.jpg">14 <image:description>A funny picture</image:description>15 <image:size width = "200" height = "100"/>16 </image:file>17 18 </text:directory>

Keyword xmlns creates two namespace prefixes, text and image.

URIs (Uniform Resource Identifiers) ensure that a namespace is unique.

Ingeniørhøjskolen i ÅrhusSlide 11 af 32

Defaultnamespace.xml

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.5 : defaultnamespace.xml -->4 <!-- Using Default Namespaces -->5 6 <directory xmlns = "urn:deitel:textInfo"7 xmlns:image = "urn:deitel:imageInfo">8 9 <file filename = "book.xml">10 <description>A book list</description>11 </file>12 13 <image:file filename = "funny.jpg">14 <image:description>A funny picture</image:description>15 <image:size width = "200" height = "100"/>16 </image:file>17 18 </directory>

Default namespace.

Element file uses the default namespace.

Element file uses the namespace prefix image.

12

DTDs

• Document Type Definition• Extended Backus-Naur Form• Defines how an XML document is structured

• Required elements

• Nesting of elements

• Does not define types or behavior

• If DTD is used – some parsers can decide if XML document is “valid” – which is more than just “wellformed”

Ingeniørhøjskolen i ÅrhusSlide 13 af 32

Letter.dtd

1 <!-- Fig. 20.4: letter.dtd -->2 <!-- DTD document for letter.xml -->3 4 <!ELEMENT letter ( contact+, salutation, paragraph+, 5 closing, signature )>6 7 <!ELEMENT contact ( name, address1, address2, city, state,8 zip, phone, flag )>9 <!ATTLIST contact type CDATA #IMPLIED>10 11 <!ELEMENT name ( #PCDATA )>12 <!ELEMENT address1 ( #PCDATA )>13 <!ELEMENT address2 ( #PCDATA )>14 <!ELEMENT city ( #PCDATA )>15 <!ELEMENT state ( #PCDATA )>16 <!ELEMENT zip ( #PCDATA )>17 <!ELEMENT phone ( #PCDATA )>18 <!ELEMENT flag EMPTY>19 <!ATTLIST flag gender (M | F) "M">20 21 <!ELEMENT salutation ( #PCDATA )>22 <!ELEMENT closing ( #PCDATA )>23 <!ELEMENT paragraph ( #PCDATA )>24 <!ELEMENT signature ( #PCDATA )>

The ELEMENT element type declaration defines the rules for element letter.

The plus sign (+) occurrence indicator specifies that the DTD allows one or more occurrences of an element. (2 contacts in our example)

The contact element definition specifies that element contact contains child elements name, address1, address2, city, state, zip, phone and flag— in that order.

Ingeniørhøjskolen i ÅrhusSlide 14 af 32

Letter.dtd

1 <!-- Fig. 20.4: letter.dtd -->2 <!-- DTD document for letter.xml -->3 4 <!ELEMENT letter ( contact+, salutation, paragraph+, 5 closing, signature )>6 7 <!ELEMENT contact ( name, address1, address2, city, state,8 zip, phone, flag )>9 <!ATTLIST contact type CDATA #IMPLIED>10 11 <!ELEMENT name ( #PCDATA )>12 <!ELEMENT address1 ( #PCDATA )>13 <!ELEMENT address2 ( #PCDATA )>14 <!ELEMENT city ( #PCDATA )>15 <!ELEMENT state ( #PCDATA )>16 <!ELEMENT zip ( #PCDATA )>17 <!ELEMENT phone ( #PCDATA )>18 <!ELEMENT flag EMPTY>19 <!ATTLIST flag gender (M | F) "M">20 21 <!ELEMENT salutation ( #PCDATA )>22 <!ELEMENT closing ( #PCDATA )>23 <!ELEMENT paragraph ( #PCDATA )>24 <!ELEMENT signature ( #PCDATA )>

The ATTLIST element type declaration defines an attribute (i.e., type) for the contact element.

Keyword #IMPLIED specifies that if the parser finds a contact element without a type attribute, the parser can choose an arbitrary value for the attribute or ignore the attribute and the document will be valid.

Flag #PCDATA specifies that the element can contain parsed character data (i.e., text).

Assignment: 5 min. – make a Letter XML document that is:-Well-formed (how would an XML Validator check this?)-Valid (how would an XML Validator check this?)

Ingeniørhøjskolen i ÅrhusSlide 15 af 32

Letter.xml

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.3: letter.xml -->4 <!-- Business letter formatted with XML -->5 6 <!DOCTYPE letter SYSTEM "letter.dtd">7 8 <letter>9 10 <contact type = "from">11 <name>John Doe</name>12 <address1>123 Main St.</address1>13 <address2></address2>14 <city>Anytown</city>15 <state>Anystate</state>16 <zip>12345</zip>17 <phone>555-1234</phone>18 <flag gender = "M"/>19 </contact>20 21 <contact type = "to">22 <name>Joe Schmoe</name>23 <address1>Box 12345</address1>24 <address2>15 Any Ave.</address2>25 <city>Othertown</city>26 <state>Otherstate</state>27 <zip>67890</zip>28 <phone>555-4321</phone>29 <flag gender = "M"/>30 </contact>31

Ingeniørhøjskolen i ÅrhusSlide 16 af 32

Letter.xml

32 <salutation>Dear Sir:</salutation>33 34 <paragraph>It is our privilege to inform you about our new35 database managed with XML. This new system allows36 you to reduce the load of your inventory list server by37 having the client machine perform the work of sorting38 and filtering the data.</paragraph>39 <closing>Sincerely</closing>40 <signature>Mr. Doe</signature>41 42 </letter>

17

XML Schema

• DTD works OK – but• Is in Ex. Backus-Naur Form – why not use XML?• Cannot declare the type of an element • <amount>hundrede kr</amount>

• Could give problems

• Several other problems• W3C XML Schema

• Use XML to describe the structure of XML documents …• Possible to give type information to XML definitions

• Not supported by all parsers yet• Will live besides DTDs for a while

Ingeniørhøjskolen i ÅrhusSlide 18 af 32

Book.xsd

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.8 : book.xsd -->4 <!-- Simple W3C XML Schema document -->5 6 <xsd:schema xmlns:xsd = "http://www.w3.org/2000/10/XMLSchema"7 xmlns:deitel = "http://www.deitel.com/booklist"8 targetNamespace = "http://www.deitel.com/booklist">9 10 <xsd:element name = "books" type = "deitel:BooksType"/>11 12 <xsd:complexType name = "BooksType">13 <xsd:element name = "book" type = "deitel:BookType"14 minOccurs = "1" maxOccurs = "unbounded"/>15 </xsd:complexType>16 17 <xsd:complexType name = "BookType">18 <xsd:element name = "title" type = "xsd:string"/>19 </xsd:complexType>20 21 </xsd:schema>

Element element defines an element to be included in the XML document structure.

A BookType has an Element named Title of Type “xsd:string” – which is defined at “http://www.w3.org/2000/10/XMLSchema”

Ingeniørhøjskolen i ÅrhusSlide 19 af 32

20

How to use XML?

• Need a parser (or a parser API) to access XML (as with CSV)• Two commonly used methods:

• DOM (Document Object Model)• W3C Recommendation• Makes a tree structure representation of an XML document in

memory• SAX (Simple API for XML)

• Supported by diff. vendors• Parses document line by line and sends events to subscribers• Needs to parse every time access to XML document is needed

• DOM is better for• Slow to load XML document (need all)• Quick access to random read or update of XML (like WWW browser -

BOM)• Requires a lot of memory (need to hold entire XML in mem)

• SAX is better for• Applications subscribing to certain parts of XML (event subscription)• Slow for random access to XML document (must parse every time)

21

What is DOM

• DOM: Document Object Model• http://www.w3.org/TR/2003/REC-DOM-Level-2-HTML-2

0030109/• W3C definition:

• Standard for accessing structured documents• Core DOM used with XML• HTML DOM used with HTML• Representation of an object as an object tree structure • Provides a uniform interface for programming and

scripting languages• API’s available for JavaScript, Java, C++, C# etc.

22

DOM Tree Structure

• Tree structure of an XML document (left)• … or HTML (right)

document

… table

tbdoy

… …

tr trtr

td tdtd

tekst

<table> <tbody> <tr> <td> tekst </td>….

23

Example – using DOM on Article.xml

• We have looked at Article.xml• We Will:

• Look at the Article.xml document again

• Look at the Tree Structure formed by loading it into a DOM

• Use JavaScript to work on it

Ingeniørhøjskolen i ÅrhusSlide 24 af 32

1 <?xml version = "1.0"?>2 3 <!-- Fig. 20.1: article.xml -->4 <!-- Article structured with XML -->5 6 <article>7 8 <title>Simple XML</title>9 10 <date>September 19, 2001</date>11 12 <author>13 <firstName>Tem</firstName>14 <lastName>Nieto</lastName>15 </author>16 17 <summary>XML is pretty easy.</summary>18 19 <content>XML is easily20 learned. You must remember that XML is not for21 displaying information but for managing information.22 </content>23 24 </article>

XML document – Article.XML

25

DOM Methods

firstName

lastName

contents

summary

author

date

title

article

Tree structure for article.xml.

Ingeniørhøjskolen i ÅrhusSlide 26 af 32

DOMExample.html

1 <?xml version="1.0"?>2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">4 <html xmlns="http://www.w3.org/1999/xhtml">5 6 <!-- Fig. 20.15 : DOMExample.html -->7 <!-- DOM with JavaScript -->8 9 <head>10 <title>A DOM Example</title>11 </head>12 13 <body>14 15 <script type = "text/javascript" language = "JavaScript">16 <!--17 var xmlDocument = new ActiveXObject( "Microsoft.XMLDOM" );18 19 xmlDocument.load( "article.xml" );20 21 // get the root element22 var element = xmlDocument.documentElement;23 24 document.writeln( 25 "<p>Here is the root node of the document: " +26 "<strong>" + element.nodeName + "</strong>" +27 "<br />The following are its child elements:" +28 "</p><ul>" );29 30 // traverse all child nodes of root element31 for ( var i = 0; i < element.childNodes.length; i++ ) {32 var curNode = element.childNodes.item( i );33

Instantiate a Microsoft XML Document Object Model object and assign it to reference xmlDocument.

method load loads article.xml (Fig. 20.1) into memory.

Property documentElement corresponds to the root element in the document (e.g., article).

Ingeniørhøjskolen i ÅrhusSlide 27 af 32

DOMExample.html

34 // print node name of each child element35 document.writeln( "<li><strong>" + curNode.nodeName36 + "</strong></li>" );37 }38 39 document.writeln( "</ul>" );40 41 // get the first child node of root element42 var currentNode = element.firstChild;43 44 document.writeln( "<p>The first child of root node is: " +45 "<strong>" + currentNode.nodeName + "</strong>" +46 "<br />whose next sibling is:" );47 48 // get the next sibling of first child49 var nextSib = currentNode.nextSibling;50 51 document.writeln( "<strong>" + nextSib.nodeName +52 "</strong>.<br />Value of <strong>" +53 nextSib.nodeName + "</strong> element is: " );54 55 var value = nextSib.firstChild;56 57 // print the text value of the sibling58 document.writeln( "<em>" + value.nodeValue + "</em>" +59 "<br />Parent node of <strong>" + nextSib.nodeName +60 "</strong> is: <strong>" + 61 nextSib.parentNode.nodeName + "</strong>.</p>" );62 -->63 </script>64 65 </body>66 </html>

Ingeniørhøjskolen i ÅrhusSlide 28 af 32

Program Output

29

The SOAP Connection

• SOAP, WSDL, UDDI uses:• XML

• Namespaces

• and Schemas

• Original idea behind Web services• Connection through the Internet

• Good sense to use XML – W3C child

• Everyone loves W3C

• practical solutions “that work”