The XML Document Object Model (DOM) Aug’10 – Dec ’10.

37
The XML Document Object The XML Document Object Model (DOM) Model (DOM) Aug’10 – Dec ’10

Transcript of The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Page 1: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The XML Document ObjectThe XML Document ObjectModel (DOM)Model (DOM)

Aug’10 – Dec ’10

Page 2: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Introduction to XML DOM•used by programmers as a way to manipulate the content of used by programmers as a way to manipulate the content of an XML documentan XML document

This chapter covers the followingThis chapter covers the following

❑ ❑ The purpose of the XML Document Object ModelThe purpose of the XML Document Object Model

❑ ❑ How the DOM specification was developed at the W3CHow the DOM specification was developed at the W3C

❑ ❑ Important XML DOM interfaces and objectsImportant XML DOM interfaces and objects

❑ ❑ How to add and delete elements and attributes from a DOM How to add and delete elements and attributes from a DOM and manipulate a DOM tree in other way and manipulate a DOM tree in other way

Aug’10 – Dec ’10

Page 3: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Purpose of the XML DOMPurpose of the XML DOM

provides an interface toprovides an interface to create XML documentscreate XML documents to navigate themto navigate them to add, modify, or delete parts of XML to add, modify, or delete parts of XML

docs while they are held in memorydocs while they are held in memory

Aug’10 – Dec ’10

Page 4: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

XML Parsing

ParsingParsing means taking a stream of means taking a stream of characters and characters and producing an internal representation conforming to producing an internal representation conforming to a predetermined structure.a predetermined structure. the result is an in-memory model of the XML, the result is an in-memory model of the XML, known as the XML DOMknown as the XML DOM

Raises issues when serializing an XML DOMRaises issues when serializing an XML DOM

Some features of the input are lost after parsing, Some features of the input are lost after parsing, likelike

XML declaration and its specified encodingXML declaration and its specified encoding whether attributes were quoted with ‘ ‘ or “ “whether attributes were quoted with ‘ ‘ or “ “

Aug’10 – Dec ’10

Page 5: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

two differently constructed documents, once parsed, can two differently constructed documents, once parsed, can yield the same yield the same Infoset (XML Information set)Infoset (XML Information set)

Serialization Serialization is the process of storing an object’s state to a is the process of storing an object’s state to a permanent form, such as a file or a database, or converting it permanent form, such as a file or a database, or converting it to a form that can be transmitted between machinesto a form that can be transmitted between machinesDeserialization – Deserialization – opposite of serializationopposite of serialization

The DOM can be created directly in memory using the The DOM can be created directly in memory using the appropriate DOM methods such asappropriate DOM methods such as

createElement()createElement() and and appendChild()appendChild()..

Aug’10 – Dec ’10

Page 6: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

DOM concepts:the XML DOM representation: equivalent to a hierarchical the XML DOM representation: equivalent to a hierarchical treelike structure consisting of treelike structure consisting of nodesnodes

XML document itself: Document nodeXML document itself: Document node

DOM Document node : at the apex of a treeDOM Document node : at the apex of a tree

Document node Document node differs significantly differs significantly from an XPath root from an XPath root nodenode

XML parser checks for XML parser checks for well-formednesswell-formedness and,optionally, and,optionally, validity of the document. validity of the document.

The XML DOM may then be constructed as an in-memory The XML DOM may then be constructed as an in-memory representation of the XML documentrepresentation of the XML document

Aug’10 – Dec ’10

Page 7: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

DOM DocumentDOM Document

Aug’10 – Dec ’10

Page 8: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Serialized versionSerialized version

<documentElement><documentElement>

<firstChildElement attributeNode=”attributeValue”>Text <firstChildElement attributeNode=”attributeValue”>Text Node</firstChildElement>Node</firstChildElement>

<secondChildElement>Text Node</secondChildElement><secondChildElement>Text Node</secondChildElement>

<!- - Example comment - -><!- - Example comment - ->

</documentElement></documentElement>

Aug’10 – Dec ’10

Page 9: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Interfaces and ObjectsInterfaces and Objects

An interface is a more abstract concept than an objectAn interface is a more abstract concept than an objectgeneral general conceptconcept – interface – interfaceSpecific Specific Instance –Instance – object object

interface describes the interface describes the propertiesproperties and behavior of a class of and behavior of a class of objects (objects (methodsmethods))

Document interface Document interface defined in the XML DOM.defined in the XML DOM. documentElement propertydocumentElement property Document node- implements the Document interfaceDocument node- implements the Document interface

Interface and Objects – same properties and methodsInterface and Objects – same properties and methods

Aug’10 – Dec ’10

Page 10: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The Document Object Model at the W3CThe Document Object Model at the W3C

main page for the DOM specification is main page for the DOM specification is www.w3.org/DOM/

XML DOM : XML DOM : logical model of an XML documentlogical model of an XML document

XML DOM Level 1 provides an interface- implementation left XML DOM Level 1 provides an interface- implementation left to the creatorsto the creators

a shared interface – improves productivitya shared interface – improves productivity DOM Level 1 specification provided no common interface – only DOM Level 1 specification provided no common interface – only

proprietaryproprietary failed to provide a universal interfacefailed to provide a universal interface did not include a way to create an XML documentdid not include a way to create an XML document DOM Level 1 specified language bindings -DOM Level 1 specified language bindings -Java and ECMAScriptJava and ECMAScript

For very large XML documents, the Simple API for XML For very large XML documents, the Simple API for XML (SAX), or .NET’s XmlReader are preferred(SAX), or .NET’s XmlReader are preferred

Aug’10 – Dec ’10

Page 11: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

DOM level 2 and level 3

•DOM Level 2 added some new functionalityDOM Level 2 added some new functionality•support for namespaced elementssupport for namespaced elements

The DOM Level 2 specification documents and their location The DOM Level 2 specification documents and their location can be found at can be found at www.w3.org/TR/DOM-Level-2-Core/..

In 2004 DOM Level 3 was finalizedIn 2004 DOM Level 3 was finalized standards for URI handling,standards for URI handling, namespace resolution, and namespace resolution, and how the DOM maps to the XML Infosethow the DOM maps to the XML Infoset

www.w3.org/TR/DOM-Level-3-Core/www.w3.org/TR/DOM-Level-3-Core/

Aug’10 – Dec ’10

Page 12: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

XML DOM ImplementationsXML DOM Implementations

provides all interfaces described in a particular level of the provides all interfaces described in a particular level of the DOM specificationDOM specificationfree to provide additional interfacesfree to provide additional interfaces

Two Ways to View DOM NodesTwo Ways to View DOM Nodes

1.1.hierarchy of Node objectshierarchy of Node objects

2.2.view the root of the tree as a Document node (or view the root of the tree as a Document node (or object) whose descendant nodes are objects of object) whose descendant nodes are objects of different specialized typesdifferent specialized types

Aug’10 – Dec ’10

Page 13: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Overview of the XML DOMOverview of the XML DOM

root of the DOM hierarchy is always a root of the DOM hierarchy is always a Document nodeDocument nodeThe child nodes of the Document node :The child nodes of the Document node :

DocumentTypeDocumentType node, node, ElementElement node, node, ProcessingInstruction ProcessingInstruction nodes, andnodes, and CommentComment nodes nodes

If fragment of an XML doc :If fragment of an XML doc :DocumentFragment DocumentFragment node node

child nodes of the child nodes of the DocumentFragmentDocumentFragment and and ElementElement node: node:• Element nodes orElement nodes or• Comment, ProcessingInstruction, Text, CDATASection, and Comment, ProcessingInstruction, Text, CDATASection, and

EntityReferenceEntityReference

Aug’10 – Dec ’10

Page 14: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Attribute : Attribute : Attr node Attr node associated with Element node but is associated with Element node but is not considered to be a child not considered to be a child (compare with Xpath (compare with Xpath attributes)attributes)

Entity node and EntityReference nodeEntity node and EntityReference node•Child nodes: Element, Comment,ProcessingInstruction, Text, CDATASection, and EntityReference

Aug’10 – Dec ’10

Page 15: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Tools

MSXML (Microsoft XML Core Services)MSXML (Microsoft XML Core Services)

• a set of services that allow applications written in JScript, a set of services that allow applications written in JScript, VBScript, and Microsoft development tools to build Windows-VBScript, and Microsoft development tools to build Windows-

native XML-based applicationsnative XML-based applications

Internet Explorer 5.0 or above – Internet Explorer 5.0 or above – To run DOM examplesTo run DOM examples

Microsoft Jscript – Microsoft Jscript – To manipulate XML DOMTo manipulate XML DOM

Aug’10 – Dec ’10

Page 16: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

MSXML3

msxmltest.html msxmltest.html

UsageUsage

new ActiveXObjectnew ActiveXObject

- Dom Object created using MSXML3- Dom Object created using MSXML3

loadXMLloadXML

- loads the XML file into the XML DOM object- loads the XML file into the XML DOM object

document.writedocument.write

- text will be displayed on the web page- text will be displayed on the web page

alert alert

- text will be displayed in a pop up window- text will be displayed in a pop up window

Aug’10 – Dec ’10

Page 17: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Navigating to the document element

Once the XML document is loaded into the DOM object, the Once the XML document is loaded into the DOM object, the documentElement can be accessed as,documentElement can be accessed as,

Document Object Document Object

- represents the entire XML document- represents the entire XML document

- root of document tree and gives primary - root of document tree and gives primary access to document data access to document data

Document Element Document Element

- returns the root node of the document - returns the root node of the document

- objXMLDOM.documentElement.nodeName- objXMLDOM.documentElement.nodeName

- syntax uses period character to indicate - syntax uses period character to indicate properties or methods of an object properties or methods of an object

Myxmltest2.htmlMyxmltest2.html

Aug’10 – Dec ’10

Page 18: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Node Object

One way of viewing nodes in an XML DOM is as specializations of the One way of viewing nodes in an XML DOM is as specializations of the Node objectNode object

Node object has properties and methods that are also available on all Node object has properties and methods that are also available on all other types of XML DOM nodeother types of XML DOM node

XML DOM Programming consists of :XML DOM Programming consists of :

- retrieving and setting some of these - retrieving and setting some of these properties directlyproperties directly

- or using the methods defined in the interface - or using the methods defined in the interface to manipulate the object that instantiates the to manipulate the object that instantiates the interface or related objectsinterface or related objects

Aug’10 – Dec ’10

Page 19: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Node Object

Node object of DOM Level 2 has 14 properties :Node object of DOM Level 2 has 14 properties :

❑ attributes—This is a read-only property whose value is a NamedNodeMap object.

❑ childNodes—This is a read-only property whose value is a NodeList object.

❑ firstChild—This is a read-only property whose value is a Node object.

❑ lastChild—This is a read-only property whose value is a Node object.

❑ localName—This is a read-only property that is a String.

Aug’10 – Dec ’10

Page 20: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Node Object

❑ namespaceURI—This is a read-only property whose value is a String.

❑ nextSibling—This is a read-only property whose value is a Node object.

❑ nodeName—This is the name of the node, if it has one, and its value is a String type.

❑ nodeType—This is a read-only property that is of type number. The number value of the nodeType property maps to the names of the node types mentioned earlier.

❑ nodeValue—This property is of type String. When the property is being set or retrieved, a DOMException can be raised.

❑ ownerDocument—This is a read-only property whose value is a Document object.

Aug’10 – Dec ’10

Page 21: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Node Object

❑ parentNode—This is a read-only property whose value is a Node object.

❑ prefix—This property is a String. When the property is being set, a DOMException can be raised.

❑ previousSibling—This is a read-only property whose value is a Node

Depending on the particular node object, there may not bea retrievable useful value for some properties made available by the Node interface.

For example, Document object does not have a parent node Comment node has no attributes or child nodes

Only text nodes and attributes have non-null nodeValue.

Aug’10 – Dec ’10

Page 22: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Exploring Child Nodes

objXMLDOM.documentElement.firstChild.nodeNameobjXMLDOM.documentElement.firstChild.nodeName

- Uses the documentElement and firstChild properties to - Uses the documentElement and firstChild properties to retrieve the name of the node that is the first child of the retrieve the name of the node that is the first child of the document element of the document.document element of the document.

objXMLDOM.documentElement.firstChild.firstChild.nodeValueobjXMLDOM.documentElement.firstChild.firstChild.nodeValue

- Retrieves the value of the first child of the first child of the - Retrieves the value of the first child of the first child of the document element node in the document.document element node in the document.

ChildNodes.htmlChildNodes.html

Aug’10 – Dec ’10

Page 23: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Methods of the Node Object

❑ appendChild(newChild)—This method returns a Node object. The newChild argument is a Node object. This method can raise a DOMException object.

❑ cloneNode(deep)—This method returns a Node object. The deep argument is a Boolean value. If true, then all nodes underneath this node are also copied; otherwise, only the node itself.

❑ hasAttributes()—This method returns a Boolean value. It has no arguments.

❑ hasChildNodes()—This method returns a Boolean value. It has no arguments.

❑ insertBefore(newChild, refChild)—This method returns a Node object. The newChild and refChild arguments are each Node objects. This method can raise a DOMException object.

Aug’10 – Dec ’10

Page 24: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Methods of the Node Object

❑ isSupported(feature, version)—This method returns a Boolean value. The feature and version arguments are each String values.

❑ normalize()—This method has no return value and takes no arguments.

❑ removeChild(oldChild)—This method returns a Node object. The oldChild argument is a Node object. This method can raise a DOMException object.

❑ replaceChild(newChild, oldChild)—This method returns a Node object. The newChild and oldChild arguments are each Node objects. This method can raise a DOMException object.

Aug’10 – Dec ’10

Page 25: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Loading an XML Document

loadXML() – supply literal characters equivalent to a well-formed XML loadXML() – supply literal characters equivalent to a well-formed XML documentdocument

load() – load an existing XML documentload() – load an existing XML document

var objXMLDOM = new ActiveXObject(“Mxxml2.DOMDocument.3.0”);var objXMLDOM = new ActiveXObject(“Mxxml2.DOMDocument.3.0”);

- A DOM Document node is created with no descendant nodes.- A DOM Document node is created with no descendant nodes.

objXMLDOM.load(“C:\\SimpleDoc.xml”);objXMLDOM.load(“C:\\SimpleDoc.xml”);

- XML document SimpleDoc.xml is loaded, its XML parsed and the - XML document SimpleDoc.xml is loaded, its XML parsed and the appropriate node tree is created inside the objXMLDOM object. appropriate node tree is created inside the objXMLDOM object.

SimpleDoc.xmlSimpleDoc.xml

Aug’10 – Dec ’10

Page 26: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Deleting a Node

var objToBeDeleted = objXMLDOM.documentElement.firstChild;var objToBeDeleted = objXMLDOM.documentElement.firstChild;

If XML document is,If XML document is,<Book><Book>

<Chapter>This is Chapter 1 </Chapter> <Chapter>This is Chapter 1 </Chapter> <Chapter>This is Chapter 2 </Chapter> <Chapter>This is Chapter 2 </Chapter> <Chapter>This is Chapter 3 </Chapter> <Chapter>This is Chapter 3 </Chapter>

</Book></Book>

objXMLDOM.documentElement.removeChild(objToBeDeleted);objXMLDOM.documentElement.removeChild(objToBeDeleted);alert(objXMLDOM.xml); alert(objXMLDOM.xml);

This deletes first of the three Chapter element nodes in the This deletes first of the three Chapter element nodes in the document.document.

DeleteNode.htmlDeleteNode.html

Aug’10 – Dec ’10

Page 27: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Adding new nodes

createTextNode ()createTextNode ()

- creates new Text node- creates new Text node

createElement()createElement()

- creates new Element node- creates new Element node

appendChild()appendChild()

- add as child of an element node- add as child of an element node

- if the node already has child node, appendChild() method - if the node already has child node, appendChild() method adds new node as child after existing nodesadds new node as child after existing nodes

insertBefore()insertBefore()

- inserts new node before another element node- inserts new node before another element node

AddNode.htmlAddNode.html

Aug’10 – Dec ’10

Page 28: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Effect of text nodes

xml:space attribute preserves white spacesxml:space attribute preserves white spaces

<Book xml:space=“preserve”><Book xml:space=“preserve”>

<Chapter>This is chapter 1</Chapter><Chapter>This is chapter 1</Chapter>

<Chapter>This is chapter 1 </Chapter><Chapter>This is chapter 1 </Chapter>

<Chapter>This is chapter 1 </Chapter><Chapter>This is chapter 1 </Chapter>

</Book></Book>

objXMLDOM.documentElement.childNodes.length returns 7objXMLDOM.documentElement.childNodes.length returns 7

If xml:space=“preserve” is not mentioned the number of child nodes If xml:space=“preserve” is not mentioned the number of child nodes returned is 3returned is 3

WhiteSpace.htmlWhiteSpace.html

Aug’10 – Dec ’10

Page 29: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The NamedNodeMap Object

A named node map is an unordered set of objectsA named node map is an unordered set of objects

The attributes property of the Node object is a NamedNodeMap The attributes property of the Node object is a NamedNodeMap object.object.

The NamedNodeMap object has a single property, the length property The NamedNodeMap object has a single property, the length property which is a Number value. which is a Number value.

The value of the length property indicates how many nodes are in the The value of the length property indicates how many nodes are in the named node map.named node map.

The NamedNodeMap object has 7 methods:The NamedNodeMap object has 7 methods:

❑ getNamedItem(name)—This method returns a Node object. The name argument is a String value.

❑ getNamedItemNS(namespaceURI, localName)—This method returns a Node object. The namespaceURI and localName arguments are String values.

Aug’10 – Dec ’10

Page 30: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The NamedNodeMap Object

❑ item(index)—This method returns a Node object. The index argument is a Number value.

❑ removeNamedItem(name)—This method returns a Node object. The name argument is a String value. This method can raise a DOMException object if the item doesn’t exist.

❑ removeNamedItemNS(namespaceURI, localName)—This method returns a Node object. The namespaceURI and localName arguments are String values. This method can raise aDOMException object if the item does not exist.

❑ setNamedItem(node)—This method returns a Node object. The node argument is a new Node. This method

can raise a DOMException object.

❑ setNamedItemNS(node)—This is the same as setNamedItem except it handles namespaced nodes.

Aug’10 – Dec ’10

Page 31: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

Adding and Removing Attributes

NamedNodeMap Interface is used to alter the values of NamedNodeMap Interface is used to alter the values of attributesattributes

objXMLDOM.documentElement.lastChild.attributes;objXMLDOM.documentElement.lastChild.attributes;

createAttribute()createAttribute()

removeNamedItem()removeNamedItem()

setNamedItem()setNamedItem()

ChangeAttributes.htmlChangeAttributes.html

Aug’10 – Dec ’10

Page 32: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The NodeList Object

NodeList is a list of nodesNodeList is a list of nodes

childNodes property of the Node object has the value that is a childNodes property of the Node object has the value that is a NodeList.NodeList.

NodeList object can be used to process all child nodes of a specified NodeList object can be used to process all child nodes of a specified nodenode

The NodeList object has one property, that is a read only property of The NodeList object has one property, that is a read only property of type Number type Number length length

NodeList object has one method, the item() method. Takes a single NodeList object has one method, the item() method. Takes a single argument which is a number value and returns a Node object. item(3) argument which is a number value and returns a Node object. item(3) returns fourth child node. Index starts from 0.returns fourth child node. Index starts from 0.

Aug’10 – Dec ’10

Page 33: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The DOMException Object

When an error occurs, an exception is thrown, which is caught When an error occurs, an exception is thrown, which is caught by the exception handler.by the exception handler.

Eg :Eg :

Syntax incorrectSyntax incorrect

Property or method name specified wrongProperty or method name specified wrong

Trying to change the value of a read-only propertyTrying to change the value of a read-only property

DOMException.htmlDOMException.html

Aug’10 – Dec ’10

Page 34: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The Document Interface

Document Interface has three properties :Document Interface has three properties :

❑ documentElement—This read-only property returns an Element object.

❑ doctype—This read-only property is a DocumentType object, corresponding to a DOCTYPE declaration, if present, in

the XML document.

❑ implementation—This read-only property is a DOMImplementation object.

Aug’10 – Dec ’10

Page 35: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The Document Interface

Document Interface has 14 methods :Document Interface has 14 methods :

❑ createAttribute(name)—This method returns an Attr object. The name argument is a String value. This method can raise a DOMException object.

❑ createAttributeNS(namespaceURI, qualifiedName)—This method returns an Attr object. The namespaceURI and qualifiedName arguments are String values. This method can raise a DOMException object if the name contains an invalid character.

❑ createCDATASection(data)—This method returns a CDATASection object. The data argument is a String value.

❑ createComment(data)—This method returns a Comment object. The data argument is a String value.

❑ createDocumentFragment()—This method takes no argument and returns a DocumentFragment object.

Aug’10 – Dec ’10

Page 36: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The Document Interface

❑ createElement(tagName)—This method returns an Element object. The tagName argument is a String value. This method can raise a DOMException object if the name contains an invalid character.

❑ createElementNS(namespaceURI, qualifiedName)—This method returns an Element object. The namespaceURI and qualifiedName arguments are String values. This method can raise a DOMException object.

❑ createEntityReference(name)—This method returns an EntityReference object. The name argument is a String value. This method can raise a DOMException object if the name contains an invalid character.

❑ createProcessingInstruction(target, data)—This method returns aProcessingInstruction object. The target and data arguments are each of type String. This method can raise a DOMException object if the target contains an invalid character.

❑ createTextNode(data)—This method returns a Text object. The data argument is a String value.

Aug’10 – Dec ’10

Page 37: The XML Document Object Model (DOM) Aug’10 – Dec ’10.

The Document Interface

❑ getElementById(elementId)—This method returns an Element object. The elementId argument is a String value.

❑ getElementsByTagName(tagname)—This method returns a NodeList object. The tagname argument is a String value.

❑ getElementsByTagNameNS(namespaceURI, localName)—This method returns a NodeList object. The namespaceURI and localName arguments are String values.

❑ importNode(importedNode, deep)—This method returns a Node object. The importedNode argument is a Node object. The deep argument is a Boolean value. This method can raise a DOMException object.

Aug’10 – Dec ’10