Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and...

26
Sheet 1 XML Technology in E-Commerce 2001 Lecture 3 XML Technology in E- Commerce Lecture 3 DOM and SAX

Transcript of Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and...

Page 1: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 1XML Technology in E-Commerce 2001 Lecture 3

XML Technology in E-Commerce

Lecture 3

DOM and SAX

Page 2: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 2XML Technology in E-Commerce 2001 Lecture 3

• General Model for XML Processing

• Document Object Model (DOM)– Logical Model;

– DOM Interfaces;

– Example;

• Simple API for XML (SAX);– Parser Architecture;

– Events;

– Java Classes and Interfaces;

– Example. Error Handling;

• Summary;

Lecture Outline

Page 3: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 3XML Technology in E-Commerce 2001 Lecture 3

General Model for XML Processing

DOM - Document Object Model

SAX - Simple API for XML

Page 4: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 4XML Technology in E-Commerce 2001 Lecture 3

DOM

• DOM defines:– Logical model for XML documents;

– Platform and language independent application programming interfaces for model manipulation;

• DOM allows:– accessing document content;

– modifying document content;

– creating new documents in the memory;

• DOM homepage:– http://www.w3.org/DOM/

Page 5: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 5XML Technology in E-Commerce 2001 Lecture 3

XML Document:

<competition> <results> <name>John Smith</name> <name>Derek Warwick</name> <name>Mik Douglas</name> </results> <photos> <img src="img1.gif"/> <img src="img2.gif"/> </photos></competition>

img

competition

results photos

imgname name name

John Smith D. Warwick M. Douglas

DOMLogical Model

DOM Tree Structure:

XML Document is a set of Nodes that form tree structure. There are different node types: for elements, attributes, text content, etc.

Page 6: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 6XML Technology in E-Commerce 2001 Lecture 3

• DOM Interfaces are defined in the Interface Definition Language (IDL);

• There are bindings for different languages:

DOMProgramming Interfaces

IDL

JavaScript C++ PythonJava

Page 7: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 7XML Technology in E-Commerce 2001 Lecture 3

DOMInterface Hierarchy

More important interfaces defined in Java package org.w3c.dom

Page 8: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 8XML Technology in E-Commerce 2001 Lecture 3

DOMStructure Model

Page 9: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 9XML Technology in E-Commerce 2001 Lecture 3

• DOM provides two groups of interfaces:– Generic: Node, NodeList, NamedNodeMap;

– Specialized: Node subinterfaces for elements, attributes, text nodes, etc.

• Interfaces:– Node - Deitel 8.5, Fig. 8.7, page 201;

– Document - Deitel 8.5, Fig. 8.5, page 200;

– Element - Deitel 8.5, Fig. 8.9. Page 200;

– Attr;

– Text;

DOMInterface Details

Page 10: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 10XML Technology in E-Commerce 2001 Lecture 3

• Demo - Example on fig. 8.10, Deitel 8.5, page 202;

• Tools:– Java 1.2.2;

• http://java.sun.com/products/jdk/1.2/

– Java API for XML Processing (JAXP) 1.0.1;• http://java.sun.com/xml/archive.html

• Classes: jaxp.jar and parser.jar;

• Demo files:– ReplaceText.java;

– MyErrorHandler.java;

– intro.xml;

DOMDemo

Page 11: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 11XML Technology in E-Commerce 2001 Lecture 3

• Importing packages:

import org.w3c.dom.*;import org.xml.sax.*;import javax.xml.parsers.*;import com.sun.xml.tree.XmlDocument;

• Instantiation of the parser. DOM does not specify parser instantiation, so this is an implementation specific detail:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

factory.setValidating( true );DocumentBuilder builder =

factory.newDocumentBuilder();

DOMDemo Explained (1)

Page 12: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 12XML Technology in E-Commerce 2001 Lecture 3

• Loading and Parsing the XML file:Document document = builder.parse(new File( "intro.xml"));

• Getting the root element (myMessage):Node root = document.getDocumentElement();

• Casting the root to Element type: Element myMessageNode = ( Element ) root;

• Finding the message elements:NodeList messageNodes = myMessageNode.getElementsByTagName("message");

• Getting the first message element:Node message = messageNodes.item(0);

DOMDemo Explained (2)

Page 13: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 13XML Technology in E-Commerce 2001 Lecture 3

• Creating a new text content and replacing the old one: Text newText = document.createTextNode("New Changed Message!!"); Text oldText = (Text) message.getChildNodes().item(0); message.replaceChild( newText, oldText );

• Writing the changed document to a new file. DOM does not specify how to save the DOM structure. This is implementation specific detail:

((XmlDocument) document).write(

new FileOutputStream("intro1.xml"));

DOMDemo Explained (3)

Page 14: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 14XML Technology in E-Commerce 2001 Lecture 3

• DOM level 1 (Discussed here);

• DOM level 2:– Namespace support;

– Stylesheets interface;

– Model for events;

– Views, Range and Traversal interfaces;

• DOM level 3 (work in progress):– Loading and Saving documents;

– Model for DTD and Schema;

DOMSpecification Levels

Page 15: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 15XML Technology in E-Commerce 2001 Lecture 3

• General Model for XML Processing

• Document Object Model (DOM)– Logical Model;

– DOM Interfaces;

– Example;

• Simple API for XML (SAX);– Parser Architecture;

– Events;

– Java Classes and Interfaces;

– Example. Error Handling;

• Summary;

Lecture Outline

Page 16: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 16XML Technology in E-Commerce 2001 Lecture 3

• SAX - Simple API for XML;

• Developed by the members of XML-DEV list in 1998;

• SAX is Event based:– The parser reports parsing events: start and end of the

document, start and end of an element, errors, etc.

– When an event occurs, the parser invokes a method on an event handler;

– The application handles the events accordingly;

• SAX home page:http://www.megginson.com/SAX/

SAX

Page 17: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 17XML Technology in E-Commerce 2001 Lecture 3

XMLSource

S AX

Pa r

s er

Document Handler

Error Handler

DTD Handler

Entity Resolver

Ap

plication

SAXParser Architecture

DocumentHandler, ErrorHandler, DTDHandler and EntityResolver are interfaces that the Application can implement

Page 18: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 18XML Technology in E-Commerce 2001 Lecture 3

SAXEvents

Page 19: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 19XML Technology in E-Commerce 2001 Lecture 3

SAXDocumentHandler Interface

• Java package org.xml.sax;

• DocumentHandler Interface;More important methods:

public abstract void startDocument()public abstract void endDocument()

public abstract void startElement(String name, AttributeList atts)

public abstract void endElement(String name)

public abstract void characters(char ch[],int start, int length)

public abstract void processingInstruction(String target,String

data)

Page 20: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 20XML Technology in E-Commerce 2001 Lecture 3

SAXDemo

• Demo - Example on fig. 9.3, Deitel 9.6, page 235;

• Tools:– Java 1.2.2;

• http://java.sun.com/products/jdk/1.2/

– Java API for XML Processing (JAXP) 1.0.1;• http://java.sun.com/xml/archive.html

• Classes: jaxp.jar and parser.jar;

• Demo files:– Tree.java;

– Sample XML documents;

Page 21: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 21XML Technology in E-Commerce 2001 Lecture 3

SAXDemo Explained (1)

• Importing packages:import org.xml.sax.*; import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

• Class HandlerBase:– Provide default implementation of the four event handlers.

Applications usually extends it and overrides some methods:

public class Tree extends HandlerBase

– Tree class overrides the methods from DocumentHandler interface;

– Registration in the parser before parsing;

Page 22: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 22XML Technology in E-Commerce 2001 Lecture 3

SAXDemo Explained (2)

• Factory Instantiation:

SAXParserFactory saxFactory = SAXParserFactory.newInstance();

saxFactory.setValidating( validate );

• Obtaining the parser and start parsing:

SAXParser saxParser = saxFactory.newSAXParser();

saxParser.parse(new File(args[1]), new Tree());

Page 23: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 23XML Technology in E-Commerce 2001 Lecture 3

SAXError Handling

• Three error types:– Fatal errors: usually violation of well-formedness constraints.

The parser must stop processing;

– Errors: usually violation of validity rules;

– Warnings: related to DTD;

• Errors are handled by implementing ErrorHandler Interface;

• The Tree class overrides the default implementation of methods for warnings and errors;

• The same mechanism is used with DOM parsers;

Page 24: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 24XML Technology in E-Commerce 2001 Lecture 3

• Main Changes:

– Namespace support;

– Introduction of Filter mechanism;

– Interface DocumentHandler is replaced by

ContentHandler;

– New exception classes;

SAX 2.0

Page 25: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 25XML Technology in E-Commerce 2001 Lecture 3

SAX and DOMComparison

• DOM:– maintains an internal structure for the document;

– possible high memory usage for large documents;

– enables traversing;• SAX:

– doesn’t maintain an internal structure;

– enables building of custom structure;

– low memory usage;

– usually faster than DOM;

– traversing is impossible without internal structure;

• Usually a DOM implementation is built on the top of a SAX parser;

Page 26: Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.

Sheet 26XML Technology in E-Commerce 2001 Lecture 3

• Two approaches for XML processing:– Tree-based (DOM);– Event-based (SAX);

• Tools:– JDK 1.2.2;– JAXP 1.0.1 (used in the book);– JAXP 1.1 is also available;– See also http://xml.apache.org;

Read: Deitel 8, 9

Assignment: Modify the case study in Deitel 8.8. In the new version the query should be based only on year, month and day (time is excluded). Add new functionality for making new appointment for a meeting on the found day and at specified time.

For more detailed explanation and some hints see the course site.

Summary