Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

17
Intro to XML Intro to XML Originally Presented by Originally Presented by Clifford Lemoine Clifford Lemoine Modified by Box Modified by Box

Transcript of Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Page 1: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Intro to XMLIntro to XML

Originally Presented byOriginally Presented by

Clifford LemoineClifford Lemoine

Modified by BoxModified by Box

Page 2: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Introduction to XMLIntroduction to XML

Review of XMLReview of XML What is different XML parsing?What is different XML parsing? Simple Example programSimple Example program Wrap-upWrap-up ReferencesReferences

Page 3: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Quick XML ReviewQuick XML Review

XML – Wave of the futureXML – Wave of the future Method of representing dataMethod of representing data Differs from HTML by storing and Differs from HTML by storing and

representing data instead of displaying or representing data instead of displaying or formatting dataformatting data

Tags similar to HTML tags, only they are Tags similar to HTML tags, only they are user-defineduser-defined

Follows a small set of basic rulesFollows a small set of basic rules Stored as a simple ASCII text file, so Stored as a simple ASCII text file, so

portability is insanely easyportability is insanely easy

Page 4: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Quick XML ReviewQuick XML Review

SyntaxSyntax Every XML document has a preambleEvery XML document has a preamble

<?xml version=“1.0” ?><?xml version=“1.0” ?> An XML document may or may not have An XML document may or may not have

a DTD (Document Type Definition) or a DTD (Document Type Definition) or SchemaSchema <!DOCTYPE catalog><!DOCTYPE catalog>

Page 5: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Quick XML ReviewQuick XML Review

Syntax cont.Syntax cont. Every element has a start and end tag, Every element has a start and end tag,

with optional attributeswith optional attributes <catalog <catalog version=“1.0”version=“1.0”> … </catalog>> … </catalog>

If an element does not contain any data If an element does not contain any data (or elements) nested within, the closing (or elements) nested within, the closing tag can be merged with the start tag tag can be merged with the start tag like so:like so: <catalog <catalog version=“1.0”version=“1.0”/>/>

Page 6: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Quick XML ReviewQuick XML Review

Syntax cont.Syntax cont. Elements must be Elements must be properly nestedproperly nested The outermost element is called the The outermost element is called the root root

elementelement An XML document that follows the basic An XML document that follows the basic

syntax rules is called syntax rules is called well-formedwell-formed An XML document that is An XML document that is well-formedwell-formed and and

conforms to a DTD or Schema is called conforms to a DTD or Schema is called validvalid Once again, XML documents do not always Once again, XML documents do not always

require a DTD or Schema, but they require a DTD or Schema, but they mustmust be be well-formedwell-formed

Page 7: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Quick XML ReviewQuick XML Review

Sample XML filesSample XML files Catalog.xmlCatalog.xml

Page 8: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

<?xml version="1.0"?><?xml version="1.0"?><catalog library="somewhere"><catalog library="somewhere"> <book><book> <author>John Doe</author><author>John Doe</author> <title>Title 1</title><title>Title 1</title> </book></book> <book><book> <author>Phill Smith</author><author>Phill Smith</author> <title>His One Book</title><title>His One Book</title> </book></book> <magazine><magazine> <name>PC Mag</name><name>PC Mag</name> <article page="17"><article page="17"> <headline>Second Headline</headline><headline>Second Headline</headline> </article></article> </magazine></magazine></catalog></catalog>

Page 9: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Let’s work in our project Let’s work in our project inputinput

Back to our original class diagram Back to our original class diagram (network configuration)(network configuration)

Define your inputDefine your input xml\net.xmlxml\net.xml

Page 10: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

What is XML Parser?What is XML Parser?

A program or module that checks a A program or module that checks a well-formed syntax and provides a well-formed syntax and provides a capability to manipulate XML data capability to manipulate XML data elements.elements. Navigate thru the XML document (DOM Navigate thru the XML document (DOM

or SAX)or SAX) extract or query data elementsextract or query data elements Add/delete/modify data elementsAdd/delete/modify data elements

Page 11: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

XML ParsingXML Parsing

DOM (Document Object Model). DOM (Document Object Model). Reads the whole document and builds DOM Reads the whole document and builds DOM

tree.tree. SSimple imple AAPI for PI for XXML = ML = SAXSAX

SAX is an SAX is an event-basedevent-based parsing method parsing method reads an XML document, firing (or calling) reads an XML document, firing (or calling)

callback methods when certain events are callback methods when certain events are found (e.g. elements, attributes, start/end found (e.g. elements, attributes, start/end tags, etc.)tags, etc.)

Pull parser (won’t talk more here)Pull parser (won’t talk more here)

Page 12: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

DOM vs. SAX Parsing?DOM vs. SAX Parsing? Unlike DOM (Document Object Model), SAX Unlike DOM (Document Object Model), SAX

does not store information in an internal tree does not store information in an internal tree structurestructure

Because of this, SAX is able to parse huge Because of this, SAX is able to parse huge documents (think gigabytes) without having documents (think gigabytes) without having to allocate large amounts of system resourcesto allocate large amounts of system resources

Really great if the amount of data you’re Really great if the amount of data you’re looking to process is relatively large (no looking to process is relatively large (no waste of memory on tree)waste of memory on tree)

If processing is built as a pipeline, you don’t If processing is built as a pipeline, you don’t have to wait for the data to be converted to have to wait for the data to be converted to an object; you can go to the next process an object; you can go to the next process once it clears the preceding callback methodonce it clears the preceding callback method

Page 13: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

DOM vs. SAX Parsing?DOM vs. SAX Parsing?

Most limitations are the programmer’s Most limitations are the programmer’s problem, not the API’sproblem, not the API’s

SAX does not allow random access to SAX does not allow random access to the file; it proceeds in a single pass, the file; it proceeds in a single pass, firing events as it goesfiring events as it goes

Makes it hard to implement cross-Makes it hard to implement cross-referencing in XML (ID and IDREF) as referencing in XML (ID and IDREF) as well as complex searching routineswell as complex searching routines

Page 14: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

XML Parser XML Parser implementationsimplementations

Apache.org Xerces package at Apache.org Xerces package at http://xml.apache.org/http://xml.apache.org/

JDOM.org jdom package JDOM.org jdom package www.jdom.orgwww.jdom.org

Page 15: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Simple Example ProgramSimple Example Program

WarReader.javaWarReader.java Build document from xml file.Build document from xml file.

SAXBuilder builder = new SAXBuilder();SAXBuilder builder = new SAXBuilder(); Document doc = builder.build(new File(filename));Document doc = builder.build(new File(filename));

Get the root element (node)Get the root element (node) Element root = doc.getRootElement();Element root = doc.getRootElement();

Get children of the rootGet children of the root List servlets = root.getChildren("servlet");List servlets = root.getChildren("servlet");

Iterate thru each child and extract more Iterate thru each child and extract more detailed infodetailed info

xml\WarReader.javaxml\WarReader.java

Page 16: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

Using XML for your Final Using XML for your Final Term ProjectTerm Project

Each team spends 10 mins to come Each team spends 10 mins to come up with data structure and XML up with data structure and XML representationrepresentation

xml\NetReader.javaxml\NetReader.java

Demonstration hereDemonstration here

Page 17: Intro to XML Originally Presented by Clifford Lemoine Modified by Box.

ReferencesReferencesGittleman, Art. Gittleman, Art. Advanced Java: Internet Applications (Second Advanced Java: Internet Applications (Second

Edition).Edition). Scott Jones Publishers. El Granada, California. Scott Jones Publishers. El Granada, California. 2002. pp. 504-511.2002. pp. 504-511.

""JDOMJDOM Makes XML Easy" slides from JavaOne 2002, Makes XML Easy" slides from JavaOne 2002, http://www.servlets.com/speaking/descriptions.html#jdomhttp://www.servlets.com/speaking/descriptions.html#jdom

Janert, Phillip K. “Simple XML Parsing with SAX and DOM.” Janert, Phillip K. “Simple XML Parsing with SAX and DOM.” http://www.onjava.com/pub/a/onjava/2002/06/26/xml.htmlhttp://www.onjava.com/pub/a/onjava/2002/06/26/xml.htmlPublished June 26, 2002. Accessed February 10, 2003.Published June 26, 2002. Accessed February 10, 2003.

Wati, Anjini. “E-Catalog for a Small to Medium Enterprise.” Wati, Anjini. “E-Catalog for a Small to Medium Enterprise.” http://ispg.csu.edu.au/subjects/itc594/reports/Tr-005.dochttp://ispg.csu.edu.au/subjects/itc594/reports/Tr-005.docAccessed February 10, 2003.Accessed February 10, 2003.