Java WWW Week 10 Version 2.1 Mar 2008 Slide [email protected] Java (JSP) and XML Format of...

25
Java WWW Week 10 Version 2.1 Mar 2008 Slide 1 [email protected] Java (JSP) and XML Format of lecture: What is XML? A sample XML file… How to use XML with JSP Example code for parsing an XML file in JSP

Transcript of Java WWW Week 10 Version 2.1 Mar 2008 Slide [email protected] Java (JSP) and XML Format of...

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Java (JSP) and XML

Format of lecture: What is XML? A sample XML file… How to use XML with JSP

Example code for parsing an XML file in JSP

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

What is XML? XML is principally concerned with the description and

structure of data (content & presentation are separate) Traditional methods of data storage and exchange

employ a variety of schemes Electronically, these usually are of the form of simple text

files (Comma Separated Values etc.) or binary files Both methods have their own advantages and

disadvantages CSV files contain data that can be easily read but lack a

description of their own format Binary files contain data and a description of its data

format (as a Word document does) but are proprietary schemes that require specific applications to read them

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

What is XML? XML stands for eXtensible Markup Language Developed by World Wide Web Consortium

(W3C) Aims to provide the best of both worlds

Stores data in an easy to read text format Also contains a description of the data

Open standard - looks similar to HTML (large user base) except the extensible nature of XML allows for the creation of user-defined tags

It is not a replacement for HTML (although it may eventually supplant it)

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

What is XML? XML offers a standard method for storing

structured data

Language independence (English, Chinese etc.)

Hierarchical structure allows for simple and efficient querying/parsing of the document

Simple data interchange between applications and/or distributed objects

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

What is XML? Can improve upon and replace existing EDI

(Electronic Document Interchange) solutions such as EDIFACT

Websites gain by having content and presentation separate A site could be developed purely in XML

Cost benefits No need for private EDI networks – the

Internet used as exchange medium Reduced time to implement/reduced

maintenance

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Illustrative Example

Paper-based or Simple Text File

Jonathan WestlakeStaffordshire [email protected]

Comma Separated Values

Jonathan,Westlake,Staffordshire University,[email protected]

Binary – a string of 0s and 1s0101 1001 0111 0110 0110 0001 0110 1110 0010 0000 etc.

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Illustrative Example

Simple ‘Freeform’ (i.e. without an XML schema) XML File

<contact>

<name><firstname>Jonathan</firstname><lastname>Westlake</lastname>

</name><workplace>Staffordshire University</workplace><email>[email protected]</email>

</contact>

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

How to use XML with JSP XML is a big topic! We are just going to look at one of the more fundamental

aspects of XML An XML parser simply checks that your XML document

is syntactically correct (well-formed) and contains correctly formatted data (valid)

Once an XML document is parsed, the information it contains is accessible inside our web application

There are two ways of parsing an XML document – Simple API for XML (SAX) <reference link> and Document Object Model (DOM) <reference link>

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

SAX Parser

Event Driven : An event is triggered each time the parser encounters a beginning, or ending tag

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

DOM Parser

A tree representing the document is built in memory.

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

DOM Tree Structurecontacts---contact

| |---name | | | |---firstname | | └---Text | | | └ ---lastname | └---Text | |---workplace | └---Text | └ ---email

└---Text

For our contactsexample

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Standard DOM Parsing

Searching for a contact based on their email address

Create a Document object Load the XML file into the Document Using iterations over the Document nodes, you

can access any of the values or attributes that are stored in the XML file

When you find the record that contains the email address that you are looking for, do something with the information

Standard DOM Parsing

// Get a factory object (many ways to build a Document, this lets us choose)DocumentBuilderFactory docBuilderFactory =

DocumentBuilderFactory.newInstance();

// ensures the factory object returned is set to validate the XML data with the schema

docBuilderFactory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");

docBuilderFactory.setValidating(true); docBuilderFactory.setNamespaceAware(true); // get the Document builder object that we will use to build the DOM tree in

memoryDocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();

// load the XML file into a new Document objectDocument doc = docBuilder.parse(new File(“myfile.xml”));

// does a bit of extra DOM processing (strips out empty nodes etc.)doc.normalize();

Standard DOM Parsing// get the whole node structureNodeList nodelist = getNodeList("/contacts", doc);

for(int index = 0; index < nodelist.getLength(); index++){ Node node = nodelist.item(index);

if(node.getNodeType(TEXT_NODE)) { if(node.getNodeName().equals(“firstname”))

firstname = node.getNodeValue();if(node.getNodeName().equals(“lastname”))

lastname = node.getNodeValue();if(node.getNodeName().equals(“workplace”))

workplace = node.getNodeValue();if(node.getNodeName().equals(“email”))

email = node.getNodeValue();if(email.equals(“[email protected]”)){ displayRecord(firstname, lastname, workplace, email); break;}

}}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Standard DOM Parsing Problems

The previous procedure is ok if you have a fairly ‘flat’ XML structure that does not contain many different node types

If you have many contacts stored then it may be very slow to iterate through the nodes until you find the contact you are looking for

For more complex XML documents, you can’t use a simple FOR loop to iterate

You end up with code that looks more like…

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Not Nice!

Element root = doc.getDocumentElement(); Node configNode = root.getFirstChild(); NodeList childNodes = configNode.getChildNodes(); for (int childNum = 0; childNum < childNodes.getLength(); childNum+

+){

if ( childNodes.item(childNum).getNodeType() == Node.ELEMENT_NODE ){ Element child = (Element) childNodes.item( childNum );

if ( child.getTagName().equals( "header" ) ){ // Do something with the header

System.out.print("Got a header!\n");}

}}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

XPath

XPath (XML Path Language) is a terse (short, non-XML) syntax for addressing portions of an XML document

A typical XPath expression is a Location Path consisting of a string of element or attribute qualifiers separated by forward slashes ("/"), similar in appearance to a file system path

E.g. this gets the email field of the first contact//contact[1]/email

NB would you expect to have seen //contact[0]/...?

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Using XPath

We can use the XPath syntax to retrieve any information we like from the Document

XPath is a new(ish) specification and initially it was only available via Java extensions

As of version 1.5 of the JDK, Java now natively supports XPath

Search Using XPath // create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath(); // now execute the XPath select statement to get the contact that matches the email address NodeList nodes = (NodeList)xPath.evaluate(//contact[email=‘[email protected]’],

nodelist, XPathConstants.NODESET);

Node fNameNode = (NodeList) xPath.evaluate(//firstname’], nodes, XPathConstants.NODESET);

String firstname = firstNameNode.getNodeValue();

Node lNameNode = (NodeList) xPath.evaluate(//lastname’], nodes, XPathConstants.NODESET);

String lastname = lastNameNode.getNodeValue();

Node workNode = (NodeList) xPath.evaluate(//workplace’], nodes, XPathConstants.NODESET);

String workplace = workPlaceNode.getNodeValue();

Node emailNode = (NodeList) xPath.evaluate(//email’], nodes, XPathConstants.NODESET);String email = emailNode.getNodeValue();

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

XMLHelper.java

package xmlhelper;

import javax.xml.xpath.XPath;import javax.xml.xpath.XPathConstants;import javax.xml.xpath.XPathExpressionException;import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;import org.w3c.dom.Node;import org.w3c.dom.NodeList;

So JSP has access to a set of packages which include DOM and XPath

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

getNodeListXPath()

public static NodeList getNodeListXPath(String expression, Document target) throws XPathExpressionException{

// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();

// now execute the XPath select statementNodeList nodeList = (NodeList) xPath.evaluate(expression, target,

XPathConstants.NODESET);

// return the resulting nodereturn nodeList;

}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

getBooleanXPath()public static boolean getBooleanXPath(String expression, Document target) throws XPathExpressionException{

// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();

// now execute the XPath select statementBoolean nodeBoolean = (Boolean)xPath.evaluate(expression, target,

XPathConstants.BOOLEAN);

// return the resulting nodereturn nodeBoolean.booleanValue();

}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

getNumberXPath()public static double getNumberXPath(String expression, Document target) throws XPathExpressionException{

// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();

// now execute the XPath select statementDouble nodeNumber = (Double)xPath.evaluate(expression, target,

XPathConstants.NUMBER);

// return the resulting nodereturn nodeNumber.doubleValue();

}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

getStringXPath()public static String getStringXPath(String expression, Document target) throws XPathExpressionException{

// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();

// now execute the XPath select statementString nodeText = (String)xPath.evaluate(expression, target,

XPathConstants.STRING);

// return the resulting nodereturn nodeText;

}

Java WWW Week 10

Version 2.1 Mar 2008

Slide [email protected]

Summary

XML is important as it offers: Neutral data exchange Can be built into a web application Can be searched for content using XPath

Java (and therefore JSP) can use XML and Xpath

Used widely in enterprise-scale information systems

No lecture next week but the first revision session in preparation of the module class test (short exam)