8/7/2019 2 WHAT is XML
1/43
WHAT is XMLWHAT is XML
Rakesh Kumar RaiRakesh Kumar Rai
Lecturer I.T Dept.Lecturer I.T Dept.G.C.E.T Gr. NoidaG.C.E.T Gr. Noida
8/7/2019 2 WHAT is XML
2/43
What is XMLWhat is XML
XML is a text-based markup language that is
Fast becoming the standard for data
interchange on the Web. As with HTML, youidentify data using tags (identifiers enclosed
in angle brackets, like this:
).Collectively, the tags are known as"markup".
8/7/2019 2 WHAT is XML
3/43
What is XML cont.What is XML cont.
But unlike HTML, XML tags identify the data,
Rather than specifying how to display it. Where an
HTML Tag says something like "display this data inBold font (...), an XML tag acts like a
Field name in your program. It puts a label on a
Piece of data that identifies it (for example:
...)
8/7/2019 2 WHAT is XML
4/43
8/7/2019 2 WHAT is XML
5/43
Example DetailExample DetailThroughout this tutorial, we use boldface text to
highlight things we want to bring to your attention.XML does not require anything to be in bold! Thetags in this example identify the message as awhole, the destination and sender addresses, thesubject, and the text of the message. As in HTML,the tag has a matching end tag: . Thedata between the tag and and its matching end tagdefines an element of the XML data. Note, too, thatthe content of the tag is entirely containedwithin the scope of the message>..tag. It is this ability for one tag to contain othersthat gives XML its ability to represent hierarchicaldata structures
8/7/2019 2 WHAT is XML
6/43
Example DetailExample Detail
Once again, as with HTML, white space isessentially irrelevant, so you can formatthe data for readability and yet stillprocess it easily with a program. UnlikeHTML, however, in XML you could easilysearch a data set for messages containing"cool" in the subject, because the XML
tags identify the content of the data,rather than specifying its representation.
8/7/2019 2 WHAT is XML
7/43
The XML Prolog
XML file is always starts with a prologXML file is always starts with a prologMinimum prolog tells that the givenMinimum prolog tells that the given
document is XML.document is XML.
Version specifies the version of XMLVersion specifies the version of XML
Document. it is not optionalDocument. it is not optional
Encoding identifies the character to encodeEncoding identifies the character to encodeThe data ISOThe data ISO--88598859--1 the western European1 the western European
And English language character set.And English language character set.
8/7/2019 2 WHAT is XML
8/43
The XML Prolog
standalone
Tells whether or not this document
references an external entity or anexternal data type specification (see below).
If there are no external references, then
"yes" is appropriate
8/7/2019 2 WHAT is XML
9/43
DTD(Document Type Definition)DTD(Document Type Definition)
A Document Type Definition (DTD)A Document Type Definition (DTD)defines the legal building blocks ofdefines the legal building blocks of
an XML document. It defines thean XML document. It defines thedocument structure with a list ofdocument structure with a list oflegal elements and attributes.legal elements and attributes.
A DTD can be declared inline insideA DTD can be declared inline inside
an XML document, or as an externalan XML document, or as an externalreference.reference.
8/7/2019 2 WHAT is XML
10/43
DTD cont.DTD cont.The DTD specification is actually part of the XMLspecification, rather than a separate entity. Onthe other hand, it is optional you can write anXML document without it. A DTD specifies the
kinds of tags that can be included in your XMLdocument, and the valid arrangements of thosetags. You can use the DTD to make sure youdon't create an invalid XML structure. You can
also use it to make sure that the XML
structure you are reading (or that got sent overthe net) is indeed valid.
8/7/2019 2 WHAT is XML
11/43
DTD cont.DTD cont.
Unfortunately, it is difficult to specify aDTD for a complex document in such away that it prevents all invalidcombinations. The DTD can exist at thefront of the document, as part of theprolog. It can also exist as a separate
entity
8/7/2019 2 WHAT is XML
12/43
DTD(Internal DTD Declaration )DTD(Internal DTD Declaration )
If the DTD is declared inside the XML file,If the DTD is declared inside the XML file,
it should be wrapped in a DOCTYPEit should be wrapped in a DOCTYPE
definition with the following syntax:definition with the following syntax:
8/7/2019 2 WHAT is XML
13/43
DTD Example (Internal DTDDTD Example (Internal DTDdeclaration)declaration)
8/7/2019 2 WHAT is XML
14/43
DTD Example (External DTDDTD Example (External DTDdeclaration)declaration)
There will be two filesThere will be two files
1.1. .dtd file(contains the reference of dtd.dtd file(contains the reference of dtd
file)file)2.2. .xml file( contains the reference of dtd.xml file( contains the reference of dtd
file)file)
8/7/2019 2 WHAT is XML
15/43
Dtd fileDtd file
games(cricket,hockey,boxing,shooting)>
8/7/2019 2 WHAT is XML
16/43
xml filexml file
game of 11 playersgame of 11 playersgame of 11 playersgame of 11 players
game of 11 playersgame of 11 players
game of 11 playersgame of 11 players
8/7/2019 2 WHAT is XML
17/43
SAXSAX
You can also think of this standard as the "serialaccess" protocol for XML. This is the fast-to
execute mechanism you would use to read and
write XML data in a server, for example. This is
also called an event-driven protocol, because the
technique is to register your handler with a SAX
parser, after which the parser invokes your
callback methods whenever it sees a new XML tag(or encounters an error, or wants to tell you
anything else).
8/7/2019 2 WHAT is XML
18/43
The Simple API for XML (SAX) APIs
The basic outline of the SAX parsing APIs
are shown at right. To start the process, an
instance of the SAXParserFactory classed is
used to generate an instance of the parser.
The parser wraps a SAXReader object. When theparser's parse() method is invoked, the reader
invokes one of several callback methods
implemented in the application.
Those methods are defined by the interfaces
ContentHandler,ErrorHandler,DTDHandler, and
EntityResolver.
8/7/2019 2 WHAT is XML
19/43
What is parsingWhat is parsing
Parsing is the mechanism which is usedParsing is the mechanism which is used
To read the object. It is also responsible forTo read the object. It is also responsible for
Wellformedness and correctnessWellformedness and correctness
8/7/2019 2 WHAT is XML
20/43
JAXP: Java API for XML Parsing
It provides a common interface for creating
and using the standard SAX,DOM APIs in
Java, regardless of which vendor'simplementation is actually being used.
8/7/2019 2 WHAT is XML
21/43
SAX overviewSAX overview
8/7/2019 2 WHAT is XML
22/43
The Simple API for XML (SAX) APIscont.
1. SAXParserFactoryA SAXParserFactory object creates an instance ofthe parser determined by the system property,
javax.xml.parsers.SAXParserFactory.
2.SAX ParserThe SAX Parser interface defines several kinds ofparse() methods. In general, you pass an XMLdata source and a Default Handler object to theparser, which processes the XML and invokes theappropriate methods in the handler object.
8/7/2019 2 WHAT is XML
23/43
The Simple API for XML (SAX) APIscont.
3.SAXReaderThe SAX Parser wraps a SAXReader.Typically, you don't care about
that, but every once in a while you need toget hold of it using SAX Parser'sgetXMLReader(), so you can configure
it. It is the SAXReader which carries on theconversation with the SAX event handlersyou define.
8/7/2019 2 WHAT is XML
24/43
The Simple API for XML (SAX) APIscont.
Default HandlerDefaultHandler implements the Content Handler,ErrorHandler,DTDHandler, and EntityResolverinterfaces (with null methods), so you canoverride only the ones you're interested in.Content Handler Methods like startDocument, endDocument, startElement, andendElement are invoked when an XML tag isrecognized. This interface also defines methods
characters and processing Instruction, which areinvoked when the parser encounters the text inan XML element or an inline processinginstruction, respectively.
8/7/2019 2 WHAT is XML
25/43
The Simple API for XML (SAX) APIscont.
Error HandlerMethods error, fatal Error, and warningare invoked in response to various parsingerrors. The default error handler throws
an exception for fatal errors and ignoresother errors (including validation errors).That's one reason you need to knowsomething about the SAX parser, even ifyou are using the DOM. Sometimes, the
application may be able to recover from avalidation error. Other times, it may needto generate an exception. To ensure thecorrect handling, you'll need to supply
your own error handler to the parser.
8/7/2019 2 WHAT is XML
26/43
The Simple API for XML (SAX) APIscont.DTDHandler
Defines methods you will generally never be called uponto use. Used when processing a DTD to recognize andact on declarations for an unparsed entity.Entity Resolver
The resolveEntity method is invoked when the parsermust identify data identified by a URI. In most cases,a URI is simply a URL, which specifies the location of adocument, but in some cases the document may beidentified by a URN -- a public identifier, or name, that isunique in the web space. The public identifier may bespecified in addition to the URL. The EntityResolver canthen use the public identifier instead of the URL tofind the document, for example to access a local copy ofthe document if one exists.
8/7/2019 2 WHAT is XML
27/43
Packages for SAXPackages for SAX
1.org.xml.sax1.org.xml.sax Defines the SAX interfaces.
The name "org.xml" is the package prefix
that was settled on by the group that
defined the SAX API.2. org.xml.sax.ext Defines SAX extensions that are
used when doing more sophisticated SAX
processing, for example, to process a documenttype definitions (DTD) or to see the
detailed syntax for a file.
8/7/2019 2 WHAT is XML
28/43
3.org.xml.sax.helpers
Contains helper classes that make it easier
to use SAX -- for example, by defining adefault handler that has null-methods for all
of the interfaces, so you only need to
override the ones you actually want toimplement.
Packages for SAXPackages for SAX
8/7/2019 2 WHAT is XML
29/43
Packages for SAXPackages for SAX
4.javax.xml.parsers
Defines the SAXParserFactory class which
returns the SAXParser. Also definesexception classes for reporting errors.
8/7/2019 2 WHAT is XML
30/43
Example of SAX XMLTest.javaExample of SAX XMLTest.javaimport java.io.*;import java.util.*;import org.w3c.dom.*;import org.xml.sax.*;import java.io.*;import java.util.*;import org.w3c.dom.*;import org.xml.sax.*;
import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;import javax.xml.parsers.SAXParser;public class XMLTestpublic class XMLTest
{Public static void main(String args[]){Try{String xmlResource=file:+new{Public static void main(String args[]){Try{String xmlResource=file:+newFile(args[0]).getAbsolutePath();File(args[0]).getAbsolutePath();
Parser parser;SAXParserFactory spf= SAXParserFactory.newInstance();Parser parser;SAXParserFactory spf= SAXParserFactory.newInstance();
//get an instance of SAXParserFactory//get an instance of SAXParserFactory
SAXParser sp=spf.newSAXParser();//get a SAXParser instance from the factorySAXParser sp=spf.newSAXParser();//get a SAXParser instance from the factorySAXHandler handler=new SAXHandler();//craete an instance of handlerbaseSAXHandler handler=new SAXHandler();//craete an instance of handlerbase
sp.parse(xmlResource,handler);//set the document handler to call our SAXHandler whensp.parse(xmlResource,handler);//set the document handler to call our SAXHandler whenSAXEvent occurs while parsing our XMLresouseSAXEvent occurs while parsing our XMLresouse
Hashtable cfgTable=handler.getTable();//After the resourced is parsed get the resultingHashtable cfgTable=handler.getTable();//After the resourced is parsed get the resultingtabletable
HashtableHashtable
cfgtable=handler.getTable();System.out.println(ID==+(String)cgfTable.get(newcfgtable=handler.getTable();System.out.println(ID==+(String)cgfTable.get(newString(ID)));String(ID)));
System.out.println(DES==+(String)cgfTable.get(new String(DESCRIPTION)));System.out.println(DES==+(String)cgfTable.get(new String(DESCRIPTION)));
System.out.println(PRICE==+(String)cgfTable.get(new String(PRICE)));System.out.println(PRICE==+(String)cgfTable.get(new String(PRICE)));
System.out.println(QUANTITY==+(String)cgfTable.get(new String(QUANTITY)));System.out.println(QUANTITY==+(String)cgfTable.get(new String(QUANTITY)));
}catch(Exception e){e.printStackTrace();}}}}catch(Exception e){e.printStackTrace();}}}
8/7/2019 2 WHAT is XML
31/43
Example of XML (SAXHandler.java)Example of XML (SAXHandler.java)
Import java.io.*;Import java.io.*;Import java.util.*;Import java.util.*;
Import org.xml.sax.*;Import org.xml.sax.*;
Public class SAXHandler extends HandlerBasePublic class SAXHandler extends HandlerBase
{Private Hashtable table=new Hashtale();Private String current{Private Hashtable table=new Hashtale();Private String current
Element;Private String current Value;Element;Private String current Value;Public void settable(){This.table=table;} Public HashtablePublic void settable(){This.table=table;} Public Hashtable
getTable(){Return table;getTable(){Return table;
}Public void startElement(String tag,AttributeList atts)}Public void startElement(String tag,AttributeList atts)
Throws SAXException{currentElement=tag;}Throws SAXException{currentElement=tag;}
public void charcters(char[] ch,int start,int lenght)Throwspublic void charcters(char[] ch,int start,int lenght)ThrowsSAXExceptionSAXException
{currentValue=new String(ch,start,length);{currentValue=new String(ch,start,length);
}Public void endElement(String name) throws SAXException}Public void endElement(String name) throws SAXException
{If(currentElement.equals(name)){table.put(currentElement,currentVal{If(currentElement.equals(name)){table.put(currentElement,currentVal
ue);}}}ue);}}}
8/7/2019 2 WHAT is XML
32/43
DOMDocument Object Model
The Document Object Model protocol converts an
XML document into a collection of objects in your
program. You can then manipulate the object
model in any way that makes sense. This
mechanism is also known as the "random access
protocol, because you can visit any part of the
data at any time. You can then modify the data,remove it, or insert new data. For more
information on the DOM specification,
8/7/2019 2 WHAT is XML
33/43
DOM cont.
The XML DOM (Document ObjectThe XML DOM (Document ObjectModel) defines a standard way forModel) defines a standard way foraccessing and manipulating XMLaccessing and manipulating XMLdocuments. The DOM presents andocuments. The DOM presents anXML document as a tree structure,XML document as a tree structure,with elements, attributes, and text aswith elements, attributes, and text as
nodes:nodes:
8/7/2019 2 WHAT is XML
34/43
DOM cont.
8/7/2019 2 WHAT is XML
35/43
DOMDOM
8/7/2019 2 WHAT is XML
36/43
DOMDOM
The javax.xml.parsers.DocumentBuilderFactoryclass to get a DocumentBuilder instance, anduse that to produce a Document (a DOM) thatconforms to the DOM specification. The builder
you get, in fact, is determined by the
SystemProperty,javax.xml.parsers.DocumentBuilderFactory, which selects the factory implementation
that is used to produce the builder.
8/7/2019 2 WHAT is XML
37/43
DOM PackagesDOM Packages
org.w3c.dom
Defines the DOM programming interfaces
for XML (and, optionally, HTML) documents,as specified by the W3C.
8/7/2019 2 WHAT is XML
38/43
DOM PackagesDOM Packages
javax.xml.parsers
Defines the DocumentBuilderFactory class and the
DocumentBuilder class, which returns an object that
implements the W3C Document interface. The factory thatIs used to create the builder is determined by the
javax.xml.parsers system property, which can be set from
the command line or overridden when invoking the
newInstance method. This package also defines the
ParserConfigurationException class for reporting errors.
8/7/2019 2 WHAT is XML
39/43
DOM ExampleDOM Exampleimport javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.parsers.ParserConfigurationException;import org.xml.sax.SAXException;import org.xml.sax.SAXParseException;import org.w3c.dom.Document;import org.w3c.dom.DOMException;import org.w3c.dom.Node;import org.w3c.dom.NodeList;import javax.xml.transform.Transformer;import javax.xml.transform.TransformerException;import javax.xml.transform.TransformerFactory;
import javax.xml.transform.TransformerConfigurationException;import javax.xml.transform.dom.DOMSource;import javax.xml.transform.stream.StreamResult;import java.io.*;public class TransformationApp03{ Document document; public static
void main (String argv []){if (argv.length != 1) {System.err.println ("Usage: java Transformation filename");System.exit
(1);}DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);//factory.setValidating(true);try {File f = newFile(argv[0]);DocumentBuilder builder factory.newDocumentBuilder();document = builder.parse(f);// Get the first element in the DOMNodeList list = document.getElementsByTagName("slide");Node node = list.item(0);TransformerFactory tFactory
=TransformerFactory.newInstance();Transformer transformer =tFactory.newTransformer();DOMSource source = new DOMSource(node);
StreamResult result = new StreamResult(System.out);transformer.transform(source,
result);} catch (TransformerConfigurationException tce) {
8/7/2019 2 WHAT is XML
40/43
Example Cont.Example Cont.
System.out.println ("\n** Transformer Factory error");System.out.println(" " +tce.getMessage() ); anyThrowable x = tce;if (tce.getException() != null)x =tce.getException();x.printStackTrace();} catch (TransformerException te){// Error generated by the parserSystem.out.println ("\n** Transformationerror");
System.out.println(" " + te.getMessage() );
Throwable x = te;if (te.getException() != null)x =te.getException();x.printStackTrace();} catch (SAXException sxe) {
Exception x = sxe;if (sxe.getException() != null)x =sxe.getException();x.printStackTrace();} catch(ParserConfigurationException pce) {
// Parser with specified options can't be builtpce.printStackTrace();} catch(IOException ioe) {// I/O error
ioe.printStackTrace()}}}
8/7/2019 2 WHAT is XML
41/43
DTDDocument Type Definition
A DTD specifies the kinds of tags that
can be included in your XML document,
and the valid arrangements of those tags.
You can use the DTD to make sure you
don't create an invalid XML structure. You
can also use it to make sure that the XML
structure you are reading (or that got sent
over the net) is indeed valid.
8/7/2019 2 WHAT is XML
42/43
DTD contDTD cont..
A DTD makes it possible to validate thestructure of relatively simple XMLdocuments, A DTD can't restrict thecontent of elements, and it can't specifycomplex relationships. For example, it isimpossible to specify with a DTD that a for a must have botha and an , while a
for a only needs a. In a DTD,once you only get tospecify the structure of the element one time. There is no context-
sensitivity.
8/7/2019 2 WHAT is XML
43/43
DTD contDTD cont..
This issue stems from the fact that a DTDspecification is not hierarchical. For a mailingaddress that contained several "parsed character
data" (PCDATA) elements, for example, the DTDmight look something like this:
Top Related