eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsing using Java lanaguage
-
Upload
wojciech-podgorski -
Category
Technology
-
view
3.070 -
download
0
description
Transcript of eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsing using Java lanaguage
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
eXtensible Markup Language APIs in Java 1.6Simple and efficient XML parsing using Java lanaguage
Wojciech Podgorskihttp://podgorski.wordpress.com
April 8, 2008
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Presentation outline
1 IntroductionWhat is parsingDiffrent ways of parsing documents
2 XML API’s in JavaSAXDOMStAX
3 Capabilities and performance comparison
4 CASE STUDY: Parsing Really Simple Syndication (RSS) doc
5 What next? Alternatives to API’s, Java SE 7.0 features
6 Summary
7 Further reading...
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
Parsing definition
Parsing, more formally called syntactic analysis is the process ofanalyzing a sequence of tokens to determine grammatical structurewith respect to a given formal grammar.
Source: http://en.wikipedia.org/wiki/Parsing
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
We can distinguish three main models of parsing XML documents.Each one of them differs with mechanism of traversing betweenthe nodes and idea of processing XML data.Those models are:
SAX - Simple API for XML
DOM - Document Object Model
StAX - Streaming API for XML
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
We can distinguish three main models of parsing XML documents.Each one of them differs with mechanism of traversing betweenthe nodes and idea of processing XML data.Those models are:
SAX - Simple API for XML
DOM - Document Object Model
StAX - Streaming API for XML
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
We can distinguish three main models of parsing XML documents.Each one of them differs with mechanism of traversing betweenthe nodes and idea of processing XML data.Those models are:
SAX - Simple API for XML
DOM - Document Object Model
StAX - Streaming API for XML
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
We can distinguish three main models of parsing XML documents.Each one of them differs with mechanism of traversing betweenthe nodes and idea of processing XML data.Those models are:
SAX - Simple API for XML
DOM - Document Object Model
StAX - Streaming API for XML
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
That’s not all! There are other approaches, which won’t bedescribed in this presentation.
JAXB - Java XML Binding APITechnology providing ability to marshal Java objects intoXML and the reverse, i.e. to unmarshal XML elements backinto Java objects. Working on top of another parser (mostlystreaming parsers).
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
That’s not all! There are other approaches, which won’t bedescribed in this presentation.
JAXB - Java XML Binding APITechnology providing ability to marshal Java objects intoXML and the reverse, i.e. to unmarshal XML elements backinto Java objects. Working on top of another parser (mostlystreaming parsers).
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
JavolutionLibrary providing real-time StAX-like implementation whichdoes not force object creation and has smaller effect onmemory footprint/garbage collection, using eg. lookup tablesfor retriving and reusing data.
VTD-XML - Virtual Token Descriptor for XMLCollection of efficient processing technologies, centeredaround a non-extractive and ‘document-centric‘ parsingtechnique called VTD. Supports random access’ and XPath
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
What is parsingDiffrent ways of parsing documents
JavolutionLibrary providing real-time StAX-like implementation whichdoes not force object creation and has smaller effect onmemory footprint/garbage collection, using eg. lookup tablesfor retriving and reusing data.
VTD-XML - Virtual Token Descriptor for XMLCollection of efficient processing technologies, centeredaround a non-extractive and ‘document-centric‘ parsingtechnique called VTD. Supports random access’ and XPath
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
SAX as a processing model
While describing SAX, firstly it should be considered as a specificprocessing mechanism, rather then simple API. SAX representsevent-driven architecture. It means, that parser would performan operation each time when a particular event will occur.
To handle these occurences, user defines a number of callbackmethods, which will be called when parser is notified aboutencountered element.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Figure: Top-down parsing in SAX API
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
In Java language, SAX API is a collection of classes and interfaces,which should be implemented while constructing XML parser.Package containing this collection is:
org.xml.sax.*
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Figure: org.xml.sax.* package class diagram
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Basic class structure
1 // D e c l a r e document URI2 S t r i n g xmlURI = ” h t t p : / / example . com/ r e p o r t . xml ” ;3
4 // C r e a t e r e a d e r i n s t a n c e5 XMLReader r e a d e r = XMLReaderFactory . createXMLReader ( ) ;6
7 // Set i m p l e m n t a t i o n c l a s s o f Content Hand le r8 r e a d e r . s e t C o n t e n t H a n d l e r ( new MyContentHandler ( ) ) ;9
10 // R e s o l v e document s o u r c e11 I n p u t S o u r c e i n p u t S o u r c e = new I n p u t S o u r c e ( xmlURI ) ;12
13 // Parse document14 r e a d e r . p a r s e ( i n p u t S o u r c e ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Diffrent SAX implementations
1 // X e r c e s i m p l e m e n t a t i o n2 XMLReader r e a d e r =3 new org . apache . x e r c e s . p a r s e r s . SAXParser ( ) ;4
5 // JAXP i m p l e m e n t a t i o n6 SAXParser p a r s e r = SAXParserFactory . newSAXParser ( ) ;7 XMLReader r e a d e r = p a r s e r ;8
9 // P i c c o l o i m p l e m e n t a t i o n10 XMLReader r e a d e r = new com . b l u e c a s t . xml . P i c c o l o ( ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Other SAX features
SAX provides number of interfaces for correct data handling. Someof them, not only process the content of document, but also it’sstructure.
Interfaces such as:
ErrorHandler
EntityResolver
DTDHandler
Analyze also structure of the document, for possible errors, entitylinks or elements describing other elements.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Advanced SAX features I
SAX API is considered as very flexible solution. Mainly because itcan be configured by properites and features.
1 v o i d s e t P r o p e r t y ( S t r i n g p r o p e r t y I D , Object v a l u e ) ;2 v o i d s e t F e a t u r e ( S t r i n g f e a t u r e I D , b o o l e a n s t a t e ) ;
Properties and features modify parser behaviour while processingdocument. For example, we can validate if document is well-formedXML file, or validate it against the schema related to it.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Advanced SAX features II
Among many other interesting SAX features, one is very importantand radically extends SAX capabilities. Interface XMLFilter allowsto create a cascade of parsers, each for a different processingoperation. It greatly accelerates parsing as a one piece.
Figure: Cascade processing using XMLFilter interface
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... I
Q: Why do we need other mechanisms, if SAX is so good?
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... I
Q: Why do we need other mechanisms, if SAX is so good?
A: SAX has some serious limitations due to his sequential dataaccess.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
What SAX cannot do... II
SAX parse data from beginning to end. It doesn’t allow to goback. And also got some other negative issues.:
it is unable to modify content or structure of document
it cannot access specific or random elements
it cannot access sibling elements
it is not serializable
So it seems, that it is useless. THAT’S NOT TRUE! (comparisonsection). Every issue mentioned above can be resolved by SAXcomplement...
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
DOM as a processing model
Document Object Model is based on a whole different idea.It doesn’t parse document and react to specific events (though it isable to), instead of this it builds up a tree based on documentsstructure, and store it in memory as an object.Due to this, every node in this tree is always available and can beaccessed later on, many times. Moreover, strucutre stored inmemory, can be easily transformed in many ways.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
DOM architecture I
DOM, in contrary to SAX, is a standard developed by W3C1. Dueto standarization it has strict architecture divided into levels, eachcontaining required and optional modules.
To claim to support a level, an application must implement all therequirements of the claimed level and the levels below it. There are3 levels, the newest (DOM 3) has been developed in 2004 and isthe current release of the DOM specification.
Every level has it’s core, which is a root element for other modules(figure)
1Refernce to the standard could be found on W3C sites
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Figure: Document Object Model architecture (Adapted from original W3C specification)
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
In Java language, DOM has a different structure than SAX. Almostevery class representing Document Object Model implementsinterfaces inherited from org.w3c.dom.Node interface.
Such framework, allows very simple data manipulation andtraversing between nodes contained in tree structure. It is essentialto understand how elements are stored in tree (figure).
For example if we want to read text data from element A, weshould get his child element contatining text, not extract elementsA content.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Figure: org.w3c.dom.* package class diagram From [1]
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Basic class structure using Java implementation
1 S t r i n g docURI = ” h t t p : / / example . org / n u t r i t i o n . xml ” ;2 // g e t new D o c u m e n t B u i l d e r F a c t o r y3 D o c u m e n t B u i l d e r F a c t o r y d o c B u i l d e r F a c t o r y =4 D o c u m e n t B u i l d e r F a c t o r y . n e w I n s t a n c e ( ) ;5 // g e t new DocumentBui lder6 DocumentBui lder d o c B u i l d e r =7 d o c B u i l d e r F a c t o r y . newDocumentBui lder ( ) ;8 // i n i t i a l i z e document w i t h n u l l9 Document doc = n u l l ;
10 // p a r s e document11 doc = d o c B u i l d e r . p a r s e ( docURI ) ;12 // e x t r a c t r o o t e l em en t and13 // n o r m l i z e whole t r e e ( o p t i o n a l )14 doc . getDocumentElement ( ) . n o r m a l i z e ( ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Accessing elements
1 N o d e L i s t e l e m e n t s = n u l l ;2 // g e t ” food ” e l e m e n t s3 e l e m e n t s = doc . getElementsByTagName ( ” food ” ) ;4 f o r ( i n t i =0; i <e l e m e n t s . g e t L e n g t h ( ) ; i ++)5 // g e t ” Avocado Dips ”6 S t r i n g foodName = e l e m e n t s . i tem ( i ) . getNodeName ( ) ;7 i f ( foodName . c o n t a i n s ( ” Avocado Dip ” ) )8 {9 N o d e L i s t l = e l e m e n t s . i tem ( i ) . g e t C h i l d N o d e s ( ) ;
10 f o r ( i n t j =0; j< l . g e t L e n g t h ( ) ; j ++)11 // p r i n t out c a l o r i e s12 i f ( l . i tem ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) )13 System . out . p r i n t l n ( l . i tem ( j ) . g e t T e x t C o n t e n t ( ) ) ;14 }
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Modyfing elements
1 . . .2 i f ( l . i tem ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) )3 {4 I n t e g e r c a l =( I n t e g e r ) ( l . i tem ( j ) . g e t T e x t C o n t e n t ( ) ) ;5 // i f food avocado d i p has more than 300 c a l .6 i f ( c a l > 300)7 {8 Element a v o c a d o d i p = l . i tem ( j ) . getParentNode ( ) ;9 // r e p l a c e i t w i t h low f a t food
10 Element newfood=doc . c r e a t e E l e m e n t ( ” LowFatFood ” ) ;11 doc . r e p l a c e C h i l d ( newfood , a v o c a d o d i p ) ;12 }13 }
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Diffrent DOM implementations
1 // X e r c e s DOM i m p l e m e n t a t i o n2 DOMParser p=new org . apache . x e r c e s . p a r s e r s . DOMParser ( ) ;3 p . p a r s e ( new I n p u t S o u r c e ( xmlURI ) ) ;4 Document doc = p . getDocument ( ) ;5
6 // JDOM DOM i m p l e m e n t a t i o n7 DOMBuilder b u i l d e r = org . jdom . i n p u t . DOMBuilder ( ) ;8 Document d=b u i l d e r . b u i l d ( new F i l e I n p u t S t r e a m ( xmlURI ) ) ;9 // i t ’ s org . jdom . Document not org . w3c . dom . Document !
10
11 // dom4j DOM i m p l e m e n t a t i o n12 SAXReader r e a d e r = new org . dom4j . i o . SAXReader ( ) ;13 Document document = r e a d e r . r e a d ( xmlURI ) ;14 // i t ’ s org . dom4j . Document not org . w3c . dom . Document !
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Advanced DOM features I
DOM provides many advanced functionalities with modulesspecified in standard (mainly level 3 modules). Some of them:
MutationEvents module provides methods for changeslistining
LS, LS-Async modules provides methods for various kinds ofserialization
Validation module provides methods for real-time validation
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Advanced DOM features II
It is important, while using specified API, to check what modulesand in what version are implemented. To do this, we can use:
1 b o o l e a n h a s F e a t u r e ( S t r i n g f e a t u r e , S t r i n g v e r s i o n ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Streaming API for XML - different approach
The third approach to processing XML data is based on idea totreat incoming information, about events, as a stream.
Streaming API for XML use technique called pull parsing whichprovides a sequential access to the document adapting iteratordesign pattern. Associating this with java.util.Iterator is notaccidenatial, because part of API implements this interface.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
StAX architecture
StAX in Java divides into two (theoretically) seperate APIs:
cursor API represented by XMLStreamReader andXMLStreamWriter classes. Maintained as a fast and mostefficient solution.
event API represented by XMLEventReader andXMLEventWriter classes. Regarded as a simple and andflexible solution.
Both are specified in JSR173 and contained in javax.xml.stream.*
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Difference between SAX event-driven architecture
Common view as if StAX API is similar to SAX is wrong.
SAX architecture provides number of interfaces to handle incomingevents. StAX Event API provides methods for iterating throughevent stream, and proper handling specific occurences.
Moreover StAX is symmetric Read/Write API which allows alsoto modify and store elements.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Basic class structure
1 /∗ C r e a t i n g r e a d e r s . . . ∗/2
3 // c r e a t i n g i n p u t f a c t o r y4 S t r i n g xmlURI = ” h t t p : / / example . org / n u t r i t i o n . xml ”5 S t r i n g R e a d e r s r = new S t r i n g R e a d e r ( xmlURI ) ;6 XMLInputFactory i f = XMLInputFactory . n e w I n s t a n c e ( ) ;7
8 // c u r s o r API r e a d e r9 XMLStreamReader c u r = i f . createXMLStreamReader ( s r ) ;
10 // e v e n t API r e a d e r11 XMLEventReader e v e n t = i f . createXMLEventReader ( s r ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Identifying events I
Main issue while using StAX is how to identify event which hasjust occured. There are many ways to do that, most simple is tocheck the constant connected with an event (cursor API).Constants are declared in XMLStreamConstants interface2.For example:
1 - START ELEMENT
2 - END ELEMENT
3 - PROCESSING INSTRUCTION
And so on...
2https://java.sun.com/webservices/docs/1.5/api/javax/xml/stream/XMLStreamConstants.html
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Accessing elements by iterator I (cursor API)
1 s t a r t E l e m = XMLStreamConstants . START ELEMENT ;2 // w h i l e t h e r e i s n e x t e v e n t3 w h i l e ( c u r . hasNext ( ) )4 {5 // c a t c h e v e n t t y p e6 i n t eventType = c u r . n e x t ( ) ;7 System . out . p r i n t l n ( evenType ) ;8 // i f e v e n t t y p e i s START ELEMENT9 // p r i n t e l e m e n t s t e x t c o n t e n t
10 i f ( eventType == s t a r t E l e m )11 System . out . p r i n t l n ( c u r . getE lementText ( ) ) ;12 }
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Identifying events II
In event API identyfing events is a bit different. XMLEventReaderProvides methods:
1 XMLEvent n e x t E v e n t ( ) ;2 b o o l e a n hasNext ( ) ;
So, to identify catched event, we must analyse XMLEvent objectreturned from the first method. Once again there are few ways todo that. Getting event type method can be called:
1 i n t getEventType ( ) ;
Or we can test if element is certain type, by one of “is“ methods.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Accessing elements by iterator II (event API)
1 // w h i l e t h e r e i s n e x t e v e n t2 w h i l e ( e v e n t . hasNext ( ) )3 {4 XMLEvent e = e v e n t . n e x t E v e n t ( ) ;5 // i d e n t i f y e v e n t by c a s t i n g !6 i f ( e i n s t a n c e o f S t a r t E l e m e n t )7 {8 // c a s t e v e n t to s p e c i f i c e l e me nt9 S t a r t E l e m e n t s e = ( S t a r t E l e m e n t ) e ;
10 QName name = s e . getName ( ) ;11 // p r i n t e l em en t name12 System . out . p r i n t l n ( name . g e t L o c a l P a r t ( ) ) ;13 }14 }
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Advanced iteration methods
Both StAX APIs provides more complex iteration methods.
1 XMLEvent nextTag ( ) ;2 // o n l y i n XMLEventReader3 XMLEvent peek ( ) ;4 // o n l y i n XMLStreamReader5 v o i d r e q u i r e ( i n t type , S t r i n g nsURI , S t r i n g l o c a l N ) ;
First method moves cursor omitting events, until the start or endof the element. Second allows to check next event before movingcursor. And third compares cursor position with wanted value.All methods are well documented and should reviewed by reader.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
EventFilters and StreamFilters I
StAX API allows to create filtered readers. It’s not necessary tocreate complex stream handlers to process specific events. Onlything that should be done is implementing one (or both) interfacecontaining singular method.Interfaces:
1 E v e n t F i l t e r ( e x t e n d s XMLFi l te r )2 S t r e a m F i l t e r ( e x t e n d s XMLFi l te r )
Methods:
1 p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t )
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
EventFilters and StreamFilters II
Implementing filter is simple:
1 p u b l i c c l a s s C h a r F i l t e r imp lements E v e n t F i l t e r2 {3 p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t )4 {5 r e t u r n ( e v e n t . getEventType ( ) ==6 XMLStreamConstants . CHARACTERS ) ;7 }8 }
Filter above will only react to characters elements.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Writing elements I
StAX as a symmetric API providing I/O handling is able to writeXML data. It provides to interfaces to do that:
1 XMLEventWriter ( e x t e n d s XMLEventConsumer )2 XMLStreamWriter
Basic difference between them, is that XMLEventWriter has lessfunctionalities.
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Writing elements II
1 // u s i n g XMLStreamWriter2 OutputStream c o n s o l e = System . out ;3 XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ;4 XMLStreamWriter sw = o f . createXMLStreamWriter ( c o n s o l e ) ;5 sw . w r i t e S t a r t D o c u m e n t ( ” 1 . 0 ” ) ;6 // c r e a t e document w i t h one meal7 sw . w r i t e S t a r t E l e m e n t ( ” n u t r i t i o n ” ) ;8 sw . w r i t e S t a r t E l e m e n t ( ” food ” ) ;9 sw . w r i t e S t a r t E l e m e n t ( ”name” ) ;
10 sw . w r i t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ;11 sw . wr i teEndElement ( ) ;12 sw . wr i teEndElement ( ) ;13 sw . wr i teEndElement ( ) ;14 sw . writeEndDocument ( ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
Writing elements III
1 // th e same u s i n g XMLEventWriter2 OutputStream c o n s o l e = System . out ;3 XMLEventFactory x e f = XMLEventFactory . n e w I n s t a n c e ( ) ;4 XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ;5 XMLEventWriter ew = o f . createXMLEventWri ter ( c o n s o l e ) ;6 ew . add ( x e f . c r e a t e S t a r t D o c u m e n t ( ”UTF8” , ” 1 . 0 ” ) ) ;7 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” n u t r i t i o n ” ) ) ;8 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” food ” ) ) ;9 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ”name” ) ) ;
10 ew . add ( x e f . c r e a t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ) ;11 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;12 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;13 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;14 ew . add ( x e f . createEndDocument ( ) ) ;
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
SAXDOMStAX
XmlPull
XmlPull is ancestor of StAX. Although StAX is a popular standardfor parsing XML data, XmlPull didn’t retire. Due to its lightweight(JAR file - only 9 kB) XmlPull found applicable for devices withlimited memory. It is often used in developing mobile applications.
http://www.xmlpull.org/
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Comparing capabilities I
Developing applications processing XML data, always relates withparser choice.
Selection of proper API is essential to success of the project.Although choice is not an easy task. Before making decision, askyourself few questions:
What needs to be done (using parser)?
Is application platform-dependent? If so, what’s the platform?
Is it a distributed system?
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Comparing capabilities I
Developing applications processing XML data, always relates withparser choice.
Selection of proper API is essential to success of the project.Although choice is not an easy task. Before making decision, askyourself few questions:
What needs to be done (using parser)?
Is application platform-dependent? If so, what’s the platform?
Is it a distributed system?
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Comparing capabilities I
Developing applications processing XML data, always relates withparser choice.
Selection of proper API is essential to success of the project.Although choice is not an easy task. Before making decision, askyourself few questions:
What needs to be done (using parser)?
Is application platform-dependent? If so, what’s the platform?
Is it a distributed system?
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Comparing capabilities I
Developing applications processing XML data, always relates withparser choice.
Selection of proper API is essential to success of the project.Although choice is not an easy task. Before making decision, askyourself few questions:
What needs to be done (using parser)?
Is application platform-dependent? If so, what’s the platform?
Is it a distributed system?
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Comparing capabilities II
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Benchmarks I
Figures: From http://piccolo.sourceforge.net/bench.html
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Benchmarks II
Figures: From http://piccolo.sourceforge.net/bench.html
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Benchmarks III
Figures: From http://www.xml.com/lpt/a/1702
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Benchmarks IV
Figure: From: http://www.ximpleware.com/benchmark1.html
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
CASE STUDYParsing Really Simple Syndication documents
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
RSS definition
RSS is a family of Web feed formats used to publish frequentlyupdated content. An RSS document (which is called a ”feed“ or”web feed“ or ”channel“) contains either a summary of contentfrom an associated web site or the full text stored as a XML. RSSmakes it possible for people to keep up with web sites in anautomated manner that can be piped into applications or filtereddisplays.
Source: http://en.wikipedia.org/wiki/RSS
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
The initials ”RSS” are used to refer to the following formats:
Really Simple Syndication (RSS 2.0)
RDF Site Summary (RSS 1.0 and RSS 0.90)
Rich Site Summary (RSS 0.91)
While creating solution for reading/writing RSS documents wemust remember that, RSS is not a standard, and doesn’t haveXMLSchema doc descrbing it’s strucutre (or DTD)! Onlyreference could be found on:
http://www.rssboard.org/rss-specification
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
The CodePresenting jNivo RSS Exterior Plugin v.0.1
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Every previous presented API, can be thought as difficult to learnand use. It’s partly true, XML APIs in Java have rather difficultsyntax, and hundreds of classes and interfaces, which should behandled to process XML data.
Another thing is that, there are few standards:
javax.xml.stream.* (StAX, JSR-173)
org.w3c.dom.* (DOM standard)
org.xml.sax.* (SAX standard)
JAXP
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Mark Reinhold3 suggested different way of expressing XML in
Java language4.
Built in:
New type:New package:
java.lang.String ”foo“
java.lang.XML <foo\> (syntax!)java.lang.xml.* (XML Literlas!)
3Chief Engineer for the Java Platform, Standard Edition, at Sun Microsystems.
4Java Technical Session 3441 (TS-3441)
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Proposed syntax I
Figure: From [3]
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Proposed syntax II
Figure: From [3]
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Much more...
Obviously new syntax is not just syntactic sugar, it helps improveproper structure of the document, and prevent from wronginstruction order.Mark Reinhold proposed also:
datatype coders
collections
hybrid event/tree API
accessing by XPath
And more! His blog:
http://blogs.sun.com/mr/
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Three different approaches to XML parsing
SAX - keywords: event-based, callback model, fast, cannotmodify structure, interfaced based API
DOM - keywords: builds tree in memory, divided intomodules, rather slow, can generate and modify documents
StAX -keywords: pull parsing, events catched from stream,consistent code!, can be used on mobile devices (XmlPull)
RSS parsing? Difficult to make decision about parsing model,most efficient are already implemented APIs for example ROME
http://rome.dev.java.net
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Brett McLaughlin, Justin EdelsonJava & XMLO’Reilly Media, 3rd edition, 1 December 2006
Cay S. Horstmann, Gary CornellCore Java, Volume II — Advanced FeaturesPrentice Hall PTR, 8th edition, 7 April 2008
Mark ReinholdIntegrating XML into the Java Programming Language TS-3441http://developers.sun.com/learning/javaoneonline/sessions/2006/TS-3441/index.htm
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Jurgen SaleckerHybrid Parser Architectural Patternhttp://developerlife.com/tutorials/?p=53
Various APIs documentationFor starters it’s good to search wikipedia...Xerces 2 Java Parser http://xerces.apache.org/xerces2-j/JAXP reference implementation https://jaxp.dev.java.net/XOM - XML Object Model http://www.xom.nu/JDOM - Java Document Object Model http://www.jdom.org/StAX - Streaming API for XML http://stax.codehaus.org/VTD - XML - new way of processing XMLhttp://vtd-xml.sourceforge.net/
AND OTHER...
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
Why?...
Questions ?What if?...
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6
IntroductionXML API’s in Java
Capabilities and performance comparisonCASE STUDY: Parsing Really Simple Syndication (RSS) doc
What next? Alternatives to API’s, Java SE 7.0 featuresSummary
Further reading...
THANK YOU
Wojciech Podgorski http://podgorski.wordpress.com eXtensible Markup Language APIs in Java 1.6