Using XML In ViPER and More. XML Direct access to information, without worrying about parsing. XML...
-
Upload
kerry-snow -
Category
Documents
-
view
226 -
download
0
Transcript of Using XML In ViPER and More. XML Direct access to information, without worrying about parsing. XML...
XML
• Direct access to information, without worrying about parsing.
• XML Information Set– XML provides a way to access information
independent of access, format, etc.– XML is just a serialization of a set of
information arranged in a tree.
ViPER Tree
• viper– config
• descriptor
• descriptor
– data• sourcefile filename=“file.mpg”
– file
– object
ViPER File Format
<?xml version="1.0" encoding="UTF-8"?><viper xmlns="http://lamp.cfar.umd.edu/viper"
xmlns:data="http://lamp.cfar.umd.edu/viperdata"><config>
<descriptor name="Information" type="FILE"> <attribute name="SOURCEDIR" dynamic="false" type="svalue"/></descriptor>
</config><data>
<sourcefile filename="comm-001_00001.jpg" ><file name="Information" id="0" framespan="0:0">
<attribute name="SOURCEDIR"><data:svalue value="/fs/lampa/FaceTextDB/JPEG/advertisements" /> </attribute>
</file> </sourcefile>
</data></viper>
Accessing Via XPath
• Get data from a specific file– /viper/data/sourcefile[@filename=“f.mpg”]
• Gets the sourcefile node
– //sourcefile[@fname=“f.mpg]//bbox• Gets all bbox nodes
Matlab with Java% Add xerces.jar to classpath.txt (find using 'which classpath.txt')% need to restart matlab after changingimport org.apache.xerces.parsers.* org.w3c.dom.*;import java.lang.String org.xml.sax.*;
input = InputSource('C:\MATLAB6p1\work\advertisements.xml');parser = DOMParser;parser.setFeature('http://apache.org/xml/features/validation/schema', 0)parser.parse(input);doc = parser.getDocument;sfs = doc.getElementsByTagName('sourcefile')
files = cell(sfs.getLength, 1);i = 0;while i < sfs.getLength fileattr = sfs.item(i).getAttributes.getNamedItem('filename'); i = i + 1; files(i) = fileattr.getValue;end
Perl
use XML::LibXML;my $parser = XML::LibXML->new();my $tree = $parser->parse_file($datafiles[0]);my $root = $tree->getDocumentElement;foreach my $source ($root->findnodes('sourcefile')){
my $image = $source->findvalue('@filename');foreach my $d ($source->findnodes('content|object')){
[$startFrame, $endFrame] = split(/:/,$d->findvalue('@framespan'));
foreach my $shape ($d->findnodes(lc($attribType))) {$orig_x = $shape->findvalue( ‘@x' );$orig_y = $shape->findvalue( ‘@y' );
C with libxml2#include <libxml/xmlmemory.h>#include <libxml/parser.h>---- xmlDocPtr doc = = xmlParseFile(‘truth.xml’);if (doc == NULL) return(NULL);xmlNodePtr cur = xmlDocGetRootElement(doc);xmlNsPtr viperns = xmlSearchNsByHref(doc, cur,
(const xmlChar *) "http://lamp.cfar.umd.edu/viper");cur = cur->xmlChildrenNode;while (cur != NULL) {
if ((!strcmp(cur->name, “config”)) && (cur->ns == viperns)) parseConfig (doc, viperns, cur); else if ((!strcmp(cur->name, “data”)) && (cur->ns == viperns)) parseData (doc, viperns, cur); cur = cur->next;}xmlCleanupParser();
XML Databases
• Uses existing tools to access persistent data– DOM and XPath – XQuery and XUpdate
• Many different implementations– Open Source: Apache Xindice, eXist– Proprietary:TextML, X-Hive, – Relational: MS SQL, Oracle
XSL:Transformations
• The idea is to look at the incoming data as a tree, using XPath, and select various nodes to copy to the output.
• While the output does not have to be XML, the input and the document itself must be well formed.
• On system 7, ‘testXSLT’ runs stylesheets.
XSLT<?xml version=“1.0” encoding=“UTF-8”?><xsl:stylesheet version=“1.0”
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”xmlns:gtf=“http://lamp.cfar.umd.edu/viper”xmlns:data=“http://lamp.cfar.umd.edu/viperdata”xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:strip-space elements="gtf:viper"/><xsl:output method="xml" omit-xml-declaration="yes"/><!– continued -->
XSLT<xsl:template match="/gtf:viper">
<xsl:text>#VIPER_VERSION_3.01
</xsl:text><xsl:apply-templates select="*/"/>
</xsl:template><xsl:templatematch="//gtf:sourcefile[starts-with(@filename, 'comm-001')]">
<xsl:value-of select="@filename" /><xsl:text>
</xsl:text></xsl:template></xsl:stylesheet>
CSS-1
• Supported in the majority of browsers in use today.
• Basic styling. Hopefully will reduce reliance on HTML tables as a way to lay out web pages.
<style type="text/css">p {
font-size: 12pt;line-height: 18pt;
}
p:first-letter {font-size: 200%;float: left;
}</style>
CSS-2
• Added support for pagination, including widow and orphan control, page breaks, and margins.
• Aural style sheets for voice browsing.
• Can be applied directly to XML.
• Possible to do some multi-column layout.
CSS-3
• Modularized• Through Ruby, support for Japanese,
Arabic, etc.• Multi-column layout• Support for other W3C specs, like
– SVG– MathML– SMIL
XSL:FO• Basically, the idea is to put CSS-2 in an XML
dialect, and use XPath and other XML technologies to make printed media look nice.
• Extremely verbose – designed to be generated from semantic markup.– However, its lack of semantics leads Opera CTO Lie to
call them “Harmful.”
• Additions include footnotes, hyphenation, odd/even pages, citations for indices and tables of contents.
• RenderX, Apache FOP
Defining an XML Dialect
• Document Type Definitions– Simple, BNF type definition of tags, attributes, and
how they may be arranged.
• Schema– XML based replacement for non-XML DTDs.
– Complex.
– Define data types, and associate them with tag names.
• Rule based constriction– Schematron
ViPER Schema
<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://lamp.cfar.umd.edu/viper" xmlns:viper="http://lamp.cfar.umd.edu/viper" elementFormDefault="qualified"><xsd:element name="viper" type="viper:viperType"/><xsd:complexType name="viperType">
<xsd:sequence><xsd:element name="config" type="viper:configType"/><xsd:element name="data" type="viper:dataType" minOccurs="0"/></xsd:sequence>
</xsd:complexType>
ViPER Data Schema
<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”targetNamespace=“http://lamp.cfar.umd.edu/viperdata”xmlns:viperdata=“http://lamp.cfar.umd.edu/viperdata”xmlns:viper=“http://lamp.cfar.umd.edu/viper”elementFormDefault=“qualified”>
<xsd:import namespace=“http://lamp.cfar.umd.edu/viper” schemaLocation=“file:viper.xsd” /><xsd:element name="point" substitutionGroup="viper:null">
<xsd:complexType><xsd:complexContent><xsd:extension base="viper:descriptorAttributeData">
<xsd:attribute name="x" type="xsd:integer"/><xsd:attribute name="y" type="xsd:integer"/>
</xsd:extension></xsd:complexContent></xsd:complexType>
</xsd:element>
MPEG-7
• Based on XML-Schema.• Extensions to deal better with video type
data, including matrix data types, etc.• Designed to work with any level of
description, from low level to high.• W3C has only a working draft for DOM
access to schemas, so using generic MPEG-7 documents is currently difficult.
Resources
• www.xml.com– O'Reilly's XML resource
• www.w3.org – The standards themselves,
and lots of good links to implementations.
• xml.apache.org– DOM, SAX, and XSLT for
C and Java
• xmlsoft.org– libxml creators
• msdn.microsoft.com/xml– MS-XML parser is the one to use
on Windows.
• mpeg.telecomitalialab.com– MPEG-7 Working Group
• pyxml.sourceforge.net– Using xml with Python.
• okmij.org/ftp/Scheme/xml.html– Using XML with Scheme.