Using XML In ViPER and More. XML Direct access to information, without worrying about parsing. XML...

21
Using XML In ViPER and More

Transcript of Using XML In ViPER and More. XML Direct access to information, without worrying about parsing. XML...

Using XML

In ViPER and More

XML

• Direct access to information, without worrying about parsing.

• XML Information Set– XML provides a way to access information

independent of access, format, etc.– XML is just a serialization of a set of

information arranged in a tree.

ViPER Tree

• viper– config

• descriptor

• descriptor

– data• sourcefile filename=“file.mpg”

– file

– object

ViPER File Format

<?xml version="1.0" encoding="UTF-8"?><viper xmlns="http://lamp.cfar.umd.edu/viper"

xmlns:data="http://lamp.cfar.umd.edu/viperdata"><config>

<descriptor name="Information" type="FILE"> <attribute name="SOURCEDIR" dynamic="false" type="svalue"/></descriptor>

</config><data>

<sourcefile filename="comm-001_00001.jpg" ><file name="Information" id="0" framespan="0:0">

<attribute name="SOURCEDIR"><data:svalue value="/fs/lampa/FaceTextDB/JPEG/advertisements" /> </attribute>

</file> </sourcefile>

</data></viper>

Accessing Via XPath

• Get data from a specific file– /viper/data/sourcefile[@filename=“f.mpg”]

• Gets the sourcefile node

– //sourcefile[@fname=“f.mpg]//bbox• Gets all bbox nodes

Matlab with Java% Add xerces.jar to classpath.txt (find using 'which classpath.txt')% need to restart matlab after changingimport org.apache.xerces.parsers.* org.w3c.dom.*;import java.lang.String org.xml.sax.*;

input = InputSource('C:\MATLAB6p1\work\advertisements.xml');parser = DOMParser;parser.setFeature('http://apache.org/xml/features/validation/schema', 0)parser.parse(input);doc = parser.getDocument;sfs = doc.getElementsByTagName('sourcefile')

files = cell(sfs.getLength, 1);i = 0;while i < sfs.getLength fileattr = sfs.item(i).getAttributes.getNamedItem('filename'); i = i + 1; files(i) = fileattr.getValue;end

Perl

use XML::LibXML;my $parser = XML::LibXML->new();my $tree = $parser->parse_file($datafiles[0]);my $root = $tree->getDocumentElement;foreach my $source ($root->findnodes('sourcefile')){

my $image = $source->findvalue('@filename');foreach my $d ($source->findnodes('content|object')){

[$startFrame, $endFrame] = split(/:/,$d->findvalue('@framespan'));

foreach my $shape ($d->findnodes(lc($attribType))) {$orig_x = $shape->findvalue( ‘@x' );$orig_y = $shape->findvalue( ‘@y' );

C with libxml2#include <libxml/xmlmemory.h>#include <libxml/parser.h>---- xmlDocPtr doc = = xmlParseFile(‘truth.xml’);if (doc == NULL) return(NULL);xmlNodePtr cur = xmlDocGetRootElement(doc);xmlNsPtr viperns = xmlSearchNsByHref(doc, cur,

(const xmlChar *) "http://lamp.cfar.umd.edu/viper");cur = cur->xmlChildrenNode;while (cur != NULL) {

if ((!strcmp(cur->name, “config”)) && (cur->ns == viperns)) parseConfig (doc, viperns, cur); else if ((!strcmp(cur->name, “data”)) && (cur->ns == viperns)) parseData (doc, viperns, cur); cur = cur->next;}xmlCleanupParser();

XML Databases

• Uses existing tools to access persistent data– DOM and XPath – XQuery and XUpdate

• Many different implementations– Open Source: Apache Xindice, eXist– Proprietary:TextML, X-Hive, – Relational: MS SQL, Oracle

XSL:Transformations

• The idea is to look at the incoming data as a tree, using XPath, and select various nodes to copy to the output.

• While the output does not have to be XML, the input and the document itself must be well formed.

• On system 7, ‘testXSLT’ runs stylesheets.

XSLT<?xml version=“1.0” encoding=“UTF-8”?><xsl:stylesheet version=“1.0”

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”xmlns:gtf=“http://lamp.cfar.umd.edu/viper”xmlns:data=“http://lamp.cfar.umd.edu/viperdata”xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<xsl:strip-space elements="gtf:viper"/><xsl:output method="xml" omit-xml-declaration="yes"/><!– continued -->

XSLT<xsl:template match="/gtf:viper">

<xsl:text>#VIPER_VERSION_3.01

</xsl:text><xsl:apply-templates select="*/"/>

</xsl:template><xsl:templatematch="//gtf:sourcefile[starts-with(@filename, 'comm-001')]">

<xsl:value-of select="@filename" /><xsl:text>

</xsl:text></xsl:template></xsl:stylesheet>

CSS-1

• Supported in the majority of browsers in use today.

• Basic styling. Hopefully will reduce reliance on HTML tables as a way to lay out web pages.

<style type="text/css">p {

font-size: 12pt;line-height: 18pt;

}

p:first-letter {font-size: 200%;float: left;

}</style>

CSS-2

• Added support for pagination, including widow and orphan control, page breaks, and margins.

• Aural style sheets for voice browsing.

• Can be applied directly to XML.

• Possible to do some multi-column layout.

CSS-3

• Modularized• Through Ruby, support for Japanese,

Arabic, etc.• Multi-column layout• Support for other W3C specs, like

– SVG– MathML– SMIL

XSL:FO• Basically, the idea is to put CSS-2 in an XML

dialect, and use XPath and other XML technologies to make printed media look nice.

• Extremely verbose – designed to be generated from semantic markup.– However, its lack of semantics leads Opera CTO Lie to

call them “Harmful.”

• Additions include footnotes, hyphenation, odd/even pages, citations for indices and tables of contents.

• RenderX, Apache FOP

Defining an XML Dialect

• Document Type Definitions– Simple, BNF type definition of tags, attributes, and

how they may be arranged.

• Schema– XML based replacement for non-XML DTDs.

– Complex.

– Define data types, and associate them with tag names.

• Rule based constriction– Schematron

ViPER Schema

<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://lamp.cfar.umd.edu/viper" xmlns:viper="http://lamp.cfar.umd.edu/viper" elementFormDefault="qualified"><xsd:element name="viper" type="viper:viperType"/><xsd:complexType name="viperType">

<xsd:sequence><xsd:element name="config" type="viper:configType"/><xsd:element name="data" type="viper:dataType" minOccurs="0"/></xsd:sequence>

</xsd:complexType>

ViPER Data Schema

<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”targetNamespace=“http://lamp.cfar.umd.edu/viperdata”xmlns:viperdata=“http://lamp.cfar.umd.edu/viperdata”xmlns:viper=“http://lamp.cfar.umd.edu/viper”elementFormDefault=“qualified”>

<xsd:import namespace=“http://lamp.cfar.umd.edu/viper” schemaLocation=“file:viper.xsd” /><xsd:element name="point" substitutionGroup="viper:null">

<xsd:complexType><xsd:complexContent><xsd:extension base="viper:descriptorAttributeData">

<xsd:attribute name="x" type="xsd:integer"/><xsd:attribute name="y" type="xsd:integer"/>

</xsd:extension></xsd:complexContent></xsd:complexType>

</xsd:element>

MPEG-7

• Based on XML-Schema.• Extensions to deal better with video type

data, including matrix data types, etc.• Designed to work with any level of

description, from low level to high.• W3C has only a working draft for DOM

access to schemas, so using generic MPEG-7 documents is currently difficult.

Resources

• www.xml.com– O'Reilly's XML resource

• www.w3.org – The standards themselves,

and lots of good links to implementations.

• xml.apache.org– DOM, SAX, and XSLT for

C and Java

• xmlsoft.org– libxml creators

• msdn.microsoft.com/xml– MS-XML parser is the one to use

on Windows.

• mpeg.telecomitalialab.com– MPEG-7 Working Group

• pyxml.sourceforge.net– Using xml with Python.

• okmij.org/ftp/Scheme/xml.html– Using XML with Scheme.