All Your XML Sandeepan Banerjee Oracle Corporation [email protected].

100
All Your XML Sandeepan Banerjee Oracle Corporation [email protected]
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    226
  • download

    0

Transcript of All Your XML Sandeepan Banerjee Oracle Corporation [email protected].

Page 1: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

All Your XML

Sandeepan BanerjeeOracle Corporation

[email protected]

Page 2: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

One Management System for All Your Data

Complete Integrated Robust Scalable Secure Available on all

platforms

Oracle interMediaOracle interMediaMultimedia managementMultimedia management

Oracle Locator & SpatialOracle Locator & SpatialLocation and Proximity SearchingLocation and Proximity Searching

Extensibility FrameworkExtensibility FrameworkChemical, Genetic, Engineering,…Chemical, Genetic, Engineering,…

XML DBXML DBIntegrated Native XML DatabaseIntegrated Native XML Database

Oracle Text & Ultra SearchOracle Text & Ultra SearchText management and searchText management and search

RelationalRelationalCharacters, Numbers and DatesCharacters, Numbers and Dates

Oracle Collaboration SuiteOracle Collaboration SuiteUnified Messaging and FilesUnified Messaging and Files

Page 3: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

What is Oracle XML DB?

Database support for the XML data model– XMLType, XMLSchema, DOM Fidelity, Xpath, …

Hierarchical organization of the data– WebDAV compliant with indexing for fast access

Transparent storage optimizations Query Language: SQL/XML (SQL2003) and

XQuery

Page 4: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Oracle XML DB Features New XMLType

– XMLSchema Support – Object-Relational Storage Maintaining DOM fidelity, CLOB

storage with document fidelity– XML-specific memory mgmt for better scalability and

performance Lazily loaded virtual DOM, Schema caching

– Built-in XML operators for SQL/XML interchangability– XPath Search in the server, and piecewise update of XML via

XPath– XSL Transforms in the server– Enhanced XML Views for creating your own representations of

XML New XML Repository

– FTP, WebDAV, HTTP protocol servers to move XML content in and out

– ‘Foldering’ and Repository view over XML Content including access control Hierarchical Index, SQL Versioning

– SQL Repository Search

Page 5: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Common XML Architectures

Enforce XML Schema

Manage Content & DataUniformly

Eliminate extra movingparts

Get better performance& scalability

Make DB applicationsStandards-based

Page 6: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB Architecture Application Development

Oracle*NET

.NET Client OCI Client

JDBC/OCI/.NET

XML Schema CacheXML/DOM Parser

Repository

SQL

XMLTypeViews/Tables

Text Index Path Index Text Index B-TreeBitmap Index

Oracle XML DB

JAVA Client

XSL Transforms

Page 7: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB Architecture

HTTP FTP WebDAV

HTTP Client FTP Client

Content Oriented Access

Repository XMLTypeViews/Tables

WebDAV Client

Protocol Handlers

Text Index Path Index Text Index B-TreeBitmap Index Oracle XML DB

Page 8: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB Architecture Data & Content Integration

Oracle HO Connectivity

XML Query

Repository

SQL

XMLTypeViews

Relational DBs File Systems Web Sources

Oracle XML DB

Text Index B-TreeBitmap Index

J2EEQuery &

Results Cache

Oracle XDS (JCA-based) Connectivity

JVM

XML Query

Web Services

& Data Sources

Page 9: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB Architecture Web Services

UDDI

9iAS SOAP Server9iAS SOAP Server

Web ServicesWeb ServicesDirectoryDirectory

Client ApplicationClient Application

Publish DiscoverDiscover

Invoke

Web ServiceWeb Service SOAP

WSDL

OracleOracle XML DBXML DB

HTTP/Oracle*Net

Page 10: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB Architecture

XML Messaging

XMLTypeTable

DeQueue

Messaging Application

XMLTypeTable

EnQueue

Oracle Streams

Transform TransformXML Document

XML Document

IDAP/JMS/AQ

Oracle*Net

Oracle XML DB Oracle XML DB

Page 11: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Classes of XML Applications

Managing Structured Documents– Well-formed templated business-documents e.g.

Purchase Orders, Phone Bills, … Managing Unstructured Documents

(“Content”)– Documents, Messages, Instructions– Life-cycle management of these assets under

multiple channels Querying over integrated, normalized data

from diverse sources

Page 12: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Relational storage remains the “right” way to store highly structured data

– “extended-relational” or “object-relational” mechanisms have been available for some time to deal with ordering, recursion, substitution etc.

As an XML programmer, you do not want to think about “tables”

– A hierarchical data model is what you want to manipulate

XML DB’s XMLType is about preserving the XML paradigm while getting the benefits of relational performance and scalability

1. Managing Structured Documents

Page 13: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

D E M O N S T R A T I O N

Structured XML

Page 14: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 15: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 16: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 17: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 18: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 19: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 20: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 21: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 22: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 23: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.
Page 24: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Object Relational DB Basics

Object types Collection types Object References LOBs

XML Schema Construct

Object Construct

Complex type Object type

Local complex type Embedded object type

Complex type with maxOccurs > 1

Collection type

Derived complex type

Subtype

Page 25: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XMLType Native data type for XML Used to define columns of tables and views,

arguments to stored procedures, etc. XML specific methods and operators for

– Querying and extracting XML using XPath– Transforming XML using XSLT– Validating XML using XML Schema

Multiple Storage Options– Unstructured Storage in CLOB– Structured Storage into object-relational rows and columns– Hybrid Storage

Maintains application transparency to physical storage choice

Page 26: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML DB and XML Schema

XML Schema controls all aspects of processing – Storage mappings – In-memory representations– Language Bindings

XML Schema Registration Process– Associates XML Schema with URL– Generates Object types– Creates default tables

XMLType column can be constrained to a global element of registered schema

Page 27: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML Schema Example

<schema targetNamespace=“http://www.oracle.com/PO.xsd” xmlns:po=“http://www.oracle.com/PO.xsd”elementFormDefault=”qualified”xmlns="http://www.w3.org/2001/XMLSchema">

<complexType name="PurchaseOrderType"> <sequence>

<element name="PONum" type="decimal"/> <element name="Company"> <simpleType>

<restriction base="string"> <maxLength value="100"/> </restriction>

</simpleType> </element>

Page 28: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML Schema Example (contd)

<element name="Item" maxOccurs="1000"> <complexType> <sequence>

<element name="Part"> <simpleType> <restriction base="string">

<maxLength value="1000"/> </restriction> </simpleType> </element> <element name="Price" type="float"/>

</sequence> </complexType> </element>

</sequence> </complexType>

<element name="PurchaseOrder" type="po:PurchaseOrderType"/> </schema>

Page 29: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Generated Object Types

TYPE "Item_T" (part varchar2(1000), price number);

TYPE "Item_COLL" AS VARRAY(1000) OF "Item_T";

TYPE "PurchaseOrderType_T" (ponum number, company varchar2(100), item Item_COLL);

Page 30: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Generated Tables

TABLE po_tab

OF XMLTYPE

XMLSCHEMA “http://www.oracle.com/PO.xsd"

ELEMENT "PurchaseOrder“

VARRAY(item) STORE AS item_tab;

Page 31: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML Document Example<PurchaseOrder xmlns="http://www.oracle.com/PO.xsd"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.oracle.com/PO.xsd

http://www.oracle.com/PO.xsd"> <PONum>1001</PONum> <Company>Oracle Corp</Company> <Item>

<Part>9i Doc Set</Part> <Price>2550</Price>

</Item> <Item>

<Part>8i Doc Set</Part> <Price>350</Price>

</Item> </PurchaseOrder>

Page 32: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML Storage – PO_TAB and LINE_TAB

Row ID ponum company

1 1001 Oracle Corp

Parent Row ID Array Index part price

1 1 9i Doc Set 2250

1 2 8i Doc Set 350

Page 33: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Structured Storage Attributes and single-valued elements

– Stored as columns in single row– SQL data types correspond to XML Schema types– SQL constraints correspond to XML Schema constraints

Multi-valued elements (collections) stored in separate nested tables

– One row per item in collection– Nested table row stores parent key– Array Index column stores the position information

Number column – uses full range of floating points Elements inserted in the middle of the collection get

(previous_array_index + next_array_index)/2 Supports multiple levels of nesting

– Embedded object types– Embedded collection types with multiple nested tables

Page 34: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

DOM Fidelity

Structured storage guarantees DOM fidelity– No whitespace fidelity

System binary attribute in object types– SYS_XDBPD$

PD attribute stores non-relational information– Ordering of elements – Comments – Processing Instructions– Namespace declarations – Prefix information – Mixed content – text nodes that are intermixed with

elements are stored in the system column

Page 35: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Do you need it ?

Can disable on a Type by Type basis using annotation xdb:maintainDOM=“false”

Location of Namespace declarations No Comments or Processing Instructions Do not care about empty Vs missing elements Elements in an All and Choice will be ordered

as per the Schema, not the instance Not worried about defaults Type does not allow Mixed text

Page 36: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Type defn with DOM Fidelity

desc LINEITEMS_T LINEITEMS_T is NOT FINAL Name Null? Type ------------------------------- -------- ------------------ SYS_XDBPD$ XDB.XDB$RAW_LIST_T LINEITEM LINEITEM_V

desc LINEITEM_V LINEITEM_V VARRAY(2147483647) OF LINEITEM_T LINEITEM_T is NOT FINAL Name Null? Type ------------------------------ -------- ------------------ SYS_XDBPD$ XDB.XDB$RAW_LIST_T ITEMNUMBER NUMBER(38) DESCRIPTION VARCHAR2(256 CHAR) PART PART_T

Page 37: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Type defn without DOM Fidelity

desc LINEITEMS_T LINEITEMS_T is NOT FINAL Name Null? Type ------------------------------- -------- ------------------ LINEITEM LINEITEM_V

desc LINEITEM_V LINEITEM_V VARRAY(2147483647) OF LINEITEM_T LINEITEM_T is NOT FINAL Name Null? Type ------------------------------ -------- ------------------ ITEMNUMBER NUMBER(38) DESCRIPTION VARCHAR2(256 CHAR) PART PART_T

Page 38: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

xdb:maintainDOM

<xs:complexType name="LineItemsType" xdb:SQLType="LINEITEMS_T" xdb:maintainDOM="false"> <xs:sequence> <xs:element name="LineItem" type="LineItemType" maxOccurs="unbounded" xdb:SQLName="LINEITEM" xdb:SQLCollType="LINEITEM_V"/> </xs:sequence></xs:complexType>

<xs:complexType name="LineItemType" xdb:SQLType="LINEITEM_T" xdb:maintainDOM="false"> <xs:sequence> <xs:element name="Description" type="DescriptionType“ xdb:SQLName="DESCRIPTION"/> <xs:element name="Part" type="PartType" xdb:SQLName="PART"/> </xs:sequence> <xs:attribute name="ItemNumber" type="xs:integer" xdb:SQLName="ITEMNUMBER" xdb:SQLType="NUMBER"/></xs:complexType>

Page 39: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Dom Fidelity

00.10.20.30.4

0.50.60.70.80.9

1

Files DOM Fidelity No Fidelity

Insert TimeSize on Disk

Page 40: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XDB Attributes in XML Schema

Mapping details captured as new attributes within the XML Schema

– Oracle attributes use namespace : http://xmlns.oracle.com/xdb

Input XML Schema does not need to be annotated– Default assumptions for all XDB attributes

Input XML Schema can explicitly specify XDB attributes

– SQLName : Name of column or type attribute– SQLType : Name of object type – SQLCollType : Name of collection type– MaintainOrder : Permits turning off order maintenance

Page 41: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Hybrid Storage of XML

Two ends of storage spectrum– Store entire content in a single LOB– Full shredding of XML data

Hybrid Storage Options– Combination of shredding and LOB storage– xdb:SQLType=“CLOB” applied to <complexType>– CLOB storage ideal for fragments that are not intended for

query or partial updates– Example : XHTML content embedded within resource

descriptors (metadata)

Page 42: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Querying XMLType

XPath based operators existsNode

– Boolean operator– Checks for existence of node identified by XPath

extract– Extracts a fragment identified by XPath

extractValue– Retrieves the raw value of leaf node identified by XPath

Namespace Aware ANSI SQL/XML Standards effort

Page 43: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Query Rewrite

Automatic rewrite of XPaths during query compilation

Rewritten query directly accesses underlying relational columns

Introduces joins with nested tables Enables use of indexes

Page 44: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Query Rewrite - Example

Original QuerySELECT extractValue(value(p),

'/PurchaseOrder/PONum') FROM po_tab pWHERE existsNode(value(p),

'/PurchaseOrder[Company=Oracle]') = 1; Rewritten Query

SELECT p.ponum FROM po_tab pWHERE p.company = 'Oracle';

Page 45: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Query Rewrite Examples

XPath Expression Rewrite LogicSimple XPath expressions /PurchaseOrder/Company/PurchaseOrder/@poDate

Traversals of object attributes1 or more levels

Collection traversals/PurchaseOrder/LineItem/Part

Joins with appropriate nested tables

Predicates[Company=“Oracle”]

Relational predicates

List indexesLineitem[1]

Operator to access i-th item of collectionUses index on array_index column

Page 46: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Updating XMLType

Updating entire XML document Partial Update

– Uses UpdateXML() operator– XPath identifies element or attribute to be

updated– New value for the updated node is specified– Rewritten to directly update underlying column(s)

Similar mechanisms for – Inserting new nodes– Deleting node(s)

Page 47: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Update Rewrite - Example

Original StatementUPDATE po_tab p SET value(p) = updatexml(value(p),

'/PurchaseOrder/PONum/text()',

9999)WHERE existsNode(value(p),

‘/PurchaseOrder[Company=Oracle]’) = 1; Rewritten Statement

UPDATE po_tab p SET p.ponum = 9999WHERE p.Company = ‘Oracle’;

Page 48: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XMLType Views

Provide an XML view of relational data Good evolutionary strategy

– New XML apps on XML abstraction of existing data

SQL/XML Standard operators used to generate XML Views can generate schema based XML Insert / Update / Delete operations via ‘Instead of’

Triggers Queries over XML views are rewritten to directly

access underlying relational columns

Page 49: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XMLType View - Example

CREATE VIEW po_view of XMLTYPEXMLSCHEMA "po.xsd" ELEMENT "PurchaseOrder" AS SELECT

XMLElement("PurchaseOrder", XMLForest(p.ponum "PONum",

p.company "Company"), (SELECT XMLAGG(

XMLElement("Item", XMLForest(i.part "Part",

i.price "Price")) FROM items_rel_tab i WHERE i.po_id = p.id))

FROM po_rel_tab p;

Page 50: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Complex XML Schemas

Cyclic Definitions– Object types created with references– Table row contains reference to other rows stored in same

or different table– Keys to document and parent rows stored in nested rows

Complex type derivation – Mapped to object type inheritance

Wildcards– Mapped to CLOB attributes

Page 51: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Indexing XMLType

Multiple index types B-Tree and bitmap indexes Function-based indexes

– Create index on specific XPath expressions

Text indexes– Inverted lists provide section-based search– Also support keyword based search within textual

content

Page 52: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Performance Metrics

Page 53: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Summary: Native XML Data Type

Abstraction for storing XML– Native Server data type– Use as Table, Column, PL/SQL variable – Supports constraints and referential integrity– Structured and Unstructured storage options

XML Specific methods enable– XPath based Navigation and Searching of XML content– XPath based manipulation and update of XML content– Server based XSLT Transformation– XMLSchema validation

Page 54: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Summary: XML Schema

Validation of instance Documents Basis for Structured Storage of XMLType

– XML shredded and stored as SQL Objects– DOM Fidelity– Optimized Collection Management– B-Tree indexes over collections– Query Re-write of XPath expressions– Partial Update– Lazily Loaded Virtual DOM– Object model automatically derived from XML Schema

Page 55: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Summary: SQL / XML Interoperability

XML generation from SQL Queries– Generate an XML document from a SELECT statement– Support for generating complex documents– XMLType views provide XML access to relational content

SQL query and update of XML Content– XPath based extraction of XML Content (SELECT List)– XPath based query of XML Content (WHERE Clause)– XPath based update of XML content– XPath based relational views over XML content

Page 56: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Summary: API

Higher Level API for working with XML with full support for

– Generation– Storage and Retrieval– Indexing, Searching– Query and Update– Transformation

Page 57: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Summary: Generating XML

SQL XML operators make it easy to generate XML from relational data

– Result set of SQL Query is XML document

XMLType Views allow persistent XML access to relational data

Content of XMLType view can be exposed as a virtual document

– Direct access via HTTP / WebDAV or FTP

Page 58: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

2. Managing Unstructured Data

More and more content is being produced as XML (Microsoft Word, Corel XMetal, Arbortext Epic, …)

– Markup improves search, processing, organization, … XML DB’s Repository enables XML document

content to be stored as ‘files’ in ‘folders’ without losing strong-management, queryability, unbreakable security etc.

XML is doing for unstructured data what Relational did for structured: create a standard way to store, query and manage unstructured data

Page 59: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML data model and API’s familiar to Content Developers

Integrated Repository– WebDAV compliant– Xpath index for fast traversal of foldering hierarchies– SQL Queryable

Integrated Text Processing– Optimizations such as “tag aware” search

Managing Unstructured Data with Oracle XML DB

Page 60: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Using XML for Unstructured Content

Measurable results: Soft results: Increase information quality Support knowledge

management initiatives Improve information freshness Enhance information accuracy Greater content Interactivity Improve content flexibility Increase customer

satisfaction Expand customer retention

Authoring Cost Savings

50%

Publishing Time Savings

95%

Translation Savings

30%

Reduced Content Maintenance

20X

Source:

Page 61: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

D E M O N S T R A T I O N

Unstructured XML

Page 62: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Oracle & Corel Integration

Page 63: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Oracle XML DB Repository

XML Repository based IETF DAV Specification– File / Folder metaphor for storing and managing content– ACL Based access control– Basic versioning support

Supports WebDAV, HTTP and FTP protocols– Access and update content using standard tools

Full SQL access and update Programmatic access via multiple APIs Hierarchical Index

– Patented, high performance folder-traversal operations and queries

Page 64: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML Searching

The CONTAINS SQL function is an extension to SQL, and can be used in any query.

The ora:contains XPath function is an extension to XPath, and can be used in any call to existsNode, extract or extractValue..

SELECT ID FROM PURCHASE_ORDERS WHERE CONTAINS( DOC, 'electric INPATH (/purchaseOrder/items/item/comment)' )>0 ;

SELECT ID FROM PURCHASE_ORDERS_xmltype WHERE existsNode( DOC, '/purchaseOrder/comment[ora:contains(text(), "($lawns AND wild) OR flamingo")>0]', 'xmlns:ora="http://xmlns.oracle.com/xdb"' ) = 1 ;

Page 65: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

CONTAINS vs ora:contains() The CONTAINS SQL function

– Needs a CONTEXT index to run - if there is no index, you get an error– does an indexed search, and is generally very fast– returns a score (via the score operator)– can restrict a search using both Full Text and XML structure– restricts a search based on documents (rows in a table), not nodes– cannot be used for XML structure-based projection (pulling out parts of an XML

document) The ora:contains XPath function:

– does not need an index to run, so it is very flexible - separates application logic from storing and indexing considerations

– may do an unindexed search, so it may be resource-intensive; alternatively, may also get rewritten to use an index

– does not return a score– can restrict a search using Full Text in an XPath expression– can be used for XML structure-based projection (pulling out parts of an XML document)

Page 66: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

More Search Examples

SELECT ID FROM PURCHASE_ORDERS_xmltype WHERE existsNode( DOC, '/purchaseOrder/comment[ora:contains(text(), "($lawns AND wild) OR flamingo")>0]', 'xmlns:ora="http://xmlns.oracle.com/xdb"' ) = 1 ;

SELECT extract( DOC, '/purchaseOrder/items/item/comment[ora:contains(text(), "electric")>0]', 'xmlns:ora="http://xmlns.oracle.com/xdb"') “Comment” FROM PURCHASE_ORDERS_xmltype WHERE CONTAINS( DOC, 'electric INPATH (/purchaseOrder/items/item/comment) ' )>0 ;

SELECT ID FROM PURCHASE_ORDERS WHERE CONTAINS( DOC, '(electric INPATH (//comment)) INPATH (/purchaseOrder/items)' )>0 ;

SELECT ID FROM PURCHASE_ORDERS WHERE CONTAINS( DOC, 'electric INPATH (//items/item[@partNum="872-AA"]/comment)' )>0 ;

SELECT ID FROM PURCHASE_ORDERS WHERE CONTAINS( DOC, 'electric INPATH (/*/*/item[.//comment])' )>0 ;

Page 67: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Query Rewrite A query with existsNode, extract or extractValue, where the XPath

includes ora:contains, may be considered for Query Rewrite iff: – The XML is Schema-based– the first argument to ora:contains (text_input) is either a single text node

whose parent node maps to a column, or an attribute that maps to a column. The column must be a single relational column (possibly in a nested table).

The re-written query will use a CONTEXT index iff: – there is a CONTEXT index on the column that text_input's parent node (or

attribute node) maps to– the ora:contains policy exactly matches the index choices of the CONTEXT

index– the CONTEXT index was built with the TRANSACTIONAL keyword in the

PARAMETERS string

Page 68: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

iAS J2EETM Platform

3. XML Query-based Integration Why XQuery ?

– Declarative way to query XML documents

Why Java?– Run in mid-tier or database– Future server implementation in C

Why XML Database ?– Native XML storage– XML data management – Performance optimizations– SQL/XML or XQuery depending on

data Status

– OTN downloads (pending W3C standard finalization in ’04)

XML DB

XQuery Engine Server JVM

XQuery Engine

Page 69: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

What is XQuery?

New W3C language to query XML Evolution from XPath – path navigation Analogous to SQL

– SQL for structured storage (relational)– XQuery for structured (schema)/unstructured

Text extensions – work in progress

Page 70: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery example

<PurchaseOrder pono=“100”>

<Customer>SCOTT</Customer>

<PODate>12-Sep-2003</PODate>

<ShipAddr street=“211,Redwood Lane” city=“Seattle”

state=“CA”/>

<LineItem ItemNo=“232” Item=“Box” Quantity=“200”/>

<LineItem ItemNo=“344” Item=“Card” Quantity=“3”/>

<LineItem ItemNo=“333” Item=“Pen” Quantity=“33”/>

</PurchaseOrder>

Page 71: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery example - query Search all of scott’s po for > 20 items

<ScottLargeItems> { for $i in doc(“scott.xml”)/PurchaseOrder/LineItem where $i/@Quantity > 20 return <LargeItem>{$i/@Item}</LargeItem>} </ScottLargeItems>

<ScottLargeItems>

<LargeItem Item=“Box”/>

<LargeItem Item=“Pen”/>

</ScottLargeItems>

Page 72: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Differences from SQL Navigation-oriented (using XPath expressions) Different type system (XMLSchema based simple types) Identity-based (XML Node identities and document order) Namespace aware name-resolution (functions, variables,

element creation) Row based versus Item based Results are heterogeneous sequences Does not have all SQL extensions (e.g, OLAP, Full-Text..)

Page 73: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

What’s really new?

New Data Model Shared with XSLT-2.0 / XPath 2.0 Input & Output - XQuery data model Type system based on XMLSchema Supports un-typed and typed values Supports construction of XML Has prolog with static/dynamic context

Page 74: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Data Model

XML -> Infoset -> PSVI -> Data model Everything is a sequence

– Sequence is list of “Item”s– Items can be

XML nodes (document, PI, comment, attributes)

Simple content (number, date)Can be typed (XMLSchema) or untyped

Page 75: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Sequence

Virtual container of data model items Can be homogeneous or heterogeneous Example Sequence

( “ABC” (atomic value of type xsi:string), <a>33.2</a> (element a of type xsi:float), @foo:x = 33 (attribute foo:x of type xsi:integer), <ShipAddr> <street>... </street> ... </ShipAddr> (element ShipAddr of type foo:Address),)

Page 76: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Expressions

Everything is an “expression”. Inputs defined in the language -

– doc() – takes in URI returns data model– collection() – URI input – returns collection– bind variables – external variable input

Function – recursive, named, typed Defined F&O – lots - shared with XPath Modules – Similar to PL/SQL packages

Page 77: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XML construction

Element, Attribute and other constructors– Direct constructors

<Emp name=“{/emp/ename}”>21</Emp>– Computed Constructor element Emp { attribute name {/emp/ename}, text( “21”) }

Sequence constructor( “a”, “b”, 1 to 10 ) – sequence of strings,

integers

Page 78: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Type system

XMLSchema primitive and complex types Query can be statically typed (conformance

level) Dynamic types can be more specific Validation is implicit – lax,strict,skip

Page 79: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery path language

XPath 2.0 – new path language (subset) Type aware unlike XPath 1.0 New operators

– Node compares – is, isnot– Value compares – eq, gt, lt– Existential compares - <, = , >– Range searches, Satisfies, etc.

Ordering maintained – unordered() hint

Page 80: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Java standards

JSR 225 (XQJ) – new JSR in the works IBM, BEA, Datadirect, SoftwareAG, Ipedo Define standard for access in Java Early draft Q1 ‘04? Final draft sometime in summer ’04? Taking good aspects of JDBC

Page 81: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example

Preliminary API (warning! subject to change)import javax.xml.xquery;

XQConnection con = Datasource.getconnection(jdbcco);XQExpression expr = con.prepareExpr(“for $i in doc..”);XQItemSet items = expr.executeQuery();

while (items.next()) { if (items.isOfType(XQTypes.StringType)) String str = items.getString();}

Page 82: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQJ – main goals

Get/Set Static and Dynamic context Get/Set External variables Prepare XQuery Expressions Provide Java classes for data model Provide all types of access (SAX, StaX) Metadata and type supportput of one query is

input of another

Page 83: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

SQL/XML standards

Part of ANSI/ISO SQL2003 XML datatype will support XQuery data model XQuery() function to be part of SQL XMLTable function to explode XML to

relational values – using XQuery Define relational mapping to XQuery

– Provide XQuery functions to map tables– Provide mapping from SQL to data model

Page 84: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Support in Oracle

Standalone Java query engine– 100% Java– Integrated into mid-tier, can run in server-tier

(integrated with XML DB)– Interoperate with XSLT/XPath

XMLDB integrated database engine– SQL/XML standard support– Optimized queries – rewrite to SQL

Page 85: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Java implementation

XQuery or XQueryX input Extensible function implementation Compiles into rowsource like structures Integrate with SQL data (view function)

– Uses XSU for relational data

Optimization – push XQuery to XMLDB Can run in mid-tier or server-tier JVM

Page 86: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Java Architecture

PARSER

PARSER

XQueryXQueryX

COMPILER

COMPILER

XQueryX

ExecutionTree

Plan in XML

EXECUTOR

EXECUTOR

Datasources Datasources

XQJ

API

XML

Page 87: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Oracle’s XML Data Services

XDS provides common XQuery services– Caching – cache in Java/XMLDB– View text storage, functional definitions– Web services for XQuery– Common XML Adapter framework

XQuery – the integration language Caching in XMLDB – query pushdown JNDI tree used to discover datasource

Page 88: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XDS Architecture

XDS Client API’s

EJB JSP Tags Web Service

XML Data Synthesis

Cached XML Data Source

Cached XML Data Source

Applications using XDS

e.g. Portals, Reports etc

Query

Builder

Tool

Meta-data

Repository

Ora

cle

En

terprise

Man

ager

XML DB

JCache

Filesystem

J2EE Security Framewor

kXDS Cache

Security

XDS

Caching

Service

XML Data source adaptors CCI-XML

Web

Services

J2CA EAI

SAP

Oracle

AppsJMS RDBMS

Files HTTP

Web

Cache

Java

Functions

XQuery Subsystem

XQ4J/JXQI

XQuery Engine

XMLDataSource modules

XQuery Result

In-Memory

Stored Query

Page 89: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example – XDS usage

User registers webservice as datasource XDS creates an XQuery module automatically User Query for querying webservice

import module namespace wss=“datasrc/stockws”; for $i in wss:getcompanies()return wss:get_stock_price($i/name)

wss – namespace prefix for the loaded moduleJNDI lookup to get datasource implementationXDS adapters implement datasource

Page 90: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example – Query pushdown

Query against XML DBfor $i in view(“PurchaseOrder”)/ROWwhere $i/City = ‘Fresno’return $i/PoNo

With rewrite - equivalent to for $i in sql(“select extract(p,‘PoNo’) from purchaseorder p where existsnode(p,‘[City=“Fresno”]’)=1return $i

Page 91: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Server-tier XML Query

Query SQL tables/views Query XMLtype tables Query data in repository Query URLs (using UriType in DB) Result is XMLType Rewrite when possible

Page 92: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

XQuery Database Architecture

XMLDB Repository SQL tables/views

XQuery rewrite/

Execution engine

XQuery

To rewrite or

not to rewrite

Page 93: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example: SQL integration

SQL tables (get all 150K+ employees)

select XQuery(‘

for $i in view(“SCOTT.EMP”)/ROW

where $i/SALARY > 150000

return $i/EMPNO’)

from dual;

<EMPNO>2100</EMPNO>

<EMPNO>344</EMPNO>

Page 94: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example – XMLType integration

XMLType tables as inputselect p.XQuery(‘<PO pono=“{/PoNo}”/>’)

from purchaseorder p

where p.XQuery(‘/PurchaseOrder[ShipAddr/City=“Fresno”]‘)

is not null;

After rewriteselect XMLElement(“PO”,

XMLAttributes(p.xmldata.”PONo”))

from purchaseorder p

where p.xmldata.”ShipAddr”.”City” = ‘Fresno’

Page 95: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Example – Repository integration

Query files from repository– doc() queries files from repository (rewrite)– collection() maps to directories

select XQuery(‘ for $i in doc(“/public/foo.xml”) return $i’)from dual;

<FOO> ....</FOO>

Page 96: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

Datasources

Enables arbitrary input sources– files, cache, JCA datasources

xmldatasrc – Oracle language addition Datasource API

– initialize– describe– execute – Fetch

Bind (an existing DOM)

Page 97: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

D E M O N S T R A T I O N

XML Query

Page 98: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

More Information

Introduction to XML DB– http://www.oracle.com/ip/index.html?xmldbtp_intro.html

Oracle Technology Network XML Page– http://otn.oracle.com/tech/xml/index.html

Oracle Technology Network XML DB Page– http://otn.oracle.com/tech/xml/xmldb/index.html

Oracle Technology Network XML Querying Page– http://otn.oracle.com/tech/xml/xmldb/htdocs/querying_xml.html

Sample XML DB Applications– http://otn.oracle.com/sample_code/tech/xml/xmldb/index.html

Page 99: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.

AQ&Q U E S T I O N SQ U E S T I O N S

A N S W E R SA N S W E R S

Page 100: All Your XML Sandeepan Banerjee Oracle Corporation Sandeepan.Banerjee@Oracle.com.