Post on 05-Apr-2018
7/31/2019 D15 - What's New for PureXML in DB2 9.7
1/38
May 15, 2009 12:30 pm 01:30 pmPlatform: DB2 for Linux, UNIX, and Windows
Matthias NicolaIBM Silicon Valley Lab
Session: D15What's new for pureXML in DB2 9.7
7/31/2019 D15 - What's New for PureXML in DB2 9.7
2/38
2
Key Points
Learn about new pureXML capabilities planned forDB2 and how you can benefit from it.
Learn new ways for managing, partitioning, andscaling the XML data that you accumulate.
Learn how to support and exploit XML in data
warehouses. Learn how you can query XML data with plain SQL
queries, without any XPath or XQuery involved !
Learn how IBM might respond to specific featurerequests from DBAs and application developers.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
3/38
3
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
4/38
4
Recap: pureXML in DB2 9.1 and 9.5create table customer (cid integer, info XML)
insert into customer (cid, info) values (?,?)
select cid, info from customer
select xmlquery ('$INFO /customer/name ')from customerwhere cid > 1234 andxmlexists ('$INFO /customer/addr [zip = 95123]')
xquery for $i in db2-fn:xmlcolumn(" CUSTOMER. INFO")/ customerwhere $i /addr/zip = 95123return {$i /name }
7/31/2019 D15 - What's New for PureXML in DB2 9.7
5/38
5
create index idx1 on customer( info) generate key usingxmlpattern ' /customer/addr/zip ' as sql varchar(5)
update customerset info = ?where .
update customerset info = xmlquery (copy $new := $INFOmodify do replace value of $new /customer/addr/zip
with 95141return $new ')
where ;
Plus: XML Schema Support, Utilities, Shredding, XSLT, etc.
Recap: pureXML in DB2 9.1 and 9.5
7/31/2019 D15 - What's New for PureXML in DB2 9.7
6/38
6
Recap: XML Storage in DB2 9.1
PR28
ACC
PR27
DOC (XML)ID Regions
Index
INX Object (Indexes)DAT Object (rel. Data)
XDA Object (XML Data)
Regions indexfacilitates access todocument regionsIn the XML dataarea.
Like LOBs, XML datais stored separatelyfrom the base table.
Unlike LOBs, XMLdata is buffered inthe buffer pool.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
7/38
7
DB2 9.5: Base Table Inlining for small docs
PR28
ACC
PR27
DOC (XML)ID Regions
Index
INX Object (Indexes)DAT Object (rel. Data)
XDA Object (XML Data)
Documents that aresmall enough can bestored in the base
table.
and can becompressed !
XDA can notbe compressedin DB2 9.5.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
8/38
8
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions
Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
9/38
9
Admin Functions for Inlining
ADMIN_IS_INLINED(xmlcol
) 1 , if the document in the current row is inlined. 0 , if the document in the current row is not inlined.
ADMIN_EST_INLINE_LENGTH( xmlcol ) Inline length that would allow the XML document in the
current row to be inlined (estimated value)
-1 , if the document is too large to be inlined. -2 , for documents inserted in previous versions of DB2
Both functions return NULL if the XML column is NULL
CREATE TABLE customer(id int, xmlcol XML INLINE LENGTH 1000 );
7/31/2019 D15 - What's New for PureXML in DB2 9.7
10/38
10
Admin Functions for InliningSELECT count(xmlcol) as total,
sum( ADMIN_IS_INLINED (xmlcol)) as inlinedFROM customer;
TOTAL INLINED----------- -----------
6 2
1 record(s) selected.
SELECT id, ADMIN_IS_INLINED (xmlcol) AS inlined
FROM customer;
ID INLINED---------------- ----------------
1000 11001 01002 11003 01004 01005 0
6 record(s) selected.
2 out of 6documents are
inlined
7/31/2019 D15 - What's New for PureXML in DB2 9.7
11/38
11
Admin Functions for InliningSELECT id, ADMIN_IS_INLINED (xmlcol) AS inlined,
ADMIN_EST_INLINE_LENGTH (xmlcol) AS inline_lengthFROM customer;
ID INLINED INLINE_LENGTH----------- ----------- -------------
1000 1 770
1001 0 23451002 1 7961003 0 14891004 0 19101005 0 -1
6 record(s) selected.
Is inlined, uses770 bytes.
Too large to be in-linedfor the given page size
Not inlined,requires inlinelength > 1489
7/31/2019 D15 - What's New for PureXML in DB2 9.7
12/38
12
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions
Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
13/38
13
XML Data & Index Compression
Compression of DAT Object Compression of XDA Object*
Separate compressiondictionaries for DAT and XDA*
Compression of any user-
defined index* No compression of regions
index or MDC block indexes
CREATE TABLE customer(id int, xmlcol XML)COMPRESS YES;
*see next slide for details
PR28
ACC
PR27
DOC (XML)ID
PR28
ACC
PR27
DOC (XML)ID RegionsIndex
INX Object (Indexes)DAT Object (rel. Data)
XDA Object (XML Data)
7/31/2019 D15 - What's New for PureXML in DB2 9.7
14/38
14
XDA compression not for XML columns that werecreated in a previous version of DB2
Move XML data to a new XML column first,e.g. online table move ( SYSPROC.ADMIN_MOVE_TABLE )
Default: index compressed if table is compressed But, index compression can be controlled separately
ALTER INDEX myxmlidx COMPRESS YES; CREATE INDEX myxmlidx2 ON COMPRESS NO;
REORG RESETDICTIONARY rebuilds dictionary for relational data only
REORG LONGLOBDATA RESETDICTIONARY rebuilds dictionary for XML and relational data
XML Data & Index Compression
7/31/2019 D15 - What's New for PureXML in DB2 9.7
15/38
15
XDA Compression saves 60% to 80%of the storage spaceXDA Compression Ratio for Various XML Data Sets
74%
61%
63%
77%
77%
63%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
XBench
TPoX
DITA
*Customer C
*Customer B
*Customer A
*data sets from DB2 customers
7/31/2019 D15 - What's New for PureXML in DB2 9.7
16/38
16
XDA Compression saves 60% to 80%of the storage spaceXDA Compression Ratio for Various XML Data Sets
74%
61%
63%
77%
77%
63%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
XBench
TPoX
DITA
*Customer C
*Customer B
*Customer A
*data sets from DB2 customers
Doc size: 10MBto 100MB
Doc size: 2KBto 20KB
Doc size: 20KBto 10MB
Doc size: 10KBto 100KB
7/31/2019 D15 - What's New for PureXML in DB2 9.7
17/38
17
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions
Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
18/38
1818
SQL Access to Relational Data
SQLXML
Relational
XMLTABLEView
JohnDoe
344
.
Create relational view over XML data, then use plain oldSQL queries against that view
Problem in DB2 9.5: SQL predicates cannot use XMLindexes on the source data table scans
Now the problem is solved!
7/31/2019 D15 - What's New for PureXML in DB2 9.7
19/38
1919
SQL Access to XML Data
John Doe
344
Peter Pan
216
CREATE VIEW empview(empid, firstname, lastname, office)
AS SELECT X.* FROM dept,XMLTABLE ('$DOC /dept/employee ' COLUMNSempid INTEGER PATH ' @id ',firstname VARCHAR(30) PATH 'name/first',lastname VARCHAR(30) PATH 'name/last',
office INTEGER PATH 'office') AS X
216PanPeter902
344DoeJohn901
officelastnamefirstnameempid
create table dept (doc XML);select lastname, officefrom empvviewwhere empid = 901 ;
create index idx1 on dept (doc ) generate keys usingxmlpattern ' /dept/employee/@id as sql double;
7/31/2019 D15 - What's New for PureXML in DB2 9.7
20/38
7/31/2019 D15 - What's New for PureXML in DB2 9.7
21/38
21
Data Warehousing and XML
1. Accumulating large amounts of XML inoperational systems ?
Need to analyze and warehouse that dataeventually, even without shredding
2. Looking for ways to improve flexibility of
existing warehouses?XML columns for flexible dimensions
3. Need to ingest XML data into a relational warehouse more efficiently?
All pureXML features available with DPF
7/31/2019 D15 - What's New for PureXML in DB2 9.7
22/38
22
A relational Star SchemaSTORE PERIODSTOREKEY PERKEY STORE_NUMBER DAILY_SALES CALENDAR_DATECITY PERKEY DAY_OF_WEEKSTATE PRODKEY WEEKDISTRICT STOREKEY PERIODREGION CUSTKEY YEAR
PROMOKEY HOLIDAY_FLAG
CUSTOMER QUANTITY_SOLD WEEK_ENDING_DATECUSTKEY EXTENDED_PRICE MONTHNAME EXTENDED_COSTADDRESS SHELF_LOCATION
C_CITY SHELF_NUMBER PRODUCTC_STATE START_SHELF_DATE PRODKEY ZIP SHELF_HEIGHT UPC_NUMBERPHONE SHELF_WIDTH (more) PACKAGE_TYPEAGE_LEVEL FLAVORAGE_LEVEL_DESC FORMINCOME_LEVEL CATEGORY
INCOME_LEVEL_DESC DAILY_FORECAST SUB_CATEGORYMARITAL_STATUS PERKEY CASE_PACKGENDER STOREKEY PACKAGE_SIZEDISCOUNT PRODKEY ITEM_DESC
QUANTITY_FORECAST P_PRICE
PROMOTION EXTENDED_PRICE_FORECAST CATEGORY_DESCPROMOKEY EXTENDED_COST_FORECAST P_COSTPROMOTYPE SUB_CATEGORY_DESCPROMODESCPROMOVALUEPROMOVALUE2PROMO_COST
7/31/2019 D15 - What's New for PureXML in DB2 9.7
23/38
23
A relational Star SchemaSTORE PERIODSTOREKEY PERKEY STORE_NUMBER DAILY_SALES CALENDAR_DATECITY PERKEY DAY_OF_WEEKSTATE PRODKEY WEEKDISTRICT STOREKEY PERIODREGION CUSTKEY YEAR
PROMOKEY HOLIDAY_FLAG
CUSTOMER QUANTITY_SOLD WEEK_ENDING_DATECUSTKEY EXTENDED_PRICE MONTHNAME EXTENDED_COSTADDRESS SHELF_LOCATION
C_CITY SHELF_NUMBER PRODUCTC_STATE START_SHELF_DATE PRODKEY ZIP SHELF_HEIGHT UPC_NUMBERPHONE SHELF_WIDTH (more) PACKAGE_TYPEAGE_LEVEL FLAVORAGE_LEVEL_DESC FORMINCOME_LEVEL CATEGORY
INCOME_LEVEL_DESC DAILY_FORECAST SUB_CATEGORYMARITAL_STATUS PERKEY CASE_PACKGENDER STOREKEY PACKAGE_SIZEDISCOUNT PRODKEY ITEM_DESC
QUANTITY_FORECAST P_PRICE
PROMOTION EXTENDED_PRICE_FORECAST CATEGORY_DESCPROMOKEY EXTENDED_COST_FORECAST P_COSTPROMOTYPE SUB_CATEGORY_DESCPROMODESCPROMOVALUEPROMOVALUE2PROMO_COST
DOC XML
DOC XML
DOC XML
DOC XML
DOC XML
DOC XML
.extended with XML
7/31/2019 D15 - What's New for PureXML in DB2 9.7
24/38
24
XML in DPF, RP, MDC Tables
All of the following have to be relational columns: DPF distribution key Range partitioning key MDC clustering columns
XML column is payload in DPF, RP, MDC table Cannot distribute, partition, or organize by XML values Can extract XML values into relational columns, then
use those to distribute, partition, or organize
7/31/2019 D15 - What's New for PureXML in DB2 9.7
25/38
25
Indexes in MDC and RP Tables
Both MDC block indexes and XML indexes can
be used in the same query Index AND-ing plans of block indexes and XMLindexes !
Range partitioned tables: Relational indexes can be local (partitioned) or
global (non-partitioned) indexes XML Regions Index is always a local index User-defined XML Indexes are (for now) global
indexes
7/31/2019 D15 - What's New for PureXML in DB2 9.7
26/38
7/31/2019 D15 - What's New for PureXML in DB2 9.7
27/38
27
XML in User Defined Functions
XML Data Type allowed for parameters andvariables in UDFs
UDFs can manipulate XML data without XMLparsing
You can encapsulate XML operations in a UDF Extract XML element or attribute values Update selected elements or attributes Use table UDFs to produce relational tables from
XML documents etc.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
28/38
28
Scalar UDF with XMLCREATE FUNCTION getname ( doc XML)RETURNS VARCHAR(25)BEGIN ATOMIC
RETURN XMLCAST(XMLQUERY('$d/customerinfo/name'PASSING doc AS "d")
AS VARCHAR(25));END#
SELECT getname (info) AS nameFROM customer
WHERE cid = 1002 #
NAME-------------------------Jim Noodle
1 record(s) selected.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
29/38
29
Table UDF with XMLCREATE FUNCTION getphone ( doc XML)RETURNS TABLE(type VARCHAR(10), number VARCHAR(20))BEGIN ATOMIC
RETURN
SELECT type, numberFROM XMLTABLE('$d/customerinfo/phone' PASSING doc AS "d"COLUMNS
type VARCHAR(10) PATH '@type',number VARCHAR(20) PATH '.') ;
END # SELECT cid, p.type, p.numberFROM customer, TABLE( getphone (info)) p
WHERE cid = 1004 #
CID TYPE NUMBER ---------------- ---------- --------------------
1004 work 905-555-47891004 home 416-555-3376
2 record(s) selected.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
30/38
30
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions
Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
31/38
31
Online CREATE and REORGof XML Indexescreate table customer (cid integer, info XML);
create index idx1 on customer( info) generate key usingxmlpattern ' /customerinfo/addr/zip ' as sql varchar(5) ;
reorg indexes all for table customerallow write access ;
Writes are notblocked.
Writes are notblocked.
7/31/2019 D15 - What's New for PureXML in DB2 9.7
32/38
32
Agenda Recap: pureXML in DB2 9 and 9.5 Admin functions for the DBA
Compressing XML Data and Indexes SQL Access to XML Data Partitioning and Clustering with XML Data
XML in Range-Partitioned Tables XML in MDC Tables XML in Partitioned Databases (DPF)
XML in User-Defined Functions
Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
33/38
7/31/2019 D15 - What's New for PureXML in DB2 9.7
34/38
34
New: Shred all document identified by a query LOAD first, then use CLP command or SP call to shred:
DECOMPOSE XML DOCUMENTS IN'SELECT cid, info FROM customer'XMLSCHEMA db2admin.customerxsdMESSAGES /home/matthias/errorreport.xml ;
CALL XDB_DECOMP_XML_FROM_QUERY ('DB2ADMIN', 'CUSTOMERXSD','SELECT cid, info FROM customer ',
0, 0, 0, NULL, NULL, 1,:numInput, :numDecomposed, :errorreportBuf);
Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
35/38
35
Summary
New admin functions to check inlining Can Compress all your XML Data and Indexes
SQL Access to XML Data via XMLTABLE Views XML in the Warehouse
XML in DPF, MDC, and Range-Partitioned Tables XML in User-Defined Functions Online CREATE and REORG of XML Indexes Bulk Decomposition
7/31/2019 D15 - What's New for PureXML in DB2 9.7
36/38
36
Comprehensive coverage of
pureXML inDB2 for Linux, UNIX, WindowsandDB2 for z/OS
Available in July
Available for pre-order now:
http://tinyurl.com/pureXML
7/31/2019 D15 - What's New for PureXML in DB2 9.7
37/38
37
Further Reading "pureXML in DB2 9: Which way to query your XML Data?"
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/
"Query XML data that contains namespaces"http://www.ibm.com/developerworks/db2/library/techarticle/dm-0611saracco/
"XMLTABLE by Example", Part 1 & 2http://www.ibm.com/developerworks/db2/library/techarticle/dm-0708nicola/ http://www.ibm.com/developerworks/db2/library/techarticle/dm-0709nicola/
Update XML in DB2 9.5:http://www.ibm.com/developerworks/db2/library/techarticle/dm-0710nicola /
DB2 Documentation & Resources:http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsphttp://www.ibm.com/developerworks/wikis/display/db2xml/Technical+Papers+and+Articles
15 best practices for pureXML performance in DB2 9http://www.ibm.com/developerworks/db2/library/techarticle/dm-0610nicola/
Performance of DB2 9 pureXML vs. CLOB and shredded XML storagehttp://www.ibm.com/developerworks/db2/library/techarticle/dm-0612nicola/
XML Database Benchmark: Transaction Processing over XML (TPoX)http://tpox.sourceforge.net/ , http://tpox.sourceforge.net/Sigmod2007_TPoX.pdf
7/31/2019 D15 - What's New for PureXML in DB2 9.7
38/38
38
Matthias NicolaIBM Silicon Valley Lab
mnicola@us.ibm.com
Session: D15
What's new for DB2 pureXML