Post on 06-Oct-2018
© 2010 IBM CorporationJune 18, 2010
Resource Optimization Compression Space Reclamation Scan Sharing
Tim VincentDB2 LUW Chief Architect
© 2010 IBM Corporation2
Lower Storage Costs with Deep Compression
Best in industry Minimize storage costs Improve performance Easy to implement Advances with
– Index compression– Temp space compression– XML compression
1.5 TimesBetter
3.3 TimesBetter
2.0 TimesBetter
8.7 TimesBetter
DB2 9DB2 9 OtherOther
“With DB2 9, we’re seeing compression rates up to 83% on the Data Warehouse. The projected cost savings are more than $2 million initially with ongoing savings of $500,000 a year.” - Michael Henson
“We are saving anywhere between 60 to 65% in storage and we’ve actually found the performance has improved” - Bashir Khan
© 2010 IBM Corporation3
How Compression Works Compression looks for repeating patterns across the entire table
– When a pattern is found, string is replaced with 12-bit symbol– Symbols are stored in a dictionary for fast lookup
Data resides compressed on pages (both on-disk and in memory)– Significant I/O bandwidth savings – better performance– Significant memory savings – more efficient memory utilization
Name Dept Salary City Province Postal_CodeZikopoulos 510 56105 Whitby ONT L4N5R4
Katsopoulos 500 82475 Whitby ONT L4N5R4
01 opoulos
… …
02 WhitbyONTL4N5R4
Dictionary
Zikopoulos 510 56105 Whitby ONT L4N5R4 …ONTWhitby82475500Katsopoulos L4N5R4
Zik (01) 510 56105 …(02) (02)82475500Kats (01)
Uniqueto DB2
4
© 2009 IBM Corporation4 © 2010 IBM Corporation4
– Multiple algorithms for automatic index compression
– Automatic compression for temporary tables
– Compression of large objects and XML
– Replication of Compressed Tables
Unique in the industry
Unique in the industry
Table
Order By Order By
Temp Table Temp
Compression Improvements
Log db2ReadLog API
Dictionary Compressed user data in logsUncompressed user data in logs
5
© 2009 IBM Corporation5 © 2010 IBM Corporation5
Index Compression Algorithms implemented by the Database Engine
(under-the-covers): – RID List Compression, Prefix Compression, and variable slot directory– Applies to all indexes except: Catalog indexes, MDC block indexes, XML
path indexes and meta indexes, Index specifications
Activated:– When row compression is activated on a table– CREATE INDEX with the new “COMPRESS YES” option– ALTER INDEX COMPRESS [YES|NO] statement, followed by an index
reorg
Savings – ADMIN_GET_INDEX_COMPRESS_INFO to estimate compression
savings for uncompressed index– COMPRESS and PCTPAGESSAVED in the SYSINDEXES catalog
• show if an index is defined as compressed and the percentage saved respectively
6
© 2009 IBM Corporation6 © 2010 IBM Corporation6
© IB M 2 0 0 9 P a g e 7 © S A P 2 0 0 9
I n d e x C o m p r e s s i o n E a r l y C u s t o m e r R e s u l t s
--5 2 %--6 2 G BE n e rg y d e liv e ry c o m p a n y , U S A
6 8 %4 9 %7 2 %7 2 5 G BW o rld le a d in g c o n s tru c tio n m a c h in e ry m a n u fa c tu re r, U S A
--6 5 %--3 .6 T BM e d tro n ic , U S A
6 5 %7 3 %6 0 %5 0 0 G BT -S ys te m s , G e rm a n y
--5 0 %--1 7 6 G BIn s u ra n c e c o m p a n y , G e rm a n y
5 6 %4 9 %5 8 %1 .4 T BG lo b a l c o n s u m e r a n d c o m m e rc ia l p ro d u c t m a rk e te r, U S A
--5 8 %----J o h n D e e re , C h in a
--5 2 % ----H a ie r G ro u p , C h in a
T o ta l D a ta b a s e S a v in g
In d e x C o m p re s s io n R a tio
D a ta C o m p re s s io n R a tio
D a ta b a s e S iz e
D B 2 9 .7 E a r ly C u s to m e r
S a v in g s
B W S ys te m sE R P S ys tem s
7
© 2009 IBM Corporation7 © 2010 IBM Corporation7
Temp Table Compression Compression of temporary tables aims to:
– Reduce the amount of temporary disk space required– Have no performance penalty as a result of the extra
processing required for row compression.
Applicable to User temporary tables and System temps (DGTT/CGTT)
Sorts, MGJN,NLJN, utilities, …
If Deep Compression is licensed, then temporary tables will be compressed by default. – There is no additional action required by the user in order to use it. DB2 will
evaluate the query and apply compression where appropriate.
db2pd will report on temp tablespace usage
8
© 2010 IBM Corporation8
Deep Compression: Warehouse Results Customer POC
– Compressed data from 15.3 to 7.9 TB– Table compression rates were between 80-85%– Aggregate Build throughput improved 15%– Query Response time decreased 23%
9
© 2010 IBM Corporation9
Deep Compression: Warehouse Results
System CPU decreased from 16.5% to 12.3% Wait time decreased from 23.9% to 5.7% User CPU increased from 30.7% to 53% BUT combination of:
– Increased throughput on aggregate build (15%)– Decreased response time (23%)– Reduction in wait time– Compress/uncompress
10
© 2010 IBM Corporation10
Space Savings for TPC-DS Queries with Temp Compression
78.3
50.2
0.0
20.0
40.0
60.0
80.0
100.0
Without Temp Comp Total Bytes Stored With Temp Comp Bytes Stored
Size
(Gig
abyt
es)
Temp Compression: MeasurementsElapsed Time for TPC-DS Queries with Temp
Compression
183.98175.56
120.00
130.00
140.00
150.00
160.00
170.00
180.00
190.00
200.00
Without Temp Comp Runtime With Temp Comp Runtime
Min
utes
TPC-DS CPU Analysis for Temp Compression
39.2646.50
22.1914.61
0.00
20.00
40.00
60.00
80.00
Baseline Index Com press ion
I/O Wait
User CPU
56% lessspace
5% Faster
Effective CPU
Usage
* Lower is better * Lower is better
11
© 2010 IBM Corporation11
New tablespace format to allow automated extent remapping
Allow extents that are not assigned to any object (eg. table, index) to be used by other tablespaces
ALTER TABLESPACE REDUCE … XXX | MAX
All new tablespaces will have this format
Storage in an MDC table is tracked through a ‘block map’– which extents have data and which don’t– When a block is emptied the storage remains with the table and is available for later
reuse by that table
New option on reorg table command to not reorg the table but reclaim these empty blocks/extents
REORG TABLE <mdc table> RECLAIM EXTENTS ON [table partition clause] ALLOW WRITE ACCESS | ALLOW READ ACCESS | ALLOW NOACCESS
Simple Space Reclamation
12
© 2010 IBM Corporation12
DB2 Compresses all Aspects of the Database
15%
30%
55%
10x?2x?
1.5x?3x
2.5x
4x2x?
1.2x?
1x
3x?
1.5x?
1x
4x3x
7x
Oracle Exadata Compression Range
DB2 CompressionRange
Percentage of O
riginal Database S
ize
13
© 2010 IBM Corporation13
Automatic Storage Migration Support ALTER DATABASE command for non-auto AS database
Allow existing tablespaces to grow into auto storage containers ALTER TABLESPACE <table_space_name>
MANAGED BY AUTOMATIC STORAGE Existing containers can no longer be altered.
Support redirected tablespace restore to AS tablespace RESTORE DB <dbname> REDIRECT SET TABLESPACE CONTAINERS FOR <tablespaceID> USING AUTOMATIC STORAGE
REBALANCE support after a new path is added to the database– Allows existing tablespaces to use new path
Ability to DROP a path from an automatic storage database. – Can be used to migrate to new containers
Scan Sharing for DB2
© 2006 IBM CorporationIBM Confidential © 2010 IBM Corporation14
Scan Sharing Performance Test
TPCH Q1 : CPU Intensive, Slow Query On Lineitem Table Using A Table Scan TPCH Q6 : IO Intensive, Fast Query On Lineitem Table Using A Table Scan
Test Scenario : Queries executed in parallel in the following sequence
Results : 34% Improvement In End to End Timing
Q1
Q6
Q1
Q6
30Secs
60 Secs
90 Secs
User System Idle IO Wait0
10
20
30
40
50
60
70
% T
ime
Spen
t
Base ScanSharing
CPU Usage
Tim e0
1
2
3
4
Milli
ons
Cum
ilativ
e R
eads
Scan Sharing Base
Reads on a disk: 42% Reduction