1 juliandyke.com © 2005 Julian Dyke Data Segment Compression Julian Dyke Independent Consultant Web...

46
1 juliandyke.co © 2005 Julian Dyke Data Segment Compression Julian Dyke Independent Consultant Web Version

Transcript of 1 juliandyke.com © 2005 Julian Dyke Data Segment Compression Julian Dyke Independent Consultant Web...

1

juliandyke.com

© 2005 Julian Dyke

Data SegmentCompression

Julian Dyke

Independent ConsultantWeb Version

2

juliandyke.com

© 2005 Julian Dyke

Agenda1. Introduction

2. What is data compression?

3. Data segment compression 1. Functionality

2. Syntax

3. Implementation

4. Performance

4. Conclusion

3

juliandyke.com

© 2005 Julian Dyke

What is data compression? Data is

compressed when it is written to a block decompressed when it is read from the block

Compression requires less data storage to hold the compressed data more CPU to compress and decompress the data

4

juliandyke.com

© 2005 Julian Dyke

Why use data compression? Data compression

Increases number of rows in each block Reduces number of blocks required to store data For a full table scan reduces number of logical

(and probably physical) I/Os required For a table access by ROWID increases

probability that block is already in buffer cache Improves buffer cache hit ratio Potentially reduces backup and recovery times

5

juliandyke.com

© 2005 Julian Dyke

When does Oracle use compression? Oracle compresses some data types including

VARCHAR2 NUMBER RAW

Oracle does not compress DATE CHAR

Compression is often achieved by using length byte(s) trimming unused characters/bytes

6

juliandyke.com

© 2005 Julian Dyke

When does Oracle use compression? Oracle also compresses

Length bytes in table blocks Length bytes in index blocks NULL values NULL values at end of each row Index branch blocks (suffix compressed) Index leaf blocks (optionally prefix compressed)

In addition some data structures implicitly compress data IOTs Index Clusters

7

juliandyke.com

© 2005 Julian Dyke

Data Segment Compression Introduced in Oracle 9.2 Intended for

DSS environments Read-only tables

Not intended for OLTP environments Environments with any DML activity subsequent

to data loading Data is compressed at block level Direct path load must be used

8

juliandyke.com

© 2005 Julian Dyke

Restrictions Data segment compression cannot be used

with IOTs IOT overflow segments IOT mapping tables Index clustered tables Hash clustered tables Hash partitions Hash / list subpartitions External Tables

9

juliandyke.com

© 2005 Julian Dyke

Block Level Compression Compression is applied at block level

Blocks will only be compressed if data is sufficiently large to fill the block rows have low enough cardinality

Columns will be reordered within each block to achieve optimal compression ratios

A segment may contain compressed and uncompressed blocks blocks compressed on different columns

10

juliandyke.com

© 2005 Julian Dyke

Direct Path Load Direct path load bypasses much on the work

done by conventional load

Direct path load reserves extents from above HWM formats rows into blocks inserts blocks back into table adjusts HWM

No other transactions can be active on the table whilst load is in progress

11

juliandyke.com

© 2005 Julian Dyke

Direct Path Load In Oracle 9.2 the following statements can use

direct path loads CREATE TABLE AS SELECT INSERT /* + APPEND */ ALTER TABLE MOVE

In addition the following features can use direct path loads Materialized View Refresh SQL*Loader Online reorganisation

12

juliandyke.com

© 2005 Julian Dyke

Creating New Tables Tables are compressed using COMPRESS

clause

CREATE TABLE t1(

c01 NUMBER,c02 VARCHAR2(30)

)COMPRESS;

Default is for tables to be uncompressed This is equivalent to using the NOCOMPRESS

clause

13

juliandyke.com

© 2005 Julian Dyke

Successful Compression If conditions are met then these statements

create compressed data blocks

CREATE TABLE t2 COMPRESS ASSELECT * FROM t1;

CREATE TABLE t2 COMPRESS AS SELECT * FROM t1 WHERE ROWNUM < 1;

INSERT /*+ APPEND */ INTO t2SELECT * FROM t1;

14

juliandyke.com

© 2005 Julian Dyke

Unsuccessful Compression These statements will not create compressed

data blocks

CREATE TABLE t2 AS SELECT * FROM t1 WHERE ROWNUM < 1;

INSERT INTO t2 VALUES ('DBA_TABLES',46);

CREATE TABLE t2 COMPRESS AS SELECT * FROM t1 WHERE ROWNUM < 1;

INSERT INTO t2SELECT * FROM t1;

15

juliandyke.com

© 2005 Julian Dyke

Altering Existing Tables Compression can be specified for an existing

table

ALTER TABLE t1 COMPRESS;

Existing blocks are not compressed New blocks will be compressed if direct path load is used Similarly

ALTER TABLE t1 NOCOMPRESS;

disables compression for new data blocks, but does not change existing data blocks

16

juliandyke.com

© 2005 Julian Dyke

Moving Existing Tables Tables can be moved using

ALTER TABLE t1 MOVE COMPRESS;

This command Creates a new segment Uses direct load to copy and compress blocks Drops old segment

17

juliandyke.com

© 2005 Julian Dyke

Data Dictionary Views Not modified in Oracle 9.2

Data segment compression flag is missing from ALL_TABLES DBA_TABLES USER_TABLES

Data segment compression is recorded by setting a bit in TAB$.SPARE1

Affects Håkan factor – maximum number of rows that can be held on block

18

juliandyke.com

© 2005 Julian Dyke

Data Dictionary Views In Oracle 9.2.0.1 the following will list all

compressed tables in the database

SELECTu.name AS owner,o.name AS table_name

FROM sys.tab$ t,sys.obj$ o,sys.user$ u

WHERE BITAND (t.spare1, 131072) = 131072AND o.obj# = t.obj#AND o.owner# = u.user#;

19

juliandyke.com

© 2005 Julian Dyke

Tablespace Defaults Data segment compression can be specified at

tablespace level

CREATE TABLESPACE ts01 DEFAULT COMPRESS;

All new objects created will have compression enabled Data segment compression can also be specified for existing tablespaces

ALTER TABLESPACE ts01 DEFAULT COMPRESS;

20

juliandyke.com

© 2005 Julian Dyke

Data Dictionary Views In Oracle 9.2.0.1 the DBA_TABLESPACES view

was not been updated to include data segment compression

Use the following to identify tablespaces with compression enabled

SELECT name FROM sys.ts$WHERE BITAND (flags, 64) = 64;

21

juliandyke.com

© 2005 Julian Dyke

Partitioned Tables Data segment compression can also be used

with range or list partitioned tables

CREATE TABLE t1 (c01 NUMBER, c02 VARCHAR2(200))PARTITION BY RANGE (c01)(

PARTITION p1 VALUES LESS THAN (100),PARTITION p2 VALUES LESS THAN (200)

)COMPRESS;

Oracle 9.2 cannot compress hash or composite partitioned tables

22

juliandyke.com

© 2005 Julian Dyke

Partitioned Tables Can also create a table with some partitions

compressed and others uncompressed

CREATE TABLE t1 (c01 NUMBER, c02 VARCHAR2(200))PARTITION BY RANGE (c01)(

PARTITION p1 VALUES LESS THAN (100) COMPRESS,PARTITION p2 VALUES LESS THAN (200)

)COMPRESS;

ALTER TABLE t1ADD PARTITION p3VALUES LESS THAN (300) COMPRESS;

Compression can also be specified for new partitions added to an existing table

23

juliandyke.com

© 2005 Julian Dyke

Partitioned Tables Existing partitions can be specified as

compressed/uncompressed using

ALTER TABLE t1 MODIFY PARTITION p1 COMPRESS;ALTER TABLE t1 MODIFY PARTITION p1 NOCOMPRESS;

These do not affect existing rows

ALTER TABLE t1 MOVE PARTITION p1 COMPRESS;

This creates new segment, copies and compresses all the rows drops old segment

An existing uncompressed partition can be compressed using

24

juliandyke.com

© 2005 Julian Dyke

Data Dictionary Views DBA_PART_TABLES.DEF_COMPRESSION

contains

NONE Compression never enabled

ENABLED Compression enabled at table level

DISABLED Compression has been enabled at table level and subsequently disabled

N/A Partitioned IOT

DBA_TAB_PARTITIONS.COMPRESSION contains

ENABLED Compressed enabled for partition

DISABLED Otherwise

25

juliandyke.com

© 2005 Julian Dyke

Nested Tables Compression can be specified for storage table

of a nested table

CREATE TABLE t1 (c1 NUMBER, c2 TY2)NESTED TABLE c2 STORE AS t2 COMPRESS;

In Oracle 9.2. DBA_NESTED_TABLES has not been updated to indicate that the storage table has been compressed

26

juliandyke.com

© 2005 Julian Dyke

Materialized Views Compression can be specified for materialized

views

CREATE MATERIALIZED VIEW mv1COMPRESSBUILD IMMEDIATEENABLE QUERY REWRITEAS

SELECT c1, c2, SUM (c3) FROM t1GROUP BY c1, c2;

27

juliandyke.com

© 2005 Julian Dyke

Materialized Views Compression can be also specified for existing

materialized views

ALTER MATERIALIZED VIEW mv1 COMPRESS;

Data will be compressed the next time the materialized view is refreshed

e.g

EXECUTE dbms_mview.refresh ('MV1');

28

juliandyke.com

© 2005 Julian Dyke

SQL*Loader SQL*Loader can create data segment

compressed blocks using direct path loads Specified using the parameter

DIRECT = TRUE

Conventional loads using SQLLDR do not generate compressed blocks

29

juliandyke.com

© 2005 Julian Dyke

PCTFREE Default value of PCTFREE for compressed

tables is 0 Can be overridden manually e.g.

CREATE TABLE t1 (c01 NUMBER)COMPRESS PCTFREE 10;

In general the default behaviour is preferable

30

juliandyke.com

© 2005 Julian Dyke

Compression Ratios Compression ratios vary with

number of rows number of columns cardinality of rows

Compression ratios can be improved by sorting table on low cardinality columns prior to loading

Can also be improved by using larger block sizes

31

juliandyke.com

© 2005 Julian Dyke

Compression Ratios For example – loading SALES table from sales

history demo schema

$ORACLE_HOME/demo/schema/sales_history

Block Size

Uncompressed Size (Blocks)

Compressed Size (Blocks)

Ratio %

2048 18777 13433 71.5

4096 8983 6106 68.0

8192 4398 2850 64.7

16384 2179 1353 62.0

Table contains 1016271 rows

32

juliandyke.com

© 2005 Julian Dyke

Implementation Each compressed block contains two tables

Symbol table – contains one row for each individual column value or set of column values

Row table – one row for each row in block

Each column in row table can be a reference to the symbol table if column is

compressed column value if column is not compressed

Compression is performed at block-level only – no inter-block references

33

juliandyke.com

© 2005 Julian Dyke

Data Segment Compression

Block Common Header

Compressed Data Header

Row directory

Free Space

Symbol Table

Table directory

Data Header

Transaction Header

Row Data

Tail

20 bytes

24 bytes + 24 bytes per ITL entry

14 bytes

16 bytes (variable)

8 bytes

2 bytes per row

4 bytes

34

juliandyke.com

© 2005 Julian Dyke

Compressed Block Header Compressed blocks include an extra header. Header length is variable Depends on

number of columns order in which they are compressed

Example of header from block dump

r0_9ir2=0x0mec_kdbh9ir2=0x1r1_9ir2=0x0 76543210 flag_9ir2=------OC fcls_9ir2[5]={ 0 32768 32768 32768 32768 } perm_9ir2[5]={ 0 1 4 2 3 }

35

juliandyke.com

© 2005 Julian Dyke

Length Bytes Column length bytes behave differently in

compressed blocks

<= 200 reference (single column values)

column count (multi-column values)

> 200 AND < 250 length is value - 200

250 (0xFA) length is contained in next two bytes

251 (0xFB) reference is contained in next two bytes

36

juliandyke.com

© 2005 Julian Dyke

Length Bytes Examples

Byte(s) - Hex Bytes (s) – Decimal Value

C9 201 1

CA 202 2

CB 203 3

CC 204 4

F8 248 48

F9 249 49

FA 00 32 250 00 50 50

FA 00 33 250 00 51 51

FA 0F 9F 250 15 159 3999

FA 0F A0 250 15 160 4000

37

juliandyke.com

© 2005 Julian Dyke

Example Monaco Grand Prix Winners 1993-2002

Year Driver Team1993 Ayrton Senna McLaren1994 Michael Schumacher Benetton1995 Michael Schumacher Benetton1996 Olivier Panis Ligier1997 Michael Schumacher Ferrari1998 Mika Hakkinen McLaren1999 Michael Schumacher Ferrari2000 David Coulthard McLaren2001 Michael Schumacher Ferrari2002 David Coulthard McLaren

38

juliandyke.com

© 2005 Julian Dyke

Example - Uncompressed Data Block

2001 FerrariMichael Schumacher

2002 David Coulthard McLaren

1999 FerrariMichael Schumacher

1997 FerrariMichael Schumacher

1998 Mika Hakkinen McLaren

1993 Ayrton Senna McLaren

1994 BenettonMichael Schumacher

1996 Olivier Panis Ligier

1995 Michael Schumacher Benetton

2000 David Coulthard McLaren

Row Data

39

juliandyke.com

© 2005 Julian Dyke

Example - Compressed Data Block

Row Data

Symbol Table

1993Ayrton Senna4

1996Olivier PanisLigier

19943 2

19953 2

19971 2

20024 0

20011 2

20004 0

19991 2

19984 Mika Hakkinen

SymbolTable

Row Data

2 David Coulthard0

3 Ferrari1

5 Michael Schumacher2

Benetton3 2

4 4 McLaren

40

juliandyke.com

© 2005 Julian Dyke

Performance Performance tests

Cost of inserting compressed rows Cost of selecting compressed rows

Tested using Oracle 9.2.0.1 Sun Ultra Enterprise 450 – 4 CPUs 8192 byte block size

Test data adapted from sales history demo

$ORACLE_HOME/demo/schema/sales_history

SALES table contains 1016271 rows

41

juliandyke.com

© 2005 Julian Dyke

Inserting Compressed Rows Loading the entire file into an empty table

Compressed Blocks Elapsed TIme (Secs)

CPU Time (Secs)

No 4398 31.77 4.13

Yes 2850 71.08 43.86

Compressed data is 35% smaller Reduction in logical and physical I/O more than

offset by increase in CPU time to compress blocks Statistics from V$SYSSTAT

42

juliandyke.com

© 2005 Julian Dyke

Selecting Compressed Rows Selecting all rows from table

Compressed Blocks Elapsed TIme (Secs)

CPU Time (Secs)

No 4398 3.41 2.77

Yes 2850 3.78 3.53

Reduction in logical and physical I/O more than offset by increase in CPU time to compress blocks

Statistics from trace file

SELECT SUM (quantity_sold) FROM sales;

43

juliandyke.com

© 2005 Julian Dyke

Caveat Updating rows is VERY expensive Rows must be decompressed before they can

be updated In this example the ALL_OBJECTS view

contained 7253 rows

CREATE TABLE t1 PCTFREE 0 ASSELECT owner, object_name, subobject_name, object_idFROM all_objects;

creates a 28 block table The same statement with a COMPRESS clause

creates a 23 block table

44

juliandyke.com

© 2005 Julian Dyke

Caveat The statement

UPDATE t2 SET owner = owner;

After the update statement is executed the table contains 79 blocks

Once blocks are decompressed, rollback will not recompress them

Use read-only tablespaces to prevent inadvertent updates

decompresses all blocks

Deletes do not display this characteristic

45

juliandyke.com

© 2005 Julian Dyke

Conclusions DSS / read only feature Good compression ratios Only works with direct path load High CPU usage High elapsed times Updates are disproportionately expensive Documentation is weak Data dictionary views need enhancing

46

juliandyke.com

© 2005 Julian Dyke

Thank you for your interest

For more information and to provide feedback

please contact me

My e-mail address is:[email protected]

My website address is:

www.juliandyke.com