Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus...

41
Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer , Principal Gus Bjorklund, Wizard

Transcript of Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus...

Page 1: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

Table Partitioning ForMaintenance and Performance

Jarmo Nieminen, System Engineer , PrincipalGus Bjorklund, Wizard

Page 2: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.2

What language we are going to talk today?

“Saari, saari, heinäsaari; heinäsaaren neito.”

“Island, island, grassy island; grassy island's maiden.”

“Ö, ö, hö ö; hö ö mö”

Page 3: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.3

Agenda

Partition design considerations

• Partition definition setup, not physical layout

Some internals of Table Partitioning in OpenEdge

Performance Impact (maintenance and runtime)

• Your configuration matters

• Yes, we have numbers

Page 4: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.4

Is it Right for You?

Do you require 24/7 uptime for your large business critical OpenEdge application?

Does maintaining your home-grown archiving system take up too much of your time?

Do you enjoy maintaining your OpenEdge database during weekends and holidays?

How important are performance and response time SLAs for you?

Page 5: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.5

Data Access: Partitioning types

Sub-partitioning(up to 15 levels!)OR

Northern Region

Western Region

Southern Region

Order Table Order Table

12/31/2011

Western Region

12/31/2013

Western Region

12/31/2015

Western Region

12/31/2011

Southern Region

12/31/2013

Southern Region

12/31/2015

Southern Region

12/31/2013

Northern Region

12/31/2015

Northern Region

12/31/2011

Northern Region

List Partitioning

Range PartitioningOR

Order Table

12/31/2011

12/31/2013

12/31/2015

Page 6: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.7

What to Partition: Data Organization

Look for a “well known” grouping by “static data value”

• Known at creation time, changes infrequently

List Partitioning

Data organized geographically or grouped by specific entities

• Exact match

• Country, region, company, division

• Why or why not Sales-Rep?

Consider number of unique data values

• 32,765 max defined partitions per table

For best performance: Spread the data out

Northern Region

Western Region

Southern Region

Order Table

List Partitioning

Page 7: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.8

What to Partition: Data Organization

Range Partitioning

Data organized by ranges of values

• Range rather than single value to identify a group of data

• Date (by year is most typical)

– Usage: Calendar year, fiscal year, quarter?

– Order-date vs ship-date?

– Consider affect on index choice

• Alphabetic or numeric range

– Product code

– Usage vs Balance: Group related products, balance A-Z spread

For best performance: Spread the data out

Range Partitioning

Order Table

12/31/2011

12/31/2013

12/31/2015

Page 8: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.9

What to Partition: Data Organization

Sub-partitioning

Sub-partitioning candidate?

• Can you include another column (or add one)?

• By region by order-date

For best performance:

• Sub-partition AND spread the data out

Sub-partitioning

Order Table

12/31/2011

Western Region

12/31/2013

Western Region

12/31/2015

Western Region

12/31/2011

Southern Region

12/31/2013

Southern Region

12/31/2015

Southern Region

12/31/2013

Northern Region

12/31/2015

Northern Region

12/31/2011

Northern Region

Page 9: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.10

Increasing Concurrency With Table Partitioning

A7 A8 A9

Table data all in one physical storage area

User #4User #3User #2User #1

A7

No partitioning – Order table data in 1 storage area

Create order. Assign Order-date = TODAY region = “NorthEast”. Create order. Assign Order-date = TODAY region = “SouthEast”.

Page 10: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.11

Create order. Assign Order-date = TODAY.Create order. Assign Order-date = TODAY.

Increasing Concurrency With Table Partitioning

A7 A8 A9

“Current” data in one physical storage area

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

A7

Range Partitioningby Order-Date

Page 11: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.12

Create order. Assign Order-date = TODAY Product-Code = “D100”. Create order. Assign Order-date = TODAY Product-Code = “A50”.

Increasing Concurrency With Table Partitioning

A7 A8 A9

“Current” data in one physical storage area

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

A7

Range Partitioningby Order-Date

A7 A8 A9

Table data across physical storage areas

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

Range partitioningby Product-Code

Page 12: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.13

Increasing Concurrency With Table Partitioning

A7 A8 A9

Table data across physical storage areas

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

List partitioningby Region

Create order. Assign Order-date = TODAY region = “NorthEast”. Create order. Assign Order-date = TODAY region = “SouthEast”.

Page 13: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.14

Increasing Concurrency With Table Partitioning

A7 A8 A9

Table data across physical storage areas

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

List partitioningby Region

Create order. Assign Order-date = TODAY region = “NorthEast”. Create order. Assign Order-date = TODAY region = “SouthEast”.

A7 A8 A9

Table data across physical storage areas

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

Sub-partitioningby Region & Order-Date

Page 14: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.15

DOES IT REALLY MATTER?

LET’S LOOK AT SOME NUMBERS…

Page 15: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.16

Physical Characteristics

Type II Areas

• Data and index separated

– 8 Kb block size with cluster sizes of 512(data) and 64(index)

• All partitions in separate areas

• Areas of proportional fixed sizes with matching db extends

Data

• Average record size 257, all same RPB (32)

– Might be interesting to show per partition RPB tuning

• 50,000 records to 10,000,000 per run (base on # users)

• 3 Global indexes and 2 local indexes

Recovery

• 8KB block with 128 MB cluster size

Page 16: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.17

Testing performed

Scale users

• 1, 2, 5, 10, 25, 50, 100, 200

• Avoid application side conflicts

• Monitor internal resource conflicts

Operations executed

• Basic Create, Read, Delete

Vary transaction scope

• 10, 100, 500 records per transaction

Vary partitioning scheme

• No partitioning

• Range partitioning on {order-date}

• Sub-partitioning on {region(9), order-date}

Page 17: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.18

Modified Server Parameters

Buffer pool

• -B 50000 -lruskips 250

Lock Table

• -L 100000 -lkwtmo 3600

Transaction

• -TXERetryLimit 1000

BI

• -bibufs 4000 -bwdelay 20

Latching

• -spin 50000 -napmax 10

Page writers: 1 BIW 3 APWs

Page 18: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.19

Other Test Information

Machine Stats

• 16 sparcv9 processor operating at 3600 MH

• Memory size: 32768 Megabytes

Dbanalys performed before and after each activity

Database recreated with same .st file for each run

Variation across runs: ±1%

These tests were run with the best intentions

• There are some additional areas Progress needs to investigate

• As always, YMMV

Page 19: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.20

Sub-partitioning on region AND order-date

Txn size of 10

Writes and deletesperform significantlybetter with thissub-partitioning scheme

Big jump starts at 25 & 50

No improvement for“Isolated” Read activity

# Users vs % difference

Neg. indicates a loss for TP

Page 20: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.21

Sub-partitioning on region AND order-date

Deletes fall off with increased txn size

Big jump starts at 25

Reads remain flat

Page 21: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.22

Sub-partitioning on region AND order-date

Deletes fall off with increased txn size

Writes improve withincreased txn size

Reads remain flat

Page 22: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.23

Data Location Mapping Overhead

A7 A8 A9 A10

Table data now across physical storage areas

Partition 1 Partition 2 Partition 3 Partition 4

Table # +

Column Value

Area # andRecord data

Object Mapping

Partition mapping via “special” _partition-policy-detail (ppd) index

One additional index lookup

Per record created

Create order. Assign region = “NorthEast” and order-date = TODAY.

PartitionMapping

Page 23: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.24

Why the big difference?

For the 100 user “write” example (117% runtime performance improvement):

Note: 3 level index for non-partitioned and global, only 2 for partitioned indexes

Stat Base Partition Delta Waits Delta

DB Buf I Lock 24,965,752 24,219,272 746,480 48,485,873

DB Buf S Lock 49,845,176 43,137,912 6,707,264 3,669,065

Find index entry 100 5,000,100 -5,000,00 0

BUF Latch 325,197,441 230,610,801 94,586,640 2,488,178

MTX Latch 31,877,745 31,280,314 597,431 -443,931

TXQ 60,339,744 60,930,902 -591,158 -71,178

Latch timeouts 3,205,140 1,088,120 2,117,020

Resource waits 64,643,103 6,940,906 57,702,197

Extends 44,224 48,224

Page 24: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.25

Data Location Mapping Overhead

A7 A8 A9 A10

Table data now across physical storage areas

Partition 1 Partition 2 Partition 3 Partition 4

Table # +

Column Value

Area # andRecord data

Object Mapping

Partition mapping via “special’ ppd index

One additional index lookup

Per partition traversed

Query spanning 3 partitions requires only 3 additional partition index lookups

For each order where region = “NorthEast” and order-date > 01/01/2013: end.

PartitionMapping

Page 25: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.26

Why so little difference?

For the 100 user “read” example (1.8% runtime performance improvement):

This is an isolated test case

• Very little conflict for read activity

Real world scenarios will have mixed activity introducing concurrency issues

• May be resolved with partitioning

Stat Base Partition Delta Waits Delta

Index operations 25,000,100 25,000,200 100 0

DB Buf S Lock 26,059,284 26,044,498 14,786 6,816

BHT Latch 25,872,927 29,698,636 -3,825,709 29,532

BUF Latch 81,084,554 81,295,318 -210,764 -3,519

Latch timeouts 57,193 31,327 25,866

Resource waits 7,172 356 6,816

Page 26: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.27

Data Location Mapping Overhead

A7 A8 A9 A10

Table data now across physical storage areas

Partition 1 Partition 2 Partition 3 Partition 4

Table # +

Column Value

Area # andRecord data

Object Mapping

Partition mapping via “special” ppd index

One additional index lookup

Per partition traversed

Query spanning 3 partitions requires only 3 additional partition index lookups

For each order where region = “NorthEast” and order-date > 01/01/2013: DELETE order.

PartitionMapping

Page 27: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.28

Why the big difference?

For the 100 user “delete” example (126% runtime performance improvement):

Partitioning has more Shared Buffer activity with less waits.

Latch time out and resource waits are significant

Stat Base Partition Delta Waits Delta

DB Buf I Lock 49,764,712 39,622,400 10,142,312 28,235,912

DB Buf S Lock 135,589,472 172,297,568 -36,708,096 45,324,198

BHT Latch 308,628,367 261,774,931 46,853,436 471,875

BUF Latch 527,202,766 516,930,247 10,272,519 2,955,964

Latch timeouts 5,200,069 1,849,659 3,350,410

Resource waits 120,187,904 46,963,709 73,224,195

Page 28: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.29

Create order. Assign Order-date = TODAY.Create order. Assign Order-date = TODAY.

Numbers from a “bad” configuration

A7 A8 A9

“Current” data in one physical storage area

Partition 1 Partition 2 Partition 3

User #4User #3User #2User #1

A7

Range Partitioningby Order-Date

Page 29: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.30

Poorly designed partitioning scheme (order-date only)

Write performanceis pretty bad for25-100 users

Expect MUCH flatterwrite performance

Improves with200 users

Reads remain flat

Same for other txn sizes

Page 30: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.31

All the overhead without improved concurrency

For the 100 user “write” example (66% runtime performance loss):

Waits not significantly different but extends, index and buffer activity is MUCH higher.

• Test case was broken – data in wrong area

Stat Base Partition Delta Waits Delta

DB Buf I Lock 24,965,752 24,987,122 -21,370 1,135,820

DB Buf S Lock 49,845,176 54,496,236 -4,651,060 17,760

Find index entry 100 5,000,100 -5,000,00 0

BUF Latch 325,197,441 332,561,860 -7,364,419 268,565

MTX Latch 31,877,745 31,931,589 -53,844 9,455

TXQ 60,339,744 60,384,584 -44,840 -346

Latch timeouts 3,205,140 2,894,719 310,42

Resource waits 64,643,103 63,340,388 1,302,715

Extends 44,224 173,760

Page 31: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.32

Poorly designed partitioning scheme (order-date only)

Much flatter response

No 200 user results

Further investigation:

• There is variation in reads and writes from previous.

• Stats show same as well

From small gain down to ~7% loss depending on the operation & # users

Page 32: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.33

All the overhead without improved concurrency

For the 100 user “write” example (0.38% runtime performance loss):

Now have similar stats

• Index activity and X Buffer locks is the only real standout

Stat Base Partition Delta Waits Delta

DB Buf I Lock 24,942,392 24,953,548 -11,156 347,384

DB Buf S Lock 49,868,240 55,152,964 -5,284,724 -16,267

DB Buf X Lock 41,941,720 42,181,992 -240,272 -324,844

Find index entry 100 5,000,100 5,000,000 0

BUF Latch 321,989,710 333,313,379 -11,323,669 48,054

MTX Latch 31,931,319 31,944,574 -13,255 -4,037

Latch timeouts 3,166,505 3,132,653 33,852

Resource waits 61,308,496 6,865

Extends 3,392 10,880 -7,488

Page 33: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.34

All the overhead without improved concurrency

For the 100 user “read” example (0.31% runtime performance loss):

Very little difference in activity

Index operations indicate use of “Global Index” vs local.

• Probably should rerun with different query to force comparative local index lookup

Stat Base Partition Delta Waits Delta

Index operations 25,000,100 25,000,100 0 0

DB Buf S Lock 26,782,308 26,503,194 279,114 372

BHT Latch 25,863,634 26,933,298 -1,069,664 1,428

BUF Latch 82,393,823 82,089,754 304,069 -790

Latch timeouts 59,747 58,588 1,159

Resource waits 3,893 3,521 372

Page 34: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.35

All the overhead without improved concurrency

For the 100 user “delete” example (4.56% runtime performance loss):

TP experiences more activity

And now more waiting - NOTE: Buffer intent and share locks

Indexes all identical levels

Stat Base Partition Delta Waits Delta

DB Buf I Lock 49,708,524 49,747,972 -39,448 -2,270,296

DB Buf S Lock 133,305,600 133,141,072 164,528 -2,075,716

BHT Latch 292,113,659 296,430,870 -4,317,211 -57,465

BUF Latch 508,107,788 511,003,173 -2,895,385 -132,237

Latch timeouts 4,670,032 4,857,119 -187,087

Resource waits 106,292,224 110,639,104 -4,346,880

Page 35: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.36

Maintenance Operations

10 GB Table (90 million records)

Binary Dump / load

Index rebuild

• Local

• Global

Page 36: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.37

Binary Load Performance

Important to note:

Concurrent load for TP does NOT foul loading in dump order

• Multiple users insert on different allocation chains – one per partition

– Regardless if partitions are in same area or not!

Concurrent load for non-TP fouls loading in dump order

• Multiple users inserting on SAME allocation chains

• “Logical” scatter of data is re-introduced during load

Page 37: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.38

Binary Load

Single user slower than expected

During TP load noticed high DBSI contention

• Due to global index support.

Operation Non-TP Table TP Entire Table % DifferenceBinary Load -1 24m35.698s 30m0.474s -22.90Binary Load -1 –i 24m36.540s 30m14.441s -21.95Binary Load –n 9 ** 24m59.249s 14m10.843s 76.15Binary Load –n 9 (w/apw,biw)

17m3.913s 11m45.879s 45.04

Binary Load –n 9 –i ** 17m30.530s 6m41.232s 162.09Binary Load –n 9 -i (w/apw,biw)

16m53.992s 6m34.930s 150.74

Page 38: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.39

Index Rebuild

Modest performance loss running off line

• All TP indexes built – only need to build local indexes after binary load

• No real increased concurrency

Running online in parallel for all local indexes is a huge win

• Very small amount of BI activity when run online

Operation Non-TP Table

TP Entire Table

%Difference

Idxbuild off-line 16m37s 18m26s -10.93Idxbuild on-line (2 local) (9m15.5) 4m49s 92.04Idxbuild on-line –i (2 local) (9m15.5) 4m53s 89.42

Page 39: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.40

Binary Dump

Operation Non-TP Table TP Entire Table %DifferenceBinary Dump -1 5m13.499s 3m21.848s 54.95Binary Dump –n w/1 exe 6m10.337s 3m43.359s 65.92Binary Dump –n w/9 exes 2m30.658s 1m11.738s 111.27Binary Dump threaded w/1 exe 2m37.818s 3m40.989s -125.51Binary Dump threaded w/9 exes 2m28.074s 1m10.527s 111.43Binary Dump specified threaded w/9 exes

2m27.048s 1m10.050s 110.00

Page 40: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.

© 2015 Progress Software Corporation. All rights reserved.41

Summary

Partition based on data grouping

Spread data across partitions

Create and delete performance improvements

• Significant for well designed partition schemes

Isolated read performance mostly unaffected by partitioning

Maintenance performance improvements also significant

Page 41: Table Partitioning For Maintenance and Performance Jarmo Nieminen, System Engineer, Principal Gus Bjorklund, Wizard.