This work is by Kendra Little and is licensed under a Creative Commons...

42
This work is by Kendra Little and is licensed under a Creative Commons Attribution- NonCommercial-NoDerivs 3.0 Unported License Kendra Little Introduction to SQL Server Partitioning

Transcript of This work is by Kendra Little and is licensed under a Creative Commons...

Page 2: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

About Kendra

Page 3: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index

1. A sample case.2. What is partitioning?3. When is partitioning helpful?4. What’s the fine print?5. Revisiting our sample case.

You are here

Page 4: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Should this client use partitioning?

Page 5: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index

1. A sample case.2. What is partitioning?3. When is partitioning helpful?4. What’s the fine print?5. Revisiting our sample case.

You are here

Page 6: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

All tables have at least one partition.“In SQL Server, all tables and indexes in a database are considered partitioned, even if they are made up of only one partition. Essentially, partitions form the basic unit of organization in the physical architecture of tables and indexes. This means that the logical and physical architecture of tables and indexes comprised of multiple partitions mirrors that of single-partition tables and indexes.” …Partitioned Table and Index Concepts (msdn)

One Partition

Page 7: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

“Partitioning” actually means “horizontal partitioning”

Horizontal partitioning takes groups of rows in a single table

and allocates them in semi-independent physical sections.

SQL Server’s horizontal partitioning is RANGE based.

Page 8: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Horizontal ranges are based on a partition key.A single column in the table.

Just one!Use a computed column if you must, but make sure it

performs well as a criterion and works for joins.Typically a date or integer valueConsider:

A column you will join onA column you can always use as a criterion

I must choose wisely.

Page 9: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Ranges of data are defined by a partition function which uses the key.The partition function defines your boundary points and can use either RANGE LEFT or RIGHT.

LEFT: the first value is an UPPER boundary point in partition #1

RIGHT: the first value is a LOWER boundary point in partition #2

Keep to the right. It’s

easier.

Page 10: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

RIGHT based partition function for Doll Orders keyed on OrderDate

1/1/2008

1/1/2009

1/1/2010

1/1/2011

Partition 1

Partition 2

Partition 3

Partition 4

Partition 5

Page 11: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

RIGHT based partition function keyed on PartName (effectively LIST)

Boundary Point 1: BODY

Boundary Point 2: SHOE

Partition 1

Partition 2

Partition 3

Question: how do we get rows into Partition 1?

Page 12: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Filegroups are mapped to the partition function using a partition scheme.

1/1/2008

1/1/2009

1/1/2010

1/1/2011

Partition 1: Compressed

Partition 2:Compressed

Partition 3

Partition 4

Partition 5

Slow, Read-only FG_A

FG_B

FG_C

FG_D

Page 13: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Objects are created on the partition scheme.

Table(and

indexes)

• Created on partition scheme.

Partition

Scheme

• Maps partitions defined by the partition function to physical filegroups

Partition

Function

• Boundary points• Defines ranges• Define an algorithm the engine will use to know where to put rows

Page 14: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Indexes can be created on the partition scheme. Or not.

• Located on your partitioning scheme (or an identical partitioning scheme)• Must contain the partitioning key. • If the partitioning key is not specified, it will be added for you. Note: this

affects your primary key for the table!• Indexes are aligned by default unless it is otherwise specified at creation time.• Perform better for aggregations and when partition elimination can be used.

Aligned Indexes

• Physically located elsewhere- either non partitioned or on a non-identical partitioning scheme

• May perform better with single-record lookup• Allow unique indexes (because they do not have to contain the partitioning

key)• However, the presence of these preclude partition-switching!

Non-aligned indexes

Page 15: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

SwitchingRequires all indexes to be aligned.Compatible with filtered indexesData may be switched in or out only within the same

filegroup.Is a metadata-only operation requiring a schema

modification lock. This can be blocked by DML operations, which require a schema stability lock.

Is an exceptionally fast way to load or remove a large amount of data from a table!

Page 16: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Creating the partition function

Our hero.

Page 17: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Creating filegroups

We left the Primary FG default on purpose!

Page 18: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Creating the partition schemeThe partition scheme can map each partition to a specific filegroup, or all partitions to the PRIMARY filegroup. Where the

rubber meets the

road.

Page 19: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Query FGs mapped to the partition function via the partition scheme

This gets a little

complicated.

Page 20: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Creating a table on the partition scheme and add some rows.

A partitioned heap: you can totally

do that.

Page 21: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Let’s have a look at that heap.

We’ll use this query again, but not show it on every slide for

obvious reasons.

Page 22: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Adding indexes

Someone’s not in line.

Page 23: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Notice that aligned indexes always have the clustering key

That’s not usually there!

Page 24: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Adding another partition

We now have a full staging

table and empty

partition on dailyFG4

Page 25: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Switching in!Don’t forget to drop

ordersDaily20101230: your staging table is

still there, it’s just empty now.

And you’re gonna have to rebuild that

non-aligned NC if you want it back.

Page 26: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index

1. A sample case.2. What is partitioning?3. When is partitioning helpful?4. What’s the fine print?5. Revisiting our sample case.

You are here

Page 27: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Is maintenance a significant problem for availability?

YES

• Partitioning may be what you are looking for.

• Keep checking other factors.

NO

• You may have other reasons to partition, but one of its big benefits is to help with this.

Maintenanceincludes index rebuilds, loading data, and deleting data.

Page 28: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Are query patterns defined by regions?

YES

• Finding regions of data which are queried together and have a good partitioning key is important to good query performance.

• This is the basis of partition elimination.

NO

• You may not have a good partitioning key.

• Keep looking at the query patterns for your workload and evaluating different partitioning keys.

Data regions may be dates, integers, codes

Page 29: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Can applications and queries be optimized for partitioning?

YES

• This means you will be able to rewrite some queries and procedures as needed to take advantage of partition elimination.

NO

• If you do not have the ability to tune user and application queries, some will likely perform very poorly.

Some assembly required.

Page 30: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Do you have resources to support the partitioned system?

• Can your disk configuration be optimized?• Is enough buffer pool available for what

will need to be read into memory concurrently?

• Will you be able to tune and configure parallelism appropriately for the workload?

• Do you have a system you can test with a production-like workload, or a suitable rollback plan?

Page 31: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index

1. A sample case.2. What is partitioning?3. When is partitioning helpful?4. What’s the fine print?5. Revisiting our sample case.

You are here

Page 32: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Editions with partitioning

Enterprise Datacenter Developer Evaluation

Page 33: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Support for HOW MANY partitions?

15,000 partitions are available in SQL 2008 with SP2 applied

SQL Server 2005, 2008, and 2008 R2 (for now) are limited to 1,000 partitions. This is less than 3 years for daily partitioning.

What problems could happen with lots of partitions?

Page 34: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

ParallelismIn 2005, a query touching more than one partition

typically had only one thread per partition.In 2008, the Partitioned Table Parallelism

improvement allows multiple threads to be used on each partition for parallel plans.

Partition 1! Partition

1!Partition

2!Partition

2!Partition

3!

Partition 3!

Page 35: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Lock escalation AUTOLock escalation can be set to AUTO for a table. If the

table is partitioned, locks will escalate to the partition level rather than the table level.

What’s awesome: greater concurrency!

Partition level deadlocks are not awesome. Test your workload (like with any feature).

Page 36: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Partition aware seeksIn SQL 2008, the optimizer has been made more

clever and has a greater chance at achieving partition elimination. This has been done by:Changing the internal representation of a partitioned

table to be more optimized for seeking on the PartitionID (even when the table’s CX is on another column)

A “skip scan” operation has been added to allow the optimizer greater flexibility.

More optimized optimizin.

Page 37: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Be careful with your statisticsStatistics are not maintained per partition, they are

maintained for the entire index or column. Since there is a limit to the number of steps in the histogram, the statistics can become invalid, and on very large tables may take a long time to update.

Filtered statistics can be used to help with this in 2008: you can create new filtered statistics for your new partition.

This sounds like work.

Page 38: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index rebuilds and compressionIndividual partitions cannot be rebuilt online.The entirety of a partitioned index can be rebuilt

online.Individual partitions can be compressed.

For fact tables with archive data, older partitions can be be rebuilt once with compression. Their filegroups can then be made read-only.

I’d better check my maintenance jobs.

Page 39: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Switching Feature CompatibilityWorks with replication in 2008 and later

Some subscribers can have the partitioning scheme, others don’t have to

This means you can have some subscribers on Standard.Works with Change Data Capture (with some special

steps)Does not work with Change Tracking

@SQLFool replicates her partitioned tables, check out her blog.

Page 40: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

Index

1. A sample case.2. What is partitioning?3. When is partitioning helpful?4. What’s the fine print?5. Revisiting our sample case. You are here

Page 41: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

So, should this client use partitioning?

Page 42: This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.

This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License

Resources/ ContactThere is a very large amount of documentation online for horizontal table partitioning. Get my recommendations here:

http://littlekendra.com/resources/partition/

This presentation would not have been possibly without whitepapers and blogs by Kimberly Tripp, Michelle Ufford, and Ron Talmage.

• Twitter: @kendra_little• Email: [email protected]• LinkedIn: http://www.linkedin.com/in/kendralittle