Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all...

28

Transcript of Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all...

SQL

Rasmus Reinholdt

Dos and don’ts of Columnstore indexes

MCITP(BI), MCSE(BI), Managing ConsultantRehfeld Partners A/S

Agenda

Dos and don’ts of Columnstore indexes• The basis of xVelocity in-memory technology

• What’s it all about• The compression methods (RLE / Dictionary encoding)

• Maximizing columnstore performance • Columnstore limitations and how to work around them

• The Columnstore Indexes Best Practises• The Limitations of the columnstore Indexes• Working around the limitations• How to load data for a columnstore indexed table

• Questions

Demo• Just to get your attention

The basis of xVelocity in-memory technology

Dos and don’ts of Columnstore indexes

Row storage

Dos and don’ts of Columnstore indexes

Column storage

Dos and don’ts of Columnstore indexes

What’s it all about

• Columnstore indexes can speed up some queries by a factor of 10X to 100X on the same hardware depending on the query and data. These key things make columnstore-based query processing so fast:

• The columnstore index itself stores data in highly compressed format, with each column kept in a separate group of pages.  

• There is a highly efficient, vector-based query execution method called "batch processing" that works with the columnstore index.

• Segment elimination can skip large chunks of data to speed up scans.

• The storage engine pushes filters down into the scans of data.

Demo• How to create a columnstore index

Dos and don’ts of Columnstore indexes

xVelocity engine

• The columnstore index utilizes the xVelocity in-memory engine also known from the SSAS Tabular model• Storage• Compression• Batch mode processing

Dos and don’ts of Columnstore indexes

Compression

• xVelocity compresses data from 4 – 200 times• Two methods

• Run-length encoding (RLE)• Dictionary encoding

Dos and don’ts of Columnstore indexes

RLE

Ru

n-l

en

gth

En

cod

ing

Dos and don’ts of Columnstore indexes

Dic

tion

ary

En

cod

ing

Demo• Cpu / RAM usage during index rebuild

Maximizing columnstore performance

Dos and don’ts of Columnstore indexes

The key to getting the best performance is to make sure your queries process the largest majority of data in batch mode

Dos and don’ts of Columnstore indexes

Ensuring Batch Mode Query Execution

• Parallelism (DOP >= 2) is Required to Get Batch Processing • Do not use:

• Outer Join• IN and EXISTS • NOT IN• Scalar Aggregate• UNION ALL (If not using SP1)• Nonclustered B-tree Indexes • Unsupported Data Types

• Using HASH JOIN hint to avoid nested loop join and force batch processing

Demo• What to look out for

Dos and don’ts of Columnstore indexes

Segment elimination

• A segment is a grouping of rows, from 1 to 8 million rows depending on application.

• Order by date to ensure date segment elemination• SQL server eliminates segments based on min and max data id

when scanning the index

Columnstore limitations and how to work around them

Dos and don’ts of Columnstore indexes

Obeying the following do's will help you get the most out of columnstore

• DOs• Put columnstore indexes on large tables only. • Include every column of the table in the columnstore index• Structure your queries as star joins with grouping and aggregation as much as

possible. • All join columns most have the “not null” attribute sat• Use surrogate keys and dimension tables, do not use strings in the fact table

Dos and don’ts of Columnstore indexes

Similar – a list of do not use operators – I will show you how to work around them in a few minutes

• Do not use• Outer Join• IN • EXISTS • NOT IN• Scalar Aggregate• String filtering

Demo• How to work around the limitations

Dos and don’ts of Columnstore indexes

Update strategies

• Disable / enable the index• Partition switching

SQL 2014 – whats new?

Dos and don’ts of Columnstore indexes

SQL 2014 – whats new?

• Columnstore indexes got updatable• A Clustered columnstore index

EvaluationCreate a Text message on your phone and send it to 1919 with the content:

DB401 5 5 5 I liked it a lotSession Code

Rasmus Performance

(1 to 5)

Match of technical Level

(1 to 5)

Relevance(1 to 5) Comments

(optional)

Evaluation Scale: 1 = Very bad 2 = Bad 3 = Relevant 4 = Good 5 = Very Good!

Questions:• Speaker Performance• Relevance according to your work • Match of technical level according to

published level• Comments

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.  Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.  MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Thank you