SQL Server 2016 Operational Analytics. Sponsorzy strategiczni Sponsorzy srebrni.

Post on 19-Jan-2016

236 views 0 download

Tags:

Transcript of SQL Server 2016 Operational Analytics. Sponsorzy strategiczni Sponsorzy srebrni.

SQL Server 2016Operational Analytics

Sponsorzy strategiczni

Sponsorzy srebrni

Łukasz Grala Microsoft MVP Data Platform | MCT | MCSE

• Architect - Mentor Data Platform & Business Intelligence Solutions

• Trainer Data Platform and Business Intelligence

• University Lecturer

• Author Webcasts and Publications

• Microsoft MVP Data Platform

• Leader PLSSUG Poznań

• Phd Student on Poznan University of Technology, Faculty of Computing Science(topics – database and datawarehouse architecture, data mining, machine learning)

lukasz@grala.biz lukasz@sqlexpert.pl

Marcin Szeliga• Data Philosopher

• BI Expert and Consultant

• Data Platform Architect

• 20 years of experience with SQL Server

• Ph.D. Candidate at Politechnika Śląska

• marcin@sqlexpert.pl

Operational Database Management Systems

Data Warehouse Database Management Systems

Business Intelligence and Analytics Platforms

x86 Server Virtualization

Cloud Infrastructure as a Service

Enterprise Application Platform as a Service

Public Cloud Storage

Leader in 2014 for Gartner Magic QuadrantsMicrosoft platform leads the way on-premises and cloud

Hyperscale cloud

Deeper insights across data

Do more. Achieve more.

Performance Security Availability Scalability

Operational analyticsInsights on operational data; Works with in-memory OLTP and disk-based OLTP

In-memory OLTP enhancementsGreater T-SQL surface area, terabytes of memory supported, and greater number of parallel CPUs

Query data store Monitor and optimize query plans

Native JSON Expanded support for JSON data

Temporal database supportQuery data as points in time

Always encryptedSensitive data remains encrypted at all times with ability to query

Row-level securityApply fine-grained access control to table rows

Dynamic data maskingReal-time obfuscation of data to prevent unauthorized access

Other enhancementsAudit success/failure of database operationsTDE support for storage of in-memory OLTP tablesEnhanced auditing for OLTP with ability to track history of record changes

Enhanced AlwaysOnThree synchronous replicas for auto failover across domainsRound robin load balancing of replicas Automatic failover based on database health DTC for transactional integrity across database instances with AlwaysOnSupport for SSIS with AlwaysOn

Enhanced database caching Cache data with automatic, multiple TempDB files per instance in multi-core environments

SQL Server 2016 improvements

Performance Security Availability Scalability

Operational analyticsInsights on operational data; Works with in-memory OLTP and disk-based OLTP

In-memory OLTP enhancementsGreater T-SQL surface area, terabytes of memory supported, and greater number of parallel CPUs

Query data store Monitor and optimize query plans

Native JSON Expanded support for JSON data

Temporal database supportQuery data as points in time

Always encryptedSensitive data remains encrypted at all times with ability to query

Row-level securityApply fine-grained access control to table rows

Dynamic data maskingReal-time obfuscation of data to prevent unauthorized access

Other enhancementsAudit success/failure of database operationsTDE support for storage of in-memory OLTP tablesEnhanced auditing for OLTP with ability to track history of record changes

Enhanced AlwaysOnThree synchronous replicas for auto failover across domainsRound robin load balancing of replicas Automatic failover based on database health DTC for transactional integrity across database instances with AlwaysOnSupport for SSIS with AlwaysOn

Enhanced database caching Cache data with automatic, multiple TempDB files per instance in multi-core environments

Mission-critical performance

•Refers to Operational Workload (i.e. OLTP)•Examples:

• Enterprise Resource Planning (ERP) – Inventory, Order, Sales, • Machine Data – Data from machine operations on factory floor• Online Stores (e.g. Amazon, Expedia)• Stock/Security trades

•Mission Critical• No downtime (High Availability) – impact on revenue• Low latency and high transaction throughput

What does operational mean?

•Analytics• Studying past data (e.g. operational, social media) to identify potential trends • To analyze the effects of certain decisions or events (e.g. Ad campaign)• Analyze past/current data to predict outcomes (e.g. credit score)

•Goals• Enhance the business by gaining knowledge

to make improvements or changes

Source – MIT/SLOAN Management Review

What does analytics mean?

SQL Server

Database

Application Tier

Presentation Layer

IIS Server

SQL ServerRelational DW

Database

ETL

BI and analytics

SQL ServerAnalysis Server

Key Issues• Complex Implementation

• Requires two Servers (CapEx and OpEx)

• Data Latency in Analytics

• More businesses demand/require real-time Analytics

Hourly, Daily, Weekly

Traditional BI architecture

SQL Server

Database

Application Tier

Presentation Layer

IIS Server

BI and analytics

Benefits• No Data Latency• No ETL • No Separate DW

Challenges• Analytics queries are resource intensive

and can cause blocking• How to minimize Impact on Operational

workload• Sub-optimal execution of Analytics on

relational schema

Add analytics specific indexes

This is OPERATIONAL ANALYTICS

SQL ServerAnalysis Server

Minimizing data latency for analytics

SQL Server 2016

16

Quick Recap: Columnstore Index

Improved compression:Data from same domain

compress better

Reduced I/O:

Fetch only columns needed

Data stored as rows Data stored as columns

Ideal for OLTP Efficient operation on small set of rows

C1 C2 C3 C5C4

Improved performance:More data fits in memoryOptimized for CPU utilization

Ideal for DW workload

17

Clustered Columnstore Performance: TPC-H

19

Key Points• Create an updateable non-clustered columnstore index (NCCI) for analytics queries• Drop all other indexes that were created for analytics• No application changes• ColumnStore index is maintained just like any other index• Query Optimizer will choose columnstore index where needed

Relational Table(Clustered Index/Heap)

Btree IndexD

elet

e b

itma

pNonclustered columnstore index (NCCI)

Delta rowgroups

Operational Analytics with columnstore index

20

Key Points• Create Columnstore only on cold data – using filtered predicate to minimize maintenance• Analytics query accesses both columnstore and ‘hot’ data transparently• Example – Order Management Application – CREATE NONCLUSTERED COLUMNSTORE INDEX ….. WHERE order_status = ‘SHIPPED’

Relational Table(Clustered Index/Heap)

Btree Index

Del

ete

bitm

ap

Nonclustered columnstore index (NCCI) – filtered index

HOT

Delta rowgroups

DML Operations

Minimizing CSI overhead

22

Operational Analytics with columnstore on In-Memory Tables

No explicit delta rowgroup Rows (tail) not in columnstore stay in in-

memory OLTP table No columnstore index overhead when

operating on tail Background task migrates rows from tail to

columnstore in chunks of 1 million rows not changed in last 1 hour

Deleted Rows Table (DRT) – tracks deleted rows

Columnstore data fully resident in memory Persisted together with operational data No application changes required

In-Memory OLTP Table

Updateable CCI

DRT Tail

Range Index

Hash Index

Hot

Like

Delta rowgroup

Query processing

Demo time

Performance improvments

Scan type Elapsed time (s) Speedup

Row store scan, interop 44.441

Row store scan, native 28.445 1.6x

CSI scan, interop 0.802 55.4x

Insert, Update, Delete costs and query time

Operation Elasped time (s) with CSI

Elasped time (s) No CSI

Increase % Update

Increase % Query

CSI scan, interop 0.802 BASE

Insert 400 000 rows 53.5 47.8 11.9%

CSI scan, interop 0.869 8.4%

Update 400 000 rows 42.4 28.9 46.7%

CSI scan, interop 1.181 47.3%

Delete 400 000 rows 38.3 30.5 25.6%

CSI scan, interop 1.231 53.5%

Single thread insert and update

Operation Rows affected Row store (s) Secondary CSI (s) Primary CSI (s)

1000 updates 10 000 0.893 1.400 6.866

10% insert 18M 233.9 566 291.4

2% update 3.96M 123.2 314.3 275.9

Single thread scan

Millions of rows Row store Secondary CSI Primary CSI

New built 180 99.1 4.7 1.71

After 1000 updates 180 99.4 5.4 1.75

After 10% inserts 198 108.7 14.5 9.5

After 2% updates 198 109.5 16.8 10.0

Comparing performanceOperation Billions of value

per secondNo SIMD

Billions of value per secondSIMD

Speedup

Bit unpacking 6bits 2.08 11.55 5.55x

Bit unpacking 12 bits 1.91 9.76 5.11x

Bit unpacking 21 bits 1.96 5.29 2.70x

Compaction 32 bits 1.24 6.70 5.40x

Range predicate 16 bits 0.94 11.42 5.06x

Sum 16 bit values 2.86 14.46 5.06x128-bit bitmap filter 0.97 11.42 11.77x64KB bitmap filter 1.01 2.37 2.35x

Query performance (1)

Predicate or aggregation Duration SQL2014 (ms) Duration SQL2016 (ms) Speedup Billion of rows per s

Q1-Q4: select count(*) from LINEITEM where <predicate>

L_ORDERKEY = 235236 220 140 1.57x 12.9

L_QUANTITY = 1900 664 68 9.76x 26.5

L_SHIPMODE='AIR' 694 147 4.72x 12.2

L_SHIPDATE between '01.01.1997' and '01.01.1998'

512 87 5.89x 20.7

Query performance (2)

Predicate or aggregation Duration SQL2014 (ms) Duration SQL2016 (ms) Speedup Billion of rows per s

Q5-Q6: select count(*) from PARTSUPP where <predicate>

PS_AVAILQTY < 10 50 27 1.85x 8.9

PS_AVAILQTY = 10 45 15 3.00x 16

Q7-Q8: select <aggregates> from LINEITEM avg(L_DISCOUNT) 1272 196 6.49x 9.1avg(L_DISCOUNT), min(L_ORDERKEY), max(L_ORDERKEY)

1978 356 5.56x 5.1

Availability Groups as data warehouse

Key points

• Mission Critical Operational Workloads typically configured for High Availability using AlwaysOn Availability Groups

• You can offload analytics to readable secondary replica

Secondary Replica

Secondary Replica

Secondary Replica

Primary Replica

Always on Availability Group

SQL Server

Database

Application Tier

Presentation Layer

IIS Server

BI and analytics

Add analytics specific indexes

SQL ServerAnalysis Server

Minimizing data latency for analytics

High-end Server Hardware

SSAS Enterprise Readiness: Tabular

New DirectQuery

DirectQuery for Oracle, Teradata, ASP

DirectQuery support for MDX query(Excel Tools)

Sponsorzy strategiczni

Sponsorzy srebrni