New fordevelopersinsql server2008

Introduction to

New T-SQL Programmability

Features in

SQL Server 2008

Aaron Shilo, Database Consultant Oracle & MS Sql Server Certified

4477117-050* [email protected]* www.dbconsultant.co.il

mailto:[email protected]

http://www.dbconsultant.co.il/

Who Am I ??? Married + 2.5 children

A DBA for 10 years.

Oracle And Sql-Server Certified Professional.

Used to be CTO@johnbryce training.

Lead my own consulting business.

Advisor to Tapuz.co.il / BezeqInt / Lavie Time-tec and more

Topics

Data Compression

Table value constructor support through the VALUES clause

New date and time data types and functions

The HIERARCHYID data type

Table types and table-valued parameters

The MERGE statement, grouping sets enhancements

Sparse columns

Filtered indexes

Change data capture

Resource Governor


Compression

SQL Server 2008 supports both row and page compression for both tables and indexes.

Data compression can be configured for the following database objects:

A whole table that is stored as a heap.

A whole table that is stored as a clustered index.

A whole non clustered index.

A whole indexed view.

For partitioned tables and indexes, the compression option can be configured for each

partition, and the various partitions of an object do not have to have the same

compression setting.


Considerations And Limitations

Compression is available only in the SQL Server 2008 Enterprise and Developer editions.

Compression can allow more rows to be stored on a page, but does not change the

maximum row size of a table or index.

A table cannot be enabled for compression when the maximum row size plus the

compression overhead exceeds the maximum row size of 8060 bytes

For example, a table that has the columns c1 char(8000) and c2 char(53) cannot be compressed

because of the additional compression overhead. When the vardecimal storage format is used, the

row-size check is performed when the format is enabled. For row and page compression, the row-

size check is performed when the object is initially compressed, and then checked as each row is

inserted or modified.

Compression enforces the following two rules:

An update to a fixed-length type must always succeed.

Disabling data compression must always succeed. Even if the compressed row fits on the page,

which means that it is less than 8060 bytes; SQL Server prevents updates that would not fit on the

row when it is uncompressed.


Page Compression

Compressing the leaf level of tables and indexes with page compression consists of three

operations in the following order:

Row compression

Prefix compression

Dictionary compression


Prefix Compression

For each page that is being compressed, prefix compression uses the following steps:

For each column, a value is identified that can be used to reduce the storage space for

the values in each column.

A row that represents the prefix values for each column is created and stored in the

compression information (CI) structure that immediately follows the page header.

The repeated prefix values in the column are replaced by a reference to the

corresponding prefix. If the value in a row does not exactly match the selected prefix

value, a partial match can still be indicated.


Prefix Compression

The following illustration shows a sample page of a table before prefix compression.

The following illustration shows the same page after prefix compression. The prefix is

moved to the header, and the column values are changed to references to the prefix.


Dictionary Compression

After prefix compression has been completed, dictionary compression is applied.

Dictionary compression searches for repeated values anywhere on the page, and stores

them in the CI area. Unlike prefix compression, dictionary compression is not restricted to

one column. Dictionary compression can replace repeated values that occur anywhere

on a page. The following illustration shows the same page after dictionary compression.


Row Compression

Enabling compression only changes the physical storage format of the data that is

associated with a data type but not its syntax or semantics. Application changes are not

required when one or more tables are enabled for compression. The new record storage

format has the following main changes:

It reduces the metadata overhead that is associated with the record. This metadata is

information about columns, their lengths and offsets. In some cases, the metadata

overhead might be larger than the old storage format.

It uses variable-length storage format for numeric types (for example integer, decimal,

and float) and the types that are based on numeric (for example datetime and money).

It stores fixed character strings by using variable-length format by not storing the blank

characters.


demo


Table Value Constructor Support through the

VALUES Clause

SQL Server 2008 introduces support for table value constructors through the VALUES

clause. You can now use a single VALUES clause to construct a set of rows. One use of this

feature is to insert multiple rows based on values in a single INSERT statement

INSERT INTO dbo.Customers(custid, companyname, phone, address)VALUES

(1, 'cust 1', '(111) 111-1111', 'address 1'),

(2, 'cust 2', '(222) 222-2222', 'address 2'),

(3, 'cust 3', '(333) 333-3333', 'address 3'),

(4, 'cust 4', '(444) 444-4444', 'address 4'),

(5, 'cust 5', '(555) 555-5555', 'address 5');

Note that even though no explicit transaction is defined here, this INSERT statement is

considered an atomic operation. So if any row fails to enter the table, the entire INSERT

operation fails.


A table value constructor can be used to define table expressions such as key derived tables and CTEs, and can be used where table expressions are allowed (such as in the FROM clause of a SELECT statement or as the source table in a MERGE statement). The following example demonstrates using the VALUES clause to define a derived table in the context of an outer SELECT statement:

SELECT * FROM

(VALUES

(1, 'cust 1', '(111) 111-1111', 'address 1'),

(2, 'cust 2', '(222) 222-2222', 'address 2'),

(3, 'cust 3', '(333) 333-3333', 'address 3'),

(4, 'cust 4', '(444) 444-4444', 'address 4'),

(5, 'cust 5', '(555) 555-5555', 'address 5')

)AS C(custid, companyname, phone, address);

The outer query can operate on this table expression like any other table expression, including joins, filtering, grouping, and so on.


Demo


Date and Time Data Types Before SQL Server 2008, date and time improvements were probably at the top of the list of the

most requested improvements for SQL Server—especially the request for separate date and time

data types, but also for general enhanced support for temporal data.

SQL Server 2008 introduces four new date and time data types—including DATE, TIME, DATETIME2,

and DATETIMEOFFSET.

The four new date and time data types provide a split between date and time, support for a

larger date range, improved accuracy, and support for a time zone element.

The DATE and TIME data types split the date and time, which in previous versions were consolidated.

The DATETIME2 data type is an improved version of DATETIME, providing support for a larger date

range and better accuracy.

The DATETIMEOFFSET data type is similar to DATETIME2 with the addition of a time zone component.

Table 1 describes the new data types, showing their storage in bytes, date-range support,

accuracy, recommended entry format for literals.

Data TypeStorage (bytes)

Date Range AccuracyRecommended Entry Format and Example

DATE 3January 1, 0001, through December 31, 9999 (Gregorian calendar)

1 day'YYYY-MM-DD''2009-02-12'

TIME 3 to 5100 nanoseconds

'hh:mm:ss.nnnnnnn''12:30:15.1234567'

DATETIME2 6 to 8January 1, 0001, through December 31, 9999

100 nanoseconds

'YYYY-MM-DD hh:mm:ss.nnnnnnn''2009-02-12 12:30:15.1234567'

DATETIMEOFFSET 8 to 10January 1, 0001, through December 31, 9999

100 nanoseconds

'YYYY-MM-DD hh:mm:ss.nnnnnnn[+|-]hh:mm''2009-02-12 12:30:15.1234567 +02:00'


Note that the format 'YYYY-MM-DD' is language neutral for the new data types, but it is language dependent for the DATETIME and SMALLDATETIME data types. The language-

neutral format for those data types is 'YYYYMMDD'.

The three new types that contain a time component (TIME, DATETIME2, and DATETIMEOFFSET) enable you to specify the fractional seconds precision in parentheses following the type name. The default is 7, meaning 100 nanoseconds. If you need a fractional second accuracy of milliseconds, such as three for example, you must explicitly specify it: DATETIME2(3).

The following code shows an example of using the new types:

DECLARE

@d AS DATE = '2009-02-12',

@t AS TIME = '12:30:15.1234567',

@dt2 AS DATETIME2 = '2009-02-12 12:30:15.1234567',

@dto AS DATETIMEOFFSET = '2009-02-12 12:30:15.1234567 +02:00';

SELECT @d AS [@d], @t AS [@t], @dt2 AS [@dt2], @dto AS [@dto];


HIERARCHYID Data Type

The new HIERARCHYID data type in SQL Server 2008 is a system-supplied CLR UDT that can

be useful for storing and manipulating hierarchies.

This type is internally stored as a VARBINARY value that represents the position of the current node in the hierarchy (both in terms of parent-child position and position among siblings).

You can perform manipulations on the type by using either Transact-SQL or client APIs to invoke methods exposed by the type.


demo


Table Types and Table-Valued Parameters

SQL Server 2008 introduces table types and table-valued parameters that help abbreviate your code

and improve its performance. Table types allow easy reuse of table definition by table variables, and

table-valued parameters enable you to pass a parameter of a table type to stored procedures and

functions.

Table Types

Table types enable you to save a table definition in the database and use it later to define table

variables and parameters to stored procedures and functions. Because table types let you reuse a table

definition, they ensure consistency and reduce chances for errors.

You use the CREATE TYPE statement to create a new table type.

Table-Valued Parameters

You can now use table types as the types for input parameters of stored procedures and functions.

Currently, table-valued parameters are read only, and you must define them as such by using the

READONLY keyword.

A common scenario where table-valued parameters are very useful is passing an “array” of keys to a

stored procedure. Before SQL Server 2008, common ways to meet this need were based on dynamic

SQL, a split function, XML, and other techniques. The approach using dynamic SQL involved the risk of

SQL Injection and did not provide efficient reuse of execution plans. Using a split function was

complicated, and using XML was complicated and nonrelational.

In SQL Server 2008, you simply pass the stored procedure a table-valued parameter. There is no risk of SQL

Injection, and there is opportunity for efficient reuse of execution plans.


SQL Server 2008 also enhances client APIs to support defining and populating table-

valued parameters. Table-valued parameters are treated internally like table variables.

Their scope is the batch (procedure, function). They have several advantages in some

cases over temporary tables and other alternative methods:

They are strongly typed.

SQL Server does not maintain distribution statistics (histograms) for them; therefore, they do

not cause recompilations.

They are not affected by a transaction rollback.

They provide a simple programming model.


demo


MERGE Statement

The new MERGE statement is a standard statement that combines INSERT, UPDATE, and DELETE actions as a single atomic operation based on conditional logic. Besides being performed as an atomic operation, the MERGE statement is more efficient than applying those actions individually.

The statement refers to two tables: a target table specified in the MERGE INTO clause and a source table specified in the USING clause. The target table is the target for the modification, and the source table data can be used to modify the target.

The semantics (as well as optimization) of a MERGE statement are similar to those of an outer join. You specify a predicate in the ON clause that defines which rows in the source

have matches in the target, which rows do not, and which rows in the target do not have a match in the source. You have a clause for each case that defines which action to take—WHEN MATCHED THEN, WHEN NOT MATCHED [BY TARGET] THEN, and WHEN NOT MATCHED BY SOURCE THEN. Note that you do not have to specify all three clauses, but only the ones you need.

As with other modification statements, the MERGE statement also supports the OUTPUT clause, which enables you to return attributes from the modified rows. As part of the OUTPUT clause, you can invoke the $action function, which returns the action that modified the row ('INSERT', 'UPDATE', 'DELETE').


Demo

Aaron Shilo, Database Consultant Oracle & MS Sql Server CertifiedAaron Shilo, Database Consultant Oracle & MS Sql Server Certified

Grouping Sets

SQL Server 2008 introduces several extensions to the GROUP BY clause that enable you to

define multiple groupings in the same query. These extensions are: the GROUPING SETS,

CUBE, and ROLLUP subclauses of the GROUP BY clause and the GROUPING_ID function.

The new extensions are standard and should not be confused with the older, nonstandard

CUBE and ROLLUP options.


Demo


Sparse Columns

Sparse columns are columns that are optimized for the storage of NULLs.

To define a column as sparse, specify the SPARSE attribute as part of the column definition.

Sparse columns consume no storage for NULLs, even with fixed-length types; however,

when a column is marked as sparse, storage of non-NULL values becomes more expensive

than usual. Therefore, you should define a column as sparse only when it will store a large

percentage of NULLs. SQL Server Books Online provides recommendations for the

percentage of NULLs that justify making a column sparse for each data type.

Querying and manipulation of sparse columns is the same as for regular columns, with one

exception described later in this Presentation


Demo


Filtered Indexes

A filtered index is an optimized nonclustered index, especially suited to cover queries that

select from a well-defined subset of data. It uses a filter predicate to index a portion of

rows in the table.

A well-designed filtered index can improve query performance, reduce index

maintenance costs, and reduce index storage costs compared with full-table indexes.


Filtered Indexes

Improved query performance and plan quality

A well-designed filtered index improves query performance and execution plan quality because it is smaller than a full-table nonclustered index and has filtered statistics. The filtered statistics are more accurate than full-table statistics because they cover only the

rows in the filtered index.

Reduced index maintenance costs

An index is maintained only when data manipulation language (DML) statements affect the data in the index. A filtered index reduces index maintenance costs compared with a

full-table nonclustered index because it is smaller and is only maintained when the data in the index is affected. It is possible to have a large number of filtered indexes, especially when they contain data that is affected infrequently. Similarly, if a filtered index contains only the frequently affected data, the smaller size of the index reduces the cost of updating the statistics.

Reduced index storage costs

Creating a filtered index can reduce disk storage for nonclustered indexes when a full-table index is not necessary. You can replace a full-table nonclustered index with multiple filtered indexes without significantly increasing the storage requirements.


Filtered Indexes cont’

Design Considerations :

In order to design effective filtered indexes, it is important to understand what queries your

application uses and how they relate to subsets of your data. Some examples of data that

have well-defined subsets are columns with mostly NULL values, columns with

heterogeneous categories of values and columns with distinct ranges of values. The

following design considerations give a variety of scenarios for when a filtered index can

provide advantages over full-table indexes.

Filtered Indexes for Subsets of Data :

When a column only has a small number of relevant values for queries, you can create a

filtered index on the subset of values. For example, when the values in a column are

mostly NULL and the query selects only from the non-NULL values, you can create a

filtered index for the non-NULL data rows. The resulting index will be smaller and cost less to

maintain than a full-table nonclustered index defined on the same key columns.


Demo


Change Data Capture

Change data capture is a new mechanism in SQL Server 2008 that enables you to easily

track data changes in a table. The changes are read by a capture process from the

transaction log and recorded in change tables. Those change tables mirror the columns

of the source table and also contain metadata information that can be used to deduce

the changes that took place. Those changes can be consumed in a convenient relational

format through TVFs.

An extract, transform, and load (ETL) process in SQL Server Integration Services that

applies incremental updates to a data warehouse is just one example of an application

that can benefit from change data capture.


Demo


Resource Governor

Resource Governor is a new technology in SQL Server 2008 that enables you to manage

SQL Server workload and resources by specifying limits on resource consumption by

incoming requests. In the Resource Governor context, workload is a set of similarly sized

queries or requests that can, and should be, treated as a single entity. This is not a

requirement, but the more uniform the resource usage pattern of a workload is, the more

benefit you are likely to derive from Resource Governor. Resource limits can be

reconfigured in real time with minimal impact on workloads that are executing.

In an environment where multiple distinct workloads are present on the same server,

Resource Governor enables you to differentiate these workloads and allocate shared

resources as they are requested, based on the limits that you specify. These resources are

CPU and memory.


Types of Resource Issues

Resource Governor is designed to address the following types of resource issues which are commonly

found in a database environment:

Run-away queries on the server. In this scenario a resource intensive query can take up most or all of the

server resources.

Unpredictable workload execution. In this scenario concurrent applications on the same server have

workloads of different size and type. For example, two data warehouse applications or a mix of OLTP and

data warehouse applications. These applications are not isolated from each other and the resulting

resource contention causes unpredictable workload execution.

Setting workload priority. In this scenario one workload is allowed to proceed faster than another or is

guaranteed to complete if there is resource contention. Resource Governor enables you to assign a

relative importance to workloads.

All of the preceding scenarios require the ability to differentiate workloads in some way. Resource Governor

provides:

The ability to classify incoming connections and route their workloads to a specific group.

The ability to monitor resource usage for each workload in a group.

The ability to pool resources and set pool-specific limits on CPU usage and memory allocation. This

prevents or minimizes the probability of run-away queries.

The ability to associate grouped workloads with a specific pool of resources.

The ability to identify and set priorities for workloads.


Resource Governor – Workloads

Ability to differentiate workloads

e.g. app_name, login

Per-request limits

Max memory %

Max CPU time

Grant timeout

Max Requests

Resource monitoring

Memory, CPU, Threads, …

Backup

Admin Tasks

OLTP Activity

Ad-hocReports

ExecutiveReports


Resource Governor – Importance

A workload can have an

importance label

Low

Medium

High

Gives resource allocation

preference to workloads based

on importance

SQL Server

Memory, CPU, Threads, …

Resources

Admin Workload

Backup

Admin Tasks

OLTP Workload

OLTP Activity

Report Workload

Ad-hocReports

ExecutiveReports

High


Resource Governor – Pools

Resource pool: A virtual subset of

physical database engine resources

Provides controls to specify

Min Memory %

Max Memory %

Min CPU %

Max CPU %

Max DOP

Resource monitoring

Up to 20 resource pools

SQL Server

Min Memory 10%

Max Memory 20%

Max CPU 20%

Admin Workload

Backup

Admin Tasks

OLTP Workload

OLTP Activity

Report Workload

Ad-hocReports

ExecutiveReports

High

Max CPU 90%

Application PoolAdmin Pool


Resource Governor

Putting it all together

Workloads are mapped to

Resource Pools (n : 1)

Online changes of groups/pools

SQL Server 2005 = default group +

default pool

Main Benefit

Prevent run-away queries

SQL Server

Min Memory 10%

Max Memory 20%

Max CPU 20%

Admin Workload

Backup

Admin Tasks

OLTP Workload

OLTP Activity

Report Workload

Ad-hocReports

ExecutiveReports

High

Max CPU 90%

Application PoolAdmin Pool


All sample files demos and PPT.

www.dbconsultant.co.il

http://www.dbconsultant.co.il/

THANK YOU!


www.dbconsultant.co.il * [email protected] * 050-4477117

New fordevelopersinsql server2008

Technology

Transcript of New fordevelopersinsql server2008