Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best...

51
EMC Solutions Group Abstract This document describes various best practices for deploying Microsoft SQL Server with EMC ® VNXseries storage arrays. It distills the results of extensive testing performed by EMC Strategic Solutions Engineering into actionable best practices that can easily be implemented. December, 2011 EMC SOLUTIONS FOR MICROSOFT SQL SERVER WITH EMC VNX SERIES Applied Best Practices Guide

Transcript of Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best...

Page 1: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions Group

Abstract

This document describes various best practices for deploying Microsoft SQL Server with EMC® VNX™ series storage arrays. It distills the results of extensive testing performed by EMC Strategic Solutions Engineering into actionable best practices that can easily be implemented.

December, 2011

EMC SOLUTIONS FOR MICROSOFT SQL SERVER WITH EMC VNX SERIES Applied Best Practices Guide

Page 2: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

2

Copyright © 2011 EMC Corporation. All rights reserved.

Published December, 2011

EMC believes the information in this publication is accurate of its publication date. The information is subject to change without notice.

The information in this publication is provided “as is”. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

VMware, ESX, VMware vCenter, vMotion, and VMware vSphere are registered trademarks or trademarks of VMware, Inc., in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners.

EMC Solutions for Microsoft SQL Server with EMC VNX Series

Applied Best Practices Guide

Part Number h8307

Page 3: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

3

Contents

Chapter 1 EMC VNX Series for SQL Server ............................................ 15

Introduction to EMC VNX Series .................................................................... 16

Software suites available ..................................................................................... 16

Software packs available ..................................................................................... 16

Improved system performance ............................................................................. 17

Efficient storage capacity management ................................................................ 17

Continuous availability ........................................................................................ 17

Simple management of storage assets ................................................................. 17

Tight integration with virtual environments .......................................................... 18

Chapter 2 Microsoft SQL Server Best Practices .................................... 19

SQL Server planning considerations ............................................................. 20

Consider high availability technologies for your SQL Server ................................. 20

Consider table and index partitioning .................................................................. 20

Use failover-aware applications ........................................................................... 21

Use Microsoft SQL Server Best Practice Analyzer (BPA) to identify potential issues in the database environment ................................................................................... 21

Make SQL Server part of an Active Directory domain ............................................ 21

Enable SQL Server to keep pages in memory ........................................................ 22

Enable Windows fast file initialization .................................................................. 22

Set the database file sizes and autogrow increments appropriately ..................... 23

Set Autoshrink to OFF on data and log files .......................................................... 24

Use defaults for processors and memory ............................................................. 24

Set the Windows NT File System (NTFS) allocation unit to 64 KB ........................... 24

Do not exceed 80 percent utilization of LUNs ....................................................... 24

Only use hardware that is approved by Microsoft ................................................. 24

When using MSCS, reboot the passive node occasionally .................................... 25

Use quick format for thin pool LUNs on Windows Server 2008 ............................. 25

Chapter 3 Storage Best Practices ........................................................ 27

Storage best practices overview .................................................................... 28

Plan storage layouts for performance and capacity .............................................. 28

Use SQLIOSim.exe to validate storage configuration ............................................ 28

Page 4: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Contents

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

4

Consider Tempdb performance ............................................................................ 28

Size tempdb appropriately ................................................................................... 29

For optimal recovery place database and log files on separate spindles ............... 29

Plan database filegroups based on workload ....................................................... 30

File group planning .............................................................................................. 30

Tempdb ............................................................................................................... 30

User databases .................................................................................................... 30

Log files ............................................................................................................... 30

Determine whether a pool-based LUN or a RAID group LUN is appropriate for the SQL deployment.......................................................................................................... 30

Plan storage operations for minimal disruption .................................................... 32

Do not change the default owner of the pool LUN once it is provisioned ............... 32

Balance SQL Server database and log LUNs between storage processors ............. 33

Name the LUNs for quick identification ................................................................ 33

Align Windows disk partitions for best performance ............................................. 34

Chapter 4 Network Best Practices ........................................................ 35

Network best practices overview ................................................................... 36

Use a dedicated VLAN for cluster heartbeat connectivity ...................................... 36

Use Powerpath for path failover and high-availability ........................................... 36

Use the latest verified HBA / NIC driver ................................................................ 36

Plan SQL Server network connectivity for high speed links ................................... 36

Set network speed and duplexing ........................................................................ 36

Plan for network high availability ......................................................................... 37

Chapter 5 Protection Best Practices ..................................................... 39

SQL Server protection best practices overview .............................................. 40

Use array based replication to offload SQL Server ................................................ 40

Database mirroring .............................................................................................. 41

If clustering is used at the principal site, do not use a witness server ................... 41

Plan for high I/O levels at the mirror site for database mirroring ........................... 41

Verify that interdatabase consistency is not required before implementing database mirroring .............................................................................................................. 41

Methods to sync SQL Server objects if database mirroring is in use ..................... 41

For point-in-time recovery, use database log backups .......................................... 42

When possible, schedule backups for minimal disruption ................................... 42

Do a full backup when you change the database recovery model ......................... 42

Increasing LUN Prefetch settings can improve restore performance ...................... 42

Using VNX file deduplication and compression feature reduces the storage footprint for SQL backups ................................................................................................... 42

Chapter 6 Virtualization Best Practices ................................................ 43

Virtualization best practices overview ........................................................... 44

Align new storage volumes on virtualized servers ................................................ 44

Page 5: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Contents

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

5

Use Virtualization Management Servers ............................................................... 44

Make Virtualization Management Servers highly available ................................... 44

Be aware of virtual machine time considerations ................................................. 44

Chapter 7 FAST Best Practices ............................................................. 45

FAST Suite best practices overview ............................................................... 46

Enable FAST cache on highly active SQL Server databases ................................... 46

For FAST Cache, allow sufficient time for the cache to warm-up ............................ 47

Do not enable FAST Cache for Log LUNs ............................................................... 47

Consider FAST VP for autotiering of data on large SQL databases ......................... 47

Appendix A Performance Monitoring and Tuning .................................... 49

Performance considerations ......................................................................... 50

Windows Performance Monitor ..................................................................... 50

RAID group planning ..................................................................................... 53

RAID level attributes ............................................................................................. 53

Estimating required performance .................................................................. 55

Calculating disk spindle requirements .......................................................... 57

Page 6: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

7

Figures

Figure 1. Microsoft Baseline Configuration Analyzer 2.0 .................................... 21 Figure 2. Select local security policy ................................................................. 22 Figure 3. Perform management tasks ................................................................ 23 Figure 4. Performance comparison .................................................................... 32 Figure 5. Naming LUNs ...................................................................................... 33 Figure 6. RAID overhead effect for random I/O .................................................. 54

Page 7: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

9

Tables

Table 1. Terminology ....................................................................................... 12 Table 2. RAID group LUNs vs. pool-based LUNs ................................................ 31 Table 3. Windows Performance Monitor counters ............................................ 50 Table 4. RAID level performance characteristics ............................................... 54 Table 5. I/O types and performance attributes ................................................. 55 Table 6. Number of spindles required for a series of sample workloads ........... 58

Page 8: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

11

Preface

About this document

Microsoft SQL Server is a relational database management system (RDBMS) that is widely deployed to store, retrieve, and manage application data. Because it is used with a range of applications and each application has different requirements for performance, sizing, availability, recoverability, manageability, etc, it is very important to understand these factors and plan accordingly when deploying SQL Server.

Storage is an important part in any SQL Server deployment. EMC VNX Series unified storage solutions can help in assessment, design, implementation, and management while lowering the implementation risks and costs of an SQL Server environment. The new EMC VNX Series delivers uncompromising scalability and flexibility while providing market-leading simplicity and efficiency to minimize total cost of ownership. SQL Server deployments can benefit from VNX Series features such as:

Improved system performance by extending the cache using Enterprise Flash Drives (EFDs) with EMC FAST Cache.

Fully Automated Storage Tiering for Virtual Pools (FAST VP)

More efficient storage capacity management with Virtual Provisioning™.

Block compression and file deduplication.

Local and remote data protection suites for protection against local failures, outages, and disasters.

This document highlights how the EMC VNX Series unified storage solutions benefit a Microsoft SQL Server database environment. It presents a series of best practices that were discovered and validated during the testing of solutions for using Microsoft SQL Server with EMC VNX Series unified storage.

Audience

This document is intended for IT administrators, database administrators, data architects, and system engineers who have an interest in implementing Microsoft SQL Server on EMC VNX Series Unified Storage systems.

The document is designed for a reader who has a general knowledge of Microsoft SQL Server and EMC Unified storage systems and features.

Scope

It is important to plan a Microsoft SQL Server deployment for scalability while maintaining acceptable performance, high availability, and an efficient mechanism for disaster recovery.

Page 9: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Preface

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

12

This document can be applied in many situations, and is meant to be a collection of recommendations for Microsoft SQL Server storage configuration on EMC VNX.

Note The recommendations in this document have been derived mostly from running a TPCC-like workload. A TPCC-like workload is meant to be representative of an OLTP workload. However, no two workloads are the same. The only way to be sure of the effect of a change in an environment is to test the change in that environment. Microsoft SQL Server is an environment, not an application, so the databases it hosts can vary widely. Therefore, it is possible that a recommendation may help in one environment but have a minimal or negative effect in another. Using best practices is not a valid substitute for proper planning, design, and architecture.

Terminology

Table 1. Terminology

Term Definition

EFD

(Enterprise Flash Drives)

A data storage device that uses nonvolatile semiconductor NAND Flash memory to store data.

Storage pool A container that holds a number of disk drives of the same RAID type.

iSCSI

(Internet SCSI)

A protocol for sending SCSI packets over TCP/IP networks.

LUN

(Logical unit number)

The identifying numbers of a SCSI or iSCSI object that processes SCSI commands. The LUN is the last part of the SCSI address for a SCSI object. The LUN is an ID for the logical unit, but the term is sometimes used to refer to the logical unit itself.

RAID

(Redundant array of inexpensive disks)

A storage method where the data is stored on multiple disk drives to maximize performance and storage capacities, and to provide redundancy and fault tolerance.

RAID 1 RAID method that provides data integrity by mirroring (copying) data to another disk. This RAID type provides the greatest assurance of data integrity at the greatest cost in disk space.

RAID 10 RAID method that provides data integrity by mirroring data to another disk. This method provides better performance than RAID 1 by striping data across mirrored disk pairs.

RAID 5 RAID method where data is written across disks in large stripes. Parity information is stored so that data can be reconstructed if needed. One disk can fail without data loss. Performance is faster for reads but slower for writes.

RAID group A set of physical disks of a particular RAID type on which one or more LUNs are bound. Each RAID group supports only the RAID type of the first LUN bound on it, and any other LUNs bound on it must have the same RAID type. LUNs are distributed equally across all disks in the RAID group.

Page 10: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Preface

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

13

Term Definition

SP

(Storage processor)

A circuit board with memory modules and control logic that manages the storage-system I/O between the host system Fibre Channel adapter and the disk modules.

System database A database installed as part of Microsoft SQL Server. The system databases include master, model, msdb, and tempdb.

User database A non-system database placed on the server after Microsoft SQL Server installation. Examples include an OLTP application database or data warehouse.

VI client The VMware Virtual Infrastructure native client that allows users to administer VMware Virtual Infrastructure functions.

Page 11: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

15

Chapter 1 EMC VNX Series for SQL Server

This chapter presents the following topics:

Introduction to EMC VNX Series .................................................................. 16

Software suites available .............................................................................. 16

Software packs available .............................................................................. 16

Improved system performance ...................................................................... 17

Efficient storage capacity management ......................................................... 17

Continuous availability ................................................................................. 17

Simple management of storage assets ......................................................... 17

Tight integration with virtual environments ................................................... 18

Page 12: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC VNX Series for SQL Server

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

16

Introduction to EMC VNX Series VNX series delivers uncompromising scalability and flexibility for the midtier while providing market-leading simplicity and efficiency to minimize total cost of ownership. Customers can benefit from the new VNX features such as:

Next-generation unified storage, optimized for virtualized applications

Extended cache using Flash drives with FAST Cache and Fully Automated Storage Tiering for Virtual Pools (FAST VP) that can be optimized for the highest system performance and lowest storage cost simultaneously on both block and file

Multiprotocol support for file, block, and object with object access through Atmos™ Virtual Edition (Atmos VE)

Simplified management with EMC Unisphere™ for a single management interface for all NAS, SAN, and replication needs

Up to three times improvement in performance with the latest Intel Xeon multicore processor technology, optimized for Flash

6 gigabit/s SAS back end with the latest drive technologies supported:

3. 5” 100 GB and 200 GB Flash, 3.5” 300 GB, and 600 GB 15k or 10k rpm SAS, and 3.5”, 1 TB, 2 TB, and 3 TB 7.2k rpm NL-SAS

2. 5” 100 GB and 200 GB Flash, 300 GB, 600 GB and 900 GB 10k rpm SAS

Expanded EMC UltraFlex™ I/O connectivity—Fibre Channel (FC), Internet Small Computer System Interface (iSCSI), Common Internet File System (CIFS), Network File System (NFS) including parallel NFS (pNFS), Multi-Path File System (MPFS), and Fibre Channel over Ethernet (FCoE) connectivity for converged networking over Ethernet

The VNX series includes five new software suites and three new software packs, making it easier and simpler to attain the maximum overall benefits

Software suites available VNX FAST Suite—Automatically optimizes for the highest system performance and the

lowest storage cost simultaneously (FAST VP is not part of the FAST Suite for the VNX5100).

VNX Local Protection Suite—Practices safe data protection and repurposing.

VNX Remote Protection Suite—Protects data against localized failures, outages and disasters.

VNX Application Protection Suite—Automates application copies and proves compliance.

VNX Security and Compliance Suite—Keeps data safe from changes, deletions, and malicious activity.

Software packs available VNX Total Efficiency Pack—Includes all five software suites (not available for the

VNX5100).

VNX Total Protection Pack—Includes local, remote and application protection suites

VNX Total Value Pack—Includes all three protection software suites and the Security and Compliance Suite (the VNX5100 exclusively supports this package).

Page 13: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC VNX Series for SQL Server

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

17

Improved system performance The VNX series next-generation storage platform is powered by the quad-core Intel Xeon 5600 series with a 6-Gb/s SAS drive backend. It delivers demonstrable performance improvements over previous generation mid-tier storage systems. The VNX Series is designed to take advantage of the latest innovations in Flash drive technology to maximize performance and efficiency and minimize the cost per gigabyte.

Efficient storage capacity management EMC Virtual Provisioning provides pool-based storage provisioning by implementing pool LUNs that can be either thin or thick. Thin LUNs provide on-demand storage that maximizes storage utilization by allocating storage as it is needed and minimizes over-provisioning of storage capacity to reduce the total cost of ownership (TCO). Thick LUNs provide a way to pre-allocate and dedicate that space for a specific application. Virtual Provisioning also simplifies storage management tasks with the ability to expand pool LUNs (both thin and thick) with a few button clicks. The underlying pools can also be expanded by adding drives nondisruptively when additional storage space is required. This reduces the time and effort required to provision additional storage, and avoids upfront provisioning of storage that may not be needed initially.

The VNX series has built-in features to help ensure redundant or inactive data does not consume valuable storage resources. Block compression, intended for relatively inactive LUNs such as backup copies and static data repositories, automatically compresses data. This allows customers to recapture capacity and reduce the data footprint by up to 50 percent. File-level de-duplication/compression reduces disk space requirements by up to 50 percent by selectively compressing and de-duplicating inactive files. There is minimal system performance overhead because these features operate as background tasks.

Continuous availability The VNX series is architected to provide five 9s availability in mission critical business environments. VNX availability and redundancy features include:

Up to 12.8 GB of mirrored write cache, where each storage processor contains both primary cached data for its LUNs, and a secondary copy of the cache for its peer storage processor.

Battery backup to allow for an orderly shutdown and cache de-staging to the vault disks to ensure data protection in the event of a power failure.

RAID protection levels 0, 1, 1/0, 3, 5 and 6 – all of which can co-exist in the same array simultaneously to match different protection requirements.

Proactive hot sparing enhances system robustness to deliver maximum reliability and availability.

Redundant data paths, power supplies, drive connections, and storage processors- all with non-disruptive field replacement capabilities.

Continuous system monitoring, call-home notification, and advanced remote diagnostics.

Simple management of storage assets EMC Unisphere makes it easy to manage VNX systems from anywhere with a simple, integrated user interface for distributed storage environments. The Unisphere dashboard is a single screen for at-a-glance management and reporting. Unisphere allows administrators to

Page 14: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC VNX Series for SQL Server

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

18

gain instant and actionable knowledge about the status of the entire environment. Unisphere provides single sign-on functionality to automatically discover all VNX, EMC CLARiiON, EMC Celerra, and EMC Recoverpoint SE installations in the environment for seamless configuration.

Tight integration with virtual environments The VNX Series is the ideal mid-tier system for virtualized SQL database environments. Whether the customer environment is VMware, Microsoft Hyper-V, or Xen-based, VNX is fully certified for all supported protocols to ensure successful deployments of virtualized infrastructures through all phases of implementation.

Tight integration of Unisphere with VMware vCenter through VAAI (vStorage APIs for Array Integration) provides both Storage and Virtualization administrators with visibility into the entire environment-end-to-end. Each administrator can use a familiar management interface to view both virtual and physical resources, transparently provision storage, integrate replication, and access and offload all storage functions to the array.

EMC Virtual Storage Integrator (VSI) provides a way to monitor and manage EMC VNX systems in virtual environments. EMC VSI for Hyper-V can be installed on Microsoft System Center Virtual Manager and provides administrator the ability to manage both Hyper-V and EMC VNX from one administrative console. Similarly, EMC VSI for VMware vSphere can be installed in VMware vSphere client to manage both VMware and EMC VNX form one administrative console.

Page 15: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

19

Chapter 2 Microsoft SQL Server Best Practices

This chapter presents the following topics:

SQL Server planning considerations ........................................................... 20

Consider high availability technologies for your SQL Server .......................... 20

Consider table and index partitioning ........................................................... 20

Use failover-aware applications .................................................................... 21

Use Microsoft SQL Server Best Practice Analyzer (BPA) to identify potential issues in the database environment ............................................................. 21

Make SQL Server part of an Active Directory domain ..................................... 21

Enable SQL Server to keep pages in memory ................................................ 22

Enable Windows fast file initialization .......................................................... 22

Set the database file sizes and autogrow increments appropriately ............. 23

Set Autoshrink to OFF on data and log files ................................................... 24

Use defaults for processors and memory ...................................................... 24

Set the Windows NT File System (NTFS) allocation unit to 64 KB ................... 24

Do not exceed 80 percent utilization of LUNs ................................................ 24

Only use hardware that is approved by Microsoft ......................................... 24

When using MSCS, reboot the passive node occasionally............................. 25

Use quick format for thin pool LUNs on Windows Server 2008 ...................... 25

Page 16: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

20

SQL Server planning considerations Good planning prior to deployment is crucial for a successful SQL Server environment. The SQL Server environment should be scalable, highly available, predictable, and manageable during daily operations.

The following best practices give important guidelines for planning SQL Server settings, subsequent chapters cover other aspects of the environment.

Consider high availability technologies for your SQL Server SQL Server is a mission-critical environment, and requires high-availability. Clustering technologies help protect the SQL Server environment from any network or hardware failures on the server.

Microsoft clusters provide a highly available environment to protect against Microsoft SQL Server failures due to hardware, operating systems, device drivers, or applications. If one node in a cluster fails, the service fails over to another node, which provides service in place of the failed node. Getting Started with SQL Server 2008 R2 Failover Clustering available from Microsoft TechNet provides more information about Microsoft Clustering.

VMware clustering capabilities also provide high availability for SQL Server virtual machines by monitoring server failure and automatically starting them on another server if necessary. VMware HA clusters provides uniform, cost-effective failover protection against hardware and operating system failures. Availability Guide for Deploying SQL Server on VMware® vSphere on the VMware website provides more information about VMware HA.

For the highest level of protection against server failure, deploy SQL Server in a cluster environment.

Consider table and index partitioning Partitioning breaks up database objects by allowing subsets of data to reside on separate file groups. In a SQL environment, this ability is beneficial in several ways:

Improved manageability: Partitioning makes large tables or indexes more manageable. Maintenance operations can be performed on subsets of data, and target only the required data that instead of the whole table. This shortens maintenance windows, improves backup coverage, and reduces backup storage requirements.

Reduced costs: Some environments have very large tables that contain historical data. This data may be less valuable than current data, but is still required. Partioning allows the movement of individual sections of the table to lower-cost storage without impacting the whole table Consider FAST VP for autotiering of data on large SQL databases provides more information about additional cost reduction.

Improved availability: Partitioning enables database partial availability, which can reduce downtime due to planned and unplanned events. It also enables partial restores to recover a section of a table without impacting the rest of the table.

Table and index partitioning improves manageability, reduces costs, and improves availability, but partitioning must be planned and implemented based on the specific environment requirements. Improper planning or implementation may result in negative impacts on the SQL environment.

Partitioned Table and Index Strategies Using SQL Server 2008 available on Microsoft TechNet provides more information about partitioning tables and indexes.

Page 17: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

21

Use failover-aware applications When a Microsoft SQL Server failover occurs with MSCS / Failover clustering, Database Mirroring, or other technologies, all database connections are lost and “in-flight” transactions are rolled back. To minimize data loss, all applications should be failover-aware, and have reconnect/retry logic. Therefore, in a failover situation, the application will attempt to reconnect, and retry the interrupted transaction once the connection is re-established.

Use Microsoft SQL Server Best Practice Analyzer (BPA) to identify potential issues in the database environment

After installing the Microsoft SQL Server, run the Microsoft SQL Server Best Practice Analyzer (BPA) tool to identify any potential issues in the database environment. Microsoft SQL Server Best Practice Analyzer (BPA) is diagnostic tool that gathers information from the Microsoft Windows and SQL Server configuration settings. This tool determines the overall health of the SQL Server environment by verifying important settings are configured according to Microsoft-recommended best practices, indicates settings that are different from the recommended practices, reports potential problems, and suggests solutions.

Figure 1. Microsoft Baseline Configuration Analyzer 2.0

The Microsoft SQL Server 2008 R2 Best Practices Analyzer page at the Microsoft Download Center provides more information about the SQL Server 2008 R2 Best Practices Analyzer

The Microsoft SQL Server 2005 Best Practices Analyzer page at the Microsoft Download Center provides more information about the SQL Server 2005 Best Practice Analyzer.

Make SQL Server part of an Active Directory domain Security and account management in Active Directory is more secure, mature, and robust than SQL Server user authentication. Active Directory also allows centralized account administration for the entire environment to reduce administration overhead. Therefore, it is highly recommended to make SQL Server part of Active Directory domain and use Windows authentication mode. Choosing an Authentication Mode available from Microsoft TechNet, provides more information.

Page 18: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

22

Use SQL Server as a domain controller only in certain unusual circumstances. The SQL Server performance can reduce if SQL Server is a domain controller.

Enable SQL Server to keep pages in memory Microsoft SQL Server dynamically allocates and deallocates memory based on the current state of the server to prevent memory pressure and swapping. However, if a process tries to use a substantial amount of memory, SQL Server may not be able to respond quickly enough and the operating system may swap some SQL Server memory to the disk. However, in response to the newly created memory pressure, it is likely the memory that was swapped to the disk contains the memory that SQL Server will soon deallocate to decrease its memory use.

It is recommended to prevent SQL Server from swapping memory. This is known as “Locking pages in RAM”. To lock pages in RAM, select the “Lock pages in memory” policy on the account that is running the Microsoft SQL Server service. How to: Enable the Lock Pages in Memory Option (Windows) available on the Microsoft website, provides more information about locking pages in memory.

Enable Windows fast file initialization When Microsoft SQL Server creates or expands a file, the file must be initialized. Earlier versions of SQL Server had only one option to initialize the space by writing all zeros. This causes a substantial performance impact on the storage system. Microsoft SQL Server 2005 and later versions support fast file initialization, which sets a file end pointer. This operation is nearly instantaneous and minimizes the performance impact. To enable Fast file initialization at the OS level, add the “Perform volume maintenance tasks” policy to the account that is running the Microsoft SQL Server service. By default, this policy is granted to administrators.

To enable fast file initialization on a SQL Server administrator account, complete the following steps:

1. Navigate to Administrative Tools > Local Security Policy. The Local Security Policy page

appears.

2. Select User Right Assignment under Local Policies.

Figure 2. Select local security policy

Page 19: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

23

3. Double-click Perform volume management tasks. The Perform volume management properties dialog box appears.

Figure 3. Perform management tasks

4. Click Add User or Group. The Add User window appears.

5. Add the SQL Server administrator account.

6. Restart the SQL services. The Fast file initialization is enabled automatically.

Database File Initialization available from Microsoft TechNet provides more information.

Set the database file sizes and autogrow increments appropriately Microsoft SQL Server supports the ability to automatically grow both data and log files. However, do not consider this as a method for database sizing. It is a best practice to set the file sizes appropriately and grow them manually during minimal system use, on a planned basis. Use Autogrowth only as a safe option to prevent the files from becoming full and making the database read-only when substantial, unpredicted growth occurs.

When database files are expanded, there is an impact on performance. This impact is minimized but not eliminated through fast file initialization (Enable Windows fast file initialization provides more information). Additionally, set the file autogrowth increments considering that the time taken for the growth to occur is short enough to minimize its impact on performance, but large enough to prevent many small allocations that provide file fragmentation. An adequate increase in file size that prevents fragmentation usually impacts the performance of the database. Hence, it is recommended to change file sizes during periods of lower activity on the database.

Log files have an additional issue because there are virtual log files within a physical log file, and a virtual log file cannot span file growth increments. Thus, if the log file is set to grow in 1 MB increments, the virtual log file will not exceed 1 MB. This will have a performance impact due to the impact of file expansion. This limit may also make certain transactions impossible to complete.

Page 20: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

24

Because of these performance implications, it is recommended to use an absolute growth increment (MB or GB) for all the files instead of a percentage growth.

Set Autoshrink to OFF on data and log files Autoshrink is a simple mechanism to reduce the size of the database and log file space allocated on the disks. This feature does not consider other aspects of the SQL Server before reducing the size. In many cases, causes significant performance problems. Therefore, it is more beneficial to turn off this feature on the SQL Server. Considerations for the "autogrow" and "autoshrink" settings in SQL Server on the Microsoft Support Knowledgebase provides more information.

Use defaults for processors and memory When Microsoft SQL Server is first installed, most of its tunable parameters are set to automatic. It is recommended to retain the default values of these parameters on the server dedicated for SQL Server. Change the default parameters if there are other workloads running on the same server or if there are issues due to the default values.

By default, SQL Server runs at a standard priority and ensures that all processors in the system are available for use. Also, SQL Server uses memory based on the requirement until a memory pressure starts. If other processes start consuming memory, SQL Server decreases its memory footprint appropriately to decrease the possibility of a memory swap.

Set the Windows NT File System (NTFS) allocation unit to 64 KB When NTFS volume allocates space for SQL Server data files, the smallest chunk of data it can allocate is the allocation unit size. SQL Server stores one NTFS data volume as extent. An extent is eight physically contiguous SQL Server pages, which is 64 KB.

To format a new NTFS volume using Disk Administrator for Microsoft SQL Server data files, Microsoft and EMC recommends using a 64 KB block size. SQL Server Best Practices Article available from Microsoft TechNet provides more information.

Do not exceed 80 percent utilization of LUNs If NTFS volume needs additional space and that space is not available, NTFS performance may degrade. This situation may spur additional performance degradation by creating excessive disk fragmentation.

If the LUN that is servicing Microsoft SQL Server reaches 80 percent utilized drive capacity, perform at least one of following:

Remove unnecessary data from the disk.

Move some of the data to disks with more space.

Add more disk space.

This recommendation provides protection against application failure if there is an unexpected growth in the database. Therefore, for best performance and protection against unexpected growth, the utilized drive (NTFS formatted) capacity must not exceed 80 percent.

Only use hardware that is approved by Microsoft Using hardware that is on the Windows Hardware Compatibility List (WHCL) decreases the possibility of compatibility problems and increases the level of support that Microsoft will provide if a problem occurs.

Note In the case of using Windows on Hypervisors, use only hardware that is validated by Microsoft Server Virtualization Validation Program (SVVP).

Page 21: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Microsoft SQL Server Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

25

When using MSCS, reboot the passive node occasionally Configuration changes such as disk configuration changes may not be detected by the passive node until a reboot is completed. If the passive node has not detected these changes, then a cluster failover may not succeed.

Use quick format for thin pool LUNs on Windows Server 2008 It is important to use quick format option (/Q) while formatting any thin LUN provisioned from EMC VNX to the Windows Server 2008 host. When a non-quick format option is used, every block on the file system is overwritten with zeros. From virtual provisioning perspective, this causes a thin LUN to become fully allocated within its respective thin pool. A quick format allocates only a small amount of space within the pool.

However, it is not mandatory to use this option in Windows 2003. Even when a format command is executed without the quick format option, the actual allocation does not differ. The non-quick format operation simply adds an additional scan of the disk device in search of media errors.

Page 22: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

27

Chapter 3 Storage Best Practices

This chapter presents the following topics:

Storage best practices overview ................................................................. 28

Plan storage layouts for performance and capacity ....................................... 28

Use SQLIOSim.exe to validate storage configuration..................................... 28

Consider Tempdb performance ..................................................................... 28

Size tempdb appropriately ............................................................................ 29

For optimal recovery place database and log files on separate spindles ....... 29

Plan database filegroups based on workload ............................................... 30

File group planning ....................................................................................... 30

Tempdb ......................................................................................................... 30

User databases ............................................................................................. 30

Log files ........................................................................................................ 30

Determine whether a pool-based LUN or a RAID group LUN is appropriate for the SQL deployment ...................................................................................... 30

Plan storage operations for minimal disruption ............................................ 32

Do not change the default owner of the pool LUN once it is provisioned ....... 32

Balance SQL Server database and log LUNs between storage processors ..... 33

Name the LUNs for quick identification ......................................................... 33

Align Windows disk partitions for best performance ..................................... 34

Page 23: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

28

Storage best practices overview A high-performance storage subsystem is crucial for a successful SQL Server deployment. Plan the storage configuration ahead of time to meet the performance and capacity requirement of current and future growth. This chapter details recommendations for the configuration of EMC VNX platforms to use with Microsoft SQL Server.

Plan storage layouts for performance and capacity The storage subsystem is one of the most critical components of the SQL Server environment. The storage system must be properly sized for current and future SQL Server workloads to maintain optimum performance. A common error during storage planning for Microsoft SQL Server is designing for storage capacity only. Over the past 20 years, increases in storage capacity have outpaced increases in performance by 1000:1 in disk drives. Performance, not storage capacity is usually the limiting factor for disk drives.

Three aspects must be considered, in the priority shown below during storage planning:

1. Size the storage system to meet the performance need

2. Size the storage system to meet the capacity need.

3. Select the appropriate disk and RAID types.

Chapter 3 Storage best practices provides information about additional storage planning guidelines.

Use SQLIOSim.exe to validate storage configuration All changes to the storage system must be validated before using them with SQL Server to reduce the possibility of errors. This includes changes to the disk array, storage network, or any software on the I/O path.

The Microsoft SQLIOSim utility validates the reliability and integrity of the storage subsystem. SQLIOSim simulates the following typical user and system SQL Server activities such as, read, write, checkpoint, backup, sort, and read-ahead operations on the storage system.

It is also designed to read pages of written data and validate them with a checkum algorithm. SQLIOSim is not a benchmarking tool for storage planning. It is an IO subsystem validation tool.

Microsoft SQLIOSim does not require SQL Server to be installed on the system. Download and install SQLIOSim.exe from Microsoft to validate the storage subsystem for errors prior to installing SQL Server or deploying a database. How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem available on Microsoft website provides more information about SQLIOSIM.

Note: SQLIOSIM validation will inflate thin storage pools

Consider Tempdb performance Tempdb is a global resource system database available to all users connected to a particular instance of the SQL Server. It is used by all databases within the instance to hold all temporary data, such as temporary user objects that are explicitly created, and internal objects created by SQL Server database engine. Tempdb can be very IO intensive.

When tempdb and user databases are placed on same set of physical spindles, it can cause disk contention and degrade overall SQL Server performance.

Page 24: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

29

EMC recommends monitoring tempdb and placing it on EMC VNX with appropriate storage sizing. The storage sizing can follow the same principles as designing storage for user databases.

Size tempdb appropriately Each time the Microsoft SQL Server service is started, tempdb is dropped and re-created with its initial parameters. If tempdb is initially 128 MB and autogrows to 4 GB during operations, it will be 128 MB again after a restart. Database performance will suffer as tempdb autogrows to 4 GB again. Size tempdb appropriately for the environment to minimize the performance impact:

Start with a reasonable size tempdb for the size of databases that are in the same

SQL Server instance.

For example, a 1 GB tempdb is a reasonable starting point for instance databases with a cumulative total size between 10 GB and 100 GB, but not for 1 TB. A general rule is to set tempdb to be between 1% and 10% of the cumulative total size of all the instance databases

Set a valid autogrow increment allows tempdb to grow without heavy fragmentation.

Set the autogrow to 10 percent to 20 percent of the initial tempdb size. Do not use a percentage for the growth parameter; calculate the growth in MB to correspond to the percentage. Verify fast file initialization is enabled. Enable Windows fast file initialization and Set the database file sizes and autogrow increments appropriately provide more information.

Periodically verify the size and utilization of the tempdb database to observe database growth.

Reset the size of the tempdb database to something close to its current size before a shutdown.

If the tempdb database from the previous example grew from 1 GB to 5 GB, reset it to start at 5 GB, unless the new size is obviously excessive. For example, if the total size of the user databases was 10 GB, and tempdb was 15 GB, this seems excessive. It is possible for an odd set of scenarios to come together to cause uncharacteristic tempdb growth. If this happens, set the starting size to something smaller than the current size. If tempdb repeatedly grows larger than expected, it is possible the initial size allocated for tempdb is too small. From here, a DBA could diagnose what is causing the excessive growth and then determine if it is valid, or if anything needs tuning.

Place tempdb on its own set of spindles.

Consider Tempdb performance, Appendix A, and Optimizing tempdb Performance available in the Microsoft MSDN Library, provide more information.

For optimal recovery place database and log files on separate spindles Every SQL Server transaction is written to the log files, and then written to the data files. If database data files and log files are on separate physical spindles, it prevents the loss of the log and data files at the same time due to loss of multiple drives.

If the data files are lost, restore them from a backup and play back the latest log files from the other set of spindles.

This best practice provides an additional recovery mechanism, but testing shows that there is little to no performance benefit on EMC VNX.

Page 25: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

30

Plan database filegroups based on workload SQL Server data files are divided into primary and secondary files. There is one primary file, and one or more optional secondary files. Filegroups help administrators organize and store data files in different locations. Since filegroups can be accessed in parallel, place each filegroup on a different set of spindles to improve SQL Server database performance Filegroups also allow the SQL Server database to be restored in stages by piecemeal restore. File group planning provides more information.

In the planning stage, organize SQL Server data files with similar performance and protection needs into a filegroup. Using Files and Filegroups in the Microsoft MSDN Library provides more information.

File group planning The number of data files per filegroup (or database, if a single filegroup is used) is dependent on a number of factors.

Tempdb In SQL Server 2000, the recommendation for tempdb was to have one file for every CPU core. Therefore, if you have a machine with four CPU sockets and use dual-core CPUs, tempdb should be broken into eight files. This is primarily because of an issue with contention in the GAM and SGAM areas of the files. Contention has been decreased substantially in SQL Server 2005 and later versions, but it is still a possibility because tempdb is used more heavily than in previous versions, because of new features such as row versioning. There is a cost associated in having multiple files per filegroup. This is because SQL Server stripes data across all files in a given filegroup (proportional fill) and accesses all the files simultaneously. If the files are located on LUNs that are on the same RAID group (set of spindles), head movement is induced, which increases latency and decreases throughput. Therefore, the decreased contention of multiple files must be balanced against the increased I/O load. A good midway starting point is to split tempdb into a number of files equal to half the number of CPU cores.

Concurrency enhancements for the tempdb database and "Possible thread starvation detected" message may be reported in the SQL Server error log when all worker threads for a particular UMS scheduler are in a wait state in SQL Server in the Microsoft Support Knowledgebase discuss this in more detail.

User databases For user databases, a similar decision must be made and similar criteria should be used. However, the contention of user databases is usually much lower. Therefore, the administrator must start with a smaller number of files (for example, one or two) and then increase, if necessary.

Log files Increasing the number of files available to a SQL Server database does absolutely nothing for performance. If a database has two log files, Microsoft SQL Server will fill the first log file before beginning to use the second log file. Therefore, the only use of a second log file is to expand a database’s logs to a new volume.

Determine whether a pool-based LUN or a RAID group LUN is appropriate for the SQL deployment

With EMC VNX platforms, there is flexibility to select either virtually provisioned pool LUNs (thick LUNs and thin LUNs) or RAID group (traditional) LUNs to provision storage.

Pool-based provisioning provides several functional benefits compared to traditional provisioning. Some of the benefits include Fully Automated Storage Tiering (FAST), efficient

Page 26: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

31

space utilization with thin provisioning, simple and easy management, and compression. A pool LUN can either be thick or thin. It is possible to use a thin LUN to present more storage to an application than to allocate it physically. The physical storage is assigned to the server in a capacity-on-demand fashion from a shared pool. With thick LUNs, all the storage is reserved at the time of creation. Therefore, the host-reported capacity is equal to the actual storage allocated.

Table 2 compares RAID group and pool-based LUNs.

Table 2. RAID group LUNs vs. pool-based LUNs

Feature RAID group LUN Pool-based LUN

Maximum drives 16 per RAID group All drives in the array (except vault drives and hot spares)

Drive types Single drive type per RAID group

Multiple drive types can be mixed in a pool

FAST Not supported Supported

Virtual Provisioning Not supported Supported with thin LUNs

Space utilization Poor Excellent with thin LUNs

LUN expansion Possible with MetaLUN, complex

Simple, nondisruptive, performed easily

Compression Not supported Supported with thin LUNs

Performance High Thick LUN - High but less than RAID group LUN Thin LUN - Less compared to Thick LUN

Because of the many functional benefits offered by the pool LUNs, it is recommended to use pool-based LUNs, either thick or thin according to the requirements of the environment. However, if milliseconds of performance is critical in the environment and the performance requirements outweigh the functional benefits, use the RAID group LUNs because they provide the best performance.

EMC CLARiiON Virtual Provisioning—Applied Technology white paper available on the EMC Online Support website, provide more information on pool LUNs and RAID group LUNs.

Figure 4 shows the performance difference between the thick pool LUNs, thin pool LUNs, compressed thin LUNs, and RAID group LUNs. These results are from the validated solution for Microsoft SQL Server 2008 using EMC VNX5700 and Windows Server 2008. The workload used in these tests is SQL OLTP (TPC-C like).

Page 27: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

32

Figure 4. Performance comparison

In the tested environment, the RAID group LUNs performed nearly 16 percent better than thick pool LUNs. Thin pool LUNs showed minor decrease in performance when compared to thick LUNs, while compressed LUNs showed a marginal 4.5 percent decline.

Plan storage operations for minimal disruption Some operations on the storage array can consume resources and may cause an impact on the production system if executed during a heavy production load. For this reason, schedule storage-based operations that may consume resources during non-peak hours to minimize the potential for such occurrences

Do not change the default owner of the pool LUN once it is provisioned Changing a pool LUN’s ownership may adversely affect the performance because even after changing the ownership, a pool LUN’s private information remains under the control of the original owning storage processor (SP). This results in both SPs being used to handle the I/Os. Involving both SPs increases the time and resource used to complete an I/O of a particular LUN.

If a host path failure results in some LUNs trespassing in a shared pool, repair the failure quickly and return the ownership of those trespassed LUNs to their default SP. The ALUA feature of CLARiiON will eventually link both the trespassed pool LUNs and their private information under the sole control of the new owning SP. However, response time will be higher until this occurs.

To change the SP ownership after the LUN is created, use LUN migration to migrate the LUN to a new LUN with the required SP. The EMC CLARiiON Asymmetric Active/Active feature—white paper available on the EMC Online Support website provides more details on the ALUA feature.

Page 28: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

33

Balance SQL Server database and log LUNs between storage processors When one EMC VNX storage processor is overloaded, performance of the LUNs owned by that SP is dregaded due to resource contention. To provide optimum performance, LUNs that contain SQL Server database and log data must be balanced between the SPs so that a single SP is not overloaded. This provides optimum performance to the SQL Server storage environment.

Allocating LUN ownership must be based on LUN resource utilization rather than the number of LUNs. For more details on how to change LUN ownership, refer to Do not change the default owner of the pool LUN once it is provisioned.

Name the LUNs for quick identification Naming the LUNs enables quick identification as shown in the following example. This is more useful when several LUNs are created in the storage array for different applications.

Figure 5. Naming LUNs

Proper naming convention enables users to quickly identify and fix any potential errors.

Page 29: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Storage Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

34

Align Windows disk partitions for best performance When Windows create an NTFS volume on top of the EMC VNX LUN, it always creates an internal structure called master boot record (MBR) at the beginning of the LUN. The MBR uses hidden sectors of the LUN and it is default to 63 sectors. When a Windows partition is created, it is created starting at the 64th sector. This misaligns the partition with the storage system, which can cause the I/O operation to straddle stripe element boundaries. This causes additional I/O to the storage system and can cause significant reduction in performance of the SQL Server I/O.

Note: Using diskpar and diskpart to Align Partitions on Windows Basic and Dynamic Disks — white paper available on the EMC online support website, and Disk Partition Alignment Best Practices for SQL Server in the Microsoft MSDN Library provide more information.

Therefore, it is recommended to align the drives so that the performance improves by as much as 40 percent when Diskpart is used on the LUNs prior to SQL Server deployment. EMC recommends aligning the drives (with 128 sectors) with the CLARiiON stripe size of 64 KB. Windows Server 2008 aligns disk partitions out-of-the box, however, this has to be accomplished separately for the earlier versions of Windows operating systems.

The following example uses diskpart on disk 4:

C:\>Diskpart

Microsoft DiskPart version 5.2.3790.1830

Copyright (C) 1999-2001 Microsoft Corporation.

On computer: JC27Q91X32

DISKPART> list disk

Disk ### Status Size Free Dyn Gpt

-------- ---------- ------- ------- --- ---

Disk 1 Online 136 GB 112 GB

Disk 2 Online 267 GB 0 B

Disk 3 Online 267 GB 0 B

Disk 4 Online 600 GB 600 GB

DISKPART> select disk 4

Disk 4 is now the selected disk.

DISKPART> create partition primary align=64

DISKPART succeeded in creating the specified partition.

Use Microsoft Disk Manager to select the drive letter or mount point to associate with the corresponding LUN. After selecting this information, format the NTFS drive at 64 KB Allocation Unit Size (Set the Windows NT File System (NTFS) allocation unit to 64 KB provides more information) for database data and log files.

Page 30: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

35

Chapter 4 Network Best Practices

This chapter presents the following topics:

Network best practices overview ................................................................ 36

Use a dedicated VLAN for cluster heartbeat connectivity .............................. 36

Use Powerpath for path failover and high-availability ................................... 36

Use the latest verified HBA / NIC driver ......................................................... 36

Plan SQL Server network connectivity for high speed links ........................... 36

Set network speed and duplexing ................................................................. 36

Plan for network high availability .................................................................. 37

Page 31: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Network Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

36

Network best practices overview A fast, reliable network infrastructure is a critical component of a successful SQL Server deployment. Plan the network configuration ahead of time to meet the performance and capacity requirement of current and future growth. This section details recommendations for the network configuration to use with Microsoft SQL Server.

Use a dedicated VLAN for cluster heartbeat connectivity In MSCS heartbeat is used for communication between cluster nodes to pass information about health and state of their resources. For MSCS cluster to function properly, a reliable heartbeat communication is important. In the Windows server versions prior to Windows 2008, the heartbeat communication is sent as a broadcast. If MSCS is used with earlier versions of Windows 2008, it is recommended to physically isolate the cluster heartbeat network from other networks for reliability and security. For example, in a two-way cluster, it is common to use a crossover cable between the two machines as the heartbeat network. Recommended private "Heartbeat" configuration on a cluster server on the Microsoft Support Knowledgebase provides more information.

Use Powerpath for path failover and high-availability When multiple network paths are configured between SQL Server and EMC VNX platforms, load-balancing I/O loads across multiple network path to multiple VNX storage ports avoids storage bottlenecks. It also allows redundant connections for high availability.

EMC Powerpath® provides load-balancing for SQL storage IOs between active SQL Server network paths and storage ports. It also monitors dataflow through all active paths. If a path fails, it removes the failed path and reroutes all data access through an alternate active path until the failed path is repaired and restored. The EMC PowerPath® and PowerPath/VE Version 5.5 and Minor Releases Release Notes available from the EMC online support website provides more information.

Use the latest verified HBA / NIC driver HBA/NIC vendors periodically release new drivers to provide enhancements and bug fixes. EMC recommends installing the latest vendor HBA / NIC drivers that have been validated for use with the Windows version used in your environment for maximum performance and stability.

Note For Windows on VMware ESX, use the latest VMware-approved drivers.

Plan SQL Server network connectivity for high speed links High-speed network connectivity between the server and storage array is required for a high-performance SQL Server environment. Use 1/10 GbE switches with virtual LAN (VLAN) capability to segment different kinds of traffic. Design a network topology with a minimum number of hops for lower latency. Use CAT6 cables for maximum performance and reliability. The EMC VNX series has 1/10 GbE connectivity for very high network throughput SQL Server environment.

Set network speed and duplexing Once the setup is complete and the GbE infrastructure is verified, the switch ports and NIC ports should be configured to 1/10 Gb/s and FULL duplex. During setup it may be necessary to use AUTO settings to ensure everything works properly in a new environment, however, the proper speed and duplex settings should be set explicitly for production systems.

Page 32: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Network Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

37

Plan for network high availability One common oversight is to provide for high availability at the server level using clusters and at the storage level using RAID, but not designing a highly-available network. Plan the network for high-availability to ensure uninterrupted communication between systems in your environment. This includes redundant switches and paths as well as redundant HBA/NIC ports and cards.

Page 33: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

39

Chapter 5 Protection Best Practices

This chapter presents the following topics:

SQL Server protection best practices overview ............................................ 40

Use array based replication to offload SQL Server ......................................... 40

Database mirroring ....................................................................................... 41

If clustering is used at the principal site, do not use a witness server ........... 41

Plan for high I/O levels at the mirror site for database mirroring ................... 41

Verify that interdatabase consistency is not required before implementing database mirroring........................................................................................ 41

Methods to sync SQL Server objects if database mirroring is in use .............. 41

For point-in-time recovery, use database log backups .................................. 42

When possible, schedule backups for minimal disruption ............................ 42

Do a full backup when you change the database recovery model ................. 42

Increasing LUN Prefetch settings can improve restore performance .............. 42

Using VNX file deduplication and compression feature reduces the storage footprint for SQL backups ............................................................................. 42

Page 34: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Protection Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

40

SQL Server protection best practices overview A protection mechanism is needed on SQL Server to meet appropriate recovery point objective (RPO) and recovery type objective (RTO) requirements in an environment. Different solutions are needed for protecting against hardware failures, data corruption, human errors, and natural disasters. EMC provides several solutions to address these needs by utilizing the SQL Server database mirroring feature along with EMC VNX snap, clone, and replication features to meet customer Service Level Agreements (SLAs).

Use array based replication to offload SQL Server SQL Server works well when server resources are dedicated to it. When a protection mechanism is implemented in the SQL environment, it will take some server resources from SQL Server and may cause performance problems. EMC provides two array-based replication solutions to offload database replication operations to the EMC VNX system: RecoverPoint and MirrorView.

RecoverPoint replicates data to protect the SQL Server environment from disaster. It provides three options:

Local recovery protection (CDP) provides synchronous protection by capturing every transaction in a database and simultaneously writing it to a secondary storage location.

Continuous remote replication (CRR) is an asynchronous protection that can replicate data across any distance.

Concurrent local and remote data protection (CLR) which combines the CDP and CRR replication methods to provide local and remote protection for an SQL Server environment.

RecoverPoint scales well, and can be implemented on very large SQL Server environments. EMC RecoverPoint Replicating Microsoft SQL Server Technical Notes, available from the EMC Online Support website, provides more information.

MirrorView replicates SQL Server database LUNs to remote locations for disaster recovery. MirrorView replication is transparent to the host. If the production host or the production storage system fails, the remote replication facilities failover to the secondary mirror image.

MirrorView software offers two complementary mirroring products:

MirrorView/S can synchronously mirror data images of production host LUNs to secondary storage at a remote site in real time. This offers zero data loss if there is a failure at the production site.

MirrorView/A offers long-distance replication based on a periodic incremental update model. It periodically updates the remote copy of the data with all the changes that occurred on the local copy since the last update. This can result in data loss if there is a failure at the production site.

MirrowView works well in small to medium size SQL Server environments. EMC Business Continuity for Microsoft SQL Server 2008: Enabled by EMC CLARiiON and EMC MirrorView /A—Applied Technology, available from the EMC Online Support website, provides more information about MirrorView.

Page 35: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Protection Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

41

Database mirroring Database mirroring is the built-in high-availability feature of Microsoft SQL Server to protect a database instance from a catastrophic failure. Database mirroring provides for two identical copies of an SQL Server instance on two separate SQL Servers. The data is synchronized by replicating transaction logs from the active (principal) server to the passive (mirror) server. Database Mirroring Overview available in the Microsoft MSDN library provides more information about database mirroring.

Consider the following best practices when using database mirroring.

If clustering is used at the principal site, do not use a witness server In High Availability mode (Synchronous with a witness) with database mirroring, using a Microsoft cluster VMHA failover cluster or Hyper-V cluster, it is likely that the database will fail over to the mirror before the cluster failover can complete. Therefore, when using MSCS or VMHA, EMC recommends that only High Protection (Synchronous without a witness) or High Performance (Asynchronous) be used. Database Mirroring and Failover Clustering, available from Microsoft TechNet, provides more information about databases in clustering environments.

Plan for high I/O levels at the mirror site for database mirroring The method that database mirroring uses to commit transactions at the mirror site causes substantially more write I/O than occurs on the principal. Therefore, depending on the data load, more storage resources may be required at the mirror site than at the principal. Some tests indicate that the mirror site might need to handle as much as four times the level of I/O as the principal site. Some tests indicate the mirror site might need to handle as much as four times the level of I/O as the principal site.

EMC recommends monitoring the mirror site for performance bottlenecks that might impact data protection and recovery plans. EMC Business Continuity for Microsoft SQL Server: Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iSCSI—Applied Technology, available from the EMC Online Support website, provides more information about database test results.

Verify that interdatabase consistency is not required before implementing database mirroring

Database mirroring maintains only intradatabase consistency. There is no mechanism in database mirroring to maintain consistency among multiple databases. If an environment requires interdatabase consistency, database mirroring as implemented in Microsoft SQL Server 2005 and SQL Server 2008 is not recommended for data protection in that environment. Microsoft SQL Server 2005 Database Mirroring – Applied Technology Guide, available from EMC.com, provides more information about interdatabase consistency.

Methods to sync SQL Server objects if database mirroring is in use Database mirroring operates only within single databases, and cannot be used for system databases like master database. Therefore, a separate mechanism is required to keep objects like user accounts, jobs, and security assignments above the database level, and to keep the system databases in sync between the principal and mirror systems. This can be done with an SQL Server job that runs at regular intervals or with third-party tools. It is important to sync dependent objects to completely fail over the entire environment. Database Mirroring in SQL Server 2005, available from Microsoft TechNet, provides more information about protecting SQL Server objects.

Page 36: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Protection Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

42

For point-in-time recovery, use database log backups A point-in-time recovery means rolling back an SQL Server database to its state as of a specified time. A full database backup combined with a chain of log backups allows the restoration of a database to a given point in time, down to individual transactions. This is the highest level of granularity possible with Microsoft SQL Server.

The full backup may be taken with the SQL Server’s native backup functionality, a third-party tool, or a snapshot tool like EMC Replication Manager (RM). RM cannot perform transaction log (in SQL Server terms) backups. To achieve point-in-time recoverability, RM full backups need to be combined with SQL Server log backups. Restoring a Database to a Point Within a Backup, available from the Microsoft MSDN Library, provides more information about point-in-time recovery.

When possible, schedule backups for minimal disruption When a backup is in progress, Microsoft SQL Server creates a checkpoint to flush all dirty pages to disk. When this is done on a machine with a large amount of RAM (possibly most of which is dirty pages) that is also under a heavy I/O load, the backup may take substantially longer — sometimes as much as two to 20 times longer. Whenever possible, schedule backups for times when the system is under a light load. EMC also recommends taking backup overhead into account during the storage design process.

Do a full backup when you change the database recovery model When the recovery model changes from Full to simple or bulk logged, and back to Full, the change to Full only completes after performing a full database backup. A database that is changed from Simple to Full may lose data if the backup is taken before the change in recovery model, instead of after the change.

A database does not maintain a log in Full recovery mode until a full backup is complete. The database will remain in Full recovery mode only as long as nothing is done to break the log chain. If the command Backup Log With Truncate Only is run, the database log no longer operates in Full recovery mode because the log chain is broken. The only way to bring the log back into Full recovery mode is to take another full database backup. Recovery Model Overview, available from the Microsoft MSDN Library provides more information about recovery modes and considerations.

Increasing LUN Prefetch settings can improve restore performance The prefetch option allows the VNX system to cache some blocks beyond the current read block. This significantly improves the restore performance since it is sequential in nature.

EMC CLARiiON Best Practices for Performance and Availability: Release 30.0 Firmware Update—Applied Best Practices, available on EMC Online Support, provides more information about prefetch settings for sequential IO.

Increasing the prefetch may hurt performance during non-sequential operations. Use a backup LUN for sequential operations only, to maximize the benefit of an increased prefetch setting.

Using VNX file deduplication and compression feature reduces the storage footprint for SQL backups

Enabling dedulication on the underlying file system reduces the storage space on a CIFS share required for SQL backups. The compression features reduce the size of the SQL backup data and save backup space on the VNX file system

However, enabling the de-duplication on the backup file system slightly increases the restore time, as the restore involves re-duplicating the files.

Page 37: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

43

Chapter 6 Virtualization Best Practices

This chapter presents the following topics:

Virtualization best practices overview ........................................................ 44

Align new storage volumes on virtualized servers ......................................... 44

Use Virtualization Management Servers ........................................................ 44

Make Virtualization Management Servers highly available ........................... 44

Be aware of virtual machine time considerations .......................................... 44

Page 38: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Virtualization Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

44

Virtualization best practices overview Due to rapid growth in server technology, the cost of processing power is decreasing rapidly. Currently most server environments are underutilized, or have to be on separate servers in order for SQL Server applications to work properly. Virtualization provides the ability to optimize datacenter resources by consolidating server resources down to a few physical servers. This reduces power consumptions, saves space, increases manageability and flexibility, and introduces new high- availability options. Microsoft supports SQL Server virtualization on any hypervisor approved by the Server Virtualization Validation Program (SVVP).

The following best practices are related to SQL Server virtualization on VMware ESX or Microsoft Hyper-v servers.

Align new storage volumes on virtualized servers To virtualize a server, there are additional layers of abstraction to align. All layers between the host application and the storage must be aligned, including NFS, NTFS, and VMware VMFS. Follow the instructions in Align Windows disk partitions for best performance. Performance decreases if any layer in the stack is out of alignment.

It is recommended to create VMware VMFS volumes from the VMware VI client, because they automatically align as they are created.

Use Virtualization Management Servers Virtualization Management Servers simplify virtual data center management, and provide additional features for the administrator. Virtualization management Servers are important for the proper functioning of several virtual machine features such as VMHA, DRS, VMotion, and live migration. Also, virtualization management Server adds ease of use to the following features and many others:

Virtual machine template creation and deployment throughout the data center

Historical and real-time performance tracking of Hyper-V, ESX hosts, and virtual machines

Tracking of VM-to-host mapping

Initialization of new storage, including alignment of VMFS volumes

Make Virtualization Management Servers highly available Virtualization Management Servers are the most critical component of a virtualization infrastructure. Without them, it is possible to lose most Virtualization server functionality, including DRS, VMHA, VMotion, live migration, and starting a virtual machine. It is important for the Virtualization Management Servers to be highly available. The easiest, most reliable way to accomplish this is to run the Virtualization Management Servers on physical machines in an MSCS cluster. The SQL Server database utilized by the Virtualization Management Servers must also be highly-available.

Be aware of virtual machine time considerations Time measurements on virtual machines can be skewed due to the nature of hypervisor-based virtualization. Refer to all relevant documentation for the hypervisor to understand how it impacts the environment. For example, VMware has a guide on timekeeping called Timekeeping in VMware Virtual Machines available from the VMware website.

Page 39: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

45

Chapter 7 FAST Best Practices

This chapter presents the following topics:

FAST Suite best practices overview ............................................................. 46

Enable FAST cache on highly active SQL Server databases ........................... 46

For FAST Cache, allow sufficient time for the cache to warm-up .................... 47

Do not enable FAST Cache for Log LUNs ........................................................ 47

Consider FAST VP for autotiering of data on large SQL databases ................. 47

Page 40: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

FAST Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

46

FAST Suite best practices overview The EMC Fully Automated Storage Tiering (FAST) suite is an advanced software feature that is enabled to provide greater flexibility to manage the increase in performance and capacity requirement of the SQL Server environment. EMC FAST suite makes use of the Solid State Drive (SSD) drives, SAS, and NL-SAS storage configuration to balance performance and storage needs. There are two type of FAST suites—FAST cache and Fully Automated Storage Tiering for Virtual Pools (FAST VP).

FAST VP provides policy-based autotiering solutions to move data between SSD, SAS, and NL-SAS drives. FAST VP efficiently utilizes storage tiering to lower total cost of ownership (TCO) for the customers by moving cold datasets to high capacity drives such as NL-SAS and to increase performance by placing most frequently accessed data on high performance drives such as SSD.

Note: This tiering of data happens automatically and transparently on SQL Server.

FAST Cache uses SSDs to act as an additional cache between dynamic random access memory (DRAM) of storage system and storage disks. This significantly reduces the service responsiveness of SQL data, which is mostly random. FAST Cache identifies the most frequently accessed data and copies to the SSD automatically and nondisruptively for quicker access. This operation is also automatic and completely transparent to SQL Server.

Enable FAST cache on highly active SQL Server databases EMC FAST Cache increases the storage system cache by extending the functionality of DRAM cache by mapping frequently accessed data to SSD. FAST Cache capacities range from 73 GB to 2 TB, which is considerably larger than the available DRAM cache of the existing storage systems. If a particular chunk of data is accessed frequently by the user application, that chunk will be automatically promoted into the FAST Cache by copying from hard disk drives to the Flash drives. Subsequent access to the same chunk is serviced at Flash drive response times, thus boosting the performance of the storage system.

Analyze the SQL workload characteristics before deciding to use FAST Cache in the storage system. FAST Cache is not equally helpful in all kinds of I/O profiles. For example, sequential streams with large I/O size may not access the same data chunk multiple times, which is required for FAST Cache promotions. It is most suitable for high I/O-intensive random workloads with small working sets. Typically, OLTP database have similar behavior and it can be greatly benefit from FAST Cache to improve performance and response time. Monitor SQL Server database file groups and enable FAST Cache on the highly active storage pools where such data is located.

EMC CLARiiON and Celerra Unified FAST Cache—A Detailed Review, available on the EMC Online Support website, provides more details on FAST Cache design concepts, planning, and usage guidelines.

Testing suggests that the inclusion of FAST Cache results in a 300 percent increase in transactions per second (TPS) for a SQL OLTP workload using the same number of hard disk drives in the backend.

The EMC Unified Storage for Microsoft SQL Server 2008—Enabled by EMC CLARiiON and EMC FAST Cache—Proven Solutions Guide, available on EMC Online Support, provides more details on the testing and the results.

The EMC Unified Storage for Microsoft SQL Server 2008—Enabled by EMC CLARiiON and EMC FAST Cache—Reference Architecture, available on EMC Online Support, provides more details on how to build a solution with EMC FAST Cache.

Page 41: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

FAST Best Practices

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

47

For FAST Cache, allow sufficient time for the cache to warm-up When the FAST Cache is enabled and empty, it takes some time before the full benefit of FAST Cache is observed. This time is called warm-up time during which the FAST Cache promotions happen. This also happens when the working dataset of the application has changed dramatically, and the current FAST Cache data is no longer being referenced. During this time, response time to the application is the same as the response time of the hard disk drives (HDDs) and the FAST Cache hit rate is less.

The warm-up time depends mostly on the type and number of backend HDDs, size of FAST Cache, and size of the working set. EMC Unisphere has several FAST Cache counters that can be monitored to obtain its optimum utilization to get the warm-up time. EMC CLARiiON, Celerra Unified, and VNX FAST Cache—A Detailed Review, available on EMC.com, provides more information.

Do not enable FAST Cache for Log LUNs SQL Server log files are sequential-write operations. Log writes are rarely reread during SQL operation. This I/O pattern makes the log files not a good candidate to enable FAST Cache. Testing shows that enabling FAST Cache for SQL database log LUNs does not provide any additional benefit, and reduces the amount of FAST Cache available for other workloads. The log file I/O can be managed by array-write cache and SAS drives as effectively as FAST Cache.

Consider FAST VP for autotiering of data on large SQL databases According to the industry analysts, 60 to 80 percent of operational database data is inactive and increases as the size of the database grows. The low-cost, high-capacity spinning drives are an ideal choice for the inactive data. However, the high performance drives are suitable for the frequently accessed data. Classifying and storing the right data in the right tier manually is a complex and challenging task, and mostly requires down time when the data is moved.

EMC FAST VP moves frequently accessed data to a faster physical storage within a pool and moves infrequently accessed data to a less-expensive physical storage within the pool automatically.

Consider the following aspects when using FAST VP:

Control when FAST VP can move data to prevent any impact on host I/O requests during known periods of high system usage.

Improve the overall system performance without investing in additional high-performance physical drives.

Apply FAST VP to any or all pool-based LUNs on a storage system.

Understanding Microsoft SQL Server OLTP Performance Characterization for EMC VNX Series, available on the EMC Online Support website provides more details about using FAST VP with SQL Server.

FAST VP operates in the background, while the LUNs are online and available for host access. The data movement speed can be controlled to minimize the impact on overall system performance.

Note: FAST Cache and FAST VP can be implemented together in a SQL database environment to improve performance, achieve higher capacity utilization, and lower the power and cooling requirements. FAST Cache boosts the performance immediately for random-access patterns and burst prone data, while FAST VP allows the storage system to move inactive data to a lower-cost storage tier. FAST Cache operates in 64 KB granularity and FAST VP operates with 1 GB chunks.

Page 42: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

49

Appendix A Performance Monitoring and Tuning

This appendix presents the following topics:

Performance considerations ....................................................................... 50

Windows Performance Monitor ................................................................... 50

RAID group planning .................................................................................. 53

Estimating required performance ............................................................... 55

Page 43: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

50

Performance considerations Server memory, and the I/O subsystem setup (number of physical disks, RAID level, paths to the storage) impact the SQL storage subsystem.

Server memory is the primary cache for SQL Server. In general, the more memory available for database caching, the fewer I/O operations the storage subsystem needs to service. The number of physical disk drives and the RAID type (where the database data files and log files will be stored) determine the sustainable I/O rate the database can use without exceeding acceptable latencies. As long as the storage connection bandwidth is not exceeded, adding more drives generally increases the I/O rate. The proper RAID type also benefits the number of sustained I/Os the system can handle. Additional storage connection paths or additional storage arrays can relieve the load on saturated storage components. Monitor and tune the SQL Server database installations as a standard practice in all deployments.

There are several tools and methodologies to help monitor and tune performance, including SQLTrace (SQL Server Profiler is the GUI), Dynamic Management Views (DMVs), and Windows Performance Monitor (Perfmon). This section provides detailed information about some of the most common Perfmon counters.

Windows Performance Monitor Perfmon allows administrators to view and collect realtime performance counter information on a wide variety of operating system, SQL Server, and hardware components. Table 3 provides a brief description of the counters.

Note: Some Windows counters may not be correct on a virtual machine. Any counters that rely on timing based on processor ticks are skewed and unreliable.

Table 3. Windows Performance Monitor counters

Counter name

Performance object Description

% Processor Time

Processor An indication of how busy the system processors are. If the processors are very busy, the I/O system is not likely to impact the overall system performance. For Intel-based systems with hyperthreading turned on, this counter is no longer accurate. For the Intel-based systems, if this counter is hovering near 50%, the machine is likely processor-bound (the processor is the bottleneck).

Pages / sec Memory This is one of the key indicators of operating system-level memory pressure. Some level of occasional paging is normal for most systems; however, sustained periods of substantial paging, or if the paging occurs during periods of poor performance, there is likely a memory shortage.

% Idle time PhysicalDisk or LogicalDisk

An indication of how much time a given disk/volume is not busy. If it is busy almost all the time (near 0%), the disk/volume may be a bottleneck.

Page 44: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

51

Counter name

Performance object Description

Avg. Disk Queue length

PhysicalDisk or LogicalDisk

Average number of read and write requests outstanding, over a sample interval, on the disk/volume. Large queues and high disk-busy times usually indicate performance bottlenecks. With RAID arrays, this number can be much higher than for a single physical disk drive, because a RAID array is composed of many physical disks. Keep the queue length to less than two per physical disk in the RAID array.

Current Disk Queue length

PhysicalDisk or LogicalDisk

The instantaneous number of outstanding requests on the disk, at the exact moment of sampling. This is useful for tracking down temporal hot spots. If the disk queue varied between 0 and 128 during a sample interval, then this counter could be anything from 0 to 128, even though the average over the sample period might be 16. For example, if the disk queue length samples were (0, 0, 0, 128), the average disk queue length is 32, but the current disk queue length is 128.

Avg. Disk sec / transfer

PhysicalDisk or LogicalDisk

Average response times, in milliseconds (ms), for disk read and write operations. Typical values of fewer than 10 ms are very good. Values consistently greater than 20 ms may indicate a problem.

Avg. Disk sec / Read

PhysicalDisk or LogicalDisk

Average response time (ms) for read operations. This is useful for further isolating general response time issues. An average of less than 20 ms is desirable.

Avg. Disk sec / Write

PhysicalDisk or LogicalDisk

Average response time (ms) for write operations. This is useful for further isolating general response time issues. An average of less than 10 ms is desirable.

Disk Bytes / sec

PhysicalDisk or LogicalDisk

Number of bytes transferred to or from the disk/volume per second.

Disk Transfers/sec

PhysicalDisk or LogicalDisk

Number of transfers to or from the disk/volume, regardless of transfer size. Also known as IOPS.

Avg. Disk Bytes / Transfer

PhysicalDisk or LogicalDisk

A measure of the relative I/O composition of the system. This is an average, but on disks/volumes that exclusively contain database data files, it will trend towards 8 KB for most random data workloads. If the value is significantly higher than 8 KB, the workload may be more sequential in nature and may benefit from additional caching. For disks/volumes that contain database log files, this value varies substantially depending on the workload.

Page 45: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

52

Counter name

Performance object Description

Buffer Cache Hit Ratio

SQLServer: Buffer Manager

This object is useful for helping determine whether SQL Server has sufficient memory available to it. Values of 98% or higher are excellent, 94% or higher are acceptable, and lower values are either an indication of insufficient memory or an extremely random data workload.

Buffer Cache Hit Ratio is approximately (Page Lookups per second – Page Reads per second) divided by Page Lookups per second. ReadAheads are not considered a cache miss, since they are not immediately requested by a query processor; however they are disk reads, and they make the Buffer Cache Hit Ratio misleading. For example, if all pages needed from the disk were prefetched by the read ahead manager, then the Buffer Cache Hit Ratio would be at or near 100%.This is only semi-true, as a Buffer Cache Hit Ratio of 100% would imply no physical reads occurring. Therefore, it is not uncommon to use an alternate calculation that takes ReadAheads into account, such as (Page Lookups per second – (Page Reads per second + ReadAhead Pages per second)) divided by Page Lookups per second.

Page Life Expectancy

SQLServer: Buffer Manager

This object is useful for helping determine whether SQL Server has sufficient memory available to it. Values of less than 300 (5 minutes) are usually an indication of insufficient memory.

Page Lookups / sec

SQLServer: Buffer Manager

Number of requests to find a page in the buffer pool. This counter includes pages requested from the buffer pool, pages not found, and pages read in from disk (Page Read).

Page Reads / sec

SQLServer: Buffer Manager

Number of physical database page reads issued. Page reads in this counter are a direct result of needing the page to finish the execution of a query.

ReadAhead Pages / sec

SQLServer: Buffer Manager

Number of pages read in anticipation of use. This is done by the ReadAhead manager, and is also termed a prefetch. If SQL Server sees certain types of I/O activity, it will try to get the data from disk before it is actually requested. For example, for a serial scan of a table, rather than reading in individual pages (8 KB I/Os), the ReadAhead manager will begin prefetching entire extents (64 KB I/Os), or several contiguous extents (128, 256, 512 KB, and so on), as a single I/O request, to increase the efficiency of the I/O subsystem and improve Query response time.

Page 46: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

53

RAID group planning RAID groups have two resources that can be consumed by an application workload: storage capacity and performance capacity. It is possible to have a RAID group fully utilize its storage capacity, but under-utilize its performance capacity. The more common scenario is for a RAID group to fully utilize its performance capacity, while using only a portion of its storage capacity. Since increases in the storage capacity of disk spindles has dramatically outpaced increases in performance capacity, one of the most common errors encountered in SQL Server deployments is to find a RAID group designed for capacity instead of performance.

When designing a RAID group, the first thing to consider is the level of performance that is needed, then verify if the needed capacity is available. The methods of designing a RAID group discussed here take a conservative view of RAID group design and do not rely on features such as caching or prefetching for sustained performance.

RAID level attributes There are four primary RAID levels that offer fault tolerance: RAID 1, RAID 5, RAID 6, and RAID 1 with striping (RAID 10).

Note: RAID 0 is not recommended because it does not provide fault tolerance. Any spindle failure in a RAID 0 group will render the entire RAID group unusable.

The discussion of the various RAID levels is based in capacity, where RAID 1 and 10 are thought of as 2n, or needing twice the number of disks to store a given amount of information when compared with a non-RAID protected system; and RAID 5 and RAID 6 are thought of as n+1 and n+2, or needing one or two more spindles than is required outside of a RAID implementation. These facts are true, but are seldom the most important part of a design discussion. The most common limitation of a disk array for SQL Server is its performance capacity, not storage capacity.

Each RAID level has a performance impact based on the type of fault tolerance the RAID level implements. This performance impact is normally seen only during writes. For example, RAID 1 and 10 both use mirroring, therefore everything written to a given spindle is also written to its partner mirror. For this reason, each logical write I/O issued by the host server turns into two physical write I/Os inside the storage array. RAID 1 and 10 are thought of as having a 2x write penalty.

RAID 5 has a 4x write penalty. The specifics of this calculation are out of scope for this document, but are easily obtainable from trusted sources. Each logical I/O request from the host is broken down into four physical I/Os:

Read data disk being written

Read parity disk for the stripe parity value

Write new data to data disk

Write new stripe parity to the parity disk

RAID 6 has dual parity and a 6x write penalty. The logical I/O from the host is similar to RAID 5, but uses two parity disks instead of one. Therefore, it has one additional read and write compared to RAID 5 for every write operation.

RAID protection impacts the usable storage capacity, and performance capacity of a disk array. The impact is determined by the RAID level selected for the array.

Page 47: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

54

As the percentage of writes increases from 0 percent (read only) to 100 percent (write only), the gap in performance between RAID 1 or 10, RAID 5, and RAID 6 widens rapidly. The following graph shows the effects of RAID level on different % write / % read workloads.

Figure 6. RAID overhead effect for random I/O

Table 4 presents additional details about the four RAID levels that are being discussed.

Table 4. RAID level performance characteristics

RAID level Random Serial Read Write

11 Good1 Good1 Good1 Good1

52 Moderate Good Excellent Poor3

6 Poor Moderate Good Poor

10 Excellent Excellent Excellent Better1

Notes:

1. By definition all RAID 1 groups are limited to two drives. This places a distinct upper limit on their potential performance. RAID 10 is a method for striping data across multiple RAID 1 groups to avoid this limit.

2. RAID 5 takes a substantial performance impact when a drive fails and needs to be rebuilt. Take this into account when planning.

3. Although RAID 5 writes perform poorly, because of the 4x penalty, there is a rare case called a “full stripe write” where RAID 5 writes can actually outperform RAID 1 or RAID 10. This occurs when the write is aligned with the stripe and is the exact same width as a full stripe. For example, if each disk held 32 KB per stripe and a 4+1 array was created, then a full stripe write would be 128 KB in size and be aligned to create one full stripe across the disks. Improved performance cannot be guaranteed unless proven through testing.

Page 48: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

55

Estimating required performance One often misunderstood facet of Microsoft SQL Server is that it is not an application. It is an environment that houses databases of various types and attributes. Performance characteristics can vary substantially from one database to another. It is not possible to discuss how Microsoft SQL Server will perform in general for all possible workloads, but it is possible to discuss the performance of a given database when the workload characteristics are defined. Databases are usually broken down into two general classes, Online Transaction Processing (OLTP) and Data Warehouse (DW).

The usual attributes of the I/O patterns involved with each type of database, as well as tempdb, are discussed Table 5.

Table 5. I/O types and performance attributes

File type Performance attributes

User Database Data File (OLTP)

The database data file for most OLTP-type applications usually has the following characteristics:

Smaller I/Os

Random I/Os

High percentage of writes compared to reads

Usually not a very large database (aged data is usually archived to a data warehouse)

The required performance can usually be achieved with fewer spindles using RAID 10, rather than RAID 5 or RAID 6.

User Database Data File (Data Warehouse)

The database data file for most OLAP-type applications usually has the following characteristics:

Larger I/Os

Serial I/Os

Low percentage of writes compared to reads, sometimes read-only

Usually a very large database

RAID 5 will usually provide adequate performance and much more usable space for a given number of spindles.

Page 49: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

56

File type Performance attributes

Database Log File

The database log file(s) for all databases have the following characteristics:

Smaller I/Os (some multiple of 512 bytes).

Highly serialized I/Os.

Almost exclusively writes, with occasional reads during large rollbacks or log backups.

Size is dependent upon several factors and difficult to predict without specific details about the database workload and protection method.

A log file is the single most important piece of information for database recovery from a crash or database restore.

Every transaction that modifies data is limited by log write speed.

Log files are critical for performance and recoverability, so RAID 10 is the recommended standard for database logs. There are times when RAID 5 may provide adequate performance (because of full stripe writes), but if a drive fails, RAID 5 performance will likely drop below needed levels. RAID 5 also cannot survive a double drive fault, while it is possible that RAID 10 will survive such a failure.

tempdb Data File

The database data file(s) for tempdb usually has the following characteristics:

Smaller or larger I/Os, depending upon usage, but many times it is larger I/Os.

Serial or random I/Os, although a given workload might be somewhat serial, many workloads running simultaneously may give tempdb more of a random I/O appearance.

Usually a near 50/50 split of writes and reads.

Size can vary wildly.

Based on the unpredictable nature of tempdb combined with its usually large percentage of writes, RAID 10 will usually provide the best performance for a given number of spindles.

Remember these are general characteristics and that a specific user database might generate an I/O workload that varies substantially from those presented. The only real way to determine the I/O performance needs of a given database is test the database.

For example, in an OLTP-type database, it is critical to know what level of I/O performance is required from the data and log RAID groups in terms of IOPS. SQL Server has many inherent buffering algorithms to try to decrease I/O levels. The efficiency of these algorithms is entirely dependent on the database and workload.

To get accurate performance estimates, run tests with as close to “real world” conditions as possible. During these tests, Performance Monitor Logs can be used to capture the characteristics (reads per second and writes per second) of the volumes used to store database files. This information can be used to do an initial RAID group design.

Note

Do not use IOPS counters averaged over time as the basis for RAID group design. EMC recommends identifying the 90th percentile of the IOPS samples and

Page 50: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

57

designing for that performance level. This allows the system to respond well to spikes in demand.

The IOPS requirements for a RAID group should be calculated independently for reads and writes

Calculating disk spindle requirements Once the read and write IOPS are known, they can be plugged in to the following formula:

Note The spindle count may need to be adjusted to conform to the requirements for the selected RAID level. For example, a seven-spindle RAID 10 set is not possible. In such a case an eight-spindle RAID 10 set is required.

The last variable is the number of IOPS that a given rotational speed spindle can support.

This number is harder to compute than the others because it has a very broad range of possibilities. Primarily, the number of IOPS a spindle can support is derived either by observation of workloads, or computed mathematically with certain assumptions made. The primary factor that influences the number of IOPS a spindle can support is the distance between where the current I/O is occurring and where the next I/O will occur. This is referred to as the “locality of reference.”

Since it is impossible to know the locality of reference ahead of time in all non-serial workloads, certain assumptions are common to estimate how many IOPS a spindle is capable of maintaining. These assumptions usually involve taking an average of the minimum distance and the maximum distance.

Note The exact math involved in making these assumption and estimates is beyond the scope of this document. It is generally accepted that a 15,000 rpm disk spindle can support approximately 180 IOPS under a variety of common workloads and conditions while maintaining acceptable latency.

The following table shows estimated spindle counts for a few example workloads, using the formula supplied above. In almost every example, the number of spindles required for RAID 10 to hit a certain performance level is lower than for RAID 5 or RAID 6. For non-read-only workloads, RAID 10 is less expensive than RAID 5 or RAID 6 because RAID 10 requires fewer spindles.

Page 51: Applied Best Practices Guide: EMC Solutions for Microsoft ... · PDF fileApplied Best Practices Guide . ... VMware vCenter, vMotion, and VMware vSphere are registered trademarks or

Performance Monitoring and Tuning

EMC Solutions for Microsoft SQL Server with EMC VNX Series Applied Best Practices Guide

58

Table 6 shows the number of spindles that would be needed for various workloads using RAID 5, RAID 6 or RAID 10.

Table 6. Number of spindles required for a series of sample workloads

Total IOPS

%Read %Write Read IOPS

Write IOPS

RAID 5 RAID 6 RAID 10

1000 100 0 1000 0 6 6 6

1000 75 25 750 250 10 14 8

1000 50 50 500 500 14 20 10

1000 25 75 250 750 19 28 10

2000 100 0 2000 0 12 12 12

2000 75 25 1500 500 20 26 14

2000 50 50 1000 1000 28 40 18

2000 25 75 500 1500 37 54 20

After the initial RAID group configuration is set, EMC recommends testing it against specific workload conditions to verify it meets performance requirements.

The requirements above are based on a RAID group with a single LUN. If multiple LUNs exist on a given RAID group, use the aggregate sum of the required IOPS performance from all LUNs on that RAID group in the previous equation.