Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

1. DBAs Behaving Badly
Worst Practices for Database Administrators

2. About the writer
Rod Colledge
Independent SQL Consultant
Based in Brisbane, Australia
Web; www.sqlCrunch.com
Blog; www.rodcolledge.com
MVP Deep Dives Book
Twitter @rodcolledge
linkedin.com/in/rodcolledge

3. About us.
Dubi Lebel
DBA

4. About us.
Dubi Lebel
DBA Dubi Behind All

5. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A

6. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A Dont Bother Asking
Shahar Bar
SQL Consultant and CEO at Valinor

7. Session Overview
Disaster Recovery (DR) Planning
Backup & Restore
Change Control
Storage Configuration
File Configuration
Indexing
Administration Techniques

8. Disaster Recovery (DR) Planning

9. 1 / 20; Not having SLAs
SLAs provide context for everything. e.g.;
Database available 24/7 @ 99.999% uptime
Zero data loss
Sub-second response time
Use option papers during SLA negotiations

10. SLA Option Papers

11. 2 / 20; Not having/testing DR plans
Do you have DR Plans?
How do you know your plans will work?
DR fire drills
All/new DBAs trained in recovery procedures?
Location of recovery documents & scripts?
Documents/scripts up to date?

12. 2 / 20; Not having/testing DR plans

13. 3 / 20; Narrow definition of disaster
Types of disasters;
Complete environmental destruction
Air conditioning failure
Disk crash
Accidentally dropping a table/database
Security breach; what data was accessed?
The next disaster will be unanticipated. Are your DR plans pessimistic enough?

14. Backup & Restore

15. argh! ... who would have thought we needed backups?

16. 4 / 20; Not Taking Backups
Huh?
Less obvious variations;
File system backups only
No transaction log backups
SAN Snapshots Recoverability?

17. 5 / 20; Not Verifying Backups
How do you know they worked?
Verification options
RESTORE VERIFYONLY FROM
Restore to a Reporting Server
Log shipping (log backup verification)

18. 6 / 20; Designing for Backups only
Design for restoration!
What is the data loss exposure?
How long will the recovery take?
Script, test & document various restore scenarios

19. Backup Compression
BACKUP DATABASE AdventureWorks2008
TO DISK =G:SQL BackupAWorks.bak
WITH COMPRESSION

20. Change Control

21. 7 / 20; Insufficient Test Environments

22. 8 / 20; No Performance Baseline

23. 9 / 20; No Standard Build/Change Log
Without a change log, how can you answer;
Why is something different?
Who made the change?
When was the change made?
Was the change successful?
What will happen if the change is rolled back?

24. Policy Based Management

25. Configuration File.ini

26. Demo; Configuration Changes Report

27. Storage Configuration

28. 10 / 20; Capacity-Centric Design
200GB database How many 73GB disks?
Capacity Centric;
200 / 73 = 3 disks
Performance Centric
(reads per sec + (writes per sec * RAID)) / IOPS
(1200 + (400 * 2)) / 125 = 16 disks!
~ 1.1TB or 500GB after RAID

29. Preface: Many Factors Affect Disk I/O Perf
There are myriad best practices & considerations for optimal disk I/O subsystem performance.
Be mindful of factors such as:
RAID level
File allocation unit size
Number, size, & speed of disks
Configuration & capacity of HBAs & fabric switches
Consider increasing HBA Queue Depth
Network bandwidth
Cache on disk, controllers, & SAN
Whether disks are dedicated, shared, or virtualized
Bus speed
Number of paths from disk I/O subsystem to server
Driver versions for all components
Stripe size
Stripe unit size
Workload

30. HDD Architecture: 3-D
This image is from a contemporary & otherwise excellent document, but it represents disks as they were over two decades ago!
The disk deities at Microsoft wont allow me to perpetrate such myths.
Graphics source: Veritas Storage Foundation 5.0 for Windows Best Practices for Storage Management
http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf

31. Partition Alignment Graphic: NTFS 4KB Cluster: Default vs. Aligned RAID Array ***This has CONTEMPORARY RELEVANCE***
This is a very simplified graphic
Contemporary relevance
Corresponds to default NTFS file allocation unit of 4KB
Given common 64KB stripe unit size
See the Notes for details
Graphics Source: Jimmy May

32. Partition Alignment Graphic: RAID Array: Default vs. Optimized for SQL Server ***This has CONTEMPORARY RELEVANCE***
This is a very simplified graphic
Mark Licata, Senior Technology Architect
The worst scenario? Random operations using 64K IO and 64K chunk size. One sector off and you are hitting two disks for every IO thus halving the random performance potential.
Note: On a RAID array this means accessing two different stripe units on two separate disks.
Graphics Source: Jimmy May

33. 11 / 20; Using Unaligned Partitions

34. Which of the following RAID levels is not a good choice for write-intensive DBs?
RAID-0
RAID-1
RAID-5
RAID-10

35. File Configuration

36. 12 / 20; Relying on Autogrowth

37. 13 / 20; Shrinking Files

38. 14 / 20; Full recovery + no log backups
When are records removed from the t-log file?
Full recovery model; ONLY after t-log backup
Simple recovery model; On checkpoint
When to use full recovery model?
When point in time recovery is required
Backup the log file!
Take care when moving DBs from/to production

39. Indexing

40. 15 / 20; Too many/not enough indexes
Small dev db production (not enough)
Loaded with unused indexes (too many)
Watch for duplicate or overlapping indexes
DMVs to the rescue
sys.dm_db_missing_index_%
sys.dm_db_index_usage_stats
sys.dm_db_index_physical_stats

41. Demo; Indexing

42. 16 / 20; Inappropriate index maintenance
Code in Books Online: sys.dm_db_index_physical_stats

43. 17 / 20; Update stats after index rebuild

44. Administration Techniques

45. 18 / 20; Manual Administration
Automation enables more things to be achieved with fewer mistakes in a given amount of time

46. 19 / 20; Not defining alerts
Manage by exception
SQL Agent Alerts;
Job failures
Performance conditions
High severity errors (level 19 +)
What about error 825 (level 10) ?
http://www.karaszi.com/SQLServer/util_agent_alerts.asp

47. 20 / 20; No task lists/check lists

48. Demo; Administration techniques

49. Summary
Be cautiously pessimistic
Design backups from a restore perspective
Establish & maintain performance baselines
Validate the I/O chain
Use a performance-centric design
Dont rely on all out of the box settings
Understand the indexing DMVs
Automate & manage by exception

50. [email protected]

51. Complete the Evaluation Form & Win!
You could win a Dell Mini Netbook every day just for handing in your completed form! Each session form is another chance to win!
Pick up your Evaluation Form:
Within each presentation room
At the PASS Booth near registration area
Drop off your completed Form:
Near the exit of each presentation room
At the PASS Booth near registration area
Sponsored by Dell

52. Thank you
for attending this session and the 2009 PASS Summit in Seattle

Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

Technology

Transcript of Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge