1. DBAs Behaving Badly
Worst Practices for Database Administrators
2. About the writer
Rod Colledge
Independent SQL Consultant
Based in Brisbane, Australia
Web; www.sqlCrunch.com
Blog; www.rodcolledge.com
MVP Deep Dives Book
Twitter @rodcolledge
linkedin.com/in/rodcolledge
3. About us.
Dubi Lebel
DBA
4. About us.
Dubi Lebel
DBA Dubi Behind All
5. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A
6. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A Dont Bother Asking
Shahar Bar
SQL Consultant and CEO at Valinor
9. 1 / 20; Not having SLAs
SLAs provide context for everything. e.g.;
Database available 24/7 @ 99.999% uptime
Zero data loss
Sub-second response time
Use option papers during SLA negotiations
10. SLA Option Papers
11. 2 / 20; Not having/testing DR plans
Do you have DR Plans?
How do you know your plans will work?
DR fire drills
All/new DBAs trained in recovery procedures?
Location of recovery documents & scripts?
Documents/scripts up to date?
12. 2 / 20; Not having/testing DR plans
13. 3 / 20; Narrow definition of disaster
Types of disasters;
Complete environmental destruction
Air conditioning failure
Disk crash
Accidentally dropping a table/database
Security breach; what data was accessed?
The next disaster will be unanticipated. Are your DR plans
pessimistic enough?
14. Backup & Restore
15. argh! ... who would have thought we needed
backups?
16. 4 / 20; Not Taking Backups
Huh?
Less obvious variations;
File system backups only
No transaction log backups
SAN Snapshots Recoverability?
17. 5 / 20; Not Verifying Backups
How do you know they worked?
Verification options
RESTORE VERIFYONLY FROM
Restore to a Reporting Server
Log shipping (log backup verification)
18. 6 / 20; Designing for Backups only
Design for restoration!
What is the data loss exposure?
How long will the recovery take?
Script, test & document various restore scenarios
19. Backup Compression
BACKUP DATABASE AdventureWorks2008
TO DISK =G:SQL BackupAWorks.bak
WITH COMPRESSION
20. Change Control
21. 7 / 20; Insufficient Test Environments
22. 8 / 20; No Performance Baseline
23. 9 / 20; No Standard Build/Change Log
Without a change log, how can you answer;
Why is something different?
Who made the change?
When was the change made?
Was the change successful?
What will happen if the change is rolled back?
24. Policy Based Management
25. Configuration File.ini
26. Demo; Configuration Changes Report
27. Storage Configuration
28. 10 / 20; Capacity-Centric Design
200GB database How many 73GB disks?
Capacity Centric;
200 / 73 = 3 disks
Performance Centric
(reads per sec + (writes per sec * RAID)) / IOPS
(1200 + (400 * 2)) / 125 = 16 disks!
~ 1.1TB or 500GB after RAID
29. Preface: Many Factors Affect Disk I/O Perf
There are myriad best practices & considerations for optimal
disk I/O subsystem performance.
Be mindful of factors such as:
RAID level
File allocation unit size
Number, size, & speed of disks
Configuration & capacity of HBAs & fabric switches
Consider increasing HBA Queue Depth
Network bandwidth
Cache on disk, controllers, & SAN
Whether disks are dedicated, shared, or virtualized
Bus speed
Number of paths from disk I/O subsystem to server
Driver versions for all components
Stripe size
Stripe unit size
Workload
30. HDD Architecture: 3-D
This image is from a contemporary & otherwise excellent
document, but it represents disks as they were over two decades
ago!
The disk deities at Microsoft wont allow me to perpetrate such
myths.
Graphics source: Veritas Storage Foundation 5.0 for Windows Best
Practices for Storage Management
http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf
31. Partition Alignment Graphic: NTFS 4KB Cluster: Default vs.
Aligned RAID Array ***This has CONTEMPORARY RELEVANCE***
This is a very simplified graphic
Contemporary relevance
Corresponds to default NTFS file allocation unit of 4KB
Given common 64KB stripe unit size
See the Notes for details
Graphics Source: Jimmy May
32. Partition Alignment Graphic: RAID Array: Default vs.
Optimized for SQL Server ***This has CONTEMPORARY
RELEVANCE***
This is a very simplified graphic
Mark Licata, Senior Technology Architect
The worst scenario? Random operations using 64K IO and 64K chunk
size. One sector off and you are hitting two disks for every IO
thus halving the random performance potential.
Note: On a RAID array this means accessing two different stripe
units on two separate disks.
Graphics Source: Jimmy May
33. 11 / 20; Using Unaligned Partitions
34. Which of the following RAID levels is not a good choice for
write-intensive DBs?
RAID-0
RAID-1
RAID-5
RAID-10
35. File Configuration
36. 12 / 20; Relying on Autogrowth
37. 13 / 20; Shrinking Files
38. 14 / 20; Full recovery + no log backups
When are records removed from the t-log file?
Full recovery model; ONLY after t-log backup
Simple recovery model; On checkpoint
When to use full recovery model?
When point in time recovery is required
Backup the log file!
Take care when moving DBs from/to production
39. Indexing
40. 15 / 20; Too many/not enough indexes
Small dev db production (not enough)
Loaded with unused indexes (too many)
Watch for duplicate or overlapping indexes
DMVs to the rescue
sys.dm_db_missing_index_%
sys.dm_db_index_usage_stats
sys.dm_db_index_physical_stats
41. Demo; Indexing
42. 16 / 20; Inappropriate index maintenance
Code in Books Online: sys.dm_db_index_physical_stats
43. 17 / 20; Update stats after index rebuild
44. Administration Techniques
45. 18 / 20; Manual Administration
Automation enables more things to be achieved with fewer mistakes
in a given amount of time
46. 19 / 20; Not defining alerts
Manage by exception
SQL Agent Alerts;
Job failures
Performance conditions
High severity errors (level 19 +)
What about error 825 (level 10) ?
http://www.karaszi.com/SQLServer/util_agent_alerts.asp
47. 20 / 20; No task lists/check lists
48. Demo; Administration techniques
49. Summary
Be cautiously pessimistic
Design backups from a restore perspective
Establish & maintain performance baselines
Validate the I/O chain
Use a performance-centric design
Dont rely on all out of the box settings
Understand the indexing DMVs
Automate & manage by exception
51. Complete the Evaluation Form & Win!
You could win a Dell Mini Netbook every day just for handing in
your completed form! Each session form is another chance to
win!
Pick up your Evaluation Form:
Within each presentation room
At the PASS Booth near registration area
Drop off your completed Form:
Near the exit of each presentation room
At the PASS Booth near registration area
Sponsored by Dell
52. Thank you
for attending this session and the 2009 PASS Summit in
Seattle