Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

52
DBAs Behaving Badly Worst Practices for Database Administrators

description

 

Transcript of Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

  • 1. DBAs Behaving Badly
    Worst Practices for Database Administrators
  • 2. About the writer
    Rod Colledge
    Independent SQL Consultant
    Based in Brisbane, Australia
    Web; www.sqlCrunch.com
    Blog; www.rodcolledge.com
    MVP Deep Dives Book
    Twitter @rodcolledge
    linkedin.com/in/rodcolledge
  • 3. About us.
    Dubi Lebel
    DBA
  • 4. About us.
    Dubi Lebel
    DBA Dubi Behind All
  • 5. About us.
    Dubi Lebel
    DBA Dubi Behind All
    D.B.A
  • 6. About us.
    Dubi Lebel
    DBA Dubi Behind All
    D.B.A Dont Bother Asking
    Shahar Bar
    SQL Consultant and CEO at Valinor
  • 7. Session Overview
    Disaster Recovery (DR) Planning
    Backup & Restore
    Change Control
    Storage Configuration
    File Configuration
    Indexing
    Administration Techniques
  • 8. Disaster Recovery (DR) Planning
  • 9. 1 / 20; Not having SLAs
    SLAs provide context for everything. e.g.;
    Database available 24/7 @ 99.999% uptime
    Zero data loss
    Sub-second response time
    Use option papers during SLA negotiations
  • 10. SLA Option Papers
  • 11. 2 / 20; Not having/testing DR plans
    Do you have DR Plans?
    How do you know your plans will work?
    DR fire drills
    All/new DBAs trained in recovery procedures?
    Location of recovery documents & scripts?
    Documents/scripts up to date?
  • 12. 2 / 20; Not having/testing DR plans
  • 13. 3 / 20; Narrow definition of disaster
    Types of disasters;
    Complete environmental destruction
    Air conditioning failure
    Disk crash
    Accidentally dropping a table/database
    Security breach; what data was accessed?
    The next disaster will be unanticipated. Are your DR plans pessimistic enough?
  • 14. Backup & Restore
  • 15. argh! ... who would have thought we needed backups?
  • 16. 4 / 20; Not Taking Backups
    Huh?
    Less obvious variations;
    File system backups only
    No transaction log backups
    SAN Snapshots Recoverability?
  • 17. 5 / 20; Not Verifying Backups
    How do you know they worked?
    Verification options
    RESTORE VERIFYONLY FROM
    Restore to a Reporting Server
    Log shipping (log backup verification)
  • 18. 6 / 20; Designing for Backups only
    Design for restoration!
    What is the data loss exposure?
    How long will the recovery take?
    Script, test & document various restore scenarios
  • 19. Backup Compression
    BACKUP DATABASE AdventureWorks2008
    TO DISK =G:SQL BackupAWorks.bak
    WITH COMPRESSION
  • 20. Change Control
  • 21. 7 / 20; Insufficient Test Environments
  • 22. 8 / 20; No Performance Baseline
  • 23. 9 / 20; No Standard Build/Change Log
    Without a change log, how can you answer;
    Why is something different?
    Who made the change?
    When was the change made?
    Was the change successful?
    What will happen if the change is rolled back?
  • 24. Policy Based Management
  • 25. Configuration File.ini
  • 26. Demo; Configuration Changes Report
  • 27. Storage Configuration
  • 28. 10 / 20; Capacity-Centric Design
    200GB database How many 73GB disks?
    Capacity Centric;
    200 / 73 = 3 disks
    Performance Centric
    (reads per sec + (writes per sec * RAID)) / IOPS
    (1200 + (400 * 2)) / 125 = 16 disks!
    ~ 1.1TB or 500GB after RAID
  • 29. Preface: Many Factors Affect Disk I/O Perf
    There are myriad best practices & considerations for optimal disk I/O subsystem performance.
    Be mindful of factors such as:
    RAID level
    File allocation unit size
    Number, size, & speed of disks
    Configuration & capacity of HBAs & fabric switches
    Consider increasing HBA Queue Depth
    Network bandwidth
    Cache on disk, controllers, & SAN
    Whether disks are dedicated, shared, or virtualized
    Bus speed
    Number of paths from disk I/O subsystem to server
    Driver versions for all components
    Stripe size
    Stripe unit size
    Workload
  • 30. HDD Architecture: 3-D
    This image is from a contemporary & otherwise excellent document, but it represents disks as they were over two decades ago!
    The disk deities at Microsoft wont allow me to perpetrate such myths.
    Graphics source: Veritas Storage Foundation 5.0 for Windows Best Practices for Storage Management
    http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf
  • 31. Partition Alignment Graphic: NTFS 4KB Cluster: Default vs. Aligned RAID Array ***This has CONTEMPORARY RELEVANCE***
    This is a very simplified graphic
    Contemporary relevance
    Corresponds to default NTFS file allocation unit of 4KB
    Given common 64KB stripe unit size
    See the Notes for details
    Graphics Source: Jimmy May
  • 32. Partition Alignment Graphic: RAID Array: Default vs. Optimized for SQL Server ***This has CONTEMPORARY RELEVANCE***
    This is a very simplified graphic
    Mark Licata, Senior Technology Architect
    The worst scenario? Random operations using 64K IO and 64K chunk size. One sector off and you are hitting two disks for every IO thus halving the random performance potential.
    Note: On a RAID array this means accessing two different stripe units on two separate disks.
    Graphics Source: Jimmy May
  • 33. 11 / 20; Using Unaligned Partitions
  • 34. Which of the following RAID levels is not a good choice for write-intensive DBs?
    RAID-0
    RAID-1
    RAID-5
    RAID-10
  • 35. File Configuration
  • 36. 12 / 20; Relying on Autogrowth
  • 37. 13 / 20; Shrinking Files
  • 38. 14 / 20; Full recovery + no log backups
    When are records removed from the t-log file?
    Full recovery model; ONLY after t-log backup
    Simple recovery model; On checkpoint
    When to use full recovery model?
    When point in time recovery is required
    Backup the log file!
    Take care when moving DBs from/to production
  • 39. Indexing
  • 40. 15 / 20; Too many/not enough indexes
    Small dev db production (not enough)
    Loaded with unused indexes (too many)
    Watch for duplicate or overlapping indexes
    DMVs to the rescue
    sys.dm_db_missing_index_%
    sys.dm_db_index_usage_stats
    sys.dm_db_index_physical_stats
  • 41. Demo; Indexing
  • 42. 16 / 20; Inappropriate index maintenance
    Code in Books Online: sys.dm_db_index_physical_stats
  • 43. 17 / 20; Update stats after index rebuild
  • 44. Administration Techniques
  • 45. 18 / 20; Manual Administration
    Automation enables more things to be achieved with fewer mistakes in a given amount of time
  • 46. 19 / 20; Not defining alerts
    Manage by exception
    SQL Agent Alerts;
    Job failures
    Performance conditions
    High severity errors (level 19 +)
    What about error 825 (level 10) ?
    http://www.karaszi.com/SQLServer/util_agent_alerts.asp
  • 47. 20 / 20; No task lists/check lists
  • 48. Demo; Administration techniques
  • 49. Summary
    Be cautiously pessimistic
    Design backups from a restore perspective
    Establish & maintain performance baselines
    Validate the I/O chain
    Use a performance-centric design
    Dont rely on all out of the box settings
    Understand the indexing DMVs
    Automate & manage by exception
  • 50. [email protected]
  • 51. Complete the Evaluation Form & Win!
    You could win a Dell Mini Netbook every day just for handing in your completed form! Each session form is another chance to win!
    Pick up your Evaluation Form:
    Within each presentation room
    At the PASS Booth near registration area
    Drop off your completed Form:
    Near the exit of each presentation room
    At the PASS Booth near registration area
    Sponsored by Dell
  • 52. Thank you
    for attending this session and the 2009 PASS Summit in Seattle