2013_544_Sagrillo_ppt
-
Upload
thota-mahesh -
Category
Documents
-
view
3 -
download
1
description
Transcript of 2013_544_Sagrillo_ppt
Be a Hero with your DBA: Database Performance Tuning for System Admins and IT ArchitectsRandal SagrilloSession #544
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3
Program Agenda
Scope and Method
Tools
Examples
Next Steps
NOTE: I ASSUME SQL, SCHEMA, INSTANCE AS TUNED AS THEY CAN GET!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4
The Life of an Systems Architect Isn’t EasyAnd Not Much Better for DBAs and SysAdmins
More Users More Data More Transactions More Complexity More Hardware More SoftwareMore Data Centers
Lower Performance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5
It is All About I/O: Logical I/OFaster CPU’s Usually Mean Faster Memory and More Memory
Database Size (Relational Data)
DB Memory Size
I/O
CPU
Query/DML
LogicalI/O
20%
80%
Working Set Size
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6
It is All About I/O: Physical I/OFaster CPU’s do not help Physical-I/O bound Databases
Database Size (Relational Data)
Working Set Size (Relational Data)
DB Memory Size
I/O
CPUQuery/DML
Physical I/O
80%
20%
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7
Enterprise Application Issues
Batch job duration too long
Reporting/ad hoc query too long
OLTP transaction times too long (Business value)
Or not high enough OLTP rate (Operational value)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8
Typical Storage Bottlenecks
Maximum IOPS delivered– Talked about the most, but least
important for enterprise Apps
– Really measures concurrency
Maximum data rate delivered– Really measured channel and disk
bandwidth
Shortest service time delivered– Usually most important for databases
I/O Supply vs. Demand
Demand
IOPS
MB/Sec
milliseconds
Supply
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9
Performance Methodology
Performance below expectation, variance, degradation over time, etc.
Identify SLAs
Systemic analysis
Minimize scope
Tuning tips Document
and apply best practices
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10
Tools to Identify Database Performance IssuesDatabase performance view gives more insight than just OS view
mpstat, iostat, strace, truss
Dtrace,SWAT Very powerful,
expert tools But hard to estimate
impact/relevance of database performance
Oracle Tuning Pack SQL tuning
Oracle Diagnostic Pack Automated Diagnostic
Database Monitor (ADDM)
Active Session History (ASH)
Application Workload Repository (AWR)
SQL tracing/tkprof
Statspack: PL code, Since Oracle 8i, download
OS toolsOS tools ‘Free’ DB Utilities‘Free’ DB Utilities Licensed ToolsLicensed Tools
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11
AWR & Statspack: First ThingsTop of the reports: What is the Environment? How Long?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12
AWR & Statspack: Most Important ThingsHow much would faster CPU execution help here?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13
Database I/O Bottlenecks: Wait Events
Typical I/O wait types, foreground– db file sequential read: disk to database buffer cache wait
– log file sync: waiting for background write of log data to complete
– db file scatter read: wait for multi-block read into buffer cache
– read by other session: another session waiting for block above
– direct path read: read bypassing buffer cache directly into PGA
Typical I/O wait types, background– log file parallel write: write log data (typically to NVRAM) from LGWR
– db file parallel write: write to tables async from DBWR(S)
– log file sequential read: to build archive log, DataGuard
– log archive I/O, RMAN, etc.
Note These Are OFF CPU Events!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14
Example #1: Online Payment Processing
Operational Objective: reduce I/O burden on large multi-million dollar storage system.
Deployment platform and topology:– SPARC T-Series, Oracle Solaris
– Oracle Real Application Clusters (RAC)
‘db file sequential read’: The ‘Poster Child’ Off-CPU Wait Events
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15
‘db file sequential read’ Before OptimizationMuch More I/O Wait Time Than Real TimeTop 5 Timed Foreground Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Class Waits Time (s) (ms) Time Wait
---------------------------- ---------- --------- ------ ---- ---------
db file sequential read 3,189,229 34,272 11 67.8 User I/O
CPU time 11,332 22.4
log file sync 2,247,374 4,612 2 9.1 Commit
gc cr grant 2-way 1,365,247 793 1 1.6 Cluster
enq: TX – index contention 140,257 720 5 1.5 Concurrenc
-------------------------------------------------------------Analysis:
– ‘db file sequential read’ - 3500 IOPS, 10ms average– 15 minute snapshots ‘under load’: but 9.5 hours of disk waits!– 77 minutes of commit time
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16
11gR2 Database Smart Flash Cache
Acts as Level 2 Buffer Cache (SGA holds pointers)
Clean Cache! Almost changes physical read
I/O to logical I/O Rule of sizing: 2x – 10x SGA
size. See Buffer Pool Advisory to narrow estimate
Best accelerates read intensive workloads
If I Cannot Add DRAM or Increase SGA
Few I/O’s
Buffer Cache
Storage
Buffer Cache
Database SmartFlash Cache
Many I/O’s
Storage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17
Database Smart Flash Cache Setup
Aggregate Flash LUNs into ONE file– ASM preferred
– Concatenate. No mirroring - it is a cache!
Set two init.ora parameters– db_flash_cache_file = <+flashdg/FlashCacheFile>
Path to flash file/raw aggregation/metadevice
– db_flash_cache_size = <flash file size> Level 2 buffer cache size: amount of flash file to use
Two Simple Steps
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18
RAC Considerations for Smart Flash Cache
RAC Scaling Generally Held Eliminates Physical I/O if
block in any node’s Buffer Cache
But only checks blocks in local node’s Flash Cache File
Flash Cache is Not Shared!
Buffer Cache
SharedStorage
Buffer Cache
Database SmartFlash Cache
Global Buffer Cache (LMS)
Database SmartFlash Cache
Two Node RAC Example
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19
Example #1 After Optimizing with Flash CacheTop 5 Timed Foreground Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
---------------------------- ---------- --------- ------ ---- ---------
CPU time 11,353 57.6
log file sync 1,434,247 6,587 3 33.4 Commit
flash cache single block read 4,221,599 2,284 1 21.3 User I/O
Buffer busy waits 723,807 1,502 329 3.3 Concurrenc
db file sequential read 22,727 182 8 .9 User I/O
-------------------------------------------------------------
Results: 140x reduction in ‘db file sequential reads’ with ~190GB of Flash!
– Average Flash Cache Read time 540us vs 10.75ms: 20X quicker.
– Transaction and commit rate also went up over 40%!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20
Example #2: Bank Processing
Bank Objective: reduce ‘proccode’ response time Deployment topology:
– Solaris Capped Containers Zones
– SPARC T-Series
– 2GB Buffer Cache
Not Always About Killer IOPS Rate!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21
Example #2: Top Wait Events Before
295 IOPS foreground reads –– 1,062,961 waits (IO’s) / 60.13 minutes (3607.8 Seconds)
Propose to add ~90GB of Flash
Adding IOPS Supply Will Not Help Much Here
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22
Example #2: Top Wait Events After Optimized
Average 507us wait from flash cache– 114 seconds/224,808 waits = .000507 sec/wait RT
Proccode response time cut better than in half! Fewer Index reads needed
20 Times Shorter I/O Wait Times with Flash
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23
‘db file sequential read’ Summary
2/3rds of OLTP databases have this as majority wait event.– Even some data warehouses!
Use Buffer Pool Advisory to determine how much more cache needed– If you can add/reallocate memory to the DB servers: GREAT!
20x reduction in storage response time common with flash vs HDD in arrays.
2x improvement in SLA typical when I/O bound.– 5x and higher improvements seen.
How to Make This Go Away
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24
Example #3: Batch Processing (SAP)Top 5 Timed Foreground Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
---------------------------- ---------- --------- ------ ---- ---------
db file sequential read 109,123,471 593,577 5 40.0 User I/O
log file sync 1,818,523 559,444 308 37.7 Commit
CPU time 344,454 23.2
db file parallel write 1,444,242 35,970 25 2.4 System I/O
log file parallel write 775,249 17,371 22 1.2 System I/O
Analysis:
– 473 minute (~ 8 hour) snapshot– With 290 minutes (~5 hours) of logging time, but over 155 hours commit time!!– Nearly 1/3 second per batch commit. 1.8M commits (~64/second)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25
Example #3 Analysis
While log time is slow (25ms), log writing time is only 3% of commit time!
– Log write time is only 1.2% of entire AWR report, but STILL top 5!
– More Important 5 hours of single process I/O in 8 hours real time!
Solution: Scheduling (LGWR priority!) Processor Binding improved commit times 4X! Follow on: work to improve storage subsystem write response times
Only One Log Writer Process per Instance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26
Example #4: Telecommunications
Communications Objective: Reduce commit time.
More Logging
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27
Example #4 Analysis
Archiving each redo log write after every commit… … IN A SQL LOADER ENVIRONMENT!
– No direct load
This was a stress test…
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28
‘log file sync/log file parallel write’ Summary
~20% of OLTP databases have this as a majority wait event Also a common wait bottleneck in batch environments
While improving log write (I/O wait time) will help, usually easier to improving scheduling
Processor binding or ‘Critical Threads’ feature
ALWAYS check total log write time of LGWR to real time (snap) duration!
As serialized single process (LGWR) time approaches all available real time, no more room to schedule: THEN Need to speed up log write response times!
Separate Log device, HBA’s, channel paths, LUNs, LUN Cache, etc. Log optimized storage: NVRAM, SPARC SuperCluster, Exadata
Managing Commit Time to Applications for Batch and OLTP
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29
Additional Resources: Performance/AWR Oracle® Database Concepts 11g R2 (esp Chapter 14)http://docs.oracle.com/cd/E11882_01/server.112/e25789/toc.htm
Oracle® Database Performance Tuning Guide 11g R2(esp Chapter 10)
http://docs.oracle.com/cd/E11882_01/server.112/e16638/toc.htm
Oracle® Database Licensing Information 11g R2 (esp Chapter , Diagnostic Pack)
http://docs.oracle.com/cd/E11882_01/license.112/e10594/toc.htm
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30
Additional Resources: Statspack
Statspack Overview
http://www.orafaq.com/wiki/Statspack Statspack Installation (Last used in 9i)
http://docs.oracle.com/cd/B10501_01/server.920/a96533/statspac.htm#27255 Using Statspack with Oracle 11g
http://myoracleworld.hobby-electronics.net/DB-statspack.html
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31
Learn More About Oracle Optimized SolutionsAccess To Webcasts, Videos, Whitepapers, Blogs, and More
http://oracle.com/optimizedsolutions
Check out Oracle Optimized Solution for Oracle Database