Oracle Performance Tuning Fundamentals
-
Upload
carlos-sierra -
Category
Software
-
view
4.864 -
download
8
Transcript of Oracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
Carlos Sierra
Enkitec (c) 2014 2
• Oracle Performance and SQL Tuning• Consultant/Developer/DBA• eDB360 and eAdam• SQLT and SQLHC• Exadata
Carlos Sierra
Enkitec © 2014 3
Oracle Performance Tuning Fundamentals
• Motivation• Times and Metrics• Wait Events• Statistics• Some Myths
Enkitec © 2014 4
Why Performance Tuning?• Performance is about Time– Comply with Service Level Agreements (SLA)– Affects User Experience– Restricts Scalability
Enkitec © 2014 5
Why Performance Tuning is hard?• Concurrency– More than one user
• Dynamism– Constant Change
• Complexity– Too many knobs and software layers
Enkitec © 2014 6
Performance Tuning Approaches• “Top-down”• “Bottom-up”• “Holistic”• “Random” a.k.a. “Trial and Error”• “Silver Bullets” a.k.a. “One Size Fits All”
Enkitec © 2014 7
Back to Basics• Understand your Operating System• Understand your Database• Understand your Business• Understand your Users• Understand your Options
Enkitec © 2014 8
System Resources• CPU• Memory• Disk Space• Throughput– IOPS– MBPS
Enkitec © 2014 9
OS Performance Monitoring Tools• sar• top and htop• mpstat and vmstat• iotop and iostat• dtrace and strace
Enkitec © 2014 10
Enkitec © 2014 11
OS Tools Strategy• Define your own subset of Tools to use• Learn them well• Create some Scripts or use a GUI• Correlate DB Performance to OS Monitoring• Work with your System Administrator
Enkitec © 2014 12
OS Statistics within Oracle• Provides a high-level view of OS• Cumulative Metrics• Views– V$OSSTAT– DBA_HIST_OSSTAT
Enkitec © 2014 13
Oracle Architecture• Database– Data Files + Temp Files + Redo Log + Control Files
• Instance– SGA + PGA + Processes
• RAC
Enkitec © 2014 14
Enkitec © 2014 15
Enkitec © 2014 16
Service and Wait Times• Service Time ≈ Time on CPU (productive)• Wait Time (overhead)– Non-idle (actively waiting)• Associated to the DB
– Idle (non-productive from DB perspective)• Outside the DB
Enkitec © 2014 17
Timed Events• On CPU + Non-idle Wait Events• There are hundreds of Wait Events– Wait Events are rolled up into 12 Wait Classes
• Wait Events represent System Calls
Enkitec © 2014 18
Elapsed Time• Time between two measurements– Usually between two snapshots
• Same as Wall Clock Time• Foundation of User Experience
Enkitec © 2014 19
Database Time• Time in User Calls– Include CPU Time and non-idle Waits– Exclude background processes– Exclude idle Waits
• Can be higher than Elapsed Time– 3 users performing some work for example
Enkitec © 2014 20
CPU Time• a.k.a. “DB CPU” or “CPU used by this session”• Time spent on the CPU by the database • It does not include time waiting for CPU
Enkitec © 2014 21
Database CPU Time Ratio• Percentage of CPU/DB Time• Example:– DB Time is 3 hours– CPU Time is 2 hours– Database CPU Time Ratio is 67%
• Note: DB CPU Time Ratio does not tell much
Enkitec © 2014 22
CPU Count• Number of CPUs available to Database• Parameter CPU_COUNT– V$SYSTEM_PARAMETER2
• Influences CBO, PX, RM and Log Buffer• Default of 0 means: dynamically get OS value– OS Number of CPUs reported
Enkitec © 2014 23
Number of CPUs• As reported by OS• Visible through V$OSSTAT– STAT_NAME = ‘NUM_CPUS’
• CPU_COUNT and NUM_CPUS could have the same value– But they don’t have to
Enkitec © 2014 24
CPU Capacity• Database– V$SYSTEM_PARAMETER2– CPU_COUNT
• Host– V$OSSTAT (or sar)– NUM_CPUS
Enkitec © 2014 25
DB CPU Utilization• Percent of CPU in use over CPU capacity• DB CPU in use– DB CPU Time / Elapsed Time
• DB CPU Utilization– 100 * (DB CPU Time / Elapsed Time) /
CPU_COUNT
Enkitec © 2014 26
DB CPU Utilization Example• DB CPU Time = 5400 seconds• Elapsed Time = 30 minutes• CPU_COUNT = 8• DB CPU in use = (5400 / 60) / 30 = 3• DB CPU Utilization = 100 * 3 / 8 = 37.5%• 37.5% of the CPUs available to database
Enkitec © 2014 27
Host CPU Utilization• % Busy / (Busy + Idle)• STAT_NAME from V$OSSTAT– BUSY_TIME– IDLE_TIME
Enkitec © 2014 28
Host CPU Utilization Example• V$OSSTAT.STAT_NAME and VALUE– BUSY_TIME = 98683306– IDLE_TIME = 4282578783
• Host CPU Utilization– 100 * 98683306 / (98683306 + 4282578783)– 2.2%
Enkitec © 2014 29
CPU Utilization Concern• When > 70%– Expect “lag” affecting User Experience– May or may not recover depending on workload
• When > 95%– Most reported times become distorted– Hard to recover without suspending workload
Enkitec © 2014 30
What burns lots of CPU?• Buffer Gets• Parsing• Logon storm• Mutex Waits
• Note: Use ASH and look for “ON CPU”
Enkitec © 2014 31
Quiz: Compute DB CPU Utilization
• DB CPU in use: (9,641 / 60) / 10.03 = 16• DB CPU Utilization: 100 * 16 / 98 = 16.3%
Enkitec © 2014 32
Quiz: Compute Host CPU Utilization
• Host CPU Utilization: % BUSY / (BUSY + IDLE)• 1,932,526 / (1,932,526 + 7,722,855) = 20%
Enkitec © 2014 33
Quiz: Compute DB CPU Utilization
• DB CPU in use: (109,778 / 60) / 120.36 = 15.2• DB CPU Utilization: 100 * 15.2 / 16 = 95%
Enkitec © 2014 34
Quiz: Compute Host CPU Utilization
• Host CPU Utilization: % BUSY / (BUSY + IDLE)• 11,274,354 / (11,274,354 + 228,971) = 98%
Enkitec © 2014 35
What is a Wait Event?• V$EVENT_NAME• 1,152 on 11.2.0.3• P1, P2, P3 Parameters– Oracle Database Reference• C Oracle Wait Events
• 12 Wait Classes
Enkitec © 2014 36
Non-idle Waits• Overhead• Actively Waiting inside the database• Examples– Reading a Block from Disk– Index Rebuild– Application Row level Lock
Enkitec © 2014 37
Idle Waits• Overhead• Inactive• Waiting for work• Outside the database
Enkitec © 2014 38
Performance Statistics• Wait Events refer to Time and Times• Statistics refer to Counters– How many of “X” so far (from Instance startup)– Examples• Sorts• Consistent Gets
Enkitec © 2014 39
Statistics a.k.a. Counters• V$STATNAME• 638 Counters on 11.2.0.3• Description– Oracle Database Reference• E Statistics Descriptions
• 8 Statistics Classes
Enkitec © 2014 40
Using Waits and Statistics• Waits respond to– What, When, Where, Who, How Long
• Statistics help with– Why
Enkitec © 2014 41
Session Type and State• Foreground Type– User Session
• Background Type– Database Processes
• State– On CPU or on Waiting State
Enkitec © 2014 42
System Time Model (1)• V$SYS_TIME_MODEL• V$SESS_TIME_MODEL• Cumulative Time with no wrapping• Tree with 19 nodes– Two root nodes
Enkitec © 2014 43
System Time Model (2)• Two root nodes– DB (Elapsed) Time– Background Elapsed Time– Notes:• Children do not necessarily add up to the parent• Children are not necessarily exclusive • The union of children does not cover the whole of the
parent
Enkitec © 2014 44
System Time Model (3)1) background elapsed time 2) background cpu time 3) RMAN cpu time (backup/restore)1) DB time 2) DB CPU 2) connection management call elapsed time 2) sequence load elapsed time 2) sql execute elapsed time 2) parse time elapsed 3) hard parse elapsed time 4) hard parse (sharing criteria) elapsed time 5) hard parse (bind mismatch) elapsed time 3) failed parse elapsed time 4) failed parse (out of shared memory) elapsed time 2) PL/SQL execution elapsed time 2) inbound PL/SQL rpc elapsed time 2) PL/SQL compilation elapsed time 2) Java execution elapsed time 2) repeated bind elapsed time
Enkitec © 2014 45
Average Active Session (AAS)• Common unit to measure Performance• What is an “Active Session”?– One “On CPU” or on a “Non-idle Wait Event”
• Two ways to compute AAS1. Using count of Active Sessions on a Snapshot2. Database Time divided over Wall Clock Time
Enkitec © 2014 46
AAS
WORKLOAD REPOSITORY report for
DB Name DB Id Instance Inst Num Startup Time Release RAC------------ ----------- ------------ -------- --------------- ----------- ---XXX 1319103893 XXX 1 10-Apr-14 14:55 11.2.0.3.0 NO
Host Name Platform CPUs Cores Sockets Memory(GB)---------------- -------------------------------- ---- ----- ------- ----------xxxxxxxx AIX-Based Systems (64-bit) 16 4 72.00
Snap Id Snap Time Sessions Curs/Sess --------- ------------------- -------- ---------Begin Snap: 139 11-Apr-14 15:00:58 217 144.9 End Snap: 140 11-Apr-14 15:15:58 218 145.7 Elapsed: 15.01 (mins) DB Time: 83.46 (mins)
Cache Sizes Begin End~~~~~~~~~~~ ---------- ---------- Buffer Cache: 23,360M 23,360M Std Block Size: 8K Shared Pool Size: 1,664M 1,664M Log Buffer: 11,848K
Load Profile Per Second Per Transaction Per Exec Per Call~~~~~~~~~~~~ --------------- --------------- ---------- ---------- DB Time(s): 5.6 1.7 0.02 0.11 DB CPU(s): 0.2 0.1 0.00 0.00 Redo size: 238,783.6 72,491.8 Logical reads: 2,300.9 698.5 Block changes: 626.3 190.2 Physical reads: 3.9 1.2 Physical writes: 28.8 8.7 User calls: 48.4 14.7 Parses: 14.3 4.3 Hard parses: 0.0 0.0W/A MB processed: 0.1 0.0 Logons: 0.1 0.0 Executes: 317.3 96.3 Rollbacks: 1.1 0.3 Transactions: 3.3
Enkitec © 2014 47
Basic DB Performance Collection• System-wide Waits– Cumulative
• Session Waits– Cumulative and Current
• Session and System-wide Statistics Counters– Cumulative
Enkitec © 2014 48
Key Dynamic Views• V$SESSION_WAIT– Current or Last Wait (session specific)
• V$SESSION_EVENT– Total Waits for Events (session specific)
• V$SYSTEM_EVENT– Total Waits for Events (system-wide)
Enkitec © 2014 49
Real-life use of Dynamic Views• SQL*Plus Scripts– Your own or Tanel Poder’s “snapper.sql”
• Current and legacy Tools– OEM, AWR, ADDR, ASH, Statspack, bstat/estat
• Other Scripts and Tools
Enkitec © 2014 50
Automatic Workload Repository (AWR)
• Requires Oracle Diagnostics Pack License• 112 DBA_HIST Views on 11.2.0.3• Snapshots– One hour interval (default)– One week retention (default)
Enkitec © 2014 51
Active Session History (ASH)• Requires Oracle Diagnostics Pack License• Multi-dimension• V$ACTIVE_SESSION_HISTORY– One snapshot (sample_id) every second
• DBA_HIST_ACTIVE_SESS_HISTORY– One sample_id out of every 10
Enkitec © 2014 52
Warning• Take “Rules of Thumb” with a pinch of salt
and do not confuse them with “Myths”• Rule of Thumb
“a principle with broad application that is not intended to be strictly accurate or reliable for every situation”
• Myth“a widely held but false belief or idea”
Enkitec © 2014 53
Some Myths (1)• Change nothing and Performance will remain
the same (i.e. Freeze CBO Statistics)• Parallelize as much as possible and
everything will run faster• Improve Buffer Hit Ratio and Performance
will improve
Enkitec © 2014 54
Some Myths (2)• An Index Access operation is better than a
Full Table Scan• Placing Tables and Indexes on separate
Tablespaces provides better performance• Reorganize all Indexes periodically for better
performance
Enkitec © 2014 55
Some Myths (3)• Upgrading to faster CPUs results on better
performance• Allocate to Oracle more Memory and
Processes will run faster• Segments (Tables or Indexes) with one Extent
perform better
Enkitec © 2014 56
Some Myths (4)• SQL cannot be modified (canned application)• A truncated SQL Trace cannot be used for
Diagnostics• Top-5 Wait Events show the root cause of the
poor Performance• Silver Bullets are “fixes”
Enkitec © 2014 57
Performance Methodology in a Nut Shell
• Listen to the voice of the Business• Collect Diagnostics and compare to Baselines• Identify Pain and Resources Bottlenecks• Mitigate Pain by addressing Bottlenecks– Reach out when needed– Learn and document
Enkitec © 2014 58
Conclusions• User Experience should be the driver– Authentic Business Requirements
• Question Everything– Apply Scientific Method: Test, Prove or Debunk
• Balance between broad with deep analysis• Diagnose with sound Scripts and Tools
Enkitec © 2014 59
References• Oracle Database Reference 11g Release 2• Oracle Database Concepts 12c Release 1• Snapper - Tanel Poder– http://blog.tanelpoder.com/files/scripts/
snapper4.sql
Enkitec © 2014 60
Contact Information• [email protected]• carlos-sierra.net• @csierra_usa