2013_544_Sagrillo_ppt

34
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1

description

k

Transcript of 2013_544_Sagrillo_ppt

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1

Be a Hero with your DBA: Database Performance Tuning for System Admins and IT ArchitectsRandal SagrilloSession #544

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3

Program Agenda

Scope and Method

Tools

Examples

Next Steps

NOTE: I ASSUME SQL, SCHEMA, INSTANCE AS TUNED AS THEY CAN GET!

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4

The Life of an Systems Architect Isn’t EasyAnd Not Much Better for DBAs and SysAdmins

More Users More Data More Transactions More Complexity More Hardware More SoftwareMore Data Centers

Lower Performance

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5

It is All About I/O: Logical I/OFaster CPU’s Usually Mean Faster Memory and More Memory

Database Size (Relational Data)

DB Memory Size

I/O

CPU

Query/DML

LogicalI/O

20%

80%

Working Set Size

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6

It is All About I/O: Physical I/OFaster CPU’s do not help Physical-I/O bound Databases

Database Size (Relational Data)

Working Set Size (Relational Data)

DB Memory Size

I/O

CPUQuery/DML

Physical I/O

80%

20%

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7

Enterprise Application Issues

Batch job duration too long

Reporting/ad hoc query too long

OLTP transaction times too long (Business value)

Or not high enough OLTP rate (Operational value)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8

Typical Storage Bottlenecks

Maximum IOPS delivered– Talked about the most, but least

important for enterprise Apps

– Really measures concurrency

Maximum data rate delivered– Really measured channel and disk

bandwidth

Shortest service time delivered– Usually most important for databases

I/O Supply vs. Demand

Demand

IOPS

MB/Sec

milliseconds

Supply

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9

Performance Methodology

Performance below expectation, variance, degradation over time, etc.

Identify SLAs

Systemic analysis

Minimize scope

Tuning tips Document

and apply best practices

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10

Tools to Identify Database Performance IssuesDatabase performance view gives more insight than just OS view

mpstat, iostat, strace, truss

Dtrace,SWAT Very powerful,

expert tools But hard to estimate

impact/relevance of database performance

Oracle Tuning Pack SQL tuning

Oracle Diagnostic Pack Automated Diagnostic

Database Monitor (ADDM)

Active Session History (ASH)

Application Workload Repository (AWR)

SQL tracing/tkprof

Statspack: PL code, Since Oracle 8i, download

OS toolsOS tools ‘Free’ DB Utilities‘Free’ DB Utilities Licensed ToolsLicensed Tools

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11

AWR & Statspack: First ThingsTop of the reports: What is the Environment? How Long?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12

AWR & Statspack: Most Important ThingsHow much would faster CPU execution help here?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13

Database I/O Bottlenecks: Wait Events

Typical I/O wait types, foreground– db file sequential read: disk to database buffer cache wait

– log file sync: waiting for background write of log data to complete

– db file scatter read: wait for multi-block read into buffer cache

– read by other session: another session waiting for block above

– direct path read: read bypassing buffer cache directly into PGA

Typical I/O wait types, background– log file parallel write: write log data (typically to NVRAM) from LGWR

– db file parallel write: write to tables async from DBWR(S)

– log file sequential read: to build archive log, DataGuard

– log archive I/O, RMAN, etc.

Note These Are OFF CPU Events!

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14

Example #1: Online Payment Processing

Operational Objective: reduce I/O burden on large multi-million dollar storage system.

Deployment platform and topology:– SPARC T-Series, Oracle Solaris

– Oracle Real Application Clusters (RAC)

‘db file sequential read’: The ‘Poster Child’ Off-CPU Wait Events

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15

‘db file sequential read’ Before OptimizationMuch More I/O Wait Time Than Real TimeTop 5 Timed Foreground Events                     Avg %Total

~~~~~~~~~~~~~~~~~~                                 wait Call

Event Class                   Waits    Time (s) (ms)   Time Wait

---------------------------- ---------- --------- ------ ---- ---------

db file sequential read       3,189,229    34,272   11   67.8 User I/O

CPU time                                   11,332        22.4

log file sync 2,247,374    4,612    2   9.1 Commit

gc cr grant 2-way 1,365,247    793 1    1.6 Cluster

enq: TX – index contention    140,257     720    5   1.5 Concurrenc

          -------------------------------------------------------------Analysis:

– ‘db file sequential read’ - 3500 IOPS, 10ms average– 15 minute snapshots ‘under load’: but 9.5 hours of disk waits!– 77 minutes of commit time

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16

11gR2 Database Smart Flash Cache

Acts as Level 2 Buffer Cache (SGA holds pointers)

Clean Cache! Almost changes physical read

I/O to logical I/O Rule of sizing: 2x – 10x SGA

size. See Buffer Pool Advisory to narrow estimate

Best accelerates read intensive workloads

If I Cannot Add DRAM or Increase SGA

Few I/O’s

Buffer Cache

Storage

Buffer Cache

Database SmartFlash Cache

Many I/O’s

Storage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17

Database Smart Flash Cache Setup

Aggregate Flash LUNs into ONE file– ASM preferred

– Concatenate. No mirroring - it is a cache!

Set two init.ora parameters– db_flash_cache_file = <+flashdg/FlashCacheFile>

Path to flash file/raw aggregation/metadevice

– db_flash_cache_size = <flash file size> Level 2 buffer cache size: amount of flash file to use

Two Simple Steps

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18

RAC Considerations for Smart Flash Cache

RAC Scaling Generally Held Eliminates Physical I/O if

block in any node’s Buffer Cache

But only checks blocks in local node’s Flash Cache File

Flash Cache is Not Shared!

Buffer Cache

SharedStorage

Buffer Cache

Database SmartFlash Cache

Global Buffer Cache (LMS)

Database SmartFlash Cache

Two Node RAC Example

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19

Example #1 After Optimizing with Flash CacheTop 5 Timed Foreground Events                     Avg %Total

~~~~~~~~~~~~~~~~~~                                 wait Call

Event                           Waits    Time (s)  (ms)  Time Wait Class

---------------------------- ---------- --------- ------ ---- ---------

CPU time                                   11,353        57.6

log file sync        1,434,247     6,587    3   33.4 Commit

flash cache single block read 4,221,599     2,284    1   21.3 User I/O

Buffer busy waits 723,807   1,502 329    3.3 Concurrenc

db file sequential read       22,727     182   8  .9 User I/O

          -------------------------------------------------------------

Results: 140x reduction in ‘db file sequential reads’ with ~190GB of Flash!

– Average Flash Cache Read time 540us vs 10.75ms: 20X quicker.

– Transaction and commit rate also went up over 40%!

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20

Example #2: Bank Processing

Bank Objective: reduce ‘proccode’ response time Deployment topology:

– Solaris Capped Containers Zones

– SPARC T-Series

– 2GB Buffer Cache

Not Always About Killer IOPS Rate!

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21

Example #2: Top Wait Events Before

295 IOPS foreground reads –– 1,062,961 waits (IO’s) / 60.13 minutes (3607.8 Seconds)

Propose to add ~90GB of Flash

Adding IOPS Supply Will Not Help Much Here

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22

Example #2: Top Wait Events After Optimized

Average 507us wait from flash cache– 114 seconds/224,808 waits = .000507 sec/wait RT

Proccode response time cut better than in half! Fewer Index reads needed

20 Times Shorter I/O Wait Times with Flash

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23

‘db file sequential read’ Summary

2/3rds of OLTP databases have this as majority wait event.– Even some data warehouses!

Use Buffer Pool Advisory to determine how much more cache needed– If you can add/reallocate memory to the DB servers: GREAT!

20x reduction in storage response time common with flash vs HDD in arrays.

2x improvement in SLA typical when I/O bound.– 5x and higher improvements seen.

How to Make This Go Away

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24

Example #3: Batch Processing (SAP)Top 5 Timed Foreground Events                     Avg %Total

~~~~~~~~~~~~~~~~~~                                 wait Call

Event                           Waits    Time (s)  (ms)  Time Wait Class

---------------------------- ---------- --------- ------ ---- ---------

db file sequential read      109,123,471   593,577    5   40.0 User I/O

log file sync                  1,818,523   559,444  308   37.7 Commit

CPU time                                   344,454        23.2

db file parallel write         1,444,242    35,970   25    2.4 System I/O

log file parallel write          775,249    17,371   22    1.2 System I/O

Analysis:

– 473 minute (~ 8 hour) snapshot– With 290 minutes (~5 hours) of logging time, but over 155 hours commit time!!– Nearly 1/3 second per batch commit. 1.8M commits (~64/second)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25

Example #3 Analysis

While log time is slow (25ms), log writing time is only 3% of commit time!

– Log write time is only 1.2% of entire AWR report, but STILL top 5!

– More Important 5 hours of single process I/O in 8 hours real time!

Solution: Scheduling (LGWR priority!) Processor Binding improved commit times 4X! Follow on: work to improve storage subsystem write response times

Only One Log Writer Process per Instance

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26

Example #4: Telecommunications

Communications Objective: Reduce commit time.

More Logging

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27

Example #4 Analysis

Archiving each redo log write after every commit… … IN A SQL LOADER ENVIRONMENT!

– No direct load

This was a stress test…

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28

‘log file sync/log file parallel write’ Summary

~20% of OLTP databases have this as a majority wait event Also a common wait bottleneck in batch environments

While improving log write (I/O wait time) will help, usually easier to improving scheduling

Processor binding or ‘Critical Threads’ feature

ALWAYS check total log write time of LGWR to real time (snap) duration!

As serialized single process (LGWR) time approaches all available real time, no more room to schedule: THEN Need to speed up log write response times!

Separate Log device, HBA’s, channel paths, LUNs, LUN Cache, etc. Log optimized storage: NVRAM, SPARC SuperCluster, Exadata

Managing Commit Time to Applications for Batch and OLTP

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29

Additional Resources: Performance/AWR Oracle® Database Concepts 11g R2 (esp Chapter 14)http://docs.oracle.com/cd/E11882_01/server.112/e25789/toc.htm

Oracle® Database Performance Tuning Guide 11g R2(esp Chapter 10)

http://docs.oracle.com/cd/E11882_01/server.112/e16638/toc.htm

Oracle® Database Licensing Information 11g R2 (esp Chapter , Diagnostic Pack)

http://docs.oracle.com/cd/E11882_01/license.112/e10594/toc.htm

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30

Additional Resources: Statspack

Statspack Overview

http://www.orafaq.com/wiki/Statspack Statspack Installation (Last used in 9i)

http://docs.oracle.com/cd/B10501_01/server.920/a96533/statspac.htm#27255 Using Statspack with Oracle 11g

http://myoracleworld.hobby-electronics.net/DB-statspack.html

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31

Learn More About Oracle Optimized SolutionsAccess To Webcasts, Videos, Whitepapers, Blogs, and More

http://oracle.com/optimizedsolutions

Check out Oracle Optimized Solution for Oracle Database

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.32

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.33

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.34