OOW13 JB KP ASH Deep Dive

download OOW13 JB KP ASH Deep Dive

If you can't read please download the document

  • date post

    26-Jan-2015
  • Category

    Technology

  • view

    105
  • download

    1

Embed Size (px)

description

Joint session with JB from Oracle at OOW13/Oracle Open World 2013

Transcript of OOW13 JB KP ASH Deep Dive

  • 1. Copyright 2013, Oracle and/or its affiliates. All rights reserved.1

2. Copyright 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12 of the corporate presentation template2 ASH Deep Dive: Advanced Performance Analysis Tips John Beresniewicz, Oracle America Kellyn Potvin, Enkitec 3. Copyright 2013, Oracle and/or its affiliates. All rights reserved.3 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 4. Copyright 2013, Oracle and/or its affiliates. All rights reserved.4 Program Agenda What is ASH? How does ASH work? How do we use ASH data? Enterprise Manager: ASH Analytics ASH in Action: Kellyn Potvin 5. Copyright 2013, Oracle and/or its affiliates. All rights reserved.5 What is ASH? 6. Copyright 2013, Oracle and/or its affiliates. All rights reserved.6 What is ASH? Time-based sampling of foreground session state Highly multi-dimensional view of database activity and therefore DB Time Observations of specific values of the (DB Time/time) function This function is called: Average Active Sessions An instrumentation mechanism that actualizes an important concept 7. Copyright 2013, Oracle and/or its affiliates. All rights reserved.7 Important Properties of ASH Samples represent snapshots of session activity at same time Not really true since using latchless mechanism Sampling is time independent of session activity Important since otherwise sessions may be over or under-sampled 8. Copyright 2013, Oracle and/or its affiliates. All rights reserved.8 Active Session Sampling Time-based captures of state information for active sessions Sample_t1 Session 1 Session 2 Session 3 Sample_t2 Sample_t3 Session Time State Wait Class SQL_ID Object t1 1 ON CPU null 53qkkf6yzc2x0 null t1 2 WAITING User I/O 0naxkcasaz162 EMP t1 3 WAITING User I/O cs4qrt8kr3uhx EMP t2 3 WAITING Application 4uh6zm2wg03mx DEPT 9. Copyright 2013, Oracle and/or its affiliates. All rights reserved.9 ASH is Highly Multi-dimensional Most of these represent useful investigative paths in some context desc v$active_session_history Name Null Type ------------------------------ -------- ---------------- SAMPLE_ID NUMBER SAMPLE_TIME TIMESTAMP(3) IS_AWR_SAMPLE VARCHAR2(1) SESSION_ID NUMBER SESSION_SERIAL# NUMBER SESSION_TYPE VARCHAR2(10) FLAGS NUMBER USER_ID NUMBER . . . 93 rows selected 10. Copyright 2013, Oracle and/or its affiliates. All rights reserved.10 SQL Dimensions SQL_ID VARCHAR2(13) IS_SQLID_CURRENT VARCHAR2(1) SQL_CHILD_NUMBER NUMBER SQL_OPCODE NUMBER SQL_OPNAME VARCHAR2(64) FORCE_MATCHING_SIGNATURE NUMBER TOP_LEVEL_SQL_ID VARCHAR2(13) TOP_LEVEL_SQL_OPCODE NUMBER SQL_PLAN_HASH_VALUE NUMBER SQL_PLAN_LINE_ID NUMBER SQL_PLAN_OPERATION VARCHAR2(30) SQL_PLAN_OPTIONS VARCHAR2(30) SQL_EXEC_ID NUMBER SQL_EXEC_START DATE PLSQL_ENTRY_OBJECT_ID NUMBER PLSQL_ENTRY_SUBPROGRAM_ID NUMBER PLSQL_OBJECT_ID NUMBER PLSQL_SUBPROGRAM_ID NUMBER QC_INSTANCE_ID NUMBER QC_SESSION_ID NUMBER QC_SESSION_SERIAL# NUMBER 11. Copyright 2013, Oracle and/or its affiliates. All rights reserved.11 Wait Event Dimensions EVENT VARCHAR2(64) EVENT_ID NUMBER EVENT# NUMBER SEQ# NUMBER P1TEXT VARCHAR2(64) P1 NUMBER P2TEXT VARCHAR2(64) P2 NUMBER P3TEXT VARCHAR2(64) P3 NUMBER WAIT_CLASS VARCHAR2(64) WAIT_CLASS_ID NUMBER WAIT_TIME NUMBER SESSION_STATE VARCHAR2(7) TIME_WAITED NUMBER 12. Copyright 2013, Oracle and/or its affiliates. All rights reserved.12 Application Dimensions Instrumented applications can benefit greatly SERVICE_HASH NUMBER PROGRAM VARCHAR2(48) MODULE VARCHAR2(48) ACTION VARCHAR2(32) CLIENT_ID VARCHAR2(64) MACHINE VARCHAR2(64) PORT NUMBER ECID VARCHAR2(64) CONSUMER_GROUP_ID NUMBER TOP_LEVEL_CALL# NUMBER TOP_LEVEL_CALL_NAME VARCHAR2(64) CONSUMER_GROUP_ID NUMBER XID RAW(8) REMOTE_INSTANCE# NUMBER TIME_MODEL NUMBER 13. Copyright 2013, Oracle and/or its affiliates. All rights reserved.13 How does ASH work? 14. Copyright 2013, Oracle and/or its affiliates. All rights reserved.14 ASH Key Architecture Concepts In-memory ASH sampling: Dedicated background process: MMNL Circular SGA memory buffer: one writer; many readers Lean and robust mechanism: no locking or latching Default 1000ms (1 sec) sampling interval ASH sub-sampling to disk: Flush to AWR with snapshot or on emergency flush Default: 1-in-10 of the 1-sec samples are persisted Future: continuous sub-sampling Session activity sampled efficiently into memory and onto disk 15. Copyright 2013, Oracle and/or its affiliates. All rights reserved.15 MMNL writes to ASH circular buffer one way Readers of V$ASH start at current write pointer Readers proceed in opposite direction of MMNL through buffer Stop when current sample_id > last read sample_id SELECT from V$ASH returned recent-last order Reading / Writing in Opposite Directions MMNL SALLY start SALLY finish 16. Copyright 2013, Oracle and/or its affiliates. All rights reserved.16 Sampling Pseudo-code (lean and mean, but there is a hole) 1) FOR ALL SESSION STATE OBJECTS 2) IS SESSION CONNECTED? NO => NEXT SESSION YES: 3) IS SESSION ACTIVE? NO => NEXT SESSION YES: 4) MEMCPY SESSION STATE OBJ 5) CHECK CONSISTENCY OF COPY WITH LIVE SESSION 6) IS COPY CONSISTENT? YES: WRITE ASH ROW FROM COPY NO: IF FIRST COPY, REPEAT STEPS 4-6 ELSE => NEXT SESSION (NO ASH ROW WRITTEN) 17. Copyright 2013, Oracle and/or its affiliates. All rights reserved.17 Default Settings Sampling interval = 1000ms = 1 sec Disk filter ratio = 10 = 1 in 10 samples written to AWR ASH buffer size: Min( Max (5% shared pool, 2% SGA), 2MB per CPU) Absolute Max of 256MB These are carefully chosen for maximum general utility NOTE: the MMNL sampler session is not sampled 18. Copyright 2013, Oracle and/or its affiliates. All rights reserved.18 Control Parameters _ash_size : size of ASH buffer in bytes K/M notation works (e.g. 200M) _ash_sampling_interval : in milliseconds Min = 100, Max = 10,000 _ash_disk_filter_ratio : every Nth sample to AWR MOD(sample_id, N) = 0 where N=disk filter ratio _sample_all : samples idle and active sessions (geeks want underscores) 19. Copyright 2013, Oracle and/or its affiliates. All rights reserved.19 V$ASH_INFO New in 11.2 (but unfortunately un-documented) desc v$ash_info Name Null Type ------------------------------ -------- -------------- TOTAL_SIZE NUMBER FIXED_SIZE NUMBER SAMPLING_INTERVAL NUMBER OLDEST_SAMPLE_ID NUMBER OLDEST_SAMPLE_TIME TIMESTAMP(9) LATEST_SAMPLE_ID NUMBER LATEST_SAMPLE_TIME TIMESTAMP(9) SAMPLE_COUNT NUMBER SAMPLED_BYTES NUMBER SAMPLER_ELAPSED_TIME NUMBER DISK_FILTER_RATIO NUMBER AWR_FLUSH_BYTES NUMBER AWR_FLUSH_ELAPSED_TIME NUMBER AWR_FLUSH_COUNT NUMBER AWR_FLUSH_EMERGENCY_COUNT NUMBER Compute buffer time window size Compute average time per sample DROPPED_SAMPLE_COUNT NUMBER 20. Copyright 2013, Oracle and/or its affiliates. All rights reserved.20 ASH is Robust when CPU-constrained 1. ASH sampler is very efficient and does not lock Should complete a sample within a single CPU slice 2. After sampling, the sampler computes next scheduled sample time and sleeps until then 3. Upon scheduled wake-up, it waits for CPU (runq) and samples again CPU bound sample times are shifted by one runq but intervals stay close to 1 second (These are precisely times when reliable data is necessary) 21. Copyright 2013, Oracle and/or its affiliates. All rights reserved.21 ASH Sampler and Run-queue Sampling interval is consistent under CPU-starvation S_t0 S_t2S_t1 Run queue Run queue A_t1A_t0 Run queue A_t2 Sleep until next time Sleep until next Sample Sample Sample 22. Copyright 2013, Oracle and/or its affiliates. All rights reserved.22 The ASH Fix-up ASH column values may be unknown at sampling time TIME_WAITED: session is still waiting PLAN_HASH: session is still optimizing SQL GC events: event details unknown at event initiation ASH fixes up data during subsequent sampling TIME_WAITED fixed up in first sample after event completes Long events: last sample gets correct TIME_WAITED (all others 0) Querying V$ASH may return un-fixed rows Should not be a problem generally A unique and very important feature 23. Copyright 2013, Oracle and/or its affiliates. All rights reserved.23 How do we use ASH data? 24. Copyright 2013, Oracle and/or its affiliates. All rights reserved.24 How do we use ASH data? Estimate DB Time and Average Active Sessions For specific time intervals Decomposed and filtered many ASH dimensions Investigate tuning opportunities Excesses of DB Time in tune-able areas ASH Forensics Figure out what happened to SID? 25. Copyright 2013, Oracle and/or its affiliates. All rights reserved.25 ASH Math: Estimating DB Time from ASH Each ASH row counts for :INTERVAL of active session time Default for :INTERVAL is 1 second (1000 ms) Therefore COUNT(*) = DB Time in seconds This is what I call ASH Math An estimate because it is computed over a sample of true reality 26. Copyright 2013, Oracle and/or its affiliates. All rights reserved.26 ASH Math and DB Time The count of sampled rows is an estimate (unbiased) of DB time Estimate DB TimeCOUNT (ASH SAMPLED ROWS) 27. Copyright 2013, Oracle and/or its affiliates. All rights reserved.27 Computing Average Active Sessions AAS = DELTA(DB TIME) / DELTA(elapsed_time) Over some time interval(s) of sampled workload SUM(:sampling_interval) / [ MAX(sample_time) MIN(sample_time) ] Normalized to common time units, e.g. seconds COUNT(*) / [ (MAX(sample_id) MIN(sample_id) ] This works fo