UKOUG Tech15 - Overheads of RAC?

Post on 27-Jan-2017

587 views 3 download

Transcript of UKOUG Tech15 - Overheads of RAC?

Our Awards:Our Awards:

Zahid Anwar, Senior Oracle DBA Consultant, Version 1

6th December 2015

UKOUG

Overheads of RAC?

Introducing Version 1

A values-driven organisation that aims to prove that IT

can make a real difference to our clients’

businesses

IT Transformation to deliver business benefit in a

cost efficient manner through our service excellence,

innovation and service improvement

A circa. €75/£53m, 700-strong business with bases

across the UK and Ireland.

Broader and better than niche players, better

service than global players, nearshore and onshore

rather than offshore; values-driven approach to

delivering trusted advice to customers

Enterprise Architecture & Change

Microsoft Solutions

Oracle Solutions

Development Technologies and Services

Business Intelligence & Analytics

Infrastructure & Cloud Services

Licence Management

Section 1 - Introducing Version 1<Insert name>4 Main Sectors

3 Key Technology Specialisations 2 Delivery Practices

7 Areas of Deep Expertise

Business Solutions

Managed Services

About Version 1

26%

21%35%

18% Commercial

Financial

Public

Utilities

Our History

A bit about myself

• Senior Oracle DBA Consultant (10+ years experience)

• An Oracle Certified Master, 2nd in Version 1

• Aspiring to become an Oracle Ace

• Oracle 10g &11g Certified RAC Expert and 11gR2 RAC and GI Certified Expert

• Exadata and ODA Specialist

• Follow me on:

– facebook.ZedDBA.co.uk (blog)

– twitter.ZedDBA.co.uk or @ZedDBA

– LinkedIn.ZedDBA.co.uk

– www.ZedDBA.co.uk (coming soon!)

What is RAC?

• Instance

– Comprises of Oracle related Memory and OS

Processes on a server

• Database

– Consists of a collection of data files, control files

and redo logs located on disk

Server

Instance

Database

What is RAC?

• Real Application Clusters (RAC)

– Allows multiple instances to run on separate servers (nodes) concurrently

accessing a single database

– The single database is placed on shared storage accessible to all nodes

– Instances communicate over an Interconnect network

Node 1

Instance 1

Database

Node 2

Instance 2Interconnect

Shared Storage

Why use RAC?

• The main aim of RAC is to implement a clustered database to provide:

– Increased availability/resilience

– Increased scalability

– Improved maintainability

– Reduction in total cost of ownership

• Commodity Hardware

• Consolidation Platform

• Need to take Oracle RAC licenses into consideration

Cache Fusion

• Cache Fusion

– Allows Oracle RAC to “fuse” the in-memory data cached (physically separate) on

each node into a single Global Cache (GC)

– Through a set of dedicated RAC background processes

– Using the Interconnect for communicating GC messages and for transferring data

blocks

Access Times

• Locally in the Instance Local Cache

– Access time: nanoseconds (ns) 1,000,000,000th (billionth) of a second

• Remote in another Instance Cache (Global Cache)

– Access time: microseconds (μs) 1,000,000th (millionth) of a second

• On Disk

– Access time: milliseconds (ms) 1,000th (thousandth) of a second (spinning disks)

– Access time: microseconds (μs) 1,000,000th of a second (SSD/Flash)

– Access time: microseconds (μs) 1,000,000th of a second (NV RAM)

RAC Related Terminology

• Resource

– An object, where access is controlled at instance level

• Global Resource

– An object, where access is controlled at cluster level

• Enqueue

– Serialises local access to a resource

• Gobal Enqueue

– Serialises global access to a resource

RAC Services

• Global Cache Service (GCS)

– Implements Cache Fusion

– Coordinates access privileges to database blocks for instances

– Responsible for block transfers between instances

– Guarantees the data integrity by employing global access levels

• Global Enqueue Service (GES)

– Performs concurrency control (locks) on dictionary cache, library cache and the transaction

– Performs deadlock detection

• Global Resource Directory (GRD)

– Records the owner of each resource and it’s current state

– Distributed across all instances

– Maintained by GCS and GES

RAC Background Processes

• Each RAC instance has the standard set of background processes:

– PMON

– SMON

– LGWR

– DBWn

– ARCn

• Additional background processes to support Global Cache Service and Global Enqueue Service:

– LMSn

– LMD0

– LCK0

– LMON

– DIAG

RAC Background Processes

• LMSn

– Global Cache Service Process (Cache Fusion)

– Manages resources and provides resource control among Oracle RAC

instances

– Up to 36 LMSn processes, where n is 0-9 or a-z

– Maintains a lock database for Global Cache Service (GCS) and buffer

cache resources

– This process receives, processes, and sends GCS requests, block

transfers, and other GCS-related messages

RAC Background Processes

• LMD0

– Global Enqueue Service Daemon 0 Process

– One LMD0 process per instance

– Manages incoming remote resource requests from other instances

– LMD0 processes enqueue resources managed under Global Enqueue

Service

– In particular, LMD0 processes incoming enqueue request messages and

controls access to global enqueues

– It also performs distributed deadlock detections

RAC Background Processes

• LCK0

– Instance Enqueue Background Process

– One LCK0 process per instance

– Assists LMSn processes

– Manages global enqueue requests and cross-instance broadcasts

– The process handles all requests for resources other than data blocks

• For examples, LCK0 manages library and row cache requests

RAC Background Processes

• LMON

– Global Enqueue Service Monitor Process

– One LMON process per instance

– Monitors an Oracle RAC cluster to manage global resources

– LMON maintains instance membership within Oracle RAC

– The process detects instance transitions and performs reconfiguration of

GES and GCS resources

RAC Background Processes

• DIAG

– Diagnostic Capture Process

– Performs diagnostic dumps

– DIAG performs diagnostic dumps requested by other processes and

dumps triggered by process or instance termination

– In Oracle RAC, DIAG performs global diagnostic dumps requested by

remote instances

Block Mastering

• In RAC, every data block is mastered by an instance

– Keep track of the state of the block maintained in Global Resource Directory (GRD)

– Mastered in block ranges (128 blocks since 10g)

– Block ranges are uniformly mastered between instances so that Global Cache

grants are evenly distributed across all instances

Instance 1 Instance 2 Instance 3

000,001 -

128,000128,001 –

256,000

256,001 –

384,000

File 1, blocks

000,001 -384,000

Global Cache Examples

Global Cache Example: Read From Disk

Resource Master

Instance 1 Instance 2 Instance 3

Instance 2 requests shared read on block

1. Request to

obtain a

Shared

Resource

2. Request is

Granted

3. Read

Request

4. Block

Delivered

SCN 1000SCN 1000

Global Cache Example: Read to Write

Resource Master

Instance 1 Instance 2 Instance 3

Instance 3 requests exclusive read on block

1. Request to obtain an Exclusive Resource

2. Instruct to

transfer block

for exclusive

access

SCN 1000

SCN 10003. Transfer

block

4. Update Resource Master with Resource Status

SCN 1001

4. Update

Resource

Master with

Resource

Status

Global Cache Example: Write to Write

Resource Master

Instance 1 Instance 2 Instance 3

Instance 2 requests exclusive read on block

1. Request to obtain an Exclusive Resource

SCN 1000

SCN 10013. Transfer

block *

2. Instruct to transfer block for exclusive access

SCN 1001SCN 1002

* The instance will create a

Past Image of the dirty block

before transferring. This is to

reduce recovery times upon

instance failure.

Global Cache Waits

Global Cache Waits: Active Session History (ASH)

• Block-Related Wait Events

– gc current block 2-way

– gc current block 3-way

– gc cr block 2-way

– gc cr block 3-way

• Wait event indicates that a block arrived from resource master (2-way) or

from another instance instructed by resource master (3-way)

current block = the first time a block is read into buffer

cr block (consistent read) = subsequently, when a block transferred to another instance

Global Cache Waits: Active Session History (ASH)

• Message-Related Wait Events

– gc current grant 2-way

– gc cr grant 2-way

• Wait event indicates that no block was received as it wasn’t cached in any

instance, instead a grant was given to read from disk or modify it

current block = the first time a block is read into buffer

cr block (consistent read) = subsequently, when a block transferred to another instance

Global Cache Waits: Active Session History (ASH)

• Contention-Related Wait Events

– gc current block busy

– gc cr block busy

– gc buffer busy acquire/release

• Wait event due to a hot block therefore could not be shipped immediately;

normally due to remote log flush, high concurrency or already requested block

current block = the first time a block is read into buffer

cr block (consistent read) = subsequently, when a block transferred to another instance

Global Cache Waits: Active Session History (ASH)

• Load-Related Wait Events

– gc current block congested

– gc cr block congested

• Wait event due to High Load, CPU saturation or High Interconnect Traffic

current block = the first time a block is read into buffer

cr block (consistent read) = subsequently, when a block transferred to another instance

Global Cache Waits: AWR Report

• Global Cache Waits can be observed in AWR:

Global Cache Waits: AWR Report

• Global Cache Load Profile & Efficiency Percentage:

Global Cache Waits: AWR Report

• GCS and GES Statistics:

Global Cache Waits: Oracle Enterprise Manager

• Cluster Cache Coherency:

Global Cache Waits: Oracle Enterprise Manager

• Cluster Cache Coherency:

Scalability

• Scalability is the relationship between workload and resources at increased

increments

• For RAC, resources are increased by adding nodes

• Scalability can be:

– linear – direct relationship between workload and resources

– non-linear – more resources are required with increased workload

Resource

Wo

rklo

ad

linear

non-linear

Scalability

• With RAC overheads:

– Global Cache Service (Blocks)

– Global Enqueue Service (Locks)

• It is impossible to achieve linear scalability

• General observation, 10% RAC overhead per instance (scale factor of 1.8)

– Overheads do decrease with more instances as the GCS workload is

more evenly distributed across the cluster

Demo

Dynamic Remastering

• New Feature since 10gR1, improved in 10gR2 and further enhanced in 11g

• When an object is accessed by an instance frequently, then that instance

becomes the master of the object

– Reduces GC grants and block transfers

• The view V$GCSPFMASTER_INFO shows objects that have been

remastered, info also available in AWR report

Dynamic Remaster

Demo

Reducing Global Cache Waits Further

• Partitioning Workloads

– Partitions Workloads by applications using Database Services

– This means common data will be accessed within a given instance or

isolated to a particular instance

– Reduces Remote Global Cache Requests

• Partitioning Data

– Distribute data using partitions and accessed using Database Services

Reducing Global Cache Waits Further

• Minimise Lock Usage

– Avoid unnecessary parsing

– Increase Shared Pool size

– Bind variables

– Cursor sharing

Reducing Global Cache Waits Further

• Use Automatic Segment Space Management (ASSM) – MUST

– Eliminates old linked freelists and replaces them with bitmap freelists

– Performs much faster and scales better

– Hence Oracle recommend ASSM for RAC

• Increase Sequence Caches

– With No Order if possible

Reducing Global Cache Waits Further

• Write Contention

– Write “hot spots” due to frequent changes to same data blocks across all

instances

– Other instances request blocks that are being changed

– Blocks can’t be transferred until pending redo is flushed to redo logs

– Latency for deferred block transfer becomes dependant on the log write

– Avoid write “hot spots” using Data Partitioning and Database Services

Reducing Global Cache Waits Further

• Write Contention Continued…

– Place redo logs on fast storage i.e. SSD and separate disks from other I/O

busy disks

– 99% of write “hot spots” are due to Indexes, therefore:

• Use Global Hash Partitioned Indexes

• Use Locally Partitioned Indexes

• Drop Unused Indexes

Cache Fusion Accelerator 12c

• New in 12.1.0.2

– OS kernel (Linux & Solaris only) module that can respond directly to

certain lock requests via RDSv3

– Lock state saved in memory shared by the database and the kernel

– Saves user/kernel context switches, frees up CPU cycles in LMS and

speeds up messages

– Will be incorporated into Engineered Systems

– Improve scalability, bridging the gap between linear and non-linear

– I’ve not tested this YET!

Other Performance Tips

• Recovery

– RECOVERY_PARALLELISM – parallel instance recovery

– FAST_START_PARALLEL_ROLLBACK – parallel recovery of a

terminated transaction

• Redo Size

– Size appropriately so to avoid aggressive checkpointing

– Set FAST_START_MTTR_TARGET to a reasonable value, so to

balance aggressive writing of dirty blocks to disk versus longer recovery

times

Other Performance Tips

• Ensure cluster is reasonably balanced

– Load Balancing

– Database Services

– Still balanced after a node failure

• Parallel Query

– May increase Global Cache waits but spreads the load across the

cluster thereby increasing performance of queries

– Useful for Large Full Table Scans or DML

Vertical and Horizontal Scaling

• Vertical Scaling (add more CPU and/or Memory)

– Pros:

• Avoids Global Cache Waits by using Local Cache instead of Global Cache

• Simpler to manage a Large Single Instance

Vertical and Horizontal Scaling

• Vertical Scaling (add more CPU and/or Memory)

– Cons:

• Limitation on adding more CPU and Memory

• No High Availability

• Can become very expensive

Vertical and Horizontal Scaling

• Horizontal Scaling (add more nodes, scale out)

– Pros:

• Gets around max CPU and/or Memory limitations

• High Availability

• Highly Scalable

– 100 nodes in 11g

– 64 hub nodes in 12c, with unlimited leaf nodes (FlexCluster)

• Increases network bandwidth to storage across the cluster

• Can use cheaper commodity servers

• RAC Rollable Patching

Vertical and Horizontal Scaling

• Horizontal Scaling (add more nodes, scale out)

– Cons:

• Complexity

• Overcapacity to survive failover

• Additional skillset required

• Increases maintenance (can decrease if used as consolidation platform)

• Takes longer to patch the cluster the larger it gets

• Overheads

Summary

• RAC does what it says on the tin (High Availability and Highly Scalable)

– Doesn’t come for free (some overheads)

– Near Linear Scalability

– Regardless of the number of instances, the maximum number of

instances involved in a block request is 3 (2-way or 3-way grant/block

transfer)

– Gives Maximum Scalability

– RAC IS STILL GREAT

References

• A Rough Guide to RAC - Julian Dyke:

http://www.juliandyke.com/Presentations/ARoughGuideToRAC.ppt

• Inside RAC - Julian Dyke:

http://www.juliandyke.com/Presentations/InsideRAC.ppt

• Oracle Database Online Documentation 11g Release 2 (11.2) - Background Processes:

http://docs.oracle.com/cd/E11882_01/server.112/e40402/bgprocesses.htm#REFRN104

• Monitoring Performance:

http://docs.oracle.com/cd/E11882_01/rac.112/e41960/monitor.htm#RACAD986

• Descriptions of Wait Events:

http://docs.oracle.com/cd/E11882_01/server.112/e40402/waitevents003.htm#BGGIBDJI

References

• Oracle RAC Wait Event Tuning:

• http://www.dba-oracle.com/t_rac_wait_event_tuning.htm

• RAC object remastering (Dynamic remastering):

https://orainternals.wordpress.com/2010/03/25/rac-object-remastering-dynamic-remastering/

• Oracle RAC Internals - The Cache Fusion Edition:

http://www.slideshare.net/MarkusMichalewicz/oracle-rac-internals-the-cache-fusion-edition

• Oracle 10G RAC Scalability – Lessons Learned:

https://www.toadworld.com/platforms/oracle/w/wiki/403.oracle-10g-rac-scalability-lessons-learned

• Ten Vital Tips for Oracle RAC Performance:

http://www.slideshare.net/ZekeriyaBesiroglu/oracle-rac-performance-tunning-tipstricks

Thank you for listening!

• Zahid.Anwar@Version1.com

– facebook.ZedDBA.co.uk (blog)

– twitter.ZedDBA.co.uk or @ZedDBA

– LinkedIn.ZedDBA.co.uk

– www.ZedDBA.co.uk (coming soon!)