UKOUG Tech15 - Overheads of RAC?
-
Upload
zahid-anwar-ocm -
Category
Technology
-
view
587 -
download
3
Transcript of UKOUG Tech15 - Overheads of RAC?
Our Awards:Our Awards:
Zahid Anwar, Senior Oracle DBA Consultant, Version 1
6th December 2015
UKOUG
Overheads of RAC?
Introducing Version 1
A values-driven organisation that aims to prove that IT
can make a real difference to our clients’
businesses
IT Transformation to deliver business benefit in a
cost efficient manner through our service excellence,
innovation and service improvement
A circa. €75/£53m, 700-strong business with bases
across the UK and Ireland.
Broader and better than niche players, better
service than global players, nearshore and onshore
rather than offshore; values-driven approach to
delivering trusted advice to customers
Enterprise Architecture & Change
Microsoft Solutions
Oracle Solutions
Development Technologies and Services
Business Intelligence & Analytics
Infrastructure & Cloud Services
Licence Management
Section 1 - Introducing Version 1<Insert name>4 Main Sectors
3 Key Technology Specialisations 2 Delivery Practices
7 Areas of Deep Expertise
Business Solutions
Managed Services
About Version 1
26%
21%35%
18% Commercial
Financial
Public
Utilities
Our History
A bit about myself
• Senior Oracle DBA Consultant (10+ years experience)
• An Oracle Certified Master, 2nd in Version 1
• Aspiring to become an Oracle Ace
• Oracle 10g &11g Certified RAC Expert and 11gR2 RAC and GI Certified Expert
• Exadata and ODA Specialist
• Follow me on:
– facebook.ZedDBA.co.uk (blog)
– twitter.ZedDBA.co.uk or @ZedDBA
– LinkedIn.ZedDBA.co.uk
– www.ZedDBA.co.uk (coming soon!)
What is RAC?
• Instance
– Comprises of Oracle related Memory and OS
Processes on a server
• Database
– Consists of a collection of data files, control files
and redo logs located on disk
Server
Instance
Database
What is RAC?
• Real Application Clusters (RAC)
– Allows multiple instances to run on separate servers (nodes) concurrently
accessing a single database
– The single database is placed on shared storage accessible to all nodes
– Instances communicate over an Interconnect network
Node 1
Instance 1
Database
Node 2
Instance 2Interconnect
Shared Storage
Why use RAC?
• The main aim of RAC is to implement a clustered database to provide:
– Increased availability/resilience
– Increased scalability
– Improved maintainability
– Reduction in total cost of ownership
• Commodity Hardware
• Consolidation Platform
• Need to take Oracle RAC licenses into consideration
Cache Fusion
• Cache Fusion
– Allows Oracle RAC to “fuse” the in-memory data cached (physically separate) on
each node into a single Global Cache (GC)
– Through a set of dedicated RAC background processes
– Using the Interconnect for communicating GC messages and for transferring data
blocks
Access Times
• Locally in the Instance Local Cache
– Access time: nanoseconds (ns) 1,000,000,000th (billionth) of a second
• Remote in another Instance Cache (Global Cache)
– Access time: microseconds (μs) 1,000,000th (millionth) of a second
• On Disk
– Access time: milliseconds (ms) 1,000th (thousandth) of a second (spinning disks)
– Access time: microseconds (μs) 1,000,000th of a second (SSD/Flash)
– Access time: microseconds (μs) 1,000,000th of a second (NV RAM)
RAC Related Terminology
• Resource
– An object, where access is controlled at instance level
• Global Resource
– An object, where access is controlled at cluster level
• Enqueue
– Serialises local access to a resource
• Gobal Enqueue
– Serialises global access to a resource
RAC Services
• Global Cache Service (GCS)
– Implements Cache Fusion
– Coordinates access privileges to database blocks for instances
– Responsible for block transfers between instances
– Guarantees the data integrity by employing global access levels
• Global Enqueue Service (GES)
– Performs concurrency control (locks) on dictionary cache, library cache and the transaction
– Performs deadlock detection
• Global Resource Directory (GRD)
– Records the owner of each resource and it’s current state
– Distributed across all instances
– Maintained by GCS and GES
RAC Background Processes
• Each RAC instance has the standard set of background processes:
– PMON
– SMON
– LGWR
– DBWn
– ARCn
• Additional background processes to support Global Cache Service and Global Enqueue Service:
– LMSn
– LMD0
– LCK0
– LMON
– DIAG
RAC Background Processes
• LMSn
– Global Cache Service Process (Cache Fusion)
– Manages resources and provides resource control among Oracle RAC
instances
– Up to 36 LMSn processes, where n is 0-9 or a-z
– Maintains a lock database for Global Cache Service (GCS) and buffer
cache resources
– This process receives, processes, and sends GCS requests, block
transfers, and other GCS-related messages
RAC Background Processes
• LMD0
– Global Enqueue Service Daemon 0 Process
– One LMD0 process per instance
– Manages incoming remote resource requests from other instances
– LMD0 processes enqueue resources managed under Global Enqueue
Service
– In particular, LMD0 processes incoming enqueue request messages and
controls access to global enqueues
– It also performs distributed deadlock detections
RAC Background Processes
• LCK0
– Instance Enqueue Background Process
– One LCK0 process per instance
– Assists LMSn processes
– Manages global enqueue requests and cross-instance broadcasts
– The process handles all requests for resources other than data blocks
• For examples, LCK0 manages library and row cache requests
RAC Background Processes
• LMON
– Global Enqueue Service Monitor Process
– One LMON process per instance
– Monitors an Oracle RAC cluster to manage global resources
– LMON maintains instance membership within Oracle RAC
– The process detects instance transitions and performs reconfiguration of
GES and GCS resources
RAC Background Processes
• DIAG
– Diagnostic Capture Process
– Performs diagnostic dumps
– DIAG performs diagnostic dumps requested by other processes and
dumps triggered by process or instance termination
– In Oracle RAC, DIAG performs global diagnostic dumps requested by
remote instances
Block Mastering
• In RAC, every data block is mastered by an instance
– Keep track of the state of the block maintained in Global Resource Directory (GRD)
– Mastered in block ranges (128 blocks since 10g)
– Block ranges are uniformly mastered between instances so that Global Cache
grants are evenly distributed across all instances
Instance 1 Instance 2 Instance 3
000,001 -
128,000128,001 –
256,000
256,001 –
384,000
File 1, blocks
000,001 -384,000
Global Cache Examples
Global Cache Example: Read From Disk
Resource Master
Instance 1 Instance 2 Instance 3
Instance 2 requests shared read on block
1. Request to
obtain a
Shared
Resource
2. Request is
Granted
3. Read
Request
4. Block
Delivered
SCN 1000SCN 1000
Global Cache Example: Read to Write
Resource Master
Instance 1 Instance 2 Instance 3
Instance 3 requests exclusive read on block
1. Request to obtain an Exclusive Resource
2. Instruct to
transfer block
for exclusive
access
SCN 1000
SCN 10003. Transfer
block
4. Update Resource Master with Resource Status
SCN 1001
4. Update
Resource
Master with
Resource
Status
Global Cache Example: Write to Write
Resource Master
Instance 1 Instance 2 Instance 3
Instance 2 requests exclusive read on block
1. Request to obtain an Exclusive Resource
SCN 1000
SCN 10013. Transfer
block *
2. Instruct to transfer block for exclusive access
SCN 1001SCN 1002
* The instance will create a
Past Image of the dirty block
before transferring. This is to
reduce recovery times upon
instance failure.
Global Cache Waits
Global Cache Waits: Active Session History (ASH)
• Block-Related Wait Events
– gc current block 2-way
– gc current block 3-way
– gc cr block 2-way
– gc cr block 3-way
• Wait event indicates that a block arrived from resource master (2-way) or
from another instance instructed by resource master (3-way)
current block = the first time a block is read into buffer
cr block (consistent read) = subsequently, when a block transferred to another instance
Global Cache Waits: Active Session History (ASH)
• Message-Related Wait Events
– gc current grant 2-way
– gc cr grant 2-way
• Wait event indicates that no block was received as it wasn’t cached in any
instance, instead a grant was given to read from disk or modify it
current block = the first time a block is read into buffer
cr block (consistent read) = subsequently, when a block transferred to another instance
Global Cache Waits: Active Session History (ASH)
• Contention-Related Wait Events
– gc current block busy
– gc cr block busy
– gc buffer busy acquire/release
• Wait event due to a hot block therefore could not be shipped immediately;
normally due to remote log flush, high concurrency or already requested block
current block = the first time a block is read into buffer
cr block (consistent read) = subsequently, when a block transferred to another instance
Global Cache Waits: Active Session History (ASH)
• Load-Related Wait Events
– gc current block congested
– gc cr block congested
• Wait event due to High Load, CPU saturation or High Interconnect Traffic
current block = the first time a block is read into buffer
cr block (consistent read) = subsequently, when a block transferred to another instance
Global Cache Waits: AWR Report
• Global Cache Waits can be observed in AWR:
Global Cache Waits: AWR Report
• Global Cache Load Profile & Efficiency Percentage:
Global Cache Waits: AWR Report
• GCS and GES Statistics:
Global Cache Waits: Oracle Enterprise Manager
• Cluster Cache Coherency:
Global Cache Waits: Oracle Enterprise Manager
• Cluster Cache Coherency:
Scalability
• Scalability is the relationship between workload and resources at increased
increments
• For RAC, resources are increased by adding nodes
• Scalability can be:
– linear – direct relationship between workload and resources
– non-linear – more resources are required with increased workload
Resource
Wo
rklo
ad
linear
non-linear
Scalability
• With RAC overheads:
– Global Cache Service (Blocks)
– Global Enqueue Service (Locks)
• It is impossible to achieve linear scalability
• General observation, 10% RAC overhead per instance (scale factor of 1.8)
– Overheads do decrease with more instances as the GCS workload is
more evenly distributed across the cluster
Demo
Dynamic Remastering
• New Feature since 10gR1, improved in 10gR2 and further enhanced in 11g
• When an object is accessed by an instance frequently, then that instance
becomes the master of the object
– Reduces GC grants and block transfers
• The view V$GCSPFMASTER_INFO shows objects that have been
remastered, info also available in AWR report
Dynamic Remaster
Demo
Reducing Global Cache Waits Further
• Partitioning Workloads
– Partitions Workloads by applications using Database Services
– This means common data will be accessed within a given instance or
isolated to a particular instance
– Reduces Remote Global Cache Requests
• Partitioning Data
– Distribute data using partitions and accessed using Database Services
Reducing Global Cache Waits Further
• Minimise Lock Usage
– Avoid unnecessary parsing
– Increase Shared Pool size
– Bind variables
– Cursor sharing
Reducing Global Cache Waits Further
• Use Automatic Segment Space Management (ASSM) – MUST
– Eliminates old linked freelists and replaces them with bitmap freelists
– Performs much faster and scales better
– Hence Oracle recommend ASSM for RAC
• Increase Sequence Caches
– With No Order if possible
Reducing Global Cache Waits Further
• Write Contention
– Write “hot spots” due to frequent changes to same data blocks across all
instances
– Other instances request blocks that are being changed
– Blocks can’t be transferred until pending redo is flushed to redo logs
– Latency for deferred block transfer becomes dependant on the log write
– Avoid write “hot spots” using Data Partitioning and Database Services
Reducing Global Cache Waits Further
• Write Contention Continued…
– Place redo logs on fast storage i.e. SSD and separate disks from other I/O
busy disks
– 99% of write “hot spots” are due to Indexes, therefore:
• Use Global Hash Partitioned Indexes
• Use Locally Partitioned Indexes
• Drop Unused Indexes
Cache Fusion Accelerator 12c
• New in 12.1.0.2
– OS kernel (Linux & Solaris only) module that can respond directly to
certain lock requests via RDSv3
– Lock state saved in memory shared by the database and the kernel
– Saves user/kernel context switches, frees up CPU cycles in LMS and
speeds up messages
– Will be incorporated into Engineered Systems
– Improve scalability, bridging the gap between linear and non-linear
– I’ve not tested this YET!
Other Performance Tips
• Recovery
– RECOVERY_PARALLELISM – parallel instance recovery
– FAST_START_PARALLEL_ROLLBACK – parallel recovery of a
terminated transaction
• Redo Size
– Size appropriately so to avoid aggressive checkpointing
– Set FAST_START_MTTR_TARGET to a reasonable value, so to
balance aggressive writing of dirty blocks to disk versus longer recovery
times
Other Performance Tips
• Ensure cluster is reasonably balanced
– Load Balancing
– Database Services
– Still balanced after a node failure
• Parallel Query
– May increase Global Cache waits but spreads the load across the
cluster thereby increasing performance of queries
– Useful for Large Full Table Scans or DML
Vertical and Horizontal Scaling
• Vertical Scaling (add more CPU and/or Memory)
– Pros:
• Avoids Global Cache Waits by using Local Cache instead of Global Cache
• Simpler to manage a Large Single Instance
Vertical and Horizontal Scaling
• Vertical Scaling (add more CPU and/or Memory)
– Cons:
• Limitation on adding more CPU and Memory
• No High Availability
• Can become very expensive
Vertical and Horizontal Scaling
• Horizontal Scaling (add more nodes, scale out)
– Pros:
• Gets around max CPU and/or Memory limitations
• High Availability
• Highly Scalable
– 100 nodes in 11g
– 64 hub nodes in 12c, with unlimited leaf nodes (FlexCluster)
• Increases network bandwidth to storage across the cluster
• Can use cheaper commodity servers
• RAC Rollable Patching
Vertical and Horizontal Scaling
• Horizontal Scaling (add more nodes, scale out)
– Cons:
• Complexity
• Overcapacity to survive failover
• Additional skillset required
• Increases maintenance (can decrease if used as consolidation platform)
• Takes longer to patch the cluster the larger it gets
• Overheads
Summary
• RAC does what it says on the tin (High Availability and Highly Scalable)
– Doesn’t come for free (some overheads)
– Near Linear Scalability
– Regardless of the number of instances, the maximum number of
instances involved in a block request is 3 (2-way or 3-way grant/block
transfer)
– Gives Maximum Scalability
– RAC IS STILL GREAT
References
• A Rough Guide to RAC - Julian Dyke:
http://www.juliandyke.com/Presentations/ARoughGuideToRAC.ppt
• Inside RAC - Julian Dyke:
http://www.juliandyke.com/Presentations/InsideRAC.ppt
• Oracle Database Online Documentation 11g Release 2 (11.2) - Background Processes:
http://docs.oracle.com/cd/E11882_01/server.112/e40402/bgprocesses.htm#REFRN104
• Monitoring Performance:
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/monitor.htm#RACAD986
• Descriptions of Wait Events:
http://docs.oracle.com/cd/E11882_01/server.112/e40402/waitevents003.htm#BGGIBDJI
References
• Oracle RAC Wait Event Tuning:
• http://www.dba-oracle.com/t_rac_wait_event_tuning.htm
• RAC object remastering (Dynamic remastering):
https://orainternals.wordpress.com/2010/03/25/rac-object-remastering-dynamic-remastering/
• Oracle RAC Internals - The Cache Fusion Edition:
http://www.slideshare.net/MarkusMichalewicz/oracle-rac-internals-the-cache-fusion-edition
• Oracle 10G RAC Scalability – Lessons Learned:
https://www.toadworld.com/platforms/oracle/w/wiki/403.oracle-10g-rac-scalability-lessons-learned
• Ten Vital Tips for Oracle RAC Performance:
http://www.slideshare.net/ZekeriyaBesiroglu/oracle-rac-performance-tunning-tipstricks
Thank you for listening!
– facebook.ZedDBA.co.uk (blog)
– twitter.ZedDBA.co.uk or @ZedDBA
– LinkedIn.ZedDBA.co.uk
– www.ZedDBA.co.uk (coming soon!)