Tuning Oracle RAC Wednesday, June 04, 2014 Guy Peleg President Maklee Engineering...
-
Upload
fabian-faulconer -
Category
Documents
-
view
222 -
download
1
Transcript of Tuning Oracle RAC Wednesday, June 04, 2014 Guy Peleg President Maklee Engineering...
Agenda
RAC overview & Performance ExpectationsPerformance Tips SQL Tuning
Typical RAC configuration
Public Network
Private Network(Interconnect)
Storage Network
AppServer1
AppServer2
Client
Node 1
DatabaseInstance 1
Node 2
DatabaseInstance 2
Local Storage Local StorageShared
Database
Oracle RAC
Oracle RAC provides two main features:Availability
Scalability
May operate in two modes:All nodes are active (load distributed between nodes)
Active/Passive
RAC scaling/performance considerations are similar to OpenVMS clustering scaling/performance considerations
Interconnect
Locks
Sharing
RAC Scaling – Maklee’s Golden Rules
Application that does not scale on a standalone node – will not scale on RAC
Start with single instance tuningshutdown all nodes measure scalingtest scaling by adding CPUs
Add one node at a time to measure scalability
Scalability Benchmark
2 nodes cluster1.3 Ghz rx2600, running OpenVMS V8.3-1H1
Oracle 10gR2 RACLatest set of patches
Test database contains information about 50,000 customers
200,000 customer orders
200,000 ordered items
Scalability Benchmark
PL/SQL procedure to fetch data about 2000 random customers
Read only test
All data in SGA
No I/O
CPU Bound
Scalability Benchmark
0
200
400
600
800
1000
1200
1400
1 Job
6 Parallel jobs -one instance
3 Parallel jobsinstance 1
3 Parallel jobsinstance 2
Elapsed time (seconds per job) to complete the testLess is better
Elapsed time (seconds per job) to complete the testLess is better
RAC Proof Of Concept
MAKLEE Engineering recently performed a RAC proof of concept installation at a large chain of department stores in Switzerland.
Benchmarked a single Alpha GS1280 (production node) vs. a RAC cluster running 2 Integrity servers rx6600.
The goals were:Install RAC Get hands on experience with RACPerform RAC scaling testsMake a go/no go decision on implementing RAC in production
Hardware & Software Configuration
Oracle RAC Configuration:2 nodes OpenVMS ClusterEach node is rx6600 with 8 coresOpenVMS V8.3-1H1EVA8000 storage
Products installed:Oracle CRS (Cluster Ready Services)Oracle 10g R2DBCA executed for configuring RAC enabled databaseDatabase patches
27 Parallel Database Import Jobs
Minutes to complete database import
less is better
0
20
40
60
80
100
120
140
160Standalone GS1280
rx6600RAC/standaloneItanium
Database Import
Itanium outperformed Alpha
Operating in RAC environment does not increase the throughput of the import operation
Spreading the jobs across two nodes or running all jobs on one node yields identical performance/throughput
No performance degradation witnessed
Batch Processing Benchmark
0
5
10
15
20
25
30
35
40
4554 jobs - SingleAlpha GS1280
54 jobs - Singlerx6600
54 jobs - Spreadacross the RAC
Minutes to complete batch processing cycle
Less is better
Minutes to complete batch processing cycle
Less is better
Batch Processing Benchmark
Itanium outperformed Alpha
RAC allows scaling outside of the box
Second RAC node adds 40% more throughput
Another Example – European Bank
European Bank migrating from Alpha to Itanium
2 nodes AlphaServer ES47 -> 2 nodes rx7640
Migrating to Oracle 10gR2 RAC
Availability is main concernInteractive users will be distributed between nodesNo plans to distributed batch load between nodesNeeded to verify that RAC does not degrade performance
Another Example – European Bank
Benchmarked various batch jobs – focusing on one specific batch job.
Initial results did not favor Itanium.
Batch Processing Benchmark
0
5
10
15
20
25
30
35Alpha ES47
rx7640 out of thebox
rx7640 after tuning
Minutes to complete selected batch job
Less is better
Minutes to complete selected batch job
Less is better
European Bank - Summary
Tuning is critical for achieving optimal performance
Don’t run “out of the box”.66% improvement after (minimal) tuning
The specific benchmark is running 52% faster on Itanium comparing to Alpha.
European Bank - Summary
All other batch jobs/applications witnessed similar improvement.
RAC increases availability and does not degrade performance.
RAC will go into production in few weeks
Performance Tips
CRS Base Priority
CRS is running in batch
Usually, runs in a dedicated batch queue
By default, base priority of a batch queue is 4
On a system with thousands of processes, CRS may need to compete (and sometimes lose) for CPU resources
CRS should be given high priority
Set base priority of CRS queue to 12
RAC Cluster Interconnect
The performance of the cluster interconnect is critical to the performance of the RAC.
Interconnect used forCluster managementLocksCache Fusion
Oracle requires (at least one) dedicated cluster interconnect
Gigabit Ethernet is highly recommendedEnable Jumbo FramesTransfer rate of ~ 25MB per second (faster than some disks ;-)
Cluster interconnect Performance
Latency is CRITICAL for RAC performance
Measure the latency of the interconnect:
set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select
b1.inst_id,
b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value/b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1,
gv$sysstat b2
where b1.name='gc cr block receive time'
and b2.name='gc cr blocks received'
and b1.inst_id=b2.inst_id;
Cluster interconnect Performance
Latency should be lower than 15ms
OpenVMS achieved 0.5ms onblades RAC (BL860)V8.3-1H1Gigabit EthernetJumbo Frames enabled
Load distribution between instances
set pagesize 60 space 2 numwidth 8 linesize 132 verify off feedback offcolumn service_name format a20 truncated heading 'Service'column instance_name heading 'Instance' format a10column service_time heading 'Service Time|mSec/Call' format 999999999select service_name, instance_name, elapsedpercall service_time, cpupercall cpu_time, dbtimepercall db_time, callspersec throughput from gv$instance gvi, gv$active_services gvas, gv$servicemetric gvsm where gvas.inst_id=gvsm.inst_id and gvas.name_hash=gvsm.service_name_hash and gvi.inst_id=gvsm.inst_id and gvsm.group_id=10 order by service_name, gvi.inst_id;
Standalone Database Import
Minutes to complete database import
less is better
0
1
2
3
4
5
6
7
8
rx6600 beforetuning
rx6600 after tuning
37% Improvement
Database import
Install imp.exe as resident image with shared address space$ install add imp.exe/resident/share=addr
Increase default quotas for BEQ’s mailboxes$ define/sys ORA_BEQ_MBXSIZ 64000$ define/sys ORA_BEQ_MBXSBFQ 64000
Set DEFMBXBUFQUO to 64000Set DEFMBXMXMSG to 64000
DBMS_STATS.GATHER_SCHEMA_STATS
0
10
20
30
40
50
60
70
80
90
100
rx6600 beforetuning
rx6600 after tuning
Minutes to gather database statistics (350GB database)
Less is better
Minutes to gather database statistics (350GB database)
Less is better
DBMS_STATS.GATHER_SCHEMA_STATS
Calling gather_schema_stats results in a database server process being created
The server process in not multithreaded
Typically consumes 100% of one CPU
Performance improvement achieved by affinitizing the server process to one CPU and increasing QUANTUM to 20.
SORT
Analyze the efficiency of sort operations
Determine the number of optimal, one pass and multipass operationsSELECT optimal_count, round(optimal_count*100/total, 2) optimal_perc, onepass_count, round(onepass_count*100/total, 2) onepass_perc, multipass_count, round(multipass_count*100/total, 2) multipass_percFROM (SELECT decode(sum(total_executions), 0, 1, sum(total_executions)) total, sum(OPTIMAL_EXECUTIONS) optimal_count, sum(ONEPASS_EXECUTIONS) onepass_count, sum(MULTIPASSES_EXECUTIONS) multipass_count FROM v$sql_workarea_histogramWHERE low_optimal_size > 64*1024);
Sizing the SGA
Reserve memory for the SGA (SYSMAN)
Avoid automatic memory management in the SGA whenever possible.
The following query will help properly size the SGA
select sga_size, sga_size_factor as size_factor,estd_physical_reads as estimated_physical_reads from v$sga_target_advice order by sga_size_factor;
Sizing the SGA
SQL> select sga_size, sga_size_factor as size_factor,
2 estd_physical_reads as estimated_physical_reads
3 from v$sga_target_advice order by sga_size_factor;
SGA_SIZE SIZE_FACTOR ESTIMATED_PHYSICAL_READS
---------- ----------- ------------------------
4356 ,75 44485808
5808 1 24659539
7260 1,25 24659539
8712 1,5 24659539
10164 1,75 24659539
SQL>
What’s wrong in this picture?
$ show memory
System Memory Resources on 1-APR-2008 15:32:35.62
Physical Memory Usage (bytes): Total Free In Use Modified
Main Memory (GB) 64.00 58.27 5.69 0.02
Extended File Cache (Time of last reset: 31-MAR-2008 15:14:46.99)
Allocated (MBytes) 397.03 Maximum size (MBytes) 32768.00
Free (MBytes) 17.82 Minimum size (MBytes) 3.12
In use (MBytes) 379.20 Percentage Read I/Os 77%
Read hit rate 99% Write hit rate 0%
Read I/O count 5368075 Write I/O count 1578011
Read hit count 5315683 Write hit count 0
Reads bypassing cache 79 Writes bypassing cache 241954
Files cached open 739 Files cached closed 2255
Vols in Full XFC mode 0 Vols in VIOC Compatible mode 52
Vols in No Caching mode 0 Vols in Perm. No Caching mode 0
....
Of the physical memory in use, 8.52 GB are permanently allocated to OpenVMS.
$
SQL Tuning
The next step in improving performance
SQL Tuning !
With previous “Alpha Vs. Itanium” benchmarks we had to play it fare
Not a single SQL statement was changed.
SQL tuning may improve performance by magnitudes
SQL Tuning
All the tools that are required for SQL tuning are shipping with the database:
Automatic Workload Repository (AWR)Endless amount of performance related informationEnhanced version of statpak
Active Session History (ASH)Automatic Database Diagnostic Monitor (ADDM)SQL Access AdvisorSQL Tuning Advisor
Statspack analyzer (not part of the DB but available for free)
The power of SQL tuning
AWR was used to analyze the “scalability benchmark”97% of the time was spent executing single SQL statementAfter SQL tuning – elapsed time of the benchmark was reduced from 411 seconds to 3.18 seconds !
0
50
100
150
200
250
300
350
400
450
Untunedversion
Tuned version
130 times
faster!!!!
The power of SQL tuning
“Real life” examplerx6600, Oracle 10g, DWH DBSingle SQL statement required 140 minutes to completeBy biasing the optimizer, elapsed time reduced to 10 minutes
0
20
40
60
80
100
120
140
Untunedversion
Tuned version
See us at www.maklee.com for:
• Performance improvements
• Oracle Tuning
• Platform Migration
• Custom Engineering solutions
• Custom Training
Questions?