How Many Slaves (Ukoug)

37
How Many Slaves? How Many Slaves? Parallel Execution and the Parallel Execution and the Magic of 2 Magic of 2 Doug Burns Doug Burns [email protected] [email protected] http://oracledoug.com http://oracledoug.com

description

UKOUG version of a presentation trying to establish the sensible limits of parallelism on a couple of hardware configurations. Detailed white paper is at http://oracledoug.com/px_slaves.pdf

Transcript of How Many Slaves (Ukoug)

Page 1: How Many Slaves (Ukoug)

How Many How Many Slaves?Slaves?

Parallel Execution and the Magic Parallel Execution and the Magic of 2of 2

Doug BurnsDoug Burns

[email protected]@yahoo.com

http://oracledoug.comhttp://oracledoug.com

Page 2: How Many Slaves (Ukoug)

IntroductionIntroduction

IntroductionIntroduction What is the Magic of What is the Magic of

‘2’?‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion …When is a Conclusion …

Page 3: How Many Slaves (Ukoug)

IntroductionIntroduction

Who (or what) am I ?Who (or what) am I ? ScottishScottish Predominantly a DBAPredominantly a DBA Training and ConsultancyTraining and Consultancy

Current AssignmentCurrent Assignment BSkyBBSkyB Very Cool Projects and Very Cool Projects and

HardwareHardware Less Cool Release Less Cool Release

ManagementManagement http://oracledoug.comhttp://oracledoug.com

BlogBlog

Page 4: How Many Slaves (Ukoug)

Why Parallel Execution?Why Parallel Execution? Increasing Volumes of DataIncreasing Volumes of Data Increasing User ExpectationsIncreasing User Expectations More Powerful HardwareMore Powerful Hardware

Parallel Execution (PX) splits a Parallel Execution (PX) splits a single large task into multiple single large task into multiple smaller tasks which are handled smaller tasks which are handled by separate processes running by separate processes running concurrently.concurrently. Full Table ScansFull Table Scans SortsSorts Index Creation, Direct Path inserts etc …Index Creation, Direct Path inserts etc …

IntroductionIntroduction

Page 5: How Many Slaves (Ukoug)

Previous PaperPrevious Paper Suck It Dry – Tuning Parallel ExecutionSuck It Dry – Tuning Parallel Execution http://oracledoug.com/px.htmlhttp://oracledoug.com/px.html (.doc & .pdf) (.doc & .pdf) Reviewer comments on parallel_max_serversReviewer comments on parallel_max_servers Debate about parallel_adaptive_multi_userDebate about parallel_adaptive_multi_user Something about the Magic of ‘2’Something about the Magic of ‘2’ Talked about Hardware, but nothing specificTalked about Hardware, but nothing specific

‘‘Sometimes when faced with a slow i/o subsystem you might find that Sometimes when faced with a slow i/o subsystem you might find that higher degrees of parallelism are useful because the CPUs are higher degrees of parallelism are useful because the CPUs are spending more time waiting for i/o to complete’spending more time waiting for i/o to complete’

IntroductionIntroduction

Page 6: How Many Slaves (Ukoug)

Always set customer expectation Always set customer expectation levelslevels I hope you didn’t come here looking for I hope you didn’t come here looking for

answers!answers! Or lots of detailOr lots of detail

http://oracledoug.com/px_slaves.pdfhttp://oracledoug.com/px_slaves.pdf (or.doc) (or.doc)

An interesting story, nonethelessAn interesting story, nonetheless A framework for your own testsA framework for your own tests A glance at some resultsA glance at some results

IntroductionIntroduction

Page 7: How Many Slaves (Ukoug)

IntroductionIntroduction

IntroductionIntroduction What is the Magic of What is the Magic of

‘2’?‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion When is a Conclusion

……

Page 8: How Many Slaves (Ukoug)

Batch Queue Management and the Magic Batch Queue Management and the Magic of ‘2’ of ‘2’ Cary Millsap (2000) - available at hotsos.comCary Millsap (2000) - available at hotsos.com

How many batch processes to execute per CPU?How many batch processes to execute per CPU? 2.2. Well, a range of values really, between 1 and 1.8? Well, a range of values really, between 1 and 1.8? Most recent work expands on thisMost recent work expands on this

CPU-intensive batch jobs per CPU <2 (nearer to 1)CPU-intensive batch jobs per CPU <2 (nearer to 1) I/O-intensive batch jobs per CPU >2I/O-intensive batch jobs per CPU >2 CPU and I/O request durations are exactly equal CPU and I/O request durations are exactly equal

(rare) - CPU * 2(rare) - CPU * 2

Misconfiguration could change everythingMisconfiguration could change everything What is a batch job anyway?What is a batch job anyway?

What is The Magic of ‘2’?What is The Magic of ‘2’?

Page 9: How Many Slaves (Ukoug)

What is The Magic of ‘2’?What is The Magic of ‘2’?

Oracle 10.2 Docs mention the Magic of ‘2’Oracle 10.2 Docs mention the Magic of ‘2’PARALLEL_THREADS_PER_CPU PARALLEL_THREADS_PER_CPU enables you to adjust for hardware enables you to adjust for hardware

configurations with I/O subsystems that are slow relative to the configurations with I/O subsystems that are slow relative to the

CPU speed and for application workloads that perform few CPU speed and for application workloads that perform few

computations relative to the amount of data involvedcomputations relative to the amount of data involved. .

If the system is neither CPU-bound nor I/O-bound, then the If the system is neither CPU-bound nor I/O-bound, then the

PARALLEL_THREADS_PER_CPU value should be increased. This PARALLEL_THREADS_PER_CPU value should be increased. This

increases the default DOP and allow better utilization of hardware increases the default DOP and allow better utilization of hardware

resources. resources.

The default for PARALLEL_THREADS_PER_CPU on most platforms is The default for PARALLEL_THREADS_PER_CPU on most platforms is

twotwo. . However, the default for machines with relatively slow I/O However, the default for machines with relatively slow I/O

subsystems can be subsystems can be as high as eightas high as eight..

Page 10: How Many Slaves (Ukoug)

What Tests?What Tests?

IntroductionIntroduction What is the Magic of What is the Magic of

‘2’?‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion When is a Conclusion

……

Page 11: How Many Slaves (Ukoug)

What should I test?What should I test? Parallel operations (obviously)Parallel operations (obviously) Multiple CPUsMultiple CPUs I/O infrastructureI/O infrastructure

Operating System – Operating System – Unix / LinuxUnix / Linux Free (as in beer)Free (as in beer) Cross-platformCross-platform Tools and UtilitiesTools and Utilities

Oracle Version – Oracle Version – 10.210.2 The latest and greatest, or common and well-The latest and greatest, or common and well-

known?known? Boy, Boy, thatthat was a good choice. was a good choice.

Workloads – Workloads – Keep it simpleKeep it simple Data!Data! CPU vs I/O balanceCPU vs I/O balance

What Tests?What Tests?

Page 12: How Many Slaves (Ukoug)

First attemptFirst attempt Full Table scan of a 2 million row tableFull Table scan of a 2 million row table

PCTFREE 90 expanded it to 2.8GbPCTFREE 90 expanded it to 2.8Gb Small enough for all platformsSmall enough for all platforms Big enough to exercise the I/O subsystem properlyBig enough to exercise the I/O subsystem properly NOT! EMC took 7 seconds.NOT! EMC took 7 seconds.

Second attemptSecond attempt Full Table scan of 8 million row tableFull Table scan of 8 million row table

PCTFREE 90 expanded it to 10GbPCTFREE 90 expanded it to 10Gb Too big for the little PC now! (Used 1/8 of the Too big for the little PC now! (Used 1/8 of the

data)data) Solved most problemsSolved most problems But too I/O intensive (More on this later)But too I/O intensive (More on this later)

What Tests?What Tests?

Page 13: How Many Slaves (Ukoug)

Third attemptThird attempt FTS plus a Hash Join and Sort of two 8 FTS plus a Hash Join and Sort of two 8

million row tablesmillion row tables PCTFREE 90 expanded them to over 10GbPCTFREE 90 expanded them to over 10Gb Unsuitable for the PC, used 1/8 data againUnsuitable for the PC, used 1/8 data again Started to produce more interesting resultsStarted to produce more interesting results

Multi-user testsMulti-user tests More on these laterMore on these later

8 new 1 million row tables8 new 1 million row tables PCTFREE 90 expanded them to 147Mb eachPCTFREE 90 expanded them to 147Mb each

What Tests?What Tests?

Page 14: How Many Slaves (Ukoug)

What Tests?What Tests?

The Test Process will be much easier if you haveThe Test Process will be much easier if you have Enough TimeEnough Time Appropriate HardwareAppropriate Hardware A Dedicated AssistantA Dedicated Assistant A Pleasant Working EnvironmentA Pleasant Working Environment

Two out of Four ain’t bad …Two out of Four ain’t bad …

Page 15: How Many Slaves (Ukoug)

Intel Single-CPU PC – Intel Single-CPU PC – Tulip PCTulip PC White Box Linux – Kernel 2.6.9White Box Linux – Kernel 2.6.9 1 x 550Mhz Pentium 31 x 550Mhz Pentium 3 768Mb RAM768Mb RAM Single 20Gb IDESingle 20Gb IDE

Intel SMP Server – Intel SMP Server – Intel ISP4400 (SRKA4)Intel ISP4400 (SRKA4) White Box Linux – Kernel 2.6.9White Box Linux – Kernel 2.6.9 4 x 700Mhz Pentium 3 Xeon4 x 700Mhz Pentium 3 Xeon 3.5Gb RAM3.5Gb RAM 4 x Seagate Cheetah U-160 SCSI4 x Seagate Cheetah U-160 SCSI

Software RAID-0 (256Kb stripe)Software RAID-0 (256Kb stripe) Separate system/software diskSeparate system/software disk Enable/Disable CPUs by editing grub.confEnable/Disable CPUs by editing grub.conf

£300 on eBay including all HDD and shipping£300 on eBay including all HDD and shipping

What Tests?What Tests?

Page 16: How Many Slaves (Ukoug)

Enterprise SMP server – Enterprise SMP server – Sun E10KSun E10K Solaris 8Solaris 8 12 x 400Mhz SPARC12 x 400Mhz SPARC 12Gb RAM12Gb RAM EMC Symmetrix 8730 via Brocade SANEMC Symmetrix 8730 via Brocade SAN 5 x Hard Disk Slices (Hypers) in RAID 1+0 (960Kb 5 x Hard Disk Slices (Hypers) in RAID 1+0 (960Kb

stripe)stripe) Enable/Disable CPUs using psradmEnable/Disable CPUs using psradm

Yes, Yes, reallyreally! ! We had some spare kit kicking around. (Thanks, We had some spare kit kicking around. (Thanks,

Mike)Mike) DBA LessonsDBA Lessons

#1 - Always be nice to System and Storage #1 - Always be nice to System and Storage AdministratorsAdministrators

#2 – Work for companies with a lot of money#2 – Work for companies with a lot of money

What Tests?What Tests?

Page 17: How Many Slaves (Ukoug)

Test Scripts and ToolsTest Scripts and Tools

IntroductionIntroduction What is the Magic of What is the Magic of

‘2’?‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion …When is a Conclusion …

Page 18: How Many Slaves (Ukoug)

Test Scripts and ToolsTest Scripts and Tools

init.ora init.ora Disabled parallel_adaptive_multi_userDisabled parallel_adaptive_multi_user

Set parallel_max_servers to 512Set parallel_max_servers to 512 I forgot to increase this a couple of timesI forgot to increase this a couple of times

A stupid mistake in the paper (and an important A stupid mistake in the paper (and an important lesson)lesson)

Parallel_max_servers=512 keeps defaulting to 385?Parallel_max_servers=512 keeps defaulting to 385? processes=400 !processes=400 !

Setup scriptsSetup scripts To be able to recreate environment easilyTo be able to recreate environment easily setup1.sql – Tablespaces, user account and privssetup1.sql – Tablespaces, user account and privs setup2.sql – Create two 8 million row / 11Gb tablessetup2.sql – Create two 8 million row / 11Gb tables setup3.sql – Create eight 1 million row / 147Mb tables.setup3.sql – Create eight 1 million row / 147Mb tables.

Page 19: How Many Slaves (Ukoug)

Test Scripts and ToolsTest Scripts and Tools

Test scriptsTest scripts

To run selected SQL statements consistently across a To run selected SQL statements consistently across a range of DOPs, unattended.range of DOPs, unattended.

rolling.sh – FTS and HJ/Sort against the big tablesrolling.sh – FTS and HJ/Sort against the big tables

session.sh – HJ/Sort of one big table and one of the session.sh – HJ/Sort of one big table and one of the smaller tables, accepting a session parameter so smaller tables, accepting a session parameter so that multiple copies can run concurrentlythat multiple copies can run concurrently

multi.sh – Harness script that runs session.sh for a multi.sh – Harness script that runs session.sh for a given number of usersgiven number of users

Page 20: How Many Slaves (Ukoug)

Test Scripts and ToolsTest Scripts and Tools

Information CollectionInformation Collection Simple log fileSimple log file

SQL statementsSQL statements OutputOutput TimingsTimings AutotraceAutotrace v$pq_tqstat query after each statementv$pq_tqstat query after each statement

10046 Trace File10046 Trace File Consolidated version, using client_id and trcsessConsolidated version, using client_id and trcsess tkprof output tootkprof output too Watch the overhead in disk space and trcsess run Watch the overhead in disk space and trcsess run

time!time! System StatisticsSystem Statistics

Page 21: How Many Slaves (Ukoug)

Test Scripts and ToolsTest Scripts and Tools

Operating System StatisticsOperating System Statistics Resource UsageResource Usage BottlenecksBottlenecks Long-running tests – likely to be a lot of data!Long-running tests – likely to be a lot of data!

ORCA/orcallatorORCA/orcallator http://www.orcaware.com/orcahttp://www.orcaware.com/orca

Go for the latest development tarball, which includesGo for the latest development tarball, which includes procallator for Linux statistics collectionprocallator for Linux statistics collection

Easy configuration to generate HTML outputEasy configuration to generate HTML output Pretty graphs!Pretty graphs! Lots of them in the paper, but not here.Lots of them in the paper, but not here.

Page 22: How Many Slaves (Ukoug)

Test ResultsTest Results

IntroductionIntroduction What is the Magic of What is the Magic of

‘2’?‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion …When is a Conclusion …

Page 23: How Many Slaves (Ukoug)

PC – 1 CPU – 1.3GbPC – 1 CPU – 1.3Gb

PC - Single CPU - 1 million rows - 1.3Gb

00:00.0

01:26.4

02:52.8

04:19.2

05:45.6

07:12.0

08:38.4

10:04.8

11:31.2

12:57.6

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32

DOP

Tim

e FTS

HJ

Page 24: How Many Slaves (Ukoug)

ISP4400 – 1-4 CPUs – FTS ISP4400 – 1-4 CPUs – FTS 11Gb11Gb

ISP4400 - Full Table Scan - 8 million rows - 11Gb

00:00.0

01:26.4

02:52.8

04:19.2

05:45.6

07:12.0

08:38.4

10:04.8

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32 48 64 96 12

8

DOP

Tim

e

1 CPU

2 CPU

3 CPU

4 CPU

Page 25: How Many Slaves (Ukoug)

ISP4400 – 1-4 CPUs – HJ ISP4400 – 1-4 CPUs – HJ 22Gb22Gb

ISP - HJ with GB - 2 x 8 million rows - 22Gb

00:00.0

02:52.8

05:45.6

08:38.4

11:31.2

14:24.0

17:16.8

20:09.6

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32 48 64 96 12

8

DOP

Tim

e

1 CPU

2 CPU

3 CPU

4 CPU

Page 26: How Many Slaves (Ukoug)

E10K – 1-12 CPUs – FTS E10K – 1-12 CPUs – FTS 11Gb11Gb

E10K - FTS - 8 million rows - 11Gb

00:00.0

01:26.4

02:52.8

04:19.2

05:45.6

07:12.0

08:38.4

10:04.8

11:31.2

12:57.6

14:24.0

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32 48 64 96 12

8

DOP

Tim

e

1 CPU

2 CPU

3 CPU

4 CPU

5 CPU

6 CPU

7 CPU

8 CPU

9 CPU

10 CPU

11 CPU

12 CPU

Page 27: How Many Slaves (Ukoug)

E10K – 1-12 CPUs – HJ E10K – 1-12 CPUs – HJ 22Gb22Gb

E10K - HJ with GB - 2 x 8 million rows - 22Gb

00:00.0

07:12.0

14:24.0

21:36.0

28:48.0

36:00.0

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32 48 64 96 12

8

DOP

Tim

e

1 CPU

2 CPU

3 CPU

4 CPU

5 CPU

6 CPU

7 CPU

8 CPU

9 CPU

10 CPU

11 CPU

12 CPU

Page 28: How Many Slaves (Ukoug)

Multi-user TestsMulti-user Tests First attempt First attempt

Hash Join/Sort statement onlyHash Join/Sort statement only 170Mb Tables – 128,000 rows (PCTFREE 90)170Mb Tables – 128,000 rows (PCTFREE 90) Between 1 and 12 concurrent users, noparallel to DOP 4Between 1 and 12 concurrent users, noparallel to DOP 4 Showed how quickly PX response drops off with multiple Showed how quickly PX response drops off with multiple

usersusers Then I noticed something strange in the V$PQ_TQSTAT Then I noticed something strange in the V$PQ_TQSTAT

outputoutput Slaves weren’t doing much work.Slaves weren’t doing much work.

What’s that sound I can hear?What’s that sound I can hear? PCTFREE 90 - lots of disk I/O (largely empty blocks)PCTFREE 90 - lots of disk I/O (largely empty blocks) Very small data volumes feeding into later stages of the Very small data volumes feeding into later stages of the

plan!plan! Mmmmm …. Perhaps that doesn’t test the CPUs too wellMmmmm …. Perhaps that doesn’t test the CPUs too well

Page 29: How Many Slaves (Ukoug)

Multi-user TestsMulti-user Tests

ISP4400 - 4 CPUs - HJ of two 147Mb (1,024,000 rows) Tables

00:00.0

00:08.6

00:17.3

00:25.9

00:34.6

00:43.2

00:51.8

01:00.5

01:09.1

01:17.8

01:26.4

1 2 3 4 5 6 7 8 9 10 11 12

Sessions

Tim

e

No Parallel

DOP 2

DOP 3

DOP 4

Page 30: How Many Slaves (Ukoug)

Doh!Doh!

If the CPUs weren’t working hard enough on If the CPUs weren’t working hard enough on the multi-user tests, then …the multi-user tests, then …

I should re-run the Single User/Volume TestsI should re-run the Single User/Volume Tests

Page 31: How Many Slaves (Ukoug)

Single User Volume Tests IISingle User Volume Tests II

E10K - HJ with GB - 2 x 65 million rows - 17Gb

00:00.0

07:12.0

14:24.0

21:36.0

28:48.0

36:00.0

43:12.0

50:24.0

57:36.0

04:48.0

No Par

allel 2 3 4 5 6 7 8 9 10 11 12 16 24 32 48 64 96 12

8

DOP

Tim

e

1 CPU

3 CPU

5 CPU

7 CPU

9 CPU

11 CPU

12 CPU

10 CPU

2 CPU

Page 32: How Many Slaves (Ukoug)

When is a Conclusion …When is a Conclusion …

IntroductionIntroduction What is the Magic of ‘2’?What is the Magic of ‘2’? What Tests?What Tests? Test Scripts and ToolsTest Scripts and Tools Test ResultsTest Results When is a Conclusion …When is a Conclusion …

Page 33: How Many Slaves (Ukoug)

… … not a Conclusion?not a Conclusion? When it contains lots of mights, maybes and coulds?When it contains lots of mights, maybes and coulds? When you’ve been testing the wrong thing?When you’ve been testing the wrong thing?

IF you’re the only user of the server IF you’re the only user of the server andand it has it has more than one CPU more than one CPU andand enough disks enough disks thenthen You should definitely give PX at a DOP of 2 a tryYou should definitely give PX at a DOP of 2 a try Benefit from the direct path I/O, not the parallelism?Benefit from the direct path I/O, not the parallelism?

_serial_direct_read=true_serial_direct_read=true

Benefits diminish rapidlyBenefits diminish rapidly If using an unsuitable disk configuration, like these If using an unsuitable disk configuration, like these

teststests Then again, I think a lot of people Then again, I think a lot of people areare

When is a Conclusion …When is a Conclusion …

Page 34: How Many Slaves (Ukoug)

The only way to know for The only way to know for suresure is to test is to test your your SQL, with SQL, with youryour data with a range of DOPs data with a range of DOPs Then choose something below the apparent optimum?Then choose something below the apparent optimum?

Parallel Execution loves hardwareParallel Execution loves hardware But it’s not just about having loads of kitBut it’s not just about having loads of kit You need to have the right You need to have the right balancebalance of CPU, Memory of CPU, Memory

and I/O bandwidthand I/O bandwidth Bottlenecks will become apparent more quicklyBottlenecks will become apparent more quickly

Don’t use it for onlineDon’t use it for online Unless it’s a handful of usersUnless it’s a handful of users With a predictable maximum number of concurrent With a predictable maximum number of concurrent

activitiesactivities Set parallel_adaptive_multi_user to TRUE? (10g Set parallel_adaptive_multi_user to TRUE? (10g

default)default) You must explain it to your users!You must explain it to your users!

When is a Conclusion …When is a Conclusion …

Page 35: How Many Slaves (Ukoug)

More things to tryMore things to try Bigger stripe widths and filesystem options (Bigger stripe widths and filesystem options (DONEDONE)) Different extent and block sizes (Different extent and block sizes (DONEDONE)) Disk-separated data files and Hash Partitioned Disk-separated data files and Hash Partitioned

TablesTables Hardware RAIDHardware RAID Different Automatic PGA Management SettingsDifferent Automatic PGA Management Settings Oracle’s Default PX Parameter ValuesOracle’s Default PX Parameter Values Different SQLDifferent SQL

What have I started ?!?What have I started ?!? What price an old EMC Symmetrix on eBay?What price an old EMC Symmetrix on eBay? Do you think Scottish Power do 3-phase power for Do you think Scottish Power do 3-phase power for

domestic customers?domestic customers? How will I explain the How will I explain the noisenoise to Housemates and to Housemates and

Partner!Partner!

When is a Conclusion …When is a Conclusion …

Page 36: How Many Slaves (Ukoug)

The scripts are thereThe scripts are there http://oracledoug.com/px_slaves.dochttp://oracledoug.com/px_slaves.doc Tailor them to your needs. Improve them!Tailor them to your needs. Improve them! Let me know your results – I’m interested.Let me know your results – I’m interested. Including details of your environmentIncluding details of your environment Data creation scriptsData creation scripts Your SQLYour SQL

When is a Conclusion …When is a Conclusion …

Page 37: How Many Slaves (Ukoug)

How Many How Many Slaves?Slaves?

Parallel Execution and the Magic Parallel Execution and the Magic of 2of 2

Doug BurnsDoug Burns

[email protected]@yahoo.com

http://oracledoug.comhttp://oracledoug.com