Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 ›...

70
Fewer/Faster vs More/Slower: Practical Considerations Scott Chapman Enterprise Performance Strategies [email protected]

Transcript of Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 ›...

Page 1: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Fewer/Faster vs More/Slower:

Practical ConsiderationsScott Chapman

Enterprise Performance Strategies

[email protected]

Page 2: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Contact, Copyright, and Trademark Notices

Questions?Send email to Scott at [email protected], or visit our website at http://www.epstrategies.com or http://www.pivotor.com.

Copyright Notice:© Enterprise Performance Strategies, Inc. All rights reserved. No part of this material may be reproduced, distributed, stored in a retrieval system,

transmitted, displayed, published or broadcast in any form or by any means, electronic, mechanical, photocopy, recording, or otherwise, without the prior written permission of Enterprise Performance Strategies. To obtain written permission please contact Enterprise Performance Strategies, Inc. Contact information can be obtained by visiting http://www.epstrategies.com.

Trademarks:Enterprise Performance Strategies, Inc. presentation materials contain trademarks and registered trademarks of several companies.

The following are trademarks of Enterprise Performance Strategies, Inc.: Health Check®, Reductions®, Pivotor®

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries: IBM®, z/OS®, zSeries® WebSphere®, CICS®, DB2®, S390®, WebSphere Application Server®, and many others.

Other trademarks and registered trademarks may exist in this presentation.

Page 3: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

z/OS Performance workshops available

During these workshops you will be analyzing your own data!

•WLM Performance and Re-evaluating of Goals

–September 25-29 in Minneapolis, MN

•Parallel Sysplex and z/OS Performance Tuning

–August 29-31, 2017 via the internet

•Essential z/OS Performance Tuning Workshop

–Fall 2017, Columbus OH

Page 4: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Like what you see?

•The z/OS Performance Graphs you see here come from

Pivotor™ but should be in most of the major reporting products

•If not, or you just want a free cursory review of your

environment, let us know!

–We’re always happy to process a day’s worth of data and show you the

results

–See also: http://pivotor.com/cursoryReview.html

Page 5: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Overview

•Processors, caches, c, and performance

•Pros/Cons for fewer/faster vs more/slower

•What choices might you have: example scenarios

•What to expect when changing processor speeds

• Interesting measurements

•Considerations for various workloads

Red = things I want you to pay particular attention to

Page 6: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Clock cycles and effective capacity

• Ideally, you’d like to get real work done each

clock cycle

•z Processor speeds are really fast

–z10 – 4.4 Ghz

–z196 – 5.2Ghz

–zEC12 – 5.5Ghz

–z13 – 5.0 Ghz

•So 1ms to wait for an I/O = millions of clock

cycles

Billions of cycles per second

1 Clock cycle = fraction of a

nanosecond

Page 7: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

How long is a cycle again?

•Just over 2 inches

–Light, in a vacuum

–Electrical signal in a circuit is much slower (40-70% of c)

–1 meter in fiber ~ 5 ns

•Need to make a round trip

•Signal paths aren’t as the mosquito flies

–7.7 Miles of wire in a zEC12 chip, >13 in z13

•Physical distance matters!

Page 8: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Data access hierarchy

• Register

• Memory

– L1 Cache

– L2 Cache

– L3 Cache

– L4 Cache

• Local

• Remote

– Real

• Storage Class Memory

• Disk

– Cache

– SSD

– Spinning

• Network

Clo

ser

to C

PU

Core

Optimal performance &

capacity utilization =

keeping data as close to

processor as possible!

The farther the data is away

from the processor, the more

clock cycles will be spent

accessing it.

Page 9: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Cache utilization & performance

•Memory is far away from the processor core and relatively slow

•Effective use of processor cache is important to keeping the processor

“fed”

•Cache effectiveness measurements are in the Hardware Instrumentation

Services SMF 113 records

–Requires z/OS 1.8 +PTFs & z10 GA2

•Enable HIS and record the 113 records

–Required for effective capacity planning on upgrade

Page 10: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

SMF 113 Data: Average CPI

Page 11: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Dynamic Address Translation

•DAT performed using multiple tables that point to different ranges of

storage

•DAT is not free!

•Result of DAT cached in Translation Look-aside Buffers (TLB)

•TLBs are in L1 cache and managed by the hardware

–Relatively small

•1MB & 2GB pages make TLBs more effective

–Larger pages = Fewer pages = Fewer TLB entries required

• 100GB = 50 2GB pages = 102,400 1MB pages = 26,214,400 4K pages

Page 12: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Estimated impact of TLB Misses

Page 13: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

HiperDispatch Terms

•Logical processors classified as:

–High – The processor is essentially dedicated to the LPAR (100% share)

–Medium – Share between 0% and 100%

–Low – Unneeded to satisfy LPAR’s weight

•This processor classification is sometimes referred to as “vertical” or

“polarity” or “pool”

–E.G. Vertical High = VH = High Polarity = High Pool = HP

•Parked / Unparked

–Initially, VL processors are “parked”: work is not dispatched to them

–VL processors may become unparked (eligible for work) if there is demand and

available capacity

Page 14: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

HiperDispatch off (5 CPs)

L

3L

3

L

3L

3

L

3L

3

L4

Controller

L4 L4

L4 L4

LPAR A LPAR B LPAR C

Page 15: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

HiperDispatch on (5 CPs)

L

3L

3

L

3L

3

L

3L

3

L4 controller

L4 L4

L4 L4

LPAR A LPAR B LPAR C

Turn on

HiperDispatch!

Page 16: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

More/Slower (7 CPs)

L

3L

3

L

3L

3

L

3L

3

L4 controller

L4 L4

L4 L4

LPAR A LPAR B LPAR C

More L1/L2 cache

for the work

Page 17: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Fewer/faster

L

3L

3

L

3L

3

L

3L

3

L4 controller

L4 L4

L4 L4

LPAR A LPAR B LPAR C

Less L1/L2 cache

for the work

Page 18: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

How can you improve cache effectiveness?

• Enable HiperDispatch

• Make good use of large pages

• Upgrade to newer machine

• Consider more/slower CPUs instead of fewer/faster

–More CPUs = More L1/L2/TLB

Cache sizes (L4/book or drawer)

Per CP Per Chip NIC dir Shared

Year Machine L1 Data L1 Inst. L2 Data L2 Inst. L3 L4

2005 z9 EC 256KB 256KB 40MB n/a n/a

2008 z10 EC 128KB 64KB 3MB n/a 48MB

2010 z196 128KB 64KB 1.5MB 24MB 192MB

2012 zEC12 96K 64K 1MB 1MB 48MB 384MB

2015 z13 128K 96K 2MB 2MB 64MB 448MB 960MB

Page 19: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Do you have a choice, and if you do,

what should you choose?

Page 20: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Processor count trade-offs

•Mainframe

–Sub-capacity (“slow”)

processors limited to first

drawer/book

–4 or 26 speed options

• Intel – Xeon chips

–6 to 22 cores / chip

–1.6Ghz – 3.5Ghz

Page 21: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

What’s in a name (or model number)?

•General form: mmmm-snn

–mmmm = model number

–s = relative engine speed

• “EC” machines: 4 (slowest) to 7 (fastest)

• “BC” machines: A (slowest) to Z (fastest)

–nn = number of general purpose engines

•Examples:

–2827-704 = zEC12 with 4 full speed engines

–2964-410 = z13 with 10 slowest-speed engines

–2818-M03 = z114 with 3 engines, middle of the speed range

Common name Model number

z13

z13s

2964

2965

zEC12

zBC12

2827

2828

z196

z114

2817

2818

z10EC

z10BC

2097

2098

z9EC

z9BC

2094

2096

Page 22: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scott’s ROTs for CPU counts

•Don’t configure z/OS with less than 2 logical CPs

–Possible exception: LPAR is a largely unused sandbox & not in a Sysplex

•Less than 3 physical CPs on a machine troublesome

–2 possibly ok if single primary LPAR

•Minor changes may or may not matter as much, depending on your

LPAR configuration

–Going from 3 faster to 5 slower not as dramatic as 5 to 10 or 3 to 8

–“not as dramatic” doesn’t mean ignore it

–If a couple of extra CPs means more high-polarity processors, that might be a very

good thing

Page 23: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Upgrade scenario examples

•Upgrade z196/z114 to z13

•Keep MSUs about the same

–Easily done for MLC sub-capacity, but consider the ISVs

•Explore the options with zPCR

–Completely fictional configs, but hopefully somewhat representative

z196 2817-704

531 MSUs

z196 2817-504

265 MSUs

z114 2818-V02

92 MSUs

6 LPARs 4 LPARs 3 LPARs

Page 24: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

z13 Options <1500 MSUs

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

See: https://www-304.ibm.com/servers/resourcelink/lib03060.nsf/pages/lsprITRzOSv2r1?OpenDocument

Page 25: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

z13s Options < 500 MSUs

0

100

200

300

400

500

2965-A01

2965-A03

2965-A05

2965-B01

2965-B03

2965-B05

2965-C01

2965-C03

2965-C05

2965-D01

2965-D03

2965-D05

2965-E01

2965-E03

2965-E05

2965-F01

2965-F03

2965-F05

2965-G01

2965-G03

2965-G05

2965-H01

2965-H03

2965-H05

2965-I01

2965-I03

2965-I05

2965-J01

2965-J03

2965-J05

2965-K01

2965-K03

2965-K05

2965-L01

2965-L03

2965-L05

2965-M

01

2965-M

03

2965-M

05

2965-N01

2965-N03

2965-N05

2965-O01

2965-O03

2965-O05

2965-P01

2965-P03

2965-P05

2965-Q01

2965-Q03

2965-Q05

2965-R01

2965-R03

2965-R05

2965-S01

2965-S03

2965-S05

2965-T01

2965-T03

2965-T05

2965-U01

2965-U03

2965-U05

2965-V02

2965-V04

2965-W

01

2965-W

03

2965-X01

2965-X03

2965-Y02

2965-Z01

2965-Z03

Page 26: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario A, Total Capacity

Page 27: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario A, Single CP

Page 28: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario B, Total Capacity

Page 29: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario C, Total Capacity

Page 30: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario Comparisons

Page 31: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Scenario C comparisons

Page 32: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Impact on software cost

Fewer MSUs/Cap

is better!

Page 33: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

What does this mean?

•When considering an upgrade you should consider all engine speed

options

•The choice might impact your software bill, at least slightly

•The big question is: what happens if you change the engine speed?

Page 34: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

What should you expect from

more/slower vs. fewer/faster?

Page 35: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Components of elapsed times

•CPU time – time spent using the CPU

–Directly impacted by CPU speed

•CPU wait/delay time – time spent waiting to get on a CPU

–Related to number & speed of CPUs

• I/O time – time spent doing I/O

–Likely won’t change substantially with CPU changes

•Database or other subsystem request time

–This will likely include some CPU time that’s not charged back to the job

–Also likely includes some I/O time

•Serialization (locking, enqueues, latches, etc.)

–Can be related to how fast other work is running

Page 36: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

What to expect (fewer/faster)

•Lower CPU times

–If you double the processor speed, then CPU time should drop by about half

–Of course, things will not be so simple

•Possibly higher CPU delays

–Fewer processors = more queueing

•Possibly more variable throughput

–High priority, CPU-intensive tasks can monopolize more of the total capacity

–Sync CF requests consume more of your capacity, unless they get faster too

• Always best to keep the CF technology on the same level as your CPU

Page 37: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

What to expect (more/slower)

•Higher CPU times

–If you halve the processor speed, then CPU time should double

–Of course, things will likely not be so easy

•Possibly lower CPU delays

–More processors = less queueing

•Possibly more consistent throughput

–High priority, CPU-intensive tasks can monopolize less of the total capacity

–Sync CF requests consume less of your capacity

• Note CF engines always run at full speed

Page 38: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Fewer/Faster

+ Lower CPU time

+ Better for single-threaded CPU-intensive

workloads

+ Specialty engine capacity constraints cause

less impact

+ Scalable to multiple books/drawers

- CPU spinning/waiting is more of total capacity

(Higher Parallel Sysplex overhead)

- Less L1/L2 cache

- Potentially fewer high polarity processors

(More inter-LPAR impacts)

- Fewer concurrently executing tasks

More/Slower

- Higher CPU times

- Worse for single-threaded CPU-intensive

workloads

- Specialty engine capacity constraints cause

more impact

- Not scalable past one book/drawer

+ CPU spinning/waiting is less of total capacity

(Lower Parallel Sysplex overhead)

+ More L1/L2 cache

+ Potentially more high polarity processors

(Less inter-LPAR impact)

+ More concurrently executing tasks

Professionals and Jailbirds

Best for

single-task

performance

Best for

system efficiency

Page 39: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Interesting measurements

Page 40: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Work units

•Work units is the replacement for the “in-ready” address space counts

–Counts running or waiting work units

•More accurate representation of work because we have an increasing

number of multi-threaded address spaces

•Values by processor type (GP, zIIP, zAAP)

•Plot min, average, max over time

–Max is often far larger than average

•Distribution of observations

–Based on the number of online and not parked processors (N)

–Counts in buckets: N, N+1, N+2, N+3, N+5, N+10, N+15, N+20, N+30…

Page 41: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Work units over time

Page 42: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Work units over time

Page 43: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Work units over time - zIIP

Page 44: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Work units over time - zIIP

Page 45: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More
Page 46: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More
Page 47: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More
Page 48: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Highest task CPU percent

• “New” field in SMF30: SMF30_Highest_Task_CPU_Percent

–For interval records: largest percentage of CPU time used by any task in the

address space = round(TCB / interval * 100)

–For step/job end records: largest reported interval value

•Related field: SMF30_Highest_Task_CPU_Program

–Name of the program loaded by the task that had the largest percentage

•As value approaches 100, per-CPU speed becomes more important

–Threshold for worry: “it depends”

•We don’t know anything about the second-most busy task

• “Spikey” TCBs might be under-represented

–Larger intervals hide larger spikes

Page 49: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Highest task CPU – high level overview

Page 50: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

CPU Intensity

•Consider calculating CPU Intensity: CPU time / Elapsed Time

–Really the same calculation as the highest task CPU percent

•Can do this for entire batch jobs and/or on an interval basis for jobs

and/or started tasks

•Gives you a gross overview of which workloads might be sensitive to

CPU speed changes

–Note same issues as for highest task CPU percent, maybe even worse

–But… considers all CPU time, not just the most intensive TCB

Page 51: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

CPU Intensity

Page 52: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

CICS: QR TCB time

•Each CICS region has multiple TCBs, but only one QR TCB

–Single TCB = scalability bottleneck

•Historically all application code ran on the QR TCB

•Today: many more TCBs

–Prime example: SQL and MQ commands run on L8 TCB

–L9: User Key OpenAPI programs

• But use CICS Key for programs doing DB2/MQ

–L8/L9 TCB pool limit = (2 * max task) + 32

•CICS SMF data will break CPU consumption down by TCB

–Will also show things like the amount of dispatch delay and number of dispatches

Page 53: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Moving between TCBs: not threadsafe

Program start

App code

EXEC CICS (ts)

App code

App code

App code

EXEC CICS (not ts)

App code

App code

EXEC CICS (ts)

App code

App code

App code

EXEC SQL

EXEC SQL

EXEC SQL

EXEC SQL

EXEC SQL

QR L8

Even though the program is not threadsafe,

some amount of execution (SQL/MQ) occurs

on the L8 TCB, not the QR TCB

Every TCB switch takes a certain amount of

CPU

Waits for the QR TCB are also very possible

Page 54: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Moving between TCBs: threadsafe

Program start

App code

EXEC CICS (ts)

App code

EXEC CICS (not ts)

App code

EXEC SQL

App code

EXEC SQL

App code

EXEC SQL

App code

EXEC CICS (ts)

App code

EXEC SQL

App code

EXEC SQL

App code

QR L8

Marking your programs threadsafe means:

• Less TCB switch overhead

• Less QR TCB wait

• More work runs on multiple L8 TCBs

Note that without API(OPENAPI): after

executing a non-threadsafe CICS command,

execution stays on the QR TCB until the next

EXEC SQL (or MQ)

Where possible,

Use threadsafe!Threadsafe Considerations for CICS: http://www.redbooks.ibm.com/abstracts/sg246351.html

Page 55: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Moving between TCBs: threadsafe & OpenAPI

Program start

App code

EXEC CICS (ts)

App code

EXEC CICS (not ts)

EXEC SQL

App code

EXEC SQL

App code

App code

EXEC SQL

App code

EXEC CICS (ts)

App code

EXEC SQL

App code

EXEC SQL

App code

QR L8

Note that with the API(OPENAPI), after

executing a non-threadsafe CICS command,

execution returns to the L8 TCB

But this is EXECKey(CICSKey),

EXECKey(UserKey) is shown on next page

Page 56: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Program start

App code

EXEC CICS (ts)

App code

App code

App code

App code

App code

EXEC CICS (ts)

App code

App code

App code

Threadsafe, OPENAPI, User Key

EXEC CICS (not ts)

EXEC SQL

EXEC SQL

EXEC SQL

EXEC SQL

EXEC SQL

QR L8L9

While this solves our

single-TCB problem,

it re-introduces the

excessive task

switching we started

with

Not recommended for

applications doing

DB2/MQ work!

Page 57: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

CICS Response Time Breakdown

Page 58: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

QR Dispatch Delay

Page 59: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Specialty engine crossover

•SMF 30 and SMF 72 record the amount of CPU time consumed on GPs

that could have been done on zIIPs (or zAAPs)

•Mostly an indication of queueing for specialty engines

•Can be mostly (~99%) stopped by setting IIPHONORPRIORITY(NO)

–Beware DB2 v11 impact if you do this!

•Slower GPs mean the execution time will be longer

–Maybe should just wait for a zIIP to become available?

• ZIIPAWMT can tune that, but probably rarely needed

• If you have significant failover happening, ideally just buy another zIIP

–Or turn on SMT on z13

–Note SMT or not is also a more/slower vs. fewer/faster question

Page 60: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More
Page 61: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Sysplex overhead

•While sync requests are executing, the requesting CPU spins waiting for

a response

•Faster engine with same CF response time = more lost capacity

•More / slower CPs means a smaller portion of your capacity

•Keep CF technology matched as best as possible

Page 62: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More
Page 63: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Practical Considerations - Summary

Page 64: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Batch

•Find candidates that will be impacted

–Consider filtering by batch jobs that consume more than trivial CPU and/or run more

than a trivial amount of time

–Consider filtering by CPU intensity

–Consider batch jobs that are CPU-bound when the system is not CPU-constrained

•Understand how much headroom you have in your batch SLAs

–E.G. if you’re meeting your window by 3 hours today, some delay may be fine

–Make sure the SLAs are representative of real business need!

•Do you care about low-importance batch?

–Maybe: dev/test work needs to get done too!

Page 65: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Started tasks

•Find candidates that will be impacted

–SMF30_Highest_Task_CPU_Percent is a good place to start

–But don’t forget to look for other started tasks with a high CPU intensity

• If moving to fewer/faster beware high-priority workloads’ ability to

monopolize CPUs

Page 66: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

CICS

•Often the CPU time & CPU delay per transaction is so small as to be

unnoticeable to the end user

–Most transactions are using milliseconds of CPU—milliseconds + milliseconds of

wait may not be enough to matter

–But beware applications which trigger multiple transactions per user interaction

•How busy are the QR TCBs in each region?

–Note that this is a single server queueing model: ~70% busy implies 2x service time

–As QR TCB busy exceeds 70% and approaches 100%, increased caution warranted

• If at all possible, make applications threadsafe

Page 67: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

zIIP-eligible workloads

•Slower engines differ more from full-speed zIIPs

•Consider whether you want to allow failover based on

–How often failover occurs

–Whether it occurs during your R4HA peaks

–How important the work that fails over is

–Discrepancy between GPs and zIIPs

Page 68: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Can you use OOCoD to help?

•On/Off Capacity on Demand is great for trying different combinations

–“Delivered” capacity can be less than “purchased” capacity

–Changes within purchased capacity can be done for no additional hardware charge

•But there are rules that you have to be aware of:

–There are various agreements to sign

–You can not decrease the number of physical CPs to less than what was delivered

–You can not decrease the capacity to less than what was delivered

–You can not go beyond what’s physically installed or 2x purchased capacity

–Pre-paid maintenance may be problematic

• Pre-pay based on purchased vs. delivered capacity?

• Try during warranty, lock in before maintenance kicks in?

Page 69: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

OOCoD Matrices (Upgrade Scenario B)Number of CPUs

Speed 1 2 3 4 5 6 7 8 9 10 11

4xx 250 478 697 910 1,118 1,321 1,520 1,716 1,907 2,093 2,277

5xx 746 1,417 2,067 2,694 3,306 3,903 4,485 5,052 5,606 6,145 6,671

6xx 1,068 2,019 2,938 3,827 4,690 5,528 6,342 7,132 7,899 8,644 9,368

7xx 1,695 3,196 4,644 6,041 7,392 8,700 9,964 11,188 12,371 13,515 14,622

Number of CPUs

Speed 1 2 3 4 5 6 7 8 9 10 11

4xx 250 478 697 910 1,118 1,321 1,520 1,716 1,907 2093 2277

5xx 746 1417 2067 2694 3,306 3,903 4,485 5,052 5,606 6,145 6,671

6xx 1,068 2019 2938 3827 4,690 5,528 6,342 7,132 7,899 8,644 9,368

7xx 1,695 3,196 4,644 6,041 7,392 8,700 9,964 11,188 12,371 13,515 14,622

Number of CPUs

Speed 1 2 3 4 5 6 7 8 9 10 11

4xx 250 478 697 910 1,118 1,321 1,520 1,716 1,907 2093 2277

5xx 746 1417 2067 2694 3,306 3,903 4,485 5,052 5,606 6,145 6,671

6xx 1,068 2019 2938 3827 4,690 5,528 6,342 7,132 7,899 8,644 9,368

7xx 1,695 3,196 4,644 6,041 7,392 8,700 9,964 11,188 12,371 13,515 14,622

Number of CPUs

Speed 1 2 3 4 5 6 7 8 9 10 11

4xx 250 478 697 910 1,118 1,321 1,520 1,716 1,907 2093 2277

5xx 746 1417 2067 2694 3,306 3,903 4,485 5,052 5,606 6,145 6,671

6xx 1,068 2019 2938 3827 4,690 5,528 6,342 7,132 7,899 8,644 9,368

7xx 1,695 3,196 4,644 6,041 7,392 8,700 9,964 11,188 12,371 13,515 14,622

Which of these

options will work

best?

Bringing in the machine

as a 410 precludes

using OOCoD to try the

other options

Starting at a 503 is

better

But we really want to

start as a 602 (or 502)

even though we don’t

expect to use that

Blue =

delivered

capacity

Green =

OOCoD

possibilities

Page 70: Fewer/Faster vs More/Slower Practical Considerations › wp-content › uploads › 2017 › 06 › ... · •Consider more/slower CPUs instead of fewer/faster –More CPUs = More

Summary

•Choice of engine speed can affect system efficiency

•Engine speed choice can possibly affect real capacity / MSU

•Slower engines may be a better choice

•Many measurements to review to help you decide

•Examine your workloads

Fill out your evaluations!