Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

29
Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007

Transcript of Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Page 1: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Edward Jones IS Capacity Planning and Performance Management

Jim PolettiOctober 23, 2007

Page 2: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

About Edward Jones. . .

• Full service investment firm • 10,000+ branches – US, Canada, UK• 1 "broker" and 1 branch office administrator

per branch• Land-line WAN – DSL or T1• St Louis datacenter is hub for most traffic• Tempe datacenter primarily DR for mainframe• 21,000 users signed on to CICS at high-water

Page 3: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

IS Capacity Planning & Performance Management

• Jim Poletti (MF Performance Analyst)

• Gerry Oliver (MF Performance Analyst)

• Greg Volk (Network Performance Analyst)

• Rick Pranger (Open Systems Performance Analyst)

• Dwayne Allen (Open Systems Performance Analyst)

• Tom Siech (Load Tester)

• Brandy Brown (Load Tester)

Rich Unnerstall (Director – Data Center Operations)

Art Morlock (Department Leader)

Page 4: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

St. Louis Mainframe Hardware

• All LPARs run on 1 physical mainframe

• IBM Z9 2094-707 – 3516 MIPs – Z/os 1.7

• 80 GB memory

• 40 TB DASD – EMC Raid -1 and -7, 5 Ms

• Older symmetrix – replacing with DMX-4

• Data replication to Tempe using SRDF

Page 5: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

CPU by LPAR

Page 6: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Production Environment/LPAR

• 1 LPAR (no data-sharing SYSPLEX yet)

• 25 CICS regions – 19 AORs, 5 TORs,1 FOR

• 32 Million CICS transactions/day = 7 million user "enters"

• DB2 – 1 subsystem

• IDMS – 5 regions, 15 million run units/day

• RRDF replication in DB2 and IDMS to Tempe

Page 7: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Responsibilities

• Assure system performance and scalability.

• Provide capacity planning support for purchasing decisions.

• Tune the mainframe hardware "till the wheels come off", then buy capacity.

• Hotline, war room participation.

• Performance Testing.

Page 8: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Early Morning "System Checks"

• Check system "barometers" from yesterday• Check performance graphs and reports• CICS transactions – Volume, CPU, Response• LPAR CPU• Memory• DASD• DB2• IDMS• Development response time – TSO, compiles

Page 9: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Houston, we have a problem !

• Go into detective mode

• Start at high level, look at service classes within LPAR for abnormalities

Page 10: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Daily Workload Statistics For 9:30-10:30 on Wed, Oct 17, 2007 Compared to Prior 4 Wednesdays

Service CPU CPU Change % Real Real

Class Util Util in Change Memory Memory

  17-Oct Prior 4 CPU CPU Gb Prior 4

    Wednesdays Util     Wednesdays

BAT_HOT 0.3 0.3 0 -8 7.6 8.6

BAT_1 1.6 1.5 0.1 5 20.1 15.7

BAT_2 3.6 3.6 0 1 52 126.2

CICS_1 11.8 11.2 0.6 6 1490 1490

CICS_2 33.4 34.5 -1.2 -3 2037 2246

CICS_3 0.6 0.8 -0.2 -27 315.5 352.5

DB2_HI 1.6 1.8 -0.2 -11 6648 6636

DB2_LO 0.6 0.6 -0.1 -11 21.9 25.5

IDMS 11.3 11.9 -0.6 -5 1390 1398

MQSERIES 0.3 0.2 0.1 35 775 418.7

NEWWORK 0 0 0 -44 0  

Page 11: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Dig deeper into details of the workloadProgram SUM CPU CICS

+DB2CPU % DB2 DB2 Pct Resp Resp

Name Time CPU Time Change CPU Time Change Time Time

  9:30 to Time Prior 4 CPU Time Prior 4 DB2   Prior 4

  10:30 Per Weds   Per Weds     Weds

    Tran     Tran        

CMSOC300 884 0.0025 0.0025 1 0.0021 0.0021 1 0.076 0.078

DFHMIRS 424 0.0006 0.0006 -2 0 0 . 0.031 0.034

MYDOC016 391 0.0072 0.0075 -3 0.006 0.0062 -3 0.301 0.314

PRTOC515 284 0.0141 0.0145 -3 0.0102 0.0104 -3 0.189 0.21

BRHOC053 190 0.0008 0.0008 1 0.0006 0.0006 1 0.011 0.012

PRTOC630 188 0.0111 0.0116 -4 0.0053 0.0056 -5 0.07 0.077

CMSOC320 187 0.0052 0.0052 1 0.0048 0.0048 1 0.149 0.153

CHSOC120 133 0.0025 0.0025 -2 0.0006 0.0006 -2 0.052 0.057

CMSOC330 95 0.006 0.0059 2 0.0058 0.0057 2 0.182 0.184

BRIOC022 93 0.001 0.001 0 0 0 1 0.018 0.019

IAAOC222 91 0.0156 0.0156 0 0.0116 0.0116 0 0.482 0.485

PRTOC001 84 0.005 0.005 0 0.0019 0.0019 0 0.074 0.08

Page 12: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Once problem is found, find cause

• Run strobe on CICS or batch job.

• Ask if program was changed.

• Was a system parm changed?

• Lurking problem surfaced when user patterns changed

• Did a new system go in?

Page 13: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Recommend change to fix problem

• Code fix

• Parameter change

• SQL or IDMS call change

• Run workload different time; smooth peaks

• Redesign database or add index

• Completely shutdown workload

• If you don't know how to fix it, ask others

Page 14: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

It helps to make performance recommendations if…

• You were a programmer in a previous life

• You were a DBA in a previous life

• Knowledgeable in MVS,CICS, DASD etc.

Page 15: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Integrity matters

• Be right, study before you speak

• Go for tuning that gives a payback

• If the workload isn't measurable, put in mechanisms to measure it before doing the tuning change

• Do some PR work - Send tuning results to programmer and their management

Page 16: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Mainframe tools

• SAS

• MXG

• Strobe

• Jones built performance repositories

• Our performance website

• RMF 3

• Omegamon

Page 17: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Capacity Management’s Prime Objective: When Do We Run Out?

• When do we need more of a resource?

• How much lead time do you need?

– Approval cycle

– Floor space

– Vendor Delivery Time

– Installation Time

– Acceptable Risk

Page 18: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Forecasting Processes

Performance Prediction

Resource Utilization Trends

Business Forecasts

Resource Utilization Models

Workload Models

Performance and Workload Data Repositories

Validate, Assess and Revise

Page 19: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance Tuning:

• We continually tune hardware and software, as well as their interrelationships, to improve the performance of systems.

• Shares ownership across multiple departments.• Very highly iterative – never done!• Why:

– Direct positive impact upon end user experience.– Tuning cost avoidance.

Page 20: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance Tuning: How do we improve programs?

• Divide and Conquer:– Which program in a batch job takes the longest?– Which program uses the most CPU?– Profile Code– Tune infrastructure (including

network).– Prioritize process

Page 21: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance Tuning

• Which programs are slowest (Dawgs)?

• Which programs use the most resources (Hawgs)?

• Which programs are used the most?

• Business criticality: How important are they to the business?

Identify Opportunities for Improvement – aka "Hawgs" and "Dawgs".

Page 22: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance Data Repositories

• We maintain many performance data repositories – these tend to be collections of statistics not detail data.

• For example, we will not retain CICS transaction detail, but we will calculate counts of transactions by region by transaction name as well as average, maximum and percentile statistics for a variety of variables and intervals.

• SAS is our primary tool.

Page 23: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance Data Repositories: Data Sources

• CICS – by day, by tran• DASD Type 74 – by day, by LPAR, by VOLSER• Jones application instrumentation • MVS level – by day, by LPAR• IDMS- by day, by program• DB2 – by day, by tran• Service and report classes – by day, by service class• Proc summary, proc append

Page 24: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Business Metrics and Workloads

• Business Metrics typically use different time frames than workload metrics.

• Business doesn’t forecast in terms of megabytes of DASD, cpu seconds used, interactive sessions, concurrent users or paging rates.

• They refer to branches, IRs, customers, trades, purchases, $$$, payments, visits, exorbitant cost of IT,…

Page 25: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Loved Ones: Sorry, all apps are not equal

• What is the business importance of the application / workload?

• If there are diverse workloads on a system it is necessary to prioritize the work to ensure that the work is processed in an order that reflects its business priority.

• To understand priorities you have to understand the business.

• Capacity planning activities should also ensure that when work is constrained, the highest priority work is favored.

Page 26: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Performance testing

• Jones has clone environment of production

• Use Loadrunner tool to generate transactions

• Think time adjustable

• A few hundred users is usually enough

• All major system enhancements are loaded tested

Page 27: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Load Testing: Objectives

Is End User Performance acceptable?Will the introduction of these new features threaten the

health of other applications?How does response & resource utilization compare to

current production levels?Reproduce and troubleshoot production problems.Will we need to add capacity?

In stress testing we measure response times at production peak load and 5x production peak.

Often identify 'Break Points' to watch for in production.

Page 28: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Interaction with Availability

• A badly performing application is effectively the same as the application being unavailable.

• Capacity and Availability Management share common goals / tools and complement each other.

• Capacity Management needs to be aware of Availability techniques deployed, such as mirroring, load balancers or clustering, in order to plan accurately for Capacity.

Page 29: Edward Jones IS Capacity Planning and Performance Management Jim Poletti October 23, 2007.

Questions: