IDz/ADFz Workbench Using Fault ... - community.ibm.com

112
® IBM Software Group IDz/ADFz Workbench – Using Fault Analyzer to Analyze and Solve z/OS ABENDs Jon Sayles, IBM zDevOps Enablement - [email protected] @Copyright IBM April 2021 DevOps

Transcript of IDz/ADFz Workbench Using Fault ... - community.ibm.com

Page 1: IDz/ADFz Workbench Using Fault ... - community.ibm.com

®

IBM Software Group

®

IBM Software Group

®

IBM Software Group

®

IBM Software Group

®

IBM Software Group

®

IBM Software Group

IDz/ADFz Workbench –

Using Fault Analyzer to Analyze

and Solve z/OS ABENDs

Jon Sayles, IBM zDevOps Enablement - [email protected]

@Copyright IBM – April 2021

DevOps

Page 2: IDz/ADFz Workbench Using Fault ... - community.ibm.com

2

IBM Trademarks and Copyrights © Copyright IBM Corporation 2008 through 2021.

All rights reserved – including the right to use these materials for IDz instruction.

The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates.

This information is based on current IBM product plans and strategy, which are subject to change by IBM without notice. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way.

IBM, the IBM logo, the on-demand business logo, Rational, the Rational logo, and other IBM Rational products and services are trademarks or registered trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service marks of others.

Page 3: IDz/ADFz Workbench Using Fault ... - community.ibm.com

3

The IDz Workbench Curriculum

▪ Module 1 – IDz Terms, Concepts and Navigation

▪ Module 2 – Editing Your COBOL Programs

▪ Module 3 – Analyzing COBOL Programs

▪ Module 4 – Remote Systems – Connect, Navigate and Search

▪ Module 5 – Remote Systems – Dataset Access and Organization

▪ Module 6 – Remote Systems – ISPF 3.x, Batch Jobs and Batch Job Management

▪ Module 7 – MVS Subprojects – Organizing PDS Members and SCM Checkout

▪ Module 8 - The Data Tools – SQL Code/Test and DB2 Table Access

▪ Module 9 - Debugging z/OS COBOL Applications

Optional Modules▪ IDz/Endevor Integration Through CARMA

▪ zUnit – Unit Test

▪ Code Coverage – Test quality feature

▪ Code Review – Application quality feature

▪ Menu Manager – Integrate ISPF REXX Execs and CLISTs

▪ Web Services – SOA development

▪ Fault Analyzer

▪ File Manager

Page 4: IDz/ADFz Workbench Using Fault ... - community.ibm.com

4

Course Assumptions

1. You know ISPF and have used it for at least two years, doing production work on z/OS with COBOL, PL/I or Assembler

Note that all of the workshops in this course are in COBOL – although files exist that are Assembler and other languages for you to experiment with – as time permits

2. You have:

No experience with Eclipse or IDz

Some experience with PC tools

▪ You have used MS-Windows applications for at least one year

IDz installed and running on your workstation at version 8.0 or later

▪ Note that all ISPF discussion/examples and screen captures assume IBM-installed ISPF product defaults – not any 3rd party or custom Dialog Manager applications you may have installed on your mainframe

Page 5: IDz/ADFz Workbench Using Fault ... - community.ibm.com

5

Course Contributing Authors

▪ Thanks to the following individuals, for assisting with this course: Russ Courtney/IBM

James Rice/IBM

Walter (Zack) Zakorchemny

David Bean/IBM-Rational

Ed Steele/IBM-Rational

Olivier Gauneau/IBM

Page 6: IDz/ADFz Workbench Using Fault ... - community.ibm.com

6

Course Overview

▪ AudienceThis course is designed for application developers who have learned or

programmed in COBOL, and who need to do z/OS Traditional Development and Maintenance as well as build leading-edge applications using COBOL and Rational Developer for System z.

▪ PrerequisitesThis course assumes that the student has a basic understanding and

knowledge of software computing technologies, and general data processing terms, concepts and vocabulary, as well as a working knowledge of COBOL and z/OS.

Knowledge of SQL (Structured Query Language) is assumed for database access is assumed as well.

Basic PC and mouse-driven development skills, terms and concepts are also assumed.

Page 7: IDz/ADFz Workbench Using Fault ... - community.ibm.com

7

UNIT

Topics:

The IDz Workbench

▪ Analyzing Mainframe Abends

▪ ABEND Codes and Reasons

▪ Fault Analyzer

▪ Appendicies

This course is written for z/OS developers, not Systems Programmers.While Fault Analyzer has deep z/OS analytical tools that can be used by the Systems

Programming staff this material is aimed at COBOL and PL/I programmers responsible for discovering, analyzing and solving application program ABENDs.

Note an ABEND == a thrown exception in Java, C++ and in other/modern languages

Page 8: IDz/ADFz Workbench Using Fault ... - community.ibm.com

8

Objectives

After completing the first two sections on Production

Support/Application Testing/Software Defects and

IBM Mainframe COBOL ABEND Research, you

should be able to:

Define the steps in a generalized methodology of ABEND

resolution

List the various sources of ABEND inputs, including:

▪ PD Tools documents

▪ Other SYSOUT

▪ Dynamic trace facilities

▪ Static code analytics

List the common types of COBOL program ABENDS

Page 9: IDz/ADFz Workbench Using Fault ... - community.ibm.com

9

▪ When an application ABEND (ABnormal END-of-job) occurs, z/OS stops executing your program, closes files and buffers and generates a high-level message in the form of a System Completion Code (Sxxx) – or USER code (typically 4038)

▪ The System Completion Code is typically written to an output listing file through your //SYSOUT DD * JCL entry.

▪ The completion code indicates the z/OS system’s reason it stopped executing your program.

▪ The completion code is related to, but often only loosely related to what is invalid in your code

▪ Because of this the System Completion Code represents the starting point for your analysis of the problem.

Program ABENDs – Overview

She won't be laughing when she gets back to her desk

and finds out that last night's production batch stream never finished…

Page 10: IDz/ADFz Workbench Using Fault ... - community.ibm.com

10

Analyzing and Solving a z/OS ABEND

There are as many ways to analyze and research ABENDs – just as there are many individual approaches to solving a business problem with procedural logic.

However, if you've never done software production support work, consider starting with the following structured problem-solving approach:

1. Preparation

2. Research

3. Hypothesis

4. Solution

5. Resolution

As a final note before we begin, understand that there are usually two distinct phases of /z/OS application Production Support:

1. Data Center “on-call” ABEND resolution – where a technician receives notification that a job or transaction has ABEND’d and often must be "fixed" within an extremely short timeframe (measured in hours). In this case, the technician's concern is to "patch" the problem - get the system back online, or get the batch job-stream back into production

2. Root Cause Analysis. This begins is when the programmers responsible for the application track down and solve the problem that caused the ABEND – typically “why” the ABEND happened.

The steps that follow represent a common approach to "Fix-It“ – they include ABEND resolution and proceed to Root Cause Analysis work-flow.

Page 11: IDz/ADFz Workbench Using Fault ... - community.ibm.com

11

1. Preparation and Information/Input Gathering – 1 of 4

Collect background information on: What happened, When and WHERE the ABEND occurred.

\

1. Start with Fault Analyzer reports which contain a deep set of formatted analytic information

2. Collect additional supporting ABEND output ▪ SYSOUT from the job

▪ DISPLAY statements…

3. Obtain copies of the run-time:▪ JCL/PROCs

▪ Program source for

all modules

▪ Compile listings &

Link Maps

4. Grab “data dumps”▪ Data File records

▪ DB2 Table values

▪ Log files

Especially if the problem is data-specific

Page 12: IDz/ADFz Workbench Using Fault ... - community.ibm.com

12

1. Preparation and Information/Input Gathering – 2 of 4

From the Run-stream JCL and/or from the JES spool file(s) (JESJCL) retrieve the DSNs of input and output files accessed by the program.

If the ABEND occurred in an online app you will need to gather the same kinds of background information from the:

- IMS SYSGEN Tables, and individual IMS GEN control blocks

- CICS – RDO Table entries

Page 13: IDz/ADFz Workbench Using Fault ... - community.ibm.com

13

IBM’s A.D. (static analysis) tool can simplify the research and discovery phases of ABEND resolution – both for batch and online applications

1. Preparation and Information/Input Gathering – 3 of 4Application Discovery (A.D.)

Diagram ➔

showing datasets and DB2

table access points

Screen/Transaction/Program

Call Graph

Page 14: IDz/ADFz Workbench Using Fault ... - community.ibm.com

14

1. Preparation and Information/Input Gathering – 4 of 4You can often benefit by spending some time with a Subject Matter Expert… as you often need to know “What” the application or programs do for the business.

The “What” of an application is the province of either: Business (End) Users, Architects or Systems Analysts

However it may not be possible to find someone – as most of the first (and even second) generation of Business Application developers have retired

Page 15: IDz/ADFz Workbench Using Fault ... - community.ibm.com

15

Research – Analyze the ABEND data

▪ Using information & inputs from the preparation construct a mental map of the program's execution – HOW & WHY the ABEND occurred.

▪ The Why an ABEND occurred usually requires a combination of "Static" and "Dynamic" analysis – tools and techniques.

▪ These steps need not be followed in this order. In time you will develop an "intuition" as to which kind(s) analysis will be most likely to provide the information needed to solve the problem.

▪ To assist use application research and analysis tools such as

▪ IBM’s Application Discovery (AD) and IDz’s Static Analysis tooling:– Program Control Flow & PERFORM Hierarchy

– Data Flow

▪ IBM Debug Tool – for Dynamic (run-time) Analysis– You will most likely need an image copy of the data that caused the ABEND

▪ So – if the reason "why" the ABEND occurred is not apparent at this point, perform Static and/or Dynamic Analysis on the specific areas of the application relating to the ABEND.

Page 16: IDz/ADFz Workbench Using Fault ... - community.ibm.com

16

Overview of Techniques: Static Analysis

1. Structural Visualization: is the generation of an accurate mental map, understanding or mental image of the program's control structure, or logic-architecture. Using the starting point represented by the ABEND condition (the statement which caused z/OS to halt execution) and using electronic-assisted tools (such as IBM’s Rational Asset Analyzer or Rational Developer for System z), build an accurate understanding of the code invocation: The module/file level (System View) - Paragraph/Section level (Hierarchy chart) -Statement level (Flow chart)

Structural Visualization can done be "top-down", by asking open-ended questions; such as learning how a particular routine "hangs-together logically", or it can be used "bottom-up", by asking specific close-ended questions about a program, such as "How does this particular paragraph get executed?" "How did this module get invoked?"

2. Data Flow Analysis: A combination of control structure analysis and data item analysis, which seeks to determine the usage of fields throughout a program. Data flow analysis is used to determine (from a given instance of a data item) where the next occurrences of that item exist in your program, and how the data item is used; (as a receiving field in a MOVE or mathematical operation, as the sending field in a MOVE statement, as part of a logic-branch (IF, PERFORM UNTIL/VARYING, etc.).

3. Data Impact Analysis: An expansion of Data Flow Analysis which traces the movement of data from field-to-field throughout a program, or throughout an entire application; including I/O (screens and files). Using Data Impact Analysis, you can identify all fields that might have had an impact on the contents of a field (before the ABEND occurred). And just as importantly - you can learn the affect changing this field will have on the behavior of the application.

4. Textual or Data Item Usage: Utilized more for application maintenance and enhancement requests, this type of Static Analysis involves searching for "categories" of program-items, such as "List all fields that contain *JUL*, *GREG*, *YR*, *YEAR* (suspect date candidates for Year2000 conversion), or list all such fields with two digits (numeric) or two-byte (alphanumeric) definitions.

5. Code Partitioning: Again, utilized more for application maintenance, enhancements and application reengineering, Code Partitioning involves mentally organizing and analyzing code by function or process, such that you understand and can distinguish the usage of code by business process. For example: Find all code that relates to the calculation of premium renewal payments … or… Isolate the code that edits a particular file, with an eye towards creating a shared subroutine from the code.

IBM’s A.D. is the fastest & most accurate

approach to Static Code Analysis

Page 17: IDz/ADFz Workbench Using Fault ... - community.ibm.com

17

Overview of Techniques: Dynamic Analysis

1. Follow Program Logic: Source-level interactive debugging. Watch the program execute statement-by-statement, and line-by-line. This is very useful for detailed-debugging, particularly of dense or complex instructions. Some software (for example, the Rational Developer for System z) allows you to trace the program logic, attempting to re-create the sequence of events (COBOL statements) that transpired up to and including the ABEND condition. Given the size and scope of production applications, it is generally more practical to trace specific problem areas of a program.

2. Interactive Execution: Execute (run) a program, stopping at selective Breakpoints (Pause execution each time a certain field-value changes, or when a value exceeds some threshold), and examining the contents (value) of specific fields. Interactive Execution must be done by (or with) an application analyst who understands how the system is supposed tooperate. Interactive Execution is useful for observing control flow, and is often combined with line-by-line tracing by setting selective breakpoints, monitoring values, "running" the application to the breakpoints, and then tracing the code line-by-line.

3. Selective Data State Collection: Execute code and establish a functional summary of specific data value states. Use these states in subsequent test runs to compare results of current values to expected values. Debug Tool’s auto-log feature is beneficial

4. Code Coverage: Analyze the number of times each COBOL statement is executed for a given run. Note that PD Tools/Debug Tool can run a report that shows code coverage. This technique is extremely useful for analyzing test data coverage of a given application. And it can be used effectively for debugging if it makes apparent problems such as infinite loops (S222, S322 and B37 ABENDs), over-loading tables – i.e. loading tables beyond the maximum OCCURS clause and overlaying storage, which cause things like: S0C1, S0C4, and even S0C7 ABENDs.

IBM’s Test/Debug Tools are the best-in-

class approach to Dynamic Analysis

Page 18: IDz/ADFz Workbench Using Fault ... - community.ibm.com

18

Hypothesis - Determine WHY the ABEND occurred – 1 of 4

▪ Your preparation and research is probably all you need to be able to describe WHAT, WHERE and HOW the ABEND occurred

In other words; at what point in the program the logic failed, and during what sequence of COBOL statements…

▪ However, before modifying any business logic you must determine WHYthese statements (sequence of steps) caused the failure:

"Why did this production input file contain spaces in a numeric field?"

▪ The data was supposed to have been edited at this point in the batch stream

"Why did the program's logic perform the Initialization routine twice?"

"Why did the Read routine execute past end-of-file?“

“Why did this alpha data end up in a packed LINKAGE field?”

▪ Only through a determination of WHY will you be able to make a change to production business logic safely, and with confidence that;

Your change will resolve the ABEND

Your change will not introduce new (additional) ABENDs

Page 19: IDz/ADFz Workbench Using Fault ... - community.ibm.com

19

Hypothesis - Determine WHY the ABEND occurred – 2 of 4

▪ Sometimes it is relatively easy to come to an understanding of WHY certain ABEND conditions occurred. For example, perhaps a period was left off the appropriate termination point for an IF statement - which caused execution to perform an operation out of sequence. Or perhaps an IF .. NUMERIC test (which should have been coded for all numeric fields in a file) was forgotten. Or a paragraph was performed through the wrong paragraph-exit, or a production job was released before certain files were available (causing I/O errors). These types of ABEND situations can be understood (and usually resolved) fairly quickly. But, not always.

What if - in the case of the IF statement with the incorrect termination point - the logic that has been coded, correctly processed the first 100,000 records in the file?

▪ Making a change to a critical IF condition could very well affect other down-stream processing within the

program, wrecking havoc with subsequent routines.

Or what if - in the case of the file containing blanks in the numeric fields - the input file was supposed to be "clean" (validated) by this point in the job-stream - having gone through allegedly "exhaustive" edits in prior modules.

▪ By simply adding an IF test you may solve your program's specific ABEND, but you will not have resolved

the actual problem - which exists somewhere else in the system.

▪ In other words, localized/piecemeal approaches to resolving production ABENDs are not recommended - as

they usually change the problem, instead of solving it. And sometimes they just spawn new problems.

▪ It should be noted that, a clear understanding of the business functionality automated by this process is almost always required to resolve WHY something has gone wrong.

Call on business experts or "application/business" expertswho understand "the big picture" - and the context in whichthe job executes is the rule rather than the exception to this process.

Page 20: IDz/ADFz Workbench Using Fault ... - community.ibm.com

20

Hypothesis - Determine WHY the ABEND occurred – 3 of 4

Developing an accurate determination of WHY a problem that lead to an ABEND condition exists may take a considerable amount of time depending on the:

Size, complexity and structure of the code

▪ Number of copybooks, Calls, Files/IO, etc.

Your familiarity with the program's business purpose - coupled with your ability to grasp the point of each statement

Type of ABEND and reason for the problem (some are more diabolical than others)

Size of the input/output files – and complexity of the data

▪ Multi-level OCCURS tables, Multiple 01-records on a file, etc.

In addition to an understanding of the reason for the ABEND, the results of your investigation should produce an understanding of the solution to the problem (the fix itself).

Page 21: IDz/ADFz Workbench Using Fault ... - community.ibm.com

21

Hypothesis - Determine WHY the ABEND occurred – 4 of 4

There are typically two categories of ABEND “WHY” issues:

1. Data problems

Incorrect schema mapping

Invalid numeric data

Uninitialized data

Invalid values

2. Procedural logic problems

Missing modules

Modules executed out-of-sequence

Paragraphs/Sections executed out-of-sequence

▪ Fault Analyzer provides a research starting point with its “Event” listings - for both of the above categories

Page 22: IDz/ADFz Workbench Using Fault ... - community.ibm.com

22

Determine WHY the ABEND Occurred – Data Problems

Typical reasons for data problems include:

▪ Mapping issues:

Incorrect Copybook version(s)

▪ Verify with SCM & Release Management tool

Mismatched LINKAGE/Entry Using

▪ Use IDz’s Scan for compatibility tool

▪ Invalid numeric data

Typically causes S0C7 or Data Integrity exposures – because of:

Data editing procedures bypassed

▪ Trace runtime module & paragraph flow

▪ Including prior jobs in the batch stream

Misunderstanding of COBOL syntax

Values entering application from un-edited sources

Basic logic/coding errors

▪ View COBOL logic by using IDz’s tools:

– Data Flow diagram

– Occurrences in Compilation Unit

Page 23: IDz/ADFz Workbench Using Fault ... - community.ibm.com

23

Data Problems –analyzed by IDz tools:

Program Control Flow,

Data Flow Diagram,

Fault Analyzer,

Application Discovery

A.D.

FaultAnalyzer

Data Flow

Page 24: IDz/ADFz Workbench Using Fault ... - community.ibm.com

24

Determine WHY the ABEND Occurred – Procedural Problems

Typical reasons for procedural problems include:Missing modules▪ Check

– Compile/Link JES output

– Link Maps – note that this info is provided by Fault Analyzer

Modules executed out-of-sequence▪ Fault Analyzer contains a CALL sequence table

▪ AD (Application Discovery)

▪ Application Architect or Systems Analyst input

Paragraphs/Sections executed out-of-sequence▪ Perform Hierarchy

▪ Program Control Flow

▪ AD (Application Discovery)

▪ Fall thru – caused by:– Poor program design/Obsession with GO TO statements

– Coding errors: PERFORM 1000-UPDATE-RTN THRU 10000-EXIT.

Page 25: IDz/ADFz Workbench Using Fault ... - community.ibm.com

25

Procedural Problems – Program Control Flow, Perform Hierarchy & A.D.

Page 26: IDz/ADFz Workbench Using Fault ... - community.ibm.com

26

Procedural Problems – Fault Analyzer Runtime Event Capture

Page 27: IDz/ADFz Workbench Using Fault ... - community.ibm.com

27

Solution - Fix the Problem and Test Your Solution

Take the appropriate action to resolve any business -or system-wide issues.

▪ Depending on how extensive the damage caused by the problem, or for how long any problems have persisted undetected:

Files may have to be restored from backups from a previous point-in-time

Jobs may have to be re-run from a previous point-in-time (synchronized with file generations)

Files may have to be modified with "one-shot" programs, written to resolve issues that require "surgery" on the data

▪ Take the appropriate action to fix the technical (coding) problem: Edit program source - modifying the existing production

logic …and/or…

Modify the JCL (if the error included JCL issues)

You may have to edit files using File Manager

▪ Test your solution: Compile and Link the new version of the application

Create an "image copy" of the production file system, in order to test your fix

Re-Run the batch job and analyze results

Run "Regression Tests" against the new code – analyzefor unexpected results

Page 28: IDz/ADFz Workbench Using Fault ... - community.ibm.com

28

Resolution

▪ Build (Compile/Link) the program(s) within your test environment

▪ Test/Validate your hypothesis/solution

▪ Migrate source modifications using your Version Control or SCM

▪ Promote your changes to production

▪ Schedule/Re-run the cycle

▪ Document the problem and its resolution – and optionally build in safe-guards into your development practices:

▪ Maintenance procedures/Best practices

▪ Testing tools and platforms

▪ Coding standards

Page 29: IDz/ADFz Workbench Using Fault ... - community.ibm.com

29

Section Summary

Having completed this section on Production Support/Application

Testing/Software Defects and IBM Mainframe COBOL ABEND

Research, you should now be able to:

Define the steps in a generalized methodology of ABEND resolution

List the various sources of ABEND inputs, including:

▪ PD Tools documents

▪ Other SYSOUT

▪ Dynamic trace facilities

▪ Static code analytics

List the common types of COBOL program ABENDS

Page 30: IDz/ADFz Workbench Using Fault ... - community.ibm.com

30

UNIT

Topics:

The IDz Workbench

▪ Analyzing Mainframe Abends

▪ ABEND Codes and Reasons

▪ Fault Analyzer

▪ Appendicies

Notes: • In order to understand why the z/OS runtime ABENDS in your code, you will need to

understand the z/OS software operations (Op Codes). • The traditional point of study is the Principles of Operation (POP) manual (not light reading)• Alternatively – your COBOL or PL/I instruction might have provided instructional guidelines

• Specific z/OS releases and COBOL Compiler options can modify the ABEND (MVS) Code information in this section You may need to discuss specific discrepancies found in this material with your Systems staff.

Page 31: IDz/ADFz Workbench Using Fault ... - community.ibm.com

31

ABEND Completion Codes – And some typical causes

▪ There are as many reasons for ABEND conditions ("WHYs") as there are production systems. But is useful to categorize HOW certain ABEND completion codes are caused by specific programming patterns. This can expedite your approach to ABEND analysis

▪ The following information on a few common z/OS ABEND completion codes, and the conditions which generated them is included for you to make effective use of PD Tools/Fault Analyzer listings and the above debugging, research and analysis process.

Notes:This information is available to some degree within the ADFz product in the

Lookup View. There are other sources of MVS Completion Codes that you can find on the web:▪ http://ibmmainframes.com/references/a29.html

▪ http://ibmmainframes.com/topic-42-0-250.html

▪ http://www.jaymoseley.com/hercules/sabends.htm

Page 32: IDz/ADFz Workbench Using Fault ... - community.ibm.com

32

S001: Record Length/Block Size DiscrepancyReason(s)

S001-0: Conflict between record length specifications (program vs. JCL vs. dataset label)

S001-2: Damaged storage media or hardware error

S001-3: Fatal QSAM error

S001-4: Conflict between Block specifications (program vs. JCL)

S001-5: Attempt to read past end-of-file‘

Instructions: OPEN, CLOSE, READ, WRITE

Frequent Coding Causes:

S001-0: Typos in FD or JCL

S001-2: Corrupt disk or tape dataset

S001-3: Internal z/OS problem

S001-4: Forgot to code BLOCK CONTAINS 0 RECORDS in FD (default Block is 1)

S001-5: Logic error (either forgot to close file, or end-of-file-switch not set, overwritten or ignored)

Tools to debug/IDz equivalent return codes:

S001-0: Cannot occur on IDz with Local ASCII/Windows (Line Sequential) files

S001-2: Norton Utilities – if on Workstation/COBOL application

S001-4: Cannot occur on Workstation/COBOL (no blocking for Line Sequential files)

S001-5: Logic error: Use IDz's Perform Hierarchy or AD's Program Flow Diagram to detect

Dynamic:

S001-0: During Debug – set a Watch Monitor on the 01 record

S001-2: Need to have PC/IT technician investigate (may need to reformat disk)

S001-4: Always code BLOCK CONTAINS 0

Page 33: IDz/ADFz Workbench Using Fault ... - community.ibm.com

33

S013: Conflicting DCB Parameters

Reason(s)S013-10: Dummy data set needs buffer space; specify BLKSIZE in JCL

S013-14: DD statement must specify a PDS

S013-18: PDS member not found

S013-1C: I/O error search PDS directory

S013-20: Block size is not a multiple of the LRECL

S013-34: LRECL is incorrect

S013-50: Tried to open a printer for Input of I/O

S013-60: Block size not equal to LRECL for unblocked file

S013-64: Attempted to Dummy out indexed or relative file

S013-68: Block size > 32K

S013-A4: SYSIN or SYSOUT not QSAM file

S013-A8: Invalid RECFM for SYSIN/SYSOUT

S013-D0: Attempted to define PDS with RECFM FBS or FS

S013-E4: Attempted to concatenate > 16 PDSs

Instructions: OPEN, CLOSE, READ, WRITE

Frequent Coding Causes:Most of these ABENDs occur running und z/OS (some may not even occur under z/OS, although older modules

running OSVS or VS COBOL II code that have not been recompiled can produce them).

Most are due JCL/COBOL➔ FD inconsistencies.

Tools to debug – Static Analysis:

S013-18: Open multiple windows on AD Batch Job Diagram and program Environment Division -SELECT ASSIGN clause

Page 34: IDz/ADFz Workbench Using Fault ... - community.ibm.com

34

SOC1: Invalid InstructionReason(s)

- SYSOUT DD statement missing

- The value in an AFTER ADVANCING clause is < 0 or > 99

- And Index or Subscript is out of range

- An I/O verb was issued against an unopened dataset

Instructions:

OPEN, CLOSE, READ, WRITE, Table handling routines

Note also that during Debug SYSOUT-DISPLAYs are written to the "console"

Frequent Coding Causes:

- Incorrect logic in setting AFTER ADVANCING variable (or failure to understand 0-99 limits)

- Incorrect logic in table handling code, or number of table entries has overflowed the PIC of variable

e.g. PIC 99 (two digits, max) - but there are 100 entries in the table

Tools to debug:Static

SYSOUT problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN.

In AD: Double-click on GO TO verb, or PERFORM chain, or paragraph name.

In IDz: Select Paragraph name/Perform chain and select: Open Declaration

Dynamic:

Set Watch Breakpoint and Monitor on table index or AFTER ADVANCING variable.

Set conditional advanced break point on subscript (i.e. SUB<100).

Page 35: IDz/ADFz Workbench Using Fault ... - community.ibm.com

35

S0C4: Protection Exception

Reason(s)The program is attempting to access a memory address that is not within the applications z/OS Address Space

Frequent Coding Causes:

- JCL DD statement is missing or incorrectly coded

- Incorrect logic in table handling code (referencing a table subscript < 1 or > max-table-size),

- Number of table entries has outgrown PIC of variable (i.e. PIC 99, but 100 entries).

- In IMS/TM systems, an MFS LL (length) field value is smaller than the actual input MSG length.

Tools to debug:Static

- DD statement problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN

- IMS LL problem: Analyze through multiple Edit Windows (same solution as DD).

- Incorrect linkage problem:

- Open multiple windows on CALLing and CALLed programs - verify linkage declarations.

Dynamic

Incorrect linkage problem:

- Set Breakpoint and Monitor on linkage declarations.

- Set conditional advanced break point on subscript (i.e. IDX < 100).

Incorrect logic.

- In IDz/Debug - set a conditional break point on subscript (i.e. IDX < 100).

Page 36: IDz/ADFz Workbench Using Fault ... - community.ibm.com

36

S0C7: Data ExceptionReason:

Machine instruction expecting numeric data found invalid data

Instructions:

Arithmetic, IF-THEN-ELSE, MOVE (if receiving field is numeric - )

Note: IDz will also S0C7 if sending field is numeric and contains non-numeric (MOVE pic9field TO picXfield)

Frequent Coding Causes:

- Incorrectly initialized, or uninitialized variable

- Missing or incorrect data edit

- 01 to 01 level MOVE if sending field is shorter than receiving field

- Move of Zeros to Group-level numeric fields

- MOVE CORRESPONDING incorrect

- or -

- MOVE field1 to field2 incorrect assignment

Tools to debug:

Static

AD report with options data selector on MOD or ALL

Dynamic

Set Watch points and Monitor on field.

Run through to S0C7.

Locate the field definition, or use CSI report.

Solutions:

Add edit checks for all numeric fields and MOVE statements.

Page 37: IDz/ADFz Workbench Using Fault ... - community.ibm.com

37

S0CB: Divide by Zero

Reason:

CPU attempted to divide a number by 0.

Instructions:

DIVIDE, COMPUTE

Frequent Coding Causes:

- Incorrectly initialized, or un-initialized variable

- Missing or incorrect data edits (i.e. failed to check divisor for zero value)

Tools to debug:

Static

AD report on all DIVIDE and COMPUTE instructions – or using IDz double-click on these verbs and select Filter from the Context Menu

Dynamic

Run through to the S0CB

Locate to field definitions of the offending fields

Solution:

Add edit to check for zero divide:IF divisor > ZERO

THEN

COMPUTE ...

ELSE

PERFORM error-processing routine

Add ON SIZE ERROR to all arithmetic verbs.

Page 38: IDz/ADFz Workbench Using Fault ... - community.ibm.com

38

S222/S322: Timeout … Endless Loop

Reason:

Timeout due to program logic caught in "loop" through instruction set with no exit.

Frequent Coding Causes:

- Invalid logic or fall-through logic

- Invalid end-of-file logic

- End-of-file switch overlaid

- Subscript not large enough

- Perform Thru wrong Exit

- PERFORM UNTIL "End-Of-File", but not performing "READ" routine to reach EOF condition

Tools to debug:Static

Perform Hierarchy on logic in PERFORM chain

Program Control Flow

Dynamic

PD Tools (mainframe) Debug to S222

Analyze counts (color)

Query and Monitor on subscript

Set an Advanced Break Point - Conditional on count

Solution:

From within Debug, use Program Control Flow to identify logic which could cause looping.

Select and click on PERFORM THRU, PERFORM UNTIL, GO TO.

Place break points on potential error lines.

Page 39: IDz/ADFz Workbench Using Fault ... - community.ibm.com

39

S806: Module Not Found

Reason:CALL made to program which could not be located along normal search path

(STEPLIB top-to-bottom, JOBLIB top-to-bottom, LINKPACK)

Instructions:Program CALL keyword or JCL EXEC PGM=XXXX

Frequent Coding Causes:- Module deleted from library, or never compiled to library

- Module name spelled incorrectly

- STEPLIB does not contain load library with module

- I/O error occurred while z/OS searched the directory of the library

Tools to debug:Static

Build (Link) Map

Do Remote Systems search on module name – in the Load Libraries

Dynamic

Set Program Advanced Break Point (Entry) to set program break before entry to system.

Solution:

Spell name correctly

Check for 0 or 4 return code from Link Edit (Build step)

Page 40: IDz/ADFz Workbench Using Fault ... - community.ibm.com

40

B37/D37/E37 – Dataset or PDS Index Space ExceededABENDS - B37/D37/E37 (RTS-028)

B37: Disk volume out of space.

D37: Primary space exceeded, no secondary extents defined.

E37: Primary and secondary extents full. In TSO, PDS directory needs compress.

E37-04: Disk volume table of contents (VTOC) is full.

Reason:

MVS could not find space for output WRITE to disk

Instructions:

WRITE

Frequent Coding Causes:

- Not enough space initially allocated to output file(s).

- (more likely) Logic error - program in (infinite) loop writing output file(s) - see S222/S322 reasons.

Tools to debug:

Static – Fault Analyzer will show the DSNs of the out-of-space dataset. As will the JES Output messages

On the host the JCL will show the DDNAME and z/OS filespec of the dataset in question

Dynamic

Set an advanced conditional break point to break on a certain number on iterations

See S222/S322 reasons and solutions

Also, set break point on file WRITE statements

Page 41: IDz/ADFz Workbench Using Fault ... - community.ibm.com

41

Database “ABENDS” – Unrecoverable Events from I/O Operations

Typically database-access routines are coded to test for specific return code values from the DBMS after each I/O operation. And the program will shut itself down if the specific return code values do or do not occur.

DB2:SQLCODE

A unique integer which describes DB2's reaction to your request.

SQLCA

Variable group which contains fields pertinent to debugging, particularly the SQLWARNs.

▪ IMS (DL/I database), VSAM and QSAM file management systems also pass values back to the application program that describe the outcome of each I/O (insert/update/delete/read) call.Consult your shop standards for coding best practices to determine how to utilize these

Better – create reusable code structures using IDz Snippets & Templates to simplify file access coding and make it consistent

Debugging approach:

Set Line Breakpoint and/or Variable Monitor on SQLCODE and other key feedback areas

- or -

Set Line Breakpoint and Watch Monitor for /"On-Change Break"

Double-click on field, Ctrl/F3

Page 42: IDz/ADFz Workbench Using Fault ... - community.ibm.com

42

UNIT

Topics:

The IDz Workbench

▪ Analyzing Mainframe Abends

▪ ABEND Codes and Reasons

▪ Fault Analyzer▪ Appendices

Page 43: IDz/ADFz Workbench Using Fault ... - community.ibm.com

43

Unit objectives

After completing this unit, you should be able to:

Work with ABEND analysis reports created by Fault Analyzer

Browse Report and Mini-Dump pages

Retrieve various Fault Analyzer view information

Browse and search ABEND codes

Use the various productivity features in the Fault Analyzer perspective

Reminder…This course is written for z/OS developers, not Systems Programmers.While Fault Analyzer has deep z/OS analytical tools that can be used by your Systems Programming staff, this material is created for COBOL and PL/I programmers responsible for discovering and analyzing application program ABENDs and their root causes.

Page 44: IDz/ADFz Workbench Using Fault ... - community.ibm.com

44

Shooting Dumps – Trad. ABEND Analysis

Face facts:

“Shooting a dump“ (traditional ABEND research) is not a quick or easy task

▪ You need Assembler experience as well as z/OS systems knowledge to understand Address Space, Control Blocks, Registers, Base/Offsets, Hex addressing, etc.

I’d rather use Fault Analyzer – which: Identifies the line where execution halted

Shows the salient points-of-interest surrounding the ABEND:

▪ Variables and variable values

▪ Statements

▪ Data and buffers

Gives you a head start on the What/Where and How of ABEND analysis Work-Flow

Page 45: IDz/ADFz Workbench Using Fault ... - community.ibm.com

45

What is Fault Analyzer?

▪ Fault Analyzer is a tool that helps you determine the cause of

an application ABEND. It determines:

What happened, How it happened

In which program(s) – On which lines – Using which variables

Accessing which Files …or… which Databases

▪ Fault Analyzer provides the necessary information to perform root cause

analysis on an application ABEND.

You do not have to interpret low-level, system dumps and wade through HEX

data & addresses. Information is presented in report format

▪ Fault Analyzer gathers information about an application and the

surrounding environment at the time of an abnormal end (ABEND),

providing you with the valuable information you need to work through

▪ After analyzing information about your application and its environment,

Fault Analyzer generates an analysis report (IDIREPORT) that describes

the problem in terms of application/program statements and variables

Page 46: IDz/ADFz Workbench Using Fault ... - community.ibm.com

46

What does Fault Analyzer Provide?

Fault Analyzer answers

the questions:

What happened

Where it happened

How it happened

▪ In which program(s)

▪ On which lines

▪ Using which variables

▪ Accessing which Files

Etc.

Page 47: IDz/ADFz Workbench Using Fault ... - community.ibm.com

47

Fault Analyzer for z/OS – Language and Environment Support

▪ Fault Analyzer supports:

▪ IMS and CICS® online application and system failures - with

debugging facilities for all of the online file systems and databases

– IMS-DL/I, DB2, VSAM…

▪ WebSphere® Application Server for z/OS system failures

▪ WebSphere MQ application failures

▪ Batch application failures that access:

– IMS-DLI. QSAM/VSAM, DB2…

▪ Language support▪ COBOL

▪ PL/I

▪ Assembler

▪ C/C++

▪ Language Environment

▪ UNIX System Services

▪ Java

Page 48: IDz/ADFz Workbench Using Fault ... - community.ibm.com

48

1. An application ABENDs. The system intercepts the ABEND and calls a Fault Analyzer exit. The exit invokes Fault Analyzer (FA)

z/OS

Fault

Analyzer

Application(batch or

online)

Options

FA Invocation Exit

Abend

Real-Time Analysis

2. FA reads Options

that control whether it

will analyze the

ABEND, how to

process the ABEND,

and which Fault History

file to use

▪ Installation options are

specified for the system

▪ Options can be

overridden for a job step

or online region

Fault Analyzer – Operational Flow – 1 of 3

WHEN and WHERE the

ABEND occurred

Page 49: IDz/ADFz Workbench Using Fault ... - community.ibm.com

49

4. Files for source mapping are read

▪ It searches for matching SYSDEBUG files, side files, and compiler listings

▪ Multiple libraries can be searched

z/OS

FA Invocation Exit

Fault

Analyzer

Application

Sysdebug files, compiler listings, and side files

Options

Real-Time Analysis3. Fault Analyzer examines

programs and the

environment in the

application Address Space

Application

Abend

Fault Analyzer – Operational Flow – 2 of 3

Fault Analyzer does the

ABEND Preparation for you

Fault Analyzer does much of

the ABEND Research for you

Page 50: IDz/ADFz Workbench Using Fault ... - community.ibm.com

50

SYSOUT

Analysis

Report

5. A new Fault Entry is written to a Fault History File. The entry contains:

▪ Information about the application

▪ The Analysis Report

▪ A “mini-dump” of the application (this enables reanalysis)

6. The Analysis Report is written to SYSOUT (batch jobs only)

z/OS

Fault

Analyzer

Application

Fault History File

Fault

Entry

SYSDEBUG files, Compiler Listings, or Side Files

Options

FA Invocation Exit

Abend

Real-Time Analysis

Fault Analyzer – Operational Flow – 3 of 3

HOW the ABEND

occurred – Analysis of

the Fault Event

Page 51: IDz/ADFz Workbench Using Fault ... - community.ibm.com

51

ABEND Resolution: Preparation, Research & Hypothesis

WHAT, WHEN, WHERE and

HOW the ABEND occurred

Page 52: IDz/ADFz Workbench Using Fault ... - community.ibm.com

52

Reviewing ABENDs in the Fault Analyzer Perspective

Besides FA’s main Analysis Report (IDIREPORT) you may also wish to use the Fault Analyzer perspective

To do that:

1. Switch to (Open) the Fault Analyzer perspective in IDz

2. Specify the history file to connect with, that populates a

Default ABEND view with failed online and batch job

IDIREPORTs and other outputs

– You may be able to utilize the default file

3. Learn how to navigate the Fault Analyzer perspective,

to make use of the information contained therein

The next slides contain step details...

Page 53: IDz/ADFz Workbench Using Fault ... - community.ibm.com

53

Fault Analyzer Perspective – 1 of 2

Steps:Open the Fault Analyzer Perspective ➔

Page 54: IDz/ADFz Workbench Using Fault ... - community.ibm.com

54

Fault Analyzer Perspective – 2 of 2

Enter: FAULTANL.<version>.HIST

Ex: FAULTANL.V14R1.HIST

Page 55: IDz/ADFz Workbench Using Fault ... - community.ibm.com

55

Fault Analyzer Perspective – Overview

Fault History files

Report Outline

List of ABENDS in the current Fault History file

Additional Reports

IDIREPORT

Page 56: IDz/ADFz Workbench Using Fault ... - community.ibm.com

56

Fault Analyzer Perspective – The Outline View Sections in the IDIREPORT synch with entries in the Outline view. Double-click an entry to open the associated section

Page 57: IDz/ADFz Workbench Using Fault ... - community.ibm.com

5757

Fault Analyzer – Report Tabs

Click the tabs to navigate to the report sections

Page 58: IDz/ADFz Workbench Using Fault ... - community.ibm.com

58

Default List of History FilesFrom the Default tab

Scroll up and down – to find a particular ABEND

Double-click an ABEND history file, to bring up its IDIREPORT and other stats

Sort the list by any of the column headings

▪ Can also work with options of the Context Menu – with each ABEND entry

Page 59: IDz/ADFz Workbench Using Fault ... - community.ibm.com

59

Filtering the list of ABENDs – 1 of 2

Right-click anywhere in the list > Filters >

then select a column to filter

RT

click

Clear Filters using this entry ➔

Page 60: IDz/ADFz Workbench Using Fault ... - community.ibm.com

60

Specify a column-filter value for the list – 2 of 2

Wildcard characters

can be used

Page 61: IDz/ADFz Workbench Using Fault ... - community.ibm.com

61

Main Report Example – S0CBThe IDIREPORT presents

a formatted, high-level

summary of the points of

interest necessary to

debug ABEND conditions

in your application.

Specifically, to answer the

questions:

• What happened?What z/OS ABEND

condition

• Where did it happen?What line or statement

was executing when it

happened

• How did it happen?What additional

information is available

for debugging purposes

Program line where

the S0CB occurred

Click S0CB for an explanation of this ABEND

Page 62: IDz/ADFz Workbench Using Fault ... - community.ibm.com

62

IDIREPORT Example – S0C4

Here's an example of

an IDIREPORT which

shows that RPT-REC is

“Not addressable"

…a euphemism for:

"There's something wrong

with the: FD, JCL DD,

Data Set connection"

Page 63: IDz/ADFz Workbench Using Fault ... - community.ibm.com

63

Fault Analyzer – Main Report Example – S0C7The IDIREPORT and

supporting text varies from

ABEND to ABEND

depending on:

• Type of ABEND

• Information available

at the time of the

ABEND

• Run-time platform

Note: CUST-ACCT-BALANCE

value is shown in hex because, even though the

field is declared as numeric,

invalid numeric data exists at

runtime

Page 64: IDz/ADFz Workbench Using Fault ... - community.ibm.com

64

Fault Analyzer – Main Report Example – S0C9

A S0C9 is like a S0CB

(divide by zero) except

that a S0C9 occurs

because of an

excessively large fixed-

point number obtained

as the result of a

decimal division

operation

Page 65: IDz/ADFz Workbench Using Fault ... - community.ibm.com

65

Fault Analyzer – Main Report Example – S0C1The IDIREPORT on an

IMS (TM) S0C1 ABEND

Page 66: IDz/ADFz Workbench Using Fault ... - community.ibm.com

66

Fault Analyzer – Main Report Example – S806

IDIDREPORT information on a module-not-found (S806) ABEND

Most likely SAM2 is

either a typo on the CALL

statement, or the

program did not

successfully

compile/link into the

Load Module

Page 67: IDz/ADFz Workbench Using Fault ... - community.ibm.com

67

Open Source File to ABEND Instruction

▪ Fault Analyzer can open the program source and position your cursor on the exact COBOL statement that failed.

▪ Steps:

From the IDIREPORT – click the source line #

From the FA Invocation Options Page specify the PDS that contains the source module

Page 68: IDz/ADFz Workbench Using Fault ... - community.ibm.com

68

Lookup View - For MVS, DB2, IMS, MQ and File Return Codes

The Lookup view

shows a great

deal of

background

information on:• ABEND codes

• DB2 SQLCODE

• IMS PCB Feedback

• VSAM File Status

etc.

You can use the view,

or double-click on the

ABEND code shown in

the IDIREPORT

An alternative to the Lookup View: MVS Return Codes for Application Programmers• http://ibmmainframes.com/references/a29.html

Page 69: IDz/ADFz Workbench Using Fault ... - community.ibm.com

69

What does ASRA stand for?ASRA means ABEND SYSTEM RECOVERY MESSAGE/REASON A. Various parts of CICS raise the error

▪ The first letter 'A' stands for ABEND.

▪ The second and third letters are from the name of the routine which raised the ABEND. In the situation of ASRA the routine is DFHSRP. The 4th and 5th letters of the program raising the Abend, make up the 2nd and 3rd letters of the Abend code. In this case SR. Giving ASR.

▪ The 4th letter signifies which error has been raised as each program may have the capacity to raise more than 1 error. Hence the last letter being A,B,C,1,2 or 0 etc. In the case of ASRA message/error type A has been raised giving ASRA.

▪ ATSB - Abend Temporary Storage. This is raised by DFHTSP.

▪ Terminal Control Abends are of the format ATC_ and are raised by DFHTCP.

▪ Task Control Abends are of the format AKC_ and are raised by DFHKCP. KC in this instance as TC has already been used for Terminal Control.

▪ AICA - Abend Interval Control. This is message A from program DFHICP.

▪ The AEI_ Abend codes are from the Exec Interface and are produced by DFHEIP.

Page 70: IDz/ADFz Workbench Using Fault ... - community.ibm.com

70

Event Details – Event Summary

The event summary shows the call chain

Each line is an event or a program in the call chain

SAM2 was the active program when the ABEND occurredHyperlinks to the source file

Page 71: IDz/ADFz Workbench Using Fault ... - community.ibm.com

71

Event Details – Program Detail section

Paragraph trace

The detail report for the 1st program begins here

Page 72: IDz/ADFz Workbench Using Fault ... - community.ibm.com

72

Event Details – Current Statement + Variables

Current statement

Variables referenced by the current statement and their values

Page 73: IDz/ADFz Workbench Using Fault ... - community.ibm.com

73

Event Details – Load Module Details

Link-Edit Date/Timestamp

Page 74: IDz/ADFz Workbench Using Fault ... - community.ibm.com

74

Event Details – Instructions and General Purpose Registers

Current machine instruction

Register values

Page 75: IDz/ADFz Workbench Using Fault ... - community.ibm.com

75

Event Details – Associated Files

Information is displayed about each file that was open at the time of the ABEND

Page 76: IDz/ADFz Workbench Using Fault ... - community.ibm.com

7676

Event Details – File Buffers (Blocks)

Associated storage areas displays program variables, when source information is available

Page 77: IDz/ADFz Workbench Using Fault ... - community.ibm.com

77

Event Details – Working Storage

77

Working-Storage SectionData Values + Declarations

Page 78: IDz/ADFz Workbench Using Fault ... - community.ibm.com

7878

Abend Information – General information about the job and modules

Job information

Page 79: IDz/ADFz Workbench Using Fault ... - community.ibm.com

79

Abend Information – Module Summary

Module summary

Page 80: IDz/ADFz Workbench Using Fault ... - community.ibm.com

80

Fault Analyzer – System Wide InformationThis section contains console messages that are not identified as belonging to any specific event, or CICS system-related information, such as trace data and 3270 screen buffer contents. It is preceded by the heading: S Y S T E M - W I D E I N F O R M A T I O N -Information about open files that could not be associated with any specific event might also be included here. If there is no information in this section, then it does not appear in the report.

Information on Data Set not associated with the ABEND

Page 81: IDz/ADFz Workbench Using Fault ... - community.ibm.com

81

Fault Analyzer – Miscellaneous

This section contains information about the Fault Analyzer options and files.

Page 82: IDz/ADFz Workbench Using Fault ... - community.ibm.com

82

Fault Analyzer – Mini-Dump Reading 1 of 2

Fault Analyzer also provides for the

reading/browsing of System Dump data –

in Hex/Character format.Select an ABEND

Scroll through the dump –Issue navigation commands: Show nnn, +nn, etc.

Page 83: IDz/ADFz Workbench Using Fault ... - community.ibm.com

83

Fault Analyzer Integration – Mini-Dump Reading 2 of 2

You can assign analysis notes to the

dump.

1. Right-click over the storage address

2. Add your note (click OK) ➔

3. Your note becomes

highlighted text inside the

dump

Page 84: IDz/ADFz Workbench Using Fault ... - community.ibm.com

84

Checkpoint

1. What IDz Perspective is used to view Fault Analyzer reports?

2. How does IDz obtain Fault Analyzer Information? Where does the information originate?

3. IDz Fault Analyzer interface has a Lookup View. What is it used for?

4. How can you jump to the program statement where the ABEND occurred with the IDz Fault Analyzer interface?

Page 85: IDz/ADFz Workbench Using Fault ... - community.ibm.com

85

Opening Fault Analyzer from Remote Systems/JES – 1 of 2

You can open a Fault Analyzer report directly from Remote Systems/JES.• First ensure that the name of your LPAR/Connection is the same as

the Systems Information Host name

Page 86: IDz/ADFz Workbench Using Fault ... - community.ibm.com

86

• Then – you’ll be able to Right-Click on an Abended Job and hyperlink directly to the Fault Analyzer report

Opening Fault Analyzer from Remote Systems/JES – 2 of 2

Page 87: IDz/ADFz Workbench Using Fault ... - community.ibm.com

87

Summary

Having completed this unit, you should now be able to:

Work with ABEND analysis reports created by Fault Analyzer

Browse Report and Mini-Dump pages

Retrieve various Fault Analyzer view information

Browse and search ABEND codes

Use the various productivity features in the Fault Analyzer perspective

Page 88: IDz/ADFz Workbench Using Fault ... - community.ibm.com

88

UNIT

Topics:

The IDz Workbench

▪ Analyzing Mainframe Abends

▪ ABEND Codes and Reasons

▪ Fault Analyzer

▪ Appendices

▪ Miscellaneous Fault Analyzer slides

▪ COBOL and ABENDs

Page 89: IDz/ADFz Workbench Using Fault ... - community.ibm.com

89

Downloading and Installing the Fault Analyzer Client

Click the Fault Analyzer link

https://developer.ibm.com/mainframe/products/downloads/

Page 90: IDz/ADFz Workbench Using Fault ... - community.ibm.com

90

Fault Analyzer – Operational Process (Terms & Concepts)

▪ Fault Analyzer has the ability to isolate the exact instruction that caused an ABEND:

The analysis engine provides automatic analysis when the application fails.

When an ABEND occurs, Fault Analyzer activates automatically, and then records details in

a fault history file (see screen capture below)

Fault History files contain information about the faults analyzed by Fault Analyzer for z/OS.

Using Fault History files, re-analysis is available when real-time ABEND analysis isn’t

enough (you can extract additional information in batch or interactive mode)

ABEND happens

Fault Analyzer exits are invoked ➔

Salient details (points of interest)

written and stored ➔

Page 91: IDz/ADFz Workbench Using Fault ... - community.ibm.com

91

Workshop – Big Picture

Steps/Stages:1. Copy a several datasets from your instructor's zServerOS TSO ID to your ID

▪ Details on the next slide

2. Modify JCL dataset names (and high-level qualifiers) to match your Sandbox ID

3. Compile a program named: HOSPCALC – which contains different types of COBOL

ABENDs generated from invalid COBOL logic in different parts of the program

4. Run your program (if it ABENDs)m from the Fault Analyzer IDIREPORT:

▪ Find the error in the COBOL source, and use the IDIREPORT ABEND analysis data to fix the error

▪ After you've solved the problem, you will save your edits, and re-compile HOSPCALC. Then run the

program until you either get the next ABEND … or get a zero return code ☺

ProgramLoad Module

Analyze IDIREPORT

Fix HOSPCALC COBOL error

Compile

Link Edit

Page 92: IDz/ADFz Workbench Using Fault ... - community.ibm.com

92

Fault Analyzer – IDIREPORT

The IDIREPORTprovides ABEND analysis information

WHAT, WHEN

and WHERE the

ABEND occurred

HOW the ABEND occurred

Page 93: IDz/ADFz Workbench Using Fault ... - community.ibm.com

93

Inputs to the Debugging and ABEND-Resolution Process

▪ Along with the System Completion Code Fault Analyzer provides reports about your program run which describe What/Where/When/How the ABEND occurred.

▪ Valuable information contained in the Fault Analyzer report-files includes:

The System Completion Code (and often a short text description of what it designates)

A short explanation of the cause of the ABEND

The COBOL instruction (statement) or line number, which contained the invalid operation causing z/OS to halt execution

Variables of interest – and code surrounding the instruction that halted execution

A "core-dump" (a hexadecimal printout) of the internal machine storage and registers relevant to the areas of your program surrounding the COBOL instruction which caused z/OS to halt execution.

▪ This information is critical to begin understanding and researching the problem, but it is sometimes insufficient to solve the underlying application problem, which could be any combination of: Incomplete, incorrect or invalid COBOL procedural logic

A typo such as a misplaced period, or incorrectly specified field

Incorrect or invalid input data

Batch jobs run out of sequence

Input files missing or corrupted (hardware errors)

Errors which relate to JCL problems

etc.

Page 94: IDz/ADFz Workbench Using Fault ... - community.ibm.com

94

Fault Analyzer for z/ OS – Overview and Use

▪ Fault Analyzer runs in both test and production with very little

overhead.

▪ Fault Analyzer:

Helps you analyze failures when they occur or reanalyze them after the fact

Expands error messages and codes that apply to your failure with interactive

reanalysis and includes a feature for using application-specific messages and

codes to supplement those supplied by IBM

Creates a fault history file with an interactive display that helps you track and

manage application failures

Starts automatically when an application fails, eliminating the need to recompile

programs or change the job control language (JCL)

Integrates with IBM Developer for z Systems – and enables developers to

diagnose application problems without changing user interface

Note that you do not have to make any changes to existing programs in order to allow Fault Analyzer to produce an analysis of an ABEND. Nor do you have to recompile programs in order to use Fault Analyzer

Page 95: IDz/ADFz Workbench Using Fault ... - community.ibm.com

95

UNIT

ABEND Resolution

• Terms and Concepts

• Types of ABENDs

• Defensive Programming

• Specific ABENDs

• ABEND on purpose

ABENDs and COBOL

"It's almost a given that there is some

amount of invalid data floating around

in the files and data bases."

z/OS Architect, 2020

Appendix 1

Page 96: IDz/ADFz Workbench Using Fault ... - community.ibm.com

96

z/OS ABEND (ABnormal END of Task)

▪ Production business application software errors are costly:

While they are nowhere near as expensive as mistakes on an operating table

They’re more expensive than mixing up the 1% vs. 2% milk in the dairy cabinets…or hitting Reply All when you actually meant to hit Reply

▪ There are ~dozen categories of common COBOL errors which produce ABENDS. These include but are not limited to:

Incorrect data typing of field definitions

Incorrect subprogram parameter passing order

Invalid data within files

▪ Values out of range

▪ Specific bad values … missing values

Incorrect record-layout offset definitions

Programmer/Analyst/Developer errors

▪ Misunderstanding of the specs – Typically the biggest & most expensive single issue

▪ Incomplete testing – Second biggest issue

▪ Misunderstanding of the COBOL language - Third biggest issue

An ABEND is a

mainframe business

application "Blue

Screen"

Page 97: IDz/ADFz Workbench Using Fault ... - community.ibm.com

97

ABENDING in Production vs. Test and Development

▪ ABENDs during Development & Test are expected

Not welcome - but expected

▪ ABENDs in Production are expensive - and unacceptable

They negatively impact corporation financials, market reputation, etc.

ABENDs during

Development/Test

Inconvenient and

Expected

Production ABENDs

Unacceptable

Potential loss of business revenue

Page 98: IDz/ADFz Workbench Using Fault ... - community.ibm.com

98

ABEND or Invalid Data - which is worse?

▪ It is widely held that invalid production data is far worse than MVS ABEND situations:

When something ABENDS it ABENDS

▪ Execution stops

▪ z/OS tells you precisely what failed - when & where it failed (the why & how are up to

you to discover)

▪ Backout routines can be called automatically

▪ CHECKPOINT routines can be used to provide point-in-time recovery

When applications "go EOJ":

▪ Results may (or may not) be correct– Often only business users can verify this

▪ If results are not correct:– What's wrong - was it the data or the code?

– If it's the code, where in heck do you start?

– Backtrack - or start from the beginning

▪ If this was production, invalid values will negatively impact the corporation - not just you

or your team

Sometimes programs contain their own "self-balancing" defensive-programming:

▪ Record in/Record out counters

▪ Amounts in/Amounts out as well "trial balances"

Page 99: IDz/ADFz Workbench Using Fault ... - community.ibm.com

99

▪ Alphanumeric Data: Truncation

Incorrect PIC clause alignment in the record layout

▪ Numeric data: Reference to numeric field that contains non-

numeric data

Decimal place precision and rounding - esp. with internal variables

▪ File Problems: Read past end of file

Reference to file before OPEN or after CLOSE

Write loop fills up an output file

▪ IF Conditions Incorrect specification of True/False logic

References to numeric fields that contain non-numeric data

ABENDS and COBOL Coding Errors

• Programmatic "fall-thru"

• COBOL statements execute downwards sequentially - irrespective of paragraph boundaries

• Unchecked PERFORM UNTIL (Iteration):

• Infinite Loops

• Index issues:• Typically "index out of range"

• File Handling:• Invalid ASSIGN clause

• JCL: • Incorrect module name

• Invalid DD Name

• Invalid DSN

• DISP = not correct with READ/WRITE

• Application Version Control Issues

Typical COBOL ABEND causes for sequential batch applications:

Page 100: IDz/ADFz Workbench Using Fault ... - community.ibm.com

100

▪ Data: Truncation: Understand the COBOL MOVE

instruction

Incorrect PIC clause alignment in the record layout: Align the actual data file to the record layout

▪ Numeric data: Reference to numeric field that contains non-

numeric data: Liberal use of IF … NOT NUMERIC tests

Decimal place precision and rounding - esp. with internal variables: Understand the underlying accounting - and use ROUND

▪ File Problems: Read past end of file: Debugging, Desk-Checking

and Peer Reviews

Reference to file before OPEN or after CLOSE: Ditto

Write loop fills up the output file: Understand the record capacity and file Space Allocation. Debug for Infinite Loop

▪ IF Conditions Incorrect specification of True/False logic: Debug

with "Jump To" function, Flow Charting, Clear understanding of the COBOL semantics and business spec.

Avoiding ABENDS

• Program "fall-thru": • Paragraph Fall Thru: Debug with "Conditional

Watch Monitors and/or code a DISPLAY statement at the top of the paragraph - which names the paragraph.

• IF/Conditional Fall /thru: Ditto

• Iteration:• Infinite Loops: Check for numeric truncation in

loop counters

• File Handling• Invalid ASSIGN clause: Vertical split screen, view

JCL & Program ENVIRONMENT DIVISION

• JCL • Incorrect module name: Typically easy (JCL Error)

• Invalid DD Name - View ENVIRONMENT DIVISION and batch JCL side-by-side

• Invalid DSN: JCL Error

• File not the correct DCB: Debug with "Conditional Watch Monitors.

• DISP= not correct with READ/WRITE: ABEND upon OPEN <file>. In general: OPEN INPUT assumes that the file contains data (DISP=SHR) and OPEN OUTPOUT assumes that the file is empty (DISP=NEW). OPEN OUTPUT will over-write the content of a file.

Page 101: IDz/ADFz Workbench Using Fault ... - community.ibm.com

101

Common COBOL Business Application ABEND Types

There are more ABEND types and situations that you'll see as a COBOL coder. But understanding these nine common ABENDS in this list will get you started

Also - z/OS will mask or return different system ABENDS than those listed below depending on whether the ABENDS occur in a layer of System Software (CICS, Language Environment)

▪ S001 - Record Length/Block Size Discrepancy

▪ S013 - Empty File/Record Length/Block Size Discrepancy

▪ S0C1 - Invalid Instruction

▪ S0C4 - Storage Protection Exception

▪ S0C7 - Data Exception

▪ S0CB - Divide by Zero

▪ S222/S322 - Time out/Job Cancelled - Infinite Loop

▪ S806 - Module Not Found

▪ B37/E37 - Out of space (output file)

Page 102: IDz/ADFz Workbench Using Fault ... - community.ibm.com

102

S001: Record Length/Block Size - Discrepancy

Reason(s)

S001-0: Conflict between record length (program vs. JCL vs. dataset label)

S001-2: Damaged storage media or hardware error

S001-3: Fatal QSAM error

S001-4: Conflict between Block specifications (program vs. JCL)

S001-5: Attempt to read past end-of-file

Instructions: OPEN, CLOSE, READ, WRITE

Frequent Coding Causes:

S001-0: Typos in FD or JCL

S001-2: Corrupt disk or tape dataset

S001-3: Internal z/OS problem

S001-4: Forgot to code BLOCK CONTAINS 0 RECORDS in FD (default Block is 1)

S001-5: Logic error (forgot to close file, or end-of-file-switch not set, overwritten, etc.)

Defensive Programming:

1. Split-Screen COBOL ➔ JCL

2. From JCL: Right-Click on DSN … Open Declaration

3. Select File and verify LRECL from the Properties View

Page 103: IDz/ADFz Workbench Using Fault ... - community.ibm.com

103

S001: Record Length/Block Size - Discrepancy

Defensive Programming:

1.Split-Screen COBOL ➔ JCL ➔ File Properties

2.From JCL: Right-Click on DSN … Open Declaration

3.Select File and verify LRECL from the Properties View

4.3.

5.

2.

1.

Page 104: IDz/ADFz Workbench Using Fault ... - community.ibm.com

104

S013: Conflicting DCB Parameters

Reason(s)S013-10: Dummy data set needs buffer space; specify BLKSIZE in JCL

S013-14: DD statement must specify a PDS

S013-18: PDS member not found

S013-1C: I/O error search PDS directory

S013-20: Block size is not a multiple of the LRECL

S013-34: LRECL is incorrect

S013-50: Tried to open a printer for Input of I/O

S013-60: Block size not equal to LRECL for unblocked file

S013-64: Attempted to Dummy out indexed or relative file

S013-68: Block size > 32K

S013-A4: SYSIN or SYSOUT not QSAM file

S013-A8: Invalid RECFM for SYSIN/SYSOUT

S013-D0: Attempted to define PDS with RECFM FBS or FS

S013-E4: Attempted to concatenate > 16 PDSs

COBOL Instructions: OPEN, CLOSE, READ, WRITE

Frequent Coding Causes:Most of these ABENDs occur running under z/OS (some may not even occur under z/OS, although older modules running on older operating systems

(OSVS or VS COBOL II code) that have not been recompiled can produce them). And most are due JCL/COBOL➔ FD inconsistencies.

Tools to debug – Static Analysis: S013-18: Same technique as S001

Page 105: IDz/ADFz Workbench Using Fault ... - community.ibm.com

105

S013: Block Size - Discrepancy

Defensive Programming:

1.Delete the BLOCK CONTAINS 0 RECORDS line as shown below

2.Save and COBUCLG

3.Open the IDIREPORT

Note that in COBOL v6 – with certain directives BLOCK issues are solved by the Operating System.

105

Page 106: IDz/ADFz Workbench Using Fault ... - community.ibm.com

106

SOC1: Invalid Instruction

Reason(s)

- SYSOUT DD statement missing

- The value in an AFTER ADVANCING clause is < 0 or > 99

- And Index or Subscript is out of range

- An I/O verb was issued against an unopened dataset

- Can also happen of CALL/ENTRY subroutine LINKAGE does not match the calling programs record definition

- File READ attempted before File OPEN

Instructions:OPEN, CLOSE, READ, WRITE, Table handling routines

Note also that during Debug SYSOUT-DISPLAYs are written to the "console"

Frequent Coding Causes:

- Incorrect logic in setting AFTER ADVANCING variable (or failure to understand 0-99 limits)

- Incorrect logic in table handling code, or number of table entries has overflowed the PIC of variable e.g. PIC 99 (two digits, max) - but there are 100 entries in the table

Tools to debug:

Static

SYSOUT problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN.

Logic problem: Select File. Use Occurrences in Compilation to isolate statements

Dynamic:

Set Watch Breakpoint and Monitor on table index or AFTER ADVANCING variable.

Set conditional advanced break point on subscript (i.e. SUB<100).

106

Page 107: IDz/ADFz Workbench Using Fault ... - community.ibm.com

107

S0C4: Protection Exception

Reason(s):

The program is attempting to access a memory address that is not within the applications z/OS "Address Space"

Frequent Coding Causes:- JCL DD statement is missing or incorrectly coded:

File Status: 47 upon READ Instruction

- Incorrect logic in table handling code (referencing a table subscript < 1 or > max-table-size)

- INITIALIZE used against a Buffer (file FD) that hasn't been opened.

- Number of table entries has outgrown PIC of variable (i.e. PIC 99, but 100 entries).

.

Tools to debug:

Static

- DD statement problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN

- Incorrect linkage problem:

- Open multiple windows on CALLing and CALLed programs - verify linkage declarations.

Dynamic

The problem with S0C4 ABENDS, is that once they happen - there's nothing left to capture and assist with Debugging.

An "Address Space" is a block of virtual memory your Load

Module is assigned and runs in, when executing on z/OS. If your

program attempts to reference memory beyond the Address Space

assigned, z/OS ABENDS your program with an S0C4

Important Note: Compile parameters influence what statements will and won't S0C4

Page 108: IDz/ADFz Workbench Using Fault ... - community.ibm.com

108

S0C7: Data Exception

Reason:

Machine instruction expecting numeric data found invalid data

Instructions:

Arithmetic, IF MOVE (if receiving field is numeric) and PERFORM VARYING statements

Your application can S0C7 if the sending field is numeric and contains non-numeric data (MOVE pic9field TO picXfield).

Frequent Coding Causes:

- Incorrectly initialized, or uninitialized variable

- Missing or incorrect data edit

- 01 to 01 level MOVE if sending field is shorter than receiving field

- Move of Zeros to Group-level numeric fields

- MOVE CORRESPONDING incorrect

- MOVE field1 to field2 incorrect assignmentstatements.

Tools to debug:

Static

Occurrences in Compilation Unit on numeric fields

Isolate all PIC 9 Fields

Dynamic

Set Watch points and Monitor on field.

Record the Debug session - Run through to S0C7 and Playback from the ABEND

Locate the field definition - and use client data analysis tools

Solutions:

Add edit checks for valid data in all numeric fields

Define all numeric data that does do participate in arithmetic as PIC X

Important Note: Compile parameters influence what statements will and won't S0C7

Page 109: IDz/ADFz Workbench Using Fault ... - community.ibm.com

109

S0CB: Divide by Zero

Reason:CPU attempted to divide a number by 0.

Instructions:DIVIDE, COMPUTE with / operation

Frequent Coding Causes:- Incorrectly initialized, or un-initialized variable

- Missing or incorrect data edits (i.e. failed to check divisor for zero value)

Tools to debug:Static

Search for all DIVIDE and COMPUTE

instructions – or using IDz double-click on these

verbs and select Filter from the Context Menu

Dynamic

Run through to the S0CB

Locate to field definitions of the offending fields

Solution:

Add edit to check for zero divide:

IF divisor > ZERO

THEN

COMPUTE ...

ELSE

PERFORM error-processing

routine

Add ON SIZE ERROR to all arithmetic verbs.

Page 110: IDz/ADFz Workbench Using Fault ... - community.ibm.com

110

S222/S322: Timeout … Endless Loop

Reason:

Timeout due to program logic caught in "loop" through instruction set with no exit. S322 = Timeout

S222 = Job Cancelled

Frequent Coding Causes:- Invalid logic or fall-through logic

- Invalid end-of-file logic

- End-of-file switch overlaid

- Subscript not large enough

- Perform Thru wrong Exit

- PERFORM UNTIL "End-Of-File", but not performing "READ" routine to reach EOF condition

Tools to debug:

Static

Perform Hierarchy/Program Control Flow on logic in PERFORM chain

Desk-Checking for other loop possibilities

Dynamic tools.

Debug to Loop

Query and Monitor on subscript

Set an Advanced Break Point - Conditional on count

Solution:

For S322 - you may need to increase the TIME=(,n) value in

the JCL Job Card

For S222 - you will need to read the code carefully to find one

of the Frequent Coding Causes

Note: You will need to

Cancel the job to stop

the Endless Loop ➔

Page 111: IDz/ADFz Workbench Using Fault ... - community.ibm.com

111

S806: Module Not Found

Reason:CALL made to program which could not be located

along normal search path - which is:

//STEPLIB

//JOBLIB

LINKPACK

Instructions:Fix the program CALL keyword or the JCL EXEC PGM=XXXX

Frequent Coding Causes:- Module deleted from library, or never compiled to library

- Module name spelled incorrectly

- STEPLIB does not contain load library with module

- I/O error occurred while z/OS searched the directory of the library

Tools to debug:Static

Build (Link) Map

Do Remote Systems search on module name –

in the Load Libraries

Dynamic

Set Program Advanced Break Point (Entry) to break

before entry to system.

Solution:

Spell name correctly

Check return code from Link Edit (Build step)

Page 112: IDz/ADFz Workbench Using Fault ... - community.ibm.com

112

B37/D37/E37: Dataset or PDS Index Space Exceeded

ABENDS - B37/D37/E37 (RTS-028)

B37: Disk volume out of space.

D37: Primary space exceeded, no secondary extents defined.

E37: Primary and secondary extents full. In TSO, PDS directory needs compress.

E37-04: Disk volume table of contents (VTOC) is full.

Reason:MVS could not find space for output file WRITEs to disk

COBOL Instructions:WRITE

Frequent Coding Causes:

- Not enough space initially allocated to output file(s).

- (more likely) Logic error - program in (infinite) loop writing output file(s) - see S222/S322 reasons.

Tools to debug:

Static – Fault Analyzer will show the DSNs of

the out-of-space dataset. As will the JES

Output messages

On the host the JCL will show the DDNAME and

z/OS filespec of the dataset in question

Dynamic

Set an advanced conditional break point to break

on a certain number on iterations

See S222/S322 reasons and solutions

Also, set break point on file WRITE statements