IT security testing, a practical guide — Part 6: Failure mode testing

April 1993 Computer Audit Update

software house should be prevented from selling sof tware which in any way discloses the purchaser's confidential information.

If ownership is to remain with the software house the purchaser may be able to recover some of the development costs charged to it by agreeing repayments are made to it should the contractor make subsequent sales of the software to a third party.

To successfully acquire bespoke software there are many points which need to be included in a purchase agreement. The lessons of the London Ambulance Service and the Taurus experience are clear. As the London Ambulance Inquiry Report concluded when the system was fully implemented last Autumn: "The software was not complete, not properly tuned and not fully tested". The purchasers of commissioned software are advised to ensure that the same does not apply to any software bought by them.

IT SECURITY TESTING, A PRACTICAL GUIDE - - PART 6

FAILURE MODE TESTING

Bernard Robertson and David Pullen PA Consulting Group

Failure mode testing is a specialized form of testing which is extensively used with defence and safety critical systems. However within the area of security testing, it is appropriate to the testing of systems which perform critical functions, e.g. financial systems, systems processing confidential data or systems which must provide high levels of availability. The objective of failure mode testing is to determine the effect of failures within system components on the entire system (e.g. the failure of the disk drive onto which the audit log is being written). In all cases the component should 'fail safe', i.e. fai lure should not result in any securi ty exposures. Failures which could be initiated both accidentally and deliberately in the live system

should be created since an attacker could claim the failure was an accident if caught.

SCOPE

Failure mode testing overlaps with other test areas such as hardware and software testing of contingency and disaster recovery plans and procedures. In this article, failure mode testing is defined to exclude failures in design or input data which should be covered by hardware and software testing.

TEST PROCESS

It is essential that failure mode testing is carefully structured to be as effective and efficient as possible. Failure mode testing should be directed to those areas most likely to fail in an insecure way and should adhere to the general structure suggested for security testing in an earlier article in this series, in particular:

• System familiarization.

• Test scripting and logging.

• Problem reporting.

The test process consists of 4 phases described below.

1. Identify points of failure

All the possible points of failure of the system should be identified. This process is best undertaken by starting at the highest level within the system and successively breaking it down into smaller components. For example an office LAN could be broken down into subgroups within the hardware and software areas as follows.

Hardware

Terminals

Screen

Keyboard

Mouse

©1993 Elsevier Science Publishers Ltd 9

Computer Audit Update April 1993

Processing units

Storage devices

Floppy diskettes

Hard disk, etc.

File servers

Screen

Processing unit

Storage devices, etc.

Print servers

Screen

Processing unit, etc.

Printers

PC communications card

In te r face be tween PC and communications card

I_AN cables and connectors

I_AN gateways

Software

Local

Central

System

Communications Software

Once the lowest relevant level of component has been reached then the ways in which the component can fail should be considered. For example the PC communications card could fail in any of the following ways:

• Unrecoverable component failure which is likely to cause a complete communications failure for the PC.

Intermittent component failure which is likely to cause spurious communications failures.

Loose connection which is likely to cause intermittent communications failures.

• Loss of the LAN address which will result in a communications failure.

• Corruption of the firmware which could result in unpredictable behaviour.

2. Identify the effects of failure

The possible effects of all of the identified failures should be defined. It is useful to identify the effects under the three security headings of confidentiality, integrity and availability. For example the following effects may be defined for an office I_AN system which provides:

• Officefacilities (e.g. word processing, spread- sheets, E-mail, etc.).

• A personnel performance recording system.

• Software and design documents for a proprietary application.

Confidentiality

Expose sensitive information

Expose personnel details

Disclose password

Integrity

Unauthorized data field change

Modify personnel record

Corruption of data

Proprietary software design corrupted

10 ©1993 Elsevier Science Publishers Ltd


Availability

User disabled

Password forgotten

Password changed

Terminal not available

Communications failure

System not available

File server not available

Communications failure

System files corrupted (resulting in loss of the system)

Data files lost

Lose personnel database, etc.

3. Generate Failure/effect matrix

Once the failures and effects have been defined a matrix may be generated with the failures on one axis and the effects on the other. Each failure/effect combination may than be scored in terms of:

• The likelihood of the failure causing the effect.

• The impact of the effect.

The scoring system should be kept as simple as possible-- a scale of 0 to 5 for each parameter is recommended. The overall score for each failure/effect combination may then be derived as the product of the two scores. A section of the matrix for the office LAN described above may look as shown in Figure 1.

The figure shows three types of failure for the PC communications card on the LAN. For each of these failures three effects are defined (one under the each of the headings confidentiality, integrity and availability). The likelihood of the failure causing the effect is scored in the top left hand box for each failure/effect combination. For example an unrecoverable component failure on the PC communications card is very likely to cause the terminal not to be available and so scores 5 out of 5. The impact of the effect is scored in the top right hand box. In the example, the effect of a terminal not being available is likely to be small since there will be other terminals on the LAN which could be used. The effect is therefore given a score of 2. The product of the two scores is placed in the bottom box.

PC Communications card

Conf . Integ. Avail .

Expose Unauth Terminal personnel modif , not info personnel available

details

Unrecoverable component failure Intermittent component failure Loose connection

0 13 0

o 13 0

0 I 3 0

0 I 2 0

2 I 2 4

2 12 4

5 12 10

4 12 8

4 I 2 8

Figure 1: Failure~effects matrix for Office LAN


Computer Audit Update April 1993

4. Selection of test areas

Once the failure/effect matrix is completed the test areas can be selected by referring to the scores. In practice there are unlikely to be many large final scores in the matrix. The tests may be listed in priority order based on the scores from the matrix. If there is limited resource available to undertake the testing then the tests may be executed in order of importance until the available resources are exhausted.

T E S T A R E A S

Failure mode testing is difficult to perform efficiently and effectively because it is often complex and requires the disruption of the normal operation of the component under test. Following the structured approach outlined above will assist the tester in defining the tests which should be performed. A few of the areas to focus on are outlined below.

1. PC card failures

PC cards may fail in a number of different ways, in particular:

• The contacts may fail causing spurious errors.

• Integrated circuits may fail.

• Batteries may fail or lose contact.

• Tamper resistant areas may fail to erase secret data under environmental extremes.

Cards which are usually of particular interest are:

• Communications cards (e.g. Ethernet, X.25).

• PC access control cards (e.g. PC Guard).

2. Communication line failures

Communications lines and connectors are susceptible to intermittent and complete failures, for example:

Cables may be accidentally or deliberately cut resulting in loss of availability.

Cables may be accidentally or deliberately damaged resulting in a short circuit or an intermittent connection.

Connectors may become worn resulting in poor electrical contact and intermittent failures.

3. Access token failures

Tokens and their associated readers used for access control purposes may fail or become unreliable in operation general wear and tear or deliberate abuse. Failures of access control tokens or readers could result in:

• Incorrect user identification.

Failure to acknowledge termination of a user session or the commencement of a new session.

4. ,Failure of storage media

Storage media may fail complete ly, intermittently or partially. For example the read/write head on a disk pack may 'crash' rendering the pack unusable, the connectors to the storage media may become worn resulting in intermittent read/write errors or sectors of a disk may become corrupted and unusable. If the storage media is used for a critical security function (e.g. access control lists or audit trail logs) then the failure may result in security exposure (e.g. audit logging may cease or a user may be allowed access to applications to which access is usually denied).

5. Cryptographic process failures

The cryptographic processes in a system may fail completely or intermittently and messages (which may include encryption keys) may be passed in cleartext if the failure is not properly detected by the system. Encryption keys may be compromised if tamper resistant areas

12 ©1993 Elsevier Science Publishers Ltd


fail to operate correctly and erase the secret information when attacked.

ISSUES

There are a number of issues which a tester should bear in mind when embarking on failure mode testing, they are listed below.

1. Mlnimlze destructlveneaa

Many failure mode tests are destructive in nature. Wherever possible the tester should seek to simulate the failure without casing any irreparable damage to the component under test. However in some circumstances this approach may not be possible and the tester should consider whether it is appropriate to undertake the test on a real system or attempt a theoretical analysis.

2. Dedicated test system

Many of the tests will cause serious disruption to the system under test. It will be necessary to use a dedicated test system to avoid disrupting any other testing programmes.

3. Gain management support

It is essential to gain management support for failure mode testing because of the two issues described above. The tester is unlikely to be able to get the level of resourcing to successfully undertake the tests until management are thoroughly convinced of the value of the testing. Testers should undertake failure mode testing only on those systems where it is appropriate and for which a convincing case for testing can be made. Tes ters should also reassure management that they will minimize the level of d is rupt ion to other test programmes. Management should be kept informed throughout the test process to ensure that they understand what is happening. Useful methods for keeping management informed include information feedback sessions, and distribution of the test specification and the test report.

4. Realistic tests

The tester should be realistic about the level of failure mode testing required. It is very easy to get carried away with esoteric tests which simulate failures which are unlikely to occur in practice (whether initiated accidentally or deliberately). The tester should continually ask the questions: "How could this failure occur?", and "How likely is the failure in reality?".

This article has described the testing of systems using stress/loading techniques. The next article will examine the specification, development and use of test tools.

Bernard Robertson is a principal consultant in the Security Consult ing Practice of PA Consulting Group. He has extensive experience in performing a range of secur i ty testing programmes for public and financial sector clients. Bernard is a regular speaker on IT security issues and holds degrees in economics and business administration. David Pullen is a senior consultant within the same Security Consulting Practice. Over the last five years he has conducted several security testing projects, including one lasting two years with a team of 15 security testers. David is a physics graduate and a qualified teacher who has produced a wide range of educational material on security testing.

RISK ASSESSMENT FOR EDI IMPLEMENTATION

Dr Brian S Collins

Background

Risk analysis methodologies are well established in the IT security world. Their application to communication systems however is less mature and even less so in the context of value added services such as EDI. The principles are nevertheless much the same and the topics


IT security testing, a practical guide — Part 6: Failure mode testing

Documents

Transcript of IT security testing, a practical guide — Part 6: Failure mode testing