An evaluation of evaluation: problems with performance measurement in small business loan and grant...

An evaluation of evaluation: problems withperformance measurement in small business

loan and grant schemes

Annabel Jackson

Annabel Jackson Associates, 52 Lyncombe Hill, Bath BA2 4PJ, UK

Abstract

This project researched the problems of evaluating public sector small business loan and grant

schemes. The methodology was to use case studies of four loan and grant schemes run by enterprise

agencies on contract from task forces or city challenges. These case studies were selected following a

mapping exercise of 27 ®nance schemes in north east London.

Research showed that evaluation relied on performance indicators. The main measures used were

deployment of funds, job creation, leverage of additional funds, lender of last resort, ethnicity of

recipients, default rate, and enquiry levels. Four technical problems with these performance indica-

tors were found. First, the interpretation of indicators varied between organisations. Second, perfor-

mance indicators judged agencies on areas which were outside their control. Third, indicators failed

to take account of the full range of work involved in managing a loan and grant scheme. Fourth,

lender of last resort and leverage were incompatible in their preferred risk position and no attempt

was made to reconcile the two.

That these technical weaknesses were not taken into account in the weight given to performance

indicators led to four organisational effects. First, agencies seemed to have made their own attempt to

render the indicators meaningful, re¯ecting the conditions under which the schemes had been ®rst

established. The existence of strong political, auditing, and time pressures led to three management

styles designated as client oriented, bureaucratic, and entrepreneurial. Second, the system encour-

aged managing agencies to distort ®gures to produce more favourable results. For instance, resche-

duling of debts allowed the agency to avoid recording high levels of default. Third, the focus of

information gathering on external needs (those of the funders) seemed to have discouraged agencies

from contemplating and developing their own data systems. This might re¯ect either a lack of

resources or a fear that any information could be used against them.

Several possible explanations for these problems are explored in this report. Objectives were

ambiguous because of the under-development of programme theory. Data was not always available

because its collection was not of value to the enterprise agency. Communication between funders

PERGAMON

0305-9006/01/$ - see front matter q 2001 Published by Elsevier Science Ltd.

PII: S0305-9006(00)00019-2

Progress in Planning 55 (2001) 1±64www.elsevier.nl/locate/pplann

E-mail address: [email protected]

and agencies was poor because funders had come to see their main power as residing in the

re-tendering of contracts.

These points present part of the explanation, but a further level of analysis is possible. That theory

is under-developed, data unavailable, and communication weak are symptoms of a deeper problem.

This is the problem that performance management is operating under a positivist theory of knowl-

edge. Several possible improvements to the evaluation of small business loan and grant schemes are

explored in this paper. Key among these is an acknowledgement of the limitations of evaluation

through adopting a developmental approach within a culture of organisational learning. This paper

illustrates the value of adopting a developmental approach, but also the fundamental transformation

which it implies.

ªWith each new evaluation, the evaluator sets out, like an ancient explorer, on a quest for useful

knowledge, not sure whether seas will be gentle, tempestuous, or becalmed. Along the way the

evaluator will often encounter any number of challenges: political intrigues wrapped in mantles of

virtue; devious and ¯attering antagonists trying to co-opt the evaluation in service of their own

narrow interests and agendas; unrealistic deadlines and absurdly limited resources; gross misconcep-

tions about what can actually be measured with precision and de®nitiveness; deep-seated fears about

the evils-incarnate of evaluation, and therefore, evaluators; incredible exaggerations of evaluators'

power; and insinuations about defects in the evaluator's genetic heritage (Patton, 1997: 38;

Utilisation Focused Evaluation).º q 2001 Published by Elsevier Science Ltd.

A. Jackson / Progress in Planning 55 (2001) 1±642

CHAPTER 1

Introduction

1.1. The research question

This project investigates the problems of evaluating public sector small business loan

and grant schemes. It leads to recommendations about ways to improve the meaningful-

ness, validity, fairness, usefulness, and practicality of evaluation.

1.2. Research methodology

There are three elements to the research methodology. First, a literature review of

evaluation theory was used to derive hypotheses on the potential problems of evaluating

loan and grant schemes. Second, a mapping exercise of all ®nance schemes was carried out

for the Hackney area. This put together information on the character, conditions, opera-

tion, performance, and procedures for public sector loan and grant schemes active in and

around Hackney. The mapping exercise had an additional role in providing background on

evaluation techniques used and the interpretation of performance indicators. Third, from

the mapping exercise, four loan and grant schemes were selected for case study to test the

hypotheses on evaluation derived from the literature review.

The ®eldwork for the case studies comprised interviews with the scheme managers and

funders, compilation of data on each of the loan holders, interviews with 24 loan holders

across two of the schemes, and analysis of written material on the loan and grant schemes.

1.3. De®nition of evaluation

ªEvaluation is the process of determining the merit, worth and value of things, and

evaluations are the products of that process Scriven (1991: 1).º

Scriven (1991) sees evaluation as a transdiscipline, wider than one area of applied social

science. It provides basic tools which span disciplines, rather in the manner of logic,

design, and statistics. Evaluation combines two processes. Compiling, analysing, and

simplifying or standardising data is only the ®rst step in evaluation. The second step

inevitably involves the imposition of values or standards. Scriven sees applications such

as programme, personnel, product, and material evaluation as branches of the core disci-

pline.

However, as Chelimsky (1997) argues, evaluation is also in essence action-oriented.

She describes three broad purposes for evaluation. The accountability function judges the

impact of a programme, its ef®ciency, and effectiveness. The development function

re¯ects on the operation of a programme and provides recommendations for improvement.

The knowledge function contributes to the generation of a pool of knowledge about social

(or economic) phenomena.

This project concentrates on the sub-area of programme evaluation while also recognising

A. Jackson / Progress in Planning 55 (2001) 1±64 3

the intellectual and practical gains from making links to the sister applications identi®ed by

Scriven.

1.4. Public sector small business loan and grant schemes

Within the ®eld of programme evaluation, loan and grant schemes are particularly

interesting. Finance is one of the most common weaknesses of small ®rms, whether

through under-capitalisation or cash ¯ow problems. Storey (1993: 1) comments that:

ªThe idea that problems in the ®nancing of smaller ®rms have signi®cantly hindered

the role they play in the overall performance of the UK economy is deeply rooted.º

Loan and grant schemes present a microcosm of ®nancial and non-®nancial pressures

on economic development. Inayatullah and Birley (1996: 7) draw attention to the

ambiguity in the World Bank's objectives, which call for application of the fundamental

principles of ®nance but also commitment to reaching the poor. They observe that: ªThere

exists no standard prescription in the literature for setting up and operating micro-credit

schemes, most of which have to tread the narrow path between charity and business.º Loan

and grant schemes have a discrete decision-point (approval or rejection) in which ®nancial

and non-®nancial pressures have to be reconciled. Other economic development schemes

do not have such a clear outcome.

This distinct character means that the approach to evaluation currently carried out by

public sector agencies starts from a different basis from that applied to other policy areas

which do not have a discrete decision-point (for example, training or business advice). The

conclusions on loan schemes might not, then, be representative of other policy areas.

1.5. Structure of the project report

This report adopts the framework of Shadish et al. (1995). They identify ®ve

components of programme evaluation:

² Appropriate phrasing of the social problem which the public sector programme seeks to

alleviate (Social Programming).

² Generation of valid information about the programme (Knowledge Construction).

² Application of appropriate value judgements (Valuing).

² In¯uence over appropriate policy decisions (Use).

² Organisation of work within resource constraints (Practice).

These ®ve headings broadly correspond with the requirements for evaluation to be mean-

ingful, valid, fair, useful, and practical. Four of the ®ve headings map onto the criteria

advocated by the American Program Evaluation Standards (Joint Committee on Standards

for Educational Evaluation, 1994): accuracy, propriety, utility, and feasibility. The ®fth

element, Social Programming, is not identi®ed in the Evaluation Standards. This term will

be renamed `Socio-economic Programming' for the purposes of this project, to re¯ect the

focus on economic development rather than social policy.

Fig. 1 shows the four stages to the project. The literature review is described in

Chapter 2, and the empirical work in Chapter 3. Implications from the empirical work

are discussed in Chapter 4, before making conclusions and recommendations in Chapter 5.



Fig. 1. Diagram of project structure.

CHAPTER 2

Literature review on evaluation problems

2.1. Introduction

This literature review draws on work across the ®eld of programme evaluation. The

review had three elements: readings from the main American theorists, Scriven (1994,

1976, 1991, 1996), Campbell and Stanley (1963), Campbell and Boruch (1975), Weiss

(1973, 1980, 1983, 1988), Wholey (1981), Stake (1978, 1981, 1995), Rossi et al. (1993),

Chelimsky (1987), Chelimsky et al. (1997), Guba and Lincoln (1989), House (1978, 1980,

1993), House et al. (1996), Chen and Rossi (1981), Chen (1990) and Patton (1990, 1996);

a review of the main evaluation journals (Evaluation, Evaluation and Program Planning,

Evaluation Review, Evaluation Studies Review Annual and Evaluation Practice); and a

review of policy related journals such as Fiscal Studies, Local Economy, Local Govern-

ment Studies, Municipal Journal, Local Government Chronicle, International Journal of

Public Sector Management, Local Government Studies, Public Administration, Public

Money and Management, and Policy Studies.

American evaluation theory has a long history, dating back to the large-scale social

experiments of the 1960s (House, 1993). The American literature is well developed but

divided. Evaluation has been bedevilled by intense `paradigm wars' (Pawson and Tilley,

1997) between social science-based evaluators who give priority to knowledge construc-

tion and stakeholder-based evaluators who give priority to use. Newman and Brown

(1996) argue that this con¯ict mirrors fundamental ethical dilemmas inherent in evalua-

tion: the evaluator's choice of autonomy (which is taken to run together with justice and

®delity to users or other stakeholders) versus ®delity to the client. Shadish et al. (1995)

emphasise that evaluation theory and practice must deal with both knowledge construction

and use as well as developing the relatively neglected areas of programme theories, value

frameworks, and guidelines about practical trade-offs in evaluation design.

In Britain, evaluation has only reached widespread importance since the Financial

Management Initiative (FMI) introduced by the Thatcher Government in 1982. Increased

use of evaluation in the public sector re¯ects four factors. First, performance management

helped justify cuts in public spending. ªCalling for ef®ciency improvements through the

better management of performance allowed the government to cut public expenditure

without necessarily advocating or, more signi®cantly, being seen to advocate service

level depletion, a process facilitated by the politically irresistible `value for money' tag.

It was dif®cult to oppose the concept of value for money without seeming to advocate or at

least defend waste and inef®ciencyº (Ball and Monaghan, 1996: 40). Second, performance

management was consistent with the government's desire for greater accountability.

Performance indicators were introduced to increase control over decentralised activities

and, more recently, to help with the management of compulsory competitive tendering

(Ball and Monaghan, 1996: 42). Agency theory shows how monitoring is needed where

there are `hidden actions' or `information asymmetries' which expose principals to `moral

hazards' or `adverse selection' (Wallace, 1980). Third, performance indicators were part

of an attempt to increase customer focus and improve quality (Ghobadian and Ashworth,


1994: 35). Fourth, performance indicators have been used positively by councils to help

focus effort on matters of strategic importance. ªA well-designed performance review

system is one method of operationalising manifesto commitments and thus has attraction

for members.º (Ball and Monaghan, 1996: 42). Use of performance indicators went hand

in hand with the increased politicisation of local government.

Performance management refers to ªan integrated set of planning and review proce-

dures which cascade down through the organisation to provide a link between each

individual and the overall strategy of the organisation.º (Rogers, 1990: 16). This is a

sub-set of the area included within the ®eld of evaluation. In Britain, the remit has been

further narrowed with the Audit Commission's focus on the three Es: economy (the extent

to which the inputs are minimalised), ef®ciency (the relationship of the output of an

activity or organisation to the associated inputs), and effectiveness (the extent to which

outputs contribute to ®nal objectives). The concept of value for money Ð which refers to

the social and economic bene®t of an activity in relation to cost Ð has in practice been

used as a shortened term for the three `E's (Cave et al., 1990: 42). Economy and ef®ciency

are more easily measured than effectiveness and these two categories of performance

indicators have come to dominate evaluation. As Rogers (1990: 48) explains: ªThere is

plenty of available data about resources and about the quantity of service provided. There

is much less data available about the quality of service and its effects on consumers and the

public generally.º

In an effort to strengthen the treatment of quality and effectiveness, the Audit Commis-

sion (1989) has clari®ed that outputs (the use made of resources or the service actually

delivered to the public) do not necessarily lead to outcomes (the ultimate value or bene®t

of the service to its users). Nor can changes in a social or economic system be attributed

equivocably to programme intervention: some activities or expenditure would have

occurred without the programme (deadweight); some activities or expenditures are offset

through the programme (displacement). Nor are all changes equally desirable. Evaluators

have added a fourth `E', equity, to cover distributional effects relative to need.

2.2. Socio-economic programming

2.2.1. Introduction

ªA social problem is a social construction (Berk and Rossi, 1990: 39)º

ªA social or intervention program is the purposive and organised effort to intervene

in an ongoing social process for the purpose of solving a problem or providing a

service. The questions of how to structure the organised efforts appropriately and

why the organised efforts lead to the desired outcomes imply that the program

operates under some theory. Although this theory is frequently implicit or unsys-

tematic, it provides general guidance for the formation of the program and explains

how the program is supposed to work (Chen, 1990: 39).º

ªAll policies involve assumptions about what governments can do and what the

consequences of their actions will be. These assumptions are rarely spelt out, but


policies nevertheless do imply a theory (or model) of cause and effect (Hogwood and

Gunn, 1984: 18).º

Socio-economic Programming deals with the conceptual basis for de®ning socio-

economic problems and designing public policy programmes. This is also known as

`substantive knowledge' (Rossi in Chen, 1990), or `theory-driven evaluation' (Chen,

1990). The starting point is the argument that public sector policy can be seen as based

on a set of assumptions which together have the structure of a theory. This view has been

expressed not only by evaluators (Berk and Rossi, Chen), but also by policy analysts

(Hogwood and Gunn).

Chen (1990) distinguishes two elements of programme theory: normative and causative

theory. Normative theory examines the ideal character (`treatment evaluation'), context

(ìmplementation environment evaluation'), and effects of the programme (òutcome

evaluation'). This comprises an analysis of the goals and objectives for the programme,

including intermediate objectives. Causative theory examines the observed effects

(ìmpact evaluation'), processes linking the treatment and the outcome (ìntervening

mechanism evaluation'), and wider relevance of the programme (`generalisation evalua-

tion'). This is about the mechanisms linking objectives with programme effects, including

indirect impacts.

The two main problems of Socio-economic Programming follow this division into

normative and causative theory. First, objectives have been found to be ambiguous.

Second, the theoretical assumptions underpinning policy are usually unstated.

2.2.2. Normative theory: clari®cation of objectives

Vague objectives are common in public programmes: ªThe goals of social programs are

often global, diffuse, diverse and inconsistent, vary over stakeholders, and may have little

to do with program functioning.º (Shadish et al., 1995: 184, commenting on the work of

Weiss). Chen (1990: 90)) attributes this weakness to the con¯icting roles which goals or

objectives must serve. He identi®es six roles: legitimising the existence of the programme;

binding political coalitions; levering resources; enabling a budget to be allocated; provid-

ing criteria for performance appraisal; and directing implementation. The ®rst three of

these are political rather than operational roles, and employ ambiguity to smooth over

con¯icts and bind coalitions. By the time policy is implemented, the gap between objec-

tives and practice may have widened because of changes in external conditions, translation

of abstract principles into practical guidelines, and the exercise of personal autonomy by

project managers (Chen, 1990: 177).

Evaluation theorists differ in their views on the importance of having clear objectives.

At one extreme, Wholley (Shadish et al., 1995: 237) thinks programmes should be tested

for evaluability before full-scale evaluation is started. Programmes without clearly de®ned

objectives are dismissed as unevaluable. At the other extreme, Scriven (1976) argues for a

process of `goal-free evaluation'. He sees goals as a source of bias and wants evaluators to

investigate the impact of programmes without any knowledge of them, in what he calls à

step beyond double-blind methodology'. Between these extremes, Patton (1996) de®nes

goals at an operational level by asking project managers about the intended impact on


clients, and Chen (1990) identi®es `plausible goals' by looking at which activities attract

the most resources. These two would be described by Etzioni (1960) as focusing on the

`real' rather than the `formal' goals.

Clarifying objectives can highlight questionable or inconsistent assumptions,

differential expectations of stakeholders, unnecessary activities, and weak links in

implementation.

2.2.3. Causative theory: identi®cation of linking mechanisms

Chen (1990: 18) argues that re®nement and analysis of the conceptual foundations of

programmes have been neglected because methodologists have taken a black box

approach focusing on impact, looking at `the overall relationships between the inputs

and outputs of a program without concern for the transformation processes in the middle'.

Bickman (1987) in Chen (1990: 29) gives four advantages of specifying programme

theory. First, evaluators can distinguish between programme failure due to an invalid

theory and failure due to incorrect or incomplete implementation. Second, understanding

the connections between a programme's operations and its effects helps the evaluator to

anticipate direct and indirect impacts. Third, identi®cation of intermediate effects of a

programme can be used by the evaluator to give early feedback of implementation

problems. Fourth, information on the processes underpinning programmes is of greater

practical relevance to programme managers than general statements about impact.

2.2.4. Developing theory

Ways of improving Socio-economic Programming are given by Chen (1990) in his

Theory-Driven Evaluation, and Pawson and Tilley (1997) in their Realistic Evaluation.

Chen (1990; 226) lays down principles to help identify those key components of the

programme where theory most needs to be developed. He argues that an understanding

of programmes develops by testing whether the theories underlying the `research system'

still apply in the `generalising system'. This is analogous to looking at whether processes

in statistical samples are representative of those in their wider population.

Pawson and Tilley (1997: 119) see Chen's approach as being about `the continual

betterment of practice' as opposed to the experimentalists' goal of `the secure transfer-

ability of knowledge'. They take Chen's work as a basis to develop their own scienti®c

realist approach Ð that social phenomena have an existence beyond the constructions

placed upon them.

Pawson and Tilley (1997) provide an iterative structure to build up a theoretical under-

standing of programmes in terms of their mechanisms, elements, contexts, and outcomes.

Their version of scienti®c realism argues that programmes work only in certain forms, for

certain people, and in certain contexts. The aim of evaluation is to provide continuous

feedback in ever ®ne detail about the conditions under which policy succeeds or fails

(Fig. 2). Pawson and Tilley (1997: 22) argue that traditional positivist approaches, which

infer cause from co-variance between aggregate variables, ignore conditional and contin-

gent factors. Quoting Guba and Lincoln (1989: 60), they suggest that: ªExperimentation

tries to minimalise all the differences (except one) between experimental and control

groups and thus `effectively strips away the context and yields results that are valid

only in other contextless situations.'º


Realistic Evaluation strengthens Socio-economic Programming in three ways. First,

programmes are seen to be composed of different sub-processes. These sub-processes

differ in their effectiveness and in their compatibility with different environments.

Second, social and economic phenomena are seen to have multiple causes which act

synergistically rather than individually. This can produce a complex pattern whereby

individual programme effects cancel each other out. Third, Pawson and Tilley restore

human agency through asking about the possible motivations of programme participants.

Programmes are seen as `offering chances which may (or may not) be triggered into action

by the subject's capacity to make choices.' This is a conceptual, and possibly an ethical,

advance from seeing programmes acting upon passive subjects.

2.3. Knowledge construction

2.3.1. Introduction

Knowledge Construction is the technical side of evaluation concerned with the produc-

tion of information (Shadish et al., 1995). It provides guidance on the appropriate tech-

niques for obtaining information (methodology), the nature and origins of that information

(epistemology), and the status of the information created, whether real or constructed

(ontology). Knowledge Construction is the most well developed facet of evaluation.

Lessons from applied social research, with which evaluation has much in common,

have contributed to the body of understanding.


Fig. 2. The realist evaluation cycle.

Con¯ict between science-based and stakeholder-based evaluators is mirrored in a split

between positivists and constructionists, and between quantitative and qualitative

researchers. The framework of Shadish et al. is intended to bridge this gap by advocating

an eclectic methodology tailored to the speci®c circumstances of each evaluation.

Knowledge Construction in the UK focuses on performance management and, within

this, on the use of performance indicators. Performance indicators are de®ned as `quanti-

tative or qualitative measures of inputs, throughputs or outputs' (Smith and Walker, 1994).

2.3.2. A critique of performance indicators

Discussion about the value of performance indicators is an important element of the

British literature on evaluation. Criticisms about the use of performance indicators include

the following.

² Subjectivity. Public services do not always lend themselves to quanti®cation. Perfor-

mance indicators frequently use nominal data for variables which are in practice

continuous. Measurement of performance in public services is complicated by the

inherent characteristics of services: intangibility, inseparability, variability, and perish-

ability (Kotler, 1994). Stewart and Walsh (1994: 47) conclude that technical solutions

will never be suf®cient to overcome these inherent tensions: ªMany of the apparent

technical dif®culties re¯ect the impossibility of regarding provision as the equivalent of

sale.º

² Insuf®cient attention is given to the use and interpretation of indicators. ªIt has

frequently been assumed that indicators will simply `speak for themselves.'º (Strand,

1997: 146). Inaccurate, out-of-date, or irrelevant data have been used and dubious

comparisons made. Public sector agencies typically vary in their number of objectives,

priorities, history, local need, environment, pay-back period and, indeed, de®nition of

performance indicators. Generalising across different organisations therefore needs to

be done with care.

² Failure to isolate additionality. Additionality is de®ned by the Treasury (1988) as: ªthe

amount of output from a policy as compared with what would have occurred without

intervention.º The aggregate picture presented by indicators has been accepted

uncritically without separating out effects which can rightfully be attributed to policy.

A speci®c element of this is the failure to take account of different starting levels and

therefore different added value in individual programmes.

² Simpli®cation. Some activities are more amenable to measurement through

performance indicators and have therefore been given greater attention. The Audit

Commission (1992) recognises that: ªthe danger with the very simple indicators is

that they over-simplify reality, or put an excessive weight on the features of the services

that happen to be easy to measure.º As already mentioned, measures of economy (cost)

have generally won over measures of effectiveness (achievement of objectives).

² Lack of local ownership. Gray (1997) complains that the emphasis on quantitative

indicators serves to marginalise community organisations from the planning and imple-

mentation of policy. Strand (1997: 146) observes that in education ªThe suspicion

remains that PIs are something which is done to schools rather than for them or with

them.º


² Neglect of other forms of accountability. The introduction of government-determined

performance indicators has tended to increase central accountability at the expense of

local accountability, accountability over outcomes at the expense of accountability over

process. ªMuch of the accountability that can be identi®ed is a post hoc accountability

for what has happened. There is little evidence of the instalment of the necessary

control systems to ensure initial quality assurance.º (Glynn and Murphy, 1996: 132).

Stewart and Walsh (1994: 45) argue that public sector performance measurement is

inherently problematic: ªThe principles that we use to measure performance in the public

sector are not so much applied in the judgements that we make of service quality, but are

derived from them. These judgements are more like a judicial activity than engineering

measurement, being based on criteria that arise from the accumulation of a series of cases.

The dilemma of performance management in the public domain therefore is to secure

effective performance when the meaning to be given to it can never be completely de®ned,

and the criteria by which it is judged can never be ®nally established.º

2.3.3. Organisational implications of performance indicators

The adverse impact of evaluation on an organisation has been recognised for some time.

ªThe more any social indicator is used for social decision making, the greater the corrup-

tion pressures upon it.º (Campbell, 1979: 85). Gray (1997) uses the example of the

`production fetishism' of the Soviet planning system to illustrate the `gaming' effects of

output monitoring. Output measures distort organisational activity by valuing some beha-

viour over others. Five problems result:

² `Creaming' is encouraged. In Russia, production targets encouraged over-production

of poor quality goods. For example, when targets for glass production were de®ned in

tons, the factory had an incentive to produce undesirably thick glass, but when it was

changed to square metres, the glass was made too thin. Similarly, the use of output

targets has resulted in a form of `target fetishism' Ð `a concern with targets which

threatens to become detached from the social purpose of the policies at stake'. In

economic development this is re¯ected in support being concentrated on individuals

most likely to generate outputs, rather than those most in need, and focusing on sectors

most likely to produce outputs or evidence of outputs. OECD (1988) has found `cream-

ing' to be widespread in training schemes across the world.

² Additionality is neglected. `Wherever the output measure is the number of bene®ciaries,

there is a risk that the jam will be spread too thinly, in order to increase this number.'

(Gray, 1997)

² Innovation is discouraged. `Innovation involves risk and risk is not rewarded.' (Soviet

quote in Gray, 1997).

² Short-termism is encouraged. Projects with a quick payoff will tend to be preferred in a

system which needs to prove the value of the current year's expenditure by achieve-

ments before year-end.

² Over-counting or double counting is rewarded. Programmes often overlap in their area

of operation. Failure to isolate the speci®c contribution of individual agencies can lead

to haphazard attribution of outputs and double-counting.


These problems can be contained through precise de®nition of performance indicators but

this demands a speci®cation of public sector performance which might not be possible at

the start of the programme.

2.3.4. Improving the use of performance indicators

Likierman (1993) used results from a survey of 500 managers in the public sector to

devise twenty lessons for the use of performance indicators (PIs). His main recommenda-

tions were:

² Include all elements of performance. Those devising measures should ensure they do

not omit and therefore undervalue an essential activity of work.

² The number of PIs should be appropriate to the organisation and its diversity. ªToo

many will make it dif®cult to focus on what is important, too few will distort action.º

² Provide adequate safeguards for `soft' indicators, particularly quality. ªQuality has

proved notably dif®cult for organisations to measure, and great care needs to be taken to

give it proper weight.º

² Acknowledge political and organisational purposes. Some measures might be included

to enhance credibility of a service rather than to deepen understanding.

² Build in counters for short-term focus. Indicators set on an annual cycle are likely to

focus management effort in the short term.

² Ensure PIs fairly re¯ect the efforts of managers. Performance indicators should not lie

outside of the control of the organisation under evaluation.

² Link PIs to existing management systems. New indicators which run in parallel with

existing systems can appear onerous or irrelevant.

² Ensure PIs are understandable to those whose performance is being measured.

Performance indicators need to be trusted and understood if they are to have an impact

on action.

² The data on which results are based must be trusted. Likierman found considerable

concern that staff did not appreciate the importance of ®lling in the forms carefully and

were therefore producing data which was unreliable.

² Use the results as guidance, not answers. Recognise that interpretation is the key to

action. ªThe results should always be accompanied by a commentary so that the ®gures

can be analysed and put in context.º

² Recognise trade-offs and complex interactions between different elements of

performance. Not all PIs should carry equal weight.

There are three main conclusions from this analysis of performance management. First,

that the quality of data collection for performance indicators should be improved. This is

likely to mean focusing on a small number of important indicators each representing

different dimensions of strategic interest. Second, that comparisons should be made care-

fully, and possibly within the framework of a classi®cation which groups together similar

organisations or operating circumstances. Third, that performance indicators should be

analysed over time as part of a broad learning process which pulls in other (especially

qualitative and non-®nancial) information. Performance indicators should be used to


identify issues for further investigation rather than to give an immediate judgement about

success or failure.

This third point is the most far reaching. It is clear that many authors want performance

management to move from a focus on accountability towards development and knowledge

functions:

² ªThe focus is on systematic thinking, fundamental structural change and organisational

learning, rather than mindless target-setting, continual ®re-®ghting, the rigorous

allocation of blameº (Meekings, 1995: 5).

² ªWhen located within a strategic-management perspective the performance measure-

ment of public services is seen to be necessary but not suf®cient for improved manage-

ment practice; and a means of learning rather than a means of controlº (Jackson, 1993:

14).

² ªRather than ªdials on the dashboard of a car', performance indicators are most helpful

when viewed as `tin-openers', leading to further examination and enquiryº (Carter et

al., 1992).

² ªComparisons are needed, not to provide de®nitive answers, but to highlight issues

which need to be debatedº (Audit Commission introduction to the ®rst national

publication of council performance indicators, 1995).

² ªOver-reliance on the ªcontrolº model of evaluation may create unnecessary defen-

siveness on the part of the agency or department concerned. Arguably, evaluation

should represent a learning process for all concerned rather than just a one-off

judgement on the ef®cacy of any scheme.º (McEldowney, 1997).

Van der Knaap (1995) shows how evaluation can contribute to organisational learning.

He breaks learning into three separate components each of which bene®ts from

evaluation:

² Corrective learning. Evaluation can provide feedback showing whether implementa-

tion is in line with plans (what Argyris and SchoÈn, 1978 call single loop learning).

² Cognitive development through a process of re®ning schemata. Evaluation can provide

stimulus and inspiration (double loop learning or what Weiss calls the ènlightenment

function' or `conceptual use').

² Social learning through dialogue and debate. Evaluation can provide a forum for

communication and the development of alliances.

This contribution to organisational learning will not be easily achieved, however. Litera-

ture on learning organisations illustrates the kinds of changes to organisational culture and

style which might be needed when introducing a developmental approach. Senge (1990)

emphasises that learning organisations must be able to tolerate failure (èxperimental mind

set'); and prepared to expose and change their fundamental assumptions (`metanoia').

These comments echo earlier work from Zuboff (1988) about the need to use information

for understanding rather than reward or punishment, to empower not disempower people

(ìnformating'). Pedlar et al. (1991) proclaim that, in the learning organisation: ªdepart-

mental and other boundaries are seen as temporary structures that can ¯ex in response to


changes in the environmentº (`enabling structures'); `company policies re¯ect the values

of all stakeholders not just those of top management' (`participative policy-making'); and

strategies should be seen as ªconscious experiments rather than set solutionsº (`learning

approach to strategy'). Nevis et al. add instructions for the learning organisation to be

continuously monitoring the external environment (`scanning imperative').

2.4. Valuing

2.4.1. Introduction

Evaluation should not only be true; it should also be just (House, 1978: 76)

This third, relatively neglected, facet of evaluation concerns the choice of criteria and

the values on which this choice is based.

There are two broad ways of dealing with values in evaluation. The descriptive

approach presents the values of the participants in a programme without elevating one

set of values over others. The prescriptive approach endorses particular values derived

either from one of the programme stakeholders, or from an abstract ethical system like

utilitarianism or social justice (Shadish et al., 1995).

Variants of these two are possible. For instance Karlsson (1996: 410) advocates nego-

tiating a consensus on values across the stakeholders. Shadish et al. (1995) suggest

constructing alternative value summaries: ªIf X is important to you, then evaluate Y is

good for the following reasons.º Other authors advise evaluators to use their own personal

values but to be open about the basis for these.

ªThe dif®culty of integrating together different measures is that values are at stake in the

weights to be given and values, and the weights to be given to them, can always be subject

to discourse and dispute in the public domainº (Stewart and Walsh, 1994: 48). As this

quotation illustrates, valuing presents two fundamental problems for the evaluator. First,

the evaluator needs to ®nd a framework for selecting and justifying the value judgements

that he/she imposes on the information collected. Second, the evaluator needs to aggregate

or weight individual judgements about the different elements of performance.

2.4.2. Comparison

Judgements of value implicitly rely on some kind of comparison: actual against alter-

native actual (such as a comparable programme or version of the programme), `with'

against `without', `before' against `after', (both aimed at separating programme and

non-programme experiences), actual against expected or actual against ideal (Berk and

Rossi, 1990). These comparisons can be extremely powerful but are not easy to make.

There are four potential problems. First, presentation of an ideal or counterfactual case

relies on judgement. The evaluator either has to construct a hypothetical case or to distil

out extraneous factors. Second, the comparison can be dif®cult to trace precisely. The

desire for precision tends to lead to a focus on quantitative (and often ®nancial) rather than

qualitative information. Third, variation in performance indicators can re¯ect changes or

differences in de®nition rather than changes in performance. Fourth, drawing attention to

values in this manner can weaken the evaluation's credibility with stakeholders who hold

different values.


The traditional way around this problem is to evaluate programmes in terms of their

objectives. However, as argued above, programme objectives are usually too vague to

provide clear prescriptions. ªIt is often hard to formulate questions based on ambiguous

program goals. The program can then reject evaluation ®ndings by saying the evaluation

measured something the program was not trying to doº (Shadish et al., 1995: 184,

commenting on Weiss).

Pawson and Tilley (1997) argue that the most powerful and practical comparisons are

between different versions of the same programme. A variation on this is benchmarking.

Benchmarking derives performance standards from comparison with processes in relevant

alternative programmes or projects. The advantages of using benchmarking to structure

valuing are that data might be seen as more objective and fair because they are derived

from real-world examples; contact between the two partners who are benchmarking

against each other can give practical insight into possible improvements; and competition

between the two organisations can increase motivation (Hill, 1995). The point is, all of this

is achieved without drawing attention to values.

2.4.3. Weighting ®ndings

Valuing relies on being able to weight different variables of the knowledge component.

As Scriven (1996: 397) observes: ªEveryone doing practical evaluation knows that before

one starts writing the Conclusions section, one has data and judgements on a number of

highly independent dimensions of program quality. How does one pull these together?

Presumably through using some kind of weighting. Where do the weights come from?º In

the absence of a carefully constructed synthesis, evaluation reports suffer from what

Scriven terms `the rorschach effect', that is, the audience is presented with a random

scatter of inkblots from which to construct patterns.

Cost bene®t analysis provides the most formal approach to weighting, and illustrates the

problems of imposing this kind of structure on different knowledge categories. Cost

bene®t has been criticised for being unable to deal adequately with uncertainty, inter-

dependencies, distributional effects, intangibles, and costs to the user (Mishan, 1971).

Even without the over-simpli®cation of reducing all costs and bene®ts to ®nancial values,

weight and sum methods suffer from assumptions of linearity and additivity which can

weaken their relevance to the practical world of programme implementation.

Pawson and Tilley (1997) question the starting point for a black box approach to

valuing. They argue that attempting to aggregate positive and negative effects inevitably

produces inconclusive results across a whole programme. That inconclusive results are

common is underlined in Rossi's (1985) pessimistic Iron Law of Evaluation which states

that: ªThe expected value of any net impact of any social program is zero.º Realistic

Evaluation moves away from global statements about the success of programmes to

focus on practical guidelines about the relevance of different approaches, styles, and

mechanisms.

2.5. Use

2.5.1. Introduction

The literature sets out many different ways in which evaluation achieves its in¯uence


over appropriate policy decisions. Each of these should be seen as operating on a scale

along which greater or lesser in¯uence can occur. Instrumental use occurs where ®ndings

have a direct effect on decision-making; for instance, where recommendations are

implemented. Conceptual use, which is also called `enlightenment' (Weiss, 1977) or

`demysti®cation' (Berk and Rossi, 1976), is where evaluation leads to changes in how

the programme is understood. Persuasion refers to ªenlisting evaluation results in efforts

either to defend or attack political positionsº (Rossi and Freeman, 1985: 388). Political use

is designed to legitimise decision-making. Haug (1996: 424) identi®es two political uses

of evaluation. In `the strategic approach' results are presented to support decisions which

have already been made. In `the symbolic approach' evaluation is used to give the impres-

sion that the decisions have been thoroughly and rationally considered. Finally, misuse

involves the ®ndings being published selectively, or to a select number of audiences.

Misuse can be deliberate, for instance, where a programme manager tries to scapegoat

one of the implementing agencies, or accidental, for instance where evaluation results are

communicated second-hand. Misuse can be more helpfully conceived as referring to the

quality of use (Cook et al., 1981).

2.5.2. Increasing use

The literature has identi®ed many potential obstacles to the full and proper use of

evaluation ®ndings (Cook, 1978; Shadish et al., 1995: 54). Organisations are resistant

to change because of the threat to vested interests. Weiss (1973: 40)) points out that: ªA

considerable amount of ineffectiveness may be tolerated if a program ®ts well with

prevailing values, if it satis®es voters, or if it pays off political debts.º Social programmes

typically change slowly because decision making is incremental. Weiss (1980) calls this

ªdecision-accretionº Decision makers design programmes for non-instrumental reasons,

and in designing programmes they use information other than evaluation reports. Weiss

concludes that: ªI doubt that we can ever persuade stakeholders to make evaluation results

the overriding consideration in program decisions. For one thing, program people know a

lot more about their programs than simply the things the evaluator tells themº (Weiss,

1988: 17; Shadish et al., 1995: 221). Third, evaluation ®ndings do not usually ®lter down

to those in the best position to apply them, such as programme operators. Caplan (in Cook,

1978) argues that lack of use re¯ects a fundamental clash of cultures between the evalua-

tor's `knowledge-generating culture' (interested in truth and validity), and the programme

manager's `knowledge-utilisation culture' (interested in pragmatic action). Last, the

inconsistent technical quality across evaluation reports affects their credibility and

therefore their use.

Theorists have identi®ed several ways of increasing the use of evaluation results

(Patton, 1996; Shadish et al., 1995: 55; Georghiou, 1995: 185). These can be related to

four principles. Evaluators can increase the relevance of their work through concentrating

on the factors which users can control, and giving priority to questions relevant to pending

decisions. Second, evaluators can increase receptivity through identifying users early on,

involving users in the evaluation, and providing interim ®ndings. Third, the absorbability

of evaluation can be increased through providing recommendations, including non-tech-

nical executive summaries, matching summaries to the interests of different stakeholders,

and disseminating ®ndings through informal meetings and brie®ngs as well as reports.


Credibility can be increased through using evaluators with appropriate status who are

demonstrably fair and competent.

Notwithstanding this, American academics are not optimistic about the use of

evaluation:

The early hope was that use would happen with little effort because evaluation

results were compelling, because stakeholders eagerly awaited scienti®c data

about programs, or because policy-making was a rational problem-solving

endeavour (¼) But those hopes were dashed, because evaluation results were

seldom compelling relative to the interests and ideologies of stakeholders, stake-

holders usually regarded scienti®c input as minor in decision-making, and problem

solving is far from a rational endeavour (Shadish et al., 1995: 54).

2.5.3. Communicating ®ndings

Use of evaluation depends in part on the effective communication of ®ndings. ªThe

proper function of evaluation is to speed up the learning process by communicating what

might otherwise be overlooked or wrongly perceived. (¼) Success is to be judged by

success in communicationº (Cronbach, 1982: 8; Torres et al., 1996). Evaluation reports

must therefore re¯ect the different information needs of different layers of the organisation

(Tables 1 and 2). Communication needs can also vary between individuals. ªLearning

begins with individuals. How individuals receive, remember and react to evaluation

communications mediates the effectiveness of those communications. The need for

presenting information in a variety of modalities is clearº (Torres et al., 1996). The

`modalities' mentioned by Torres et al. include visual, aural, interactive, tactile,

kinaesthetic or olfactory media. Meeting these demands presents two dilemmas for the

evaluator, however. First, the evaluator has to summarise work without taking ®ndings out

of context or overlooking methodological limitations and cautions. Second, evaluation

reports have to tread a ®ne line which broadens the readers' understanding while also

addressing their existing interests, a balance between leading and following.


Table 1

Attributes of information needed by different decision-makers. Source: Love (1991: 29)

Information attributes Management function

Planning Operations

Type of question What if? What is?

Time horizon Future Current

Information sources External Internal

Measurement Qualitative Quantitative

Level of detail Aggregate Individual

Level of analysis Synthesis Descriptive

Frequency of reporting Periodic Continuous

Scope of reporting Summary Detailed

Accuracy of reporting Approximate Exact

Mode of reporting Graphical Numerical

2.6. Practice

2.6.1. Introduction

This ®fth component looks at the problems of carrying out evaluation with limited time

and other resources. This includes the organisation of evaluation work, and the role of the

evaluator.

2.6.2. Independence

The literature discusses several concepts related to independence.

² Objectivity (Scriven 1976. Evaluation ®ndings should not be contaminated by the

personal preferences or ideology of the evaluator.

² Reproducibility (Cronbach, 1982). Another evaluator using the same methodology

should obtain the same results.

² Autonomy (Newman and Brown, 1996). Evaluators should be free from outside

in¯uence.

² Emphatic neutrality (Patton, 1990). Evaluators should be perceived as sympathetic to

programme participants but open about ®ndings.

Keeping some distance from stakeholders can, however, be at odds with the prescriptions

for increasing use. Stake asks: ªDoes the degree of involvement in the ®eld required for

expertise almost always create bias?º (Stake, 1995: 395). In the Evaluation Thesaurus,

Scriven (1991) directly addresses this dilemma when he argues that using evaluators with

some level of bias is better than employing `ignoramuses or cowards'.

Another source of bias is of major importance in evaluation: political pressure. Weiss

(1973: 179) argues that: ªMany of the problems that still bedevil the evaluation enterprise

are not so much failures of research expertise as they are failures to cope adequately with

the political environment in which evaluation takes place.º Berk and Rossi (1990: 14)

point out that evaluation is inherently contentious. ªIn almost all program issues, stake-

holders may be aligned on opposing sides, some favouring the program and some oppos-

ing. And whatever the outcome of the evaluation may be, there are usually some who are

pleased and some who are disappointed: it is usually impossible to please everyone.º

Political pressures can affect each stage of the evaluation (Brickell, 1978; Berk and

Rossi, 1976; Palumbo, 1989; Hedrick, 1988; Karlsson, 1996). The purchaser of the evalua-

tion or other stakeholders might phrase the objectives to highlight particular aspects of the

programme, for instance, more favourable or less contentious aspects. Radical interven-

tions, which would not be politically acceptable, might be excluded from consideration.

The evaluator might be set an unrealistic time frame so as to prevent wide consultation

during the research work. Political actors might place pressure on evaluators to distort

study ®ndings. Political actors might use the evaluation results selectively, or try to

suppress their publication. Political manoeuvring can include using evaluation research

as a delaying tactic. More positively, political con¯ict can increase the interest in an

evaluation report and produce greater pressure to implement recommendations.


2.7. Ensuring co-operation from programme staff

The evaluator often relies on data from the programme itself. ªThe basic data used in

many evaluations comes from the of®cial records of the program being evaluated. While

this usage is cost-ef®cient, it introduces politics into the meaning given to measuresº

(Kelly in Palumbo, 1989: 273). Co-operation of staff is vital where evaluation data is

drawn from the programme but it is not always forthcoming. A study by the author

(Jackson, 1996) on `foyers', a type of hostel providing accommodation and training for

homeless young people, found that project managers had three objections to evaluation.

First, they disagreed with national objectives for their projects. They thought that intan-

gible bene®ts to participants were more important than job creation outputs. Second, some

programme managers opposed any classi®cation or labelling of participants. Third, there

was a general feeling that projects should be trusted rather than asked to account for their

actions through monitoring. Patton (1996: 29) lists others fears about evaluation which can

affect co-operation. ªBarriers typically include fear of being judged, cynicism about

whether anything can really change, scepticism about the worth of evaluation, concern

about the time and money costs of evaluation, and frustration from previous bad evalua-

tion experiences, especially lack of use.º

Performance management work within the ®eld of human resource management

provides insight into the response of programme staff. Murphy and Cleveland (1995:

101) argue that performance appraisal systems reinforce power by clarifying lines of

authority. A more radical view, from Newton and Findlay, 1996: 50, is that appraisal

undermines group solidarity by increasing individual competition. Foucault uses

Bentham's image of a `panopticon' (a prison where inmates can be observed unseen

from a central tower) to argue that performance management `sequestrates' subjects,

isolating them in time and `space' (Townley, 1994). Overall, the increased control from

evaluation can weaken or demoralise those subject to evaluation.

Many theorists have argued that power needs re-balancing through giving a greater role

to stakeholders. Variants on this view now include: `responsive evaluation' (Stake, 1975),

`democratic evaluation' (MacDonald, 1976), `naturalistic evaluation' (House, 1980),

`participatory evaluation' (Choudhary and Tandon, 1988), `Fourth Generation Evaluation'

(Guba and Lincoln, 1989), èmpowerment evaluation' (Fetterman et al., 1996), and

ùtilisation-focused evaluation' (Patton, 1996). These theories advocate, in various

degrees: relativism and pluralism; qualitative methodologies including participation;

and a view of evaluation as negotiation rather than analysis.

Guba and Lincoln (1989: 11, 119) argue that à means of carrying out an evaluation

must be found that recognises the constructed nature of ®ndings, that takes different values

and different contexts (physical, psychological, social and cultural) into account, that

empowers and enfranchises, that fuses the act of evaluation and its follow-up activities

into one indistinguishable whole, and that is fully participative in that it extends both

political and conceptual parity to all stakeholders'. They make two main criticisms of

positivism. First, that its desire for objectivity often leads to unethical treatment of experi-

mental subjects. Second, that by disregarding political factors it most often serves to

reinforce existing power relations.

Fourth Generation Evaluation is itself not without criticism. Pawson (1996: 216) argues


that by seeing all constructions of reality as equal, what he calls the `militant agnosticism

on truth', Fourth Generation Evaluation reduces the role for evaluation in advancing

programme knowledge. Evaluators risk losing much of their credibility and status if

they give up the claim to being objective (House, 1993: 30. Accountability, one of the

three main functions of evaluation, relies on evaluators maintaining their distance so as to

be seen as independent (Van de Knaap, 1995: 206). More seriously, given the radical

agenda proposed by Guba and Lincoln, Fourth Generation Evaluation has been criticised

because it `fails to deal, in any meaningful way, with the concept of relative power, or

more speci®cally the unequal distribution of discursive power' (VanderPlaat, 1995: 95).

Power is not the evaluator's to give away. Stakeholder approaches can give participants

unrealistic expectations of the evaluation (Rebien, 1996). Conceptual relativism leads into

a moral relativism which reinforces rather than challenges existing power relations.

ªAbsolute relativism creates an unpredictable paradox Ð on the one hand, it promises

to free the individual from ªof®cial versionsº of their lives; on the other hand, it grants

licence to arbitrary powers to dismiss popular complaint and dissent.º (Kushner, 1996:

196).

The point is that evaluators cannot change the political role and meaning of their work

without the compliance of those who commission and use evaluation.

2.8. Conclusion

Evaluation has ®ve components. Within ®nancial and resource constraints (Practice) it

uses information (Knowledge Construction) to make judgements (Valuing) on

programmes and policy problems (Socio-economic Programming) with the aim of inform-

ing decision-making (Use). Each of these components, and the con¯icting pressures of

pursuing all ®ve simultaneously, raise problems for the evaluator.

The ®rst section, on Socio-economic Programming, looked at the problems of placing

evaluation within an explanatory framework dealing with the character of the programme

and the policy it is intended to address. This area of evaluation has been neglected because

evaluators have focused on methodology and taken a black-box approach concerned with

impact. Authors such as Chen (1990), and Pawson and Tilley (1997) argue that evaluation

should move away from broad questions about whether the programme produced the

impact intended, towards more sensitive contextual analysis looking at where and why

programmes are most effective. The `where' would look at environmental factors, types,

and motivations of participants. The `why' demands probing of programme mechanisms.

Lack of theoretical thinking in evaluation has led to two problems. First, policy

managers have given insuf®cient attention to what they wish to achieve. Second, they

have given insuf®cient attention to how this effect will be achieved, how their proposed

policy would deliver this effect. These two problems can be rephrased in the form of

hypotheses.

Hypothesis 1. Objectives will be ambiguous.

Subhypothesis 1a. The form this ambiguity will take is that objectives will be global,

diffuse, and distant from programme functioning.

Hypothesis 2. The assumptions underpinning scheme operation will be unstated.


Subhypothesis 2a. The form this understatement will take will be that stakeholders will

differ in their assumptions about the way the programme works (programme theory).

The second section, on Knowledge Construction, looked at the information which is fed

into evaluation. In Britain, this is mainly in the form of performance indicators, so the

analysis concentrated on the problems these present for measurement, interpretation, and

use. Performance indicators employ quantitative measures to represent complex inter-

actions across different projects and programmes. This representation process inevitably

leads to some level of simpli®cation, de-contextualisation, and spurious precision. That

these indicators can then affect the power, resources, and status of individuals and

organisations leads to pressure to distort results towards more favourable depictions.

The hypotheses from this are:

Hypothesis 3. Performance indicators will present measurement problems.

Sub-hypothesis 3a. The measurement problems presented will be those of subjectivity,

simpli®cation, and partiality.

Hypothesis 4. Performance indicators will present organisational problems.

Sub-hypothesis 4a. The organisational problems presented will be those of ªcreamingº,

short-termism, and demoralisation.

The third section examined the Valuing component of evaluation. Judgements of worth

rely on value statements, but it is not obvious where these value statements should come

from. The most obvious source, the stated objectives of the programme, is not easily used

because they are often vague or inconsistent. External frameworks like ethical theories

have intellectual rigour but might lack credibility and immediacy to programme stake-

holders. Evaluators face two problems: building conclusions across a portfolio of indivi-

dual ®ndings and judgements; rating the programme against standards or comparables.

Methods such as benchmarking provide a vivid and practical way of judging programmes

without drawing attention to values. The problems become these hypotheses:

Hypothesis 5. Results from evaluation will support different judgements according to

the values applied.

Sub-hypothesis 5a. Differences in judgements will follow the lines between stakeholder

groups.

Hypothesis 6. Evaluation will produce positive and negative readings on different

aspects of the programme.

Sub-hypothesis 6a. Evaluation that seeks to summarise its ®ndings into a single

aggregate answer will be inconclusive.

The fourth section, on Use, examines the way evaluation ®ndings feed into decision-

making. Research in the United States has consistently revealed that evaluation does not

have a major effect on programmes. Several possible explanations for this weak link into

use are explored: that policy is the product of political as well as technical factors; that

programmes change slowly because of resistance from vested interests; that evaluation

reports fail to speak directly to the practical concerns of policy makers and implementers;


and that some evaluation reports have technical ¯aws which justify managers' disregard

for their ®ndings. The hypotheses from this are:

Hypothesis 7. Evaluation ®ndings will not be used.

Sub-hypothesis 7a. Enlightenment use will be more common than instrumental use.

Hypothesis 8. Communication of evaluation ®ndings will be problematic.

Sub-hypothesis 8a. Over-simpli®cation increases the risk of mis-use. Over-complexity

obscures the meaning.

The ®nal section examined the practical process of carrying out the evaluation. Two

tensions were revealed. Resource constraints mean that evaluators usually expect to ®nd

data on the programme already available. Obtaining this data, and the information which

provides context for its interpretation, makes evaluators dependent on programme staff.

However, programme staff have good reasons not to co-operate. These reasons include the

effort involved, the damage to their position from negative ®ndings, and the power rela-

tions implied by this level of external control. Methods which give stakeholders more

control over evaluation, so as to encourage their participation, can threaten evaluators'

objectivity and independence. A particular example of this is where stakeholders

exert political pressure on evaluators to change the direction of their investigation or its

conclusions.

The hypotheses from this are as follows.

Hypothesis 9. Data for evaluation will be dif®cult to compile.

Sub-hypothesis 9a. Compilation dif®culties will include non-disclosure as well as lack

of availability.

Hypothesis 10. Evaluators will be pressured to change ®ndings to ®t the decision-

maker's perspective.

Sub-hypothesis 10a. The form this pressure will take will be to accentuate positive

®ndings.

The next chapter tests these 10 hypotheses in an evaluation of public sector small business

loan and grant schemes.



Table 2

Components of programme evaluation

Character Hypotheses

1. Socio-economic

Programming

Assumptions about the nature of

social (and presumably

economic) problems and the

mechanisms of public sector

programmes.

Hypothesis 1: objectives will be ambiguous.

Subhypothesis 1a: The form this ambiguity will

take is that objectives will be global, diffuse, and

distant from programme functioning.

Hypothesis 2: the assumptions underpinning

scheme operation will be unstated. Subhypothesis

2a: the form this understatement will take will be

that stakeholders will differ in their assumptions

about the way the programme works (programme

theory).

2. Knowledge

Construction

Assumptions evaluators adopt

about the nature of reality, the

origins and limits of knowledge

and choices of methodology.

Hypothesis 3: performance indicators will present

measurement problems. Sub-hypothesis 3a: the

measurement problems presented will be those of

subjectivity, simpli®cation, and partiality.

Hypothesis 4: performance indicators will present

organisational problems. Sub-hypothesis 4a: the

organisational problems presented will be those of

`creaming', short-termism, and demoralisation.

3. Valuing The way values are attached to

programme descriptions.

Hypothesis 5: results from evaluation will support

different judgements according to the values

applied. Sub-hypothesis 5a: differences in

judgements will follow the lines between

stakeholder groups.

Hypothesis 6: evaluation will produce positive

and negative readings on different aspects of the

programme. Sub-hypothesis 6a: evaluation that

seeks to summarise its ®ndings into a single


4. Use The way information feeds back

into decision-making and action.

Hypothesis 7: evaluation ®ndings will not be used.

Sub-hypothesis 7a: enlightenment use will be

more common that instrumental use.

Hypothesis 8: communication of evaluation

®ndings will be problematic. Sub-hypothesis 8a:

over-simpli®cation increases the risk of mis-use.

Over-complexity obscures the meaning.

5. Evaluation

Practice

The way evaluators select

methods and approaches to match

the circumstances and resources

available.

Hypothesis 9: data for evaluation will be dif®cult

to compile. Sub-hypothesis 9a: compilation

dif®culties will include non-disclosure as well as

lack of availability.

Hypothesis 10: evaluators will be pressured to

change ®ndings to ®t the decision-maker's

perspective. Sub-hypothesis 10a: the form this

pressure will take will be to accentuate positive

®ndings.

CHAPTER 3

Empirical work

3.1. Methodology

The empirical work for this project has two elements. A mapping exercise inves-

tigated and described the main public sector ®nance schemes for small businesses in

the Hackney area. Four of these schemes were selected for case study. Research

comprised interviews with the scheme managers and funders, compilation of data on

each of the loan holders, interviews with 24 loan holders across two of the schemes,

and analysis of written material. In one of the four schemes, where data on loan

holders were not available, details were compiled on 150 applicants by inspecting

business plans, decision-panel minutes, and monitoring reports. The written material

reviewed included publicity brochures, application forms, progress reports, and

paper records of management systems such as accounting, credit control, and busi-

ness health checks.

The case study method was used because it is suited to detailed analysis of

speci®c instances (`bounded systems', Smith in Stake, 1995), using a mix of quan-

titative and qualitative research tools, which allow real outcomes to be examined in

context. Stake (1978, 1981, 1995) strongly advocates case study methods for evalua-

tion because their use of different kinds of information to `triangulate' on the

research questions strengthens their validity. The knowledge from case studies ìs

different in that it is more concrete, more contextual, more subject to reader inter-

pretations, and based more on reference populations determined by the reader'

(Stake, 1981: 36). Stake (1995) identi®es two types of case study research: ìntrin-

sic' and ìnstrumental'. The former refers to circumstances where the case is inher-

ently interesting. The later, which applies here, is where the case is intended to

provide insight into a wider problem or issue.

The disadvantages of a case study methodology are the dif®culties in generalising to the

wider population, and the danger of interviewer bias. Follow-on research using more

quantitative methods is needed to test the conclusions from this study across a wider

range of cases and instances.

3.2. Background from the mapping exercise

This project examines loan and grant schemes in the Hackney area of London. The

mapping exercise found 27 ®nance schemes active in and around the Borough. Five are

grants schemes, twelve are loan schemes, six are equity, and four are other types (loans

and grants, credit unions, business angel introduction schemes, and loan guarantee

schemes). This is in addition to commercial schemes such as national venture capital

funds.

The pattern of supply varies across the different ®nancial instruments. Grant schemes

tend to have lower allocated amounts (£1000±£25,000 with an average payout of £6250

across the schemes), and to operate in smaller areas. The ®ve grant schemes identi®ed are


all relatively recent (established since 1993). All have reached or are approaching exhaus-

tion. Two of the ®ve are managed by Business Link, and three by the Council or City

Challenge.

Loan schemes tend to have a higher upper margin (up to £50,000), although there are a

number of small, specialist loan schemes in the same niche as grants. These smaller loan

schemes bring the average payout to the same level as grants (£6090). Loan schemes have

a longer history. Seven of the twelve date back to the, 1980s. All of the funds are active,

providing in total some £1.2 million (December 1996 ®gures). Eleven of the twelve loan

schemes are run by enterprise agencies, and one by an enterprise board (Greater London

Enterprise).

Equity schemes have a higher range (in principle £150,000 but in practice over

£200,000). All the schemes were established relatively recently (over the last six years).

The size of the area covered (London or wider), and amount of money available (over £1

million, £15.4 million across the six funds), are consistent with the more risky nature of

this ®nancial activity.

The four loan funds chosen for case study share the following features:

² Location within one area of London.

² Small size.

² Operation by enterprise agencies.

² Organisation on contract from Government agencies (Task Forces or City Challenges).

² Relatively recent establishment.

The analysis above showed that these features are characteristic of loan funds across the

study area. These similarities act to control for some of the variation in small business loan

and grant schemes which, in turn, limits the scope for generalisation of ®ndings.

Descriptive data and information on performance indicators for the four case study loan

and grant schemes is given in Table 3.

² All schemes were established in the last six years.

² Two operate in the smaller size range, up to £5000 or £10,000; two operate up to

£25,000.

² The two larger schemes are managed by agencies with several other funds.

² The level of deployment is similar for three of the four schemes, but signi®cantly higher

for one.

² Both Schemes 1 and 4 have particularly high enquiry rates.

² Both Schemes 1 and 4 have relatively low approval rates.

² There is a stark difference between the default rates for Schemes 1 and 4, versus

Schemes 2 and 3.

² Scheme 1 has the lowest estimated unit cost of jobs (excluding management costs).

² Targeting is good across the four schemes.

The rest of this project will test the 10 hypotheses from Chapter 2, using empirical work on

these four loan and grant schemes.



The proposed hypotheses are:

Hypothesis 1. Objectives will be ambiguous.

Subhypothesis 1a. The form this ambiguity will take is that objectives will be global,

diffuse, and distant from programme functioning.

Hypothesis 2. The assumptions underpinning scheme operation will be unstated.

Subhypothesis 2a. The form this understatement will take will be that stakeholders

will differ in their assumptions about the way the programme works (programme

theory).


Table 3

Summary table of loan and grant schemes (Notes: (1) Leverage ®gures are not comparable for reasons explained

below. (2) Approval rate is not the same as percentage of enterprise agency recommendations accepted. All

panels include applications not recommended by the enterprise agency. (3) Job creation for Scheme 1 is estimated

by comparing predicted amounts with current survival rates.(4) Data compiled in Summer, 1997.)

Scheme 1 Scheme 2 Scheme 3 Scheme 4

Source of Funds Task force, city

challenge &

council

City challenge City challenge Task force

Character of fund manager Business advisor Ex-bank manager Ex-bank manager Business advisor

Date of establishment 92 September 1995 March 1996 October 1993

Size of loan Up to £5000 £5±25,000 up to £25,000 £1±10,000

Number of funds managed 2 5 5 1

Total deployment £257500 £277,500 £589,075 £263,860

Average deployment per year £51,000 £200,000 £580,000 £75,000

Average loan £3960 £9900 £21,800 £6430

Number of loans 65 28 27 41

Term 3 8 8 3

Approval rate 46% 20% 78% 30%

Job creation 119 111 109 61

Average cost per job

(deployment)

£2160 £2500 £5400 £4300

Average cost per job

(deployment plus management

costs)

£3325 £3055 £5725 £5800

Leverage 1:2.9 1:3 1:2.5 Not recorded

% of loans behind with payment 66% 7% 3% 63%

Ratio of enquiries to loans 15:1 12:1 9:1 20:1

Ratio of applications to loans 2:1 6:1 1.5:1 3.5:1

% of loan recipients who are

women

22% NA 20% 19%

% of loan recipients who are not

white British

76% 71% 70% 87%

% of loan recipients who are

start-up businesses

54% 30% 88%

Each of the 27 ®nance schemes in the mapping exercise was asked to state their objectives.

Typical objectives included:

To encourage enterprise in the area or showing clear bene®ts to the area. To support

viable start ups and existing ®rms with less than 25 employees, who provide detailed

business plans.

To encourage start ups and business growth.

To help new and developing businesses in the London area.

To help established ®rms which have established a basis for success and can grow and

employ people. Firms with a turnover of £1 million to £20 million.

Written documents from the case study schemes show objectives phrased in similarly

general terms:

The fund was established `to provide low cost loan ®nance to local and incoming

businesses to aid business development. It was intended as a tool for strengthening

the local business economy and hence aid the urban regeneration of the area'

(wording from internal document reviewing future options for a fund).

The aim of the fund is to generate economic activity in the area, assisting with the

creation of employment through the encouragement of new businesses and the

development of existing ones who are experiencing dif®culty in raising ®nance

from traditional commercial sources (brochure for a fund).

The statements of objectives listed above suggest failure at a deeper level: lack of

attention to the theory underlying intervention. There was no evidence in the documenta-

tion of discussion about the mechanisms through which loan and grant schemes achieve

their effect.

Combining the review of written material with the interviews from fund managers leads

to the following conclusions for these loan schemes:

² Objectives are very general, usually assuming an effect in terms of stimulating business

activity or physical regeneration.

² Funding criteria are frequently given in place of strategic objectives.

² Mechanisms to achieve economic aims are not stated.

² The positioning of loan and grant schemes in the overall market for small business

®nance is not given.


The two hypotheses here are:

Hypothesis 3. Performance indicators will present measurement problems.

Sub-hypothesis 3a. The measurement problems presented will be those of subjectivity,

simpli®cation, and partiality.

Hypothesis 4. Performance indicators will present organisational problems.


Sub-hypothesis 4a. The organisational problems presented will be those of `creaming',

short-termism, and demoralisation.

Performance indicators are the main way in which funders evaluate the ongoing

performance of these four case studies. The reliance on performance indicators might

be higher than for other economic development schemes because of the particular circum-

stances of the case studies. The schemes are each managed by an implementing agency at

arms length from the funders. The funders lack expert knowledge about running ®nance

schemes. The culture of the funders is to value quantitative information; this is in part

because of their staf®ng by civil service secondees, and in part because of their own

requirement to meet quantitative targets.

The choice of performance indicators varied across the case studies (Table 4) although

the core was composed of:

² Deployment of funds (throughput).

² Job creation.

² Satisfaction of lender of last resort criteria.

² Amount of leverage.

² Ethnicity of recipients.

² Default rate/write-offs.

² Enquiry levels.

In addition, informal use was made of two other indicators: number of business plans

prepared, and additionality. These were also looked at in the research.

3.4.1. Deployment

Throughput, or the ability to spend the funds allocated, is taken as a major indicator of

operator performance.

The research found variation in the exact de®nition of this indicator. Scheme 2, which

used the narrowest de®nition, only included funds, which had been drawn down. The other

three schemes used ®gures that sometimes included money allocated and not subsequently

taken up; and money that had been accepted but not drawn down.


Table 4

Use of performance indicators


Throughputp p p p

Job creationp p p p

Leveragep p

Lender of last resortp p p

Ethnicityp p p p

Default ratep p p p

Enquiry levelsp p

Deployment varied across the four schemes for reasons which did not always seem to be

under the control of the fund manager. Three types of factor constrain spending by the fund

manager:

² Lending perimeters. Scheme 1, with the lowest deployment, also had the lowest lending

range (up to £5000). The number of loans given by this scheme (65) was the highest of

the four.

² Decision-making. In each case, lending decisions were made by panels composed of

representatives of the funders, and professional advisors such as bank managers or

accountants. The scheme with the highest deployment rate (Scheme 3) also had the

highest approval rate (78% compared with 30±46%).

² Demand. The amount of lending achieved is affected by the level of demand feeding

into the system, which is itself affected by the size of the eligible area, the economic

climate, the availability of alternative sources of funds, and the attitudes of potential

recipients towards external ®nance. This process can be conceptualised as a ®ltering

mechanism.

Low deployment can re¯ect failure at any point of this ®ltering mechanism (Fig. 3):

² Not all ®nancial needs will translate into demand for loans. Potential applicants might

prefer to rely on internal funds, bank ®nance or the hope of grants.

² Not all demand will translate into enquiries. The level of enquiries will depend on the

marketing of the loan scheme, and therefore the level of awareness, as well as the

perceived chance of success.

² Not all enquiries will translate into applications. The level of applications will depend

on the ability to ®nd matching funds (and meet other conditions), ease of applying, and

motivation to continue.

² Not all applications will be approved. The level of approvals will depend on the

character and role of the decision-making panel.

² Not all approved loans will be accepted. Loans might be not be taken up if approval

comes too late or subject to conditions which the business ®nds off-putting.

That deployment is used as a performance indicator without clari®cation of these

contextual factors illustrates the level of over-simpli®cation. Tightening the de®nition

of deployment will solve some but not all these problems.


Table 5

Average and total deployment of funds


Total deployment £257,500 £277,500 £589,075 £263,860

Average deployment per year £51,000 £200,000 £580,000 £70,000

3.4.2. Job-creation

Each of the four cases measures job creation differently. Scheme 1 collects ®gures for

the number of current jobs and proposed jobs anticipated at the time of the funding

application. These are treated as `jobs preserved' and `new jobs', respectively. Scheme

2 collects ®gures on the number of employees of loan holders, making no distinction

between different grades of staff. Scheme 3 keeps ®gures for gross job creation (full

time jobs, probably of®cial and unof®cial), and net job creation (90% of gross job crea-

tion). Family members are included. This scheme has a signi®cant number of established

clothing ®rms, which employ seasonal workers, outworkers, contractors, and jobs outside

the area. None of these is included. Scheme 4 records, for those loan recipients, which are

still in business, the number of people employed and the number of jobs, proposed. These

data are compiled from regular business reviews with loan recipients.

These cases show several problems in using job creation as a key performance indicator.

² Jobs vary in their quality. Storey (1990) identi®es three elements of job quality: wage

rates, job duration, and job allocation (take-up by the unemployed). No case study

collected data on any of these variables.


Fig. 3. Throughput model.

² The number of people employed by loan holders varies widely over seasons, weeks or

even hours. The Hackney labour market is characterised by seasonal and temporary

jobs, contract labour, and home workers. There is no consistency in the way ®rms

include such workers in the ®gures they give for job creation.

² The jobs are not equal in their contribution to the economy. The ®gures collected by the

loan funds include unof®cial labour, where people are still claiming bene®t, and family

members who are not paid, and therefore do not directly contribute to local spending or

multiplier effects. A clearer statement of the objectives for each scheme would be able

to show whether these groups should be included or not.

² Job creation exhibits time lags. A ®nance scheme might have its full effect after the

loan period, for instance, if a ®rm expands or survives a recession better than would

otherwise have been the case. Only one of the schemes keeps in touch with loan

recipients suf®ciently to take account of subsequent job creation.

² Attribution of new jobs is problematic. Tying down responsibility for creating jobs is

especially dif®cult where ®rms have bene®ted from several programmes of support, and

this can lead to double counting. Since all the fund managers studied are enterprise

agencies, jobs were probably already counted as an output from business advice.

² Fund criteria affect potential job creation. Differences in the average loan going out

and, in two of the four cases, limits on the size of ®rm which can apply, give an uneven

playing ®eld for comparisons. The two funds without a limit on the size of ®rms which

can apply, and those with a higher percentage of established rather than start up ®rms,

are in a better position to count job creation than the others.

² Job creation might operate at the expense of jobs outside the area. None of the four

loan funds takes account of displacement of jobs from other areas. Schemes 1 and 2, for

which detailed data are available, were attracting applicants from a wide area of

London (and beyond in some cases). Start-ups were particularly footloose in their

choice of location.

The problems with calculating job creation impinge on derived indicators such as cost per

job. Cost can be calculated using deployment alone or deployment plus management costs.

Management costs can be interpreted from the funders' perspective (®nancial contribu-

tion, as used in Table 3) or from the enterprise agencies' perspective (actual cost to the

agency, including allocation of overheads). The jobs included can be those created imme-

diately, or those created over the life of the investment. The later has to use judgement to

set a cut-off point, and decisions on this can reduce consistency between schemes. JURUE

(1986) comment that: ªcomparisons between the cost per job of revenue and capital

projects should not be made since in the latter case the total net capital cost of an asset

with a long life is being counted against an assessment of jobs created at one point in

time.º

3.4.3. Leverage

There are several ways in which fund money can lever in private sector funds:

² Provision of matching funds from the applicant can be a condition of eligibility.

² Scheme money can be used to reduce the exposure of banks.


² Schemes can present the case of the applicant, so giving the bank more con®dence to

make a loan.

² A similar but not identical concept is that of ªmatching fundsº, where the applicant

must contribute their own money to show commitment. A looser version of commit-

ment includes money paid previously Ð and therefore not levered by the fund.

Three of the four schemes have targets for leverage which they are expected to reach.

However, leverage is calculated differently across the three schemes (Table 6). At its

broadest, leverage includes:

² Public money.

² Bank loans.

² Bank overdrafts.

² Applicants' money (including money left in the business by directors taking a letter of

postponement).

² Applicants' help in kind.

² Professional advisors' help in kind.

² Previous investments in cash or in kind contributed by the applicant (this is more in

keeping with the concept of matching funds).

² National programmes, such as enterprise allowance, which are not discretionary.

The research ®ndings show the six problems with the use of leverage as a performance

indicator.

² Different de®nitions are used. This problem has already been identi®ed in the literature.

Gray (1997: 352) complains that: ªThe concept of leverage mechanistically lumps

together several different relationships between private and public sectors.º

² Some factors affecting leverage are outside the control of the fund manager. For

instance, the cancellation of enterprise allowance made leverage more dif®cult to obtain

(for Scheme 1, which is including it). The clearing banks' move towards credit scoring

might reduce the scope for schemes to obtain leverage in the future.

² The de®nition of leverage discourages referral of commercial cases to banks. Leverage


Table 6

De®nition of leverage


Public moneyp p

Bank loansp p p

Bank overdraftsp p p

{Not

Applicants' moneyp p

measured

Applicants' help in kindp p

systematically}

Professional advisors' help in kindp

Previous investments or in kind

contributed by the applicant

p

Enterprise allowancep

is only being calculated for money drawn in through use of the scheme's funds. This

fails to reward the time schemes spent helping applicants who do not then apply for

loans. The performance indicator is in effect excluding the leverage from business

advice, and thereby encouraging schemes to use their own money rather than passing

on commercial cases to banks. A fuller de®nition of leverage would keep a separate

record of money levered through negotiating with banks to take on cases.

² High leverage targets are inconsistent with lender of last resort. Scheme 2 had parti-

cularly high targets for leverage (3:1) which were very dif®cult to reconcile with the

strict requirement of lender of last resort. The problem is not as extreme as it at ®rst

appears because banks typically use a narrower de®nition of matching funds which

equates to net tangible assets (overdrafts and loan capital from directors are treated as a

liability not an asset).

² High leverage might suggest low need for public sector funds. Geddes and Erskine

(1994) point out that for some categories, such as subsidies and joint ventures, it might

be more appropriate to say that the private sector is levering public ®nance rather than

vice versa. ªIf the public sector contributes only a very small proportion of project

funding, one would suspect it more likely that the `outputs' would have occurred

anyway, even without the public contributionº.

² The demand for leverage requires the applicant to construct a complicated package of

funding. Delay or indecision on the ®nal elements can, then, threaten the package as a

whole.

A further potential problem, that preliminary estimates of leverage might be exaggerated

in order to boost agency's targets, was tested by evidence on Scheme 1. Data from

individual client records allowed proposed and achieved leverage to be compared. That

achieved leverage was higher than proposed leverage suggests that, far from exaggerating

®gures, the enterprise agency was not taking into account the full extent to which a

positive decision opened up other sources of funds.

3.4.4. Lender of last resort

All four schemes are intended to operate as lender of last resort. Their interpretation of

this differs, however. Scheme 1 takes the clients' perspective and de®nes lender of last

resort as where the client has tried but failed to raise money. This could be because of poor

presentation rather than the intrinsic merit of the scheme. Scheme 2 takes the bank's

perspective and de®nes lender of last resort as not taking away potential custom from

the bank. The proposition is not bankable in principle, regardless of how it is presented. In

its ®ne form, exempli®ed by one of the funds covered during the mapping exercise,

customers of other banks were acceptable for loans but not customers of the funding

bank. Scheme 3 described lender of last resort from an auditor's perspective. Applicants

must provide letters of rejection from banks. Scheme 4 described lender of last resort from

the public sector perspective, meaning that public sector money was not to be used where

private sector money could be used. These de®nitions will not produce the same decisions.

The second is a tighter de®nition than the other three.

Fund managers complained that use of lender of last resort has three incidental disadvan-

tages. Compiling letters of rejection and other paper work needed can lengthen the time before


a business makes contact with an enterprise agency. This additional delay can have a ®nancial

cost to the applicant. The title of lender of last resort can sound off-putting to potential

applicants, and the ®nance scheme can be stigmatised as a result. Lender of last resort, and

the other performance indicators described, is often unfamiliar to small businesses. The

manager from Scheme 2, which adopted a strict de®nition, complained that imposing these

criteria can make schemes seem alien or unsympathetic to small businesses.

3.4.5. Ethnicity

Three of the schemes recorded the ethnicity of the owner-manager, and one kept ®gures

on the ethnicity of each employee. The ®rst is easier to do, and can be argued to be fairer

because this is the relationship over which the fund manager has some control. The second

gives a fuller picture of bene®ciaries, although it suffers from all the problems mentioned

earlier for job creation, and is at best a snap-shot of a changing mix.

The four loan and grant schemes had a high percentage of their recipients from ethnic

minorities, over 70% in each case. (Table 3). This is higher than the proportion of ethnic

minorities in the business population. Business Link research earlier this year found that

non-whites own 43% of businesses in Hackney.

Even if ethnic minorities were strongly represented among start ups (where schemes are

active), the ®gures of 70% and above suggest that schemes are not `creaming' applicants on

ethnic grounds. Schemes judge applicants on ®nancial need but this is to be expected given

that funding requires a commitment to repay the loan. The concept of `creaming' does not ®t

well with the structure of loan and grant schemes. A more useful interpretation, mentioned in

the introduction, is to see schemes as balancing ®nancial and socio-economic objectives.

3.4.6. Write-offs/default rate

The original intention was to compile data on the number of write-offs. However,

initial research suggested that, to avoid looking bad, at least one fund was continually

rescheduling debts. Without looking at each case, the evaluator would not know whether

this rescheduling was justi®ed. The author responded by using the harder indicator of

percentage of loan-holders behind with their payments. One of the schemes, which had

previously recorded no write-offs now, showed a 66% failure rate.

Measurement problems aside, repayment ®gures are dif®cult to interpret:

² Low failure rate could mean strong risk management or low additionality. One fund

with a low failure rate was described by local Business Link advisors as `tougher than

the banks'. This is an example of the perennial trade-off between socio-economic and

®nancial goals.

² Funders rather than fund managers make the decisions on cases. Scheme 1 had a high

default rate but also a particularly high percentage of cases pushed through despite

business advisors' views that they were not viable.

² Default rates varied widely between types of applicants. Funding criteria therefore

pre-disposed schemes to lower or higher rates. For Scheme 1, just over 90% of the

money lent to start-ups were still outstanding compared to around 38% of that lent to

established ®rms. Furthermore, it seemed that the default rate was twice as high for a

series of 11 loans funded separately (by City Challenge) to meet a tight deadline.


3.4.7. Enquiry levels

Although enquiry levels are not subject to ongoing monitoring and targets, they

are taken as a measure of marketing. However, ®gures produced can depend on two

factors:

² How enquiries are allocated if the organisation manages more than one fund. This is

similar to the absorption problem with overheads in accounting.

² Procedures for recording enquires. Whether all phone calls are included, just those

resulting in a request for information, or just those which have been ful®lled.

A high level of enquiries is not in itself an indicator of good performance. High ®gures

could re¯ect lack of targeting or poor management of enquirers' expectations.

3.4.8. Business planning process

Fund managers are paid to provide business planning. One of the interesting ®ndings

from the research was the different perceptions of what constitutes a good business plan.

² A complete business plan (bureaucrat's perspective).

² An exciting market opportunity (entrepreneur's perspective).

² A thoughtful/dynamic business plan, considering possible risks and sensitivities

(analysts' perspective).

² A well-presented document (marketer's perspective).

This ambiguity caused problems for the operation of the funds. Scheme 1 has a 43%

approval rate for internally generated applicants against a 27% approval rate for referrals

from local authority business advisors. Interview with one of the local authority business

advisors suggested that he was using a bureaucrat's de®nition of a good business plan (`it

is one which is complete'), whereas the fund manager was using an entrepreneur's

perspective (`a good business plan is one where the business appears viable').

3.4.9. Additionality

Additionality was not used as a formal measure in the ongoing monitoring of the four

loan and grant schemes. However, it was implied in the complaints that funders made

about their schemes. It is therefore useful to examine the problems of measuring addition-

ality, as described in the literature.

² Construction of the counterfactual case typically depends on a hypothetical question

about the likely alternative outcome for the ®rm. Answering this question requires the

interviewee to simplify the complicated circumstances of corporate decision-making,

often some time after the event, using personal knowledge which might be incomplete,

and without time for re¯ection (McEldowney, 1997: 184). Attributing success to

outside factors is an emotional issue affected by the character of the interviewee (for

example, their internal locus of control), and their general sympathy towards the

programme under evaluation. Storey (1990: 675) observes that: ªit is very unlikely

that ®rms will project that in 2 or 3 years' time they will have ceased trading.º He refers


to earlier work on ®rms supported by British Steel which showed that only 65% of

employment predicted was actually achieved.

² Projects are often ®nanced by a package of funders. Separating out the impact of each

strand of support is subjective. In practice this often leads to double counting between

operators (Pearce and Martin, 1996).

² Funding is likely to have a complex effect on ®rms: increasing the scale or speed of

expansion, or removing some of the pressure on owner-managers so that they can

devote more time to other activities. This case of `partial' additionality is dif®cult to

quantify (Pearce and Martin, 1996).

² Analysis at the local level can overestimate additionality because displacement effects

are not taken into account (P.A. Cambridge Economic Consultants, 1987). For instance,

without these Hackney loan schemes, job creation in the adjacent borough of Islington

(which has a dirth of loan schemes) might have been higher.

² Calculations of additionality tend to disregard the opportunity cost of the programme,

for instance counterbalancing changes in mainstream funding, which might otherwise

have created jobs or stimulated small business development (Pearce and Martin, 1996).

² Final outputs are often dif®cult to quantify. For this reason, intermediate outputs (start-

up, expansion, investment) are often used in calculating additionality, which can give a

misleading picture of programme success (Pearce and Martin, 1996).

3.4.10. Distributional effects

Schemes differed in how they interpreted targets for performance indicators. In Scheme

2 targets were assumed to apply to each individual applicant. The other three schemes

chose to adopt what can be described as a portfolio approach where average ®gures (for

example, for leverage) met targets. This gives a more ¯exible approach which allows the

fund manager to adapt the scheme to different client needs.

3.4.11. Conclusion

Analysis through performance indicators is partial. The indicators in operation for the

case studies fall into the following seven classes.

² Input: budget, staf®ng.

² Process: enquiry levels, terms of lending, business planning, lender of last resort,

leverage.

² Equity: ethnicity of loan holder.

² Ef®ciency: unit cost per loan, write-offs.

² Service quality: client feedback on satisfaction with service.

² Outputs: money spent.

² Outcome: job creation.

An eighth class, effectiveness, is represented below.

The research found six fundamental weaknesses in the use of performance indicators.

² They are ambiguous, inconsistently applied, and therefore dif®cult to compare between

cases (all indicators to varying degrees).


² They measure factors which are not always within the control of those being evaluated

(deployment of funds, leverage, job creation, enquiry levels).

² They are unstable, changing in ways which are independent of the input from the

programme (enquiry levels, deployment of funds).

² They are dif®cult to interpret, open to different interpretations, some of which would be

favourable and others unfavourable (default rates, deployment of funds, leverage).

² They miss out an area of vital signi®cant to the programmes: effectiveness.

² They are unfamiliar to small business clients, and can make the schemes appear

unsympathetic and bureaucratic.

3.5. Valuing

The two hypotheses here were:

Hypothesis 5. Evaluation results support different judgements according to the values

applied.

Sub-hypothesis 5a. Differences in judgements will follow the lines between stakeholder

groups.

Hypothesis 6. Evaluation will produce positive and negative readings on different

aspects of the programme.

Sub-hypothesis 6a. Evaluation that seeks to summarise its ®ndings into a single


The research illustrates well the problem of weighting different ®ndings (Table 3).

Scheme 1 had good deployment (in terms of total number of loans), and high jobs creation

but also a high default rate. Scheme 2 had relatively high leverage, low default rate but a

low approval rate. Scheme 3 had a high average deployment rate, high approval rate, and a

low default rate but high average cost of jobs. Scheme 4 had the highest percentage

targeting, and a large number of enquiries but also a high default rate, and the lowest

job creation. The research also compiled `soft' information on the schemes which was not

picked up by the performance indicators. Scheme 1 was seen to have strong links into the

local community. Scheme 2 had excellent management systems. Scheme 3 had adopted a

proactive approach to marketing, including an outreach advisor to ®nd and evaluate

prospective loans.

The research found clear differences between the perceptions of each of the main

stakeholders. Funders saw the schemes in terms of their contribution to organisational

goals. They were mainly concerned with whether targets had been met, whether the

scheme was collaborating appropriately with other schemes (some set up by the funder),

whether outcomes represented additionality and value for money. For instance, the brief

for the evaluation of Scheme 1 was: `to assess the performance of the fund, taking account

of the objectives and targets set out in task force and city challenge contracts; to examine

the context in which the fund currently operates, including its relationship with other loan

funds operating in the borough; and to make recommendations on the future of the fund

following the closure of the task force.' Fund managers were more focused on the risks and


bene®ts to the enterprise agency. They were concerned with whether instructions from

funders were clear and reasonable, whether targets gave a full picture of the management

work, whether the scheme presented synergy with other enterprise agency activities,

whether decision-making gave suf®cient safeguards for staff, whether the contract was

too open-ended (exposing the agency to ®nancial risk), and whether decision-making

panels had the right skills. Clients were focused on their individual case, and their response

was strongly affected by the outcome of their application. Although 66% of interviewees

thought that applying was easy, several commented that the process was not suf®ciently

¯exible to deal with their individual circumstances.

Notwithstanding this, signi®cant differences between schemes were observed. Scheme

1 referred frequently to their clients. For instance, their interview was full of comments

such as `In the end of the day, I want what is best for the client.' `Whether I like an

organisation or not is irrelevant, if my client will bene®t from being referred to them, I will

refer them.' The manager for Scheme 2 spoke instead of propositions and of the risks of

managing the scheme. The manager for Scheme 3 talked in terms of putting deals together.

3.6. Use

The two hypotheses investigated were as follows.

Hypothesis 7. Evaluation ®ndings will not be used.

Sub-hypothesis 7a. Enlightenment use will be more common than instrumental use.

Hypothesis 8. Communication of evaluation ®ndings will be problematic.

Sub-hypothesis 8a. Over-simpli®cation increases the risk of mis-use. Over-complexity

obscures the meaning.

The data compiled for this project were later employed to complete three consultancy

assignments, as a way of investigating use.

The researcher spent considerable time thinking about how to communicate ®ndings to

the client. The problem was not that of complexity. The ®ndings although multi-faceted

were easily summarised (see Table 3). The clients were clearly used to reading, and indeed

preparing, technical reports. Rather, the researcher was concerned that the results might be

taken out of context and used to criticise the enterprise agencies under examination. The

enterprise agencies were already threatened by the introduction of Business Link and

suffering a negative reinforcement cycle, whereby underfunding leads to criticism of

the services and further underfunding. The author decided to place accountability conclu-

sions within a developmental context, which is in keeping with the recommendations of

Torres et al. (1996: 93) on presentation of negative ®ndings.

3.7. Practice

The hypotheses investigated were as follows.

Hypothesis 9. Data for evaluation will be dif®cult to compile.

Sub-hypothesis 9a. Compilation dif®culties will include non-disclosure as well as lack

of availability.


Hypothesis 10. Evaluators will be pressured to change ®ndings to ®t the decision-

maker's perspective.

Sub-hypothesis 10a. This will take the form of pressure to accentuate positive ®ndings.

The level of information available varied across the four cases. Scheme 1 supplied word

processed tables of applicant and recipient information. These seemed to have been typed

in for each information request and were not kept on a spreadsheet or database. This

information did not inspire con®dence, for instance, reference numbers had gaps and

duplicates. Attempts to interview recipients found that most of the contact details were

incomplete or inaccurate. A computerised credit control list was handed over at the ®nal

presentation meeting. This is, then, an example of non-disclosure. Schemes 2 and 3 gave

the impression of having good client information, and provided summary tables prepared

for their decision-making panels. Additional variables required for this research were

compiled verbally from the memory of the fund managers. Scheme 4 was undergoing a

change in management. Information was not available although individual client ®les

seemed to exist. In response to the problems with data for Scheme 1, the author scrutinised

application forms, business plans, and panel minutes to compile a database on the 150

applicants to the fund. This took ®ve days work. The enterprise agency listened to the

criticism, and is in the process of establishing a client database system.

The overall conclusion is that data were incomplete, and often had to be ®lled in

verbally, were not in the form needed for evaluation, and were frequently scattered across

different functions of the organisation (typically split between business advisors and

administrators). The inability to verify data obtained in this way led the researcher to

prefer to compile data from individual client ®les, despite the time involved.

As a test of the second set of hypotheses, none of the clients attempted to exert pressure

on the consultant to change the direction of investigation or conclusions. The one client

who disregarded ®ndings did this through commissioning more favourable research rather

than attempting to change evaluation recommendations.

3.8. Conclusion

Hypotheses 1, 1a and 2 were supported by the four case studies. Objectives were

ambiguous. The form this ambiguity took was that objectives were global, diffuse, and

distant from programme functioning. The assumptions underpinning scheme operation

were unstated. Insuf®cient information was obtained to test Hypothesis 2a but support for

this hypothesis is provided by the analysis of Valuing.

Hypotheses 3, 3a and 4 were supported. Performance indicators presented measurement

problems. The measurement problems presented were those of subjectivity, simpli®cation,

and partiality. Performance indicators also presented organisational problems. However,

the organisational problems presented did not seem to be those of `creaming', short-

termism, and demoralisation. Rather, the main problem seemed to be that the heavy

imposition of monitoring demands from funders alienated agencies from their own infor-

mation needs, and discouraged self-evaluation.

Hypotheses 5, 6 and 6a were given tentative support, and 5a partial support. Evaluation

results supported different judgements according to the values applied. Differences in


judgements did, in part, follow the lines between stakeholder groups, however, there were

also notable differences between schemes. Evaluation did produce positive and negative

readings on different aspects of the programme. Summarising ®ndings into a single aggre-

gate answer would have proved inconclusive.

Hypotheses 7 and 7a were rejected. Evaluation was used, and the form this use took was

instrumental rather than enlightenment related. Hypotheses 8 and 8a received limited

support. Communication of evaluation ®ndings was problematic but because of the desire

to avoid misuse rather than because of the complexity of the material.

Hypotheses 9 and 9a received some support. Data for evaluation were dif®cult to

compile, and the compilation dif®culties included non-disclosure and unavailability.

Hypotheses 10 and 10a were not supported. The evaluator was not pressured to change

®ndings to ®t the decision-makers' perspective, positively or otherwise.


CHAPTER 4

Discussion

4.1. Introduction

The purpose of this section is to comment on the signi®cance of the ®ndings from the

previous chapter, to examine possible explanations for these ®ndings, to consider implica-

tions for future evaluation of loan and grant schemes, and to look at the feasibility of

putting these proposals into practice.


Chapter 3 found that objectives and assumptions were not stated explicitly. This section

will consider the possible explanation for this oversight, and its importance in evaluation.

Loan and grant schemes fall into the category of projects where ªa treatment directly

acts on the characteristics of the problemº (Chen, 1990: 159). Finance schemes almost by

de®nition improve the ®nancial position of the recipient. This might in itself explain the

lack of pressure to specify the bene®ts from ®nance schemes. Understanding this oversight

is not to excuse it, however. Four fundamental problems with loan and grant schemes can

be traced to their lack of understanding of theoretical mechanisms.

First, the link from ®nancial support to economic impact is poorly developed. This is a

weakness of problem de®nition. Social and economic systems often present what Harmon

and Mayer (1986) describe as `wicked' rather than `tame' problems. The validity of

evaluation analysis is limited by the boundaries within which the research is set. If the

boundaries are drawn too widely or too narrowly, the explanation for events will be

externalised, or predetermined by de®nition. This is the same as saying that a criminolo-

gist's study of burglary is more likely to talk about police methods than about poverty and

deprivation (Pawson and Tilley, 1997).

The logic underlying schemes is that the provision of loans or grants improves the

®nancial position of small ®rms, and so produces an economic impact. The second element

of this, the translation of ®nancial bene®t into economic bene®t, makes several assump-

tions which have been questioned within economic development:

² That small businesses create jobs in the local economy. Birch's seminal work in the

United States Birch (1979) concluded that ªsmall ®rms (those with 20 or fewer employ-

ees) generated 66% of all new jobs in the USº for the period, 1969±1976. Continuing in

this vein, Newcastle University carried out research for the Department of Employment

(reported in Bank of England, 1994) which showed that ®rms with fewer than 20

employees created 2.4 million net jobs in Britain between, 1982 and, 1991. However,

detractors have argued that the importance of small ®rms is decreased when distribu-

tional and qualitative factors are taken into account. Storey (1993) has shown that 4% of

small businesses are responsible for 50% of job generation. A series of surveys have

found that the percentage of small businesses aspiring to rapid growth is small, as low

as 22% or even 10% (Cosh and Hughes in Hughes and Storey, 1994).


² That these jobs are new rather than displacements. The TUC (1997) argues that most

®rms are not creating jobs so much as ªrecyclingº or ªabsorbingº labour outsourced or

subcontracted from large ®rms. Storey (1990: 679) comments that ªdisplacement rates

are very high for smaller ®rms and they vary from one trade to another, being

particularly high in the construction and retail sectorsº. Net job creation will, then,

be considerably lower than gross ®gures suggest.

² That lack of ®nance acts as one (or the main) limiting factor to the start-up, expansion or

survival of small businesses. Accounts of the ®nancial problems of small ®rms are

numerous (Midland Bank, 1992; CBI, 1993; Bank of England, 1994, 1997). Birley and

Niktari (1995) found that owner managed businesses fail for reasons which are, accord-

ing to their bankers and accountants, predominantly ®nancial. That ®rms have ®nancial

problems is not the same as saying that these problems can be overcome through

intervention, however. Stanworth and Gray (1991) conclude that the problems of

small ®rms arise in part from the nature and characteristics of the small ®rms, the

attitudes of owner-managers, and the economics of small scale lending or investment.

² That small businesses want to take on external ®nance. Small ®rms typically have a

`pecking order' with a preference from overdrafts and internal funds through hire

purchase and leasing, and then loans down to external equity (Cran®eld European

Enterprise Centre, 1993) This re¯ects a desire to maintain control over the business.

(Cosh and Hughes in Hughes and Storey, 1994).

² That delivery mechanisms for the speci®c loan or grant schemes allow them to reach

those small businesses most in need of ®nancial support. The size, structure, and age of

small ®rms all reduce the quality of information about them, and can therefore increase

the dif®culties of making contact.

Each of the stages in the simple logic diagram for loan and grant schemes has possible

sources of leakage (Fig. 4). At the ®rst stage, capital in the fund could be depleted through

the costs of managing the scheme. At the second stage, bene®ts to recipients could be

depleted through inappropriate choice of recipients, insuf®cient matching of the amount of

money provided to recipient's need (including deadweight), use of the grant or loan money

for inappropriate purposes, or substitution of grant or loan money for other sources of


Fig. 4. Diagram of loan or grant scheme theory.

funding. At the third stage, bene®ts to the economy could be depleted through collapse of

recipients, their relocation, or through displacement of existing ®rms.

The empirical work found three practical problems which derived from the lack of clear

objectives:

² The funder for Scheme 1 threatened to tender fund management because of the high

default rate, only to ®nd that the terms of the contract were so general that termination

on this basis was likely to be open to legal challenge. The main contractual obligation

upon the enterprise agency was to spend the money.

² In Scheme 2, differences in the way individual applicants were rated by the two main

board members (a task force and a bank) led to a high rejection rate (70%).

² Misunderstanding of the role of Scheme 1 led to inappropriate referrals from other

agencies, especially the local authority business advisors, and hence a lower approval

rate than for internally generated applicants.

A fourth problem of unclear objectives and assumptions is that fund managers were not

given clear guidance on risk positioning. This re¯ects the funders' failure to look explicitly

at the trade-off between ®nancial and non-®nancial objectives. Over-emphasis on ®nancial

objectives will mean that additionality will be lacking because the fund will be making

loans that could be made by banks. Over-emphasis on socio-economic objectives will

mean that money will be lost, and this undermines the long-term sustainability of the fund

because money will not be recycled for further lending. Where objectives are not clear,

funds are open to criticism for poor additionality or high default, without appreciation that

these are two sides of the same coin. In Scheme 1 objectives seemed to have suffered from

`policy drift' away from an early concern with targeting (high risk) to a later tightening up

around job creation (low risk). In the ®rst phase, decisions were pushed through, often

against the fund manager's recommendation, which left a level of default that looked

highly unsatisfactory by the second phase.

The overwhelming conclusion from this section is that loan and grant schemes need to

clarify their objectives and strengthen their understanding of the theoretical mechanisms

of their programme. Funders should make an explicit statement of their risk position, and

direct schemes to monitor leakages from the programme. Strengthening Socio-economic

Programming in this way promises bene®ts for theory, implementation, and evaluation

itself. First, funders can ensure that their expectations are reasonable. Where analysis

suggests that the default rate is going to be very high, a grant scheme might be chosen

over a loan scheme. Second, evaluation can contribute to the overall pool of knowledge on

the ®nancial problems of small ®rms. `Plausible' outcomes and indirect effects can be

predicted. Understanding of causal mechanisms provides a better basis for generalisation

than speci®cation of project descriptions, and will increase appreciation of project diver-

sity. Third, identifying key processes provides practical levers for improving implementa-

tion. Elements which do not contribute to the programme can be removed, which reduces

project waste. Last, clarifying theoretical assumptions ensures that research can be focused

in an explicit manner that gives interviewees a more equal role in the debate. This helps to

lay the foundation for a `no shocks' approach to evaluation reporting.



Chapter 3 found measurement and organisational problems from employing perfor-

mance indicators. This section will consider the relevance of performance indicators to

loan and grant schemes, scope for strengthening measurement of effectiveness, ways of

introducing `soft' performance indicators, and the practicality of adopting a more devel-

opmental approach to performance management.

Radaelli and Denta (1996) provide a matrix matching the type of evaluation to the type

of programme (Table 7). Programmes are plotted by the degree of innovation and the

amount of social con¯ict. The quadrant most relevant to British loan and grant schemes

would seem to be that of low innovation and low social con¯ict, which is described as

`Tableau de bord'. This box is recommended for development of a time-series of perfor-

mance indicators to improve the quality of decision-making and contribute to account-

ability. The point of this analysis is to suggest that performance indicators should be

manageable within loan and grant schemes once the problems of ambiguity have been

overcome.

There is now a general recognition of the limitation of ®nancial, especially cost-

accounting approaches to performance management (Johnson and Kaplan, 1991). One

`soft' performance indicator which seems relevant to loan and grant schemes is the version

of `social capital' put forward by Bulder et al. (1996). They de®ne social capital as: ªthe

social networks of employeesº. This de®nition ties in well with the work by Birley on the

importance of networking to entrepreneurs (Birley, 1985; Ghaie and Birley, 1993;

Ostgaard and Birley, 1994). Bulder et al. review research which shows that social capital

increases the productivity of staff through strengthening their motivation, improving

morale, and therefore reducing staff turnover and widening their access to information,

advice and support. They conclude that: ªit is possible that organisational reforms claim-

ing increased ef®ciency and effectiveness may have negative, albeit unintended, conse-

quences for social networks within the organisation; this may turn its social capital into

`sour' capital.º Bulder et al. provide analytical techniques for quantifying changes in

social networks, and therefore tracing the damage done by such organisational reforms.

A performance indicator based on this de®nition of social capital would provide some

measure of the intense networking activity carried out by Schemes 1 and 4. This work

could be used by funders to examine the possible disadvantages of re-tendering ®nance

scheme management contracts.

Chapter 2 emphasised the dif®culty of introducing a learning culture. An analysis by

Power (1994) of `the audit explosion' illustrates how deeply change needs to reach.

He identi®es the fundamental problem as being lack of trust of the organisation under


Table 7

Dimensions of the policy process and choice of evaluation strategies

Social con¯ict

Innovation Low High

Low `Tableau de bord' Unclogging

High Discovery Con¯ict management

evaluation. He argues that the expansion of evaluation (or auditing) serves a symbolic

rather than an instrumental purpose as a demonstration of control over public spending.

Power believes: ªthe audit explosion signi®es a displacement of trust from one part of the

economic system to another; from operatives to auditors.º Audits have therefore made

organisations less rather than more transparent. The result is a vicious cycle whereby:

ªpeople adapt their behaviour to re¯ect the fact that they are not trusted and therefore

con®rm that they should not be trusted.º This is not an inherent problem within perfor-

mance management but rather a result of the tensions from having local controls imposed

from the centre: ªa much resented degree of ªbackseat drivingº by central government'

(Carter, 1989). The solution, according to Power, is to open organisations up to a process

of continual learning, that is, to adopt a developmental approach.

It is arguable that the narrowly de®ned contractual relationships, such as those

employed for the day-to-day management of each of the loan funds under study in this

project, are not ideal breeding grounds for a learning culture. The implication seems to be

that public sector organisations should copy the private sector in moving towards long

term relationships with their suppliers (see, e.g. Ford, 1990).

There is a second more narrowly de®ned problem with the developmental approach.

The impact of a programme is easier to measure once it has reached a position of stability.

The evaluator can then distinguish between set-up costs/bene®ts, transitional costs/

bene®ts, and ongoing (recurrent) costs/bene®ts from a programme. However, organisa-

tional learning is likely to lead to continual adjustments in the operation of a programme

(Richardson et al., 1996). This will make impact more dif®cult to measure.

A third possible problem of embedding evaluation in organisations lies in the need to

widen understanding of and control over information. Democratising evaluation in this

way is going to be more deeply felt if practitioners can form their own interpretations of

data. The literature contains two examples of innovative ways of presenting information.

Henry (1992) uses STAR icons which plot variables on each side of a polygon to allow the

reader to see the pattern across individual variables in the data, and thereby draw their own

conclusions. The overall shape produced allows many different cases or sites to be

compared simultaneously. Research at Imperial College is producing information systems,

such as the Bifocal Display technique, which allow the reader to see the distribution of

data, and therefore to set their own class boundaries (Spence, 1996).

The conclusion from this section is that performance indicators provide a potentially

valuable role in loan and grant scheme evaluation. However, their measurement and

organisational context need to be changed. The performance indicators themselves should

be widened to include `soft' as well as `hard' indicators. Quantitative indicators' claims to

objectivity can be questioned: ªQuantitative measurement rests on qualitative assumptions

about which constructs are worth measuring and how constructs should be conceived.º

(Shadish et al., 1995: 133, commenting on Campbell). Combining quantitative and quali-

tative methods will increase the relevance and validity of evaluation. This is because

different methods suit different evaluation contexts and different aspects of programme

operation. Qualitative methods excel on `bandwidth' (the range of issues addressed), while

quantitative methods excel on `®delity' (the accuracy of information on the issues

addressed). Triangulation allows greater perspective on research issues. Using both

gives a broader base for discovery, and allows for the generation of new hypotheses


as well as the testing of existing hypotheses. Furthermore, using different methods

can make the best of the `division of expertise' among stakeholders (Pawson and Tilley,

1997).

A second major conclusion is that evaluation should be set within a developmental

rather than an accountability approach. Performance indicators work better when used for

learning rather than control. McEldowney (1997: 177) argues that: ªThe advantages of this

model may lie in the fact that it may make policy-makers less defensive and less

threatened by the evaluation exercise.º Notwithstanding this, the literature review, empiri-

cal work, and comments earlier in this chapter all emphasise the problems of introducing a

developmental approach. A developmental approach cannot be laid on top of existing

organisations but rather requires fundamental changes to structure, culture, and

communications channels.

4.4. Valuing

Chapter 3 showed how loan schemes differed from each other in their de®nition of

performance indicators. This variation was not random but followed a systematic pattern

across the schemes. Scheme 1 consistently de®ned performance indicators more widely.

Scheme 2 de®ned them more narrowly. Scheme 3 lay in the middle of these two. Further,

the terms in which fund managers described their work differed, focusing on clients

(Scheme 1), propositions (Scheme 2), and deals (Scheme 3). Putting all this together

suggests three different styles of managing loan and grant schemes (Table 8). These

different management styles are re¯ected in choices about marketing, decision-making,

monitoring as well as interpretation of performance indicators. Scheme 1 is client focused,

Scheme 2 is systems focused, and Scheme 3 is focused on deal-making. Scheme 4 had

experienced several changes in its management, and showed a less clear pattern.

The explanation for these patterns seems to lie in the pressures imposed on loan and

grant schemes. Each scheme has to balance socio-economic and ®nancial objectives but

the choices they make seem to depend on other pressures. For Scheme 1 these pressures

were political, especially around a commitment to support ethnic minorities; for Scheme 2

prevailing pressures were audit related, designed to protect the agency from external

attack. Scheme 3 operated under extreme time pressure, and needed to act entrepreneu-

rially to meet its tight deadlines.

This model of management styles raises ®ve questions about the Valuing of loan and

grant schemes.

² Should loan schemes be guided to perform more widely on objectives rather than giving

priority to one perspective? Alternatively, should expectations on loan funds be

adjusted to acknowledge the con¯icting pressures under which they must operate?

² To what extent is there a gain in having different loan schemes in an area operating under

different management styles? Enterprise agencies seem to be reaching different client

groups, possibly because clients gravitate towards a style which makes them feel comfor-

table. If this ®nding is validated, the implication is that the current tendency towards single

access points in Business Links may be reducing the scope to perform across the wide

range of objectives, and some client groups might be excluded in the process.


² To what extent do the different styles weaken relationships between loan schemes? Would

greater recognition of the difference in styles improve mutual respect, and therefore

increase referral?

² What are the relative merits of the different models? To what extent does the funding

regime encourage one style over another? The researcher's impression was that the client

oriented route was the most dif®cult to pursue because it is at odds with the bureaucratic

culture of the funding agencies.

² Whose values should be employed during the evaluation? Can the evaluator fairly

compare schemes which appear to be founded on different priorities?

The ®rst issue on this list, trading-off different priorities, is discussed in the evaluation

literature. Chen (1990) indicates three strategies for dealing with multiple and con¯icting

values. A maximising approach prioritises one of the values. A sequencing approach gives

values priority at different stages. A balancing approach gives all values equal attention. Chen

argues in favour of the third because: additional effort devoted to individual values can be

expected to be subject to the law of diminishing marginal returns; decision-making more

usually adopts an approach of `satisfying' (March and Simon, 1958); and systems theory


Table 8

Management styles of the case studies

Client oriented Bureaucratic Entrepreneurial

Scheme 1 Scheme 2 Scheme 3

Orientation Clients Systems The deal

Pressures Political Auditing Time

Marketing Word of mouth Bank referral Outreach

Decision-making Packaging/needs

oriented

Processing/business

plan oriented

Negotiating/opportunity

oriented

Core decision-making

structure

Counsellors Banks Professionals

Loan monitoring Business club Monthly accounts Visits

Strengths Client relations

Targeting

Systems

Low failure rate

Throughput

Presence

Weaknesses Seen as a soft touch

Higher failure rate

Throughput

Bias

Lack of clarity to

applicants

Defence Writing off as a

political decision

Argument that there is

no demand. Collection

of information on other

funds' throughput

Good record keeping

Performance

Indicators:

Interpretation of

lender of last resort

From the client's

perspective: tried and

failed

From the bank

perspective:

unacceptable in theory

From the public sector

perspective:

unacceptable in practice

Interpretation of

leverage

Wide, includes help in

kind

Narrow, includes

proven bank funds

Medium

suggests that emphasis on any one value is dysfunctional. However, the context for Chen's

analysis is designing an evaluation rather than implementing an economic development

project. Chen is assuming that only one evaluation is carried out at any one time. By contrast,

local economies can contain several loan and grant schemes. Different value positions can be

expected to match the needs of a wider pro®le of clients needs. A uni®ed approach provides

less choice for the client, and may even exclude some groups from support. An alternative

view can be put forward, that `maximising' will enable each implementing agency to build on

its strengths and best meet the needs of speci®c client groups. This approach is analogous to a

business strategy of differentiation rather than cost leadership. Notwithstanding this, loan

schemes may be able to learn from each other. For example, business clubs and IT systems

can be combined in the service of credit control procedures.

This section questions the basic principle that standards can be applied across schemes.

If agencies cannot excel on all aspects of performance, local areas might bene®t from

having a range of schemes which each excel on different areas, re¯ecting their core skills.

Schemes should, then, be judged together in terms of their complementarity with each

other.

A second conclusion is that project level studies should be co-ordinated, and results

compiled across different evaluations. Introducing this form of meta-analysis would allow

effectiveness measures to be included. Some of these are dif®cult to analyse at the project

level. The information provided could be used to support benchmarking of loan and grant

schemes. Meta-analysis in its more common sense (integration after, rather than before

research is carried out) would also provide a check on the quality of evaluation research

through dialogue between different evaluators.

4.5. Use

The research found strong instrumental use in two out of three of the consultancy

exercises carried out. This section examines some of the problems of instrumental use,

and the possible reasons why instrumental use was easier than would be expected from the

evaluation literature.

This project highlighted three negative aspects of instrumental use. First, a long-term

focus on decisions about the renewal of contracts seemed to have undermined commu-

nication between funders and scheme managers. The funders had adopted a laissez faire

management style. Concerns dating back as far as the establishment of the schemes had

not been articulated or explained. The funders were behaving as if the only course of

action open to them was to re-allocate the contract. Second, re-tendering management

came to be seen as straightforward. Reliance on performance indicators seemed to encour-

age an arms-length view of projects which overlooked the build up of interrelationships,

intangible factors which are well captured within the concept of `social capital'. One of the

four loan funds had changed management, and all relationships with loan-holders were

lost in the process. The high default rate which resulted might be in part an effect of

breaking these ties, and therefore illustrative of the importance of social capital. Third, the

consultant was placed in a dif®cult position. The evaluation was dependent on the fund

managers to provide data which it was not entirely in their interests to hand over. The

emphasis on the vulnerability of the contract made this dilemma more obvious.


The author attempted to defuse potential con¯ict through improving communication

and understanding between the funders and the scheme managers, using stakeholder

analysis, and the management styles model explained above. Earlier involvement in the

monitoring of the loan schemes would have strengthened this role. The literature provides

several examples of evaluators who are appointed at the beginning of programmes and

provide continuous feedback. This is variously described as `trailing research' (Finne et

al., 1995), `continuous monitoring' (Georghiou, 1995), or `integral programmatic inter-

vention' (Patton, 1996).

That instrumental use did occur in two out of the three exercises is contrary to general-

isations within evaluation literature. However, an explanation for this difference is easily

found. In a report of a discussion between Weiss and Patton, Weiss admits that part of the

reason for the negative ®ndings of much of her work is that it is carried out at a policy

level, where the evaluator has less contact with decision-makers and political factors are

more important (Alkin, 1990: 26). In the three consultancy assignments mentioned, the

client was one of the decision-makers.

A second possible explanation for the high use lies in the researcher's decision to place

accountability evaluation within a developmental context. Chelimsky (1997) shows how

the nature of use can be expected to differ between the three broad types of evaluation

identi®ed earlier. Accountability evaluation tends to have a complex impact, for instance,

knowledge of later evaluation might lead to higher standards during implementation.

Developmental evaluation might aim for relatively close use of research ®ndings, but

can also be of value in helping programme operators question their work. Knowledge

evaluation tends to result in a diffuse application of ®ndings, for example, later

programmes, including programmes outside the immediate ®eld of interest, might be

in¯uenced by the accumulation of evidence. The proposal to increase the developmental

function of evaluation, which came out of the analysis of performance indicators above,

can be expected to strengthen instrumental use. There is no reason to expect this to

generate the kinds of problems listed above because these refer to instrumental use of

accountability evaluation, not instrumental use of developmental evaluation.

A third possible explanation for the high level of use lies in the use of case study

material. Stake (1978) argues for `naturalistic generalisation' around use of case studies

because: ªThe best substitute for direct experience probably is vicarious experience.º Case

studies, it is argued, provide vicarious experience because of their concrete and vivid

nature. ªCase studies will often be the preferred method of research because they may

be epistemologically in harmony with the reader's experience and thus to that person a

natural basis for generalisationº (Stake, 1978: 5). This is assumed to lead to greater use.

Notwithstanding this, Shadish et al. (1995: 300) point out there is a difference between

case study formats and case study methods. Case study formats may be a useful way of

presenting ®ndings to encourage use but they do not necessitate case study methods. For

instance, surveys can often generate case study material. Fictional cases can also help to

dramatise ®ndings.

Two conclusions can be derived from the analysis of Use. First, ®ndings should be

presented in different forms and structures. Communication of evaluation material has to

overcome several complexities: the different layers of ®ndings, the different interests of

readers, and the different communication styles of individuals. Using a range of media


provides the best way of reaching everyone. Presenting sections of evaluation ®ndings in

different styles allows different ways into the information. Qualitative material such as

case studies and quotations can provide `vicarious experience'. Presentation should not be

limited to written approaches. Open days can provide a direct way for funders to under-

stand the projects being evaluated.

Second, stakeholders should be involved in the evaluation as far as is possible without

sacri®cing the independence of the evaluator. Stakeholders should be included because of

the information that they can provide, and the gain for utilisation in starting dialogue from

the earliest stages of work. The justi®cations for involving stakeholders include:

² Vicarious experience. Stakeholders can experience the evaluation process. This

increases their ownership over ®ndings, and can be expected to increase the likelihood

of use. This use is not limited by the evaluator's perspective. Readers can form their

own conclusions about the data. (Patton, 1996).

² Education. Users can be educated so that they value evaluation. The evaluator can also

take responsibility for giving the client greater appreciation of ethical issues in evalua-

tion and consultancy. This is the approach recommended by Newman and Brown

(1996).

² Perspectives. The de®nition of the social or economic problem under attack can be

broadened beyond the institutional boundaries of the implementing organisation.

Qualitative information can be used to balance and contextualise quantitative data.

² Managing politics. Having a cross-section of interest groups can to some extent help to

neutralise the in¯uence of any one. (Palumbo, 1989).

² Justice. Some evaluators see a moral case for taking account of the values of different

programme participants. House (1978: 94) argues that: ªGenerally, the evaluator has an

obligation to ask how things look from the viewpoint of the least advantaged and

whether that viewpoint is worth collecting and emphasizing in the evaluation.º.

² Lead into implementation. Detailed information is imparted to the stakeholders so that

insights are not lost when evaluators withdraw. This prevents the client from being

dependent on the evaluator. (Fetterman et al., 1996)

4.6. Practice

Chapter 3 found that evaluation data were dif®cult to obtain. This section investigates

why this might have been the case, and uses information from business process reengi-

neering to consider possible ways of improving data availability.

Good information might be expected to be available given that schemes maintain

ongoing relationships with clients during credit control. The number of clients (recipients)

is also considerably smaller than might be the case for training schemes or other economic

development projects.

That the available information was not always strong seemed to re¯ect the relationship

between funders and contracts. Funders place heavy information burdens on the enterprise

agencies. Externalising information demands in this way seemed to have stopped the

agencies from thinking about information from their own perspective. Enterprise agencies


were not always comfortable with an analytical style. Maintenance of databases was either

against their culture or given a low priority. Third, it is possible that agencies were

deliberately not keeping data that could be used against them. Fourth, some of the agencies

did not have the skills to establish computerised databases. Fifth, ®nancial pressures on

agencies meant that long-term investment in information systems was given a low priority.

Even those agencies which had sophisticated information systems were using the data

for credit control and other management functions, rather than to inform decision-making.

This is unfortunate as a scheme's knowledge of their loan holders can help them under-

stand the likely behaviour of applicants. Compilation of detailed individual records for all

the clients for Scheme 1 yielded ®ndings that would have been useful to the agency. For

instance, that 90% of money lent to start-ups was outstanding, compared to around 38% of

money lent to established ®rms, and debt was concentrated in applicants who had failed to

make a single repayment.

Improving the availability of data depends on solving technical problems. Data collec-

tion should be easy and not too time consuming. Data must be accurate and timely. This is

partly about ensuring the information is valued by the people collecting it. Data must be

transparent and easy to interpret. De®nitions and measurement must be consistent across a

time series.

Means to meet these requirements can be derived from business process re-engineering

(Hammer, 1990):

² Data should be entered by those with client knowledge. This is likely to mean

delegation of responsibility for data systems down the organisation.

² Data should only be entered into the system once.

² Data should be entered regularly, preferably soon after the events related.

² Data should be captured at source.

² Related data systems within the organisation should be linked to avoid ambiguity.

² Data should be entered in a disaggregate form with the computer calculating perfor-

mance indicators for funders according to their different conventions. Disaggregation

allows easy checking of accuracy by auditors.

² The computer should have automatic cross checking (or ®eld delimiters).

An illustration of how these principles can be developed is seen in the auditing tool

developed by NOP Research Group Limited (unpublished information). Currently adapted

for private sector performance management, this compiles disaggregated data on a large

number of questions across eight categories of organisational performance. Questions are

answered in terms of the evidence available to con®rm them (instead of through simple

`yes' or `no' answers). Questions and weights between questions can be easily changed to

meet the circumstances of each organisation. Re-standardisation of data allows bench-

marking with comparable organisations. The equivalent of this system designed for loan

and grant schemes would have individual information on each applicant, distinguishing

each potential element in each performance indicator so that schemes could adopt their

own de®nitions but also recalculate ®gures on a similar basis in order to allow comparison.

Soft indicators would be easily included in this approach.

The overall conclusion is that evaluation should be integrated with management


systems. This was recommended by Likierman (1993), and is also one of the principles of

business process re-engineering. Evaluation tends to rely on the existence of detailed

project information. This is especially the case for programmes of low innovation and

low social con¯ict, as set out by Radaelli and Dente (1996). Stand-alone evaluation

projects have to compile this information over a relatively limited time. Establishing a

bank of monitoring information allows evaluations to be carried out with limited effort and

delay. The cost of data collection is reduced, and also spread over a longer period of time.

Furthermore, embedding evaluation in the organisation in this way avoids the `friction' of

needing to activate several layers of decision-making to appoint each individual piece of

evaluation. The quality of information should be improved because the data are compiled

soon after the event by a data entry operator who is familiar with the details and motivated

to ensure data quality through their own interest in the results. This process is more likely

to produce a full time-series of results. Compiling data on a disaggregate basis should

increase accuracy and versatility. Disaggregated data allow different versions of perfor-

mance indicators to be calculated automatically by the computer which ensures that each

version of a performance indicator is calculated consistently. Taking information manage-

ment down to the level of those staff who have client contact should increase their status,

and encourage them to be more questioning and responsive. This in itself should help to

raise standards of customer care. Embedding monitoring systems in the organisation

should also increase receptiveness to evaluation conclusions. Weiss (in Shadish et al.,

1995: 193) comments that policy makers most value data that come to them naturally.

Linking evaluation to management systems ful®ls this condition. The short lead-in time to

compilation of evaluation reports should also enhances their relevance to decision-

making. Patton (1997: 93) concludes:

Integrating data collection into program implementation would be considered a

problem Ð a form of treatment contamination Ð under traditional rules of

research. (¼) Making data collection integral rather than separate can reinforce

and strengthen the program intervention. Such an approach can also be cost-effec-

tive and ef®cient since, when evaluation becomes integral to the program, its costs

aren't an add-on. This enhances the sustainability of evaluation because, when it's

built in rather than added on, it's not viewed as a temporary effort or luxury that can

be easily dispensed with when cuts are necessary.

The second conclusion is that evaluation should run alongside the programme. Evalua-

tors will be able to advise on establishment of monitoring systems at the beginning, and so

ensure that data are available for later impact evaluations. Long-term involvement ensures

that the evaluator has a detailed understanding of each stage of the programme. The cycle

of feelings that staff, participants, and other stakeholders tend to experience in a

programme can be followed, as can the different costs and bene®ts from setting up and

running the programme. Cook et al. (1981) point out that: ªthe single study is by de®nition

an imperfect vehicle for obtaining accurate resultsº. Longitudinal research provides

greater scope for testing factor relationships. Including several iterations of evaluation

also provides deeper and deeper insight into the contextual factors underlying programmes

(Pawson and Tilley, 1997). Lastly, long term contracts give evaluators greater ¯exibility to

respond to changing briefs.


4.7. Conclusion

This chapter has examined the ®ndings from Chapter 3 and suggested ways of

improving the evaluation of loan and grant schemes.

² Analysis of the theory underlying loan and grant schemes should be improved. This

would lead to greater understanding of potential sources of leakage from the

programme, and to greater additionality.

² Evaluators should use a range of methods, qualitative as well as quantitative. Including

`softer' performance indicators such as social capital would give a fuller picture of the

skills that agencies employ in managing loan and grant schemes.

² Judgements of value should take into account the trade-offs in managing loan and grant

schemes, and the possible bene®t from having agencies which complement each others'

activity.

² Monitoring should be integrated into management systems. This would increase the

agencies' appreciation of evaluation as well as increasing the availability of good

quality longitudinal data, and reducing the cost of evaluation.

² Evaluation should be run alongside the programme. This would ensure that data

systems are properly established from the beginning, and would allow continual feed-

back on implementation, thereby improving the practical value of evaluation.

² Evaluation ®ndings should be presented in different media. This should include

participation (funders attending open days) as well as presentation of written reports.

² Evaluation studies should be co-ordinated across local areas. Monitoring should have a

common core of data. This would allow key issues such as effectiveness to be

examined, and would also increase the validity of comparisons between schemes.

² Each of these changes potentially expands evaluation activity. As a counter-balance,

evaluation needs to improve its focus. Shadish et al. (1995: 61) comment that: ªGood

theory of practice Ð more than anything else Ð is about setting priorities and the

trade-offs that go with doing so.º The work already reviewed in Chapter 2 shows ways

of doing this. Pawson and Tilley (1997) argue that focusing can be achieved by being

theory-driven rather than data-driven. Patton (1997) would focus on the questions of

value to key users. Ability to combine both approaches simultaneously will re¯ect the

familiar dimension of knowledge versus use approaches to evaluation.


CHAPTER 5

Conclusions

The shift from professional to managerial authority and values was accompanied by

changes in the kind of knowledge exploited in evaluation of services. Government

assumed that evaluation would be summative, delivering authoritative judgements,

based as far as possible on performance indicators or quantitative measures of

input±output relationships and outcomes and set against prede®ned targets and

standards. The underlying theory of knowledge was positivist; it assumed that social

phenomena could be divorced from their context and that objective knowledge about

them could be achieved through empirical observation and quantitatively expressed;

that facts were distinct from values and means from ends; that concepts and methods

of `good management' were applicable to the pursuit of any values. Those assump-

tions were at variance with the dominant theory and practice of evaluationº (Henkel,

1991: 19±20).

The Financial Management Initiative (FMI) introduced by the Thatcher Government in,

1982 increased the use and the importance of evaluation and performance management.

Measures of inputs, throughputs or outputs, known as performance indicators, have been

employed as the main way of controlling decentralised or contract organisations, and with

time have replaced other forms of communication between the parties. Performance

indicators have been employed to increase customer focus, and with time have come to

symbolise the search for value for money and reduction in public spending. These are high

expectations which evaluation is wholly unable to meet. The subject still faces conceptual,

organisational, and practical problems in its application.

Research from four case studies illustrates the evaluation problems of public sector

small business loan and grant schemes. Performance indicators are subjective, partial, and

inconsistent between cases. Their effects are to reduce co-operation and mutual esteem

between schemes, to distance agencies from their own information needs, and to alienate

small business clients. Readings are inconclusive and contentious between stakeholders.

These are not technical problems. At their heart lie the dif®culties of attempting to quantify

complex and variable economic phenomena, extracting enterprise agency operation from

the many external factors which impinge on their performance, simplifying and rigidifying

the very dynamics of performance itself. Above all, these problems re¯ect the centrality of

values and ideology in the organisation and operation of public sector programmes and in

their evaluation. The positivist assumptions, which are partly responsible for these

problems, can protect performance management from criticism through the claim of

objectivity and technical skill. They cannot provide solutions to practical problems of

performance management.

The problems of using performance indicators should not come as a surprise. The

United States has an extensive and long standing literature on evaluation. Although still

suffering from fundamental differences of opinion, American evaluation has nonetheless

reached a position of understanding on the many problems facing evaluators. Ten


hypotheses on evaluation problems were derived from the literature, and then tested on the

four small business loan and grant schemes. These hypotheses re¯ect the pessimistic view

coming out of the literature that evaluation is rarely well speci®ed, and seldom produces

®ndings which are conclusive or welcome. Eight hypotheses were supported. Two were

rejected, mainly because of the different experience of implementing evaluation at a

project rather than a policy level. This work points to the need for more comprehensive

research to examine the interpretation and use of performance indicators across small

business loan and grant schemes, and in other ®elds of economic development.

The literature suggests a number of ways to improve evaluation, many of which are to

do with the quality of the data and the rigour of analysis on which any programme

assessment must be based. Some of this lie in the hands of the evaluators themselves.

Evaluation can be integrated into management systems, co-ordinated across local areas,

run alongside the programme, and made easier to absorb. However, evaluators should not

make promises they cannot keep. They can give users and stakeholders a larger role in

research, but they cannot themselves ensure that these people's values will be re¯ected in

judgements. They can ask those being evaluated to open up to criticism in order to develop

and learn. But they cannot promise that negative conclusions will not be used against those

organisations, which have co-operated. They can make recommendations relevant, clear,

and precise, but they cannot promise that resources will be found to implement them.

Many evaluators are now calling for evaluation to move from an accountability purpose

towards a developmental or knowledge role. This project supports such a change.

However, it also emphasises the deep seating transformation which will be needed for a

developmental style to be created.


CHAPTER 6

Bibliography

Abma, T. A. (1997) Playing with/in plurality: revitalizing realities and relationships in

Rotterdam. Evaluation 3(1), 25±48.

ACOST (1990) The Enterprise Challenge: Overcoming Barriers to Growth in Small

Firms. HMSO, London.

Adelman, C. (1996) Anything goes: evaluation and relativism. Evaluation 2(3),

291±305.

Agarwala-Rogers, R. (1977) Why is evaluation research not utilized? Evaluation

Studies Review Annual 2, 327±33.

Awasthi, D. N. and Jose, S. (1996) Evaluation of Entrepreneurship Development

Programmes. Sage, London.

Barber, J., Metcalfe, J. S., Porteous (eds.) (1989) Barriers to Growth in Small Firms.

Routledge.

Beeton, D. (1988) Performance Management: Getting the Concepts Right.

Boruch, R. F. (1976) On common contentions about randomized ®eld experiments.

Evaluation Studies Review Annual 1, 158±194.

Broadbent, J., Dietrich, M. and Laughlin, R. (1996) The development of principal-

agent, contracting and accountability relationships in the public sector: conceptual and

cultural problems. Critical Perspectives on Accounting 17(3), 259±84.

Bryk, A. S. (ed.) (1983) Stakeholder-based evaluation. New Directions for Program

Evaluation, No.17. Jossey-Bass, San Francisco, CA.

Buisseret, T., Cameron, H. M. and Georghious, L. (1995) What difference does it make?

Additionality in the support of large ®rms. International Journal of Technology Manage-

ment 10(4±6), 587±600.

Bush, M. and Gordon, A. C. (1978) The advantages of client involvement in evaluation

research, Evaluation Studies Review Annual 3, 767±783.

Bussman, W. (1996) Democracy and evaluation's contribution to negotiation,

empowerment and information: some ®ndings from swiss democratic experience.

Evaluation 2(3), 307±319.

Calsyn, R. J. and Klinkenberg, W. D. (1995) Response bias in needs assessment studies,

Evaluation Review 19(2), 217±225.

Cameron, G. C. (1990) First steps in urban policy evaluation in the United Kingdom,

Urban Studies 27(4), 475±495.

Centre for Business Research (1996) The Changing Status of British Enterprise.

Challis, D. (1996) Performance indicators for community-based social care: from

theory to practice, Care Plan 12(4), 19±24.

Ciarlo, J. A. (ed.) Utilizing Evaluation: Concepts and Measuring Techniques. Sage,

Beverly Hills, CA.

Cook, T. D. (1978) Utilization, knowledge-building, and institutionalization: three

criteria by which evaluation research can be evaluated. Evaluation Studies Review Annual

3, 13±22.


Cracknell, B. E. (1996) Evaluating development aid: strengths and weaknesses,

Evaluation 2(1), 23±33.

Cran®eld European Enterprise Centre (1993) Special Report No 8. Financial

Characteristics of Small Companies in Britain.

Cran®eld European Enterprise Centre (1993) The European Enterprise Index. Survey 6.

Cronbach, L. J. and Associates (1981) Our ninety-®ve theses. Evaluation Studies

Review Annual 6, 27±37.

Das, T. H. (1983) Qualitative research in organisational behaviour. Journal of

Management Studies 20(3), 301±314.

Department of Trade and Industry (1991) Constraints on the Growth of Small Firms.

HMSO, London.

Drewitt, A. (1997) Evaluation and consultation: learning the lessons of user

involvement. Evaluation 3(2), 189±204.

Dummond, E. J. (1994) Making best use of performance measures and information.

International Journal of Operations and Production Management 14(9), 16±31.

Dunn, W. N. (1982) Reforms as arguments. Evaluation Studies Review Annual 7,83±116.

Duran, P., Monnier, E. and Smith, A. (1995) Evaluation aÁ la francËaise: towards a new

relationship between social science and public action, Evaluation 1(1), 45±63.

Everitt, A. and Hardiker, P. (1996) Evaluating for Good Practice. Macmillan,

Basingstoke.

Everitt, A. (1996) Developing critical evaluation. Evaluation 2(2), 173±188.

Farrington, D. P. (1997) Evaluating a community crive prevention program. Evaluation

3(2), 157±173.

Fetterman, D. M., Kaftarian, S. J. and Wandersman, A. (eds.) (1996) Empowerment

Evaluation: Knowledge and Tools for Self-Assessment and Accountability. Sage, london.

Foley, P. (1992) Local economic policy and job creation: a review of evaluation studies,

Urban Studies 29(3/4), 557±98.

Garaway, G. (1996) The case-study model: an organisational strategy for cross-cultural

evaluation. Evaluation 2(2), 201±211.

Green, J. C. (1996) Qualitative evaluation and scienti®c citizenship: re¯ections and

refractions. Evaluation 2(3), 277±289.

Gregory, D. G. and Martin, S. J. (1988) Issues in the evaluation of inner city

programmes. Local Economy 4(2): 237±249.

Hall, D. (1995) Performance Measurement Under Scrutiny. University of Birmingham,

Social Services Research.

Hambleton, R. and Thomas, H. (1995) Urban Policy Evaluation. Paul Chapman,

London.

HM Government (1979) The Financing of Small Firms. The Report of the Committee to

Review the Functioning of Financial Institutions (Wilson Report), Cmnd. 7503. HMSO,

London.

Kaplan, R. S. and Norton, D. P. (1992) The balanced scorecard Ð measures that drive

performance. Harvard Business Review January±February, 71±79.

Kaufman, C. C. (1995) Evaluation innovations for environments of systemic social

change. Evaluation 1(2), 155±169.


Kazi, M. A. F. (1996) Single-case evaluation in the public sector using a combination of

approaches. Evaluation 2(1), 85±97.

Kimmel, A. J. (1988) Ethics and values in applied social research. Sage, Newbury Park,

CA.

Kirkhard, K. E. (1995) Seeking multicultural validity: a postcard from the road.

Evaluation Practice 16(1), 1±12.

Knox, C. (1995) Concept mapping in policy evaluation: a research review of

community relations in Northern Ireland. Evaluation 1(1), 65±79.

Laughlin, R. and Broadbent, J. (1996) Redesigning fourth generation evaluation: an

evaluation model for the public-sector reforms in the UK?, Evaluation 2(4), 431±451.

Leviton, L. and Hughes, E. F. X. (1981) Research on the utilization of evaluations. A

review and synthesis. Evaluation Review 5(4), 525±48.

Light, R. J. and Smith, P. V. (1977) Accumulating evidence: procedures for resolving

contraditions among different research studies, Evaluation Studies Review Annual 2,195±238.

Lincoln, Y. S. (1994) Tracks towards a postmodern politics of evaluation. Evaluation

Practice 15(3), 299±309.

Mawhood, C. (1997) Performance measurement in the United Kingdom (1985±1995).

In Evaluation for the 21st Century, E. Chelminsky, and W. R. Shadish (ed.). Sage,

London.

Midgley, G. (1996) Evaluating services for people with disabilities: a critical systems

perspective. Evaluation 2(1), 67±84.

Midwinter, A. (1994) Developing performance indicators for local government: the

scottish experience. Public Money and Management 14(2), 37±43.

Mischon de Reya/Tilly, B. (1992) The Funding Requirements of Private Companies.

Mischon de Reya, London.

Morgan, G. and Smircich, L. (1980) The case for qualitative research. Academy of

Management Review 5(4), 491±500.

Morrisey O. (1995) Shifting paradigms: discourse analysis as an evaluation approach

for technology assessment, Evaluation 1(2), 189±216.

ODA (1984) The Evaluation of Aid Projects and Programmes. HMSO, London.

Owen, J. M. (1995) roles for evaluation in learning organisations. Evaluation 1(2),

189±216.

Pollitt, C. (1988) Bringing consumers into performance measurement: concepts,

consequences and constraints. Policy and Politics 16, 77±87.

Pollitt, C. (1995) Justi®cation by works or by faith? Evaluating the new public

management. Evaluation 1(2), 133±154.

Rein, M. and White, S. H. (1978) Can policy research help policy. Evaluation Studies

Review Annual 3, 24±41.

Robinson, F. and Wren, C. (1987) Evaluating the impact and effectiveness of ®nancial

assistance policies in the newcastle metropolitan region. Local Government Studies 13,49±61.

Robinson, S. (1996) Evaluating the progress of clinical audit: a research and develop-

ment project. Evaluation 2(4), 373±392.

Robson, D., Bradford, M., Deas, I., Hall, E., Harrison, E., Parkinson, M., Evans, R.,


Garside, P., Harding, A., and Robinson, F. (1994) Assessing the Impact of Urban Policy.

HMSO, London.

Rogerson, P. (1995) Performance measurement and policing: police service or law

enforcement agency? Public Money and Management 15(4), 25±30.

Sanders, J. R. (1994) The Program Evaluation Standards. Sage, London.

Schemenner, R. W. and Vollman, T. E. (1994) Performance measures: gaps, false

alarms and the usual suspects. International Journal of Operations and Production

Management 14(12), 58±69.

Schwandt, T. A. (1997) Evaluation as practical hermeneutics. Evaluation 3(1), 69±83.

Smircich, L. and Stubbart, C. (1985) Strategic management in an enacted world, Acad-

emy of Management Review 10(4), 734±36.

Storey, D. (1994) Understanding the Small Business Sector. Routledge, London.

Stuf¯ebeam, D. L. and Webster, W. J. (1981) An analysis of alternative approaches to

evaluation. Evaluation Studies Review Annual 1, 70±85.

Tilley, N. (1996) Demonstration, exempli®cation, duplication and replication in

evaluation research. Evaluation 2(1), 35±50.

Van der Eyken, W., Goulden, D. and Crossley, M. (1995) Evaluating educational

reform in a small state: a case study of Belize, Central America. Evaluation 1(1), 33±44.

Vaux, A., Stockdale, M. S. and Schwerin, M. J. (ed.) (1992) Independent Consulting for

Evaluators. Sage, London.

Walker, R. (1994) Putting performance measurement into context: classifying social

housing organisations. Policy and Politics 22(3), 191±202.

Wildavsky, A. (1978) The self-evaluating organization. Evaluation Studies Review

Annual 3, 82±93.

Yin, R. K. (1989) Case Study Research. Sage, Newbury Park, CA.

Yin, R. K. (1982) The case study crisis. Evaluation Studies Review Annual 7, 167±174.


References

Alkin, M.C., 1990. Debates on Evaluation. Sage, London.

Argyris, C., SchoÈn, D.A., 1978. Organizational Learning: A Theory of Action Perspective. Addison-Wesley,

Reading, MA.

Audit Commission, 1989. Managing Services Effectively Ð Performance Review. HMSO, London.

Audit Commission, 1992. Citizen's Charter Performance Indicators. HMSO, London.

Ball, R., Monaghan, C., 1996. Performance review: the British experience. Local Government Studies 22 (1),

40±58.

Bank of England (1994). Finance for Small Firms. A Note by the Bank of England. Bank of England, London.

Berk, R.A., Rossi, P.H., 1976. Doing good or worse: evaluation research politically re-examined. Social Problems

23 (3), 337±349.

Berk, R.A., Rossi, P.H., 1990. Thinking About Program Evaluation. Sage, London.

Birch, D. L. (1979). The Job Generation Process. March: MIT Program on Neighborhood and Regional Change.

Birley, S., 1985. The role of networks in the entrepreneurial process. Journal of Business Venturing 1,

107±117.

Birley, S., Niktari, N., 1995. The Failure of Owner-Managed Businesses: The Diagnosis of Accountants and

Bankers. Stoy Hayward, London.

Brickell, H.M., 1978. The in¯uence of external political factors on the role and methodology of evaluation.

Evaluation Studies Review Annual 3, 94±101.

Bulder, B., Frans, L., Henk, F., 1996. Networks and evaluating public-sector reforms. Evaluation 2 (3), 261±276.

Campbell, D.T., 1979. Assessing the impact of planned social change. Evaluation and Program Planning 2,

67±90.

Campbell, D.T., Stanley, J.C., 1963. Experimental and Quasi-Experimental Designs for Research. Rand

McNally, Chicago.

Campbell, D.T., Boruch, R.F., 1975. Making the case for randomised assignment to treatments by considering the

alternatives: six ways in which quasi-experimental evaluations in compensatory education tend to under-

estimate effects. In: Bennett, C.A., Lumsdaine, A.A. (Eds.). Evaluation and Experiments: Some Critical

Issues in Assessing Social Programmes. Academic Press, New York.

Carter, N., 1989. Performance indicators: backseat driving or hands off control? Policy and Politics 17 (2),

131±138.

Carter, N., Klein, R., Day, P., 1992. How Organisations Measure Success: The Use of Performance Indicators in

Government. Routledge, London.

Cave, M., Kogan, M., Smith, R., 1990. Output and Performance Measures in Government, The State of the Art.

Jessica Kingsley Publishers, London.

CBI (1993). Finance for Growth: Meeting the Financing Needs of Small and Medium Enterprises. CBI, London.

Chelimsky, E., 1987. What have we learned about the politics of program evaluation?. Evaluation Practice 8 (2),

5±21.

Chelimsky, E., 1997. Thoughts for a new evaluation society. Evaluation 3 (1), 97±118.

Chen, H-T., 1990. Theory-Driven Evaluations. Sage, London.

Chen, H-T., Peter, H.R., 1981. The multi-goal, theory-driven approach to evaluation: a model linking basic and

applied social science. Evaluation Studies Review Annual 6, 38±53.

Choudhary, A. and Tandon, R. (1988). Participatory Evaluation. Society for Participatory Research in Asia, New

Delhi, India.

Cook, T.D., Levinson-Rose, J., Pollard, W.P., 1981. The misutilization of evaluation research: some pitfalls of

de®nition. Evaluation Studies Review Annual 6, 727±748.

Cran®eld European Enterprise Centre (1993). Special Report No. 5. Attitudes of Smaller Firms Towards Finan-

cing and Financial Institutions in Europe. Cran®eld University, Cran®eld.

Cronbach, L.J., 1982. Designing Evaluations of Educational and Social Programs. Jossey-Bass, San Francisco.

Etzioni, A., 1960. Two approaches to organisational analysis: a critique and a suggestion. Administrative Science

Quarterly 5, 257±258.

Fetterman, D.M., Kaftarian, S.J., Wandersman, A. (Eds.), 1996. Empowerment Evaluation: Knowledge and Tools

for Self Assessment and Accountability Sage, London.


Finne, H., Levin, M., Nilssen, T., 1995. Trailing research: a model for useful program evaluation. Evaluation 1

(1), 11±31.

Ford, D. (Ed.), 1990. Understanding Business Markets: Interaction, Relationships, Networks Academic Press,

London.

Georghiou, L., 1995. Assessing the framework programmes-a meta-evaluation. Evaluation 1 (2), 171±188.

Ghaie, S., Birley, S., 1993. Networking by the indian business community in northern ireland. The Journal of

Entrepreneurship 2 (2), 209±234.

Ghobadian, A., Ashworht, J., 1994. Performance measurement in local government Ð concept and practice.

International Journal of Operations and Production Management 14 (5), 35±50.

Glynn, J.J., Murphy, M.P., 1996. Public management: failing accountabilities and failing performance review.

International Journal of Public Sector Management 9 (5/6), 125±137.

Gray, A., 1997. Contract culture and target fetishism. The distortive effects of output measures on local

regeneration programmes. Local Economy 11 (4), 343±357.

Guba, E.G., Lincoln, Y.S., 1989. Fourth Generation Evaluation. Sage, London.

Hammer, M., 1990. Reengineering work: don't automate, obliterate. Harvard Business Review July±August,

104±112.

Harmon, M.M., Mayer, R.T., 1986. Organisational Theory for Public Administration. Little Brown, Boston, MA.

Haug, P., 1996. Evaluation of government reforms. Evaluation 2 (4), 417±430.

Hedrick, T.E., 1988. The interaction of politics and evaluation. Evaluation Practice 9 (3), 5±28.

Henkel, M., 1991. Government, Evaluation and Change. Jessica Kingsley, London.

Henry, G.T., 1992. Using graphical displays to empower evaluation audiences. In: Vaux, A., Stockdale, M.S.,

Schwerin, M.J. (Eds.). Independent Consulting for Evaluators. Sage, London.

Hill, T., 1995. Manufacturing Strategy. Macmillan, London.

HM Treasury (1988). Policy Evaluation, A Guide for Managers. HMSO, London.

Hogwood, B.W., Gunn, L.A., 1984. Policy Analysis for the Real World. Oxford University Press, Oxford.

House, E.R., 1978. Justice in evaluation. Evaluation Studies Review Annual 3, 75±99.

House, E.R., 1980. Evaluating with Validity. Sage, London.

House, E.R., 1993. Professional Evaluation: Social Impact and Political Consequences. Sage, London.

Hughes, A., Storey, D.J. (Eds.), 1994. Finance and the Small Firm Routledge, London.

Inayatullah, J. and Birley, S. (1996). The Oranji Pilot Project: The Evaluation of a Micro-Enterprise Credit

Institution. Discussion Paper, Imperial College, London.

Jackson, A., 1996. Foyers: The Step in the Right Direction. Foyer Federation, London.

Jackson, P., 1988. The management of performance in the public sector. Public Money and Management 8 (4).

Jackson, P.M., 1993. Public service performance evaluation: a strategic perspective. Public Money and

Management, 9±14.

Johnson, H.T., Kaplan, R.S, 1991. Relevance Lost: The Rise and Fall of Management Accounting. Harvard

Management School Press, Boston.

Joint Committee on Standards for Educational Evaluation, 1994. The Program Evaluation Standards: How to

Assess Evaluations of Educational Programs. 2nd ed. Sage, London.

JURUE, 1986. Assessment of Industrial and Commercial Improvement Areas. HMSO, London.

Karlsson, O., 1996. A critical dialogue in evaluation: how can the interaction between evaluation and politics be

tackled?. Evaluation 2 (4), 405±416.

Kotler, P., 1994. Principles of Marketing. Prentice-Hall, Englewood Cliffs, NJ.

Kushner, S., 1996. The limits of constructivism in evaluation. Evaluation 2 (2), 189±200.

Likierman, A., 1993. Performance indicators: 20 early lessons from managerial use. Public Money and

Management 13 (4), 15±22.

Love, J.A., 1991. Internal Evaluation: Building Organisations from Within. Sage, London.

MacDonald, B., 1976. A political classi®cation of evaluation studies. In: Hamilton, D. (Ed.). Beyond the

Numbers Game. Macmillan, London.

March, J., Simon, H., 1958. Organisations. Wiley, London.

McEldowney, J.J., 1997. Policy evaluation and the concepts of deadweight and additionality: a commentary.

Evaluation 3 (2), 175±188.

Meekings, A., 1995. Unlocking the potential of performance measurement: a practical implementation guide.

Public Money and Management 15 (4), 5±12.


Midland Bank (1992). The Changing Financial Requirements of Smaller Companies. Midland Bank Business

Economics Unit, London.

Mishan, E.J., 1971. Cost-Bene®t Analysis: An Informal Introduction. Allen and Unwin, London.

Murphy, K.R., Cleveland, J.N., 1995. Understanding Performance Appraisal. Social, Organizational and Goal-

Based Perspectives. Sage, London.

Newman, D.L., Brown, R.D., 1996. Applied Ethics for Program Evaluation. Sage, London.

Newton, T., Findly, P., 1996. Playing god? The performance of appraisal. Human Resource Management Journal

6 (3), 42±57.

OECD (1988). Measures to assist the long-term unemployed. Recent Experience in Some OECD Countries. Paris,

OECD.

Ostgaard, T.E., Birley, S., 1994. Personal networks and ®rm competitive strategy Ð a strategic or coincidental

match? Journal of Business Venturing 9, 281±305.

P.A. Cambridge Economic Consultants (1987). An Evaluation of the Enterprise Zone Experiment. HMSO, DOE,

London.

Palumbo, D.J. (Ed.), 1989. The Politics of Program Evaluation Sage, London.

Patton, M.Q., 1990. Qualitative Evaluation and Research Methods. Sage, London.

Patton, M.Q., 1996. Utilization-Focused Evaluation. 3rd ed. Sage, London.

Pawson, R., 1996. Three steps to constructivist heaven. Evaluation 2 (2), 213±219.

Pawson, R., Tilley, N., 1997. Realistic Evaluation. Sage, London.

Pearce, G., Martin, S., 1996. The measurement of additionality: grasping the slippery eel. Local Government

Studies 22 (1), 78±92.

Pedlar, M., Burgoyne, B., Boydell, T., 1991. The Learning Company: A Strategy for Sustainable Development.

McGraw Hill, Maidenhead.

Power, M., 1994. The Audit Explosion. Demos, London.

Radaelli, C.M., Denta, B., 1996. Evaluation strategies and analysis of the policy process. Evaluation 2 (1), 51±66.

Rebien, C.C., 1996. Participatory evaluation of development assistance: dealing with power and facilitative

learning. Evaluation 2 (2), 151±171.

Richardson, R., Kuipers, H., Soeters, J.L., 1996. Evaluation of organisational change in the dutch armed forces.

Evaluation 2 (1), 7±22.

Rogers. S., 1990. Performance Management in Local Government.

Rossi, P.H., 1985. The iron law of evaluation and other metallic rules. Paper presented at State University of New

York, Albany, Rockefeller College.

Rossi, P.H., Freeman, H.E., 1985. Evaluation, A Systematic Approach. 3rd ed. Sage, London.

Rossi, P.H., Freeman, H.E., 1993. Evaluation, A Systematic Approach. 5th. ed. Sage, London.

Scriven, M., 1994. Product evaluation-the state of the art. Evaluation Practice 15 (1), 45±62.

Scriven, M., 1976. Evaluation bias and its control. Evaluation Studies Review Annual 1, 119±139.

Scriven, M., 1991. Evaluation Thesaurus. 4th ed. Sage, London.

Scriven, M., 1996. The theory behind practical evaluation. Evaluation 2 (4), 393±404.

Senge, P., 1990. The Fifth Discipline: The Art and Practice of the Learning Organisation. Doubleday/Currency,

New York.

Shadish, W.J., Cook, T.D., Leviton, L.C., 1995. Foundations of Program Evaluation. Theories of Practice. Sage,

London.

Smith, R.S.G., Walker, R.M., 1994. The role of performance indicators in housing management: a critique.

Environment and Planning A 26, 609±621.

Spence, B. (1996). Visualisation Really Has Nothing To Do With Computers. Information Engineering Section

Report 96/2. London, Imperial College.

Stake, R.E. (Ed.), 1975. Evaluating the arts in education: A responsive approach. Merrill, Columbus, OH.

Stake, R.E., 1978. The case study method in social inquiry. Educational Researcher 7, 5±8.

Stake, R.E., 1981. Case study methodology: an epistemological advocacy. In: Welch, W. (Ed.). Case Study

Methodology in Educational Evaluation. Minnesota Research and Evaluation Center, Minneapolis.

Stake, R.E., 1995. The Art of Case Study Research. Sage, London.

Stanworth, J., Gray, C. (Eds.), 1991. Bolton 20 Years On: The Small Firm in the, 1990s Small Business Research

Trust, London.


Stewart, J., Walsh, K., 1994. Performance measurement: when performance can never be ®nally de®ned. Public

Money and Management. 14 (2), 43±49.

Storey, D.J., 1990. Evaluation of policies and measures to create local employment. Urban Studies 26, 587±606.

Storey, D. J., 1993. Should We Abandon the Support to Start-Up Business? Warwick Business School Small and

Medium Enterprise Centre. Working Paper No. 11.

Strand, S., 1997. Key performance indicators for primary school improvement. Education Management and

Administration 25 (2), 145±153.

Torres, R.T., Preskill, H.S., Piontek, M.S., 1996. Evaluation Strategies for Communication and Reporting:

Enhanced Learning in Organisations. Sage, London.

Townley, B., 1994. Reframing Human Resource Management: Power, Ethics and the Subject at Work. Sage,

London.

TUC (1997). The Small Firms Myths. London, TUC.

Van de Knaap, P., 1995. Policy evaluation and learning: feedback, enlightenment or argumentation? Evaluation 1

(2), 189±216.

VanderPlaat, M., 1995. Beyond technique: issues in evaluating for empowerment. Evaluation 1 (1), 81±96.

Wallace, W. A. (1980). The Economic Role of the Audit in Free and Regulated Markets. University of Rochester,

New York.

Weiss, C.H., 1973. The politics of impact measurement. Policy Studies Journal 1, 179±183.

Weiss, C.H., 1977. Research for policy' sake: The enlightenment function of social research. Policy Analysis 3,

531±545.

Weiss, C.H., 1980. Knowledge creep and decision accretion. Knowledge: Creation, Diffusion, Utilisation 1,

381±404.

Weiss, C.H., 1983. The stakeholders' approach to evaluation: origin and promise. In: Bryk, A.S. (Ed.).

Stakeholder-based Evaluation. New Directions for Program Evaluation, No. 17Jossey-Bass, San Francisco,

CA.

Weiss, C.H., 1988. Evaluation for decisions: is anyone there? Does Anyone Care? Evaluation Practice 9 (1),

5±19.

Wholey, J.S., 1981. Using evaluation to improve program performance. Evaluation Studies Review Annual 6,

55±69.

Zuboff, S., 1988. In the Age of the Smart Machine: The Future of Work and Power. Basic Books, New York.


An evaluation of evaluation: problems with performance measurement in small business loan and grant...

Documents

Transcript of An evaluation of evaluation: problems with performance measurement in small business loan and grant...