Post on 16-Sep-2016
An evaluation of evaluation: problems withperformance measurement in small business
loan and grant schemes
Annabel Jackson
Annabel Jackson Associates, 52 Lyncombe Hill, Bath BA2 4PJ, UK
Abstract
This project researched the problems of evaluating public sector small business loan and grant
schemes. The methodology was to use case studies of four loan and grant schemes run by enterprise
agencies on contract from task forces or city challenges. These case studies were selected following a
mapping exercise of 27 ®nance schemes in north east London.
Research showed that evaluation relied on performance indicators. The main measures used were
deployment of funds, job creation, leverage of additional funds, lender of last resort, ethnicity of
recipients, default rate, and enquiry levels. Four technical problems with these performance indica-
tors were found. First, the interpretation of indicators varied between organisations. Second, perfor-
mance indicators judged agencies on areas which were outside their control. Third, indicators failed
to take account of the full range of work involved in managing a loan and grant scheme. Fourth,
lender of last resort and leverage were incompatible in their preferred risk position and no attempt
was made to reconcile the two.
That these technical weaknesses were not taken into account in the weight given to performance
indicators led to four organisational effects. First, agencies seemed to have made their own attempt to
render the indicators meaningful, re¯ecting the conditions under which the schemes had been ®rst
established. The existence of strong political, auditing, and time pressures led to three management
styles designated as client oriented, bureaucratic, and entrepreneurial. Second, the system encour-
aged managing agencies to distort ®gures to produce more favourable results. For instance, resche-
duling of debts allowed the agency to avoid recording high levels of default. Third, the focus of
information gathering on external needs (those of the funders) seemed to have discouraged agencies
from contemplating and developing their own data systems. This might re¯ect either a lack of
resources or a fear that any information could be used against them.
Several possible explanations for these problems are explored in this report. Objectives were
ambiguous because of the under-development of programme theory. Data was not always available
because its collection was not of value to the enterprise agency. Communication between funders
PERGAMON
0305-9006/01/$ - see front matter q 2001 Published by Elsevier Science Ltd.
PII: S0305-9006(00)00019-2
Progress in Planning 55 (2001) 1±64www.elsevier.nl/locate/pplann
E-mail address: ajataja@aol.com
and agencies was poor because funders had come to see their main power as residing in the
re-tendering of contracts.
These points present part of the explanation, but a further level of analysis is possible. That theory
is under-developed, data unavailable, and communication weak are symptoms of a deeper problem.
This is the problem that performance management is operating under a positivist theory of knowl-
edge. Several possible improvements to the evaluation of small business loan and grant schemes are
explored in this paper. Key among these is an acknowledgement of the limitations of evaluation
through adopting a developmental approach within a culture of organisational learning. This paper
illustrates the value of adopting a developmental approach, but also the fundamental transformation
which it implies.
ªWith each new evaluation, the evaluator sets out, like an ancient explorer, on a quest for useful
knowledge, not sure whether seas will be gentle, tempestuous, or becalmed. Along the way the
evaluator will often encounter any number of challenges: political intrigues wrapped in mantles of
virtue; devious and ¯attering antagonists trying to co-opt the evaluation in service of their own
narrow interests and agendas; unrealistic deadlines and absurdly limited resources; gross misconcep-
tions about what can actually be measured with precision and de®nitiveness; deep-seated fears about
the evils-incarnate of evaluation, and therefore, evaluators; incredible exaggerations of evaluators'
power; and insinuations about defects in the evaluator's genetic heritage (Patton, 1997: 38;
Utilisation Focused Evaluation).º q 2001 Published by Elsevier Science Ltd.
A. Jackson / Progress in Planning 55 (2001) 1±642
CHAPTER 1
Introduction
1.1. The research question
This project investigates the problems of evaluating public sector small business loan
and grant schemes. It leads to recommendations about ways to improve the meaningful-
ness, validity, fairness, usefulness, and practicality of evaluation.
1.2. Research methodology
There are three elements to the research methodology. First, a literature review of
evaluation theory was used to derive hypotheses on the potential problems of evaluating
loan and grant schemes. Second, a mapping exercise of all ®nance schemes was carried out
for the Hackney area. This put together information on the character, conditions, opera-
tion, performance, and procedures for public sector loan and grant schemes active in and
around Hackney. The mapping exercise had an additional role in providing background on
evaluation techniques used and the interpretation of performance indicators. Third, from
the mapping exercise, four loan and grant schemes were selected for case study to test the
hypotheses on evaluation derived from the literature review.
The ®eldwork for the case studies comprised interviews with the scheme managers and
funders, compilation of data on each of the loan holders, interviews with 24 loan holders
across two of the schemes, and analysis of written material on the loan and grant schemes.
1.3. De®nition of evaluation
ªEvaluation is the process of determining the merit, worth and value of things, and
evaluations are the products of that process Scriven (1991: 1).º
Scriven (1991) sees evaluation as a transdiscipline, wider than one area of applied social
science. It provides basic tools which span disciplines, rather in the manner of logic,
design, and statistics. Evaluation combines two processes. Compiling, analysing, and
simplifying or standardising data is only the ®rst step in evaluation. The second step
inevitably involves the imposition of values or standards. Scriven sees applications such
as programme, personnel, product, and material evaluation as branches of the core disci-
pline.
However, as Chelimsky (1997) argues, evaluation is also in essence action-oriented.
She describes three broad purposes for evaluation. The accountability function judges the
impact of a programme, its ef®ciency, and effectiveness. The development function
re¯ects on the operation of a programme and provides recommendations for improvement.
The knowledge function contributes to the generation of a pool of knowledge about social
(or economic) phenomena.
This project concentrates on the sub-area of programme evaluation while also recognising
A. Jackson / Progress in Planning 55 (2001) 1±64 3
the intellectual and practical gains from making links to the sister applications identi®ed by
Scriven.
1.4. Public sector small business loan and grant schemes
Within the ®eld of programme evaluation, loan and grant schemes are particularly
interesting. Finance is one of the most common weaknesses of small ®rms, whether
through under-capitalisation or cash ¯ow problems. Storey (1993: 1) comments that:
ªThe idea that problems in the ®nancing of smaller ®rms have signi®cantly hindered
the role they play in the overall performance of the UK economy is deeply rooted.º
Loan and grant schemes present a microcosm of ®nancial and non-®nancial pressures
on economic development. Inayatullah and Birley (1996: 7) draw attention to the
ambiguity in the World Bank's objectives, which call for application of the fundamental
principles of ®nance but also commitment to reaching the poor. They observe that: ªThere
exists no standard prescription in the literature for setting up and operating micro-credit
schemes, most of which have to tread the narrow path between charity and business.º Loan
and grant schemes have a discrete decision-point (approval or rejection) in which ®nancial
and non-®nancial pressures have to be reconciled. Other economic development schemes
do not have such a clear outcome.
This distinct character means that the approach to evaluation currently carried out by
public sector agencies starts from a different basis from that applied to other policy areas
which do not have a discrete decision-point (for example, training or business advice). The
conclusions on loan schemes might not, then, be representative of other policy areas.
1.5. Structure of the project report
This report adopts the framework of Shadish et al. (1995). They identify ®ve
components of programme evaluation:
² Appropriate phrasing of the social problem which the public sector programme seeks to
alleviate (Social Programming).
² Generation of valid information about the programme (Knowledge Construction).
² Application of appropriate value judgements (Valuing).
² In¯uence over appropriate policy decisions (Use).
² Organisation of work within resource constraints (Practice).
These ®ve headings broadly correspond with the requirements for evaluation to be mean-
ingful, valid, fair, useful, and practical. Four of the ®ve headings map onto the criteria
advocated by the American Program Evaluation Standards (Joint Committee on Standards
for Educational Evaluation, 1994): accuracy, propriety, utility, and feasibility. The ®fth
element, Social Programming, is not identi®ed in the Evaluation Standards. This term will
be renamed `Socio-economic Programming' for the purposes of this project, to re¯ect the
focus on economic development rather than social policy.
Fig. 1 shows the four stages to the project. The literature review is described in
Chapter 2, and the empirical work in Chapter 3. Implications from the empirical work
are discussed in Chapter 4, before making conclusions and recommendations in Chapter 5.
A. Jackson / Progress in Planning 55 (2001) 1±644
A. Jackson / Progress in Planning 55 (2001) 1±64 5
Fig. 1. Diagram of project structure.
CHAPTER 2
Literature review on evaluation problems
2.1. Introduction
This literature review draws on work across the ®eld of programme evaluation. The
review had three elements: readings from the main American theorists, Scriven (1994,
1976, 1991, 1996), Campbell and Stanley (1963), Campbell and Boruch (1975), Weiss
(1973, 1980, 1983, 1988), Wholey (1981), Stake (1978, 1981, 1995), Rossi et al. (1993),
Chelimsky (1987), Chelimsky et al. (1997), Guba and Lincoln (1989), House (1978, 1980,
1993), House et al. (1996), Chen and Rossi (1981), Chen (1990) and Patton (1990, 1996);
a review of the main evaluation journals (Evaluation, Evaluation and Program Planning,
Evaluation Review, Evaluation Studies Review Annual and Evaluation Practice); and a
review of policy related journals such as Fiscal Studies, Local Economy, Local Govern-
ment Studies, Municipal Journal, Local Government Chronicle, International Journal of
Public Sector Management, Local Government Studies, Public Administration, Public
Money and Management, and Policy Studies.
American evaluation theory has a long history, dating back to the large-scale social
experiments of the 1960s (House, 1993). The American literature is well developed but
divided. Evaluation has been bedevilled by intense `paradigm wars' (Pawson and Tilley,
1997) between social science-based evaluators who give priority to knowledge construc-
tion and stakeholder-based evaluators who give priority to use. Newman and Brown
(1996) argue that this con¯ict mirrors fundamental ethical dilemmas inherent in evalua-
tion: the evaluator's choice of autonomy (which is taken to run together with justice and
®delity to users or other stakeholders) versus ®delity to the client. Shadish et al. (1995)
emphasise that evaluation theory and practice must deal with both knowledge construction
and use as well as developing the relatively neglected areas of programme theories, value
frameworks, and guidelines about practical trade-offs in evaluation design.
In Britain, evaluation has only reached widespread importance since the Financial
Management Initiative (FMI) introduced by the Thatcher Government in 1982. Increased
use of evaluation in the public sector re¯ects four factors. First, performance management
helped justify cuts in public spending. ªCalling for ef®ciency improvements through the
better management of performance allowed the government to cut public expenditure
without necessarily advocating or, more signi®cantly, being seen to advocate service
level depletion, a process facilitated by the politically irresistible `value for money' tag.
It was dif®cult to oppose the concept of value for money without seeming to advocate or at
least defend waste and inef®ciencyº (Ball and Monaghan, 1996: 40). Second, performance
management was consistent with the government's desire for greater accountability.
Performance indicators were introduced to increase control over decentralised activities
and, more recently, to help with the management of compulsory competitive tendering
(Ball and Monaghan, 1996: 42). Agency theory shows how monitoring is needed where
there are `hidden actions' or `information asymmetries' which expose principals to `moral
hazards' or `adverse selection' (Wallace, 1980). Third, performance indicators were part
of an attempt to increase customer focus and improve quality (Ghobadian and Ashworth,
A. Jackson / Progress in Planning 55 (2001) 1±646
1994: 35). Fourth, performance indicators have been used positively by councils to help
focus effort on matters of strategic importance. ªA well-designed performance review
system is one method of operationalising manifesto commitments and thus has attraction
for members.º (Ball and Monaghan, 1996: 42). Use of performance indicators went hand
in hand with the increased politicisation of local government.
Performance management refers to ªan integrated set of planning and review proce-
dures which cascade down through the organisation to provide a link between each
individual and the overall strategy of the organisation.º (Rogers, 1990: 16). This is a
sub-set of the area included within the ®eld of evaluation. In Britain, the remit has been
further narrowed with the Audit Commission's focus on the three Es: economy (the extent
to which the inputs are minimalised), ef®ciency (the relationship of the output of an
activity or organisation to the associated inputs), and effectiveness (the extent to which
outputs contribute to ®nal objectives). The concept of value for money Ð which refers to
the social and economic bene®t of an activity in relation to cost Ð has in practice been
used as a shortened term for the three `E's (Cave et al., 1990: 42). Economy and ef®ciency
are more easily measured than effectiveness and these two categories of performance
indicators have come to dominate evaluation. As Rogers (1990: 48) explains: ªThere is
plenty of available data about resources and about the quantity of service provided. There
is much less data available about the quality of service and its effects on consumers and the
public generally.º
In an effort to strengthen the treatment of quality and effectiveness, the Audit Commis-
sion (1989) has clari®ed that outputs (the use made of resources or the service actually
delivered to the public) do not necessarily lead to outcomes (the ultimate value or bene®t
of the service to its users). Nor can changes in a social or economic system be attributed
equivocably to programme intervention: some activities or expenditure would have
occurred without the programme (deadweight); some activities or expenditures are offset
through the programme (displacement). Nor are all changes equally desirable. Evaluators
have added a fourth `E', equity, to cover distributional effects relative to need.
2.2. Socio-economic programming
2.2.1. Introduction
ªA social problem is a social construction (Berk and Rossi, 1990: 39)º
ªA social or intervention program is the purposive and organised effort to intervene
in an ongoing social process for the purpose of solving a problem or providing a
service. The questions of how to structure the organised efforts appropriately and
why the organised efforts lead to the desired outcomes imply that the program
operates under some theory. Although this theory is frequently implicit or unsys-
tematic, it provides general guidance for the formation of the program and explains
how the program is supposed to work (Chen, 1990: 39).º
ªAll policies involve assumptions about what governments can do and what the
consequences of their actions will be. These assumptions are rarely spelt out, but
A. Jackson / Progress in Planning 55 (2001) 1±64 7
policies nevertheless do imply a theory (or model) of cause and effect (Hogwood and
Gunn, 1984: 18).º
Socio-economic Programming deals with the conceptual basis for de®ning socio-
economic problems and designing public policy programmes. This is also known as
`substantive knowledge' (Rossi in Chen, 1990), or `theory-driven evaluation' (Chen,
1990). The starting point is the argument that public sector policy can be seen as based
on a set of assumptions which together have the structure of a theory. This view has been
expressed not only by evaluators (Berk and Rossi, Chen), but also by policy analysts
(Hogwood and Gunn).
Chen (1990) distinguishes two elements of programme theory: normative and causative
theory. Normative theory examines the ideal character (`treatment evaluation'), context
(`implementation environment evaluation'), and effects of the programme (`outcome
evaluation'). This comprises an analysis of the goals and objectives for the programme,
including intermediate objectives. Causative theory examines the observed effects
(`impact evaluation'), processes linking the treatment and the outcome (`intervening
mechanism evaluation'), and wider relevance of the programme (`generalisation evalua-
tion'). This is about the mechanisms linking objectives with programme effects, including
indirect impacts.
The two main problems of Socio-economic Programming follow this division into
normative and causative theory. First, objectives have been found to be ambiguous.
Second, the theoretical assumptions underpinning policy are usually unstated.
2.2.2. Normative theory: clari®cation of objectives
Vague objectives are common in public programmes: ªThe goals of social programs are
often global, diffuse, diverse and inconsistent, vary over stakeholders, and may have little
to do with program functioning.º (Shadish et al., 1995: 184, commenting on the work of
Weiss). Chen (1990: 90)) attributes this weakness to the con¯icting roles which goals or
objectives must serve. He identi®es six roles: legitimising the existence of the programme;
binding political coalitions; levering resources; enabling a budget to be allocated; provid-
ing criteria for performance appraisal; and directing implementation. The ®rst three of
these are political rather than operational roles, and employ ambiguity to smooth over
con¯icts and bind coalitions. By the time policy is implemented, the gap between objec-
tives and practice may have widened because of changes in external conditions, translation
of abstract principles into practical guidelines, and the exercise of personal autonomy by
project managers (Chen, 1990: 177).
Evaluation theorists differ in their views on the importance of having clear objectives.
At one extreme, Wholley (Shadish et al., 1995: 237) thinks programmes should be tested
for evaluability before full-scale evaluation is started. Programmes without clearly de®ned
objectives are dismissed as unevaluable. At the other extreme, Scriven (1976) argues for a
process of `goal-free evaluation'. He sees goals as a source of bias and wants evaluators to
investigate the impact of programmes without any knowledge of them, in what he calls `a
step beyond double-blind methodology'. Between these extremes, Patton (1996) de®nes
goals at an operational level by asking project managers about the intended impact on
A. Jackson / Progress in Planning 55 (2001) 1±648
clients, and Chen (1990) identi®es `plausible goals' by looking at which activities attract
the most resources. These two would be described by Etzioni (1960) as focusing on the
`real' rather than the `formal' goals.
Clarifying objectives can highlight questionable or inconsistent assumptions,
differential expectations of stakeholders, unnecessary activities, and weak links in
implementation.
2.2.3. Causative theory: identi®cation of linking mechanisms
Chen (1990: 18) argues that re®nement and analysis of the conceptual foundations of
programmes have been neglected because methodologists have taken a black box
approach focusing on impact, looking at `the overall relationships between the inputs
and outputs of a program without concern for the transformation processes in the middle'.
Bickman (1987) in Chen (1990: 29) gives four advantages of specifying programme
theory. First, evaluators can distinguish between programme failure due to an invalid
theory and failure due to incorrect or incomplete implementation. Second, understanding
the connections between a programme's operations and its effects helps the evaluator to
anticipate direct and indirect impacts. Third, identi®cation of intermediate effects of a
programme can be used by the evaluator to give early feedback of implementation
problems. Fourth, information on the processes underpinning programmes is of greater
practical relevance to programme managers than general statements about impact.
2.2.4. Developing theory
Ways of improving Socio-economic Programming are given by Chen (1990) in his
Theory-Driven Evaluation, and Pawson and Tilley (1997) in their Realistic Evaluation.
Chen (1990; 226) lays down principles to help identify those key components of the
programme where theory most needs to be developed. He argues that an understanding
of programmes develops by testing whether the theories underlying the `research system'
still apply in the `generalising system'. This is analogous to looking at whether processes
in statistical samples are representative of those in their wider population.
Pawson and Tilley (1997: 119) see Chen's approach as being about `the continual
betterment of practice' as opposed to the experimentalists' goal of `the secure transfer-
ability of knowledge'. They take Chen's work as a basis to develop their own scienti®c
realist approach Ð that social phenomena have an existence beyond the constructions
placed upon them.
Pawson and Tilley (1997) provide an iterative structure to build up a theoretical under-
standing of programmes in terms of their mechanisms, elements, contexts, and outcomes.
Their version of scienti®c realism argues that programmes work only in certain forms, for
certain people, and in certain contexts. The aim of evaluation is to provide continuous
feedback in ever ®ne detail about the conditions under which policy succeeds or fails
(Fig. 2). Pawson and Tilley (1997: 22) argue that traditional positivist approaches, which
infer cause from co-variance between aggregate variables, ignore conditional and contin-
gent factors. Quoting Guba and Lincoln (1989: 60), they suggest that: ªExperimentation
tries to minimalise all the differences (except one) between experimental and control
groups and thus `effectively strips away the context and yields results that are valid
only in other contextless situations.'º
A. Jackson / Progress in Planning 55 (2001) 1±64 9
Realistic Evaluation strengthens Socio-economic Programming in three ways. First,
programmes are seen to be composed of different sub-processes. These sub-processes
differ in their effectiveness and in their compatibility with different environments.
Second, social and economic phenomena are seen to have multiple causes which act
synergistically rather than individually. This can produce a complex pattern whereby
individual programme effects cancel each other out. Third, Pawson and Tilley restore
human agency through asking about the possible motivations of programme participants.
Programmes are seen as `offering chances which may (or may not) be triggered into action
by the subject's capacity to make choices.' This is a conceptual, and possibly an ethical,
advance from seeing programmes acting upon passive subjects.
2.3. Knowledge construction
2.3.1. Introduction
Knowledge Construction is the technical side of evaluation concerned with the produc-
tion of information (Shadish et al., 1995). It provides guidance on the appropriate tech-
niques for obtaining information (methodology), the nature and origins of that information
(epistemology), and the status of the information created, whether real or constructed
(ontology). Knowledge Construction is the most well developed facet of evaluation.
Lessons from applied social research, with which evaluation has much in common,
have contributed to the body of understanding.
A. Jackson / Progress in Planning 55 (2001) 1±6410
Fig. 2. The realist evaluation cycle.
Con¯ict between science-based and stakeholder-based evaluators is mirrored in a split
between positivists and constructionists, and between quantitative and qualitative
researchers. The framework of Shadish et al. is intended to bridge this gap by advocating
an eclectic methodology tailored to the speci®c circumstances of each evaluation.
Knowledge Construction in the UK focuses on performance management and, within
this, on the use of performance indicators. Performance indicators are de®ned as `quanti-
tative or qualitative measures of inputs, throughputs or outputs' (Smith and Walker, 1994).
2.3.2. A critique of performance indicators
Discussion about the value of performance indicators is an important element of the
British literature on evaluation. Criticisms about the use of performance indicators include
the following.
² Subjectivity. Public services do not always lend themselves to quanti®cation. Perfor-
mance indicators frequently use nominal data for variables which are in practice
continuous. Measurement of performance in public services is complicated by the
inherent characteristics of services: intangibility, inseparability, variability, and perish-
ability (Kotler, 1994). Stewart and Walsh (1994: 47) conclude that technical solutions
will never be suf®cient to overcome these inherent tensions: ªMany of the apparent
technical dif®culties re¯ect the impossibility of regarding provision as the equivalent of
sale.º
² Insuf®cient attention is given to the use and interpretation of indicators. ªIt has
frequently been assumed that indicators will simply `speak for themselves.'º (Strand,
1997: 146). Inaccurate, out-of-date, or irrelevant data have been used and dubious
comparisons made. Public sector agencies typically vary in their number of objectives,
priorities, history, local need, environment, pay-back period and, indeed, de®nition of
performance indicators. Generalising across different organisations therefore needs to
be done with care.
² Failure to isolate additionality. Additionality is de®ned by the Treasury (1988) as: ªthe
amount of output from a policy as compared with what would have occurred without
intervention.º The aggregate picture presented by indicators has been accepted
uncritically without separating out effects which can rightfully be attributed to policy.
A speci®c element of this is the failure to take account of different starting levels and
therefore different added value in individual programmes.
² Simpli®cation. Some activities are more amenable to measurement through
performance indicators and have therefore been given greater attention. The Audit
Commission (1992) recognises that: ªthe danger with the very simple indicators is
that they over-simplify reality, or put an excessive weight on the features of the services
that happen to be easy to measure.º As already mentioned, measures of economy (cost)
have generally won over measures of effectiveness (achievement of objectives).
² Lack of local ownership. Gray (1997) complains that the emphasis on quantitative
indicators serves to marginalise community organisations from the planning and imple-
mentation of policy. Strand (1997: 146) observes that in education ªThe suspicion
remains that PIs are something which is done to schools rather than for them or with
them.º
A. Jackson / Progress in Planning 55 (2001) 1±64 11
² Neglect of other forms of accountability. The introduction of government-determined
performance indicators has tended to increase central accountability at the expense of
local accountability, accountability over outcomes at the expense of accountability over
process. ªMuch of the accountability that can be identi®ed is a post hoc accountability
for what has happened. There is little evidence of the instalment of the necessary
control systems to ensure initial quality assurance.º (Glynn and Murphy, 1996: 132).
Stewart and Walsh (1994: 45) argue that public sector performance measurement is
inherently problematic: ªThe principles that we use to measure performance in the public
sector are not so much applied in the judgements that we make of service quality, but are
derived from them. These judgements are more like a judicial activity than engineering
measurement, being based on criteria that arise from the accumulation of a series of cases.
The dilemma of performance management in the public domain therefore is to secure
effective performance when the meaning to be given to it can never be completely de®ned,
and the criteria by which it is judged can never be ®nally established.º
2.3.3. Organisational implications of performance indicators
The adverse impact of evaluation on an organisation has been recognised for some time.
ªThe more any social indicator is used for social decision making, the greater the corrup-
tion pressures upon it.º (Campbell, 1979: 85). Gray (1997) uses the example of the
`production fetishism' of the Soviet planning system to illustrate the `gaming' effects of
output monitoring. Output measures distort organisational activity by valuing some beha-
viour over others. Five problems result:
² `Creaming' is encouraged. In Russia, production targets encouraged over-production
of poor quality goods. For example, when targets for glass production were de®ned in
tons, the factory had an incentive to produce undesirably thick glass, but when it was
changed to square metres, the glass was made too thin. Similarly, the use of output
targets has resulted in a form of `target fetishism' Ð `a concern with targets which
threatens to become detached from the social purpose of the policies at stake'. In
economic development this is re¯ected in support being concentrated on individuals
most likely to generate outputs, rather than those most in need, and focusing on sectors
most likely to produce outputs or evidence of outputs. OECD (1988) has found `cream-
ing' to be widespread in training schemes across the world.
² Additionality is neglected. `Wherever the output measure is the number of bene®ciaries,
there is a risk that the jam will be spread too thinly, in order to increase this number.'
(Gray, 1997)
² Innovation is discouraged. `Innovation involves risk and risk is not rewarded.' (Soviet
quote in Gray, 1997).
² Short-termism is encouraged. Projects with a quick payoff will tend to be preferred in a
system which needs to prove the value of the current year's expenditure by achieve-
ments before year-end.
² Over-counting or double counting is rewarded. Programmes often overlap in their area
of operation. Failure to isolate the speci®c contribution of individual agencies can lead
to haphazard attribution of outputs and double-counting.
A. Jackson / Progress in Planning 55 (2001) 1±6412
These problems can be contained through precise de®nition of performance indicators but
this demands a speci®cation of public sector performance which might not be possible at
the start of the programme.
2.3.4. Improving the use of performance indicators
Likierman (1993) used results from a survey of 500 managers in the public sector to
devise twenty lessons for the use of performance indicators (PIs). His main recommenda-
tions were:
² Include all elements of performance. Those devising measures should ensure they do
not omit and therefore undervalue an essential activity of work.
² The number of PIs should be appropriate to the organisation and its diversity. ªToo
many will make it dif®cult to focus on what is important, too few will distort action.º
² Provide adequate safeguards for `soft' indicators, particularly quality. ªQuality has
proved notably dif®cult for organisations to measure, and great care needs to be taken to
give it proper weight.º
² Acknowledge political and organisational purposes. Some measures might be included
to enhance credibility of a service rather than to deepen understanding.
² Build in counters for short-term focus. Indicators set on an annual cycle are likely to
focus management effort in the short term.
² Ensure PIs fairly re¯ect the efforts of managers. Performance indicators should not lie
outside of the control of the organisation under evaluation.
² Link PIs to existing management systems. New indicators which run in parallel with
existing systems can appear onerous or irrelevant.
² Ensure PIs are understandable to those whose performance is being measured.
Performance indicators need to be trusted and understood if they are to have an impact
on action.
² The data on which results are based must be trusted. Likierman found considerable
concern that staff did not appreciate the importance of ®lling in the forms carefully and
were therefore producing data which was unreliable.
² Use the results as guidance, not answers. Recognise that interpretation is the key to
action. ªThe results should always be accompanied by a commentary so that the ®gures
can be analysed and put in context.º
² Recognise trade-offs and complex interactions between different elements of
performance. Not all PIs should carry equal weight.
There are three main conclusions from this analysis of performance management. First,
that the quality of data collection for performance indicators should be improved. This is
likely to mean focusing on a small number of important indicators each representing
different dimensions of strategic interest. Second, that comparisons should be made care-
fully, and possibly within the framework of a classi®cation which groups together similar
organisations or operating circumstances. Third, that performance indicators should be
analysed over time as part of a broad learning process which pulls in other (especially
qualitative and non-®nancial) information. Performance indicators should be used to
A. Jackson / Progress in Planning 55 (2001) 1±64 13
identify issues for further investigation rather than to give an immediate judgement about
success or failure.
This third point is the most far reaching. It is clear that many authors want performance
management to move from a focus on accountability towards development and knowledge
functions:
² ªThe focus is on systematic thinking, fundamental structural change and organisational
learning, rather than mindless target-setting, continual ®re-®ghting, the rigorous
allocation of blameº (Meekings, 1995: 5).
² ªWhen located within a strategic-management perspective the performance measure-
ment of public services is seen to be necessary but not suf®cient for improved manage-
ment practice; and a means of learning rather than a means of controlº (Jackson, 1993:
14).
² ªRather than ªdials on the dashboard of a car', performance indicators are most helpful
when viewed as `tin-openers', leading to further examination and enquiryº (Carter et
al., 1992).
² ªComparisons are needed, not to provide de®nitive answers, but to highlight issues
which need to be debatedº (Audit Commission introduction to the ®rst national
publication of council performance indicators, 1995).
² ªOver-reliance on the ªcontrolº model of evaluation may create unnecessary defen-
siveness on the part of the agency or department concerned. Arguably, evaluation
should represent a learning process for all concerned rather than just a one-off
judgement on the ef®cacy of any scheme.º (McEldowney, 1997).
Van der Knaap (1995) shows how evaluation can contribute to organisational learning.
He breaks learning into three separate components each of which bene®ts from
evaluation:
² Corrective learning. Evaluation can provide feedback showing whether implementa-
tion is in line with plans (what Argyris and SchoÈn, 1978 call single loop learning).
² Cognitive development through a process of re®ning schemata. Evaluation can provide
stimulus and inspiration (double loop learning or what Weiss calls the `enlightenment
function' or `conceptual use').
² Social learning through dialogue and debate. Evaluation can provide a forum for
communication and the development of alliances.
This contribution to organisational learning will not be easily achieved, however. Litera-
ture on learning organisations illustrates the kinds of changes to organisational culture and
style which might be needed when introducing a developmental approach. Senge (1990)
emphasises that learning organisations must be able to tolerate failure (`experimental mind
set'); and prepared to expose and change their fundamental assumptions (`metanoia').
These comments echo earlier work from Zuboff (1988) about the need to use information
for understanding rather than reward or punishment, to empower not disempower people
(`informating'). Pedlar et al. (1991) proclaim that, in the learning organisation: ªdepart-
mental and other boundaries are seen as temporary structures that can ¯ex in response to
A. Jackson / Progress in Planning 55 (2001) 1±6414
changes in the environmentº (`enabling structures'); `company policies re¯ect the values
of all stakeholders not just those of top management' (`participative policy-making'); and
strategies should be seen as ªconscious experiments rather than set solutionsº (`learning
approach to strategy'). Nevis et al. add instructions for the learning organisation to be
continuously monitoring the external environment (`scanning imperative').
2.4. Valuing
2.4.1. Introduction
Evaluation should not only be true; it should also be just (House, 1978: 76)
This third, relatively neglected, facet of evaluation concerns the choice of criteria and
the values on which this choice is based.
There are two broad ways of dealing with values in evaluation. The descriptive
approach presents the values of the participants in a programme without elevating one
set of values over others. The prescriptive approach endorses particular values derived
either from one of the programme stakeholders, or from an abstract ethical system like
utilitarianism or social justice (Shadish et al., 1995).
Variants of these two are possible. For instance Karlsson (1996: 410) advocates nego-
tiating a consensus on values across the stakeholders. Shadish et al. (1995) suggest
constructing alternative value summaries: ªIf X is important to you, then evaluate Y is
good for the following reasons.º Other authors advise evaluators to use their own personal
values but to be open about the basis for these.
ªThe dif®culty of integrating together different measures is that values are at stake in the
weights to be given and values, and the weights to be given to them, can always be subject
to discourse and dispute in the public domainº (Stewart and Walsh, 1994: 48). As this
quotation illustrates, valuing presents two fundamental problems for the evaluator. First,
the evaluator needs to ®nd a framework for selecting and justifying the value judgements
that he/she imposes on the information collected. Second, the evaluator needs to aggregate
or weight individual judgements about the different elements of performance.
2.4.2. Comparison
Judgements of value implicitly rely on some kind of comparison: actual against alter-
native actual (such as a comparable programme or version of the programme), `with'
against `without', `before' against `after', (both aimed at separating programme and
non-programme experiences), actual against expected or actual against ideal (Berk and
Rossi, 1990). These comparisons can be extremely powerful but are not easy to make.
There are four potential problems. First, presentation of an ideal or counterfactual case
relies on judgement. The evaluator either has to construct a hypothetical case or to distil
out extraneous factors. Second, the comparison can be dif®cult to trace precisely. The
desire for precision tends to lead to a focus on quantitative (and often ®nancial) rather than
qualitative information. Third, variation in performance indicators can re¯ect changes or
differences in de®nition rather than changes in performance. Fourth, drawing attention to
values in this manner can weaken the evaluation's credibility with stakeholders who hold
different values.
A. Jackson / Progress in Planning 55 (2001) 1±64 15
The traditional way around this problem is to evaluate programmes in terms of their
objectives. However, as argued above, programme objectives are usually too vague to
provide clear prescriptions. ªIt is often hard to formulate questions based on ambiguous
program goals. The program can then reject evaluation ®ndings by saying the evaluation
measured something the program was not trying to doº (Shadish et al., 1995: 184,
commenting on Weiss).
Pawson and Tilley (1997) argue that the most powerful and practical comparisons are
between different versions of the same programme. A variation on this is benchmarking.
Benchmarking derives performance standards from comparison with processes in relevant
alternative programmes or projects. The advantages of using benchmarking to structure
valuing are that data might be seen as more objective and fair because they are derived
from real-world examples; contact between the two partners who are benchmarking
against each other can give practical insight into possible improvements; and competition
between the two organisations can increase motivation (Hill, 1995). The point is, all of this
is achieved without drawing attention to values.
2.4.3. Weighting ®ndings
Valuing relies on being able to weight different variables of the knowledge component.
As Scriven (1996: 397) observes: ªEveryone doing practical evaluation knows that before
one starts writing the Conclusions section, one has data and judgements on a number of
highly independent dimensions of program quality. How does one pull these together?
Presumably through using some kind of weighting. Where do the weights come from?º In
the absence of a carefully constructed synthesis, evaluation reports suffer from what
Scriven terms `the rorschach effect', that is, the audience is presented with a random
scatter of inkblots from which to construct patterns.
Cost bene®t analysis provides the most formal approach to weighting, and illustrates the
problems of imposing this kind of structure on different knowledge categories. Cost
bene®t has been criticised for being unable to deal adequately with uncertainty, inter-
dependencies, distributional effects, intangibles, and costs to the user (Mishan, 1971).
Even without the over-simpli®cation of reducing all costs and bene®ts to ®nancial values,
weight and sum methods suffer from assumptions of linearity and additivity which can
weaken their relevance to the practical world of programme implementation.
Pawson and Tilley (1997) question the starting point for a black box approach to
valuing. They argue that attempting to aggregate positive and negative effects inevitably
produces inconclusive results across a whole programme. That inconclusive results are
common is underlined in Rossi's (1985) pessimistic Iron Law of Evaluation which states
that: ªThe expected value of any net impact of any social program is zero.º Realistic
Evaluation moves away from global statements about the success of programmes to
focus on practical guidelines about the relevance of different approaches, styles, and
mechanisms.
2.5. Use
2.5.1. Introduction
The literature sets out many different ways in which evaluation achieves its in¯uence
A. Jackson / Progress in Planning 55 (2001) 1±6416
over appropriate policy decisions. Each of these should be seen as operating on a scale
along which greater or lesser in¯uence can occur. Instrumental use occurs where ®ndings
have a direct effect on decision-making; for instance, where recommendations are
implemented. Conceptual use, which is also called `enlightenment' (Weiss, 1977) or
`demysti®cation' (Berk and Rossi, 1976), is where evaluation leads to changes in how
the programme is understood. Persuasion refers to ªenlisting evaluation results in efforts
either to defend or attack political positionsº (Rossi and Freeman, 1985: 388). Political use
is designed to legitimise decision-making. Haug (1996: 424) identi®es two political uses
of evaluation. In `the strategic approach' results are presented to support decisions which
have already been made. In `the symbolic approach' evaluation is used to give the impres-
sion that the decisions have been thoroughly and rationally considered. Finally, misuse
involves the ®ndings being published selectively, or to a select number of audiences.
Misuse can be deliberate, for instance, where a programme manager tries to scapegoat
one of the implementing agencies, or accidental, for instance where evaluation results are
communicated second-hand. Misuse can be more helpfully conceived as referring to the
quality of use (Cook et al., 1981).
2.5.2. Increasing use
The literature has identi®ed many potential obstacles to the full and proper use of
evaluation ®ndings (Cook, 1978; Shadish et al., 1995: 54). Organisations are resistant
to change because of the threat to vested interests. Weiss (1973: 40)) points out that: ªA
considerable amount of ineffectiveness may be tolerated if a program ®ts well with
prevailing values, if it satis®es voters, or if it pays off political debts.º Social programmes
typically change slowly because decision making is incremental. Weiss (1980) calls this
ªdecision-accretionº Decision makers design programmes for non-instrumental reasons,
and in designing programmes they use information other than evaluation reports. Weiss
concludes that: ªI doubt that we can ever persuade stakeholders to make evaluation results
the overriding consideration in program decisions. For one thing, program people know a
lot more about their programs than simply the things the evaluator tells themº (Weiss,
1988: 17; Shadish et al., 1995: 221). Third, evaluation ®ndings do not usually ®lter down
to those in the best position to apply them, such as programme operators. Caplan (in Cook,
1978) argues that lack of use re¯ects a fundamental clash of cultures between the evalua-
tor's `knowledge-generating culture' (interested in truth and validity), and the programme
manager's `knowledge-utilisation culture' (interested in pragmatic action). Last, the
inconsistent technical quality across evaluation reports affects their credibility and
therefore their use.
Theorists have identi®ed several ways of increasing the use of evaluation results
(Patton, 1996; Shadish et al., 1995: 55; Georghiou, 1995: 185). These can be related to
four principles. Evaluators can increase the relevance of their work through concentrating
on the factors which users can control, and giving priority to questions relevant to pending
decisions. Second, evaluators can increase receptivity through identifying users early on,
involving users in the evaluation, and providing interim ®ndings. Third, the absorbability
of evaluation can be increased through providing recommendations, including non-tech-
nical executive summaries, matching summaries to the interests of different stakeholders,
and disseminating ®ndings through informal meetings and brie®ngs as well as reports.
A. Jackson / Progress in Planning 55 (2001) 1±64 17
Credibility can be increased through using evaluators with appropriate status who are
demonstrably fair and competent.
Notwithstanding this, American academics are not optimistic about the use of
evaluation:
The early hope was that use would happen with little effort because evaluation
results were compelling, because stakeholders eagerly awaited scienti®c data
about programs, or because policy-making was a rational problem-solving
endeavour (¼) But those hopes were dashed, because evaluation results were
seldom compelling relative to the interests and ideologies of stakeholders, stake-
holders usually regarded scienti®c input as minor in decision-making, and problem
solving is far from a rational endeavour (Shadish et al., 1995: 54).
2.5.3. Communicating ®ndings
Use of evaluation depends in part on the effective communication of ®ndings. ªThe
proper function of evaluation is to speed up the learning process by communicating what
might otherwise be overlooked or wrongly perceived. (¼) Success is to be judged by
success in communicationº (Cronbach, 1982: 8; Torres et al., 1996). Evaluation reports
must therefore re¯ect the different information needs of different layers of the organisation
(Tables 1 and 2). Communication needs can also vary between individuals. ªLearning
begins with individuals. How individuals receive, remember and react to evaluation
communications mediates the effectiveness of those communications. The need for
presenting information in a variety of modalities is clearº (Torres et al., 1996). The
`modalities' mentioned by Torres et al. include visual, aural, interactive, tactile,
kinaesthetic or olfactory media. Meeting these demands presents two dilemmas for the
evaluator, however. First, the evaluator has to summarise work without taking ®ndings out
of context or overlooking methodological limitations and cautions. Second, evaluation
reports have to tread a ®ne line which broadens the readers' understanding while also
addressing their existing interests, a balance between leading and following.
A. Jackson / Progress in Planning 55 (2001) 1±6418
Table 1
Attributes of information needed by different decision-makers. Source: Love (1991: 29)
Information attributes Management function
Planning Operations
Type of question What if? What is?
Time horizon Future Current
Information sources External Internal
Measurement Qualitative Quantitative
Level of detail Aggregate Individual
Level of analysis Synthesis Descriptive
Frequency of reporting Periodic Continuous
Scope of reporting Summary Detailed
Accuracy of reporting Approximate Exact
Mode of reporting Graphical Numerical
2.6. Practice
2.6.1. Introduction
This ®fth component looks at the problems of carrying out evaluation with limited time
and other resources. This includes the organisation of evaluation work, and the role of the
evaluator.
2.6.2. Independence
The literature discusses several concepts related to independence.
² Objectivity (Scriven 1976. Evaluation ®ndings should not be contaminated by the
personal preferences or ideology of the evaluator.
² Reproducibility (Cronbach, 1982). Another evaluator using the same methodology
should obtain the same results.
² Autonomy (Newman and Brown, 1996). Evaluators should be free from outside
in¯uence.
² Emphatic neutrality (Patton, 1990). Evaluators should be perceived as sympathetic to
programme participants but open about ®ndings.
Keeping some distance from stakeholders can, however, be at odds with the prescriptions
for increasing use. Stake asks: ªDoes the degree of involvement in the ®eld required for
expertise almost always create bias?º (Stake, 1995: 395). In the Evaluation Thesaurus,
Scriven (1991) directly addresses this dilemma when he argues that using evaluators with
some level of bias is better than employing `ignoramuses or cowards'.
Another source of bias is of major importance in evaluation: political pressure. Weiss
(1973: 179) argues that: ªMany of the problems that still bedevil the evaluation enterprise
are not so much failures of research expertise as they are failures to cope adequately with
the political environment in which evaluation takes place.º Berk and Rossi (1990: 14)
point out that evaluation is inherently contentious. ªIn almost all program issues, stake-
holders may be aligned on opposing sides, some favouring the program and some oppos-
ing. And whatever the outcome of the evaluation may be, there are usually some who are
pleased and some who are disappointed: it is usually impossible to please everyone.º
Political pressures can affect each stage of the evaluation (Brickell, 1978; Berk and
Rossi, 1976; Palumbo, 1989; Hedrick, 1988; Karlsson, 1996). The purchaser of the evalua-
tion or other stakeholders might phrase the objectives to highlight particular aspects of the
programme, for instance, more favourable or less contentious aspects. Radical interven-
tions, which would not be politically acceptable, might be excluded from consideration.
The evaluator might be set an unrealistic time frame so as to prevent wide consultation
during the research work. Political actors might place pressure on evaluators to distort
study ®ndings. Political actors might use the evaluation results selectively, or try to
suppress their publication. Political manoeuvring can include using evaluation research
as a delaying tactic. More positively, political con¯ict can increase the interest in an
evaluation report and produce greater pressure to implement recommendations.
A. Jackson / Progress in Planning 55 (2001) 1±64 19
2.7. Ensuring co-operation from programme staff
The evaluator often relies on data from the programme itself. ªThe basic data used in
many evaluations comes from the of®cial records of the program being evaluated. While
this usage is cost-ef®cient, it introduces politics into the meaning given to measuresº
(Kelly in Palumbo, 1989: 273). Co-operation of staff is vital where evaluation data is
drawn from the programme but it is not always forthcoming. A study by the author
(Jackson, 1996) on `foyers', a type of hostel providing accommodation and training for
homeless young people, found that project managers had three objections to evaluation.
First, they disagreed with national objectives for their projects. They thought that intan-
gible bene®ts to participants were more important than job creation outputs. Second, some
programme managers opposed any classi®cation or labelling of participants. Third, there
was a general feeling that projects should be trusted rather than asked to account for their
actions through monitoring. Patton (1996: 29) lists others fears about evaluation which can
affect co-operation. ªBarriers typically include fear of being judged, cynicism about
whether anything can really change, scepticism about the worth of evaluation, concern
about the time and money costs of evaluation, and frustration from previous bad evalua-
tion experiences, especially lack of use.º
Performance management work within the ®eld of human resource management
provides insight into the response of programme staff. Murphy and Cleveland (1995:
101) argue that performance appraisal systems reinforce power by clarifying lines of
authority. A more radical view, from Newton and Findlay, 1996: 50, is that appraisal
undermines group solidarity by increasing individual competition. Foucault uses
Bentham's image of a `panopticon' (a prison where inmates can be observed unseen
from a central tower) to argue that performance management `sequestrates' subjects,
isolating them in time and `space' (Townley, 1994). Overall, the increased control from
evaluation can weaken or demoralise those subject to evaluation.
Many theorists have argued that power needs re-balancing through giving a greater role
to stakeholders. Variants on this view now include: `responsive evaluation' (Stake, 1975),
`democratic evaluation' (MacDonald, 1976), `naturalistic evaluation' (House, 1980),
`participatory evaluation' (Choudhary and Tandon, 1988), `Fourth Generation Evaluation'
(Guba and Lincoln, 1989), `empowerment evaluation' (Fetterman et al., 1996), and
`utilisation-focused evaluation' (Patton, 1996). These theories advocate, in various
degrees: relativism and pluralism; qualitative methodologies including participation;
and a view of evaluation as negotiation rather than analysis.
Guba and Lincoln (1989: 11, 119) argue that `a means of carrying out an evaluation
must be found that recognises the constructed nature of ®ndings, that takes different values
and different contexts (physical, psychological, social and cultural) into account, that
empowers and enfranchises, that fuses the act of evaluation and its follow-up activities
into one indistinguishable whole, and that is fully participative in that it extends both
political and conceptual parity to all stakeholders'. They make two main criticisms of
positivism. First, that its desire for objectivity often leads to unethical treatment of experi-
mental subjects. Second, that by disregarding political factors it most often serves to
reinforce existing power relations.
Fourth Generation Evaluation is itself not without criticism. Pawson (1996: 216) argues
A. Jackson / Progress in Planning 55 (2001) 1±6420
that by seeing all constructions of reality as equal, what he calls the `militant agnosticism
on truth', Fourth Generation Evaluation reduces the role for evaluation in advancing
programme knowledge. Evaluators risk losing much of their credibility and status if
they give up the claim to being objective (House, 1993: 30. Accountability, one of the
three main functions of evaluation, relies on evaluators maintaining their distance so as to
be seen as independent (Van de Knaap, 1995: 206). More seriously, given the radical
agenda proposed by Guba and Lincoln, Fourth Generation Evaluation has been criticised
because it `fails to deal, in any meaningful way, with the concept of relative power, or
more speci®cally the unequal distribution of discursive power' (VanderPlaat, 1995: 95).
Power is not the evaluator's to give away. Stakeholder approaches can give participants
unrealistic expectations of the evaluation (Rebien, 1996). Conceptual relativism leads into
a moral relativism which reinforces rather than challenges existing power relations.
ªAbsolute relativism creates an unpredictable paradox Ð on the one hand, it promises
to free the individual from ªof®cial versionsº of their lives; on the other hand, it grants
licence to arbitrary powers to dismiss popular complaint and dissent.º (Kushner, 1996:
196).
The point is that evaluators cannot change the political role and meaning of their work
without the compliance of those who commission and use evaluation.
2.8. Conclusion
Evaluation has ®ve components. Within ®nancial and resource constraints (Practice) it
uses information (Knowledge Construction) to make judgements (Valuing) on
programmes and policy problems (Socio-economic Programming) with the aim of inform-
ing decision-making (Use). Each of these components, and the con¯icting pressures of
pursuing all ®ve simultaneously, raise problems for the evaluator.
The ®rst section, on Socio-economic Programming, looked at the problems of placing
evaluation within an explanatory framework dealing with the character of the programme
and the policy it is intended to address. This area of evaluation has been neglected because
evaluators have focused on methodology and taken a black-box approach concerned with
impact. Authors such as Chen (1990), and Pawson and Tilley (1997) argue that evaluation
should move away from broad questions about whether the programme produced the
impact intended, towards more sensitive contextual analysis looking at where and why
programmes are most effective. The `where' would look at environmental factors, types,
and motivations of participants. The `why' demands probing of programme mechanisms.
Lack of theoretical thinking in evaluation has led to two problems. First, policy
managers have given insuf®cient attention to what they wish to achieve. Second, they
have given insuf®cient attention to how this effect will be achieved, how their proposed
policy would deliver this effect. These two problems can be rephrased in the form of
hypotheses.
Hypothesis 1. Objectives will be ambiguous.
Subhypothesis 1a. The form this ambiguity will take is that objectives will be global,
diffuse, and distant from programme functioning.
Hypothesis 2. The assumptions underpinning scheme operation will be unstated.
A. Jackson / Progress in Planning 55 (2001) 1±64 21
Subhypothesis 2a. The form this understatement will take will be that stakeholders will
differ in their assumptions about the way the programme works (programme theory).
The second section, on Knowledge Construction, looked at the information which is fed
into evaluation. In Britain, this is mainly in the form of performance indicators, so the
analysis concentrated on the problems these present for measurement, interpretation, and
use. Performance indicators employ quantitative measures to represent complex inter-
actions across different projects and programmes. This representation process inevitably
leads to some level of simpli®cation, de-contextualisation, and spurious precision. That
these indicators can then affect the power, resources, and status of individuals and
organisations leads to pressure to distort results towards more favourable depictions.
The hypotheses from this are:
Hypothesis 3. Performance indicators will present measurement problems.
Sub-hypothesis 3a. The measurement problems presented will be those of subjectivity,
simpli®cation, and partiality.
Hypothesis 4. Performance indicators will present organisational problems.
Sub-hypothesis 4a. The organisational problems presented will be those of ªcreamingº,
short-termism, and demoralisation.
The third section examined the Valuing component of evaluation. Judgements of worth
rely on value statements, but it is not obvious where these value statements should come
from. The most obvious source, the stated objectives of the programme, is not easily used
because they are often vague or inconsistent. External frameworks like ethical theories
have intellectual rigour but might lack credibility and immediacy to programme stake-
holders. Evaluators face two problems: building conclusions across a portfolio of indivi-
dual ®ndings and judgements; rating the programme against standards or comparables.
Methods such as benchmarking provide a vivid and practical way of judging programmes
without drawing attention to values. The problems become these hypotheses:
Hypothesis 5. Results from evaluation will support different judgements according to
the values applied.
Sub-hypothesis 5a. Differences in judgements will follow the lines between stakeholder
groups.
Hypothesis 6. Evaluation will produce positive and negative readings on different
aspects of the programme.
Sub-hypothesis 6a. Evaluation that seeks to summarise its ®ndings into a single
aggregate answer will be inconclusive.
The fourth section, on Use, examines the way evaluation ®ndings feed into decision-
making. Research in the United States has consistently revealed that evaluation does not
have a major effect on programmes. Several possible explanations for this weak link into
use are explored: that policy is the product of political as well as technical factors; that
programmes change slowly because of resistance from vested interests; that evaluation
reports fail to speak directly to the practical concerns of policy makers and implementers;
A. Jackson / Progress in Planning 55 (2001) 1±6422
and that some evaluation reports have technical ¯aws which justify managers' disregard
for their ®ndings. The hypotheses from this are:
Hypothesis 7. Evaluation ®ndings will not be used.
Sub-hypothesis 7a. Enlightenment use will be more common than instrumental use.
Hypothesis 8. Communication of evaluation ®ndings will be problematic.
Sub-hypothesis 8a. Over-simpli®cation increases the risk of mis-use. Over-complexity
obscures the meaning.
The ®nal section examined the practical process of carrying out the evaluation. Two
tensions were revealed. Resource constraints mean that evaluators usually expect to ®nd
data on the programme already available. Obtaining this data, and the information which
provides context for its interpretation, makes evaluators dependent on programme staff.
However, programme staff have good reasons not to co-operate. These reasons include the
effort involved, the damage to their position from negative ®ndings, and the power rela-
tions implied by this level of external control. Methods which give stakeholders more
control over evaluation, so as to encourage their participation, can threaten evaluators'
objectivity and independence. A particular example of this is where stakeholders
exert political pressure on evaluators to change the direction of their investigation or its
conclusions.
The hypotheses from this are as follows.
Hypothesis 9. Data for evaluation will be dif®cult to compile.
Sub-hypothesis 9a. Compilation dif®culties will include non-disclosure as well as lack
of availability.
Hypothesis 10. Evaluators will be pressured to change ®ndings to ®t the decision-
maker's perspective.
Sub-hypothesis 10a. The form this pressure will take will be to accentuate positive
®ndings.
The next chapter tests these 10 hypotheses in an evaluation of public sector small business
loan and grant schemes.
A. Jackson / Progress in Planning 55 (2001) 1±64 23
A. Jackson / Progress in Planning 55 (2001) 1±6424
Table 2
Components of programme evaluation
Character Hypotheses
1. Socio-economic
Programming
Assumptions about the nature of
social (and presumably
economic) problems and the
mechanisms of public sector
programmes.
Hypothesis 1: objectives will be ambiguous.
Subhypothesis 1a: The form this ambiguity will
take is that objectives will be global, diffuse, and
distant from programme functioning.
Hypothesis 2: the assumptions underpinning
scheme operation will be unstated. Subhypothesis
2a: the form this understatement will take will be
that stakeholders will differ in their assumptions
about the way the programme works (programme
theory).
2. Knowledge
Construction
Assumptions evaluators adopt
about the nature of reality, the
origins and limits of knowledge
and choices of methodology.
Hypothesis 3: performance indicators will present
measurement problems. Sub-hypothesis 3a: the
measurement problems presented will be those of
subjectivity, simpli®cation, and partiality.
Hypothesis 4: performance indicators will present
organisational problems. Sub-hypothesis 4a: the
organisational problems presented will be those of
`creaming', short-termism, and demoralisation.
3. Valuing The way values are attached to
programme descriptions.
Hypothesis 5: results from evaluation will support
different judgements according to the values
applied. Sub-hypothesis 5a: differences in
judgements will follow the lines between
stakeholder groups.
Hypothesis 6: evaluation will produce positive
and negative readings on different aspects of the
programme. Sub-hypothesis 6a: evaluation that
seeks to summarise its ®ndings into a single
aggregate answer will be inconclusive.
4. Use The way information feeds back
into decision-making and action.
Hypothesis 7: evaluation ®ndings will not be used.
Sub-hypothesis 7a: enlightenment use will be
more common that instrumental use.
Hypothesis 8: communication of evaluation
®ndings will be problematic. Sub-hypothesis 8a:
over-simpli®cation increases the risk of mis-use.
Over-complexity obscures the meaning.
5. Evaluation
Practice
The way evaluators select
methods and approaches to match
the circumstances and resources
available.
Hypothesis 9: data for evaluation will be dif®cult
to compile. Sub-hypothesis 9a: compilation
dif®culties will include non-disclosure as well as
lack of availability.
Hypothesis 10: evaluators will be pressured to
change ®ndings to ®t the decision-maker's
perspective. Sub-hypothesis 10a: the form this
pressure will take will be to accentuate positive
®ndings.
CHAPTER 3
Empirical work
3.1. Methodology
The empirical work for this project has two elements. A mapping exercise inves-
tigated and described the main public sector ®nance schemes for small businesses in
the Hackney area. Four of these schemes were selected for case study. Research
comprised interviews with the scheme managers and funders, compilation of data on
each of the loan holders, interviews with 24 loan holders across two of the schemes,
and analysis of written material. In one of the four schemes, where data on loan
holders were not available, details were compiled on 150 applicants by inspecting
business plans, decision-panel minutes, and monitoring reports. The written material
reviewed included publicity brochures, application forms, progress reports, and
paper records of management systems such as accounting, credit control, and busi-
ness health checks.
The case study method was used because it is suited to detailed analysis of
speci®c instances (`bounded systems', Smith in Stake, 1995), using a mix of quan-
titative and qualitative research tools, which allow real outcomes to be examined in
context. Stake (1978, 1981, 1995) strongly advocates case study methods for evalua-
tion because their use of different kinds of information to `triangulate' on the
research questions strengthens their validity. The knowledge from case studies `is
different in that it is more concrete, more contextual, more subject to reader inter-
pretations, and based more on reference populations determined by the reader'
(Stake, 1981: 36). Stake (1995) identi®es two types of case study research: `intrin-
sic' and `instrumental'. The former refers to circumstances where the case is inher-
ently interesting. The later, which applies here, is where the case is intended to
provide insight into a wider problem or issue.
The disadvantages of a case study methodology are the dif®culties in generalising to the
wider population, and the danger of interviewer bias. Follow-on research using more
quantitative methods is needed to test the conclusions from this study across a wider
range of cases and instances.
3.2. Background from the mapping exercise
This project examines loan and grant schemes in the Hackney area of London. The
mapping exercise found 27 ®nance schemes active in and around the Borough. Five are
grants schemes, twelve are loan schemes, six are equity, and four are other types (loans
and grants, credit unions, business angel introduction schemes, and loan guarantee
schemes). This is in addition to commercial schemes such as national venture capital
funds.
The pattern of supply varies across the different ®nancial instruments. Grant schemes
tend to have lower allocated amounts (£1000±£25,000 with an average payout of £6250
across the schemes), and to operate in smaller areas. The ®ve grant schemes identi®ed are
A. Jackson / Progress in Planning 55 (2001) 1±64 25
all relatively recent (established since 1993). All have reached or are approaching exhaus-
tion. Two of the ®ve are managed by Business Link, and three by the Council or City
Challenge.
Loan schemes tend to have a higher upper margin (up to £50,000), although there are a
number of small, specialist loan schemes in the same niche as grants. These smaller loan
schemes bring the average payout to the same level as grants (£6090). Loan schemes have
a longer history. Seven of the twelve date back to the, 1980s. All of the funds are active,
providing in total some £1.2 million (December 1996 ®gures). Eleven of the twelve loan
schemes are run by enterprise agencies, and one by an enterprise board (Greater London
Enterprise).
Equity schemes have a higher range (in principle £150,000 but in practice over
£200,000). All the schemes were established relatively recently (over the last six years).
The size of the area covered (London or wider), and amount of money available (over £1
million, £15.4 million across the six funds), are consistent with the more risky nature of
this ®nancial activity.
The four loan funds chosen for case study share the following features:
² Location within one area of London.
² Small size.
² Operation by enterprise agencies.
² Organisation on contract from Government agencies (Task Forces or City Challenges).
² Relatively recent establishment.
The analysis above showed that these features are characteristic of loan funds across the
study area. These similarities act to control for some of the variation in small business loan
and grant schemes which, in turn, limits the scope for generalisation of ®ndings.
Descriptive data and information on performance indicators for the four case study loan
and grant schemes is given in Table 3.
² All schemes were established in the last six years.
² Two operate in the smaller size range, up to £5000 or £10,000; two operate up to
£25,000.
² The two larger schemes are managed by agencies with several other funds.
² The level of deployment is similar for three of the four schemes, but signi®cantly higher
for one.
² Both Schemes 1 and 4 have particularly high enquiry rates.
² Both Schemes 1 and 4 have relatively low approval rates.
² There is a stark difference between the default rates for Schemes 1 and 4, versus
Schemes 2 and 3.
² Scheme 1 has the lowest estimated unit cost of jobs (excluding management costs).
² Targeting is good across the four schemes.
The rest of this project will test the 10 hypotheses from Chapter 2, using empirical work on
these four loan and grant schemes.
A. Jackson / Progress in Planning 55 (2001) 1±6426
3.3. Socio-economic programming
The proposed hypotheses are:
Hypothesis 1. Objectives will be ambiguous.
Subhypothesis 1a. The form this ambiguity will take is that objectives will be global,
diffuse, and distant from programme functioning.
Hypothesis 2. The assumptions underpinning scheme operation will be unstated.
Subhypothesis 2a. The form this understatement will take will be that stakeholders
will differ in their assumptions about the way the programme works (programme
theory).
A. Jackson / Progress in Planning 55 (2001) 1±64 27
Table 3
Summary table of loan and grant schemes (Notes: (1) Leverage ®gures are not comparable for reasons explained
below. (2) Approval rate is not the same as percentage of enterprise agency recommendations accepted. All
panels include applications not recommended by the enterprise agency. (3) Job creation for Scheme 1 is estimated
by comparing predicted amounts with current survival rates.(4) Data compiled in Summer, 1997.)
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Source of Funds Task force, city
challenge &
council
City challenge City challenge Task force
Character of fund manager Business advisor Ex-bank manager Ex-bank manager Business advisor
Date of establishment 92 September 1995 March 1996 October 1993
Size of loan Up to £5000 £5±25,000 up to £25,000 £1±10,000
Number of funds managed 2 5 5 1
Total deployment £257500 £277,500 £589,075 £263,860
Average deployment per year £51,000 £200,000 £580,000 £75,000
Average loan £3960 £9900 £21,800 £6430
Number of loans 65 28 27 41
Term 3 8 8 3
Approval rate 46% 20% 78% 30%
Job creation 119 111 109 61
Average cost per job
(deployment)
£2160 £2500 £5400 £4300
Average cost per job
(deployment plus management
costs)
£3325 £3055 £5725 £5800
Leverage 1:2.9 1:3 1:2.5 Not recorded
% of loans behind with payment 66% 7% 3% 63%
Ratio of enquiries to loans 15:1 12:1 9:1 20:1
Ratio of applications to loans 2:1 6:1 1.5:1 3.5:1
% of loan recipients who are
women
22% NA 20% 19%
% of loan recipients who are not
white British
76% 71% 70% 87%
% of loan recipients who are
start-up businesses
54% 30% 88%
Each of the 27 ®nance schemes in the mapping exercise was asked to state their objectives.
Typical objectives included:
To encourage enterprise in the area or showing clear bene®ts to the area. To support
viable start ups and existing ®rms with less than 25 employees, who provide detailed
business plans.
To encourage start ups and business growth.
To help new and developing businesses in the London area.
To help established ®rms which have established a basis for success and can grow and
employ people. Firms with a turnover of £1 million to £20 million.
Written documents from the case study schemes show objectives phrased in similarly
general terms:
The fund was established `to provide low cost loan ®nance to local and incoming
businesses to aid business development. It was intended as a tool for strengthening
the local business economy and hence aid the urban regeneration of the area'
(wording from internal document reviewing future options for a fund).
The aim of the fund is to generate economic activity in the area, assisting with the
creation of employment through the encouragement of new businesses and the
development of existing ones who are experiencing dif®culty in raising ®nance
from traditional commercial sources (brochure for a fund).
The statements of objectives listed above suggest failure at a deeper level: lack of
attention to the theory underlying intervention. There was no evidence in the documenta-
tion of discussion about the mechanisms through which loan and grant schemes achieve
their effect.
Combining the review of written material with the interviews from fund managers leads
to the following conclusions for these loan schemes:
² Objectives are very general, usually assuming an effect in terms of stimulating business
activity or physical regeneration.
² Funding criteria are frequently given in place of strategic objectives.
² Mechanisms to achieve economic aims are not stated.
² The positioning of loan and grant schemes in the overall market for small business
®nance is not given.
3.4. Knowledge construction
The two hypotheses here are:
Hypothesis 3. Performance indicators will present measurement problems.
Sub-hypothesis 3a. The measurement problems presented will be those of subjectivity,
simpli®cation, and partiality.
Hypothesis 4. Performance indicators will present organisational problems.
A. Jackson / Progress in Planning 55 (2001) 1±6428
Sub-hypothesis 4a. The organisational problems presented will be those of `creaming',
short-termism, and demoralisation.
Performance indicators are the main way in which funders evaluate the ongoing
performance of these four case studies. The reliance on performance indicators might
be higher than for other economic development schemes because of the particular circum-
stances of the case studies. The schemes are each managed by an implementing agency at
arms length from the funders. The funders lack expert knowledge about running ®nance
schemes. The culture of the funders is to value quantitative information; this is in part
because of their staf®ng by civil service secondees, and in part because of their own
requirement to meet quantitative targets.
The choice of performance indicators varied across the case studies (Table 4) although
the core was composed of:
² Deployment of funds (throughput).
² Job creation.
² Satisfaction of lender of last resort criteria.
² Amount of leverage.
² Ethnicity of recipients.
² Default rate/write-offs.
² Enquiry levels.
In addition, informal use was made of two other indicators: number of business plans
prepared, and additionality. These were also looked at in the research.
3.4.1. Deployment
Throughput, or the ability to spend the funds allocated, is taken as a major indicator of
operator performance.
The research found variation in the exact de®nition of this indicator. Scheme 2, which
used the narrowest de®nition, only included funds, which had been drawn down. The other
three schemes used ®gures that sometimes included money allocated and not subsequently
taken up; and money that had been accepted but not drawn down.
A. Jackson / Progress in Planning 55 (2001) 1±64 29
Table 4
Use of performance indicators
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Throughputp p p p
Job creationp p p p
Leveragep p
Lender of last resortp p p
Ethnicityp p p p
Default ratep p p p
Enquiry levelsp p
Deployment varied across the four schemes for reasons which did not always seem to be
under the control of the fund manager. Three types of factor constrain spending by the fund
manager:
² Lending perimeters. Scheme 1, with the lowest deployment, also had the lowest lending
range (up to £5000). The number of loans given by this scheme (65) was the highest of
the four.
² Decision-making. In each case, lending decisions were made by panels composed of
representatives of the funders, and professional advisors such as bank managers or
accountants. The scheme with the highest deployment rate (Scheme 3) also had the
highest approval rate (78% compared with 30±46%).
² Demand. The amount of lending achieved is affected by the level of demand feeding
into the system, which is itself affected by the size of the eligible area, the economic
climate, the availability of alternative sources of funds, and the attitudes of potential
recipients towards external ®nance. This process can be conceptualised as a ®ltering
mechanism.
Low deployment can re¯ect failure at any point of this ®ltering mechanism (Fig. 3):
² Not all ®nancial needs will translate into demand for loans. Potential applicants might
prefer to rely on internal funds, bank ®nance or the hope of grants.
² Not all demand will translate into enquiries. The level of enquiries will depend on the
marketing of the loan scheme, and therefore the level of awareness, as well as the
perceived chance of success.
² Not all enquiries will translate into applications. The level of applications will depend
on the ability to ®nd matching funds (and meet other conditions), ease of applying, and
motivation to continue.
² Not all applications will be approved. The level of approvals will depend on the
character and role of the decision-making panel.
² Not all approved loans will be accepted. Loans might be not be taken up if approval
comes too late or subject to conditions which the business ®nds off-putting.
That deployment is used as a performance indicator without clari®cation of these
contextual factors illustrates the level of over-simpli®cation. Tightening the de®nition
of deployment will solve some but not all these problems.
A. Jackson / Progress in Planning 55 (2001) 1±6430
Table 5
Average and total deployment of funds
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Total deployment £257,500 £277,500 £589,075 £263,860
Average deployment per year £51,000 £200,000 £580,000 £70,000
3.4.2. Job-creation
Each of the four cases measures job creation differently. Scheme 1 collects ®gures for
the number of current jobs and proposed jobs anticipated at the time of the funding
application. These are treated as `jobs preserved' and `new jobs', respectively. Scheme
2 collects ®gures on the number of employees of loan holders, making no distinction
between different grades of staff. Scheme 3 keeps ®gures for gross job creation (full
time jobs, probably of®cial and unof®cial), and net job creation (90% of gross job crea-
tion). Family members are included. This scheme has a signi®cant number of established
clothing ®rms, which employ seasonal workers, outworkers, contractors, and jobs outside
the area. None of these is included. Scheme 4 records, for those loan recipients, which are
still in business, the number of people employed and the number of jobs, proposed. These
data are compiled from regular business reviews with loan recipients.
These cases show several problems in using job creation as a key performance indicator.
² Jobs vary in their quality. Storey (1990) identi®es three elements of job quality: wage
rates, job duration, and job allocation (take-up by the unemployed). No case study
collected data on any of these variables.
A. Jackson / Progress in Planning 55 (2001) 1±64 31
Fig. 3. Throughput model.
² The number of people employed by loan holders varies widely over seasons, weeks or
even hours. The Hackney labour market is characterised by seasonal and temporary
jobs, contract labour, and home workers. There is no consistency in the way ®rms
include such workers in the ®gures they give for job creation.
² The jobs are not equal in their contribution to the economy. The ®gures collected by the
loan funds include unof®cial labour, where people are still claiming bene®t, and family
members who are not paid, and therefore do not directly contribute to local spending or
multiplier effects. A clearer statement of the objectives for each scheme would be able
to show whether these groups should be included or not.
² Job creation exhibits time lags. A ®nance scheme might have its full effect after the
loan period, for instance, if a ®rm expands or survives a recession better than would
otherwise have been the case. Only one of the schemes keeps in touch with loan
recipients suf®ciently to take account of subsequent job creation.
² Attribution of new jobs is problematic. Tying down responsibility for creating jobs is
especially dif®cult where ®rms have bene®ted from several programmes of support, and
this can lead to double counting. Since all the fund managers studied are enterprise
agencies, jobs were probably already counted as an output from business advice.
² Fund criteria affect potential job creation. Differences in the average loan going out
and, in two of the four cases, limits on the size of ®rm which can apply, give an uneven
playing ®eld for comparisons. The two funds without a limit on the size of ®rms which
can apply, and those with a higher percentage of established rather than start up ®rms,
are in a better position to count job creation than the others.
² Job creation might operate at the expense of jobs outside the area. None of the four
loan funds takes account of displacement of jobs from other areas. Schemes 1 and 2, for
which detailed data are available, were attracting applicants from a wide area of
London (and beyond in some cases). Start-ups were particularly footloose in their
choice of location.
The problems with calculating job creation impinge on derived indicators such as cost per
job. Cost can be calculated using deployment alone or deployment plus management costs.
Management costs can be interpreted from the funders' perspective (®nancial contribu-
tion, as used in Table 3) or from the enterprise agencies' perspective (actual cost to the
agency, including allocation of overheads). The jobs included can be those created imme-
diately, or those created over the life of the investment. The later has to use judgement to
set a cut-off point, and decisions on this can reduce consistency between schemes. JURUE
(1986) comment that: ªcomparisons between the cost per job of revenue and capital
projects should not be made since in the latter case the total net capital cost of an asset
with a long life is being counted against an assessment of jobs created at one point in
time.º
3.4.3. Leverage
There are several ways in which fund money can lever in private sector funds:
² Provision of matching funds from the applicant can be a condition of eligibility.
² Scheme money can be used to reduce the exposure of banks.
A. Jackson / Progress in Planning 55 (2001) 1±6432
² Schemes can present the case of the applicant, so giving the bank more con®dence to
make a loan.
² A similar but not identical concept is that of ªmatching fundsº, where the applicant
must contribute their own money to show commitment. A looser version of commit-
ment includes money paid previously Ð and therefore not levered by the fund.
Three of the four schemes have targets for leverage which they are expected to reach.
However, leverage is calculated differently across the three schemes (Table 6). At its
broadest, leverage includes:
² Public money.
² Bank loans.
² Bank overdrafts.
² Applicants' money (including money left in the business by directors taking a letter of
postponement).
² Applicants' help in kind.
² Professional advisors' help in kind.
² Previous investments in cash or in kind contributed by the applicant (this is more in
keeping with the concept of matching funds).
² National programmes, such as enterprise allowance, which are not discretionary.
The research ®ndings show the six problems with the use of leverage as a performance
indicator.
² Different de®nitions are used. This problem has already been identi®ed in the literature.
Gray (1997: 352) complains that: ªThe concept of leverage mechanistically lumps
together several different relationships between private and public sectors.º
² Some factors affecting leverage are outside the control of the fund manager. For
instance, the cancellation of enterprise allowance made leverage more dif®cult to obtain
(for Scheme 1, which is including it). The clearing banks' move towards credit scoring
might reduce the scope for schemes to obtain leverage in the future.
² The de®nition of leverage discourages referral of commercial cases to banks. Leverage
A. Jackson / Progress in Planning 55 (2001) 1±64 33
Table 6
De®nition of leverage
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Public moneyp p
Bank loansp p p
Bank overdraftsp p p
{Not
Applicants' moneyp p
measured
Applicants' help in kindp p
systematically}
Professional advisors' help in kindp
Previous investments or in kind
contributed by the applicant
p
Enterprise allowancep
is only being calculated for money drawn in through use of the scheme's funds. This
fails to reward the time schemes spent helping applicants who do not then apply for
loans. The performance indicator is in effect excluding the leverage from business
advice, and thereby encouraging schemes to use their own money rather than passing
on commercial cases to banks. A fuller de®nition of leverage would keep a separate
record of money levered through negotiating with banks to take on cases.
² High leverage targets are inconsistent with lender of last resort. Scheme 2 had parti-
cularly high targets for leverage (3:1) which were very dif®cult to reconcile with the
strict requirement of lender of last resort. The problem is not as extreme as it at ®rst
appears because banks typically use a narrower de®nition of matching funds which
equates to net tangible assets (overdrafts and loan capital from directors are treated as a
liability not an asset).
² High leverage might suggest low need for public sector funds. Geddes and Erskine
(1994) point out that for some categories, such as subsidies and joint ventures, it might
be more appropriate to say that the private sector is levering public ®nance rather than
vice versa. ªIf the public sector contributes only a very small proportion of project
funding, one would suspect it more likely that the `outputs' would have occurred
anyway, even without the public contributionº.
² The demand for leverage requires the applicant to construct a complicated package of
funding. Delay or indecision on the ®nal elements can, then, threaten the package as a
whole.
A further potential problem, that preliminary estimates of leverage might be exaggerated
in order to boost agency's targets, was tested by evidence on Scheme 1. Data from
individual client records allowed proposed and achieved leverage to be compared. That
achieved leverage was higher than proposed leverage suggests that, far from exaggerating
®gures, the enterprise agency was not taking into account the full extent to which a
positive decision opened up other sources of funds.
3.4.4. Lender of last resort
All four schemes are intended to operate as lender of last resort. Their interpretation of
this differs, however. Scheme 1 takes the clients' perspective and de®nes lender of last
resort as where the client has tried but failed to raise money. This could be because of poor
presentation rather than the intrinsic merit of the scheme. Scheme 2 takes the bank's
perspective and de®nes lender of last resort as not taking away potential custom from
the bank. The proposition is not bankable in principle, regardless of how it is presented. In
its ®ne form, exempli®ed by one of the funds covered during the mapping exercise,
customers of other banks were acceptable for loans but not customers of the funding
bank. Scheme 3 described lender of last resort from an auditor's perspective. Applicants
must provide letters of rejection from banks. Scheme 4 described lender of last resort from
the public sector perspective, meaning that public sector money was not to be used where
private sector money could be used. These de®nitions will not produce the same decisions.
The second is a tighter de®nition than the other three.
Fund managers complained that use of lender of last resort has three incidental disadvan-
tages. Compiling letters of rejection and other paper work needed can lengthen the time before
A. Jackson / Progress in Planning 55 (2001) 1±6434
a business makes contact with an enterprise agency. This additional delay can have a ®nancial
cost to the applicant. The title of lender of last resort can sound off-putting to potential
applicants, and the ®nance scheme can be stigmatised as a result. Lender of last resort, and
the other performance indicators described, is often unfamiliar to small businesses. The
manager from Scheme 2, which adopted a strict de®nition, complained that imposing these
criteria can make schemes seem alien or unsympathetic to small businesses.
3.4.5. Ethnicity
Three of the schemes recorded the ethnicity of the owner-manager, and one kept ®gures
on the ethnicity of each employee. The ®rst is easier to do, and can be argued to be fairer
because this is the relationship over which the fund manager has some control. The second
gives a fuller picture of bene®ciaries, although it suffers from all the problems mentioned
earlier for job creation, and is at best a snap-shot of a changing mix.
The four loan and grant schemes had a high percentage of their recipients from ethnic
minorities, over 70% in each case. (Table 3). This is higher than the proportion of ethnic
minorities in the business population. Business Link research earlier this year found that
non-whites own 43% of businesses in Hackney.
Even if ethnic minorities were strongly represented among start ups (where schemes are
active), the ®gures of 70% and above suggest that schemes are not `creaming' applicants on
ethnic grounds. Schemes judge applicants on ®nancial need but this is to be expected given
that funding requires a commitment to repay the loan. The concept of `creaming' does not ®t
well with the structure of loan and grant schemes. A more useful interpretation, mentioned in
the introduction, is to see schemes as balancing ®nancial and socio-economic objectives.
3.4.6. Write-offs/default rate
The original intention was to compile data on the number of write-offs. However,
initial research suggested that, to avoid looking bad, at least one fund was continually
rescheduling debts. Without looking at each case, the evaluator would not know whether
this rescheduling was justi®ed. The author responded by using the harder indicator of
percentage of loan-holders behind with their payments. One of the schemes, which had
previously recorded no write-offs now, showed a 66% failure rate.
Measurement problems aside, repayment ®gures are dif®cult to interpret:
² Low failure rate could mean strong risk management or low additionality. One fund
with a low failure rate was described by local Business Link advisors as `tougher than
the banks'. This is an example of the perennial trade-off between socio-economic and
®nancial goals.
² Funders rather than fund managers make the decisions on cases. Scheme 1 had a high
default rate but also a particularly high percentage of cases pushed through despite
business advisors' views that they were not viable.
² Default rates varied widely between types of applicants. Funding criteria therefore
pre-disposed schemes to lower or higher rates. For Scheme 1, just over 90% of the
money lent to start-ups were still outstanding compared to around 38% of that lent to
established ®rms. Furthermore, it seemed that the default rate was twice as high for a
series of 11 loans funded separately (by City Challenge) to meet a tight deadline.
A. Jackson / Progress in Planning 55 (2001) 1±64 35
3.4.7. Enquiry levels
Although enquiry levels are not subject to ongoing monitoring and targets, they
are taken as a measure of marketing. However, ®gures produced can depend on two
factors:
² How enquiries are allocated if the organisation manages more than one fund. This is
similar to the absorption problem with overheads in accounting.
² Procedures for recording enquires. Whether all phone calls are included, just those
resulting in a request for information, or just those which have been ful®lled.
A high level of enquiries is not in itself an indicator of good performance. High ®gures
could re¯ect lack of targeting or poor management of enquirers' expectations.
3.4.8. Business planning process
Fund managers are paid to provide business planning. One of the interesting ®ndings
from the research was the different perceptions of what constitutes a good business plan.
² A complete business plan (bureaucrat's perspective).
² An exciting market opportunity (entrepreneur's perspective).
² A thoughtful/dynamic business plan, considering possible risks and sensitivities
(analysts' perspective).
² A well-presented document (marketer's perspective).
This ambiguity caused problems for the operation of the funds. Scheme 1 has a 43%
approval rate for internally generated applicants against a 27% approval rate for referrals
from local authority business advisors. Interview with one of the local authority business
advisors suggested that he was using a bureaucrat's de®nition of a good business plan (`it
is one which is complete'), whereas the fund manager was using an entrepreneur's
perspective (`a good business plan is one where the business appears viable').
3.4.9. Additionality
Additionality was not used as a formal measure in the ongoing monitoring of the four
loan and grant schemes. However, it was implied in the complaints that funders made
about their schemes. It is therefore useful to examine the problems of measuring addition-
ality, as described in the literature.
² Construction of the counterfactual case typically depends on a hypothetical question
about the likely alternative outcome for the ®rm. Answering this question requires the
interviewee to simplify the complicated circumstances of corporate decision-making,
often some time after the event, using personal knowledge which might be incomplete,
and without time for re¯ection (McEldowney, 1997: 184). Attributing success to
outside factors is an emotional issue affected by the character of the interviewee (for
example, their internal locus of control), and their general sympathy towards the
programme under evaluation. Storey (1990: 675) observes that: ªit is very unlikely
that ®rms will project that in 2 or 3 years' time they will have ceased trading.º He refers
A. Jackson / Progress in Planning 55 (2001) 1±6436
to earlier work on ®rms supported by British Steel which showed that only 65% of
employment predicted was actually achieved.
² Projects are often ®nanced by a package of funders. Separating out the impact of each
strand of support is subjective. In practice this often leads to double counting between
operators (Pearce and Martin, 1996).
² Funding is likely to have a complex effect on ®rms: increasing the scale or speed of
expansion, or removing some of the pressure on owner-managers so that they can
devote more time to other activities. This case of `partial' additionality is dif®cult to
quantify (Pearce and Martin, 1996).
² Analysis at the local level can overestimate additionality because displacement effects
are not taken into account (P.A. Cambridge Economic Consultants, 1987). For instance,
without these Hackney loan schemes, job creation in the adjacent borough of Islington
(which has a dirth of loan schemes) might have been higher.
² Calculations of additionality tend to disregard the opportunity cost of the programme,
for instance counterbalancing changes in mainstream funding, which might otherwise
have created jobs or stimulated small business development (Pearce and Martin, 1996).
² Final outputs are often dif®cult to quantify. For this reason, intermediate outputs (start-
up, expansion, investment) are often used in calculating additionality, which can give a
misleading picture of programme success (Pearce and Martin, 1996).
3.4.10. Distributional effects
Schemes differed in how they interpreted targets for performance indicators. In Scheme
2 targets were assumed to apply to each individual applicant. The other three schemes
chose to adopt what can be described as a portfolio approach where average ®gures (for
example, for leverage) met targets. This gives a more ¯exible approach which allows the
fund manager to adapt the scheme to different client needs.
3.4.11. Conclusion
Analysis through performance indicators is partial. The indicators in operation for the
case studies fall into the following seven classes.
² Input: budget, staf®ng.
² Process: enquiry levels, terms of lending, business planning, lender of last resort,
leverage.
² Equity: ethnicity of loan holder.
² Ef®ciency: unit cost per loan, write-offs.
² Service quality: client feedback on satisfaction with service.
² Outputs: money spent.
² Outcome: job creation.
An eighth class, effectiveness, is represented below.
The research found six fundamental weaknesses in the use of performance indicators.
² They are ambiguous, inconsistently applied, and therefore dif®cult to compare between
cases (all indicators to varying degrees).
A. Jackson / Progress in Planning 55 (2001) 1±64 37
² They measure factors which are not always within the control of those being evaluated
(deployment of funds, leverage, job creation, enquiry levels).
² They are unstable, changing in ways which are independent of the input from the
programme (enquiry levels, deployment of funds).
² They are dif®cult to interpret, open to different interpretations, some of which would be
favourable and others unfavourable (default rates, deployment of funds, leverage).
² They miss out an area of vital signi®cant to the programmes: effectiveness.
² They are unfamiliar to small business clients, and can make the schemes appear
unsympathetic and bureaucratic.
3.5. Valuing
The two hypotheses here were:
Hypothesis 5. Evaluation results support different judgements according to the values
applied.
Sub-hypothesis 5a. Differences in judgements will follow the lines between stakeholder
groups.
Hypothesis 6. Evaluation will produce positive and negative readings on different
aspects of the programme.
Sub-hypothesis 6a. Evaluation that seeks to summarise its ®ndings into a single
aggregate answer will be inconclusive.
The research illustrates well the problem of weighting different ®ndings (Table 3).
Scheme 1 had good deployment (in terms of total number of loans), and high jobs creation
but also a high default rate. Scheme 2 had relatively high leverage, low default rate but a
low approval rate. Scheme 3 had a high average deployment rate, high approval rate, and a
low default rate but high average cost of jobs. Scheme 4 had the highest percentage
targeting, and a large number of enquiries but also a high default rate, and the lowest
job creation. The research also compiled `soft' information on the schemes which was not
picked up by the performance indicators. Scheme 1 was seen to have strong links into the
local community. Scheme 2 had excellent management systems. Scheme 3 had adopted a
proactive approach to marketing, including an outreach advisor to ®nd and evaluate
prospective loans.
The research found clear differences between the perceptions of each of the main
stakeholders. Funders saw the schemes in terms of their contribution to organisational
goals. They were mainly concerned with whether targets had been met, whether the
scheme was collaborating appropriately with other schemes (some set up by the funder),
whether outcomes represented additionality and value for money. For instance, the brief
for the evaluation of Scheme 1 was: `to assess the performance of the fund, taking account
of the objectives and targets set out in task force and city challenge contracts; to examine
the context in which the fund currently operates, including its relationship with other loan
funds operating in the borough; and to make recommendations on the future of the fund
following the closure of the task force.' Fund managers were more focused on the risks and
A. Jackson / Progress in Planning 55 (2001) 1±6438
bene®ts to the enterprise agency. They were concerned with whether instructions from
funders were clear and reasonable, whether targets gave a full picture of the management
work, whether the scheme presented synergy with other enterprise agency activities,
whether decision-making gave suf®cient safeguards for staff, whether the contract was
too open-ended (exposing the agency to ®nancial risk), and whether decision-making
panels had the right skills. Clients were focused on their individual case, and their response
was strongly affected by the outcome of their application. Although 66% of interviewees
thought that applying was easy, several commented that the process was not suf®ciently
¯exible to deal with their individual circumstances.
Notwithstanding this, signi®cant differences between schemes were observed. Scheme
1 referred frequently to their clients. For instance, their interview was full of comments
such as `In the end of the day, I want what is best for the client.' `Whether I like an
organisation or not is irrelevant, if my client will bene®t from being referred to them, I will
refer them.' The manager for Scheme 2 spoke instead of propositions and of the risks of
managing the scheme. The manager for Scheme 3 talked in terms of putting deals together.
3.6. Use
The two hypotheses investigated were as follows.
Hypothesis 7. Evaluation ®ndings will not be used.
Sub-hypothesis 7a. Enlightenment use will be more common than instrumental use.
Hypothesis 8. Communication of evaluation ®ndings will be problematic.
Sub-hypothesis 8a. Over-simpli®cation increases the risk of mis-use. Over-complexity
obscures the meaning.
The data compiled for this project were later employed to complete three consultancy
assignments, as a way of investigating use.
The researcher spent considerable time thinking about how to communicate ®ndings to
the client. The problem was not that of complexity. The ®ndings although multi-faceted
were easily summarised (see Table 3). The clients were clearly used to reading, and indeed
preparing, technical reports. Rather, the researcher was concerned that the results might be
taken out of context and used to criticise the enterprise agencies under examination. The
enterprise agencies were already threatened by the introduction of Business Link and
suffering a negative reinforcement cycle, whereby underfunding leads to criticism of
the services and further underfunding. The author decided to place accountability conclu-
sions within a developmental context, which is in keeping with the recommendations of
Torres et al. (1996: 93) on presentation of negative ®ndings.
3.7. Practice
The hypotheses investigated were as follows.
Hypothesis 9. Data for evaluation will be dif®cult to compile.
Sub-hypothesis 9a. Compilation dif®culties will include non-disclosure as well as lack
of availability.
A. Jackson / Progress in Planning 55 (2001) 1±64 39
Hypothesis 10. Evaluators will be pressured to change ®ndings to ®t the decision-
maker's perspective.
Sub-hypothesis 10a. This will take the form of pressure to accentuate positive ®ndings.
The level of information available varied across the four cases. Scheme 1 supplied word
processed tables of applicant and recipient information. These seemed to have been typed
in for each information request and were not kept on a spreadsheet or database. This
information did not inspire con®dence, for instance, reference numbers had gaps and
duplicates. Attempts to interview recipients found that most of the contact details were
incomplete or inaccurate. A computerised credit control list was handed over at the ®nal
presentation meeting. This is, then, an example of non-disclosure. Schemes 2 and 3 gave
the impression of having good client information, and provided summary tables prepared
for their decision-making panels. Additional variables required for this research were
compiled verbally from the memory of the fund managers. Scheme 4 was undergoing a
change in management. Information was not available although individual client ®les
seemed to exist. In response to the problems with data for Scheme 1, the author scrutinised
application forms, business plans, and panel minutes to compile a database on the 150
applicants to the fund. This took ®ve days work. The enterprise agency listened to the
criticism, and is in the process of establishing a client database system.
The overall conclusion is that data were incomplete, and often had to be ®lled in
verbally, were not in the form needed for evaluation, and were frequently scattered across
different functions of the organisation (typically split between business advisors and
administrators). The inability to verify data obtained in this way led the researcher to
prefer to compile data from individual client ®les, despite the time involved.
As a test of the second set of hypotheses, none of the clients attempted to exert pressure
on the consultant to change the direction of investigation or conclusions. The one client
who disregarded ®ndings did this through commissioning more favourable research rather
than attempting to change evaluation recommendations.
3.8. Conclusion
Hypotheses 1, 1a and 2 were supported by the four case studies. Objectives were
ambiguous. The form this ambiguity took was that objectives were global, diffuse, and
distant from programme functioning. The assumptions underpinning scheme operation
were unstated. Insuf®cient information was obtained to test Hypothesis 2a but support for
this hypothesis is provided by the analysis of Valuing.
Hypotheses 3, 3a and 4 were supported. Performance indicators presented measurement
problems. The measurement problems presented were those of subjectivity, simpli®cation,
and partiality. Performance indicators also presented organisational problems. However,
the organisational problems presented did not seem to be those of `creaming', short-
termism, and demoralisation. Rather, the main problem seemed to be that the heavy
imposition of monitoring demands from funders alienated agencies from their own infor-
mation needs, and discouraged self-evaluation.
Hypotheses 5, 6 and 6a were given tentative support, and 5a partial support. Evaluation
results supported different judgements according to the values applied. Differences in
A. Jackson / Progress in Planning 55 (2001) 1±6440
judgements did, in part, follow the lines between stakeholder groups, however, there were
also notable differences between schemes. Evaluation did produce positive and negative
readings on different aspects of the programme. Summarising ®ndings into a single aggre-
gate answer would have proved inconclusive.
Hypotheses 7 and 7a were rejected. Evaluation was used, and the form this use took was
instrumental rather than enlightenment related. Hypotheses 8 and 8a received limited
support. Communication of evaluation ®ndings was problematic but because of the desire
to avoid misuse rather than because of the complexity of the material.
Hypotheses 9 and 9a received some support. Data for evaluation were dif®cult to
compile, and the compilation dif®culties included non-disclosure and unavailability.
Hypotheses 10 and 10a were not supported. The evaluator was not pressured to change
®ndings to ®t the decision-makers' perspective, positively or otherwise.
A. Jackson / Progress in Planning 55 (2001) 1±64 41
CHAPTER 4
Discussion
4.1. Introduction
The purpose of this section is to comment on the signi®cance of the ®ndings from the
previous chapter, to examine possible explanations for these ®ndings, to consider implica-
tions for future evaluation of loan and grant schemes, and to look at the feasibility of
putting these proposals into practice.
4.2. Socio-economic programming
Chapter 3 found that objectives and assumptions were not stated explicitly. This section
will consider the possible explanation for this oversight, and its importance in evaluation.
Loan and grant schemes fall into the category of projects where ªa treatment directly
acts on the characteristics of the problemº (Chen, 1990: 159). Finance schemes almost by
de®nition improve the ®nancial position of the recipient. This might in itself explain the
lack of pressure to specify the bene®ts from ®nance schemes. Understanding this oversight
is not to excuse it, however. Four fundamental problems with loan and grant schemes can
be traced to their lack of understanding of theoretical mechanisms.
First, the link from ®nancial support to economic impact is poorly developed. This is a
weakness of problem de®nition. Social and economic systems often present what Harmon
and Mayer (1986) describe as `wicked' rather than `tame' problems. The validity of
evaluation analysis is limited by the boundaries within which the research is set. If the
boundaries are drawn too widely or too narrowly, the explanation for events will be
externalised, or predetermined by de®nition. This is the same as saying that a criminolo-
gist's study of burglary is more likely to talk about police methods than about poverty and
deprivation (Pawson and Tilley, 1997).
The logic underlying schemes is that the provision of loans or grants improves the
®nancial position of small ®rms, and so produces an economic impact. The second element
of this, the translation of ®nancial bene®t into economic bene®t, makes several assump-
tions which have been questioned within economic development:
² That small businesses create jobs in the local economy. Birch's seminal work in the
United States Birch (1979) concluded that ªsmall ®rms (those with 20 or fewer employ-
ees) generated 66% of all new jobs in the USº for the period, 1969±1976. Continuing in
this vein, Newcastle University carried out research for the Department of Employment
(reported in Bank of England, 1994) which showed that ®rms with fewer than 20
employees created 2.4 million net jobs in Britain between, 1982 and, 1991. However,
detractors have argued that the importance of small ®rms is decreased when distribu-
tional and qualitative factors are taken into account. Storey (1993) has shown that 4% of
small businesses are responsible for 50% of job generation. A series of surveys have
found that the percentage of small businesses aspiring to rapid growth is small, as low
as 22% or even 10% (Cosh and Hughes in Hughes and Storey, 1994).
A. Jackson / Progress in Planning 55 (2001) 1±6442
² That these jobs are new rather than displacements. The TUC (1997) argues that most
®rms are not creating jobs so much as ªrecyclingº or ªabsorbingº labour outsourced or
subcontracted from large ®rms. Storey (1990: 679) comments that ªdisplacement rates
are very high for smaller ®rms and they vary from one trade to another, being
particularly high in the construction and retail sectorsº. Net job creation will, then,
be considerably lower than gross ®gures suggest.
² That lack of ®nance acts as one (or the main) limiting factor to the start-up, expansion or
survival of small businesses. Accounts of the ®nancial problems of small ®rms are
numerous (Midland Bank, 1992; CBI, 1993; Bank of England, 1994, 1997). Birley and
Niktari (1995) found that owner managed businesses fail for reasons which are, accord-
ing to their bankers and accountants, predominantly ®nancial. That ®rms have ®nancial
problems is not the same as saying that these problems can be overcome through
intervention, however. Stanworth and Gray (1991) conclude that the problems of
small ®rms arise in part from the nature and characteristics of the small ®rms, the
attitudes of owner-managers, and the economics of small scale lending or investment.
² That small businesses want to take on external ®nance. Small ®rms typically have a
`pecking order' with a preference from overdrafts and internal funds through hire
purchase and leasing, and then loans down to external equity (Cran®eld European
Enterprise Centre, 1993) This re¯ects a desire to maintain control over the business.
(Cosh and Hughes in Hughes and Storey, 1994).
² That delivery mechanisms for the speci®c loan or grant schemes allow them to reach
those small businesses most in need of ®nancial support. The size, structure, and age of
small ®rms all reduce the quality of information about them, and can therefore increase
the dif®culties of making contact.
Each of the stages in the simple logic diagram for loan and grant schemes has possible
sources of leakage (Fig. 4). At the ®rst stage, capital in the fund could be depleted through
the costs of managing the scheme. At the second stage, bene®ts to recipients could be
depleted through inappropriate choice of recipients, insuf®cient matching of the amount of
money provided to recipient's need (including deadweight), use of the grant or loan money
for inappropriate purposes, or substitution of grant or loan money for other sources of
A. Jackson / Progress in Planning 55 (2001) 1±64 43
Fig. 4. Diagram of loan or grant scheme theory.
funding. At the third stage, bene®ts to the economy could be depleted through collapse of
recipients, their relocation, or through displacement of existing ®rms.
The empirical work found three practical problems which derived from the lack of clear
objectives:
² The funder for Scheme 1 threatened to tender fund management because of the high
default rate, only to ®nd that the terms of the contract were so general that termination
on this basis was likely to be open to legal challenge. The main contractual obligation
upon the enterprise agency was to spend the money.
² In Scheme 2, differences in the way individual applicants were rated by the two main
board members (a task force and a bank) led to a high rejection rate (70%).
² Misunderstanding of the role of Scheme 1 led to inappropriate referrals from other
agencies, especially the local authority business advisors, and hence a lower approval
rate than for internally generated applicants.
A fourth problem of unclear objectives and assumptions is that fund managers were not
given clear guidance on risk positioning. This re¯ects the funders' failure to look explicitly
at the trade-off between ®nancial and non-®nancial objectives. Over-emphasis on ®nancial
objectives will mean that additionality will be lacking because the fund will be making
loans that could be made by banks. Over-emphasis on socio-economic objectives will
mean that money will be lost, and this undermines the long-term sustainability of the fund
because money will not be recycled for further lending. Where objectives are not clear,
funds are open to criticism for poor additionality or high default, without appreciation that
these are two sides of the same coin. In Scheme 1 objectives seemed to have suffered from
`policy drift' away from an early concern with targeting (high risk) to a later tightening up
around job creation (low risk). In the ®rst phase, decisions were pushed through, often
against the fund manager's recommendation, which left a level of default that looked
highly unsatisfactory by the second phase.
The overwhelming conclusion from this section is that loan and grant schemes need to
clarify their objectives and strengthen their understanding of the theoretical mechanisms
of their programme. Funders should make an explicit statement of their risk position, and
direct schemes to monitor leakages from the programme. Strengthening Socio-economic
Programming in this way promises bene®ts for theory, implementation, and evaluation
itself. First, funders can ensure that their expectations are reasonable. Where analysis
suggests that the default rate is going to be very high, a grant scheme might be chosen
over a loan scheme. Second, evaluation can contribute to the overall pool of knowledge on
the ®nancial problems of small ®rms. `Plausible' outcomes and indirect effects can be
predicted. Understanding of causal mechanisms provides a better basis for generalisation
than speci®cation of project descriptions, and will increase appreciation of project diver-
sity. Third, identifying key processes provides practical levers for improving implementa-
tion. Elements which do not contribute to the programme can be removed, which reduces
project waste. Last, clarifying theoretical assumptions ensures that research can be focused
in an explicit manner that gives interviewees a more equal role in the debate. This helps to
lay the foundation for a `no shocks' approach to evaluation reporting.
A. Jackson / Progress in Planning 55 (2001) 1±6444
4.3. Knowledge construction
Chapter 3 found measurement and organisational problems from employing perfor-
mance indicators. This section will consider the relevance of performance indicators to
loan and grant schemes, scope for strengthening measurement of effectiveness, ways of
introducing `soft' performance indicators, and the practicality of adopting a more devel-
opmental approach to performance management.
Radaelli and Denta (1996) provide a matrix matching the type of evaluation to the type
of programme (Table 7). Programmes are plotted by the degree of innovation and the
amount of social con¯ict. The quadrant most relevant to British loan and grant schemes
would seem to be that of low innovation and low social con¯ict, which is described as
`Tableau de bord'. This box is recommended for development of a time-series of perfor-
mance indicators to improve the quality of decision-making and contribute to account-
ability. The point of this analysis is to suggest that performance indicators should be
manageable within loan and grant schemes once the problems of ambiguity have been
overcome.
There is now a general recognition of the limitation of ®nancial, especially cost-
accounting approaches to performance management (Johnson and Kaplan, 1991). One
`soft' performance indicator which seems relevant to loan and grant schemes is the version
of `social capital' put forward by Bulder et al. (1996). They de®ne social capital as: ªthe
social networks of employeesº. This de®nition ties in well with the work by Birley on the
importance of networking to entrepreneurs (Birley, 1985; Ghaie and Birley, 1993;
Ostgaard and Birley, 1994). Bulder et al. review research which shows that social capital
increases the productivity of staff through strengthening their motivation, improving
morale, and therefore reducing staff turnover and widening their access to information,
advice and support. They conclude that: ªit is possible that organisational reforms claim-
ing increased ef®ciency and effectiveness may have negative, albeit unintended, conse-
quences for social networks within the organisation; this may turn its social capital into
`sour' capital.º Bulder et al. provide analytical techniques for quantifying changes in
social networks, and therefore tracing the damage done by such organisational reforms.
A performance indicator based on this de®nition of social capital would provide some
measure of the intense networking activity carried out by Schemes 1 and 4. This work
could be used by funders to examine the possible disadvantages of re-tendering ®nance
scheme management contracts.
Chapter 2 emphasised the dif®culty of introducing a learning culture. An analysis by
Power (1994) of `the audit explosion' illustrates how deeply change needs to reach.
He identi®es the fundamental problem as being lack of trust of the organisation under
A. Jackson / Progress in Planning 55 (2001) 1±64 45
Table 7
Dimensions of the policy process and choice of evaluation strategies
Social con¯ict
Innovation Low High
Low `Tableau de bord' Unclogging
High Discovery Con¯ict management
evaluation. He argues that the expansion of evaluation (or auditing) serves a symbolic
rather than an instrumental purpose as a demonstration of control over public spending.
Power believes: ªthe audit explosion signi®es a displacement of trust from one part of the
economic system to another; from operatives to auditors.º Audits have therefore made
organisations less rather than more transparent. The result is a vicious cycle whereby:
ªpeople adapt their behaviour to re¯ect the fact that they are not trusted and therefore
con®rm that they should not be trusted.º This is not an inherent problem within perfor-
mance management but rather a result of the tensions from having local controls imposed
from the centre: ªa much resented degree of ªbackseat drivingº by central government'
(Carter, 1989). The solution, according to Power, is to open organisations up to a process
of continual learning, that is, to adopt a developmental approach.
It is arguable that the narrowly de®ned contractual relationships, such as those
employed for the day-to-day management of each of the loan funds under study in this
project, are not ideal breeding grounds for a learning culture. The implication seems to be
that public sector organisations should copy the private sector in moving towards long
term relationships with their suppliers (see, e.g. Ford, 1990).
There is a second more narrowly de®ned problem with the developmental approach.
The impact of a programme is easier to measure once it has reached a position of stability.
The evaluator can then distinguish between set-up costs/bene®ts, transitional costs/
bene®ts, and ongoing (recurrent) costs/bene®ts from a programme. However, organisa-
tional learning is likely to lead to continual adjustments in the operation of a programme
(Richardson et al., 1996). This will make impact more dif®cult to measure.
A third possible problem of embedding evaluation in organisations lies in the need to
widen understanding of and control over information. Democratising evaluation in this
way is going to be more deeply felt if practitioners can form their own interpretations of
data. The literature contains two examples of innovative ways of presenting information.
Henry (1992) uses STAR icons which plot variables on each side of a polygon to allow the
reader to see the pattern across individual variables in the data, and thereby draw their own
conclusions. The overall shape produced allows many different cases or sites to be
compared simultaneously. Research at Imperial College is producing information systems,
such as the Bifocal Display technique, which allow the reader to see the distribution of
data, and therefore to set their own class boundaries (Spence, 1996).
The conclusion from this section is that performance indicators provide a potentially
valuable role in loan and grant scheme evaluation. However, their measurement and
organisational context need to be changed. The performance indicators themselves should
be widened to include `soft' as well as `hard' indicators. Quantitative indicators' claims to
objectivity can be questioned: ªQuantitative measurement rests on qualitative assumptions
about which constructs are worth measuring and how constructs should be conceived.º
(Shadish et al., 1995: 133, commenting on Campbell). Combining quantitative and quali-
tative methods will increase the relevance and validity of evaluation. This is because
different methods suit different evaluation contexts and different aspects of programme
operation. Qualitative methods excel on `bandwidth' (the range of issues addressed), while
quantitative methods excel on `®delity' (the accuracy of information on the issues
addressed). Triangulation allows greater perspective on research issues. Using both
gives a broader base for discovery, and allows for the generation of new hypotheses
A. Jackson / Progress in Planning 55 (2001) 1±6446
as well as the testing of existing hypotheses. Furthermore, using different methods
can make the best of the `division of expertise' among stakeholders (Pawson and Tilley,
1997).
A second major conclusion is that evaluation should be set within a developmental
rather than an accountability approach. Performance indicators work better when used for
learning rather than control. McEldowney (1997: 177) argues that: ªThe advantages of this
model may lie in the fact that it may make policy-makers less defensive and less
threatened by the evaluation exercise.º Notwithstanding this, the literature review, empiri-
cal work, and comments earlier in this chapter all emphasise the problems of introducing a
developmental approach. A developmental approach cannot be laid on top of existing
organisations but rather requires fundamental changes to structure, culture, and
communications channels.
4.4. Valuing
Chapter 3 showed how loan schemes differed from each other in their de®nition of
performance indicators. This variation was not random but followed a systematic pattern
across the schemes. Scheme 1 consistently de®ned performance indicators more widely.
Scheme 2 de®ned them more narrowly. Scheme 3 lay in the middle of these two. Further,
the terms in which fund managers described their work differed, focusing on clients
(Scheme 1), propositions (Scheme 2), and deals (Scheme 3). Putting all this together
suggests three different styles of managing loan and grant schemes (Table 8). These
different management styles are re¯ected in choices about marketing, decision-making,
monitoring as well as interpretation of performance indicators. Scheme 1 is client focused,
Scheme 2 is systems focused, and Scheme 3 is focused on deal-making. Scheme 4 had
experienced several changes in its management, and showed a less clear pattern.
The explanation for these patterns seems to lie in the pressures imposed on loan and
grant schemes. Each scheme has to balance socio-economic and ®nancial objectives but
the choices they make seem to depend on other pressures. For Scheme 1 these pressures
were political, especially around a commitment to support ethnic minorities; for Scheme 2
prevailing pressures were audit related, designed to protect the agency from external
attack. Scheme 3 operated under extreme time pressure, and needed to act entrepreneu-
rially to meet its tight deadlines.
This model of management styles raises ®ve questions about the Valuing of loan and
grant schemes.
² Should loan schemes be guided to perform more widely on objectives rather than giving
priority to one perspective? Alternatively, should expectations on loan funds be
adjusted to acknowledge the con¯icting pressures under which they must operate?
² To what extent is there a gain in having different loan schemes in an area operating under
different management styles? Enterprise agencies seem to be reaching different client
groups, possibly because clients gravitate towards a style which makes them feel comfor-
table. If this ®nding is validated, the implication is that the current tendency towards single
access points in Business Links may be reducing the scope to perform across the wide
range of objectives, and some client groups might be excluded in the process.
A. Jackson / Progress in Planning 55 (2001) 1±64 47
² To what extent do the different styles weaken relationships between loan schemes? Would
greater recognition of the difference in styles improve mutual respect, and therefore
increase referral?
² What are the relative merits of the different models? To what extent does the funding
regime encourage one style over another? The researcher's impression was that the client
oriented route was the most dif®cult to pursue because it is at odds with the bureaucratic
culture of the funding agencies.
² Whose values should be employed during the evaluation? Can the evaluator fairly
compare schemes which appear to be founded on different priorities?
The ®rst issue on this list, trading-off different priorities, is discussed in the evaluation
literature. Chen (1990) indicates three strategies for dealing with multiple and con¯icting
values. A maximising approach prioritises one of the values. A sequencing approach gives
values priority at different stages. A balancing approach gives all values equal attention. Chen
argues in favour of the third because: additional effort devoted to individual values can be
expected to be subject to the law of diminishing marginal returns; decision-making more
usually adopts an approach of `satisfying' (March and Simon, 1958); and systems theory
A. Jackson / Progress in Planning 55 (2001) 1±6448
Table 8
Management styles of the case studies
Client oriented Bureaucratic Entrepreneurial
Scheme 1 Scheme 2 Scheme 3
Orientation Clients Systems The deal
Pressures Political Auditing Time
Marketing Word of mouth Bank referral Outreach
Decision-making Packaging/needs
oriented
Processing/business
plan oriented
Negotiating/opportunity
oriented
Core decision-making
structure
Counsellors Banks Professionals
Loan monitoring Business club Monthly accounts Visits
Strengths Client relations
Targeting
Systems
Low failure rate
Throughput
Presence
Weaknesses Seen as a soft touch
Higher failure rate
Throughput
Bias
Lack of clarity to
applicants
Defence Writing off as a
political decision
Argument that there is
no demand. Collection
of information on other
funds' throughput
Good record keeping
Performance
Indicators:
Interpretation of
lender of last resort
From the client's
perspective: tried and
failed
From the bank
perspective:
unacceptable in theory
From the public sector
perspective:
unacceptable in practice
Interpretation of
leverage
Wide, includes help in
kind
Narrow, includes
proven bank funds
Medium
suggests that emphasis on any one value is dysfunctional. However, the context for Chen's
analysis is designing an evaluation rather than implementing an economic development
project. Chen is assuming that only one evaluation is carried out at any one time. By contrast,
local economies can contain several loan and grant schemes. Different value positions can be
expected to match the needs of a wider pro®le of clients needs. A uni®ed approach provides
less choice for the client, and may even exclude some groups from support. An alternative
view can be put forward, that `maximising' will enable each implementing agency to build on
its strengths and best meet the needs of speci®c client groups. This approach is analogous to a
business strategy of differentiation rather than cost leadership. Notwithstanding this, loan
schemes may be able to learn from each other. For example, business clubs and IT systems
can be combined in the service of credit control procedures.
This section questions the basic principle that standards can be applied across schemes.
If agencies cannot excel on all aspects of performance, local areas might bene®t from
having a range of schemes which each excel on different areas, re¯ecting their core skills.
Schemes should, then, be judged together in terms of their complementarity with each
other.
A second conclusion is that project level studies should be co-ordinated, and results
compiled across different evaluations. Introducing this form of meta-analysis would allow
effectiveness measures to be included. Some of these are dif®cult to analyse at the project
level. The information provided could be used to support benchmarking of loan and grant
schemes. Meta-analysis in its more common sense (integration after, rather than before
research is carried out) would also provide a check on the quality of evaluation research
through dialogue between different evaluators.
4.5. Use
The research found strong instrumental use in two out of three of the consultancy
exercises carried out. This section examines some of the problems of instrumental use,
and the possible reasons why instrumental use was easier than would be expected from the
evaluation literature.
This project highlighted three negative aspects of instrumental use. First, a long-term
focus on decisions about the renewal of contracts seemed to have undermined commu-
nication between funders and scheme managers. The funders had adopted a laissez faire
management style. Concerns dating back as far as the establishment of the schemes had
not been articulated or explained. The funders were behaving as if the only course of
action open to them was to re-allocate the contract. Second, re-tendering management
came to be seen as straightforward. Reliance on performance indicators seemed to encour-
age an arms-length view of projects which overlooked the build up of interrelationships,
intangible factors which are well captured within the concept of `social capital'. One of the
four loan funds had changed management, and all relationships with loan-holders were
lost in the process. The high default rate which resulted might be in part an effect of
breaking these ties, and therefore illustrative of the importance of social capital. Third, the
consultant was placed in a dif®cult position. The evaluation was dependent on the fund
managers to provide data which it was not entirely in their interests to hand over. The
emphasis on the vulnerability of the contract made this dilemma more obvious.
A. Jackson / Progress in Planning 55 (2001) 1±64 49
The author attempted to defuse potential con¯ict through improving communication
and understanding between the funders and the scheme managers, using stakeholder
analysis, and the management styles model explained above. Earlier involvement in the
monitoring of the loan schemes would have strengthened this role. The literature provides
several examples of evaluators who are appointed at the beginning of programmes and
provide continuous feedback. This is variously described as `trailing research' (Finne et
al., 1995), `continuous monitoring' (Georghiou, 1995), or `integral programmatic inter-
vention' (Patton, 1996).
That instrumental use did occur in two out of the three exercises is contrary to general-
isations within evaluation literature. However, an explanation for this difference is easily
found. In a report of a discussion between Weiss and Patton, Weiss admits that part of the
reason for the negative ®ndings of much of her work is that it is carried out at a policy
level, where the evaluator has less contact with decision-makers and political factors are
more important (Alkin, 1990: 26). In the three consultancy assignments mentioned, the
client was one of the decision-makers.
A second possible explanation for the high use lies in the researcher's decision to place
accountability evaluation within a developmental context. Chelimsky (1997) shows how
the nature of use can be expected to differ between the three broad types of evaluation
identi®ed earlier. Accountability evaluation tends to have a complex impact, for instance,
knowledge of later evaluation might lead to higher standards during implementation.
Developmental evaluation might aim for relatively close use of research ®ndings, but
can also be of value in helping programme operators question their work. Knowledge
evaluation tends to result in a diffuse application of ®ndings, for example, later
programmes, including programmes outside the immediate ®eld of interest, might be
in¯uenced by the accumulation of evidence. The proposal to increase the developmental
function of evaluation, which came out of the analysis of performance indicators above,
can be expected to strengthen instrumental use. There is no reason to expect this to
generate the kinds of problems listed above because these refer to instrumental use of
accountability evaluation, not instrumental use of developmental evaluation.
A third possible explanation for the high level of use lies in the use of case study
material. Stake (1978) argues for `naturalistic generalisation' around use of case studies
because: ªThe best substitute for direct experience probably is vicarious experience.º Case
studies, it is argued, provide vicarious experience because of their concrete and vivid
nature. ªCase studies will often be the preferred method of research because they may
be epistemologically in harmony with the reader's experience and thus to that person a
natural basis for generalisationº (Stake, 1978: 5). This is assumed to lead to greater use.
Notwithstanding this, Shadish et al. (1995: 300) point out there is a difference between
case study formats and case study methods. Case study formats may be a useful way of
presenting ®ndings to encourage use but they do not necessitate case study methods. For
instance, surveys can often generate case study material. Fictional cases can also help to
dramatise ®ndings.
Two conclusions can be derived from the analysis of Use. First, ®ndings should be
presented in different forms and structures. Communication of evaluation material has to
overcome several complexities: the different layers of ®ndings, the different interests of
readers, and the different communication styles of individuals. Using a range of media
A. Jackson / Progress in Planning 55 (2001) 1±6450
provides the best way of reaching everyone. Presenting sections of evaluation ®ndings in
different styles allows different ways into the information. Qualitative material such as
case studies and quotations can provide `vicarious experience'. Presentation should not be
limited to written approaches. Open days can provide a direct way for funders to under-
stand the projects being evaluated.
Second, stakeholders should be involved in the evaluation as far as is possible without
sacri®cing the independence of the evaluator. Stakeholders should be included because of
the information that they can provide, and the gain for utilisation in starting dialogue from
the earliest stages of work. The justi®cations for involving stakeholders include:
² Vicarious experience. Stakeholders can experience the evaluation process. This
increases their ownership over ®ndings, and can be expected to increase the likelihood
of use. This use is not limited by the evaluator's perspective. Readers can form their
own conclusions about the data. (Patton, 1996).
² Education. Users can be educated so that they value evaluation. The evaluator can also
take responsibility for giving the client greater appreciation of ethical issues in evalua-
tion and consultancy. This is the approach recommended by Newman and Brown
(1996).
² Perspectives. The de®nition of the social or economic problem under attack can be
broadened beyond the institutional boundaries of the implementing organisation.
Qualitative information can be used to balance and contextualise quantitative data.
² Managing politics. Having a cross-section of interest groups can to some extent help to
neutralise the in¯uence of any one. (Palumbo, 1989).
² Justice. Some evaluators see a moral case for taking account of the values of different
programme participants. House (1978: 94) argues that: ªGenerally, the evaluator has an
obligation to ask how things look from the viewpoint of the least advantaged and
whether that viewpoint is worth collecting and emphasizing in the evaluation.º.
² Lead into implementation. Detailed information is imparted to the stakeholders so that
insights are not lost when evaluators withdraw. This prevents the client from being
dependent on the evaluator. (Fetterman et al., 1996)
4.6. Practice
Chapter 3 found that evaluation data were dif®cult to obtain. This section investigates
why this might have been the case, and uses information from business process reengi-
neering to consider possible ways of improving data availability.
Good information might be expected to be available given that schemes maintain
ongoing relationships with clients during credit control. The number of clients (recipients)
is also considerably smaller than might be the case for training schemes or other economic
development projects.
That the available information was not always strong seemed to re¯ect the relationship
between funders and contracts. Funders place heavy information burdens on the enterprise
agencies. Externalising information demands in this way seemed to have stopped the
agencies from thinking about information from their own perspective. Enterprise agencies
A. Jackson / Progress in Planning 55 (2001) 1±64 51
were not always comfortable with an analytical style. Maintenance of databases was either
against their culture or given a low priority. Third, it is possible that agencies were
deliberately not keeping data that could be used against them. Fourth, some of the agencies
did not have the skills to establish computerised databases. Fifth, ®nancial pressures on
agencies meant that long-term investment in information systems was given a low priority.
Even those agencies which had sophisticated information systems were using the data
for credit control and other management functions, rather than to inform decision-making.
This is unfortunate as a scheme's knowledge of their loan holders can help them under-
stand the likely behaviour of applicants. Compilation of detailed individual records for all
the clients for Scheme 1 yielded ®ndings that would have been useful to the agency. For
instance, that 90% of money lent to start-ups was outstanding, compared to around 38% of
money lent to established ®rms, and debt was concentrated in applicants who had failed to
make a single repayment.
Improving the availability of data depends on solving technical problems. Data collec-
tion should be easy and not too time consuming. Data must be accurate and timely. This is
partly about ensuring the information is valued by the people collecting it. Data must be
transparent and easy to interpret. De®nitions and measurement must be consistent across a
time series.
Means to meet these requirements can be derived from business process re-engineering
(Hammer, 1990):
² Data should be entered by those with client knowledge. This is likely to mean
delegation of responsibility for data systems down the organisation.
² Data should only be entered into the system once.
² Data should be entered regularly, preferably soon after the events related.
² Data should be captured at source.
² Related data systems within the organisation should be linked to avoid ambiguity.
² Data should be entered in a disaggregate form with the computer calculating perfor-
mance indicators for funders according to their different conventions. Disaggregation
allows easy checking of accuracy by auditors.
² The computer should have automatic cross checking (or ®eld delimiters).
An illustration of how these principles can be developed is seen in the auditing tool
developed by NOP Research Group Limited (unpublished information). Currently adapted
for private sector performance management, this compiles disaggregated data on a large
number of questions across eight categories of organisational performance. Questions are
answered in terms of the evidence available to con®rm them (instead of through simple
`yes' or `no' answers). Questions and weights between questions can be easily changed to
meet the circumstances of each organisation. Re-standardisation of data allows bench-
marking with comparable organisations. The equivalent of this system designed for loan
and grant schemes would have individual information on each applicant, distinguishing
each potential element in each performance indicator so that schemes could adopt their
own de®nitions but also recalculate ®gures on a similar basis in order to allow comparison.
Soft indicators would be easily included in this approach.
The overall conclusion is that evaluation should be integrated with management
A. Jackson / Progress in Planning 55 (2001) 1±6452
systems. This was recommended by Likierman (1993), and is also one of the principles of
business process re-engineering. Evaluation tends to rely on the existence of detailed
project information. This is especially the case for programmes of low innovation and
low social con¯ict, as set out by Radaelli and Dente (1996). Stand-alone evaluation
projects have to compile this information over a relatively limited time. Establishing a
bank of monitoring information allows evaluations to be carried out with limited effort and
delay. The cost of data collection is reduced, and also spread over a longer period of time.
Furthermore, embedding evaluation in the organisation in this way avoids the `friction' of
needing to activate several layers of decision-making to appoint each individual piece of
evaluation. The quality of information should be improved because the data are compiled
soon after the event by a data entry operator who is familiar with the details and motivated
to ensure data quality through their own interest in the results. This process is more likely
to produce a full time-series of results. Compiling data on a disaggregate basis should
increase accuracy and versatility. Disaggregated data allow different versions of perfor-
mance indicators to be calculated automatically by the computer which ensures that each
version of a performance indicator is calculated consistently. Taking information manage-
ment down to the level of those staff who have client contact should increase their status,
and encourage them to be more questioning and responsive. This in itself should help to
raise standards of customer care. Embedding monitoring systems in the organisation
should also increase receptiveness to evaluation conclusions. Weiss (in Shadish et al.,
1995: 193) comments that policy makers most value data that come to them naturally.
Linking evaluation to management systems ful®ls this condition. The short lead-in time to
compilation of evaluation reports should also enhances their relevance to decision-
making. Patton (1997: 93) concludes:
Integrating data collection into program implementation would be considered a
problem Ð a form of treatment contamination Ð under traditional rules of
research. (¼) Making data collection integral rather than separate can reinforce
and strengthen the program intervention. Such an approach can also be cost-effec-
tive and ef®cient since, when evaluation becomes integral to the program, its costs
aren't an add-on. This enhances the sustainability of evaluation because, when it's
built in rather than added on, it's not viewed as a temporary effort or luxury that can
be easily dispensed with when cuts are necessary.
The second conclusion is that evaluation should run alongside the programme. Evalua-
tors will be able to advise on establishment of monitoring systems at the beginning, and so
ensure that data are available for later impact evaluations. Long-term involvement ensures
that the evaluator has a detailed understanding of each stage of the programme. The cycle
of feelings that staff, participants, and other stakeholders tend to experience in a
programme can be followed, as can the different costs and bene®ts from setting up and
running the programme. Cook et al. (1981) point out that: ªthe single study is by de®nition
an imperfect vehicle for obtaining accurate resultsº. Longitudinal research provides
greater scope for testing factor relationships. Including several iterations of evaluation
also provides deeper and deeper insight into the contextual factors underlying programmes
(Pawson and Tilley, 1997). Lastly, long term contracts give evaluators greater ¯exibility to
respond to changing briefs.
A. Jackson / Progress in Planning 55 (2001) 1±64 53
4.7. Conclusion
This chapter has examined the ®ndings from Chapter 3 and suggested ways of
improving the evaluation of loan and grant schemes.
² Analysis of the theory underlying loan and grant schemes should be improved. This
would lead to greater understanding of potential sources of leakage from the
programme, and to greater additionality.
² Evaluators should use a range of methods, qualitative as well as quantitative. Including
`softer' performance indicators such as social capital would give a fuller picture of the
skills that agencies employ in managing loan and grant schemes.
² Judgements of value should take into account the trade-offs in managing loan and grant
schemes, and the possible bene®t from having agencies which complement each others'
activity.
² Monitoring should be integrated into management systems. This would increase the
agencies' appreciation of evaluation as well as increasing the availability of good
quality longitudinal data, and reducing the cost of evaluation.
² Evaluation should be run alongside the programme. This would ensure that data
systems are properly established from the beginning, and would allow continual feed-
back on implementation, thereby improving the practical value of evaluation.
² Evaluation ®ndings should be presented in different media. This should include
participation (funders attending open days) as well as presentation of written reports.
² Evaluation studies should be co-ordinated across local areas. Monitoring should have a
common core of data. This would allow key issues such as effectiveness to be
examined, and would also increase the validity of comparisons between schemes.
² Each of these changes potentially expands evaluation activity. As a counter-balance,
evaluation needs to improve its focus. Shadish et al. (1995: 61) comment that: ªGood
theory of practice Ð more than anything else Ð is about setting priorities and the
trade-offs that go with doing so.º The work already reviewed in Chapter 2 shows ways
of doing this. Pawson and Tilley (1997) argue that focusing can be achieved by being
theory-driven rather than data-driven. Patton (1997) would focus on the questions of
value to key users. Ability to combine both approaches simultaneously will re¯ect the
familiar dimension of knowledge versus use approaches to evaluation.
A. Jackson / Progress in Planning 55 (2001) 1±6454
CHAPTER 5
Conclusions
The shift from professional to managerial authority and values was accompanied by
changes in the kind of knowledge exploited in evaluation of services. Government
assumed that evaluation would be summative, delivering authoritative judgements,
based as far as possible on performance indicators or quantitative measures of
input±output relationships and outcomes and set against prede®ned targets and
standards. The underlying theory of knowledge was positivist; it assumed that social
phenomena could be divorced from their context and that objective knowledge about
them could be achieved through empirical observation and quantitatively expressed;
that facts were distinct from values and means from ends; that concepts and methods
of `good management' were applicable to the pursuit of any values. Those assump-
tions were at variance with the dominant theory and practice of evaluationº (Henkel,
1991: 19±20).
The Financial Management Initiative (FMI) introduced by the Thatcher Government in,
1982 increased the use and the importance of evaluation and performance management.
Measures of inputs, throughputs or outputs, known as performance indicators, have been
employed as the main way of controlling decentralised or contract organisations, and with
time have replaced other forms of communication between the parties. Performance
indicators have been employed to increase customer focus, and with time have come to
symbolise the search for value for money and reduction in public spending. These are high
expectations which evaluation is wholly unable to meet. The subject still faces conceptual,
organisational, and practical problems in its application.
Research from four case studies illustrates the evaluation problems of public sector
small business loan and grant schemes. Performance indicators are subjective, partial, and
inconsistent between cases. Their effects are to reduce co-operation and mutual esteem
between schemes, to distance agencies from their own information needs, and to alienate
small business clients. Readings are inconclusive and contentious between stakeholders.
These are not technical problems. At their heart lie the dif®culties of attempting to quantify
complex and variable economic phenomena, extracting enterprise agency operation from
the many external factors which impinge on their performance, simplifying and rigidifying
the very dynamics of performance itself. Above all, these problems re¯ect the centrality of
values and ideology in the organisation and operation of public sector programmes and in
their evaluation. The positivist assumptions, which are partly responsible for these
problems, can protect performance management from criticism through the claim of
objectivity and technical skill. They cannot provide solutions to practical problems of
performance management.
The problems of using performance indicators should not come as a surprise. The
United States has an extensive and long standing literature on evaluation. Although still
suffering from fundamental differences of opinion, American evaluation has nonetheless
reached a position of understanding on the many problems facing evaluators. Ten
A. Jackson / Progress in Planning 55 (2001) 1±64 55
hypotheses on evaluation problems were derived from the literature, and then tested on the
four small business loan and grant schemes. These hypotheses re¯ect the pessimistic view
coming out of the literature that evaluation is rarely well speci®ed, and seldom produces
®ndings which are conclusive or welcome. Eight hypotheses were supported. Two were
rejected, mainly because of the different experience of implementing evaluation at a
project rather than a policy level. This work points to the need for more comprehensive
research to examine the interpretation and use of performance indicators across small
business loan and grant schemes, and in other ®elds of economic development.
The literature suggests a number of ways to improve evaluation, many of which are to
do with the quality of the data and the rigour of analysis on which any programme
assessment must be based. Some of this lie in the hands of the evaluators themselves.
Evaluation can be integrated into management systems, co-ordinated across local areas,
run alongside the programme, and made easier to absorb. However, evaluators should not
make promises they cannot keep. They can give users and stakeholders a larger role in
research, but they cannot themselves ensure that these people's values will be re¯ected in
judgements. They can ask those being evaluated to open up to criticism in order to develop
and learn. But they cannot promise that negative conclusions will not be used against those
organisations, which have co-operated. They can make recommendations relevant, clear,
and precise, but they cannot promise that resources will be found to implement them.
Many evaluators are now calling for evaluation to move from an accountability purpose
towards a developmental or knowledge role. This project supports such a change.
However, it also emphasises the deep seating transformation which will be needed for a
developmental style to be created.
A. Jackson / Progress in Planning 55 (2001) 1±6456
CHAPTER 6
Bibliography
Abma, T. A. (1997) Playing with/in plurality: revitalizing realities and relationships in
Rotterdam. Evaluation 3(1), 25±48.
ACOST (1990) The Enterprise Challenge: Overcoming Barriers to Growth in Small
Firms. HMSO, London.
Adelman, C. (1996) Anything goes: evaluation and relativism. Evaluation 2(3),
291±305.
Agarwala-Rogers, R. (1977) Why is evaluation research not utilized? Evaluation
Studies Review Annual 2, 327±33.
Awasthi, D. N. and Jose, S. (1996) Evaluation of Entrepreneurship Development
Programmes. Sage, London.
Barber, J., Metcalfe, J. S., Porteous (eds.) (1989) Barriers to Growth in Small Firms.
Routledge.
Beeton, D. (1988) Performance Management: Getting the Concepts Right.
Boruch, R. F. (1976) On common contentions about randomized ®eld experiments.
Evaluation Studies Review Annual 1, 158±194.
Broadbent, J., Dietrich, M. and Laughlin, R. (1996) The development of principal-
agent, contracting and accountability relationships in the public sector: conceptual and
cultural problems. Critical Perspectives on Accounting 17(3), 259±84.
Bryk, A. S. (ed.) (1983) Stakeholder-based evaluation. New Directions for Program
Evaluation, No.17. Jossey-Bass, San Francisco, CA.
Buisseret, T., Cameron, H. M. and Georghious, L. (1995) What difference does it make?
Additionality in the support of large ®rms. International Journal of Technology Manage-
ment 10(4±6), 587±600.
Bush, M. and Gordon, A. C. (1978) The advantages of client involvement in evaluation
research, Evaluation Studies Review Annual 3, 767±783.
Bussman, W. (1996) Democracy and evaluation's contribution to negotiation,
empowerment and information: some ®ndings from swiss democratic experience.
Evaluation 2(3), 307±319.
Calsyn, R. J. and Klinkenberg, W. D. (1995) Response bias in needs assessment studies,
Evaluation Review 19(2), 217±225.
Cameron, G. C. (1990) First steps in urban policy evaluation in the United Kingdom,
Urban Studies 27(4), 475±495.
Centre for Business Research (1996) The Changing Status of British Enterprise.
Challis, D. (1996) Performance indicators for community-based social care: from
theory to practice, Care Plan 12(4), 19±24.
Ciarlo, J. A. (ed.) Utilizing Evaluation: Concepts and Measuring Techniques. Sage,
Beverly Hills, CA.
Cook, T. D. (1978) Utilization, knowledge-building, and institutionalization: three
criteria by which evaluation research can be evaluated. Evaluation Studies Review Annual
3, 13±22.
A. Jackson / Progress in Planning 55 (2001) 1±64 57
Cracknell, B. E. (1996) Evaluating development aid: strengths and weaknesses,
Evaluation 2(1), 23±33.
Cran®eld European Enterprise Centre (1993) Special Report No 8. Financial
Characteristics of Small Companies in Britain.
Cran®eld European Enterprise Centre (1993) The European Enterprise Index. Survey 6.
Cronbach, L. J. and Associates (1981) Our ninety-®ve theses. Evaluation Studies
Review Annual 6, 27±37.
Das, T. H. (1983) Qualitative research in organisational behaviour. Journal of
Management Studies 20(3), 301±314.
Department of Trade and Industry (1991) Constraints on the Growth of Small Firms.
HMSO, London.
Drewitt, A. (1997) Evaluation and consultation: learning the lessons of user
involvement. Evaluation 3(2), 189±204.
Dummond, E. J. (1994) Making best use of performance measures and information.
International Journal of Operations and Production Management 14(9), 16±31.
Dunn, W. N. (1982) Reforms as arguments. Evaluation Studies Review Annual 7,83±116.
Duran, P., Monnier, E. and Smith, A. (1995) Evaluation aÁ la francËaise: towards a new
relationship between social science and public action, Evaluation 1(1), 45±63.
Everitt, A. and Hardiker, P. (1996) Evaluating for Good Practice. Macmillan,
Basingstoke.
Everitt, A. (1996) Developing critical evaluation. Evaluation 2(2), 173±188.
Farrington, D. P. (1997) Evaluating a community crive prevention program. Evaluation
3(2), 157±173.
Fetterman, D. M., Kaftarian, S. J. and Wandersman, A. (eds.) (1996) Empowerment
Evaluation: Knowledge and Tools for Self-Assessment and Accountability. Sage, london.
Foley, P. (1992) Local economic policy and job creation: a review of evaluation studies,
Urban Studies 29(3/4), 557±98.
Garaway, G. (1996) The case-study model: an organisational strategy for cross-cultural
evaluation. Evaluation 2(2), 201±211.
Green, J. C. (1996) Qualitative evaluation and scienti®c citizenship: re¯ections and
refractions. Evaluation 2(3), 277±289.
Gregory, D. G. and Martin, S. J. (1988) Issues in the evaluation of inner city
programmes. Local Economy 4(2): 237±249.
Hall, D. (1995) Performance Measurement Under Scrutiny. University of Birmingham,
Social Services Research.
Hambleton, R. and Thomas, H. (1995) Urban Policy Evaluation. Paul Chapman,
London.
HM Government (1979) The Financing of Small Firms. The Report of the Committee to
Review the Functioning of Financial Institutions (Wilson Report), Cmnd. 7503. HMSO,
London.
Kaplan, R. S. and Norton, D. P. (1992) The balanced scorecard Ð measures that drive
performance. Harvard Business Review January±February, 71±79.
Kaufman, C. C. (1995) Evaluation innovations for environments of systemic social
change. Evaluation 1(2), 155±169.
A. Jackson / Progress in Planning 55 (2001) 1±6458
Kazi, M. A. F. (1996) Single-case evaluation in the public sector using a combination of
approaches. Evaluation 2(1), 85±97.
Kimmel, A. J. (1988) Ethics and values in applied social research. Sage, Newbury Park,
CA.
Kirkhard, K. E. (1995) Seeking multicultural validity: a postcard from the road.
Evaluation Practice 16(1), 1±12.
Knox, C. (1995) Concept mapping in policy evaluation: a research review of
community relations in Northern Ireland. Evaluation 1(1), 65±79.
Laughlin, R. and Broadbent, J. (1996) Redesigning fourth generation evaluation: an
evaluation model for the public-sector reforms in the UK?, Evaluation 2(4), 431±451.
Leviton, L. and Hughes, E. F. X. (1981) Research on the utilization of evaluations. A
review and synthesis. Evaluation Review 5(4), 525±48.
Light, R. J. and Smith, P. V. (1977) Accumulating evidence: procedures for resolving
contraditions among different research studies, Evaluation Studies Review Annual 2,195±238.
Lincoln, Y. S. (1994) Tracks towards a postmodern politics of evaluation. Evaluation
Practice 15(3), 299±309.
Mawhood, C. (1997) Performance measurement in the United Kingdom (1985±1995).
In Evaluation for the 21st Century, E. Chelminsky, and W. R. Shadish (ed.). Sage,
London.
Midgley, G. (1996) Evaluating services for people with disabilities: a critical systems
perspective. Evaluation 2(1), 67±84.
Midwinter, A. (1994) Developing performance indicators for local government: the
scottish experience. Public Money and Management 14(2), 37±43.
Mischon de Reya/Tilly, B. (1992) The Funding Requirements of Private Companies.
Mischon de Reya, London.
Morgan, G. and Smircich, L. (1980) The case for qualitative research. Academy of
Management Review 5(4), 491±500.
Morrisey O. (1995) Shifting paradigms: discourse analysis as an evaluation approach
for technology assessment, Evaluation 1(2), 189±216.
ODA (1984) The Evaluation of Aid Projects and Programmes. HMSO, London.
Owen, J. M. (1995) roles for evaluation in learning organisations. Evaluation 1(2),
189±216.
Pollitt, C. (1988) Bringing consumers into performance measurement: concepts,
consequences and constraints. Policy and Politics 16, 77±87.
Pollitt, C. (1995) Justi®cation by works or by faith? Evaluating the new public
management. Evaluation 1(2), 133±154.
Rein, M. and White, S. H. (1978) Can policy research help policy. Evaluation Studies
Review Annual 3, 24±41.
Robinson, F. and Wren, C. (1987) Evaluating the impact and effectiveness of ®nancial
assistance policies in the newcastle metropolitan region. Local Government Studies 13,49±61.
Robinson, S. (1996) Evaluating the progress of clinical audit: a research and develop-
ment project. Evaluation 2(4), 373±392.
Robson, D., Bradford, M., Deas, I., Hall, E., Harrison, E., Parkinson, M., Evans, R.,
A. Jackson / Progress in Planning 55 (2001) 1±64 59
Garside, P., Harding, A., and Robinson, F. (1994) Assessing the Impact of Urban Policy.
HMSO, London.
Rogerson, P. (1995) Performance measurement and policing: police service or law
enforcement agency? Public Money and Management 15(4), 25±30.
Sanders, J. R. (1994) The Program Evaluation Standards. Sage, London.
Schemenner, R. W. and Vollman, T. E. (1994) Performance measures: gaps, false
alarms and the usual suspects. International Journal of Operations and Production
Management 14(12), 58±69.
Schwandt, T. A. (1997) Evaluation as practical hermeneutics. Evaluation 3(1), 69±83.
Smircich, L. and Stubbart, C. (1985) Strategic management in an enacted world, Acad-
emy of Management Review 10(4), 734±36.
Storey, D. (1994) Understanding the Small Business Sector. Routledge, London.
Stuf¯ebeam, D. L. and Webster, W. J. (1981) An analysis of alternative approaches to
evaluation. Evaluation Studies Review Annual 1, 70±85.
Tilley, N. (1996) Demonstration, exempli®cation, duplication and replication in
evaluation research. Evaluation 2(1), 35±50.
Van der Eyken, W., Goulden, D. and Crossley, M. (1995) Evaluating educational
reform in a small state: a case study of Belize, Central America. Evaluation 1(1), 33±44.
Vaux, A., Stockdale, M. S. and Schwerin, M. J. (ed.) (1992) Independent Consulting for
Evaluators. Sage, London.
Walker, R. (1994) Putting performance measurement into context: classifying social
housing organisations. Policy and Politics 22(3), 191±202.
Wildavsky, A. (1978) The self-evaluating organization. Evaluation Studies Review
Annual 3, 82±93.
Yin, R. K. (1989) Case Study Research. Sage, Newbury Park, CA.
Yin, R. K. (1982) The case study crisis. Evaluation Studies Review Annual 7, 167±174.
A. Jackson / Progress in Planning 55 (2001) 1±6460
References
Alkin, M.C., 1990. Debates on Evaluation. Sage, London.
Argyris, C., SchoÈn, D.A., 1978. Organizational Learning: A Theory of Action Perspective. Addison-Wesley,
Reading, MA.
Audit Commission, 1989. Managing Services Effectively Ð Performance Review. HMSO, London.
Audit Commission, 1992. Citizen's Charter Performance Indicators. HMSO, London.
Ball, R., Monaghan, C., 1996. Performance review: the British experience. Local Government Studies 22 (1),
40±58.
Bank of England (1994). Finance for Small Firms. A Note by the Bank of England. Bank of England, London.
Berk, R.A., Rossi, P.H., 1976. Doing good or worse: evaluation research politically re-examined. Social Problems
23 (3), 337±349.
Berk, R.A., Rossi, P.H., 1990. Thinking About Program Evaluation. Sage, London.
Birch, D. L. (1979). The Job Generation Process. March: MIT Program on Neighborhood and Regional Change.
Birley, S., 1985. The role of networks in the entrepreneurial process. Journal of Business Venturing 1,
107±117.
Birley, S., Niktari, N., 1995. The Failure of Owner-Managed Businesses: The Diagnosis of Accountants and
Bankers. Stoy Hayward, London.
Brickell, H.M., 1978. The in¯uence of external political factors on the role and methodology of evaluation.
Evaluation Studies Review Annual 3, 94±101.
Bulder, B., Frans, L., Henk, F., 1996. Networks and evaluating public-sector reforms. Evaluation 2 (3), 261±276.
Campbell, D.T., 1979. Assessing the impact of planned social change. Evaluation and Program Planning 2,
67±90.
Campbell, D.T., Stanley, J.C., 1963. Experimental and Quasi-Experimental Designs for Research. Rand
McNally, Chicago.
Campbell, D.T., Boruch, R.F., 1975. Making the case for randomised assignment to treatments by considering the
alternatives: six ways in which quasi-experimental evaluations in compensatory education tend to under-
estimate effects. In: Bennett, C.A., Lumsdaine, A.A. (Eds.). Evaluation and Experiments: Some Critical
Issues in Assessing Social Programmes. Academic Press, New York.
Carter, N., 1989. Performance indicators: backseat driving or hands off control? Policy and Politics 17 (2),
131±138.
Carter, N., Klein, R., Day, P., 1992. How Organisations Measure Success: The Use of Performance Indicators in
Government. Routledge, London.
Cave, M., Kogan, M., Smith, R., 1990. Output and Performance Measures in Government, The State of the Art.
Jessica Kingsley Publishers, London.
CBI (1993). Finance for Growth: Meeting the Financing Needs of Small and Medium Enterprises. CBI, London.
Chelimsky, E., 1987. What have we learned about the politics of program evaluation?. Evaluation Practice 8 (2),
5±21.
Chelimsky, E., 1997. Thoughts for a new evaluation society. Evaluation 3 (1), 97±118.
Chen, H-T., 1990. Theory-Driven Evaluations. Sage, London.
Chen, H-T., Peter, H.R., 1981. The multi-goal, theory-driven approach to evaluation: a model linking basic and
applied social science. Evaluation Studies Review Annual 6, 38±53.
Choudhary, A. and Tandon, R. (1988). Participatory Evaluation. Society for Participatory Research in Asia, New
Delhi, India.
Cook, T.D., Levinson-Rose, J., Pollard, W.P., 1981. The misutilization of evaluation research: some pitfalls of
de®nition. Evaluation Studies Review Annual 6, 727±748.
Cran®eld European Enterprise Centre (1993). Special Report No. 5. Attitudes of Smaller Firms Towards Finan-
cing and Financial Institutions in Europe. Cran®eld University, Cran®eld.
Cronbach, L.J., 1982. Designing Evaluations of Educational and Social Programs. Jossey-Bass, San Francisco.
Etzioni, A., 1960. Two approaches to organisational analysis: a critique and a suggestion. Administrative Science
Quarterly 5, 257±258.
Fetterman, D.M., Kaftarian, S.J., Wandersman, A. (Eds.), 1996. Empowerment Evaluation: Knowledge and Tools
for Self Assessment and Accountability Sage, London.
A. Jackson / Progress in Planning 55 (2001) 1±64 61
Finne, H., Levin, M., Nilssen, T., 1995. Trailing research: a model for useful program evaluation. Evaluation 1
(1), 11±31.
Ford, D. (Ed.), 1990. Understanding Business Markets: Interaction, Relationships, Networks Academic Press,
London.
Georghiou, L., 1995. Assessing the framework programmes-a meta-evaluation. Evaluation 1 (2), 171±188.
Ghaie, S., Birley, S., 1993. Networking by the indian business community in northern ireland. The Journal of
Entrepreneurship 2 (2), 209±234.
Ghobadian, A., Ashworht, J., 1994. Performance measurement in local government Ð concept and practice.
International Journal of Operations and Production Management 14 (5), 35±50.
Glynn, J.J., Murphy, M.P., 1996. Public management: failing accountabilities and failing performance review.
International Journal of Public Sector Management 9 (5/6), 125±137.
Gray, A., 1997. Contract culture and target fetishism. The distortive effects of output measures on local
regeneration programmes. Local Economy 11 (4), 343±357.
Guba, E.G., Lincoln, Y.S., 1989. Fourth Generation Evaluation. Sage, London.
Hammer, M., 1990. Reengineering work: don't automate, obliterate. Harvard Business Review July±August,
104±112.
Harmon, M.M., Mayer, R.T., 1986. Organisational Theory for Public Administration. Little Brown, Boston, MA.
Haug, P., 1996. Evaluation of government reforms. Evaluation 2 (4), 417±430.
Hedrick, T.E., 1988. The interaction of politics and evaluation. Evaluation Practice 9 (3), 5±28.
Henkel, M., 1991. Government, Evaluation and Change. Jessica Kingsley, London.
Henry, G.T., 1992. Using graphical displays to empower evaluation audiences. In: Vaux, A., Stockdale, M.S.,
Schwerin, M.J. (Eds.). Independent Consulting for Evaluators. Sage, London.
Hill, T., 1995. Manufacturing Strategy. Macmillan, London.
HM Treasury (1988). Policy Evaluation, A Guide for Managers. HMSO, London.
Hogwood, B.W., Gunn, L.A., 1984. Policy Analysis for the Real World. Oxford University Press, Oxford.
House, E.R., 1978. Justice in evaluation. Evaluation Studies Review Annual 3, 75±99.
House, E.R., 1980. Evaluating with Validity. Sage, London.
House, E.R., 1993. Professional Evaluation: Social Impact and Political Consequences. Sage, London.
Hughes, A., Storey, D.J. (Eds.), 1994. Finance and the Small Firm Routledge, London.
Inayatullah, J. and Birley, S. (1996). The Oranji Pilot Project: The Evaluation of a Micro-Enterprise Credit
Institution. Discussion Paper, Imperial College, London.
Jackson, A., 1996. Foyers: The Step in the Right Direction. Foyer Federation, London.
Jackson, P., 1988. The management of performance in the public sector. Public Money and Management 8 (4).
Jackson, P.M., 1993. Public service performance evaluation: a strategic perspective. Public Money and
Management, 9±14.
Johnson, H.T., Kaplan, R.S, 1991. Relevance Lost: The Rise and Fall of Management Accounting. Harvard
Management School Press, Boston.
Joint Committee on Standards for Educational Evaluation, 1994. The Program Evaluation Standards: How to
Assess Evaluations of Educational Programs. 2nd ed. Sage, London.
JURUE, 1986. Assessment of Industrial and Commercial Improvement Areas. HMSO, London.
Karlsson, O., 1996. A critical dialogue in evaluation: how can the interaction between evaluation and politics be
tackled?. Evaluation 2 (4), 405±416.
Kotler, P., 1994. Principles of Marketing. Prentice-Hall, Englewood Cliffs, NJ.
Kushner, S., 1996. The limits of constructivism in evaluation. Evaluation 2 (2), 189±200.
Likierman, A., 1993. Performance indicators: 20 early lessons from managerial use. Public Money and
Management 13 (4), 15±22.
Love, J.A., 1991. Internal Evaluation: Building Organisations from Within. Sage, London.
MacDonald, B., 1976. A political classi®cation of evaluation studies. In: Hamilton, D. (Ed.). Beyond the
Numbers Game. Macmillan, London.
March, J., Simon, H., 1958. Organisations. Wiley, London.
McEldowney, J.J., 1997. Policy evaluation and the concepts of deadweight and additionality: a commentary.
Evaluation 3 (2), 175±188.
Meekings, A., 1995. Unlocking the potential of performance measurement: a practical implementation guide.
Public Money and Management 15 (4), 5±12.
A. Jackson / Progress in Planning 55 (2001) 1±6462
Midland Bank (1992). The Changing Financial Requirements of Smaller Companies. Midland Bank Business
Economics Unit, London.
Mishan, E.J., 1971. Cost-Bene®t Analysis: An Informal Introduction. Allen and Unwin, London.
Murphy, K.R., Cleveland, J.N., 1995. Understanding Performance Appraisal. Social, Organizational and Goal-
Based Perspectives. Sage, London.
Newman, D.L., Brown, R.D., 1996. Applied Ethics for Program Evaluation. Sage, London.
Newton, T., Findly, P., 1996. Playing god? The performance of appraisal. Human Resource Management Journal
6 (3), 42±57.
OECD (1988). Measures to assist the long-term unemployed. Recent Experience in Some OECD Countries. Paris,
OECD.
Ostgaard, T.E., Birley, S., 1994. Personal networks and ®rm competitive strategy Ð a strategic or coincidental
match? Journal of Business Venturing 9, 281±305.
P.A. Cambridge Economic Consultants (1987). An Evaluation of the Enterprise Zone Experiment. HMSO, DOE,
London.
Palumbo, D.J. (Ed.), 1989. The Politics of Program Evaluation Sage, London.
Patton, M.Q., 1990. Qualitative Evaluation and Research Methods. Sage, London.
Patton, M.Q., 1996. Utilization-Focused Evaluation. 3rd ed. Sage, London.
Pawson, R., 1996. Three steps to constructivist heaven. Evaluation 2 (2), 213±219.
Pawson, R., Tilley, N., 1997. Realistic Evaluation. Sage, London.
Pearce, G., Martin, S., 1996. The measurement of additionality: grasping the slippery eel. Local Government
Studies 22 (1), 78±92.
Pedlar, M., Burgoyne, B., Boydell, T., 1991. The Learning Company: A Strategy for Sustainable Development.
McGraw Hill, Maidenhead.
Power, M., 1994. The Audit Explosion. Demos, London.
Radaelli, C.M., Denta, B., 1996. Evaluation strategies and analysis of the policy process. Evaluation 2 (1), 51±66.
Rebien, C.C., 1996. Participatory evaluation of development assistance: dealing with power and facilitative
learning. Evaluation 2 (2), 151±171.
Richardson, R., Kuipers, H., Soeters, J.L., 1996. Evaluation of organisational change in the dutch armed forces.
Evaluation 2 (1), 7±22.
Rogers. S., 1990. Performance Management in Local Government.
Rossi, P.H., 1985. The iron law of evaluation and other metallic rules. Paper presented at State University of New
York, Albany, Rockefeller College.
Rossi, P.H., Freeman, H.E., 1985. Evaluation, A Systematic Approach. 3rd ed. Sage, London.
Rossi, P.H., Freeman, H.E., 1993. Evaluation, A Systematic Approach. 5th. ed. Sage, London.
Scriven, M., 1994. Product evaluation-the state of the art. Evaluation Practice 15 (1), 45±62.
Scriven, M., 1976. Evaluation bias and its control. Evaluation Studies Review Annual 1, 119±139.
Scriven, M., 1991. Evaluation Thesaurus. 4th ed. Sage, London.
Scriven, M., 1996. The theory behind practical evaluation. Evaluation 2 (4), 393±404.
Senge, P., 1990. The Fifth Discipline: The Art and Practice of the Learning Organisation. Doubleday/Currency,
New York.
Shadish, W.J., Cook, T.D., Leviton, L.C., 1995. Foundations of Program Evaluation. Theories of Practice. Sage,
London.
Smith, R.S.G., Walker, R.M., 1994. The role of performance indicators in housing management: a critique.
Environment and Planning A 26, 609±621.
Spence, B. (1996). Visualisation Really Has Nothing To Do With Computers. Information Engineering Section
Report 96/2. London, Imperial College.
Stake, R.E. (Ed.), 1975. Evaluating the arts in education: A responsive approach. Merrill, Columbus, OH.
Stake, R.E., 1978. The case study method in social inquiry. Educational Researcher 7, 5±8.
Stake, R.E., 1981. Case study methodology: an epistemological advocacy. In: Welch, W. (Ed.). Case Study
Methodology in Educational Evaluation. Minnesota Research and Evaluation Center, Minneapolis.
Stake, R.E., 1995. The Art of Case Study Research. Sage, London.
Stanworth, J., Gray, C. (Eds.), 1991. Bolton 20 Years On: The Small Firm in the, 1990s Small Business Research
Trust, London.
A. Jackson / Progress in Planning 55 (2001) 1±64 63
Stewart, J., Walsh, K., 1994. Performance measurement: when performance can never be ®nally de®ned. Public
Money and Management. 14 (2), 43±49.
Storey, D.J., 1990. Evaluation of policies and measures to create local employment. Urban Studies 26, 587±606.
Storey, D. J., 1993. Should We Abandon the Support to Start-Up Business? Warwick Business School Small and
Medium Enterprise Centre. Working Paper No. 11.
Strand, S., 1997. Key performance indicators for primary school improvement. Education Management and
Administration 25 (2), 145±153.
Torres, R.T., Preskill, H.S., Piontek, M.S., 1996. Evaluation Strategies for Communication and Reporting:
Enhanced Learning in Organisations. Sage, London.
Townley, B., 1994. Reframing Human Resource Management: Power, Ethics and the Subject at Work. Sage,
London.
TUC (1997). The Small Firms Myths. London, TUC.
Van de Knaap, P., 1995. Policy evaluation and learning: feedback, enlightenment or argumentation? Evaluation 1
(2), 189±216.
VanderPlaat, M., 1995. Beyond technique: issues in evaluating for empowerment. Evaluation 1 (1), 81±96.
Wallace, W. A. (1980). The Economic Role of the Audit in Free and Regulated Markets. University of Rochester,
New York.
Weiss, C.H., 1973. The politics of impact measurement. Policy Studies Journal 1, 179±183.
Weiss, C.H., 1977. Research for policy' sake: The enlightenment function of social research. Policy Analysis 3,
531±545.
Weiss, C.H., 1980. Knowledge creep and decision accretion. Knowledge: Creation, Diffusion, Utilisation 1,
381±404.
Weiss, C.H., 1983. The stakeholders' approach to evaluation: origin and promise. In: Bryk, A.S. (Ed.).
Stakeholder-based Evaluation. New Directions for Program Evaluation, No. 17Jossey-Bass, San Francisco,
CA.
Weiss, C.H., 1988. Evaluation for decisions: is anyone there? Does Anyone Care? Evaluation Practice 9 (1),
5±19.
Wholey, J.S., 1981. Using evaluation to improve program performance. Evaluation Studies Review Annual 6,
55±69.
Zuboff, S., 1988. In the Age of the Smart Machine: The Future of Work and Power. Basic Books, New York.
A. Jackson / Progress in Planning 55 (2001) 1±6464