Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

download Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

of 16

Transcript of Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    1/16

    MMaallaayyssiiaann EEvvaalluuaattiioonn SSoocciieettyy ((MMEESS))44tthhIInntteerrnnaattiioonnaall EEvvaalluuaattiioonn CCoonnffeerreennccee,, 22001100

    KKuuaallaa LLuummppuurr:: 2299 MMaarrcchh -- 22 AApprriill,, 22001100

    Turning data into useful information for program / policyimprovements

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    2/16

    1.0 INTRODUCTION 32.0 THE HIERARCHY OF KNOWLEDGE2.1 Conventional View from Data to Wisdom

    2.2 Which Comes First the Chicken or the Egg?

    3.0 KEY ISSUES IN THE MANAGEMENT OF DATA 33.1 Designing for data measurement, collection and analysis 3

    3.2 What are the Evaluation Questions 4

    3.3 Understanding Data & Boundaries of Statistical Analysis Scales of Measurement 3

    3.5 Issues in data sampling 4

    3.6 Methods of data collection 3

    3.7 The fundamental issue of cause and effect 4

    3.8 Data Analysis Methods to be Applied - Choosing the Appropriate Statistical Tests 3

    3.9 The Evaluation Design 4

    4.0 COMMUNICATING THE RESULTS 5REFERENCES 13

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    3/16

    Turning data into useful information for program / policy improvements

    Keith T Linard

    Abstract

    Effective evaluation requires not only that we address the key issues and collect relevant and reliable data. It

    also requires that we analyse that data correctly and report the analysis and interpretation such that the decision

    makers can draw valid conclusions. In other words, we must turn the data into useful information.

    All too often little thought is given to data analysis and information presentation until data collection is well

    under way. In turn, the data collection is often decided in the absence of a clear understanding of the keyevaluation questions. Consequently, when we come to analyse the data we frequently find that:

    much (costly) irrelevant data has been collected;

    essential data items have not been collected;

    inadequate data quality control has been applied; and as a result

    the information needs of the decision makers cannot be met.

    This paper proposes that data analysis planning should be an integral part of program design so that the right

    data is collected in the right way to answer the right evaluation questions.

    Anticipating how the evaluation findings will be used forces evaluators to think carefully about the presentationsthat will address the evaluation questions in a clear and logical manner. This helps identify the graphical

    presentations and tables through which the monitoring or evaluation findings will be presented. In turn, this

    determines the way the data will be analysed, the nature of the data and the way it is collected.

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    4/16

    1.0 INTRODUCTION

    In 1986 the Australian Government mandated that all new policy proposals to Cabinet identify aperformance framework to permit the future evaluation of program effectiveness. In 1993, theUS Congress passed Government Performance and Results Act (GPRA) to improve

    Government stewardship by linking resources and management decisions with programperformance. Similar internal or legal provisions have since been made by Governments aroundthe world. Indeed, Malaysia has been a leader in applying a rigorous and structured approach toprogram evaluation since the late-1990s, due in no small measure to the dedication andexpertise of the founding members of the Malaysian Evaluation Society.

    I note in particular, ProLL (Program Logic and linkages Model), the unique program andevaluation planning model developed by Dr. Arunaselam Rasappan in the early 1990s for use inthe Malaysian public sector, and subsequently refined over the intervening years by Dr

    Rasappan and Dr Jerry Winston.

    Since the early steps in formalizing program evaluation, there have been major advances inmethodology. Thus MfDR has been very successful in shifting the focus from inputs (which wasthe norm in both developed and developing world up to the late 1980s and beyond) tomeasureable results at all phases of the development process.

    Over the same period, through the medium of the internet, ready access to expert resources hasexploded.

    Unfortunately, as methodology has improved and the literature has expanded, so has theconfusion of terms and definitions. This is understandable because, as evaluation has takencentre stage, there has been an entry of diverse professions (accountants, engineers,psychologists, agricultural scientists, sociologists etc) into the evaluation sphere. Evaluation is amulti-facetted concept which has different nuances in different professions and the differentprofessions typically have different emphases at the different stages of a program cycle.

    Overlaying these problems is the fact that both practice and the internet resources tend toneglect is the field of data analysis and reporting. It seems to be presumed that, the qualitativedecisions on evaluation process having been decided, data analysis, synthesis andinterpretation will automatically occur.

    Accordingly, this paper focuses on the data aspects of measurement:

    the nature of the evaluation questions & the associated analytical tools

    understanding data & the boundaries of statistical analysis

    the sources of the data issues in sampling

    methods of collection

    addressing cause and effect

    data analysis methods to be applied.

    Before I address this however I wish first to introduce to the evaluation confraternity to a

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    5/16

    patterns, knowledge is transformed into wisdom.

    Dataare typically regarded as a set of discrete objective facts about an event or a process whichhave little meaning by themselves although they are the raw material from which meaning maybe derived. Data for example are numerical quantities or other attributes derived fromobservation, experiment, or calculation.

    Informationis typically defined as data which has undergone some kind of organisation to endowit with meaning in relation to a defined purpose. Information concerning a particular object, event,or process is the end result of correcting (validating and verifying), collating, contextualising,analysing, categorising and re-presenting the base data.

    Knowledgeis meaning, derived from information, which has been understood and internalisedby persons such that they might put it to use. Similarly, organizational or social knowledge

    (loosely, organization culture) exists when it is accepted by a consensus of a group of people.Knowledge represents a state or potential for action and decisions in a person, organization or agroup.

    Wisdomis typically defined as the ability to identify the underlying principles and patterns andmake correct judgments on the bases of previous knowledge, experience and insight. Within anorganization, intellectual capital or organizational wisdom is the application of collectiveknowledge.

    This conventional view sees the construction of knowledge somewhat similar to using letters asatoms for building words that are subsequently combined to meaningful sentences. Thesentences then combine to form a book of profound wisdom. The symbolic curve in Figure 1 isintended to make the point that the value of the various forms of data-information-knowledgeincreases through contextual understanding. In this perspective, the raw material of data ismined and then increasingly refined to become knowledge and then wisdom.

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    6/16

    pre-defined data structure that completely defines its meaning. Instead of data being rawmaterial for information which can be mined, data instead is the result of using wisdom andknowledge to add value to information by putting it into a form that can be processed.

    This reversal of the DIKW hierarchy, which is depicted in Figure 2, has important implications forevaluation professionals. It changes the focus from data out there to be collected and analysed,to the socio-cognitive aspects of collective meaning processing. In other words, it is theorganizational process of articulating the evaluation problem and agreeing the evaluationquestions which ultimately define and give meaning to the process of measurement, collation,examination and synthesis of data.

    Figure 2: A Richer Picture of the Data Analysis Problem

    This reversal of the DIKW hierarchy is implicit in the PRoLL methodology, but appears to begiven less than its proper attention in practice. The fundamental step in good evaluation is thedefinition of the evaluation questions. If we are not asking the right questions, we can never getthe right answer.

    The balance of this paper presumes that we have indeed identified the right questions.

    3.0 KEY ISSUES IN THE MANAGEMENT OF DATA

    3.1 Designing for data measurement, collection and analysis

    The issue of measurement is appropriately considered within the overall context of evaluationdesign. Developing a valid evaluation methodology is a crucial step in evaluation design.The evaluation methodology concerns how we can answer the evaluation questions with

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    7/16

    Whilst this paper focuses specifically on measurement of outputs and outcomes, it must beremembered that there may be many common data elements relevant at varying stages of theprogram cycle. Also, the data may well be maintained in common databases. Table 1summarises some important characteristics of different types of evaluation, including thepurpose of each evaluation type (which in turn is indicative of the type of evaluation questionto be addressed), the typical timing of data collections, the type of data used and the keyevaluators.

    Table 1: Different types of Evaluation & their Characteristics

    3.2 What are the Evaluation Questions

    In every evaluation there will be a trade-off between the theoretical purity of the methodologyand resource and timing constraints. Corporate management agreement should be sought ifsuch constraints are likely to endanger the evaluations credibility or value.

    The evaluation design is, of course, dependent on the nature of the evaluation questionsbeing asked. Table 2 groups some common types of analytical tools according to the type ofarchetypal evaluation questions for which they are most relevant.

    Based on the evaluation objectives, the evaluation design should have specified specifichypotheses to test or specific questions to answer. The nature of these questions orhypotheses will suggest which statistical techniques are relevant.

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    8/16

    prove causality. Different approaches to addressing causality are discussed later in thispaper.

    EVALUATION QUESTION METHOD OF ANALYSIS ANALYTICAL TOOLS

    What is the difference between

    what is and what should be?

    Gap analysis Statistical Inference

    analysis of variance hypothesis testing

    multi-dimensional scaling

    What priority should optionshave?

    Scaling methods Rating scalesRankingsNominal group techniques

    What is the best / most

    efficient / most effective . . . ?

    Optimisation analysis Operations research tools

    System dynamics modelling

    To what extent was theprogram responsible forobserved changes ?

    Cause and effect analysis Systems analysisLogic analysisSimulation modelling

    What are the common patterns Classification methods Statistical Inference

    cluster analysis

    discriminant analysis

    factor analysis

    What will happen if . . . ? Trend analysis Simulation ModellingStatistical Inference

    Table 2: Common Evaluation Questions & Related Analytical Approaches

    3.3 Understanding Data & Boundaries of Statistical Analysis Scales ofMeasurement

    In the broadest sense, measurement is the assignment of numerals to objects or eventsaccording to rules. In evaluation four different measurement rules are common, which gives riseto four distinct classes or scales of measurement: Nominal, Ordinal, Interval and Ratio Scales.(Stevens 1946)

    It is important to distinguish between these different scales of measurement because the

    statistical manipulation that can legitimately be applied to data depends on the type of scaleagainst which the data are ordered.

    Nominal Scale: is the most unrestricted assignment of numerals. The numerals serve only aslabels to distinguish (a) individual units from each other, or (b) groups of units from other groups.An example of (a) is the numbering of football players. An example of (b) is the allocation ofpopulation numbers into categories of male and female.

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    9/16

    arbitrarily set, which means that ratio manipulation is not valid. Thus, the Celsius temperaturescale has a zero point set by convention at the temperature at which water freezes, which is273.15C above absolute zero.. This means that one cannot say that 20C is double 10C.

    Valid statistics include those valid for nominal and ordinal scales plus mean, standard deviation,rank-order correlation and product-moment correlation.

    Ratio Scale: is a scale consisting of equal-sized gradations and a true zero point. It satisfies theconditions of rank-ordering (e.g., [35 > 25], [56 = 56], [89 < 123]); equality of intervals (e.g., [8-6] = [4-2]); equality of ratios (e.g. [8/4] = [6/3]; 20K is double 10K).

    All types of statistical measurements are legitimate.

    3.4 Sources of the data

    Rarely is much thought given to the data necessary for comprehensive assessment of impactsuntil the program is up and running. In such cases effectiveness evaluations, in particular, mustrely on special post-program collection efforts to establish a base-line datum. This is usuallycostly, and it is often difficult to get the desired accuracy.

    Future evaluation data needs, the data sources and mechanisms for collecting and storing itshould be addressed during the initial planning for the program. Data sources may beconsidered under five broad categories:

    management information systems,

    special collection efforts,

    existing records and statistics,

    simulation modelling and

    expert judgement.

    Management information systems (MIS):

    Every agency should have in place an MIS which captures input, process and output data. Asfar as is practical such data should be collected as an automatic by-product of normal workprocesses.

    In practice government agencies are far from this ideal. All agencies hold vast amounts of data,of unknown quality, on diverse, and often incompatible, database systems, on manual indexes,in files, in reports. Data integrity becomes more problematic as one moves from the centralGovernment bureaucracies to regional and local bureaucracies.

    The advent of Intranets and the World Wide Web potentially give easier access to these data.However, issues of collation and quality control, especially with respect to data definition,become, if anything, more critical.

    Special collection efforts:

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    10/16

    sources it is important to check data definitions, population characteristics and any adjustmentsmade to the original data (e.g., re-basing or smoothing).

    Simulation and modelling:

    Sometimes physical or mathematical models can be developed to simulate the program

    operations. All State Road Authorities in Australia, for example, have computer models whichhelp estimate the likely impact of changes in the road system. Powerful, yet inexpensive,computer modelling tools are now being used to model a wide range of social systems. Inparticular, the graphically oriented system dynamics simulation packages such as Powersim andIthink, will in the future become standard tools for the evaluator. The calibration of such modelsmay, of course, require extensive historical data for calibration.

    Expert judgement:

    Not all change can be measured directly. In social fields the assessment of qualitative changesoften depends on expert judgement. It is important that the rating procedures used allow theexpression of judgements in a comparable and reproducible way.

    3.5 Issues in data sampling

    Data collection is expensive. Costs can be cut by using sample survey techniques. The crux ofsurvey design is its method of probability sampling, which permits generalisations to be made,

    from the findings about the sample, to the population as a whole.

    The validity of generalising from the sample to the entire population requires that the sample beselected according to rigorous rules and that there is a uniform approach to data collection toevery unit in the sample.

    The sample size affects how statistically reliable the findings will be when projected to the entirepopulation. Other important considerations include response-rate, uniformity in the samplingtechnique, and ambiguity in the data collection /questions instruments.

    Sample surveys are specialist tasks, and advice should be sought before embarking on theiruse.

    3.6 Methods of data collection

    Having decided whether and what type of sampling technique is appropriate, the next step is todetermine how to collect the needed data from the various sources available. Many approachesare available. Some require the involvement of individuals or groups; others, such as

    observation and review of existing data can largely be done by the researcher alone. Table 3 isbut a small sample of data collection techniques.

    There is a tendency to assume that any fool can design a questionnaire, run a brainstormingsession or undertake systematic observation. These also are areas which demand expertiseand experience. Undertaking major collections without staff with the requisite skills can be costlyi f d bi li d d l f dibili

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    11/16

    Individually OrientedMethods

    InterviewsQuestionnairesPollsTests

    Group-Oriented Methods Sensing interviewsCommitteesDELPHI techniquesNominal-group techniqueBrainstorming

    Observation Systematic observationComplete observationParticipant observation

    Review of Existing Data Records analysisUsage rates

    Use tracesOther Simulation

    Table 3: Data Collection Techniques

    3.7 The fundamental issue of cause and effect

    With effectiveness evaluation in particular, but also with other evaluation types (see figure 1) a

    major task of data collection relates to measuring change. However, it is equally important tobe able to determine how much of that change is due to the program itself, and how muchresults from other factors.

    CASE 1: After measurement only - No before measurement or comparison or control group

    We measure population characteristics at a single point in time, and compare these with thetarget performance. For example, if a program aims to eliminate child poverty by 2009,measurement of the extent of child poverty in 2009 will suggest whether the objective has

    been met.

    While this approach is cheap, and quite common, it has two critical defects:

    withoutbaselinedatathebasisofthetargetisquestionableandtheamountofchangeisuncertain(changefromwhat?);

    evenifchangehasoccurredthereisnovalidbasisforascribingitscausetotheprogram.

    We can often mitigate the first deficiency by estimating baseline conditions from existing data,

    through expert judgement etc. The issue of causality remains.

    Case 2: Before and after measurement - No comparison or control group

    This is one of the more common evaluation designs. As an example, to test the efficacy of apublic service management improvement program we might survey departmental

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    12/16

    For example, we might compare the crop yield of treated paddocks with untreated ones(which have the same basic characteristics). We assume that any difference between thepilot and the control is due solely to the treatment program. The doubt remains:

    werethebaselineconditionsbetweenthepilotandthecontrolidentical?

    aretheprogrameffectsthesoledifferencesinimpactsontherespectivepopulations.(With

    social

    programs

    even

    the

    awareness

    of

    a

    pilot

    program

    can

    impact

    on

    control

    groups.)

    CASE 4: Time series and econometric analyses

    In a time seriesdesign the trends in the relevant indicators are analysed for the period prior tothe program. We project these forward in time and assume the projections represent what

    would have been without the program. The difference between the projections and theactuals is presumed to be solely due to the program.

    Econometric techniques are statistical methods that estimate, from historical data, themathematical relationship between social and economic variables which are considered tohave a causal link to the evaluation question. The mathematical formula is used to predict

    what would have beenin the absence of the program. We then compare this with the actuals.Because they consider more variables than simply time, econometric approaches have abetter predictive value than time series methods.

    These approaches provide more reliable information than the previous two designs, and arerelatively inexpensive provided the requisite data is available. Their limitations are that:

    theypresumeratherthanprovecausality;hence

    wecannotbecertainthatprojectionsvalidlyrepresentwhatwillbe.

    CASE 5: Quasi-experimental design - before & after measurements of both pilot and

    comparison group

    This design involves two or more measurements over time on both a pilot and a comparisongroup. Both rates of change and amount of change between the two groups are thencompared. This protects to a large degree against changes which might have resulted fromother factors.

    The main problem with this design is ensuring the pilot and control groups have sufficientlysimilar characteristics. For example if a pilot lifestyle education program is run in Homebush,NSW, and the control group is located in Broadmeadows, Victoria, subsequent healthdifferences could be due to non program factors, such as climate, ethnicity etc.

    CASE 6: Experimental design: before & after measurements of randomly assigned pilot andcontrol groups

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    13/16

    3.8 Data Analysis Methods to be Applied - Choosing the Appropriate Statistical Tests

    Data analysis is often the weakest link in evaluation. In particular:

    thereisoftenalackofunderstandingaboutthedifferencesbetweennumberscales,asdiscussedabove;

    thereseems

    to

    be

    little

    understanding

    of

    appropriate

    statistical

    techniques

    to

    analyse

    data

    wheretherearemultiplevariables;

    thereappearstobelittleawarenessof,orskillsintheuseof,patternrecognitionorclassificationtechniquessuchasclusteranalysis,factoranalysis,multidimensionalscaling,

    discriminantanalysisetc.

    Most good statistical software packages will include programs to guide the user towards the

    appropriate statistical tools to apply, depending of the nature of the data (including the scale)and the type of analytical question. Table 4 is illustrative of the guidance available. Obviously,simply to answer such questions presumes a degree of statistical understanding. This is an areawhere the evaluation team should avail itself of skilled professionals.

    How the data will be analysed should be considered in the evaluation design stage. It should notbe left, as is so often the case, until the data has been collected.

    Table 3: Find the Right Statistical Test

    3.9 The Evaluation Design

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    14/16

    designwhoserequirementsmustlaterbecompromisedbecauseofbudget,timeor

    personnelconstraints.

    Resisttheurgetocollectthoseadditionaldataitemswhichmightbeuseful,oneday.Datacollectionandmaintenanceiscostly. Inmanyagencyevaluationsmuchofthedata

    collected

    is

    never

    used.

    Get expert advice on the validity and appropriateness of the design. The most costlydesign is that which is inappropriate to the problem.

    4.0 COMMUNICATING THE RESULTS

    Communication of the findings of the evaluation is an extremely important aspect of the study. Ifthe client cannot understand, misinterprets or is unconvinced by the conclusions then theevaluation effort is largely wasted. It is crucial to present conclusions and recommendations in aform which can be readily examined and considered by decision-makers.

    A report runs the risk of failure if it:

    isverboseorobtuse;

    concentratesonissueswhichareoflowprioritytoitsaudience(s);

    lackslogicorconsistencyinpresentinginformation;

    includescriticism

    which

    appears

    gratuitous

    or

    unfair;

    or

    lacksclearjustificationforcontentiousconclusions.

    REFERENCES

    Ackoff, Russel. "From Data to Wisdom." Journal of Applies Systems Analysis 16 (1989) 3-9.

    Burke, Martin. Thought Systems and Network Centric Warfare. DSTO, 2000.

    Tuomi, Ilkka. Data is More Than Knowledge. Journal of Management Information Systems16 (1999):107-121.

    Stevens, S. On the Theory of Scales of Measurement. Science103 (1946) 677-680.

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    15/16

  • 8/9/2019 Linard_2010_MES_Program Evaluation - Turning Data Into Useful Information

    16/16

    2010 Keith T Linard MES-2010-LINARD_TurningDataIntoUsefulInformation.doc Page 15 of 16

    FIGURE 1: TYPES OF EVALUATION AND THEIR CHARACTERISTICSo