Dubrowski assessment and evaluation

84
MODELS FOR ASSESSING LEARNING AND EVALUATING LEARNING OUTCOMES Adam Dubrowski PhD

description

Panel presentation for the Pain Symposium 2010

Transcript of Dubrowski assessment and evaluation

Page 1: Dubrowski assessment and evaluation

MODELS FOR ASSESSING LEARNING AND EVALUATING LEARNING OUTCOMES

Adam Dubrowski PhD

Page 2: Dubrowski assessment and evaluation

www.wordle.net

Page 3: Dubrowski assessment and evaluation

AcknowledgementDr. Kathryn Parker, Learning Institute

Page 4: Dubrowski assessment and evaluation

Introduction

Page 5: Dubrowski assessment and evaluation

Introduction

Is Eli OK?

Page 6: Dubrowski assessment and evaluation

Introduction

Environment

Intake

Interaction

Reaction

Function

Eli

Page 7: Dubrowski assessment and evaluation

Content

• Assessment

• Evaluation: Models and Frameworks

• Working Example: Evaluation of a Hand Hygiene Program

• Assessment Instruments

• Standards and Rigor: Assessment Instruments

• Working Example: OSATS

• Standards and Rigor: Evaluations

• Summary and Take-home Message

Page 8: Dubrowski assessment and evaluation

Introduction

Assessments

Summative

Formative

Page 9: Dubrowski assessment and evaluation

Evaluations Assessments

Introduction

Page 10: Dubrowski assessment and evaluation

Introduction

Evaluations

Outcome

Process

Page 11: Dubrowski assessment and evaluation

Introduction

Evaluations

Outcome

Process

Page 12: Dubrowski assessment and evaluation

Alkin, M.C. & Christie, C.A. (2004). The evaluation theory tree. In Alkin, M.C. (ed.), Evaluation Roots: Tracing Theorists’ Views and Influences. London: Sage Publications.

Characteristic USE METHODS(Research)

VALUE

Purpose of evaluation To generate information that will be useful and inform decision making.

To generalize findings from one program to another similar program.

To render judgment on the merit or worth of the program.

Role of evaluator Meet the needs of the stakeholders; negotiator

Researcher Judge

Role of program staff/stakeholders

Close collaborator; drives the evaluation

Inform or approve the research question(s)

Provider (source) of information

Questions How will this evaluation data be used? By whom? For what purpose?

Can these findings be generalized to other programs under similar conditions?How do these findings contribute to what we know about how people learn and change?

Does the program meet the predetermined criteria?Was the program developed and implemented in the best interest of the public?

Voices Daniel Stufflebeam (1983)Michael Patton (2007)

Ralph Tyler (1942)Peter Rossi (2006)

Michael Stake (1975)Guba & Lincoln (1989)

Introduction

Page 13: Dubrowski assessment and evaluation

Introduction

Page 14: Dubrowski assessment and evaluation

Reflections

Where would you sit on the Evaluation Tree?What about others in your “system”?

Page 15: Dubrowski assessment and evaluation

Alkin, M.C. & Christie, C.A. (2004). The evaluation theory tree. In Alkin, M.C. (ed.), Evaluation Roots: Tracing Theorists’ Views and Influences. London: Sage Publications.

Characteristic USE METHODS(Research)

VALUE

Purpose of evaluation To generate information that will be useful and inform decision making.

To generalize findings from one program to another similar program.

To render judgment on the merit or worth of the program.

Role of evaluator Meet the needs of the stakeholders; negotiator

Researcher Judge

Role of program staff/stakeholders

Close collaborator; drives the evaluation

Inform or approve the research question(s)

Provider (source) of information

Questions How will this evaluation data be used? By whom? For what purpose?

Can these findings be generalized to other programs under similar conditions?How do these findings contribute to what we know about how people learn and change?

Does the program meet the predetermined criteria?Was the program developed and implemented in the best interest of the public?

Voices Daniel Stufflebeam (1983)Michael Patton (2007)

Ralph Tyler (1942)Peter Rossi (2006)

Michael Stake (1975)Guba & Lincoln (1989)

Introduction

Page 16: Dubrowski assessment and evaluation

IntroductionCharacteristic Adam

ResearchKath

USE (Leaning towards Research)

Purpose of evaluation To generalize findings from one program to another similar program.

To generate information that will be useful for decision making. To gain a better understanding of how programs work.

Role of evaluator Researcher Collaborator

Role of program staff/stakeholders

Inform or approve the research question(s)

Close collaborator; drives the evaluation

Questions Can these findings be generalized to other programs under similar conditions?How do these findings contribute to what we know about how people learn and change?

How will this evaluation data be used? By whom? For what purpose? How do these findings contribute to what we know about how people learn and change?

Voices Ralph Tyler (1942)Peter Rossi (2006)

Patton (2007); Chen (2002); Donaldson (2001)

Page 17: Dubrowski assessment and evaluation

Models and Frameworks

• CIPP Evaluation Model (Stufflebeam, 1983)

• Kirkpatrick’s Model of Evaluating Training Programs (Kirkpatrick, 1998)

• Moore’s Evaluation Framework (Moore et al., 2009)

• Miller’s Framework for Clinical Assessments (Miller, 1990)

Page 18: Dubrowski assessment and evaluation

CIPP

• CIPP: Context, Input, Process, Product

Stufflebeam, 1983; Steinert, 2002

Page 19: Dubrowski assessment and evaluation

CIPP

Context

• What is the relation of the course to other courses?• Is the time adequate?• Should courses be integrated or separate?

Page 20: Dubrowski assessment and evaluation

CIPP

Inputs

• What are the entering ability, learning skills and motivation of students?

• What is the students’ existing knowledge?• Are the objectives suitable?• Does the content match student abilities?• What is the theory/practice balance?• What resources/equipment are available?• How strong are the teaching skills of teachers?• How many students/teachers are there?

Page 21: Dubrowski assessment and evaluation

CIPP

Process

• What is the workload of students?• Are there any problems related to

teaching/learning?• Is there effective 2-way communication?• Is knowledge only transferred to students, or do they

use and apply it?• Is the teaching and learning process continuously

evaluated?• Is teaching and learning affected by

practical/institutional problems?

Page 22: Dubrowski assessment and evaluation

CIPP

Product

• Is there one final exam at the end or several during the course?

• Is there any informal assessment?• What is the quality of assessment?• What are the students’ knowledge levels after the

course?• How was the overall experience for the teachers and

for the students?

Page 23: Dubrowski assessment and evaluation

CIPP

Methods used to evaluate the curriculum

• Discussions• Informal conversation or observation• Individual student interviews• Evaluation forms• Observation in class by colleagues• Performance test• Questionnaire• Self-assessment• Written test

Page 24: Dubrowski assessment and evaluation

CIPP

CIPP focuses on the process and informs the program/curriculum for future improvements.

Limited emphasis on the outcome.

Page 25: Dubrowski assessment and evaluation

Kirkpatrick’s Model of Evaluating Training Programs

Results

Behavior

Learning

Reaction

Page 26: Dubrowski assessment and evaluation

Results

Behavior

Learning

Reaction

Kirkpatrick’s Model of Evaluating Training Programs

• Reaction: The degree to which the expectations of the participants about the setting and delivery of the learning activity were met

• Questionnaires completed by attendees after the activity

Page 27: Dubrowski assessment and evaluation

Kirkpatrick’s Model of Evaluating Training Programs

Results

Behavior

Learning

Reaction

• Learning: The degree to which participants recall and demonstrate in an educational setting what the learning activity intended them to be able to do

• Tests and observation in educational setting

Page 28: Dubrowski assessment and evaluation

Kirkpatrick’s Model of Evaluating Training Programs

Results

Behavior

Learning

Reaction

• Behavior: The degree to which participants do what the educational activity intended them to be able to do in their practices

• Observation of performance in patient care setting; patient charts; administrative databases

Page 29: Dubrowski assessment and evaluation

Kirkpatrick’s Model of Evaluating Training Programs

Results

Behavior

Learning

Reaction

• Results: The degree to which the health status of patients as well as the community of patients changes due to changes in the practice

• Health status measures recorded in patient charts or administrative databases, epidemiological data and reports

Page 30: Dubrowski assessment and evaluation

Kirkpatrick’s Model of Evaluating Training Programs

• The importance of Kirkpatrick’s Model is in its ability to identify a range of dimensions that needs to be evaluated in order to inform us about the educational quality of a specific program.

• It focuses on the outcomes.

• However, it provides limited information about the individual learner.

Page 31: Dubrowski assessment and evaluation

Moore’s Framework

Does

Shows How

Knows How

Knows

Results

Behavior

Learning

Reaction

Page 32: Dubrowski assessment and evaluation

Miller’s Framework for Clinical Assessments

Does(Performance)

Shows How (Competence)

Knows How (Procedural Knowledge)

Knows(Declarative Knowledge)

Page 33: Dubrowski assessment and evaluation

Does(Performance)

Shows How (Competence)

Knows How (Procedural Knowledge)

Knows(Declarative Knowledge)

Miller’s Framework for Clinical Assessments

• Knows: Declarative knowledge. The degree to which participants state what the learning activity intended them to know

• Pre- and posttests of knowledge.

Page 34: Dubrowski assessment and evaluation

Miller’s Framework for Clinical Assessments

Does(Performance)

Shows How (Competence)

Knows How (Procedural Knowledge)

Knows(Declarative Knowledge)

• Knows how: Procedural knowledge. The degree to which participants state how to do what the learning activity intended them to know how to do

• Pre- and posttests of knowledge

Page 35: Dubrowski assessment and evaluation

Miller’s Framework for Clinical Assessments

Does(Performance)

Shows How (Competence)

Knows How (Procedural Knowledge)

Knows(Declarative Knowledge)

• Shows how: The degree to which participants show in an educational setting how to do what the learning activity intended them to be able to do

• Observation in educational setting

Page 36: Dubrowski assessment and evaluation

Miller’s Framework for Clinical Assessments

Does(Performance)

Shows How (Competence)

Knows How (Procedural Knowledge)

Knows(Declarative Knowledge)

• Does: Performance. The degree to which participants do what the learning activity intended them to be able to do in their practices.

• Observation of performance in patient care setting; patient charts; administrative databases

Page 37: Dubrowski assessment and evaluation

Miller’s Framework for Clinical Assessments

• The importance of Miller’s Framework is in its ability to identify learning objectives and link them with appropriate testing contexts (where) and instruments (how).

Page 38: Dubrowski assessment and evaluation

Assumption

• Moore’s and Kirkpatrick’s models assume a relationship between the different levels.

• If learners are satisfied, they will learn more, will be able to demonstrate the new skills, transfer them to the clinical setting, and consequently the health of patients and communities will improve!

Page 39: Dubrowski assessment and evaluation

Models assumptions are not met!

Page 40: Dubrowski assessment and evaluation

Building Skills, Changing Practice: Simulator Training for Hand Hygiene

Protocols

Canadian Institutes of Health ResearchPartnerships for Health System Improvement

A. McGeer, MA. Beduz, A. Dubrowski

Page 41: Dubrowski assessment and evaluation

Purpose

• Current models of knowledge delivery about proper hand hygiene rely on didactic session

• However, transfer to practice is low

Page 42: Dubrowski assessment and evaluation

Purpose

Didactic courses

Clinical practice

Simulation

Does

Shows How

Knows How

Knows

Page 43: Dubrowski assessment and evaluation

Purpose

• The primary purpose was to investigate the effectiveness of simulation-based training of hand hygiene on transfer of knowledge to clinical practice

Page 44: Dubrowski assessment and evaluation

Project

Recruitment

Clinical monitoring (audits)

Clinical monitoring (audits)

Simulation

Δ in Behavior

Δ in Learning

Reaction

Page 45: Dubrowski assessment and evaluation

Results: Reactions

• Simulation Design Scale• Educational Practices Questionnaire

– 159 respondents - average response: – ‘Agree’ or ‘Strongly Agree’ (4 or 5 on a

5-pt scale) to each of 40 items– Rated 39 of these items as ‘Important’

(4 on a 5-pt scale)

Page 46: Dubrowski assessment and evaluation

Results: Learning

Bef-Pat/Env Bef-Asp Aft-Bfl Aft-Pat/Env0

10

20

30

40

50

60

70

80

90

100 98

24

51

9292

4448

93

PreTest (n=558)Post Test (n=550)

Moment 1 Moment 2 Moment 3 Moment 4

Page 47: Dubrowski assessment and evaluation

Results: Behaviour

Pre-test Post-testHand Hygiene Compliance(average of all opportunities; all 4 moments)

62% (n=558

opportunities)

64% (n=550

opportunities)

Average Adequacy(Excellent/ Satisfactory/ Unsatisfactory)

Satisfactory (n=252

observations)

Satisfactory (n=282

observations)Average Duration 13 seconds

(n=254 observations)

13 seconds(n=252

observations)# Extra Washes 140 90

Page 48: Dubrowski assessment and evaluation

Conclusions

• No transfer of knowledge

• No clear relationship between the assessments of reactions, learning and behaviour.

Page 49: Dubrowski assessment and evaluation

Summary

Page 50: Dubrowski assessment and evaluation

Summary

• Outcome based models of program evaluation may be providing limited use (i.e. they are mostly on the research branch).

• Need new, more complex models that incorporate both processes and outcomes (i.e. span across more branches).

Page 51: Dubrowski assessment and evaluation

Summary

• Assessment instruments, which feed these models are critical for successful evaluations.

• We need to invest efforts in standardization and rigorous development of these instruments.

Evaluations Assessments

Page 52: Dubrowski assessment and evaluation

Assessment Instruments

Page 53: Dubrowski assessment and evaluation

Assessment Instruments

Page 54: Dubrowski assessment and evaluation

Standards and Rigor

Page 55: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments that apply to measuring three properties:

1. Distinguish between two or more groups,

2. assess change over time,

3. predict future status.

Page 56: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

The concept to be measured needs to be defined properly and should match its

intended use.

Page 57: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Reliability is the degree to which the instrument is free of random error, which means free from errors in measurement caused by chance factors that influence

measurement.

Page 58: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Validity is the degree to which the instrument measures what it purports to

measure.

Page 59: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Responsiveness is an instrument’s ability to detect change over time.

Page 60: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Interpretability is the degree to which one can assign easily understood meaning to an

instrument's score.

Page 61: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Burden refers to the time, effort and other demands placed on those to whom the instrument is administered (respondent burden) or on those who administer the

instrument (administrative burden)

Page 62: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Alternative means of administration include self report, interviewer-administered,

computer assisted, etc. Often it is important to know whether these modes of

administration are comparable

Page 63: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Cultural and Language adaptations or translations.

Page 64: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

The SAC defined a set of eight key attributes of instruments:

– The Conceptual and Measurement Model

– Reliability– Validity– Responsiveness or sensitivity to

change– Interpretability– Burden– Alternative Forms of Administration– Cultural And Language Adaptations

Page 65: Dubrowski assessment and evaluation

Scientific Advisory Committee of the Medical Outcomes Trust

Do our assessment instruments fit these criteria?

Page 66: Dubrowski assessment and evaluation

Working Example: OSATS

• Objective Assessments of Technical Skills

• Bell-ringer exam (12 minutes per station)

• Minimum of 6-8 stations

• Direct or video assessments of technical performance

Page 67: Dubrowski assessment and evaluation

Working Example: OSATS

The Conceptual and Measurement Model

Reznick R, Regehr G, MacRae H, Martin J, McCulloch W. Testing technical skill via an innovative "bench station" examination. Am J Surg. 1997 Mar;173(3):226-30.

Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J, Hutchison C, Brown M. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997 Feb;84(2):273-8.

Page 68: Dubrowski assessment and evaluation

Working Example: OSATS

Reliability and Validity

Faulkner H, Regehr G, Martin J, Reznick R. Validation of an objective structured assessment of technical skill for surgical residents. Acad Med. 1996 Dec;71(12):1363-5.

Kishore TA, Pedro RN, Monga M, Sweet RM. Assessment of validity of an OSATS for cystoscopic and ureteroscopic cognitive and psychomotor skills. J Endourol. 2008 Dec;22(12):2707-11.

Goff B, Mandel L, Lentz G, Vanblaricom A, Oelschlager AM, Lee D, Galakatos A, Davies M, Nielsen P. Assessment of resident surgical skills: is testing feasible? Am J Obstet Gynecol. 2005 Apr;192(4):1331-8; discussion 1338-40. VAL Martin JA, Reznick RK, Rothman A, Tamblyn RM, Regehr G. Who should rate candidates in an objective structured clinical examination? Acad Med. 1996 Feb;71(2):170-5.

Page 69: Dubrowski assessment and evaluation

Working Example: OSATS

Alternative Forms of Administration

Maker VK, Bonne S. Novel hybrid objective structured assessment of technical skills/objective structured clinical examinations in omprehensive perioperative breast care: a three-year analysis of outcomes. J Sualrg Educ. 2009 Nov-Dec;66(6):344-51.

Lin SY, Laeeq K, Ishii M, Kim J, Lane AP, Reh D, Bhatti NI. Development andpilot-testing of a feasible, reliable, and valid operative competency assessment tool for endoscopic sinus surgery. Am J Rhinol Allergy. 2009 May-Jun;23(3):354-9.

Goff BA, VanBlaricom A, Mandel L, Chinn M, Nielsen P. Comparison of objective, structured assessment of technical skills with a virtual reality hysteroscopy trainer and standard latex hysteroscopy model. J Reprod Med. 2007 May;52(5):407-12.

Datta V, Bann S, Mandalia M, Darzi A. The surgical efficiency score: a feasible, reliable, and valid method of skills assessment. Am J Surg. 2006, Sep;192(3):372-8.

Page 70: Dubrowski assessment and evaluation

Working Example: OSATS

Adaptations

Setna Z, Jha V, Boursicot KA, Roberts TE. Evaluating the utility of workplace-based assessment tools for specialty training. Best Pract Res Clin Obstet Gynaecol. 2010 Jun 30.

Dorman K, Satterthwaite L, Howard A, Woodrow S, Derbew M, Reznick R, Dubrowski A. Addressing the severe shortage of health care providers in Ethiopia: bench model teaching of technical skills. Med Educ. 2009 Jul;43(7):621-7.

Page 71: Dubrowski assessment and evaluation

The SAC defined a set of eight key attributes of instruments:

The Conceptual and Measurement Model

Reliability Validity

Responsiveness or sensitivity to changeInterpretabilityBurden

Alternative Forms of Administration Cultural And Language Adaptations

Working Example: OSATS

Page 72: Dubrowski assessment and evaluation

Working Example: OSATS

Applications to program evaluation

Chipman JG, Schmitz CC. Using objective structured assessment of technical skills to evaluate a basic skills simulation curriculum for first-year surgical residents. J Am Coll Surg. 2009 Sep;209(3):364-370.

Siddighi S, Kleeman SD, Baggish MS, Rooney CM, Pauls RN, Karram MM. Effects of an educational workshop on performance of fourth-degree perineal laceration repair. Obstet Gynecol. 2007 Feb;109(2 Pt 1):289-94.

VanBlaricom AL, Goff BA, Chinn M, Icasiano MM, Nielsen P, Mandel L. A new curriculum for hysteroscopy training as demonstrated by an objective structured assessment of technical skills (OSATS). Am J Obstet Gynecol. 2005 Nov;193(5):1856-65.

Anastakis DJ, Wanzel KR, Brown MH, McIlroy JH, Hamstra SJ, Ali J, Hutchison CR, Murnaghan J, Reznick RK, Regehr G. Evaluating the effectiveness of a 2-year curriculum in a surgical skills center. Am J Surg. 2003 Apr;185(4):378-85.

Page 73: Dubrowski assessment and evaluation

Evaluations Assessments

Standards and Rigor

Page 74: Dubrowski assessment and evaluation

The Joint Committee on Standards for Educational Evaluation (1990)

Utility Standards • The utility standards are intended to ensure that an

evaluation will serve the information needs of intended users.

Page 75: Dubrowski assessment and evaluation

The Joint Committee on Standards for Educational Evaluation (1990)

Utility Standards Feasibility Standards • The feasibility standards are intended to ensure

that an evaluation will be realistic, prudent, diplomatic, and frugal.

Page 76: Dubrowski assessment and evaluation

The Joint Committee on Standards for Educational Evaluation (1990)

Utility Standards Feasibility Standards Propriety Standards • The propriety standards are intended to ensure that

an evaluation will be conducted legally, ethically, and with due regard for the welfare of those involved in the evaluation, as well as those affected by its results.

Page 77: Dubrowski assessment and evaluation

The Joint Committee on Standards for Educational Evaluation (1990)

Utility Standards Feasibility Standards Propriety Standards Accuracy Standards • The accuracy standards are intended to ensure that

an evaluation will reveal and convey technically adequate information about the features that determine worth or merit of the program being evaluated.

Page 78: Dubrowski assessment and evaluation

Conclusions

Page 79: Dubrowski assessment and evaluation

Conclusions

Evaluations

Outcome

Process

Page 80: Dubrowski assessment and evaluation

Evaluations Assessments

Conclusions

Page 81: Dubrowski assessment and evaluation

Conclusions

Three philosophies/paradigms:• Use• Methods• Value

Page 82: Dubrowski assessment and evaluation

Conclusions

Where would you sit on the Evaluation Tree?What about others in your “system”?

Page 83: Dubrowski assessment and evaluation

Take-home

• Need more complex evaluation models that look at both processes and outcomes

• Choices will depend on the intended use (i.e. the branch on the Evaluation Tree)

• Both the evaluation models and the assessments instruments need standardization and rigor.

Page 84: Dubrowski assessment and evaluation

Beast

Environment

Intake

Interaction

Reaction

Function

Fluffy