Creating Tests that Measure Well and that Model Good ...
Transcript of Creating Tests that Measure Well and that Model Good ...
Creating Tests that Measure Well and that Model Good Instructional
and Learning Practice
National Conference on Student Assessment Minneapolis, MN
June 2012
Session Outline
• Presenters – Randy Bennett (ETS) – Edys Quellmalz (WestEd)
• Discussant – Brian Gong (NCIEA)
• Q&A
Copyright © 2011 by Educational Testing Service.
2
CBAL: Modeling Good Instructional and Learning Practice through Assessment
Randy Bennett ETS
Presentation at the National Conference on Student Assessment, Minneapolis, MN, June 2012
Overview
• Brief description of CBAL’s goal and design characteristics • Brief outline of pilot results • Examples of how we try to model good teaching and
learning practice • List of outstanding issues • Summary
Copyright © 2011 by Educational Testing Service.
4
5
Cognitively Based Assessment of, for, and as Learning
• Began in 2007 • Goal: Create knowledge and capability, grounded in the learning
sciences, that can be configured in different ways to address the assessment innovation needs of the field
• CBAL assessment prototypes attempt to: – Document what students have achieved (“of learning”) – Help identify how to plan instruction (“for learning”) – Offer worthwhile educational experiences (“as learning”)
• R&D covers reading, writing, mathematics, and science from elementary school through adult education
5
Copyright © 2011 by Educational Testing Service.
Key Design Characteristics
• Summative and formative assessment built as part of a coherent system
• System model was created from a detailed theory of action • Assessment designs are grounded in principles and
domain conceptions from learning-sciences’ research • Assessment prototypes are computer-delivered and make
heavy use of structured, scenario-based task sets • Summative assessments use a distributed design • Assessment prototypes are built to measure well and to
model good instructional and learning practice
Copyright © 2011 by Educational Testing Service.
6
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
7
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
8
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
9
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
10
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
11
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
12
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Summary of Results from 16 Online Summative Assessment Pilots
Copyright © 2011 by Educational Testing Service.
13
Content Area (& # of Form Admini-strations)
# of Tests
Median (M) of the M p+ Values
M of the M Percent Omitted/ Missing
M Coeff. Alpha
Most Frequent Factor Analytic Result
M r with Other Tests of the Same Skill
Diff. Btwn Auto-Human (H) & H-H Agreement
Reading (6)
3,062 .51 0% .88 1 F within & across forms
.74 M=6 k pts
Writing (9)
5,410 .57 1% .82 1 F within & across forms
--- 3 r pts
Math (12)
1,347 .45 6% .92 1 F within forms
.76 M=15 k pts
Note. M=median; F = factor; k = kappa; r = correlation coefficient.
Modeling Good Teaching and Learning Practice
• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with
which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient
performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills
required for more complex performance might be decomposed for instruction
– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding
Copyright © 2011 by Educational Testing Service.
14
Modeling Good Teaching and Learning Practice
• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with
which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient
performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills
required for more complex performance might be decomposed for instruction
– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding
Copyright © 2011 by Educational Testing Service.
16
Modeling Good Teaching and Learning Practice
• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with
which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient
performers tend to use – Connect qualitative (conceptual) understanding with
formalism – Use “lead-in” and “culminating tasks” to suggest how the skills
required for more complex performance might be decomposed for instruction
– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding
Copyright © 2011 by Educational Testing Service.
19
Copyright © 2012 by Educational Testing Service.
20
Will the lake become so shallow that water can no longer flow through the dam?
Modeling Good Teaching and Learning Practice
• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with
which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient
performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills
required for more complex performance might be decomposed for instruction
– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding
Copyright © 2011 by Educational Testing Service.
22
Modeling Good Teaching and Learning Practice
• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with
which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient
performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills
required for more complex performance might be decomposed for instruction
– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding
Copyright © 2011 by Educational Testing Service.
24
CBAL Definition of “Learning Progression”
• A description of qualitative change in a student’s level of sophistication for a key concept, process, strategy, practice, or habit of mind. Change in student standing on such a progression may be due to a variety of factors, including maturation and instruction. Each progression is presumed to be modal--i.e., to hold for most, but not all, students. Finally, it is provisional, subject to empirical verification and theoretical challenge
Copyright © 2011 by Educational Testing Service.
25
Provisional Learning Progression for Argument-Building (Deliberation)
• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position
• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion
• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.
• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.
• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.
Copyright © 2011 by Educational Testing Service.
26
Provisional Learning Progression for Argument-Building (Deliberation)
• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position
• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion
• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.
• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.
• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.
Copyright © 2011 by Educational Testing Service.
27
Provisional Learning Progression for Argument-Building (Deliberation)
• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position
• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion
• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.
• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.
• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.
Copyright © 2011 by Educational Testing Service.
29
Outstanding Issues
• Do our modeling strategies affect classroom teaching and learning practice?
– Do teachers change their instructional practice in the intended ways?
– Do students change their learning practice in the intended ways? – Does achievement improve?
• Are our learning progressions useful for measurement and for instruction?
• Do the modeling strategies and learning progressions appear to be of benefit for students from special populations, as well as for those from the general population?
Copyright © 2011 by Educational Testing Service.
31
Summary
• In CBAL we are: – Designing assessment prototypes to measure well and have positive impact – Attempting to have positive impact by modeling good teaching and learning
practice • Give students something substantive and reasonably realistic with which to work • Include tools and representations similar to ones used by proficient performers • Connect qualitative understanding with formalism • Use “lead-in” and “culminating tasks” to suggest how complex performance might
be decomposed • Use provisional learning progressions to denote and measure levels of qualitative
change • Data on “measuring well” appear promising • Much more work needs to be done to verify the effectiveness of our “practice-
modeling” attempts
Copyright © 2011 by Educational Testing Service.
32
For More About CBAL
• Overview Papers – Bennett, R. E., & Gitomer, D. H. (2009). Transforming K-12 assessment: Integrating
accountability testing, formative assessment, and professional support. In C. Wyatt-Smith & J. Cumming (Eds.), Educational assessment in the 21st century (pp. 43-61). New York: Springer.
– Bennett, R. E. (2010). Cognitively Based Assessment of, for, and as Learning: A preliminary theory of action for summative and formative assessment. Measurement: Interdisciplinary Research and Perspectives, 8, 70-91.
– Bennett, R. E. (2011). CBAL: Results from piloting innovative K-12 assessments (RR-11-23). Princeton, NJ: Educational Testing Service.
• Commentaries – Embretson, S. (2010). Cognitively based assessment and the integration of
summative and formative assessments. Measurement: Interdisciplinary Research & Perspectives, 8, 180-184.
– Linn, R. L. (2010). Commentary: A new era of test-based educational accountability. Measurement: Interdisciplinary Research & Perspectives, 8,145–149.
• www.ets.org/research/topics/cbal/initiative
Copyright © 2011 by Educational Testing Service.
33