Assessment for learning: putting it into practice Dylan Wiliam, Educational Testing Service Workshop...

49
Assessment for learning: putting it into practice Dylan Wiliam, Educational Testing Service Workshop at NSDC 37th annual conference, Philadelphia, PA December 2005

Transcript of Assessment for learning: putting it into practice Dylan Wiliam, Educational Testing Service Workshop...

Assessment for learning:putting it into practice

Dylan Wiliam, Educational Testing Service

Workshop at NSDC 37th annual conference, Philadelphia, PA December 2005

Overview of presentation

• Why raising achievement is important

• Why investing in teachers is the answer

• Why assessment for learning should be the focus

• Why teacher learning communities should be the mechanism

Raising achievement matters

• For individuals– Increased lifetime salary– Improved health

• For society– Lower criminal justice costs– Lower health-care costs– Increased economic growth

Where’s the solution?

• Structure– Small high schools

– K-8 schools

• Alignment– Curriculum reform

– Textbook replacement

• Governance– Charter schools

– Vouchers

• Technology

It’s the classroom

• Variability at the classroom level is up to 4 times greater than at school level

• It’s not class size• It’s not the between-class grouping

strategy• It’s not the within-class grouping strategy• It’s the teacher

Teacher quality

• A labor force issue with 2 solutions– Replace existing teachers with better ones?

• No evidence that more pay brings in better teachers• No evidence that there are better teachers out there

deterred by certification requirements– Improve the effectiveness of existing teachers

• The “love the one you’re with” strategy• It can be done• We know how to do it, but at scale? Quickly?

Sustainably?

Functions of assessment

• For evaluating institutions

• For describing individuals

• For supporting learning– Monitoring learning

• Whether learning is taking place

– Diagnosing (informing) learning• What is not being learnt

– Forming learning• What to do about it

Effects of formative assessment

• Several major reviews of the research– Natriello (1987)– Crooks (1988)– Black & Wiliam (1998)– Nyquist (2003)

• All find consistent, substantial effects

Kinds of feedback (Nyquist, 2003)

• Weaker feedback only– Knowledge of results (KoR)

• Feedback only– KoR + clear goals or knowledge of correct results (KCR)

• Weak formative assessment– KCR+ explanation (KCR+e)

• Moderate formative assessment– (KCR+e) + specific actions for gap reduction

• Strong formative assessment– (KCR+e) + activity

Effect of formative assessment (HE)

N Effect

Weaker feedback only 31 0.16

Feedback only 48 0.23

Weaker formative assessment 49 0.30

Moderate formative assessment 41 0.33

Strong formative assessment 16 0.51

Effects of feedback

• Kluger & DeNisi (1996)• Review of 3000 research reports• Excluding those:

– without adequate controls– with poor design– with fewer than 10 participants– where performance was not measured– without details of effect sizes

• left 131 reports, 607 effect sizes, involving 12652 individuals

• Average effect size 0.4, but– Effect sizes very variable– 40% of effect sizes were negative

Kinds of feedback: Israel

• 264 low and high ability grade 6 students in 12 classes in 4 schools; analysis of 132 students at top and bottom of each class

• Same teaching, same aims, same teachers, same classwork• Three kinds of feedback: scores, comments, scores+comments

[Butler(1988) Br. J. Educ. Psychol., 58 1-14]

Feedback Gain Attitudescores none top +ve

bottom -ve

comments 30% all +ve

Responses

[Butler(1988) Br. J. Educ. Psychol., 58 1-14]

Feedback Gain Attitudescores none top +ve

bottom -ve

comments 30% all +ve

What do you think happened for the students given both scores and comments?

A: Gain: 30%; Attitude: all +ve

B: Gain: 30%; Attitude: top +ve, bottom -ve

C: Gain: 0%; Attitude: all +ve

D: Gain: 0%; Attitude: top +ve, bottom -ve

E: Something else

Formative assessment

• Classroom assessment is not (necessarily) formative assessment

• Formative assessment is not (necessarily) classroom assessment

Types of formative assessment

• Long-cycle– Focus: between units– Length: four weeks to one year

• Medium-cycle– Focus: within units– Length: one day to two weeks

• Short-cycle– Focus: within lessons– Length: five seconds to one hour

Formative assessment

Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting pupils’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information to be used as feedback, by teachers, and by their pupils, in assessing themselves and each other, to modify the teaching and learning activities in which they are engaged.

Such assessment becomes ‘formative assessment’ when the evidence is actually used to adapt the teaching work to meet learning needs.

Black et al., 2002

Feedback and formative assessment

• “Feedback is information about the gap between the actual level and the reference level of a system parameter which is used to alter the gap in some way” (Ramaprasad, 1983 p. 4)

• Three key instructional processes– Establishing where learners are in their learning

– Establishing where they are going

– Establishing how to get there

Five key strategies…

• Clarifying and understanding learning intentions and criteria for success

• Engineering effective classroom discussions that elicit evidence of learning

• Providing feedback that moves learners forward

• Activating students as instructional resources for each other

• Activating students as the owners of their own learning

…and one big idea

• Use evidence about learning to adapt instruction to meet student needs

Keeping Learning on Track (KLT)

• A pilot guides a plane or boat toward its destination by taking constant readings and making careful adjustments in response to wind, currents, weather, etc.

• A KLT teacher does the same:– Plans a carefully chosen route ahead of time (in

essence building the track)

– Takes readings along the way

– Changes course as conditions dictate

Keeping learning on track

• Teaching as engineering learning environments• Key features:

– Create student engagement

– Well-regulated• Long feedback cycles vs. variable feedback cycles

• Quality control vs. quality assurance in learning

• Teaching vs. learning

• Regulation of activity vs. regulation of learning

KLT processes

• Before the lesson– Planning regulation into the learning environment– Planning for evoking information

• During the lesson– ‘Negotiating the swiftly-flowing river’– ‘Moments of contingency’– Tightness of regulation (goals vs. horizons)

• After the lesson– Structured reflection (e.g., lesson study)

Practical techniques: Questioning

• Improving teacher questioning– Generating questions with colleagues – Closed v open– Low-order v high-order– Appropriate wait-time

• Getting away from I-R-E– Basketball rather than serial table-tennis– ‘No hands up’ (except to ask a question)– Class polls to review current attitudes towards an issue– ‘Hot Seat’ questioning

• All-student response systems– ABCD cards– Mini white-boards– Exit passes

Kinds of questions: Israel

Which fraction is the smallest?

a) 16

, b) 23

, c) 13

, d) 12

.

Success rate 88%

Which fraction is the largest?

Success rate 46%; 39% chose (b)

a) 45

, b) 34

, c) 58

, d) 7

10.

[Vinner, PME conference, Lahti, Finland, 1997]

Questioning in math: discussion

Look at the following sequence:3, 7, 11, 15, 19, ….

Which is the best rule to describe the sequence?A. n + 4B. 3 + nC. 4n - 1D. 4n + 3

Questioning in math: diagnosis

In which of these triangles is a2 + b2 = c2 ?

A a

c

b

C b

c

a

E c

b

a

B a

b

c

D b

a

c

F c

a

b

Questioning in science: discussion

Ice-cubes are added to a glass of water. What happens to the level of the water as the ice-cubes melt?

A. The level of the water drops

B. The level of the water stays the same

C. The level of the water increases

D. You need more information to be sure

Questioning in science: diagnosis

The ball sitting on the table is not moving.It is not moving because:A. no forces are pushing or pulling on the ball. B. gravity is pulling down, but the table is in the way.C. the table pushes up with the same force that gravity pulls downD. gravity is holding it onto the table. E. there is a force inside the ball keeping it from rolling off the table

Wilson & Draney, 2004

Questioning in English: discussion

• Macbeth: mad or bad?

Questioning in English: diagnosis

Where is the verb in this sentence?

The dog ran across the road

A B CA B C D

Questioning in English: diagnosis

Where does the subject end and the predicate begin in this sentence?

The dog ran across the road

D

A B C D

Questioning in English: diagnosis

Which of these is a good thesis statement?A. The typical TV show has 9 violent incidents

B. There is a lot of violence on TV

C. The amount of violence on TV should be reduced

D. Some programs are more violent than others

E. Violence is included in programs to boost ratings

F. Violence on TV is interesting

G. I don’t like the violence on TV

H. The essay I am going to write is about violence on TV

Questioning in history: discussion

In which year did World War II begin?A. 1919

B. 1937

C. 1938

D. 1939

E. 1941

Questioning in History

Why are historians concerned with bias when analyzing sources?

A. People can never be trusted to tell the truth

B. People deliberately leave out important details

C. People are only able to provide meaningful information if they experienced an event firsthand

D. People interpret the same event in different ways, according to their experience

E. People are unaware of the motivations for their actions

F. People get confused about sequences of events

Hinge Questions

• A hinge question is based on the important concept in a lesson that is critical for students to understand before you move on in the lesson.

• The question should fall about midway during the lesson.• Every student must respond to the question within two

minutes.• You must be able to collect and interpret the responses from

all students in 30 seconds

Practical techniques: feedback

• Comment-only grading

• Focused grading

• Explicit reference to rubrics

• Suggestions on how to improve– ‘Strategy cards’ ideas for improvement– Not giving complete solutions

• Re-timing assessment– (eg two-thirds-of-the-way-through-a-

unit test)

Practical techniques: sharing learning expectations

• Explaining learning objectives at start of lesson/unit

• Criteria in students’ language• Posters of key words to talk about learning

– eg describe, explain, evaluate

• Planning/writing frames• Annotated examples of different standards

to ‘flesh out’ assessment rubrics (e.g. lab reports)

• Opportunities for students to design their own tests

Practical techniques:peer and self-assessment

• Students assessing their own/peers’ work – with scoring guides, rubrics or exemplars– two stars and a wish

• Training students to pose questions• Identifying group weaknesses• Self-assessment of understanding

– Red/green discs– Traffic lights– Smiley faces– Post-it notes

• End-of-lesson students’ review

Concept cards

• On the colored index cards, write a sentence or two or give an example to explain each of the following five ideas (if you’re not sure, ask a question instead):

– Questioning yellow– Feedback orange– Sharing criteria green– Self-assessment red– Peer-assessment blue

Professional development must be

• Consistent with what we know about adult learning, incorporating– choice– respect for prior experience– recognition of varied learning styles and

motivation

• Sustained• Contextualized• Consistent with research on expertise

Expertise (Berliner, 1994)1 Experts excel mainly in their own domain2 Experts often develop automaticity for the repetitive

operations that are needed to accomplish their goals3 Experts are more sensitive to the task demands and social

situation when solving problems.4 Experts are more opportunistic and flexible in their teaching

than novices5 Experts represent problems in qualitatively different ways

than novices.6 Experts have fast and accurate pattern recognition

capabilities. Novices cannot always make sense of what they experience.

7 Experts perceive meaningful patterns in the domain in which they are experienced.

8 Experts begin to solve problems slower, but bring richer and more personal sources of information to bear on the problem that they are trying to solve.

Berliner, 1994

Countdown

25 3

9 4

Target number: 127

1

Klein & Klein (1981)

Six video extracts of a person delivering cardio-pulmonary resuscitation (CPR)

5 of the video extracts are students1 of the video extracts is an expert

Videos shown to three groups

Students, experts, instructors

Success rate in identifying expert:

Experts: 90%

Students: 50%

Instructors: 30%

Chess (Newell & Simon, 1973)

1684

33.54

Actual position

Random positioning

HighMiddleLowPositioningExpertise

A model for teacher learning

• Ideas

• Evidence

• Small steps

• Flexibility

• Choice

• Accountability

• Support

Why Teacher Learning Communities?

• Teacher as local expert

• Sustained over time

• Supportive forum for learning

• Embedded in day-to-day reality

• Domain-specific

A four-part model

• Initial workshops

• TLC meetings

• Peer observations

• Training for leaders

Learning Log

• Today I learned…• I was surprised by…• The most useful thing I will take from these sessions is…• I was interested in…• What I liked most about today was…• One thing I’m not sure about is…• The main thing I want to find out more about is… • After these sessions, I feel…• I might have got more from today if…

Please feel free to add any additional comments that you think we should know.

Please use at least three of the following sentences to share your thoughts on today’s sessions. Please write your responses on the lined NCR paper.

NSDC Evaluation