Lesson Six Reliability. Yun-Pi Yuan 2 Contents Definition of reliability Definition of reliability ...

31
Lesson Six Reliability

Transcript of Lesson Six Reliability. Yun-Pi Yuan 2 Contents Definition of reliability Definition of reliability ...

Page 1: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Lesson Six

Reliability

Page 2: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 2

Contents Definition of reliability Factors contributing to unreliability Types of reliability Indication of reliability: Reliability coefficient

Ways of obtaining reliability coefficient: Alternate/Parallel forms Test-retest Split-half & KR-21/KR-20

Two ways of testing reliability How to make test more reliable

Page 3: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 3

Definition of Reliability (1) “The consistency of measures acros

s different times, test forms, raters, and other characteristics of the measurement context” (Bachman, 1990, p. 24).

If you give the same test to the same testees on two different occasions, the test should yield similar results.

Page 4: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 4

Definition of Reliability (2) A reliable test is consistent and

dependable. Scores are consistent and

reproducible. The accuracy or precision with

which a test measures something; that is, consistency, dependability, or stability of test results.

Page 5: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 5

Factors Contributing to Unreliability

X=T+ E (observed score = true score + error score)

Concerned with freedom from nonsystematic fluctuation.

Fluctuations inthe studentscoringtest administrationthe test itself

Page 6: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 6

Types of Reliability

Student- (or Person-) related reliability Rater- (or Scorer-) related reliability

Intra-rater reliability Inter-rater reliability

Test administration reliability Test (or instrument-related) reliability

Page 7: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 7

Student-Related Reliability (1) The source of the error score co

mes from the test takers.Temporary illnessFatigueAnxietyOther physical or psychological f

actorsTest-wiseness (i.e., strategies for efficie

nt test taking)

Page 8: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 8

Student-Related Reliability (2)

Principles:Assess on several occasionsAssess when person is

prepared and best able to perform well

Ensure that person understands what is expected (e.g., instructions are clear)

Page 9: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 9

Rater (or Scorer) Reliability (1)

Fluctuations: including human error, subjectivity, and bias

Principles:Use experienced trained

raters.Use more than one rater.Raters should carry out their

assessments independently.

Page 10: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 10

Rater Reliability (2)

Two kinds of rater reliability:Intra-rater reliabilityInter-rater reliability

Page 11: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 11

Intra-Rater Reliability

Fluctuations including:Unclear scoring criteriaFatigueBias toward particular good

and bad studentsSimple carelessness

Page 12: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 12

Inter-Rater Reliability (1)

Fluctuations including:Lack of attention to scoring

criteriaInexperienceInattentionPreconceived biases

Page 13: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 13

Inter-Rater Reliability (2)

Used with subjective tests when two or more independent raters are involved in scoring

Train the raters before scoring (e.g., TWE, dept. oral and composition tests for recommended students).

Page 14: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 14

Inter-Rater Reliability (3)

Compare the scores of the same testee given by different raters. If r= high, there’s inter-rater reliability.

Page 15: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 15

Test Administration Reliability

Street noiseListening comprehension test

Photocopying variationsLightingVariations in temperatureCondition of desks and chairsMonitors

Page 16: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 16

Test ReliabilityMeasurement errors come

from the test itself:Test is too longTest with a time limitTest format allows for

guessingAmbiguous test itemsTest with more than one

correct answer

Page 17: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 17

Reliability Coefficient (r) To quantify the reliability of a test al

low us to compare the reliability of different tests.

0 ≤ r ≤ 1 (ideal r= 1, which means the test gives precisely the same results for particular testees regardless of when it happened to be administered).

If r = 1: 100% reliable A good achievement test: r>= .90 R<.70 shouldn’t use the test

Page 18: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 18

How to Get Reliability Coefficient

Two forms, two administrations: alternate/parallel forms

One form, two administrations: test-retest

One form, one administration (internal consistency):split-half (Spearman-Brown procedure)KR-21KR-20

Page 19: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 19

Alternate/Parallel Forms

Two forms, two administrations:Equivalent forms

(i.e., different items testing the same topic) taken by the same test taker on different days

If r is high, this test is said to have good reliability.

the most stringent form

Test plan

Form A Form B

Page 20: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 20

Test-Retest

The same test is administered to the same testees with a short time lag, and then calculate r.

Appropriate for highly speeded test

Test A

Trial 1 Trial 2

One form, two administrations

Page 21: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 21

Split-half (Spearman-Brown Procedure)

One test, one administration Split the test into halves (i.e., odd quest

ions vs even questions) to form two sets of scores.

Also called internal consistencyQ1

Q2

Q3

Q4

Q5

Q6

First Half

Second Half

Page 22: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 22

Split-half (2) Note that the r isn’t the reliability of the tes

t A math relationship between test length and

reliability: the longer the test, the more reliable it is.

Rel.total = nr/1+ (n-1)r Spearman & Brown Prophecy Formula

E.g., correlation between 2 parts of test; r= .6 rel. of full test = .75

If lengthen the test items into 3 times: r= .82

Page 23: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 23

Kuder-Ridchardson formula 21 KR-21 = k/(k-1){1-[x (1- x/k)]/s2} k= number of items; x= mean s= standard deviation (formula see Bailey 100)

description of the spread outness in a set of scores (or score deviations from the mean)

o<=s the larger s, the more spread outE.g., 2 sets of scores: (5, 4,3) and (7,4,1); which

group in general behaves more similarly?

Page 24: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 24

Kuder-Ridchardson formula 20

KR-20= [k/(k-1)][1-(∑pq/s2) p= item difficulty (percent of people

who got an item right) q= 1-p (i.e., percent of people who

got an item wrong)

Page 25: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 25

Ways of Testing Reliability

Examine the amount of variation Standard Error of Measurement (SEM) The smaller the better

Calculate “reliability coefficient” “r” The bigger the better

Page 26: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 26

Standard Error of Measurement (1)

Average SD of an individual over a large number of testing

Essence of variability of scores of an individual

How large the error component is likely to be

Particularly useful in interpretation of test scores

SEM= S√1-rel.

Page 27: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 27

Standard Error of Measurement (2)

Average of a set of scores= “true” score of the individual

X1=T1+ E1

X2=T2+ E2

: : : Xn= Tn+ En

X = T + 0

Page 28: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 28

Standard Error of Measurement (3)

E.g., GRE SD= 100, rel.= .91 SEM= 100 √1-.91= 30o How do we apply the SEM in the int

erpretation of the score? For a given spread of scores, the gre

ater the reliability coefficient, the smaller will be the SEM.

Page 29: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 29

Ways of Enhancing Reliability

General strategies:Consider possible sources of

unreliabilityReduce or average out

nonsystematic fluctuations inraterspersonstest administrationinstruments

Page 30: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 30

How to Make Tests More Reliable? (1)

Take enough samples of behavior

Try to avoid ambiguous itemsProvide clear and explicit

instructionsEnsure tests are well layout &

perfectly legibleProvide uniform and undistracted

condition of administrationTry to use objective tests

Page 31: Lesson Six Reliability. Yun-Pi Yuan 2 Contents  Definition of reliability Definition of reliability  Factors contributing to unreliability Factors contributing.

Yun-Pi Yuan 31

How to Make Tests More Reliable? (2)

Try to use direct tests Have independent, trained raters Provide a detailed scoring key Try to identify the test takers by

number, not by names Try to have more multiple

independent scoring in subjective tests

(Hughes, 1989, pp. 36-42).