Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn...

Post on 14-Dec-2015

224 views 5 download

Transcript of Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn...

Nguyeãn Aùi Hoaøng ChaâuLöõ thò Ngoïc Lan

Voõ Duy MinhBuøi thò Minh

NguyeätNguyeãn Hoàng Leä NgoïcLeâ Ñöùc Thònh

What is the reliability?

How to identify the reliability of a test

• stable over time

• consistent in terms of the content sampling

• free from bias

A test-taker when re-examined with the same test on different occasions, or with different sets of equivalent items, or under variable examining conditions will have the same score

good validity poor reliability

poor validitygood reliability

poor validity poor reliability

good validitygood reliability

a type of validity evidence a posteriori validity evidence

“scoring validity” a measure of the stability of test scores a prerequisite for measurement validity

Test-retest reliability

Internal consistency

Marker reliability

Parallel forms reliability

Types of scoring validityTypes of scoring validity(methods of estimating (methods of estimating reliability)reliability)

using one test twice for the same test-takers

the period between the two tests is long enough for test-takers to forget the test but not too long

between the two tests, no lesson is given

Test-retest reliabilityTest-retest reliability

use two different but equivalent forms of the test to the same test-takers

the two tests can be applied in close succession

Parallel forms reliability

Leâ Ñöùc Leâ Ñöùc ThònhThònh

Internal Internal consistencyconsistency

A variation of parallel forms reliability; Using parallel statistic on one test

and dividing the test into two halves for statistics or estimating the correlation of each items in the test with another;

Focuses on the consistency with each other of a test’s internal elements.

Internal consistency Internal consistency correlationcorrelation

Split half reliability

Average inter-item correlation

Average item-total correlation

Internal consistency Internal consistency correlationcorrelation

Split half reliability

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

Item 01Item 01 Item 03Item 03 Item 04Item 04

Item 02Item 02 Item 05Item 05 Item 06Item 06

.87

Item 05Item 05

Item 02Item 02 Item 04Item 04

Internal consistency Internal consistency correlationcorrelation

Average inter-total correlation

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

i1 i2 i3 i4 i5 i6

i2

i3

i4

i5

i6

i1 1.00

.89

.91

.88

.84

.88

1.00

.92

.93

.86

.91

1.00

.95

.92

.95

1.00

.85

.87

1.00

.85 1.00

.90

Internal consistency Internal consistency correlationcorrelation

Average item-total correlation

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

i1 i2 i3 i4 i5 i6

i2

i3

i4

i5

i6

i1 1.00

.89

.91

.88

.84

.88

TotalTotal .84

1.00

.92

.93

.86

.91

.88

1.00

.95

.92

.95

.86

1.00

.85

.87

.87

1.00

.85

.83

1.00

.82 1.00

.85

Internal consistency Internal consistency estimationestimation

Excel correlation

Kuder-Richarson 20 or 21

Cronbach’s alpha

Internal consistencyInternal consistency

Advantages : Saving time and expenses; Higher value compared with the test-retest and

parallel forms

Disadvantages : Lack of temporal stability of the scores as they

result from a single administration of the test; Not easy to determine the level of difficulty of the

items; The items in one half may not be equivalent to the

items in the other half.

Threats to test reliabilityThreats to test reliability

environmental factors; construct, content, theory-based validity; define the level of difficulty/ease of the

items define the level of difficulty in reading

texts and their questions

Buøi thò Minh NguyeätBuøi thò Minh Nguyeät

Marker reliabilityMarker reliability

relate chiefly to tests in which samples of writing or speaking are produced

the consistency of the marker(s)

Marker reliabilityMarker reliability

intra-rater reliability : each marker needs to be consistent within himself/herself

inter-rater reliability : markers need to be consistent with each other

How to improve marker How to improve marker reliabilityreliability

Have explicit agreed criteria for carrying the marking task Analytic scales Holistic scales

Standardization Moderation of scores (Multi-faceted

Rasch - MFR)

Thank you for listening