Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn...

26
Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh

Transcript of Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn...

Page 1: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Nguyeãn Aùi Hoaøng ChaâuLöõ thò Ngoïc Lan

Voõ Duy MinhBuøi thò Minh

NguyeätNguyeãn Hoàng Leä NgoïcLeâ Ñöùc Thònh

Page 2: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

What is the reliability?

How to identify the reliability of a test

Page 3: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

• stable over time

• consistent in terms of the content sampling

• free from bias

Page 4: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

A test-taker when re-examined with the same test on different occasions, or with different sets of equivalent items, or under variable examining conditions will have the same score

Page 5: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

good validity poor reliability

Page 6: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

poor validitygood reliability

Page 7: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

poor validity poor reliability

Page 8: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

good validitygood reliability

Page 9: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

a type of validity evidence a posteriori validity evidence

“scoring validity” a measure of the stability of test scores a prerequisite for measurement validity

Page 10: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Test-retest reliability

Internal consistency

Marker reliability

Parallel forms reliability

Types of scoring validityTypes of scoring validity(methods of estimating (methods of estimating reliability)reliability)

Page 11: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

using one test twice for the same test-takers

the period between the two tests is long enough for test-takers to forget the test but not too long

between the two tests, no lesson is given

Test-retest reliabilityTest-retest reliability

Page 12: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

use two different but equivalent forms of the test to the same test-takers

the two tests can be applied in close succession

Parallel forms reliability

Page 13: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Leâ Ñöùc Leâ Ñöùc ThònhThònh

Page 14: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal Internal consistencyconsistency

A variation of parallel forms reliability; Using parallel statistic on one test

and dividing the test into two halves for statistics or estimating the correlation of each items in the test with another;

Focuses on the consistency with each other of a test’s internal elements.

Page 15: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistency Internal consistency correlationcorrelation

Split half reliability

Average inter-item correlation

Average item-total correlation

Page 16: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistency Internal consistency correlationcorrelation

Split half reliability

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

Item 01Item 01 Item 03Item 03 Item 04Item 04

Item 02Item 02 Item 05Item 05 Item 06Item 06

.87

Item 05Item 05

Item 02Item 02 Item 04Item 04

Page 17: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistency Internal consistency correlationcorrelation

Average inter-total correlation

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

i1 i2 i3 i4 i5 i6

i2

i3

i4

i5

i6

i1 1.00

.89

.91

.88

.84

.88

1.00

.92

.93

.86

.91

1.00

.95

.92

.95

1.00

.85

.87

1.00

.85 1.00

.90

Page 18: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistency Internal consistency correlationcorrelation

Average item-total correlation

mea

sure

mea

sure

Item 01Item 01

Item 02Item 02

Item 03Item 03

Item 04Item 04

Item 05Item 05

Item 06Item 06

i1 i2 i3 i4 i5 i6

i2

i3

i4

i5

i6

i1 1.00

.89

.91

.88

.84

.88

TotalTotal .84

1.00

.92

.93

.86

.91

.88

1.00

.95

.92

.95

.86

1.00

.85

.87

.87

1.00

.85

.83

1.00

.82 1.00

.85

Page 19: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistency Internal consistency estimationestimation

Excel correlation

Kuder-Richarson 20 or 21

Cronbach’s alpha

Page 20: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Internal consistencyInternal consistency

Advantages : Saving time and expenses; Higher value compared with the test-retest and

parallel forms

Disadvantages : Lack of temporal stability of the scores as they

result from a single administration of the test; Not easy to determine the level of difficulty of the

items; The items in one half may not be equivalent to the

items in the other half.

Page 21: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Threats to test reliabilityThreats to test reliability

environmental factors; construct, content, theory-based validity; define the level of difficulty/ease of the

items define the level of difficulty in reading

texts and their questions

Page 22: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Buøi thò Minh NguyeätBuøi thò Minh Nguyeät

Page 23: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Marker reliabilityMarker reliability

relate chiefly to tests in which samples of writing or speaking are produced

the consistency of the marker(s)

Page 24: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Marker reliabilityMarker reliability

intra-rater reliability : each marker needs to be consistent within himself/herself

inter-rater reliability : markers need to be consistent with each other

Page 25: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

How to improve marker How to improve marker reliabilityreliability

Have explicit agreed criteria for carrying the marking task Analytic scales Holistic scales

Standardization Moderation of scores (Multi-faceted

Rasch - MFR)

Page 26: Nguyeãn Aùi Hoaøng Chaâu Löõ thò Ngoïc Lan Voõ Duy Minh Buøi thò Minh Nguyeät Nguyeãn Hoàng Leä Ngoïc Leâ Ñöùc Thònh.

Thank you for listening