Principles of language assessment

26
PRINCIPLES OF LANGUAGE ASSESSMENT Week 2 Sutrisno Sadji Evenddy, M.Pd.

Transcript of Principles of language assessment

PRINCIPLES OF LANGUAGE ASSESSMENT

Week 2Sutrisno Sadji Evenddy, M.Pd.

Principles of Language Assessment

Practicality Reliability Validity Authenticity Washback

Practicality Not expensive, Within appropriate time constraint, Relatively easy to administer, A scoring/evaluation procedure that is specific and time-efficient.

Example of Practicality Checklist

1. Are administrative details clearly established before the test? 2. Can students complete the test reasonably within the set time frame? 3. Is the cost of the test within budget limits?

Reliability Consistency of assessment results

(Linn & Gronlund).

A test is reliable if: “You give the same test to the same student or matched students on two different occasions, the test should yield similar results.” (Brown,2004)

Factors that may influence reliability of a test

Students-related reliability Rater reliability Test administration reliability Test reliability

Student-Related Reliability

The most common learner-related issue in reliability is caused by temporary illness, fatigue, a “bad day”, anxiety, and other physical or psychological factors.

Rater Reliability Inter-rater reliability:

When two or more scorers yield inconsistent scores of the same test.

Factors: lack of attention to scoring, inexperience, inattention, etc.

Rater Reliability (continue)

Intra-rater Scoring criteria, fatigue, bias toward particular “good” and “bad” students, or simple carelessness.

Test Administration Reliability It can be caused by administration

factors. e.g. noisy from outside, photocopying variations, room condition, even condition of desks and chair.

Test ReliabilityFactors cause unreliability: If a test too long, test takers may become

fatigued by the time they reach the later items and hastily respond incorrectly.

Ambiguous items.

Validity“Measuring what should be measured”

o Content-related evidenceo Criterion-related evidenceo Construct-related evidenceo Consequential validityo Face validity

Content-Related Evidence If a test samples the subject matter about

which conclusions are to be drawn. If a test requires the test-taker to perform

the behavior that is being measured.

Criterion-Related Evidence

is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

Criterion-Related EvidenceExample

imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test with the scores from the hands-on driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.

Criterion-Related Evidence1. Concurrent validity/ empiric validity

if a test result is supported by other concurrent performance beyond assessment itself.

e.g.

the validity of a high score on the final exam of a foreign language course will be substantiated by actual proficiency in the language.

2. Predictive validity

to assess (and predict) a test taker’s likelihood of future success.

e.g SNMPTN

Criterion-Related Evidence

Construct-Related Evidence

How well performance on the assessment can be interpreted as meaningful measure of some characteristics or quality.

Consequential ValidityHow well use of assessment results accomplishes intended purposes and avoids unintended effect.

Face Validity It refers to the degree to which a test

looks right, and appears to measure the knowledge or ability it claims to measure, based on the subjective judgment of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers (Mousavi in Brown, 2004)

Authenticity The language as natural as possible. Items contextualized rather than

isolated. Topics meaningful (relevant,

interesting) for the learner. Some thematic organization to items is

provided, such as through a story line or episode.

Tasks represent, or closely approximate, real-world tasks.

Example of Authenticity Contextualized Decontextualize

d‘Going to”

1. What _______ this weekend?

a. you are going to do b. are you going to do c. your gonna do

1. There are three countries I would like to visit. One is Italy.

a. The other is New Zealand and other is Nepal b. The others are New Zealand and Nepal c. Others are New Zealand and Nepal

Example of Authenticity (Decontextualized)

Contextualized Contextualized

2. I’m not sure. _______ anything special?a.Are you going to

dob.You are going to

doc.Is going to do

2. When I was twelve years old, I used ______every day. a. swimming b. to swimming c. to swim

Washback The effect of testing on teaching and

learning (Hughes in Brown, 2004). Generally refers to the effects tests have

on instruction in terms of how students prepare for the test (Brown, 2004).

THANK YOU