Post on 01-Nov-2014
description
Second Language AssessmentAndrew Cohen
Misuses of Tests
• Tests were used as a punishment.• Tests were administered instead of teachers giving
instructions.• The tests were the only measure for grading.• Test did not reflect what was taught.• The tests were returned with lack of corrections or
explanations.• The tests reflected only one testing method.• There was a lack of teacher confidence in their own tests.• Students were not adequately trained to take the results.• There was a substantial delay in returning the tests.
A more constructive way of language testing exists when:
• Testing is seen as an opportunity for interaction between teacher and student.
• Students are judged based on the knowledge they have.• The tests are intended for students to improve their skills• The criteria for success on the test are clear to students• Students receive a grade for their performance on a set of tests
representing different testing methods.• The test takers are trained on how to take tests especially those
involving unfamiliar formats.• The tests are returned promptly.• The results are discussed.
Prepared by:
Marie Joy M. Anhaw
THEORETICAL FOUNDATIONS
Primary Functions of Language Assessment
1. Administrative
a. assessment
b. placement
c. exemption
d. certification
e. promotion
2. Instructional
a. diagnosis
b. evidence of Progress
c. Feedback to the respondent
d. Evaluation of teaching or curriculum
Primary Functions of Language Assessment
3. Research Purposes
a. evaluation
b. Experimentation
c. Knowledge about the language learning and the language use
Proficiency tests are intended for administrative purposes.
Achievement tests are intended for assessing instructional outcomes.
1. Norm-referenced Assessment
Distinctions in Testing
2. Criterion-referenced Assessment
- A test can be used to compare a respondent with other respondents whether
locally, regionally, or nationally.
- A test can be used to see whether a respondent has met
a certain instructional objective or criteria.
Components of Communicative Competence
1. Grammatical Competence- encompasses knowledge of lexical items and of rules of
morphology, syntax, sentence-grammar semantics, and phonology (Canale and Swain 1980)
2. Discourse Competence- The ability to connect sentences in stretches of discourse
and to form a meaningful whole out of a series of utterances.
4. Strategic Competence- The verbal and nonverbal communication strategies
that may be called into action to compensate for breakdowns in communication due to performance variables or due to insufficient competence.
Components of Communicative Competence
3. Sociolinguistics Competence- involves knowledge of the sociocultural rules of
language
Prepared by:
Hilda D. Carreon
ASSESSING LANGUAGE SKILLS
Methods of Testing Reading
•Learners use a certain TYPE(S) of READING
•Comprehend at a certain level or combination of LEVELS OF MEANING
•Enlist a certain COMPREHENSION SKILL(S)
•And do all of this within the framework of a certain TESTING OF METHOD(S)
I. TYPES OF READING
A. Skimming or Scanning
A distinction has been made between scanning and search reading
Search reading – the respondent is scanning without being sure about the form that information will take (i.e., whether it will be a WORD, PHRASE, SENTENCE, PASSAGE , and so on)
B. Read Receptively and
Read Responsibly
Read responsibly
- written materials prompts them to reflect on some point or other, and then possibly respond in writing
Read Receptively
- discovering accurately what the author seeks to convey
II. Levels of Meaning
o Grammatical meaning
– meaning that words and morphemes have on their own
o Propositional meaning
– meaning that a clause or sentence can have on its own (i.e., the information that the clause or sentence transmits)
- this meaning is also referred to as its “INFORMATIONAL VALUE”
Four (4 )Levels of Meaning:
o Discoursal meaning
– meaning a sentence can have only when in a context
– This meaning is also referred to us its “FUNCTIONAL VALUE”
o Writer’s Intent
- the meaning that a sentence has only as part of the interaction between writer and reader
- “author’s tone”
III. COMPREHENSION SKILLS
(Alderson 1987)
(i) The ability to recognize words and phrases of similar and opposing meaning
(ii) Identifying or locating information
(iii)Discriminating elements or features within context: the analysis of elements within a structure and of the relationship among them --- e.g., causal, sequential, chronological, hierarchical
(iv) Interpreting of complex ideas, actions, events, relationships
(v) Inferencing– deriving conclusions and predicting the continuation
(vi) Synthesis
(vii) Evaluation
A. The Cloze and the C- Test
The Cloze Test One-or-two-word deletions Rational deletion Partial deletion from the beginning end of words
C - Test( Klein-Braley and Raatz
- The second half of every other word is deleted, leaving the first and last sentence of the passage intact
IV. TESTING METHOD
B. Computerized Adaptive Testing (CAT)
-The selection and sequenced items depends on the pattern of success and failure experienced by the student.
Advantages: Individual testing time may be reduced Illustration and fatigue are minimizedBoredom is reducedTest scores and diagnoses feedback may be provided
immediatelyTest security may be enhanced (since it is unlikely
that two respondents would receive the same items in the same sequenced)
Record-keeping functions are improvedInformation is readily available for research purposes (Larson and Madsen 1985, Madsen 1986)
Disadvantage: CAT presumes that one major language factor or underlying test is being measured as a time
C. Communicative Test of Reading Comprehension
-Canale (1984) points out that a good test is not just one that is acceptable -- that is, accepted as fair, important, and interesting by test takers and test users
-Also, a good test has feedback potential, rewarding both test takers as test users with clear, rich, relevant and generalizable information
Storyline Test – test with a thematic line of development
“Proficiency-oriented achievement test”Canale (1985)
such tests put to use what is learned. There is a transfer from controlled training to real performance. there is a focus on the message and the function, not just on the form there is a group collaboration as well as individual work, not just the latter the respondents are called upon to use their resourcefulness in resolving authentic problems in language use, as opposed to demonstrating accuracy in resolving contrived problems at the linguistic level the testing itself is more like learning, and the learners are more involved in the assessment.
Prepared by:
Deiniol Audbert L. Garces
TEST CONSTRUCTION AND ADMINISTRATION
Inventory of Objects
• Test constructors first make an inventory of the objectives they want to test
• Distinguish broad objectives with more specific ones and important objectives from trivial ones
• Varying the type of items or procedures testing a particular objective helps distinguish one student’s comprehension from that of another student.
Inventory of Objects
• Testers may need to resist the temptation to include difficult items of marginal importance simply because they distinguish the better and poorer achievers.
Constructing an Item Bank
1. The skill or combination of skills tested
2. The language element(s) involved
3. The item-elicitation and item-response formats
4. Instructions on how to present the item
5. The section of the book or part of the course that the item relates to
6. The time it took to write the item
Test Format
• An effective way to hold the interest of a respondent towards a test is to start the test with relatively easy items and then continue it by interspersing easy and difficult items.
• Multiple choice items lend respondents to guessing.
Instructions
• The instructions should be brief and yet explicit and unambiguous
• Examples may help, but on the other hand, may hinder if they do not give the whole picture and become a substitute for reading instructions.
• The time allowed for each subtest and/or for the total test should be announced
Scoring
• If an objective is tested by more than one item, then the focus is on mastery of the objective.
• Items covering one objective may be weighted more than the items covering another objective.
• Ex. The scoring of a multiple-choice test would be considered more objective than that of an essay test, where the scorer’s subjectivity plays more of a role.
Reliability
Three Factors:
1. Test Factors
2. Situational Factors
3. Individual Factors
Validity
1. Face Validity
2. Content Validity
3. Criterion-Related Validity
4. Construct Validity
5. Convergent Validity
Item Analysis
1. Piloting the Test – sound testing on a population similar to that for which the test is designed
2. Item Difficulty – proportion of correct responses to a test item
3. Item Discrimination – how well an item performs in separating the better students from the poorer ones
Test Revision
• An item should be revised or eliminated if it has low difficulty or discrimination coefficient
• If distractors (multiple item options) draw no response or too many, then they should be omitted or altered.
• Results of item analysis should be added to the information in the item bank.