Assessment and testing language

32
María Lucely Kantún Tzuc Universidad Vizcaya de las Américas Campus Chetumal

Transcript of Assessment and testing language

María Lucely Kantún Tzuc

Universidad Vizcaya de las Américas Campus Chetumal

ASSESSMENT

DEFINITION:The process of making a judgment or forming an opinion, after considering something or someone carefully.

PURPOSE:

Inform people about their progress in learning.

TYPES OF ASSESSMENT

Tests or exams.

The most traditional form of assess students.

Activities.

Self-assessment.

Student assess by themselves their own learning.

When do teachers should assess?

All the stage of the learning process. Very frequently.

Assessment

Assessment

Assessment

LANGUAGE TESTING

WASHBACKIt refers to te impact that tests have on

teaching and learning.

Alderson and Wall (1993) list some «Washback Hypostheses»

Tests will have washback on what teachers teach (Content).

Tests also have impact on how teachers teach (methodology).

High-stakes tets (tests with important consequences)would have more impact than low-stakes tests.

Alderson and Hamp-Lyons (1996) show that teachers may indeed change the way they teachwhen teaching towards a test (in this case, theTOEFL —Test of English as a Foreign Language).

They show that the nature of the change and the methodology adopted varies from teacher to teacher (supported by Watanabe’s findings, 1996).

Shohamy et al, (1996) show that the nature of washback varies according to factors such as the status of the language being tested, and the uses of the test.

Watanabe concludes that washback is caused by the interplay between the test and the test taker in a complex manner.

He emphasises that what may be most important is not the objective difficulty of the test, but the students' perception of difficulty.

Wall summarises research findings which show that test design is only one of the factors affecting washback, and lists as factors influencing the nature of test washback.

She makes a number of recommendations about the steps that test developers might take in the future in order to assess the amount of risk involved in attempting to bring about change through testing:assessing the feasibility of examination reform

by studying the 'antecedent' conditions —what is increasingly referred to as a 'baseline study‘

Policy makers should be aware that tests on their own will not have positive impact if the materials and practices they are based on have not been effective.

ETHICS IN LANGUAGE TESTING

Messick (1994) argues that all testing involves making value judgements, and therefore language testing is open to a critical discussion of whose values are being represented and served.

Spolsky (1997) points out that tests and examinations have always been used as instruments of social policy and control, with the gate-keeping function of tests often justifying their existence.

Shohamy (1997a) argues that uses of tests which exercise control and manipulate stakeholders rather than providing information on proficiency levels are also unethical, and she advocates the development of “critical language testing”

A number of case studies have been presented recently which illustrate the use and misuse of language tests.

Hawthorne (1997) describes two examples of the misuse of language tests: the use of the access test to regulate the flow of migrants into Australia, and the step test, allegedly designed to play a central role in the determining of asylum seekers‘ residential status.

Norton and Starfield (1997) argue that criteria for assessment should be made explicit and public if testers are to behave ethically.

The International Language Testing Association (ILTA) has recently developed a Code of Ethics (rather than finalising the draft Code of Practice referred to above), which is 'a set of principles which draws upon moral philosophy and strives to guide good professional conduct.

This Code is clear: testers should follow ethical practices, and have a moral responsibility to do so.

Tests are frequently used as instruments of educational policy, and they can be very powerful — as attested by Shohamy (2001a).

Brindley (1998,2001) describes the political use of test-based assessment for reasons of public accountabilty, often in the context of national frameworks, standards or benchmarking.

Politics can be defined as action, or activities, to achieve power or to use power, and as beliefs about government, attitudes to power, and to the use of power.

POLITICS

National educational policy often involves innovations in testing in order to influence the curriculum, or in order to open up or restrict access to education and employment.

Politics can be seen as methods, tactics, intrigue, manoeuvring, within institutions which are themselves not political, but commercial, financial and educational.

STANDARDS IN TESTING

NORMS

COMMON EUROPEAN FRAMEWOR

K

Levels of proficiency

Level certified by public examinations

Guarantee educational and

employment mobility

EUROPEAN COMMON FRAMEWORK

The development of national language tests continues to be the focus of many publications, although many are either simply descriptions of test development or discussions of controversies, rather than reports on research done in connection with test development.

Page (1993) argues that the use of the target language in questions makes it more difficult to sample the syllabus adequately, and claims that the more communicative and authentic the tasks in examinations become, the more English (the mother tongue) has to be used on the examination paper in order to safeguard both the validity and the authenticityof the task.

NATIONAL TESTS

In the Netherlands (Jansen &Peer, 1999) reports a study of the recently introduced use of dictionaries in Dutch foreign language examinations and shows that dictionary use does not have any significant effect on test scores.

Pupils are very positive about being allowed to use dictionaries, claiming that it reduces anxiety and enhances their text comprehension.

Guillon (1997) recommends that more open-ended tasks be used, and that teachers be trained in the reliable use of valid criteria for subjective marking, instead of their current practice of merely counting errors in production.

Language testing can inform debates in language education more generally.

Washback studies have also been used in teacher training, both in order to influence test preparation practices, but also to encourage teachers to reflect on the reasons for their and others' practices.

Douglas (1997, 2000) identifies two aspects that typically distinguish LSP testing from general purpose testing.

a) The authenticity of the tasks, b) The interaction between language knowledge and

specific content knowledge.The development of an LSP test typically begins with

an in-depth analysis of the target language use situation, perhaps using genre analysis (see Tarone,2001).

Attention is paid to general situational features such as topics, typical lexis and grammatical structures.

LSP (Language for Specific Purpose)

Douglas (2000) stands firmly by claims made much earlier in the decade that in highly field-specific language contexts, a field-specific language test is a better predictor of performance than a general purpose test (Douglas & Selinker, 1992)

Computer-based testing has witnessed rapid growth in the past decade and computers are now used to deliver language tests in many settings.

A computerbased version of the TOEFL was introduced on a regional basis in the summer of 1998, tests are now available on CD ROM, and the Internet is increasingly used to deliver tests to users.

Computers can be used at all stages in the test development and administration process.

The commonest use of computers in language testing is to deliver tests adaptively (e.g.,Young et al.,1996).

COMPUTER BASED-TESTING

Computer based-testing advantages

Candidates are presented with items at their level of ability.computer-adaptive tests (CATs) are typically quicker to deliver.

The introduction of self-assessment was viewed aspromising by many, especially in formative assessment contexts (Oscarson, 1989).

It was considered to encourage increasing sophistication in learner awareness, helping learners to:

A) gain confidence in their own judgement.B) acquire a view of evaluation that covers the whole

learning process.C) see errors as something helpful.

It was also seen to be potentially useful to teachers, providing information on learning styles, on areas needing remediation and feedback on teaching (Barbot, 1991)

SELF-ASSESSMENT

Carton (1993) discusses how self-assessment can become part of the learning process.

In general, these studies have found self-assessment to be a robust method for gathering information about learner proficiency and that the risk of cheating is low (see Barbot, 1991).

It is usually taken to mean assessment procedures which are less formal than traditional testing, which are gathered over a period of time rather than being taken at one point in time, which are usually formative

rather than summative in function, are often low-stakes in terms of consequences, and are claimed to have beneficial washback effects. Although such procedures may be time-consuming and not very easy to administer and score, their claimed advantages are that they provide easily understood information, they are more integrative than traditional tests and they are more easily integrated into the classroom. McNamara (1998) makes the point that alternative assessment procedures are often developed in an attempt to make testing and assessment

ALTERNATIVE ASSESSMENT

Aternative

assessment

Less formal

Gather over a

period of time

formative

More integrati

ve