Theoretical Framework

UNIVERSIDAD CATOLICA DE LA SSMA. CONCEPCION

THEORETICAL FRAMEWORK

ASSESSMENT

EDUCATION FACULTY

ENGLISH LANGUAGE DEPARTMENT

LUIS FUENTES CID

GABRIEL JARA MUÑOZ

PROFESSOR

ROXANNA CORREA PEREZ

DATE

CONCEPCION, 2013

Introduction

This essay attempts to explain and compare the concept of assessment in the

context of Second Language Acquisition (SLA) from different authors’ perspectives

(Coombe, Brown, Bachman), focusing on the characteristics that a formal test should

have in order to evaluate language. First of all there is a distinction that must be made

between the concepts of assessment and test. Assessment is described as a process to

collect information regarding learners language ability or achievement (Coombe 2007

pg 14), whereas test are described as part of assessing, but in a more practical,

systematical process involving the measurement of students’ achievement or

progression throughout language learning process in stages (units, lessons) (Coombe

2007 pg 14). Therefore when ideas or concepts about assessing students language are

made on this essay, they will eventually refer to test and how test should be created in

order to evaluate language learning. Among the perspectives made by the authors

regarding test features, they all sort of agree in important aspects to take into account

when assessing students. They have all set a list of “principles” (Coombe and Brown) or

“qualities” (Bachman 1996), that in a way establishes standards of an useful, well

designed test. Coombe (2007) among a quite long description of test in many aspects

proposes a nine guiding principles of good test: usefulness, validity, reliability,

practicality, washback, authenticity, transparency and security. Brown (2004) being

more precise and practical in its explanation proposes five principles for test

effectiveness: practicality, reliability, validity, authenticity and washback. On a similar

way but with important concept difference, Bachman (1996) proposes three principles

related to usefulness of a test, as the most important aspect to consider, and six

qualities of a test (features or characteristics) that make a test useful: Reliability,

construct validity, authenticity, interactiveness, impact and practicality. Having provided

the most important aspects for the authors, they will be explained in detail and

compared among different perspectives to highlight important differences as well as

relevant concepts conveyed among the three authors.

In ‘Conceptual bases of test development’ Bachman (1996) states that when it

comes to produce a test, it is very important that testers consider the intentions of the

instrument, concluding that the most important quality of a test is its usefulness

(Bachman, 1996 p17). This usefulness is defined as the appropriate balance among the

qualities (Bachman 1996 p18) (which Brown and coombe would call principles) of a test

which for the author are reliability, construct validity, authenticity, interactiveness, impact

and practicality.

The author describes three principles as a basis for the well function of their

model of usefulness in language tests. Saying principle number 1 that it is the overall

usefulness of the test that is to be maximized, rather than the individual qualities that

affect usefulness. Principle number 2 states that the individual test qualities cannot be

evaluated independently, but must be evaluated in terms of their combined effect on the

overall usefulness of the test. Finally, principle number 3 says that test usefulness and

the appropriate balance among the different qualities cannot be prescribed in general,

but must be determined for each specific testing situation. (Bachman 1996 p18)

Bachmann (1996) describes reliability and validity as critical and essential for a

test because these two qualities provide major justification for using scores (numbers)

as a reference for making decisions. He states that reliability is more related to test

score. For instance if the same test is taken by the same group on two different

occasions and setting, it should not make any difference if an individual takes the test in

one setting or the other. If the result is different, the test that was administered was not

reliable. On the other hand validity or, as defined by the author, construct validity is the

degree to what we can relate of a given test score as a signification on the ability that

wants to be measured on the test (Bachman 1996). In this case construct stands for an

precise definition for the ability that is going to be the base for the production of the test.

Bachman (1996) define authenticity as the degree of relation between the TLU (target

language use) and the tasks that are presented in the test. Interactiveness is also a

defined as a ‘relation’ or interaction which in this case is between the test taker and the

task which can be from the test or a TLU task (Bachman, 1996). Another quality that the

author mentions is the impact and he defines it as the impact not only for the for the

individuals who are preparing themselves for the test and taking it or the teachers, but

also in a macro level in terms of the educational system or society from it context.

Finally, Bachman declares that Practicality refers to the ways that a test is going to be

implemented, if the resources required for implementing the test exceed the resources

available, the test will be impractical (Bachman 1996 p35).

Brown (2004) states five principles in language assessment to should be applied

to test in order to achieve effectiveness, appropriate administrative constraints, fair

measurement of contents and dependability. The author starts with practicality, which

being very briefly describe, encompasses four main aspects to make an evaluation tool

“friendly” in terms of applying and checking it. A test should be cheap, non-time

consuming (to create it), easy to check and easy to administer. Clearly practicality deals

with common sense aspects when it comes to assessing, for is not an easy task to carry

out when teachers have more than thirty students per classroom; therefore Brown

(2004) presents this aspects to make work easier, but not less effective. Secondly,

Brown (2004) presents reliability, as a principle that encompasses the constancy

dependability of a test, meaning that tests result should presents similar numbers or

scores if applied in different grades/classes, which if successful will provide to teachers

a trustworthy tool to assess language. Interestingly, Brown (2004) describes as well as

different factors that, besides test-creation, affects its reliability. Taking for instance

human factors like students-reliability, which explains that students originated-factors

such as emotions or attitudes will alter the reliability of results; therefore the accuracy of

them, rater reliability which describes the possible tendencies that of a rater towards

certain aspects of the test or even student, and non-human factors such as test

administration reliability, meaning all the logistic factors that could affect the optimal

implementation of a test and finally the test reliability, which includes any error or

problem that the evaluation tool might have.

Thirdly Brown (2004) presents a rather complex term alluding to the coherence between

what and how something has been taught, and how it has been evaluated, which is

basically the relationship between methodology and test results: Validity. However

Brown (2004) also explains how could validity be measured or proved in a test, so he

proposes different kinds of evidence that could support the fact that a test is valid:

content related evidence, criterion related evidence, construct related evidence,

consequential validity and face validity. Authenticity is also mention by the author as “a

task is likely to be enacted in the real world”, enlisting several aspects that a test should

present in order to be authentic, such as usage of natural language, contextualized

items, meaningful topics, contents presented in a thematic structured way like a

storyline, tasks reflect real world activities.

Washback is a concept also presented by Brown (2004) and is described as the effects

that the test could have in students preparation for it, or the process of students being

prepared for it, and its implications on finding out weakness and strengths as well as

feedback for the task.

Coombe (2007) in an extended but also very detailed text presents nine

principles to make of an assessment a well designed and developed tool. The authors

starts with usefulness of a test, which in agreement with Bachman and Palmer (1996)

states that is a quite important aspects for it defines the intention or purpose for what a

test was created or design, as well as establishing the target content, skills etc.

Validity is defined for Coombe (2007) with very practical words, as “test what you teach

and how you teach it! (pg 22)”, meaning that the way a teacher evaluates students

should be consistent with the way they were taught in terms of format, approach, skills,

target language etc. Interestingly Coombe (2007) also present a sort of subdivision in

the principle of validity just as Brown (2004) did, but in concise way focusing mainly in

construct validity, understood as the link between methodology and theoretical

background (why is this test design the way it is) and face validity, understood as the

way the test looks right in terms of supposedly measuring what is supposed to measure

in a familiar way as the content was taught to the students.

Reliability is also defined by Coombe (2007) as the consistency of results in a test,

which means that students should get similar scores independently if applied at different

times, or with different versions of the same test. The authors identifies within reliability

potential sources of error that could affect the scores of a test which are called

fluctuations in different aspects of a test such as fluctuations in the learner, the scoring

and the test administration. All of them describing possible inconveniences when it

comes to assessing.

Practicality is shortly defined as a way of making assessing an easy task for teachers

considering time to check, develop and administer a test as well as money and

resources issues. Not because is short or easier to make is going to be a poor quality

test.

Washback refers to the effect of testing on the students as well as teachers (Coombe,

2007) for the reason that is gives learners (if test is well designed) a sense of

accomplishment because test would work as an indicator that they are getting closer to

the general objectives of a course and eventually they will develop all their abilities and

skills if the content is taught and assessed properly.

Authenticity is highly related to motivation through inserting in test real world tasks, for it

works as a mirror for closely related students situation where they can actually use the

language that was taught.

Transparency deals with giving all information to learners to make it fair, meaning that

they will know how they are being evaluated, what are they asked to do and allowed to

have a fair scoring.

Security, closely relate to reliability and validity, for if any information related to test

escapes from the domain of the teacher will eventually affect its reliability and validity

when it comes to evaluate others students.

After providing all of the definitions, is necessary to compare them and see how

they differ and resemble. To make it tidy, principles will be compared one by one, even

though there is not much difference in the content of the principle but how they have

been conceived by authors and described including or omitting certain aspects.

Practicality has been seen by the three authors in a pretty similar way, since they all

included what could be called as common sense features when assessing a class such

as resources (human and material) and time.

Reliability has also been described as the consistency (Brown, 2004, Coombe 2007,

Bachman 1996) and dependability (Brown 2004) of test scoring, no matter the

circumstance in which the test would have been applied. There is a small difference

though between Brown (2004) and Coombe (2007) because even though both of them

present possible or potential errors that could affect reliability of scoring, are described

in different ways and names, but still with the same aiming.

The principle of validity instead has some variations when presented by the authors, for

its presented with different degrees of depth or subdivisions of validities of a test.

Bachman (1996) presents only the concept of construct validity alluding to the idea of

relationship between test scores and its signification on the measured ability, pretty

similar to what Coombe (2007) did plus the concept of face validity; however Brown

(2004) develops several degrees of validity as a way to give evidence to support validity

in tests.

Authenticity also mentioned by the three authors, but similarly described by Brown

(2004) and Coombe (2007) as real world tasks in test, however Bachman (1996)

presents authenticity related to the concept of TLU which in other words could be how

close are the test activities related to the target language in use, real communication

context.

WashBack is a principle presented by Brown (2004) and Coombe (2007) as the effects

of students preparing for a test, but Bachman (1996) present the concept within the

quality of Impact.

Other concepts that were not present on all the texts were transparency and security by

Coombe (2007) which could be anyways related to some major principles for they are

very specific in their description, since they refer to the alteration of the scoring.

Bachman (1996) also present some new concepts such as interactiveness which mainly

describes the relation or interaction between the test taker and the characteristics of the

task given in a test or in real life communication.

It can be clearly seen that all of these principles are very important when assessing

language, since the requirements of these principles provide teachers tools to make

tests effective and efficient for students and themselves, measuring what teachers are

supposed to measure, coherent to the contents teacher presented during classes and

also the abilities of the language required for an appropriate performance among other

characteristics of a well-designed test.

References

Bachman, L. F. & Palmer A. S. (1996). Language Testing in Practice: designing and

developing useful language tests. Oxford: Oxford University Press

Brown, H. D. (2004). Language Assessment-Principles and Classroom Practices.

Longman. New York.

Coombe, Ch., Folse, K. & Hubley, N. (2007) A practical guide to assessing English

language learners. University of Michigan Press.

Theoretical Framework

Education

Transcript of Theoretical Framework