Post on 20-May-2015
description
Assessing Vocabulary
A paper assignment for
Language Testing and Evaluation
ENGL 6201
Ihsan Ibadurrahman (G1025429)
Assessing Vocabulary
I. Introduction
Vocabulary is an essential part of learning a language, without which communication would suffer. A
message could still be conveyed somewhat without a correct usage of grammatical structure, but
without vocabulary nothing is conveyed. Thus, one may expect a communication breakdown when the
exact word or vocabulary itself is missing albeit the correct syntax or grammar is there such as in “Would
it perhaps be possible that you could lend me your ___”. On the contrary, without much grammar but a
sufficient knowledge of vocabulary, it would suffice to say ”Correction pen?” (Thornburry, 2002). In the
context of ESL teaching, it would make sense that the teaching of vocabulary should take priority over
the teaching of grammar, especially in today’s growing use of communicative approach where limited
vocabulary is the primary cause of students’ inability to express what they intend to say in
communicative activities (Chastain, 1988 as cited in Coombe, 2011). Since vocabulary is the basic
building blocks of language, most second language learners undertake their ambitious venture in
memorizing lists of words. For ESL learners, learning a language is essentially a matter of learning new
words (Read, 2000). Vocabulary is also closely tied to comprehension; it is generally believed that
vocabulary assessment may be used to predict reading comprehension performance (DeVriez, 2012;
Pearson et al, 2007; Read, 2000; Thornbury, 2002). This implies that to be able to comprehend text fully,
vocabulary is much needed (Nemati, 2010). Vocabulary is thus important both for communication and
comprehension, especially in reading.
Knowing the importance of vocabulary in language teaching, it would make sense to be able to
assess vocabulary. Such measurement may inform teachers of how much vocabulary learning has taken
place in the class, whether the teaching has been indeed effective. Thornbury (2002) contends that
vocabulary tests may also give two added advantages to teachers in that it provides beneficial backwash
1
and an opportunity for recycling vocabulary. Provided that students are informed in advance that
vocabulary is part of the assessment, students may review and learn vocabulary in earnest, thus creating
a beneficial backwash effect. The test also gives students a chance to recycle, and use their previously
learned vocabulary in a new way (Coombe, 2011). However, despite the many benefits it has on
language teaching, vocabulary assessment does not receive the attention it deserves. Pearson et al
(2007: 282) argue that vocabulary assessment is “grossly undernourished” and instead of living up to a
good standard of measurement, much more effort has been exerted in favor of a practical standpoint in
which tests are designed for economical and convenient reasons. Such phenomenon exists in Indonesian
EFL contexts where vocabulary assessment is merely used as a part of reading test in its standardized
national examinations, rather than on its own and is fashioned in the form of convenient multiple-choice
tests (Aziez, 2011).
This paper aims to describe an overview of current practices of vocabulary assessment. The
paper also attempts to outline some recommendations on testing vocabulary, and how it may be
relevant to the EFL vocabulary teaching in Indonesian contexts. To achieve these aims, recent journal
articles dated 2005 onwards were gathered and studied. Books and other journal articles that are of
relevance to the study were also used to support the current journal articles. The paper begins by
describing the many facets of vocabulary assessment that teachers first need to take into account. It
then goes on to elaborate some common types of vocabulary assessment. Where relevant, the
advantages and disadvantages of each test techniques will also be discussed. It then goes on further to
briefly report what researchers have done on vocabulary assessment, including the new direction that
they have taken. Recommendations on how to assess vocabulary will then be presented, and related to
the EFL contexts in Indonesia. Finally, the paper closes with a conclusion which summarizes the content
of the paper.
2
II. The dichotomies in vocabulary assessment
Before going into the details of the common types of vocabulary assessment, it would be useful
here to present the many facets of vocabulary assessment. The first thing that test-designers need to do
is to decide on which aspects of vocabulary that they want to test. This is especially true in vocabulary
assessment since vocabulary is multi-faceted, in which words can be examined in many different ways,
not just its meanings (Samad, 2010). These aspects are often viewed in a binary opposition or
dichotomies. Thus, vocabulary can either be assessed informally or formally, whether it is part of a
larger test or a vocabulary test on its own, or whether the assessment is on learners’ passive vocabulary
or its active counterpart. Some of the many facets of vocabulary assessment found in the literature are
discussed as follows.
a. Informal versus formal
Formal vocabulary assessment refers to the tests that are standardized, and have been designed in
such a way that reliability and validity are ensured (DeVries, 2012). Test of vocabulary can
sometimes be a part of placement test and proficiency test to measure the extent of vocabulary
knowledge a learner has. In proficiency tests such as TOEFL (Test Of English as Foreign Languages),
vocabulary is usually tested as a part of a larger construct such as reading, where candidates are
tested on their vocabulary knowledge based on a context on a reading passage. Formal assessment
also includes achievement test that is typically administered at the end of the course, and is
designed to measure whether words taught within the duration of a specific course have been
successfully learned.
Informal assessments, on the other hand, are not usually standardized, and are typically done as a
formative test, or a progress check to see if students have made a progress in learning specific
words that we want them to learn. Learning words is not something that can be done overnight.
3
Especially, in second language learning where there is less exposure to the words, learners need to
recycle the vocabulary from time to time by doing some kind of revision vocabulary activities. Such
activities are an informal vocabulary assessment, intended primarily to check whether they have
learned and progressed with their vocabulary learning (Thornburry, 2002). DeVries (2012) proposes
teacher’s observation as one of the most useful informal vocabulary assessment during on-going
classroom activities. Observations may provide teachers the first indication of whether or not words
have been grasped by learners, from which follow-up activities may ensue.
In sum, informal and formal assessment is very much related to the nature of the test itself,
particularly on the demands of the testing and the standard of the test itself. The next three
distinctions that follow are proposed by Read (2000), to which he calls the three dimensions of
vocabulary assessment.
b. Discrete versus embedded
The distinction in Read’s (2000) first dimension of vocabulary assessment is the construct of the test
itself, whether it is independent or dependent of other constructs. Where vocabulary is measured as
a test on its own right, it is called discrete. However, when a test of vocabulary forms a larger part of
a construct, it is called embedded. Using this first dimension, we can say that progress check tests
that are available at the end of a unit in most course books fall into the former category, whereas
the TOEFL test mentioned previously clearly falls into the latter category.
4
c. Selective versus comprehensive
The second dimension deals with the specification of vocabulary that is included in the test.
Vocabulary test is said to be selective when certain words are selected as the basis of vocabulary
measurement. Its comprehensive counterpart, on the other hand, examines all the words that are
spoken or written by the candidate. A selective vocabulary measure can be found typically in most
conventional vocabulary tests where the test-designer selects the words to be tested in the
assessment, such as those found in TOEFL reading comprehension questions. Comprehensive
vocabulary measure is typically found on a speaking or writing test where raters judge the overall
quality of the words rather than looking specifically at some words.
d. Context-independent versus context-dependent
The last dimension of vocabulary entails the use of context in a vocabulary test. If words in the test
are presented in isolation, without a context, the test is called context-independent but when it
makes use of the context in order for the test-takers to give the appropriate answer it is called
context-dependent. In the former case, learners are typically asked to respond whether they know
specific words or not. For example, the yes/no vocabulary check list asks whether learners know the
words from the list by marking a tick on it. For the latter, learners must engage in the context in
order to come up with the right response in the test. For example, in TOEFL reading passage, in
order to know which options is the closest synonym to the word, learners must refer to the text and
use the available context there.
e. Receptive versus productive
Another distinction to make in vocabulary assessment is to decide whether we want to test learners’
receptive vocabulary (passive) or the productive (active) one. Receptive vocabulary is the vocabulary
5
needed to comprehend listening or reading text while active vocabulary is the vocabulary used
when learners use it in writing or speaking. It is understood that learners have more receptive
vocabulary than productive vocabulary at their disposal. Knowing this distinction is crucial because
we certainly do not need to tests learners to demonstrate how they can use all words; there are
words which we simply want the learners to be able to comprehend.
f. Size versus depth
The last distinction is one that has gained currency in the research of vocabulary assessment
(Cervatiuc, 2007; Read, 2007). Size (or breath) of vocabulary refers the amount to vocabulary a
learner has, while the depth is the quality of these words. It is generally understood that knowing a
word does not simply entail knowing its meaning, but other aspects as well such as its
pronunciation, part of speech, collocation, register, morphological changes. This word knowledge
deepens through a gradual process of learning (Cervatiuc, 2007). A vocabulary depth test is thus
used to assess learners’ knowledge on some of these aspects of words. As Read (1999) puts it,
vocabulary size measures how much leaners know words, whereas vocabulary depth deals with how
well they know these words.
The binary distinctions listed thus far suggest that we must have some kind of reasons first before
vocabulary tests are constructed. Nation (2008, as cited in Samad, 2010) proposes a matrix which
lists different reasons in vocabulary assessment along with its corresponding formats of test. The
adapted version of the matrix is illustrated in the table below:
6
Reason for testing Useful formats and existing test
To encourage learningTeacher labeling, matching, completion,
translation
For placementVocabulary Levels test, Dictation level test,
Yes/No, Matching, Multiple choice
For diagnosisVocabulary Levels test, Dictation level test, EVST-
Yes/no.
To award a grade Translation, Matching, Multiple-choice
To evaluate learningForm recognition, Multiple-choice, Translation,
Interview
To measure learners’ proficiencyLexical Frequency Profile, Vocabulary size test,
Translation
Table 1: Reasons for assessing vocabulary and its corresponding test formats (Samad, 2010: 78)
Some of the examples of the formats presented in the table will be given in the next section.
III. How vocabulary is assessed
This section briefly outlines some commonly used vocabulary formats in vocabulary assessment. The list
below roughly follows a chronological order of how they appeared in the testing of vocabulary. The first
four formats listed below were the earliest measures of vocabulary which primarily ask the learners to
demonstrate their vocabulary knowledge by labeling, giving definition, and translating. Traditionally,
such assessment was done orally via an individual interview (Pearson et al, 2007). However, due to the
mass testing triggered by World War I, a more reliable, practical scoring is needed. This gave birth to the
next two test techniques in the list: Yes/No list and Multiple Choice Questions (MCQs). Research on
Second Language Acquisition (SLA) and Reading soon changed the view on how words are learned. It
becomes a widespread belief that words are learned best when they are presented in context
(Thornbury, 2002). Such view motivates more contextualized vocabulary assessments such as the cloze-
7
test. Next in the list is, the four skills assessment (writing, speaking, listening, and reading), where
vocabulary is sometimes a part of the construct, which makes use of the context to demonstrate
learners’ ability in using the words (active vocabulary).
a. Labeling
One of the most commonly used test technique in vocabulary assessment is labeling, where learners
are typically asked to respond by writing down what the word is for a given picture as illustrated
below.
Alternatively, one picture can be used in which the learners are asked to label parts of it. Although it
may be relatively easy to come up with a picture especially with the growing mass of picture content
available on the net, it is somehow limited to pictures showing, and thereby testing concrete nouns
(Hughes, 2003).
b. Definitions
In definitions, learners are asked to write the word that corresponds to the given definition, as
illustrated below.
8
From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003)
A ____________ is a person who teaches in your class.
______________ is a loud noise that you hear after lighting, in a storm.
______________ is the first day of the week.
Definition provides a wider range of vocabulary to test, unlike the labeling format which is restricted
to concrete nouns. However, Hughes (2003) pinpoints one issue in this kind of test in that not all
words can be uniquely defined. To address this limitation, dictionary definitions may provide
shortcuts and save our headaches in finding the best, clear-cut, unambiguous definition.
c. Translation
There are many different ways in which vocabulary is measured using translation. Learners can
choose the correct translation in a MCQ, or simply be asked to provide the translation for each word
given in a list as follows:
Teacher __________ Taxi driver __________
Student __________ Librarian __________
Actor __________ Shop keeper __________
President __________ Professor __________
Note the above example may also be reversed, asking learners to provide the English words from
the L2. One pitfall in using translation is that one word may consist of more than one meaning, and
therefore there may be more than one correct answer which is an issue of reliability. However, the
use of context may help address this limitation. This can be done by adding sentences, in which the
word to be translated in underlined. Another issue in translation is the assumption that the teacher
has the knowledge of the student’s mother tongue (Coombe, 2011). It may also be noted that the
use of translation is somewhat regarded as controversial in the current trend in language education
where the use of mother tongue is discouraged (Read, 2000); learners should instead be given a
9
healthy dose of L2 exposure in the classroom (Harmer, 2007). However, a recent study done by
Hayati and Mohammadi (2009) suggests that translation provides longer memory retention of words
than another vocabulary learning technique called ‘task-based’ approach whereby learners are
asked to remember the definition, parts of speech, collocation, and other aspects of a word (or to
which is referred earlier to as vocabulary depth). Their findings imply that translation may still have
its place in vocabulary assessment.
d. Matching
Another common vocabulary test is where learners are presented with two columns of information,
and are asked to respond by matching a word in one column to another one. Items on the left-hand
column are referred to as premises, and items on the other end are called options. The word can be
matched based on its related meaning, a synonym, an antonym, or a collocation as exemplified in
the excerpt below:
Ur (1991) cautions the use of matching since learners can utilize the process of elimination, which
can be useful when they do not know the words in question. She thus recommends the use of more
options in matching.
10
From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003)
e. Yes/No list
The Yes/No format is particularly useful when we wish to test a large sample of items in a relatively
short time. This is achievable because in such format the learners are only asked to give a mark if
they know what the word means. For this practical reason, the yes/no format is typically used to
measure learner’s vocabulary size as a large sample of items is particularly needed in measuring size.
Give a tick (√) if you know what the word means.
__ Mayonnaise
__ Catastrophe
__ Belligerent
__ Distinctive
f. Multiple choice question
MCQs are among the most common test techniques in vocabulary assessment, especially in formal
tests (Combee, 2011). MCQs consist of a stem and response options. What the learners do is simply
to find one correct answer in the options. In vocabulary test, MCQ can be used to demonstrate
learners’ word knowledge of synonyms, antonyms, meanings in context, or a range of English
expressions as shown in the excerpt below:
11
McCarthy and O’Dell, Academic Vocabulary in Use, p. 41, CUP. (2008)
Although MCQs are often criticized for its sheer difficulty in designing good construct, limited
number of distractors to use, and existence of guessing element, MCQs nevertheless remain one of
the most popular vocabulary test simply because of their virtue of practicability, versatility,
familiarity, and high reliability.
g. Cloze-test
Cloze-test, also known as sentence completion or gap-fill item, is yet another common vocabulary
test where learners are asked to fill in the missing words in a text. It is relatively more demanding
than the previous test format since learners must demonstrate their ability in using the word based
on the context provided in the text. Cloze-test comes in many forms. The first one, is a fixed cloze
test in which every n-th word is deleted in a fixed ratio from a reading passage. The second form is
called selective-deletion or rational cloze test where instead of deleting words in a fixed ratio, the
test-designers purposefully delete some words from a reading passage. Another form of cloze-test
involves the use of multiple choice questions in answering the items. Thus, instead of the learners
having to write down the answer, they need to choose one correct response for every deleted word.
To eliminate the possibility of having more than one correct answer, it is desirable to provide the
first letter of each deleted word as illustrated in the following excerpt:
12McCarthy and O’Dell, Academic Vocabulary in Use, p. 43, CUP. (2008)
In the excerpt above, the first letter serves as a clue as to what the answer should be. Another
extreme version of this is called C-test, where instead of giving the first letter for each deleted word,
the first half of the word is revealed. One main advantage of cloze-test is its ease in writing one,
however Read (2000) casts doubt on the use of cloze-test as a true vocabulary measure. Since there
are quite many aspects to look at in answering a cloze-test item, the score may not reflect only the
learners’ lexical knowledge but it may be seen as gauging learner’s overall proficiency in the
language, including reading ability.
h. Embedded test
As previously mentioned, embedded vocabulary test is not a vocabulary test on its own but it is part
of a larger construct such as found in the testing of four language skills (reading, listening, writing
and speaking). In such assessment, the rater judges the overall quality of learners’ vocabulary in a
given skill. In reading, mainly the learner is asked to define the meaning from the context in a
reading passage. In listening, vocabulary can be one part of a larger writing component where
students’ knowledge of word spelling is assessed. Since writing and speaking are both productive
skills, vocabulary is somewhat given more weight. IELTS writing and speaking, for instance, put
‘lexical resource’ as one of the four marking schemes. This lexical resource refers to the quality of
learners’ vocabulary, whether for example the usage of word is appropriate, varied, and natural or
incorrect, limited, and stilted.
13
IV. Research on vocabulary assessment
Read (2000) provides one of the most comprehensive historical account of research into vocabulary
assessment. He states that vocabulary assessment is one field of study where not much attention has
been paid, particularly by the researchers of language testing themselves. In 1990s, most of the study
was conducted by Second Language Acquisition (SLA) researchers who might not have an adequate
understanding of testing and measurement but need vocabulary testing as their research instrument in
order to validate their own findings. Basically, these SLA researchers were interested in examining
whether specific lexical strategies are fruitful in terms of vocabulary acquisition by means of a test. The
four recurring topics that SLA researchers contributed to the field were systematic vocabulary learning,
incidental vocabulary learning, inferring the meaning of words from context, and communication
strategies. Systematic vocabulary learning looks at a systematic, orderly, way of how one acquires
vocabulary. Incidental vocabulary learning concerns the extent to which learners absorb new words
incidentally over time. The third topic investigates learners’ use of context in getting the meaning of
unknown words. The last most researched SLA topic deals with the ranges of communication strategies
used in a situation where learners lack the vocabulary to express what they wish to say. Other key
contributors in the previous study of vocabulary assessment are first language reading researchers. This
is primarily due to the consistent findings on the positive correlation between vocabulary knowledge
and reading.
The real testing researchers, on the other hand, take interest in the construction of vocabulary
test as to how it may cater for different testing purposes such as diagnosis, achievement and
proficiency, rather than in the processes of vocabulary learning and how effective each of these
different processes is by employing some sort of vocabulary measurement. The twentieth century was
marked as the year where researchers in the field of language testing began to take interest in
14
vocabulary assessment. The first research area that gained currency at that time was objective testing,
which is a kind of test that does not require judgment in its scoring. Read (2000) also contends that the
most frequently used test techniques in objective scoring is multiple-choice question, which is favored
particularly in vocabulary assessment because: (a) the availability of synonyms, translation, and a short
defining phrase lend themselves readily to the ease of constructing distractors, (b) the source of what
vocabulary to test is available thanks to the development of lists of the most frequent words in English,
(c) objective vocabulary measurement can also be used to indicate overall language proficiency. The use
of MCQs means that vocabulary testing throughout the whole twentieth century is typified as the test
that is selective, discrete, and context-independent. However, with the growing concern of using context
in vocabulary assessment, more and more test uses context in its construct such as the cloze test.
However, as Read (2000) acknowledges, the dilemma of a context-dependent vocabulary measure is
that it becomes quite difficult to separate the scoring of pure vocabulary knowledge from other skills
such as reading ability.
Read (2007) continues the documentation of research in vocabulary assessment from its 2000
publication. He states that a growing trend in the current research on vocabulary assessment is the
measurement of vocabulary size (or breadth) and depth. Both of these two distinct vocabulary measures
will be briefly discussed in turn.
Vocabulary size is an area that has gained currency in second language vocabulary assessment.
But what is it that makes it worth studying? As pointed out by Nation and Beglar (2007), measuring
learners’ vocabulary size is important for several reasons. First, it may inform teachers of how their
learners cope with real life, authentic task such as reading newspaper, novel, watching movies, or
speaking in English. Data on vocabulary size needed to perform each task is available, by testing
learners’ current vocabulary size, teachers could estimate how close their learners are to performing
15
these tasks. Secondly, such measurement is needed to track the progress of learners’ vocabulary. And
lastly, vocabulary size can also be used to compare non-native speakers and native speakers, whereby
learners may be predicted as to how close they are to achieving the size of native speakers’ vocabulary.
In measuring vocabulary size, researchers have used word list as a source to assess the number of words
a learner has. This is more possible now than ever before due to the growing use of computer corpora
which may provide word lists that are of quality. The first step researchers must take in measuring
vocabulary size is thus to choose word list that is available. After that, words are selected from the list,
and finally a suitable test technique is chosen. One commonly used test in assessing vocabulary size is
the Vocabulary Level Test (VLT) designed by Paul Nation and has been used for second language
diagnostic test (Cervatiuc, 2007). The test is available online (http://www.lextutor.ca/tests/) and it has
both the receptive and productive version which can be used to measure learner’s passive and active
vocabulary respectively.
In contrast to vocabulary size, there has been relatively little progress made on the research of
the depth of vocabulary. This is due to the lack of definition that constitutes ‘depth’ and the construct in
developing such test. It is generally understood that knowing word does not simply mean knowing its
definition. The fact that there are many aspects of word a learner must know such as its pronunciation,
spelling, morphological forms, part of speech, and collocation mean that there are quite a lot of things
to measure, and there is little agreement on which ones should constitute learner’s depth of vocabulary
(Read, 2007). As such it is relatively more difficult to construct this kind of test. However, one much
acknowledged vocabulary depth test is the Word-Associate Test (WAT) designed by John Read in 1993
(Cervatiuc, 2007). As its name suggests, learners’ vocabulary are measured by using word associations
such as synonyms, collocations, and related meaning. Typically, WAT measures how well leaners know
words by ticking four out of eight possible options that have these associations such as exemplified
16
below:
V. Assessing vocabulary in Indonesian EFL context
Taking my personal experience of being a student learning English in Indonesia as a compulsory subject
for six years, and also my experience of being a teacher teaching English in a senior high school in
Bandung for seven years, vocabulary assessment in Indonesia seems to leave a lot to be desired. There
are two main issues that contribute to my painful experience of being assessed and assessing
vocabulary, namely the nature of teaching and learning in school, and the tough national examination
test. These two issues will be elaborated in turn.
Since I was a student learning English in junior and senior high school until I became an English
teacher myself, vocabulary learning remains unchanged. Typically, learners are asked to read a passage
from a book, followed by a comprehension exercise which may entail some vocabulary exercises.
Usually, words from these exercises will be recycled only once in the review unit. After that, the words
that learners learn would never get repeated; they seem to vanish into thin air. As Thornbury (2002)
recommends, learner needs to encounter at least 8 times for a word to be ‘stuck’ in their mental lexical
knowledge. This suggests that teachers should incorporate informal vocabulary assessment, as
17
Read (1998), Word Associate Test, taken from <http://www.lextutor.ca/tests/associates/>
mentioned previously in this paper, in their teaching so that these words get recycled and used
meaningfully in a different way, and eventually stored into the long-term memory.
Another sad fact about vocabulary assessment in Indonesian context is that the same words
learned in the class do not even come in the national examination – an achievement test at the end of
the school year as a requirement for graduation. These words seem to be used as a kind of reading
exercise to get students used to one component of reading skill, which is to guess meanings from
context. Therefore, vocabulary testing is largely used as a means to an end, rather than a means by
itself. Using Read’s (2000) dimensions of vocabulary testing then, the national tests in Indonesia have
mainly been embedded and selective. Below is an example of an item taken from the 2010 English
national examination.
The above sample typifies the test as a context-dependent vocabulary measure, whereby the
word inhabitant is not presented in isolation but used in sentence taken from a text. As Read (2000)
points out this might come from the assumption that words never occur by themselves but constitute as
an integrated part of the whole text. However, a closer look at the item reveals that it might not be
context-dependent in its true sense. Test-takers who attempt to answer this question might respond C.
Citizens without necessarily having to read the whole text in order to come up with that answer. It must
be highlighted that in order the key element in context-dependent question is that learners must engage
18
Taken from Ujian Nasional 2009/2010, Kementrian Pendidikan Nasional
with the context in order to give appropriate response. As a comparison, below is a context-dependent
item as illustrated by Read (2000):
Humans have an innate ability to recognize the taste of salt because it provides us with sodium, an
element which is essential to life. Although too much salt in our diet may be unhealthy, we must consume
a certain amount of it to maintain our wellbeing.
What is the meaning of consume in this text?
A. Use of completely
B. Eat or drink
C. Spend wastefully
D. Destroy
In contrast to the previous item sample, the above item has all four options with possible
invariant meaning of consume, and therefore puts the learners in a condition where they must read the
text and use the available context to select the correct response, which is B. Eat or Drink.
The English National Examination (ENE) has 50 multiple-choice items altogether, with 15
listening comprehension questions and the 35 reading comprehension questions. However, not all of
these 35 reading questions pertain to vocabulary assessment, from the 2010 ENE only 5 questions or
10% from the whole items assess vocabulary knowledge. The large proportion of reading in the test
seems to indicate that there is an emphasis to reading skill. This might be one of the ways the
government invest in developing a culture of reading as stipulated in article three of National
Educational Law of July 2003 (UNESCO, 2011). Such emphasis is also desirable owing to the fact when
students enroll university they are expected to read English text books that cover 80% of the required
reading (Nurweni & Read, 1999). However, these reading texts are deemed to be too difficult for the
students to comprehend. In relation to vocabulary size mentioned previously, learners must possess at
least 4000 word level in order to gain 95% comprehension level with the assumption that the remaining
19
Taken from Read (2000: 12), Assessing Reading.
5% is the maximum amount of tolerance of grappling with unfamiliar words (Laufer, 1989 as cited in
Azies 2011; Nurwenti & Read, 1999). With 95% coverage, learners may still be able to comprehend a
200-word reading passage with 10 unknown words present. A recent study suggests that with 4000
word level, learners might be able to cover at least 95.96% the words occurring in 2010 senior high
school ENE, and surprisingly a slight margin of 95.80% in its junior high school counterpart (Aziez, 2011).
This means that reading texts in both junior high and senior high school national exams belong to the
same word level, which further suggests that test designers have not fully considered the vocabulary
load of these two different levels of high school education. Even more surprisingly perhaps, Nurwenti
and Read (1999) reveal that most of these senior high school students entering university do not even
come close to the required 4000 -5000 word level, meaning that it is such arduous work having to deal
with these reading passages. A reading passage that is difficult is also said to affect test reliability, which
refers to the degree of consistency and accuracy of a test. As pointed out by Samad (2010), a test that
contains difficult reading passages might contribute to the errors which in turn affect the accuracy of
one’s true score.
To sum up, vocabulary measurement in senior high schools in Indonesia largely employ
embedded, selective, and context-dependent test in a form of MCQs. It is embedded in the sense that
vocabulary measurement constitutes a larger part of a reading skill, and thus measuring only learners’
receptive (or passive) vocabulary. It is selective since the words to be tested are chosen by the test-
designers and context-dependent since the word is not presented in isolation. However, the test does
not use context-dependent in its full sense since learners do not need to engage with the context in
order to come up with the correct response (Read, 2000). To overcome this and some other problems
mentioned previously, the following suggestions may be helpful:
20
1. The vocabulary construct should be revamped as to reflect true context-dependent
vocabulary measure, in which all the options are possible variant meanings of the word and
forces the learner to make use of the context in order to come up with the correct response.
2. Teachers need to make use of informal vocabulary assessment so that words get recycled
for at least eight times. This can be done in various ways such as doing a 10 minute review
game at the beginning of every class to recycle and use these words in different ways.
Although these same words may not come up in the national exam, they still can train their
students the reading skill in which learners infer the meaning of unknown words in a reading
passage or familiarize students with the types of text that will come out in the exam.
3. In order to improve test reliability, test designers must carefully consider the weight of tests
difficulty in national exams. Reading texts that are too difficult will considerably affect the
accuracy of the score (Samad, 2010).
4. If the education goal is to enable students to deal with English text books later when they
enroll university, then the teaching of vocabulary that gears towards mastering at least 4000
word level is desirable (Nurweni & Read, 1999). Teachers may consider using the Academic
Word List (AWL) devised by Coxhead since most of these words are related to academic
registers. This list, and many other word lists are largely available to download in the
internet. This implicates that the testing of vocabulary should also be directed to measuring
vocabulary size, so that teachers may track the number of words students know from time
to time. Thankfully there are many websites that can do this automatically for them, such as
vocabularysize.com (http://my.vocabularysize.com/).
21
VI. Conclusion
Vocabulary is an essential building block of learning language. As such, it makes sense to be able to
accurately measure it. As important as it may, it is sobering to know that there is a paucity of research
into vocabulary assessment. Even more saddening is the fact that most of the contributors to the field
are SLA, and first language reading researchers who might not have an adequate understanding of
testing but need vocabulary measurement to validate their own findings. It was not until the later
twentieth century that real researchers in the field of language testing began to pay more attentio n to
vocabulary assessment. The current trend in vocabulary assessment is towards measuring learner’s
vocabulary size and vocabulary depth, or also referred to as the measurement of how many words they
know, versus how well they know these words. Other vocabulary distinctions that researchers might use
in assessing vocabulary are receptive versus productive, informal versus formal, discrete versus
embedded, selective versus comprehensive, and context-dependent versus context-independent. These
vocabulary distinctions may come in various test formats such as Labeling, Definition, translation, MCQs,
Yes/No Checklist, Matching, Cloze-test, and embedded test.
This paper has also described the current practice of vocabulary assessment in Indonesian EFL
contexts particularly in senior high school in Bandung. The common practice of its assessment is the
receptive, embedded, context-dependent, selective use of vocabulary measurement in its MCQs. The
paper also mentioned problems in its assessment which includes the tough reading passages which
make up 70% of the total questions in the English National Exams. The difficult reading passages
severely hamper learners’ comprehension which in turn threatens test reliability. This paper thus
suggests a strong call for test-designers to reconsider the difficulty weighed in reading as well as for
teachers to adapt direct teaching of high frequency words which further confirm the need to measure
learners’ vocabulary size, and is in line with the new direction in vocabulary assessment.
22
References
Aziez, F. (2011). Examining the Vocabulary Levels of Indonesia’s English National Examination Texts.
Asian EFL Journal, Vol. 51, pp. 16-29.
Cervatiuc, A. (2007). Assessing Second Language Vocabulary Knowledge, International Forum of
Teaching and Studies, Vol. 3(3), pp. 40-78.
Coombie, C. (2011). Assessing vocabulary in the classroom. Retrieved April 2th, from
http://marifa.hct.ac.ae/files/2011/07/Assessing-Vocabulary-in-the-Language-Classroon.pdf.
DeVriez, B. (2012). Vocabulary assessment as predictor of literacy skills, New England Reading
Association Journal, Vol. 47(2), pp. 4-9.
Harmer, J. (2007). The Practice of English Language Teaching (4th ed.), Essex: Pearson Longman.
Hayati, A., Mohammadi, M. (2009). Task-based instruction vs. translation method in teaching vocabulary:
The case of Iranian-secondary school students, Iranian Journal of Language Studies, Vol. 3(2),
pp.153-176. Retrieved 14th April, from: http://www.ijls.net/volumes/volume3issue2/hayati2.pdf
Hughes, A. (2003). Testing for Language Teachers (2nd ed.), Cambridge: Cambridge University Press.
Pearson, P., Hiebert, E., Kamil, M. (2007). Vocabulary assessment: What we know and what we need to
learn. Reading Research Quarterly, Vol. 42(2), pp. 282-296.
Read, J. (1998). Word Associate Test. Retrieved 15th April, from:
http://www.lextutor.ca/tests/associates/
Read, J. (2000). Assessing Vocabulary. Cambridge: Cambridge University Press.
Read, J. (2007). Second Language Vocabulary Assessment: Current Practices and New Directions,
International Journal of English Studies, Vol. 7(2). pp. 105-125.
Redman, S. (2003). Vocabulary in Use: Pre-intermediate & intermediate, Cambridge: Cambridge
University Press.
Samad, A. (2010). Essentials of Language Testing for Malaysian Teachers. Selangor: Universiti Putra
Malaysia Press
Thornbury, S. (2002). How to Teach Vocabulary. Essex: Pearson ESL.
McCarthy, M., O’Dell, F. (2008). Academic Vocabulary in Use, Cambridge: Cambridge University Press.
23
Nation, P., Beglar, D. (2007). A vocabulary size test, The language teacher, Vol. 31(7). pp.9-12.
Nemati, A. (2010). Proficiency and Size of Receptive Vocabulary: Comparing EFL and ESL Environments.
International Journal of Education Research and Technology, Vol. 1(1), June 2010, pp. 46-53.
Nurweni, A., Read, J. (1999). The English Vocabulary Knowledge of Indonesian University Students,
English for Specific Purposes, Vol. 18(2), pp.161-175.
UNESCO [United Nations Educational, Scientific, and Cultural Organization] (2011). Indonesia. Word
Data on Education (7th ed.). Retrieved 15th April, from:
http://unesdoc.unesco.org/images/0019/001931/193181e.pdf
Ur, P. (1991). A course in Language Teaching. Cambridge: Cambridge University Press.
24