Author(s): Richard W. Buchanan and Martha Rogers Source ... Assessment in Large Classes.pdf ·...
Transcript of Author(s): Richard W. Buchanan and Martha Rogers Source ... Assessment in Large Classes.pdf ·...
Innovative Assessment in Large ClassesAuthor(s): Richard W. Buchanan and Martha RogersSource: College Teaching, Vol. 38, No. 2 (Spring, 1990), pp. 69-73Published by: Taylor & Francis, Ltd.Stable URL: http://www.jstor.org/stable/27558399 .
Accessed: 26/03/2014 11:46
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].
.
Taylor & Francis, Ltd. is collaborating with JSTOR to digitize, preserve and extend access to CollegeTeaching.
http://www.jstor.org
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions
Innovative Assessment
in Large Classes
Richard W. Buchanan and Martha Rogers
e would like to offer some
useful suggestions to solve some of the assessment prob
lems frequently encountered with large classes. 'Targe classes" will be defined
here as those with eighty students or
more. Although this definition is some
what arbitrary, it has been our experi ence that eighty students is the break
ing point where traditional teaching
techniques are no longer workable and new ones must be tried. This breaking
point is particularly noticeable in the area of assessment. We've watched
many of our colleagues struggle along with traditional approaches, such as es
say examinations, up to points where
class enrollments exceed eighty. Then
they normally collapse from overwork,
delegate assessment to lower-level as
sistants, or start looking for new ap
proaches.
This paper will show some solutions
to three problems:
1. How to offer students in large classes an opportunity to be assessed in an essay format without straining the resources available for grading
2. How to deal with students who
miss a required examination
Richard W. Buchanan is senior lecturer in marketing at the Massey University in New
Zealand. Martha Rogers is an assistant
professor of marketing at Bowling Green
State University in Bowling Green, Ohio.
w
3. How to generate large numbers
of new, relevant examination questions on a regular basis
It is useful to begin by stressing that
this paper is not, and was never intend
ed to be, an elegant scientific examina
tion of all the factors within its focus.
It is our intention to share techniques that have worked for us in sections
numbering between 50 and 350 stu
dents. One author typically teaches be
tween two and three thousand students
per year.
We have had only one graduate assis tant assigned to each of us for a period of five to ten hours per week, and thus
finding a means of dealing with mass
numbers became a matter of survival.
Virtually all of the solutions suggested
by this article were the result of trial
and-error. As such, this paper cannot
lay claim to having tested all possible solutions. In addition, although we
have kept reasonably accurate records to test the effectiveness of various solu
tions, we have made no attempt to pre sent them as anything other than ap
proximations.
Our three assessment solutions will
be presented and should be used simul
taneously, as a total system. This is in
keeping with our experience that it is
best to treat instructional design as a
system?rather than to treat individual
parts in isolation. To do otherwise of ten causes the solution to one problem to exacerbate another. Therefore, this
paper will not only relate those parts of
the system designed to deal with select
ed problems but will also mention some solutions for problems created by the new system itself.
Objective Tests?Imperfect but Unavoidable
Although people teaching large class
es often try to avoid multiple-choice/ true-false tests, we have found that
such efforts seem to be appreciated by almost no one. Although colleagues may criticize the limitations of anything other than essay tests, they usually are
willing to accept an alternative if more
than fifty students are involved. Admin
istrators may make noises about the de
sirability of essay examinations, but, in our experience, they are rarely willing to
trade the time it takes to grade them for
a lack of participation in either matters
of administration or research/publica tion. Finally, students are not nearly so
fond of them as their comments to the
contrary might suggest. For all these reasons we are assum
ing that the basis for assessment will
primarily be objective questions. This
assumption normally unleashes a storm
of student complaints to the effect that "I just don't do well on objective tests." Although this may be the case
for some, we have found that, general ly, the belief just doesn't hold true.
Through the years we have often
made it a point to offer both essay and
objective final examinations to stu
dents who have been tested up to that
Vol. 38/No. 2 69
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions
time in an objective format. Those who
have taken the essay options have been
graded on the basis of their examina
tion without our first checking to see
what their performance had been on
objective test items. Only rarely has
their letter grade on the essay final ex
amination been different from the let
ter grade on previous objective tests.
This observation concurs with the find
ings of Cowles and Hubbard (1952),
Thompson (1965), and Bracht and
Hopkins (1970). A study by Warren
(1979) indicates that it may actually be
easier for students to get high marks
with multiple-choice than with essay tests (also see Hogan [1981]).
This rule-of-thumb, however, is not
true for all students. And, even if it were true, it will not be useful for
quieting students' objections if they think it is not true for them. For this
reason, we've found it necessary to
provide some way for students to be as
sessed in an essay format?while still
protecting ourselves against the enor
mous time investment required to eval
uate all students in this manner.
Some idea of how great a time in
vestment may be involved can be deter
mined by considering a hypothetical
example. Suppose that a more or less
standard ten-question, short-answer
test intended to be taken in fifty min
utes were to be given. Assuming that it
takes a minimum of two to three min
utes to grade each question means that
assessing each paper in the most mini
mal fashion requires a total of from
twenty to thirty minutes. Multiplying this figure by a not uncommon student
load of six hundred students produces a figure of from two hundred to three
hundred hours. Even if instructors were to spend all of their time grading papers on a forty-hour week basis, each exam would take from five to seven weeks to process.
Some might argue that this situation
could be alleviated by the use of grad ers, but this technique has problems of
its own. Among them are coordination/
management of the graders, variability among graders, and the fact that stu
dents don't normally like to have their
work assessed by someone other than
the instructor.
All of these factors argue for a solu
tion that offers students a chance to be
assessed in an essay format but that
will limit the number of students so as
sessed to reasonable numbers.
Self-Selective Essay Exams
We found that the only system that
would fit into the preceding constraints
had to be based on what many would
term a "cafeteria" approach. The phil
osophical basis of this approach (which is frequently used in structuring em
ployee benefit plans) is to offer "con
sumers" a number of options from
which they can select the combination
of items they prefer. Students are, therefore, offered the
following three options: (1) four objec tive concept tests only, (2) four objec tive concept tests and an optional final, or (3) three objective concept tests and an optional final. In options one and
three, each test is worth 25 percent of
their course grade; in option two, each
test is worth 20 percent. Those students electing to take the
optional final are told
1. their current grade prior to the fi
nal (i.e., Should they quit while they're
ahead?); 2. that the final examination can
hurt them as well as help them (i.e., a
concept test can?under some circum
stances?be dropped, but a final can
not be dropped if attempted); 3. the approximate percentage of
students taking the final examination and the fraction of these improving their grades over the years;
4. that the final examination will
consist of either a fifty-question objec tive test or a ten-question short-answer
essay?both covering the entire course; 5. that students will have to decide
prior to taking the final which version
they will attempt (i.e., they could not
look at both and decide which version was easier); and
6. that most students in the past have preferred the objective version be cause it loads their risk into small (two
points each) components rather than
large (ten-point) "hunks."
When the options are presented to
them in this manner, only 10 to 15 per
cent of the students enrolled in large courses have elected to attempt the fi
nal. Of those taking the final, no more
than 20 percent chose the essay version
?and, typically, only six or seven in a
class of three hundred students.
These numbers, though manageable
enough, have been distilled even fur
ther by a refinement of the system that was produced to meet what proved to
be a product of the authors' teaching
styles. When teaching large classes, we've found it useful to make sure that
the lectures contain enough material
not covered in the supporting text to
make it worthwhile for students to at
tend lectures. We tell the students that
this material will be both presented and
the subject of examination questions
(i.e., at least 30 percent of a test's items
will not be found in the book). Because it is generally impossible to
videotape the lectures, those students
who miss many classes have a very real
problem, although they could miss at
least one concept test without penalty. However, if the final examination cov
ers both the text and the lecture, they are still at risk for those topics covered
during their absences. For this reason
we decided to make the objective ver
sion of the final examination cover the
text only while the essay version is
drawn from both the text and lecture.
Generally, the lectures are more orient
ed to applications of knowledge than to definitions or facts, and we believe
that these applications are better tested in an essay format.
Once this refinement was made, the
percentage of students taking the final
exam remained about the same, but the
number electing the essay version has
dropped to a fraction of 1 percent.
Still, it has always been there if anyone wanted to complain about not doing well on objective tests. To the best of our knowledge, no complaints about
the unavailability of essay tests have ever been made about our large classes.
It may also be useful to know that
the percentage of students attempting the final usually falls over time, pos
sibly because the grapevine eventually
spreads the word that the final is not a
particularly soft option. At any rate, the ceiling on the people attempting it
70 COLLEGE TEACHING
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions
seems to be about 10 to 15 percent of
those enrolled.
How many concept tests should be
administered? Students complain if
there are fewer than four concept tests, because administering three or fewer exams causes the amount of material to
be covered on each one to be unman
ageable. Having more than four seems
impractical because it multiplies the re
sources needed beyond a point of di
minishing returns.
The basis of this system is in direct
contrast to what seems to be an aca
demic tradition of placing relatively
greater emphasis on the final examina
tion than on others such as the concept tests. However, it is not our intention
to load most of a student's evaluation
into his or her performance on only one day of the term.
Abolishing Makeup Exams
Once the "cafeteria" style is adopt
ed, it then becomes possible to use it to
solve other problems such as makeup exams.
Having students absent from a re
quired exam is never a comfortable sit
uation. Professors dread the inconven
ience of constructing a makeup exam
and find distasteful the thought of
serving as judge, jury, and executioner in determining whether excuses are ac
ceptable. At the same time, students
don't like having their integrity ques tioned by an unpredictable, often in
sensitive system that they frequently
suspect of being punitive. These more
or less standard complaints explode in
their intensity when multiplied by the
enrollments of a large class.
Before tackling the problem of
makeup tests, we realized that 15 to 25
percent of students might be absent
from any given examination. When ap
plied to a class enrollment of 80 to 350, and multiplied by several sections, the
total number of students likely to be in
volved is beyond the scope of tradition
al methods for handling them.
The first problem is processing the
flood of individuals who show up at an
instructor's door either prior to or
shortly after an examination with their excuses for being absent. If only five
minutes is spent with each person, the
total time invested would leave little
time for doing anything else. Beyond
this, we have felt totally helpless to de
termine which excuses are truthful, jus
tified, or both.
Even if all the absentees could be ac
commodated, their sheer numbers
make it impossible to arrange a time
and place for a makeup exam that they can all attend. Finally, if a makeup test
is allowed, there is no way to make it
fair for all concerned. If anyone is al
lowed to take the test prior to the regu lar class, then someone is bound to feel
that those taking the makeup will pass
questions on to their friends. And, if a
totally different test is given as a make
up, someone will argue that it is harder
(easier) than the regular test.
At this point it would have been
tempting to surrender the entire matter
and decide to accept absolutely no ex
cuses except those that conform to uni
versity policy and are supported by ap
proved documentation (i.e., student
health center doctor's excuse, etc.).
But, common sense suggests that this
limitation would overlook some per
fectly valid situations, and this would
lead to further conflict. Although such
conflict may be permissible in smaller
class settings, it definitely is not for
large ones.
One thing that large classes teach
their instructors is never to tolerate any situation that strikes a large number of
students as unfair. Reasonable univer
sity administrators are used to discard
ing the opinions of what they may per ceive as a handful of disgruntled stu
dents. They are much more likely to
take action if fifty or a hundred gather outside their door.
After considering all the problems associated with makeup exams, we de
cided to offer the students the option
(previously discussed) of being assessed on the basis of three concept tests and an optional final examination that be comes mandatory if a student misses one of the concept tests. At the time
the students are informed of this op tion they are also told that
1. they do not need to inform the in
structor or get permission to miss a
test;
2. by taking this option, they are
also giving up the ability to drop a low
test (i.e., what really is happening is
that students are given the ability to
drop a low test score?either a bad per formance on a test taken or no per
formance on one they missed); 3. if they miss another test or the
final they will fail the course; and, most important,
4. no makeups will be given for any reason to anyone.
Besides all this they are also told the
specifics of the final examination, which have already been introduced in a preceding section.
This system has had remarkable re
sults. Only a handful of students come
to the office door each year to ask
about the possibility of a makeup. Be
yond this, the percentage of students
electing to miss any given concept test
has averaged around 5 perccent of
those enrolled. And we are relatively certain that any who do miss a test
under these circumstances have reasons
that they think are justifiable. Limitations of this part of the system
should be mentioned. Most important, when a student misses an exam, that
student has not been assessed on a sig nificant percentage of course material.
Although we have not yet tried it, one
solution to this drawback would be to
give more weight on the final exam to
those items assessing the material on
the missed exam. This weighting would
be procedurally simple. Each student
taking the final exam will do so either
voluntarily as a fifth exam or to make
up for a missed exam. The student's
record will reveal which is the case, and, if the latter, which exam was missed. It
is then a relatively simple matter to
weight the items from the missed ex
ams more heavily.
Additionally, in a few rare cases, a
student has tried to test the system either by challenging it or by missing two examinations. In the first cate
gory, an entire hockey team had their
coach call, first, a department chair, and then the dean, trying to get an ex
cused absence. These matters were eas
ily dealt with as soon as both the spe
cifics, the rationale behind the system.
Vol. 38/No. 2 71
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions
and the clarity of the presentation to
students at the outset of the course
were explained to the administration.
When students miss two examina
tions, we've found it easy to deal with
them on a case-by-case basis. Fre
quently those who miss two exams
never even bother to come in and sim
ply accept their failing grades.
Generating Test Items
It is certainly true that generating test questions is not particularly easy for any course. However, large class
sizes produce unique pressures. One
problem is introduced by the sheer vol
ume of the class. The students are like
ly to fill the largest auditorium avail
able, or at least a large amphitheater classroom. Thus, when tests are given there is no way to spread students out
with a seat between each of them.
There must be at least two (and pos
sibly more) versions of each test given for each examination. We find that re
ordering the items accomplishes this.
If two or more sections are taking the examination at different times,
each ''sitting" will probably need en
tirely different examinations as well.
Otherwise, information about the ex
am will flow from the earlier class to
the later one. The most obvious way
this can happen is if copies of the test
are pilfered and removed from the ex
amination room. However, even with
stringent security, this is not the only
way for exams to "get out." We once
learned of a student in an early section
who had taken the exam with a tape re
corder in his pocket. He apparently sat
at the back of the room and mouthed
the questions into the recorder; then,
he left the room, looked up the an
swers, and gave copies of the tests to
his friends. We've also heard that so
cial organizations have directed indi
vidual members to carry questions from the test in memory (i.e., "you do
one to five and she will do six through
ten," etc.).
Finally, large class sizes usually de
mand that all new tests be constructed
for each test each year. Large classes,
which tend to be entry-level courses,
are tempting targets for the develop ment of files that can be passed on
from year to year once students learn
that exams may be repeated. The thing that makes all of these
seemingly paranoid fears more real is
that a large class size escalates the value
of misappropriated information or
copies of exams. A graduate assistant,
caught in a campus security forces
raid, had apparently been selling copies of exams for $100 each.
All of these concerns mandate that a
large number of test items be devel
oped continually. The only problem is
that the instructor of a large course
may tend to specialize in it after a
while. Since the same textbooks and
relatively similar lectures are used year after year, the instructor may find a
diminishing ability to generate new ob
jective test items.
A popular solution to this problem, test banks supplied by textbook com
panies, may fail on two counts. One
difficulty is that the test bank has ques tions that apply only to the text. As al
ready stated, we find it desirable that
lecture content and textual material be
different. If this is the case, there may
be a large body of information that will
not be tested if questions come from
test banks only. Once students figure this out (and they will), lecture atten
dance will fall.
Furthermore, our experience with
textbook test banks has been mixed.
Some of the questions seem poorly
worded, ambiguous, or irrelevant.
Ironically enough, the resource that
can solve this problem is the same as
the one that causes most of the other
problems: large size of the class.
Student-Generated Test Items
The sheer volume of students in a
large class represents a source of aid
seldom recognized by teachers. Chan
neled in the right direction, the sum
total of talents within a large class is
usually more than equal to its chal
lenges. Where a small class may have
only four or five outstanding students,
big ones may have fifty or more.
We have designed a system that en
ables this resource to be put to work.
In a handout issued before the first ex
am, students are told that they can sub
mit potential examinations questions in
a specified format. The motivations
for students' writing test items are that
(1) they can have the satisfaction of
seeing their own questions used with
their names attached (i.e., the instruc
tor will identify the author of the ques tion on the exam if the student wishes
it); (2) if they submit the question they
presumably will get it correct on the ex
am; and (3) the teacher agrees to "pay" them two points additional credit for
each question chosen (the same as each
question is worth on an examination/
grading system based upon total
points).
Telling students the correct format
for submitting questions has proved to
be crucial, as otherwise the instructor
can be deluged with pieces of paper that are very difficult to process. For
this reason we insist that students can
submit up to ten questions per exam,
that all questions must be on a standard
5"-x-7" card, that each question must
be either typed or legibly printed on a
separate card, and that information
giving the correct answer, the source
(i.e., page of the text, date of lecture,
etc.), and the identity of the author
must be provided for each question. Over the years students operating
under these constraints have provided
many of our test items. We have been
happily surprised by the quality of the
questions. Although as many as ten
questions may be sifted to get one good one (and even this one usually requires
rewriting), we believe that those select
ed have been of a caliber at least as
good as many of those in test banks
and are often less trivial and more con
ceptual.
The students' reaction is always dif
ficult to assess. We've tried to be par
ticularly attentive to any dissatisfac
tion. Although we've occasionally had
the complaint that "the exam (grade) doesn't reflect how much I know," the
comments from mandatory student
evaluations of course and instructor
have been fairly repetitive of those
we've received about questions gener ated traditionally. The only complaint
unique to this system is that "the in
structor shouldn't be so lazy as to let
others write his exams," and these
seem to be rare and to lack passion.
72 COLLEGE TEACHING
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions
Reactions from colleagues and uni
versity administrators have been diffi
cult to assess because most of them seem oblivious to the system. Although
we've been careful to get administra
tive approval of this approach, permis sion to use it has proved easy to get.
Once permission has been granted, we
have yet to hear much about the whole
concept from anyone not connected to
the course, presumably because there
have been no complaints. At any rate, the system does seem
to generate questions that are good to
excellent once they have been filtered
and rewritten. We believe it's impor tant to make every effort not to include
simplistic questions that measure mere
ly rote memorization.
Our experience has shown that on
the average there are normally from one to one-and-one-half questions sub
mitted per test per student enrolled.
Thus if three hundred students are en
rolled in the class, the instructor may
expect from 300-450 questions to be
submitted per test, and more for mul
tiple sections. Usually the total number
of questions climbs with each succes
sive test, as some students discover that
they can increase their grades and
others realize that they are now in aca
demic trouble and need the bonus
points. Furthermore, students repeat
edly tell us that writing questions is an
effective way to review for exams.
The total number of questions sub
mitted may frighten some teachers, but
it shouldn't because a number of tech
niques can make the job of processing them easier. First, their sheer numbers
mandate that having some remote loca tion for their deposit, like a faculty
room mailbox, is probably a good idea. Once the questions are all in one
large stack, we suggest making an out
line of topics to be covered on the ex
am. Then the items that "fit" can be
used until the exam covers all the nec
essary topics. If an item looks promis
ing it is kept; if not, it is discarded. In
order to reward as many individual stu
dents as possible, we accept no more
than two questions per student. This is
fairly easy to keep track of as students
submit the questions in batches that are
paper-clipped or banded together. It should be noted that it is not nec
essary to read all the questions sub
mitted. In fact, we find that it's best to
be honest about telling the students
that we will simply reach into the stack
and draw out questions until we have
the right mix of good ones to create the
exam desired. They seem to accept this
lottery approach without much com
plaint.
Using this procedure, we've found
that a standard fifty-item test can be
constructed in from two to three hours
per test. This certainly compares favor
ably with the time necessary to create
questions oneself. And since the ques tions selected are all typed on standard
sized cards with correct answers at
tached, this system is also usually pop ular with the word-processing depart
ment; it is an easy matter for them to
turn a standard title page and a rubber
banded stack of fifty questions into a
finished examination. Once their work
is complete, they can then pass the
stack of questions on to a person creat
ing an answer key for grading pur
poses. Finally, copies of this key with
the page number where test items are
located can be passed back to students
with their answer sheets so that they can check to see that an answer they
missed really does exist.
For those faced with the responsibili ties of teaching large classes, this arti
cle was intended to resolve some practi cal aspects of the problems of assess
ment. It is sometimes hard to depart from traditional methods without much
soul-searching about whether the inno
vations are somehow a dilution of the
quality of the original. The fact is, we
have little empirical evidence to guide us, and we must accept the changes that
the resources at hand dictate in a way that seems best for everyone.
REFERENCES
Bracht, G. H., and K. D. Hopkins. 1970.
Communality of essay and objective tests
of academic achievement. Educational
and Psychological Measurement 30 (Sum mer): 359-64.
Cowles, J. T., and J. P. Hubbard. 1952. A
comparative study of essay and objective examinations for medical students. Jour
nal of Medical Education, Part 2. 27:14 17.
Hogan, T. P. 1981. Relationship between
free response and choice-type tests of
achievement: A review of the literature.
Washington, D.C.: National Institute of
Education. (ERIC Document Reproduc tion No. ED 224 811)
Thompsen, R. E. 1965. A study of the com
parative predictive validities of the essay and objective sections of the college en
trance examination board advanced place ment examination in physics. Princeton:
Educational Testing Service. Test Devel
opment Report 65-4.
Warren, G. 1979. Essay versus multiple choice tests. Journal of Research in Sci
ence Teaching 16(November): 563-67.
Public Television's
YEAR OF THE
ENVIRONMENT
1990
A|TH Vol. 38/No. 2 73
This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions