DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at...

23
DOCUMENT RESUME ED 386 472 TM 023 975 AUTHOR Powers, Donald E.; Wilson, Susan T. TITLE Passage Dependence of the New SAT Reading Comprehension Questions. College Board Report No. 93-3. INSTITUTION College Board, New York, NY.; College Entrance Examination Board, New York, N.Y. REPORT NO ETS-RR-93-60 PUB DATE 93 NOTE 23p. AVAILABLE FROM College Board Publications, Box 886, New York, NY 10101-0886 ($12). PUB TYPE Reports Research/Technical (143) EDRS PRICE MFOI/PC01 Plus Postage. DESCRIPTORS *College Entrance Examinations; Construct Validity; High Schools; *High School Students; Questionnaires; Reading Comprehention; *Reading Tests; Test Construction; *Test Items; *Verbal Ability IDENTIFIERS *Passage Dependency; *Scholastic Aptitude Test; Strategy Choice ABSTRACT It has been reasonably well established that test takers can sometimes answer correctly some reading comprehension passages without reading the passages on which the questions are based. This issue was studied with the new Scholastic Aptitude Test (SAT) in a study designed to Oetermine the strategies by which examinees are able to achieve better-than-chance performances without reading the passages. Sets of sample reading comprehension questions were administered, without passages, to 350 verbally-able students in 8 secondary schools across the country. After completing the task, students were asked to complete a questionnaire describing the strategies they had used. The most often cited strategies involved choosing answers on the basis of consistency with other questions and reconstructing the main theme of a missing passage from all the questions and answers in a set. These strategies were also more likely to result in successful performance on individual test items than any of the many other possible approaches. Implications for the construct validity of the new SAT are discussed. One figure and 14 tables present ana'ysis results. (Contains 25 references.) (Author/SLD) *********************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document. ***********************************************************************

Transcript of DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at...

Page 1: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

DOCUMENT RESUME

ED 386 472 TM 023 975

AUTHOR Powers, Donald E.; Wilson, Susan T.TITLE Passage Dependence of the New SAT Reading

Comprehension Questions. College Board Report No.93-3.

INSTITUTION College Board, New York, NY.; College EntranceExamination Board, New York, N.Y.

REPORT NO ETS-RR-93-60PUB DATE 93

NOTE 23p.

AVAILABLE FROM College Board Publications, Box 886, New York, NY10101-0886 ($12).

PUB TYPE Reports Research/Technical (143)

EDRS PRICE MFOI/PC01 Plus Postage.DESCRIPTORS *College Entrance Examinations; Construct Validity;

High Schools; *High School Students; Questionnaires;Reading Comprehention; *Reading Tests; TestConstruction; *Test Items; *Verbal Ability

IDENTIFIERS *Passage Dependency; *Scholastic Aptitude Test;Strategy Choice

ABSTRACTIt has been reasonably well established that test

takers can sometimes answer correctly some reading comprehensionpassages without reading the passages on which the questions arebased. This issue was studied with the new Scholastic Aptitude Test(SAT) in a study designed to Oetermine the strategies by whichexaminees are able to achieve better-than-chance performances withoutreading the passages. Sets of sample reading comprehension questionswere administered, without passages, to 350 verbally-able students in8 secondary schools across the country. After completing the task,students were asked to complete a questionnaire describing thestrategies they had used. The most often cited strategies involvedchoosing answers on the basis of consistency with other questions andreconstructing the main theme of a missing passage from all thequestions and answers in a set. These strategies were also morelikely to result in successful performance on individual test itemsthan any of the many other possible approaches. Implications for theconstruct validity of the new SAT are discussed. One figure and 14tables present ana'ysis results. (Contains 25 references.)(Author/SLD)

***********************************************************************

Reproductions supplied by EDRS are the best that can be madefrom the original document.

***********************************************************************

Page 2: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

S

assa e e en enceo t e ,e

Reading ComprehensionQuestions

U.S. DEPARTMENT OF EDUCATIONOnce of Educahonal Research and Improvement

EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)

dOCumerrI has been reproduced asrecerved from the person or organ.rahonoogmateng d

0 Minor changes hone been made to improvereproduction oulerty

Pardee ,new or or:onions stated in this Oocumint do not necessardy represent offic,aIOERI posmon or pohcy

"PERMISSION TO REPRODUCE THISMATERIAL HAS BEEN GRANTED BY

_23/e4vAl

TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)"

DONALD E. POWERS and SUSAN T. WILSON

BEST COPY AVAILABLE

ED The College BoardEducational Excellence for All Students

2

Page 3: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

College Board Report No. 93-3ETS RR No. 93-60

Passage Dependence

of the New SAT®

Reading Comprehension

Questions

DONALD E. POWERS and SUSAN T. WILSON

College Entrance Examination Board, New York, 1993

3

Page 4: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

Donald E. Powers is a senior research scientist at ETS.Susan T. Wilson is a senior research associate at ETS.

Researchers are encouraged to freely express their pro-fessional judgment. Therefore, points of view or opinions.stated in College Board Reports do not necessarily representofficial College Board position or policy.

The College Board is a national nonprofit association thatchampions educational excellence for all students throughthe ongoing collaboration of more than 2,900 memberschools, colleges, universities, education systems, andorganizations. The Board promotesby means of respon-sive forums, research, programs, and policy developmentuniversal access to high standards of learning, equity ofopportunity, and sufficient financial support so that everystudent is prepared for success in college and work.

Additional copies of this report may be obtained fromCollege Board Publications, Box 886, New York,New York 10101-0886. The price is $12.

Copyright 1993 by College Entrance Examination Board.All rights reserved. College Board, SAT, and the acorn logoare registered trademarks of the College Entrance Examina-tion Board.

Prigted in the United States of America.

Acknowledgments

Many people made significant contributions at variousstages of this project. At the early stages of proposaldevelopment, Warren Willingham provided useful advice onthe design of the study. Barry Druesne, Diane Rdesinski,and Cathy Wend ler helped identify appropriate files fromwhich we were able to contact students. Diane, along withPete Smith, assisted in matching study records with SATfiles. Sydell Carlton, Ed Curley, Patricia Klag, Gerry May,and Eric Wimmers helped identify possible strategies thatstudents might use and provided advice about other aspectsof the study. Iry Katz discussed think-aloud protocolanalysis with us. Gloria Weiss was instrumental in identify-ing schools to participate in the study and in advising us onenlisting their help. Nancy Burton, Fred Schuppan, andNancy Wright provided item analysis data for the readingquestions we used, as well as other information about SATtakers. Inge Novatkoski set up our files and conducted earlydata analyses. Drew Bowker, Dave Freund, Laura Jenkins,and Pat O'Reilly conducted the data analyses, and KenWilson provided the final analyses of the data. Cot rineCohen helped code and edit the data. Ruth Yoder providedadministrative assistance throughout the project andprepared the manuscript. Nancy Burton and LawrenceStricker provided helpful comments on an earlier draft.The final word of thanks goes to the anonymous teachersawl students at the eight schools at which the study wasconducted.

4

Page 5: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

Contents

Abstract 8. Strategies Most Frequently Listed as MostHelpful for Individual Test Questions 11

Introduction 1

9. Significant Correlations between Use ofMethod 1 Strategies and Performance on Question Sets 12

Checklist DevelopmentSelection of Reading Passages 2 10. Significant Correlations between TestSample Selection 2 Performance and Use of Answer ChoiceData Collection 2 Features to Choose Answers 12

Results 3 11. Significant Correlations between TestDescription of the Sample 3 Performance and Use of Answer ChoiceDescriptive Statistics for Test Scores 3 Features to Rule Out Answers 13Item-Level Comparisons 4Descriptive Statistics on Test-Taking Strategies 7 12. Logistic Regression of Performance onRelationships between Strategy Use and Most Helpful Strategies for Selected

Without-Passage Performance 10 Questions 14Relationships between Without-Passage

Scores and SAT Scores 13 13. Percentage of Test Takers Who AnsweredStudent Recommendations 13 Selected Questions Correctly, by Use of

Selected Strategies 14Discussion 15

14. Correlations between SAT Scores andReferences 17 Performance on Questions without Passages 15

Figure1. Directions to study participants. 3

Tables1. Comparison of Study Sample with All

College-Bound Seniors Who Took the SATin 1991-92 3

2. Mean Test Scores for Reading Passage Sets 4

3. Performance on Reading ComprehensionQuestions Given with and withoutReading Passages 5

4. Descriptive Statistics for 54 ReadingComprehension Questions Administeredwithout Reading Passages 6

S. Comparison of Responses for S:lectedQuestions Given with and without Passages 7

6. Frequency of Use of Various Test-Ta king Strategies 9

7. Frequency of Use of Test-Taking StrategiesInvolving Features of Answer Choices 10

Page 6: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

AbstractIt has been reasonably well established that test takers can,to varying degrees, answer some reading comprehensionquestions correctly without reading the passages on whichthe questions are based. This is true even for carefully con-structed measures such as the College Board's SAT. As aresult, the use of reading scores, including those from theSAT, as valid indicators of reading comprehension has beenchallenged.

The major aim of this study was to determine thestrategies by which exarninees are able to achieve better-than-chance performan -es without reading the passages.The focus of the research was a sample of reading com-prehension questions simi;ar to those that will be used inthe revised SAT, to be int...oduced in 1994. Sets of read-ing comprehension questions were administered, withoutthe passages, to a sample of verbally able students in eightsecondary schools across the country. After attemptingthe task, these students were asked to complete a ques-tionnaire describing the strategies they had used. Themost often cited strategies involved choosing answers onthe basis of consistency with other questions and recon-structing the main theme of a missing passage from allthe questions and answers in a set. These strategies werealso more likely to result in successful performance onindividual test items than were any of the many otherpossible (and less construct-relevant) strategies. Implica-tions are discussed with regard to the construct validityof the new SAT reading comprehension questions and theadvice that should be given to prospective SAT takers.

IntroductionA primary objective of test makers is to ensure that testperformance is not unduly influenced by factors outsidethe proper focus of measurement. This intent is formal-ized in the Standards for Educational and PsychologicalTesting (AERA, APA, and NCME 1985). Irrelevanciessuch as test takers' level of familiarity with testing proce-dures or their use of particular test-taking strategies oughtto play only a small role in determining success on anyexamination. Strategies that raise test scores but bearlittle if any relationship to what the test was designed tomeasure may diminish the predictive power of a test ordilute the meaning of scores derived from it.

For tests of reading comprehension, one strategy thatexaminees may use (and one over which test administra-tors have little if any control) is to answer test questionswithout reading the passages on which they are based.Unfortunately, little information exists on the prevalence

of this approach, but the evidence that is available sug-gests that adult readers do not employ this strategy often(Farr, Pritchard, and Smitten 1990). Nonetheless, therehas been a recent revival of interest in the extent to whichsuccess in answering reading comprehension questionsdepends on ha ving read the passages with which they areassociated. Katz, Lautenschlager, Blackburn, and Harris(1990) found that college students can perform at better-than-chance levels on SAT reading questions even whenthey do not have access to the reading passages. This find-ing is, of course, not a new discovery. It has long beenacknowledged that reading comprehension questionsvary in the degree to which they depend on the passages(Preston 1962), and this phenomenon has frequently beenstudied (Preston 1964; Pyrczak 1972, 1974, 1975;Tuinman 1973-74; Weaver and Bickley 1967). Nor is thisfinding surprising. As Conlan (1990) argued, it would behighly unusual, given current conceptions of readingcomprehension, if examinees were not able to extractsome information from the test questions themselves.

The renewed interest in passage dependence is timelyin light of the planned revisions to the current SAT. Be-ginning in 1994 the verbal reasoning portion of the SATwill place greater emphasis on critical reading, and vo-cabulary will be measured in context rather than withdiscrete antonym items. Fewer reading passages will beused (four instead of six), but the passages will be longer.The net result is that the proportion of reading questionson the SAT will increase by nearly 60 percent.

The principal objective of the present study was todetermine the particular ways in which test takers mayglean useful information solely from the kinds of readingcomprehension questions that will be used in the revisedSAT. A second aim was to assess the construct relevanceof these strategies, i.e., the extent to which they entaillegitimate verbal reasoning abilities versus testwiseness orother such skills that are less germane to measuring read-ing comprehension and readiness for postsecondarystudy. The final purpose was to determine the degree towhich the kinds of reading questions to be used in therevised SAT can be answered correctly without the pas-sages and to gain insight into the question characteristicsthat may contribute to their susceptibility to this strategy.

Method

Checklist Development

A checklist of possble strategies was developed to deter-mine how test takers might approach tests when readingpassages are not available. Several activities were under-

1

Page 7: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

taken to gain insight into this process. First, two experi-enced test developers were asked to answer sets of read-ing questions without the passages. Because these particu-lar developers were familiar with most of the currentlyavailable SAT passages, reading passages from disclosedforms of the Law School Admission Test (with whichthey were not familiar) were used. A number of usefulinsights resulted from this process. The most interestinginvolved the use of one strategy in particular. By carefullyreading an entire set of questions, these specialists wereable to reconstruct, to a considerable degree, the passageon which the questions were based. In effect, for thesehighly literate people, the task became an extended dozetest requiring sophisticated verbal reasoning.

One of the authors also conducted individual think-aloud sessions with six local high school students as theyattempted to answer sets of SAT questions without ac-cess to reading passages. These encounters generatedadditional hypotheses about how students might ap-proach this unconventional assignment.

Next, a sample of approximately 300 students wassurveyed and asked to answer a set of 9 to 12 readingquestions without the passages. Every third student re-ceived a different set of questions. Students were selectedfrom SAT registrant files so that they would represent arange of academic ability (about two-thirds reported be-ing A students and one-third C students). A total of 33students, nearly all A students, responded to our requestto attempt the questions and then to indicate for each onethe strategies they had employed. It should be noted thatthe participants in these pilot activities were notprompted in any way about how to approach the task.The objective at this stage was to uncover any strategiesthat might be employed naturally in attempting the exer-cise. Finally, a number of test preparation books wereexamined for any suggestions that might relate to thestudy. Selected literature on testwiseness (e.g., Millman,Bishop, and Ebel 1965) was consulted for additionalclues.

The data from these activities were used as the basisfor the categories in the final checklist. Possible strategieswere classified as reasoning strategies, use of personalknowledge or experience, strategies for vocabulary ques-tions, guessing, and use of features of answer choices. Thechecklist was pilot tested on a small number of local stu-dents, reviewed by several Educational Testing Service(ETS) staff members, and revised before it was used inthe study proper.

Selection of Reading Passages

Six reading passages and associated questions were se-lected from among those pretested in earlier trials of the

2

revised SAT. The nature of the passages and the numberof questions associated with each one were as follows:

1. A passage of approximately 900 words on language,in which the author, a Japanese American, recounts anexperience he had just after the United States enteredthe Second World War (12 questions).

2. A 500-word passage adapted from an excerpt of amemoir written by Elizabeth Bishop about the poetMarianne Moore (6 questions).

3. An 800-word passage about Clarence Darrow and theCommunist trial of 1920 (9 questions).

4. A 600-word passage that presents a theory about thenature of the object that exploded above Tunguska in1908 (9 questions).

S. Two passages totaling about 800 words that presenttwo views of the architectural design of cities. Onediscusses planned, medium-sized cities; the other of-fers a critique of modern cities (13 questions).

6. A 500-word passage excerpted from a book of liter-ary criticism analyzing the work of Richard Wright(1908-1960) (5 questions).

Sample Selection

The objective was to recruit students who would be mostlikely to perform well on the task at hand (i.e., answer-ing reading questions without reading the passages) andbe able to indicate the strategies involved in their perfor-mance. A list was compiled of secondary schools that hadvolunteered (but were not selected) to participate in theearlier field trials of the revised SAT. From this list, anumber of schools having above-average mean SAT-Verbal scores were identified and invited to participatein the study. To further ensure that the sample includeda preponderance of verbally able students, school person-nel targeted Advanced Placement or honors Englishclasses. Eight schools were eventually selected. Theseschools were located in California, Colorado, Florida,Maryland, New Jersey, New York, and West Virginia.

Data Collection

ThrPe different test forms were assembled, each consist-ing of 18 questions from the first and second, third andfourth, or fifth and sixth passages described above. Thenumber of questions was deliberately limited, in order tomaintain students' motivation to do this unconventionaltask, for which they had no particular reason to exert any

Page 8: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

This booklet contain: two sealant The first section consist; of two groups ofreading coneprehennen questions without the panves on which the quatiom arebased We would like you to answer these question, to the best of yoisr abilitywithout first nading through their accompanying passages. Fed free to ure any bpeof strategy that you need to solve these pubkme

Tht onty imbue:ions that we ask you to follow in this section art:

1) Indkate the time you mud artswesing the question: and the time you tzighed in thespaces provided fix godung2 of quationt

2) Click your answer to each question CM the test form

3) In addition. for each question, pleene 'crogLque (X) any choke that you are able to ruleout (Do this *hen you are able to 'narrow down' the anrwen, rather than guessamong all the =wax)

Here's an example of what this might look like on your text form

The "unfinished ware referred to In tine 50 Is the

abaciattle of Getty:binense of freedom

filertablishment of a govenvnentdedication of the battlefield

(E) honoring of the fallen soldiers

4) Please answer tqch questiost you should definiteb, gtan when you an uncertain aboutthe answer to a question. Unlike the real SAT, it is to your benefit to guess at ananrwer in this study. No points are taken off your score for wmng answen.

FIGURE 1. Directions to study participants.

extraordinary effort. The only concrete incentive was theopportunity for one student from each school to win a$100 prize on the basis of test performance. Specifically,students at each school were told that those earning testscores of at least 30 percent would be eligible for theprize:This score, just above a chance performance of 20percent, was thought to be within the reach of every stu-dent and therefore potentially motivating.

Classroom teachers administered the tests duringregular classroom periods in the spring of 1992. The threetest forms were alternated within each classroom so thatevery third student received a different form. That is, stu-dents 1, 4, 7, and so on got Form A; students 2, 5, 8, andso on got Form B; and students 3, 6, 9, and so on gotForm C. The checklist of possible strategies was includedwith each test form. Instructions to students are shownin Figure 1.

Students were asked to indicate the times at whichthey started and finished answering each group of ques-tions in order to evaluate the efficiency of answeringquestions without passages as well as the effectiveness ofthe strategy used. Students' actual SAT scores and a vari-ety of other background data were retrieved from SATfiles.

ResultsDescription of the Sample

Records were available from SAT files for 271 of the 350students (all high school juniors) who participated in thc

BEST COPY AVAILABLE

study. It is likely that those students whose records couldnot be located had not taken the SAT. Table 1 comparesthe study sample with all 1992 college-bound seniorswho took the SAT on selected variables. The studysample was, as expected, very able. The rest scores of thestudy sample were on average higher by about one stan-dard deviation than were the scores of SAT takers in gen-eral. In addition, the study sample reported higher grade-point averages and class ranks than did all college-boundseniors. For example, nearly half (47 percent) of thesample were in the top 10 percent of their classes, com-pared with 21 percent of all college-bound seniors. Amajority (66 percent) of the sample indicated that theyhad taken honors English courses, compared with about32 percent in the college-bound senior population (Col-lege Boa:d 1992).

Descriptive Statistics for Test Scores

Table 2 shows the mean score and mean percentage ofitems correct for each of the six sets of reading questionswhen administered without the reading passages. Foreach set, performance was better than would be expectedfrom random guessing on the five-choice questions (i.e.,

TABLE 1

Comparison of Study Sample with All College-Bound SeniorsWho Took the SAT in 1991-92

Study Sample(N=350)

All 1991-92College-Bound Seniors

(N=1,030,000)Sex (% female) 54 52

Honors English taken (%) 66 32

High school rank (%)Top tenth 47 21Second tenth 27 )2Second fifth 17Third fifth 9 25Fourth fifth 0 4Fifth fifth 1

High school GPA (%)A+ 22A 23 12A 20 14B 35 52

17D, E, or F 0 0

SAT scoresSAT-V 542 423

SD 88 112SAT-M 606 476

SD 102 12.3

TsWE 52. 42SD 7 11

Source: College Board, College Bound Seniors: 1992 Profile of SAT andAchievement Test Takers (Princeton. N.J.: Educational Testing Service, 1992).Note: Statistics tor study sample arc based on SAT records for 271 of 350participants.

83

Page 9: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

20 percent). However, for three of the sets the percent-age of correct responses (26-29 percent) was not muchbetter than would be expected by chance. Only for thesmallest set of questions was performance (59 percentcorrect) commensurate with that typically observed whenreading passages are available. As might be expected,given both the nature of the task and the relatively smallnumber of items, the estimated reliability of each testform was low. Internal consistency estimates (coefficientalpha) were .44, .47, and .19 for the three forms, respec-tively. The internal consistency also varied for questionsets within forms: .35 and .36 for those in Form A, .43and .32 for Form B, and .09 and .37 for Form C. In com-parison, the reliability of a 25-item pretest of the newreading comprehension questions was estimated to be .78when administered with passages (Lawrence 1992).

With respect to time required to complete the task,examinees reported taking about 13 minutes on averageto answer the 18 questions in each form. The mean timesfor each test form, as determined from examinee reports,were 12.3 minutes (SD = 3.1), 13.6 minutes (SD = 3.9),and 13.8 minutes (SD = 4.7), respectively. Thus, on av-erage, examinees devoted about 40 to 50 seconds perquestion. It is interesting to note that according to calcu-lations made by Katz et al. (1990), the average time al-lotted for each reading comprehension question on theSAT is about 65 seconds, which includes the time re-quired to read the passages.

Although overall performance on each set of ques-tions is of interest, it does not tell the complete story.Table 3 compares the performance of the students in ourstudy on each test question (without the passages) withthat of students who participated in an earlier large-scalepretesting of the same questions (with the passages). Formost test items, the with-passage statistics are based onthe performance of nearly 10,000 examinees who partici-pated in tryouts of the new SAT items from fall 1990through spring 1991. These pretest examinees repre-

TABLE 2

Mean Test Scores for Rcading Passage Sets

Form/PassageNumberof Items

MeanNumberCorrect SD

MeanPercentage

Correct

A. Language 12 3.1 1.8 26

A. Marianne Moore 6 1.7 1.3 29

B. Clarence Darrow 9 3.4 1.7 38

B. Tunguska 9 3.3 1.7 37

C. Architecture 13 1.7 1.6 28

C. Richard Wright 5 2.9 1.2 59

Note: A total of 135 students took Form A, 107 took Form 8, and 108 tookForm C.

4

sented all SAT takers reasonably well with respect todemographics but were slightly less able than the generalpopulation of SAT takers. They were, therefore, consid-erably less able, as defined by SAT performance, than thestudents who participated in the current study.

Because of the differences between samples, Table 3only approximates the differential difficulty of each ques-tion when administered with and without the passages.Questions are ranked according to their difficulty whenthe passage was available. The standard error of the dif-ference between percentages correct based on the samplesizes is .048 or less. Thus, a difference of about 12 per-cent can be used to gauge the significance of differencesat the .01 level and about 9 percent can be used at the.05 level. As is apparent from Table 3, the questions were,with few exceptions, substantially easier for the moretypical sample of test takers who had access to the read-ing passages than for the more able students in our studywho did not have access. If the two samples had beenmore nearly equal in ability, the differences in perfor-mance would probably have been even greater.

Item-Level Comparisons

Clearly, according to Table 3, restricting access to read-ing passages did not depress performance equally for eachtest item. The ordering of items by difficulty with andwithout the passages shows a less-than-perfect correspon-dence. Items that were difficult when passages were avail-able tended to be difficult when the passages were notavailable. The difficulty of some items, e.g., question 2from the passage on Language, question 7 from theClarence Darrow passage, and question 5 from the Rich-ard Wright passage, was appamntly affected very little byremoving access to the passages on which they werebased. Others were influenced dramatically. For instance,although question 11 from the passage on Language wasanswered correctly by 58 p.:rcent of test takers who hadthe passage, it was answered accurately at only a chancelevel (20 percent) when the passage was not available.The most striking change in difficulty occurred for ques-tion 1. from the Architecture selection. Although thisquestion was very easy (85 percent correct) when thepassage was accessible, it was a n-wered correctly at onlya chance level without the pass.. Question 2 from thesame passage was quite easy (63 percent correct) with thepassage but was answered at a less-than-chance level (6percent) when the passage was unavailable.

Aside from the content of the items, which is consid-ered below, there are some readily available explanationsfor their differential difficulty under the two testing con-ditions. Table 4 is informative in this regard. It displays

Page 10: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 3

,IIMIMINE11

Performance on Reading Comprehension Questions Given with and without Reading Passages

Form/Passage Question

Percentage Correctwith the withoutPassage the Passage Difference Form/Passage Question

Percentage Correctwith the withoutPassage the Passage Difference

A. Language 10 67 29 38 B. Tunguska 3 73 43 3011 58 20 38 4 68 38 30

7 54 35 19 2 67 55 12

9 54 24 30 5 54 47 712 .51 40 11 6 51 26 252 48 54 6 8 45 30 15

6 48 25 23 1 41 23 185 46 17 29 7 39 41 21 42 12 30 9 32 33 18 28 24 4 Median 51 38 153 23 16 7 C. Architecture 1 85 20 654 20 15 .5,

Median 48 24 2162

6563

566

957

A. Marianne Moore 1 85 57 18 3 62 38 245 70 42 28 9 .9 41 184 48 19 29 4 56 30 263 45 18 27 11 46 19 276 39 24 15 7 45 30 152 36 16 20 8 43 11 32

Median 47 22 28 13 39 21 181210

3729

3822

17B. Clarence Darrow 5

47673

5338

2335

3 62 51 115 23 37 14

7 59 69 10 Median 46 30 18

2 58 6 52 C. Richard Wright 5 80 74 69 56 52 4 4 77 62 13

8 55 37 18 2 74 51 231 33 24 9 1 69 55 146 25 12 13 3 59 52 7

Median 58 38 13 Median 74 55 14

Note: The Language, Marianne Moore, Tunguska, and Richard Wright item statistics are each based on more than 9,500 students tested in 1990-91.The Clarence Darrow statistics are based on approximately 2,500 to 3,500 students tested in 1990-91. The Architecture passage item statistics are alsobased on more than 9,500 examinees, except for question 5, which is based on 115 students.Differences of approximately 9 percent are significant beyond the .05 level, two-tailed; differences of approximately 12 percent are significant at the .01level.

in descending order of difficulty for each question thewithout-passage statistics on the percentage of examin-ees who answered each question correctly, eliminated theorrect answer, chose selected incorrect options (the most

and least popular), and eliminated selected incorrect op-tions (the most and least popular).

From Table 4 it appears that some items, e.g., ques-tions 1 and 5 from the Language passage, question 2 fromthe Clarence Darrow passage, and questions 2 and 8 fromthe Architecture passage, were difficult in large part be-cause examinees were much more likely to eliminate thecorrect choice than to select it. The most extreme case isquestion 2 from the Architecture passage: the correctanswer was chosen by only 6 percent of the sample buteliminated by 55 percent.

Other items appeared to be difficult because particu-lar incorrect options were attractive. An example of thisis question 2 from the passage on Marianne Moore,which was answered correctly by 16 percent of thesample. A majority (57 percent) selected the same incor-rect answer.

BEST COPY AVAILABLE

Some items appeared to be answerable without thepassage partly because particular incorrect options wereeliminated as plausible answers. Examples are question9 from the Architecture passage (one choice was elimi-nated by 66 percent of the sample) and question 3 fromthe Clarence Darrow text.

Table 5 compares examinee responses, with andwithout the passages, to several of the questions men-tioned above. The particular questions selected for dis-cussion here were chosen because they represent the mostdramatic examples of different response pacterns. Ques-tion 1 from the Language selection appears to have beenmore difficult without the passage because the correctanswer was eliminated about three times as often as itwas chosen (36 percent versus 12 percent) and becausetwo incorrect options (D and E) were more attractive toexarninees who did not have the passage than to thosewho did. Question 2 from the Clarence Darrow passagehad a similar outcome, with about five times as manywithout-passage examinees eliminating the correct an-swer as selecting it (32 percent versus 6 percent). Each of

5

Page 11: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 4

Descriptive Statistics for 54 Reading Comprehension Questions Administered without Reading Passages

Form/Passage

Correct Answer

PercentQuestion Choosing

PercentEliminating

Incorrect OptionsPercent Choosing Percent Eliminating

Least Most Least MostPopular Popular Popular Popular

A. Language 2127

10689

115341

544035292524242017161512

138

19251410171643221836

4 32 20 385 40 16 228 24 8 370 33 10 38

15 73 16 2213 28 12 21

a 46 17 35o 45 6 365 29 13 374 48 10 374 44 13 249 36 15 35

A. Marianne Moore 1

56432

574224191816

49

14131911

6 21 12 367 30 10 311 43 12 372 29 11 19

14 27 17 286 57 10 27

B. Clarence Darrow593481

62

69535251383724126

68

14161220312832

5 11 12 494 19 18 312 30 14 356 19 26 504 36 17 399 28 21 265 36 23 54

13 35 14 385 32 10 43

B. Tunguska 253749861

554743413833302623

8

1011121315202421

8 18 18 311 19 20 372 32 12 36

10 25 15 291 39 11 279 25 20 264 31 17 367 32 12 313 37 17 41

C. Architecture 693

12547

1013

1

1182

5641383837303022212019116

714179

111619

2447234655

1 18 19 433 35 18 666 30 25 516 24 12 258 20 15 25

10 35 8 284 36 19 368 26 15 277 36 7 243 44 13 325 34 14 386 41 8 484 44 12 45

C. Richard Wright 541

32

7462555251

66

131213

three incorrect choices (B, C, and D) was chosen morefrequently without the passage than with it.

For question 2 from the Architecture passage, a ma-jority (55 percent) of examinees eliminated the correctchoice when they did not have the passage, and only 6percent selected it. Besides tending to eliminate the cor-

6

371

63

14 14 4412 26 3717 22 4920 17 3629 17 46

rect answer for this item, examinees without the passagefound two Incorrect choices (A and E) to be much moreplausible than did examinees who had access to thepassage.

Question 2 about Marianne Moore appears to havebeen answered correctly somewhat less often than ex-

11

Page 12: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 5

Comparison of Responses for Selected Questions Given with and without Passages

Passage/Question Test Item and Options

PercentChoosing

withPassage

Without PassagePercent Percent

Choosing Eliminating

Language/Question 1 The phrase "older and deeper functions" (line 6) refers to the(A) grammatical structure of a language

(B) expression of emotions through sound(C) transmission of information(DI statement of cultural values(E) original meanings of wordsOmitted

Clarence Darrow/Question 2 Which of the following best captures the meaning of the word"consideration" in line 17?"(A) leniency

(B) contemplation(C) due respect(D) reasoned judgment(E) legal rightsOmitted

Architecture/Question 2 In passage A, the reference to "next month's issue of anarchitectural periodical" (lines 22-23) serves to

(A) show that the plans for the garden cities are well thoughtof in professional journals

(B) indicate that what seems like a random process is actuallyan ordered process

"(C) suggest that some people lack their own firm ideals ofbeauty

(D) imply that only those who are knowledgeable about asubject should offer their opinions

(E) emphasize the importance of what the experts sayOmitted

Marianne Moore/Question 2 The major purpose of the passage is to(A) describe the events that led to a milestone in the autnor's

life(B) reveal the character of a college librarian(C) relate the significant events of the author's college years(Di analyze the impact of Marianne Moore's poetry on

the author(E) show the unexpected surprises that can happen in an

ordinary lifeOmitted

Clarence Darrow/Question 3 By "They can afford it if you members of the jury can" (line 22),Darrow means that

(A) no harm will come to the defendants if they are convictedin this case

(B) the jurors will be severely criticized by the press if theyconvict the defendants

(0 the defendants are indifferent about the outcome ofthe trial

(D) the verdict of the jury has financial implications for all ofthe people involved in the trial

(E) a verdict of guilty would be a potential threat toeveryone's rights

Omitted

Correct choice.

124216

818

3

912103634

3536281520

58 6 327 30 14

18 29 1510 32 10

2 5 436

13 44 12

10 12 44

63 6 55

6 4 456 35 183

36 16 115 14 274 6 15

38 57 10

16 7 222

8 13 44

13 19 26

5 6 50

6 11 44

62 51 166

pected by chance (16 percent) by no-passage test takers,mainly because of the attractiveness of option D. Ques-tion 3 from the Clarence Darrow selection remained rela-tively easy even without the passage because each of theincorrect options was eliminated relatively frequently.These examples illus ate that the effect of restrictingaccess to reading passages is far from uniform across testitems and that differential effects on item difficulty ap-pear to result from several different processes.

Descriptive Statistics on Test-TakingStrategies

The extent to which study participants reported usingvarious strategies to answer questions without the pas-sages is presented in Table 6. For most ot he straegies,participants were asked to indicate the proportion ofquestions for which they had used a strategy (all or nearly

12 7

Page 13: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

all, about 75 percent, about 50 percent, about 25 percent,or few or no questions). Table 6 presents frequency ofuse, combining the first two and the last two categories.Generally, each possible strategy was used frequently byat least some students and infrequently by others.

Each strategy considered to involve reasoning ofsome kind was used by a majority or near majority ofstudents for a preponderance of questions. For instance,the strategy of choosing an answer because it seemed tobe consistent with something stated in the other questionswas reportedly used by 65 percent of examinees for 75percent or more of the questions. In addition, a slightmajority (55 percent) of the sample said that they, likethe experienced test developers consulted early in thestudy, had tried to reconstruct the theme or main ideaof the missing passage by reading all the questions andanswers.

Personal knowledge or experience was invoked rela-tively infrequently. About 2.) percent of examinees statedthat they had used their personal knowledge for 75 per-cent or more of the questions for each set of questions.There were some differences among test forms in the useof personal knowledge, with students least likely to callon personal knowledge for the questions on the Archi-tecture passage and most likely to summon it for ques-tions on the Richard Wright selection.

Generally, few students maintained that they recog-nized the passages or knew their sources. Equally fewwere familiar with the author of the passage. The excep-tion was examinees' greater familiarity with RichardWright and his works. Specifically, 30 percent said thatthey recognized the Richard Wright passage or knewwhere it came from, and 37 percent said that they recog-nized the author.

Of the possible strategies for answering vocabularyquestions, the most frequently used was to choose ananswer because its meaning seemed to fit best with thegeneral theme or context suggested by the questions.Knowing that the vocabulary word or answer couldhave more than one meaning was also helpful. Knowl-edge of prefixes, suffixes, or word roots or of the mostgeneral or common meaning of a word was used lessoften.

With respect to strategies involving guessing, randomand patterned guessing strategies were utilized relativelyinfrequently. Two other strategies that involved guessingamong choices that could not be ruled out and usingvague hunches or intuition were applied by a near ma-jority of the sample.

We speculated that certain features of answer choicesmight be used either to select the correct answer or toeliminate incorrect alternatives. Table 7 shows, in de-scending order of prevalence, students' tendencies to use

8

these characteristics. None of the features listed was re-garded in the same way by all examinees. Some studentsused a particular strategy to choose an answer, and oth-ers used the same strategy to eliminate options. Some fea-tures, however, were used much more frequently in oneway than in the other. The clenest tendency was tochoose the more carefully wonied or qualified alternativeor to rule out a less qualified one. About three-quartersof test takers said that they used this approach. Studentsalso reported choosing the more concrete options andeliminating the more abstract ones, and they preferred thespecific alternatives over the general ones. Some featuresof answer choices, such as their length, the degree towhich they were regarded as ambiguous, and whether ornot they seemed obvious, were used somewhat less oftenthan other features. More important, these features werereportedly used about as often to choose as to eliminatealternatives.

The extent to which students actually eliminatedanswer choices from consideration was also calculated.On average, students ruled out approximately one (from.9 to 1.3) choice per question on each set of questions.The mean number of choices that were eliminated dif-fered among questions, ranging from .7 to 1.8. Eliminat-ing choices appears to have been a consistently employedstrategy. The correlation between the number of choicesruled out for the first set of questions within a test formcorrelated highly with the number eliminated for the sec-ond set.75, .80, and .86, respectively, for the threeforms.

In order to determine which strategies were used forparticular questions, examinees were asked to indicatefor some sets of questions (those based on the MarianneMoore passage, the Tunguska passage, and the RichardWright passage) the single most helpful strategy of allthose listed on the checklist. Table 8 shows the two mostfrequently mentioned strategies for each question and thepercentage of test takers who listed each one. There wasno consensus regarding the helpfulness of any particularstrategy for any given item. Nonetheless, students re-ferred to some strategies more often than others. In fact,of the 45 different possibilities, only 10 were mentionedmost or second most frequently for any item. Choosingan answer because it seemed to be consistent with some-thing stated in other questions was regarded by a plural-ity of test takers as the most helpful strategy for 6 of the20 items and as the second most helpful for 7 others. At-tempting to reconstruct the theme or main idea of themissing passage from the other questions and answerswas the next most often reported strategy. The use ofpersonal knowledge about a topic was mentioned as be-ing most helpful for four items from the Richard Wrightpassage.

1 3

Page 14: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 6

Frequency of Use of Various Test-Taking Strategies

Strategy

PercentageFew or No

Questions orabout 25%of Questions

Using Strategy forAll or Nearly

All Questions orabout 75%of Questions

PercentageAnswering

Yes

REASONING STRATEGIES

RI Tried to determine the meaning of a word, or phrase, or the wayin which it was used, from the other questions in the set 15 54

R2 Assumed, guessed, or knew the answers to some questions and then,on the basis of these answers, reasoned what the answer to a laterquestion would have to be (or what it could not be) 22 48

R3 Chose an answer because it seemed to be consistent with somethingstated in the other questions 13 65

R4 Ruled out an answer because it seemed to contradict something inthe other questions 8 53

R5 Chose an answer because it resembled something in the question: Iassociated a word, phrase, or idea in the question with somethingin the answer I chose 29 37

R6 Tried to reconstruct the theme or main idea of the missingpassage by reading all the questions and answers 55

PERSONAL KNOWLEDGE OR EXPERIENCE

PI Used my personal knowledge about the topic that I learned eitherinside or outside of school

First passage 55 18

Second passagc 59 22

P2 I recognized a passage or knew where it came from

First passage 4

Second passage 15

P3 I recognized the author of the passage and was somewhat familiarwith his/her opinions, etc.

First passage 3

Second passage 14

STRATEGIES FOR VOCABULARY

Ruled out an answer because:

SI I dida't think the vocabulary word could have that meaning

S2 Everyone would know the meaning of that answer, so I thought it wastoo obvious

Chose an answer because:

S3 I knew that the vocabulary word or answer could have more than onemeaning

S4 Its meaning seemed to fit best with the general theme or contextsuggested by the questions

S.5 It was the most goieral or common meaning of the vocabulary word

56 I knew about the vocabulary word's root, prefix, or suffix

GUESSING

GI I guessed randomly among all of the choices

G2 I guessed among two or more choices I couldn't rule out

G3 I used vague hunches or intuition, but I can't say exactly how

G4 I tended to guess a particular choice (e.g., A or C)

27 44

63 13

19 55

11 69

44 23

53 24

77 9

24 48

29 45

85 8

Note: The use of strategies R3, P1, P2, P3, and SS varied significantly (p<M5) by test form. Percentages are based on all 350 students who completed thequestionnaire.

BEST COPY AVAILABLE

149

Page 15: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

v

TABLE 7

Frequency of Use of Test-Taking Strategies InvolvingFeatures of Answer Choices

Percentage Using Feature toFeatures of Answer Choices Choose Rule Out(in Relation to Other Choices) Answers AnswersMore carefully worded/qualified 76 18Less qualified 13 72

M3re concrete 69 22More abstract 30 61

More general 37 61More specific 68 33

Too extreme 19 68Too neutral 24 60

Too similar to other choices 23 65At the center of all the other choices,something in common with all of them 61 25

More definite/absolute 64 29Less definite/more relative 32 59

In tune with current thinking 63 17Outdated, old-fashioned 21 55

More positive/less critical in toneor mood 60 26More negative/more critical intone or mood 28 56

Simpler 39 51More complex 53 40Most obvious 47 47Least obvious 40 48

More common/normal 47 34More uncommon/unusual 47

Most ambiguous 29 48Least ambiguous 40 33

Longer 35 27Shorter 31 31

Note: Percentages are based on all 350 questionnalre respondents.

It is also interesting to note which strategies were notmentioned as being helpful. Of the 26 possible featuresof answer choices, only one--selecting an answer becauseof its positive or less critical tonewas mentioned byenough students to be among the two most helpful strat-egies. This strategy was mentioned in coniunction withthe first question from the Marianne Moore passage, forwhich the correct response was that Marianne Moore'spoetry was (choice D) " inspiring and well crafted."

Study participants were also asked to indicate anyother strategies they used that we had not anticipated. Atotal of 19 percent of respondents said that they had usedsome strategy that was not specified. Most of these, how-ever, were merely restatements of strategies that were onthe checklist. By far the most frequent restatement per-tained to reconstructing the theme or main idea of themissing passage by reading all the questions and answers.Students also offered a number of alternatives to theterms we had used"radical and opinionated" for "defi-nite/absolute," "politically correct" for "in tune with

10

current thinking," and "makes a personal assault on theauthor or specific people" for "negative/critical in tone."Thus, although students may have used different termi-nology, they did not appear to use any strategies that wehad not anticipated.

Relationships between Strategy Useand Without-Passage Performance

Frequency of use of each strategy (5 = strategy used forall or nearly all questions, 4 = used for about 75 percentof questions, 3 = used for about 50 percent of questions,2 = used for about 25 percent of questions, 1 = used forfew or no questions) was correlated with test perfor-mance. Only eight of the many strategies listed were re-lated significantly to performance on any of the six setsof questions administered without the passages. Thesestrategies are enumerated in Table 9. However, none ofthe strategies was consistently related to performanceacross the different sets questions. Given the relativelylarge number of correlations generated, we are not in-clined to make any extensive interpretations of those thatwere found to be significant. It does appear, however,that random guessing was not an effective strategy, asmight be anticipated. For two of the sets, students whoguessed randomly with some regularity performed worsethan those who did not (.30 and .32). Nor was choos-ing a general or common meaning of a word a good ap-proach for these two sets of questions. Correlations be-tween performance and the use of this strategy were .29and .36.

Correlations between students' use of features of an-swer choiceseither to choose or to rule out alterna-tivesand their performance on each of the six sets ofquestions were also computed. Each feature whose usecorrelated significantly (p<.0.5) with performance on anyof the six question sets is listed in Table 10 (for those usedto choose answers) and Table 11 (for those used to ruleout answers). Wh i used to select answers, only 11 of the26 possible features correlated significantly with perfor-mance on any question set. When used to rule out alter-natives, only 7 of the 26 features correlated significantlywith test performance on any set. Furthermore, any sig-nificant relationship between feature use and test perfor-mance was not consistent across question sets. Thus theredoes not appear to be any systematic relationship be-tween the use of answer-choice features and test perfor-mance without passages.

It is possible that general tendencies to use particu-lar strategies do not relate consistently to ovei all perfor-mance because specific strategies may be required for cer-tain items. Strategies that work well for one item may not

Page 16: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 8

Strategies Most Frequently Listed as Most Helpful for Individual Test Questions

Form/Passage Question

MostFrequentStrategy

PercentageListing

Second MostFrequentStrategy

PercentageListing

A. Marianne Moore 1 G1 17 R3F.5

1515

2 R3 19 R6 17

3 R4 13 G3 10

4 R3 17 R6 15

5 GI 12 R6R3

1111

6 R6 16 R3 14

B. Tunguska 1 R3 17 RI 13

2 S4 30 R3 11

3 R3 15 R6 13

4 S4 31 R3 11

G1 20 G3 15

6 R3 20 G3 9

7 G3 14 14

8 R6 15 R3 14

9 R3 15 G3 12

C. Richard Wright R6 25 PI 13

2 17 R3 17

3 PI 17 S4 11

4 PI 21 G2 15

5 PI 20 R6 9

Kcy:

F5 =

GI

G2

G3

P1

RI

R3

R4

R6

54

More positive/lcss critical in tone or mood

Guessing randomly among all the choices

Guessing among.two or more choices unable to rule out

Using v.,gue hunches or Intuition

Using personal knowledge about the topicTrying to determine the meaning of a word, or phrase, or thc way in which it was used, from the other questions in the sct

Choosing an answer because it seemed to be consistent with something stated in other questions

Ruling out an answer because it seemed to contradict something in the other questionsTrying to reconstruct the theme or main idea of the missing passage by reading all the questions and answers

Meaning seemed to fit best with the general theme or context suggested by the questions

be effective overall. Therefore, we attempted to relate testperformance to strategy use more precisely by correlat-ing performance on specific items with the use of strate-gies that were regarded as being the most helpful ones forindividual items. One section of the questionnaire askedexaminees to list for a subset of items the particular strat-egy that was most helpful for each item. The analysis en-tailed the following. For each item we specified the strat-egies that were inentioned as most helpful by about 10percent or more of the examinees. Then, for each item,performance (wrong = 0, right = 1) was regressed on theuse (0 or 1) of each of the strategies in the set. For the 20test items involved, most of the sets included from two tofour strategies, although one had five and three had only

one. Because the dependent variable was dichotomous,logistic regression was used. Table 12 contains a summaryof this analysis. For 6 of the 20 items at least one strategyin the set contributed significantly (p<.05) to predictingperformance on the item. These significant predictors arelisted in Table 12 along with the regression weights, stan-dard errors of the weights, and the appropriate signifi-cance test (Wald statistic). The far right column shows thestatistic exp(B), which is e, the base of the natural loga-rithms, raised to the power of B, the beta weight. Thisstatistic can b.: interpreted as the factor by which the oddsof correctly a ilsw ering the ,uestion increased when exam-inees used the strategy listed. For the first and the lastquestions listed, thc: estimated chance of a correct answer

i 6 11

Page 17: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 9

Significant Correlations between Use of Strategies and Performance on Question Sets

Strategy LanguageMarianne

Moore

Question SetClarenceDarrow Tunguska Architecture

RichardWright

R1 .02 -.08 .14 .02 .05 .22*R2 .18* .09 .14 .06 -.10 -.05R.5 -.11 -.17* -.13 -.14 .03

SI -.03 -.01 .04 -.05 2Y .06

S2 -.22" -.02 -.07 .06 -.14 -.06S5 -.00 .05 -.29" -.36 .04 -.0856 -.04 -.08 -.18 -.10 .05

G1 -.10 .00 -.30" -.32*" -.16 -.10

Key:

RI = Tried to determine the meaning of a word, or phrase, or the way in which it was used, from the other questions in the set

R2 = Assumed, guessed, or knew the answers to some questions and then, on the basis of these answers, reasoned what the answer to a later question wouldhave to be (or what it could not be)

RS = Chose an answer because it resembled something in the question. Associated a word, phrase, or idea in the question with somethrig in the answer chosen

Sl = Didn't think the vocabulary word could have that meaning

52 = Everyone would know the meaning of that answer, so thought it was too obvious

SS = It was the most general or common meaning of the vocabulary word

S6 = Knew about the vocabulary word's root, prefix, or suffix

G1 = Guessed randomly among all the choices

p<.0.S, "p<.01, p<.001.

TABLE 10

Significant Correlations between Test Performance and Use of Answer Choice Features to Choose Answers

Feature LanguageMarianne

Moore

Question SetClarenceDarrow Tunguska Architecture

RichardWright

More general -.25" -.05 -.01 -.14 .07 -.02More specific .15 .12 .09 .21 -.03 .06

More positive/lesscritical in tone ormood .02 .01 -.20* -.03 -.04 .00

Most obvious .00 -.12 -.20* -.21* -.13 -.08Least obvious .03 -.12 .09 .13 .03 .19*More carefullyworded/qualified .08 -.01 .11 .11 .15 .21*Shorter -.10 .02 -.20* .05 -.06 -.19More concrete .11 .17* .09 .01 .07 .05

Too neutral -.16 -.03 -.08 -.23* -.04 .03More uncommon/unusual -.03 -.12 -.04 .19* .00 -.03Outdated,old-fashioned -.13 -.18* -.08 -.01 -.03 -.02

p<M5, p<.01.

12

Page 18: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 11

Significant Correlations between Test Performance and Use of Answer Choice Features to Rule Out Answers

Feature LanguageMarianne

Moore

Question SetClarenceDarrow Tunguska Architecture

RichardWright

Simpler -.06 -.00 -.11 .15 .08 .30"More negative/morecritical in tone or mood .08 .13 -.27" -.06 .04 .05

Least obvious .06 -.01 -.21* -.16 -.08 -.14

More abstract .14 -.19* .02 .00 -.01 -.03

Too similar to otherchoices -.02 .04 .14 .20* .01 .13

Most ambiguous .18* .11 .03 -.01 .11 -.05

In tune with currentthinking -.06 -.14 .04 -.07 -.00 -.21*

p<.05, p<.01.

increased substantially when examinees chose the alter-native that was more positive/less critical in tone than theother options for the first item, and used personal knowl-edge about the topic for the last item. For each of the otheritems, the odds shifted less dramatically with the use ofparticular strategies but still changed by a factor of morethan two to nearly six for various items. For three of thesix items listed, attempting to reconstruct the theme ormain idea of the missing passage was linked to better per-formance. Table 13 gives the percentages of examineeswho answered these items correctly according to whetheror not they used these strategies. The different rates ofsuccess are consistent with the results of the logistic re-gressions. The product moment correlations of strategyuse with item performance are also given for those read-ers who may find this statistic easier to interpret than thelogistic regression statistics.

Relationships between Without-Passage Scores and SAT Scores

Table 14 displays, for each of the six sets of questions, thecorrelations between performance without passages andSAT scores, Test of Standard Written English (TSWE)scores, and high school grades (average and rank in class).The patterns of correlations are quite different for eachquestion set. Performance on three of the sets (Language,Clarence Darrow, and Tunguska) correlated significantlywith SAT-Verbal total scores, with verbal subscores, andwith TSWE scores. SAT-Mathematical scores did not cor-relate with performance on any of the six sets. Cn two ofthe three sets for which correlations with the test scoreswere noted (Clarence Darrow and Tunguska ), perfor-mance also correlated significantly with high school

grades. The lower correlation of the Architecture passagewith all measures is probably largely a function of thelower reliability of this set of questions (coefficient alpha= .09) compared with the estimated reliability of the othersets (alpha = .33 to .43). These differential patterns haveimplications for interpreting performance. Particularlynoteworthy is the finding that, at least for some sets,those who can glean information from the test questions(without the passages) tend to be those who get goodgrades.

Student Recommendations

As a final question, study participants were asked if theywould ever recommend answering SAT questions with-out reading the passages. A minority (about 16 percent)said that they would. When asked to indicate the circum-stances under which they would follow this strategy,about half said that the approach might help primarily iftime were running short. Some students suggested vari-ants of the tactic, such as skimming a passage or readingthe first and las: sentences in each paragraph. Still othershinted that they would first read the questions and thenselectively read the passage, searching for the answers.Several others contended that questions referring to spe-cific lines in a passage could be answered by consultingonly those lines instead of reading the entire passage. Thesecond most often mentioned circumstance involved theuse of personal knowledge of the subject of a passage.

Finally, students were given an opportunity to pro-vide any other comments u; their choosing, and manytook the opportunity to do so. Many study participantsopined that answering questions without the passages isprobably not a major factor in test takers' performance:"I don't really know of many people who do answer the

18 13

Page 19: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

tt

TABLE 12

Logistic Regression of Performance on Most Helpful Strategies for Selected Questions

Explanatory Variable Beta (B)

Logistic Regression StatisticsStandard Wald

Error Statistic Exp(B)Marianne Moore/Question 1

F5 More positive/less critical in tone or mood 2.71 1.06 6.5 15.0R6 Trying to reconstruct the theme or main idea of the missing passage

by reading all the questions and answers 1.70 0.81 4.4" 5.5

Tunguska/Question 1

RI Trying to determine the meaning of a word, or phrase, or the way inwhich it was used, f:om the other questions in the set 1.58 0.66 5.7' 4.8

Tunguska/Question 2

54 Meaning seemed to fit best with the general theme or context suggestedby the questions 0.95 0.49 3.8" 2.6

Tunguska/Question 7

R6 Trying to reconstruct the theme or main idea of the missing passageby reading all the questions and answers 1.72 0.73 5.6* 5.6

Richard Wright/Question 1

R6 Trying to reconstruct the theme or main idea of the missing passageby reading all the questions and answers 1.51 0.61 6.2" 4.5

Richard Wright/Question 4

PI Using personal knowledge about the topic 2.44 1.06 5.3* 11.5

*p.05.

TABLE 13

Percentage of Test Takers Who Answered Selected Questions Correctly, by Use of Selected Strategies

Explanatory VariableDid Not Used

Use Strategy StrategyMarianne Moore/Question 1

F.5 More positive/less critical in toneor mood

R6 Trying to reconstruct the theme or main ideaof the missing passage by reading all the questionsand answers

Tunguska/Question 1

RI Trying to determine the meaning of a word, orphrase, or the way in which it was used, fromthe other questions in the set

Tunguska/Question 2

54 Meaning seemed to fit best with thegeneral theme or context suggested by the questions

Tunguska/Quesnon 7

R6 Trying to rcconstruct the theme or main ideaof the missing passage by reading all the questionsand answers

Richard Wright/Question 1

R6 Trying to reconstruct the theme or main ideaof thc missing passage by reading all the questionsand answers

Richard Wright/Question 4

P1 Using personal knowledge about the topic

14

53 94 .28

55 85 .23

20 50 .21

51 69 .24

37 73 .18

48 81 .29

56 94 .30

BEST COPY AVAILABLE 1 9

Page 20: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

TABLE 14

Correlations between SAT Scores and Performance on Questions without Passages

Form/Passage TotalSAT-Verbal

Reading Vocabulary TSWE SAT-MathematicalHigh School Grades

Average RankA. Language .48" .46" .40" .37** .14 .09 .11

A. Marianne Moore .09 .17 .02 .06 -.14 -.08 -.103. Clarence Darrow .37 .37" .32" .34" .14 .28* .19

B. Tunguska .32" .34" .29" .26* .19 .23* .34"C. Architecture .17 .19 .17 .13 .06 .01 -.05C. Richard Wright .06 .07 .04 .02 -.16 .14 .13

Note: Correlations berween the SAT and TSWE and questions in Forms A, B, and C are based on 95, 82, and 91 students, respectively, for whom SAT scores wereavailable. Correlations involving high school grade-point averages are based on 86, 75, and 81 cases, respectively, for the three test forms. Correlations involvinghigh school ranks are based on 76, 65, and 74 cases, respectively.p<.05, rwo-tailed; p<.01, two-tailed.

questions without reading the passage first. So I don'tthink the testing service has much to be worried about."Others commented: "I've never heard of anyone using thestrategy." "I thought it could be done, but it takes timeand is painstaking." "I think it's important and interest.:ing to realize that you can make educated guesses with-out even having to read the passage." Finally, one stu-dent advised fellow test takers as follows: "Instead oflearning stupid strategies, just learn to read."

DiscussionThe results indicated that, as expected, students were ableto attain scores that exceeded a chance level. Thus thefindings are generally consistent with those obtained forother measures of reading comprehension, including thecurrent verbal portion of the SAT (Katz et al. 1990). Thefindings also extend the results of earlier studies to thekind of reading questions that Will be used in the modifi-cation of the SAT.

It is clear from the results that some helpful informa-tion can be obtained from reading only the questionsthemselves. It is equally apparent, however, that resort-ing to answering questions exclusively on this basis isneither efficient nor effective. Students used nearly asmuch time per question without reading the passages asis currently allotted for reading SAT passages and an-swering questions. And, although most students did bet-ter than chance without the passages, they did not per-form substantially better, given what might be expectedof the very capable students in the study sample. Norwas the use of a no-passage strategy very dependable.According to indices of internal consistency, the result-ing performance varied considerably among items. Thus,any concern about testwise examinees using this strategy

to gain an unfair advantage over less savvy test takersseems unfounded. Students' comments reinforced thisconclusion.

In addition, better-than-chance performance was notuniform over individual items. A few questions seemedbarely more difficult without the passages than withthem. However, without the passages, performance onsome items was actually lower than would be expectedby chance. Items that involved the meaning of a word orphrase tended to be quite easy when the meaning was arelatively common one that might be known without ref-erence to the context in which it occurred. Performanceon some other vocabulary questions, however, was worsethan would be expected by random guessing, usuallybecause a less common meaning of a word was signaledby the context. When the more usual meaning of theword appeared as one of the alternatives, examinees werelikely to choose this typical meaning instead of the cor-rect answer.

Other factors were implicated in difficulty differen-tials for other items. In one particular set, questions wererelatively easy, apparently because examinees were famil-iar with the author or with the particular passage onwhich the questions were based. Each of the questions inthis set was even easier for examinees who took the testwith the passage. Nonetheless, this example illustratesthat awareness of an author or of an author's works maymake questions easier, both with and without the pas-sages, than when questions are based on less familiarsources. Given current research on reading comprehen-sion, this finding should come as no surprise.

With respect to establishing links between the use ofparticular strategies and performance on test items, ex-aminees' comments strongly suggested that some strate-gies were more heavily implicated in success than others.The strategies most frequently mentioned as being help-ful involved verbal reasoning rather than skills that rely

- 015

Page 21: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

on characteristics of questions or answer choices. Thetwo strategies mentioned most often as being helpful in-volved attending to consistencies among questions andattempting to reconstruct the theme of a missing passagefrom all the available questions and answer choices. Theuse of this second strategy was linked to successful per-formance on particular items more often than any otherstrategy. These findings seem to run counter to the ear-lier research by Katz, Blackburn, and Lautenschlager(1991), who effectively eliminated these strategies fromexaminees' repertoires by scrambling questions from sev-eral passages. Even under these conditions, however, ex-aminees were still able to answer questions at a greater-than-chance level without the passages, leading the au-thors to conclude that information obtained from otherquestions associated with a passage has relatively littleeffect on performance without passages. This apparentdiscrepancy between our results and those of Katz et al.(1991) is consistent with the hypothesis (discussed below)that students invoke different strategies with and with-out passages.

The use of strategies involving reasoning was notunexpected. That reading and reasoning are intertwined,perhaps inextricably, has been asserted, quite literally, forgenerations (E. L. Thorndike 1917; R. L. Thorndike1973-74). More recently, Stanovich and Cunningham(1991) discussed the evolution of reading theory, notingthat although reading may indeed be reasoning, it isreasoning of a particular sort. They suggested that a bet-ter conception might be reading as "constrained reason-ing," since comprehension is at least partially determinedby both a reader's expectations and his or her use ofworld knowledge to supplement text. (They also argued,however, that there are situations in which we wouldhope that readers do not impose their own meaningson what they read, for example, when a physician con-sults a medical reference to prescribe an appropriatetreatment.)

The study results appear to have definite implicationsfor test development. It is clear that a test developmentstrategy could be implemented that would foil test takerswho rely heavily on answering reading comprehensionquestions without reading the passages. There were sev-eral examples of test items for which failure to read thepassages resulted in performance that was actually worsethan would be attained by random guessing. It appearsthat items of this sort could be constructed relatively eas-ily. However, resorting to such a strategy seems unneces-sary, and the effects on the validity of test scores wouldneed to be established before any such plan was imple-mented.

With regard to construct validity, Katz et al. (1990,122.) concluded that currently used SAT reading compre-

16

hension questions "substantially measure factors unre-lated to reading comprehension." This conclusion restson sound empirical evidence, but it also relies on severalmajor premises and on a particular conception of read-ing. For instance, the investigators asserted that if SATreading passages are "essential" to the task, then "norelation" should exist between performance exhibitedwith and without the passages. Finding correlations be-tween performance with and without the passages, theauthors concluded that "the passage is not a necessarycomponent of the task" and that factors other than theability to understand the passages "contribute stronglyto differences among SAT-V scores" (Katz et al. 1990,126). There is no disputing that, for many questions,examinees can indeed perform at a better-than-chancelevel without the passages. Thus, in a strictly logicalsense, the passages are not essential to attain performancethat is better than that expected from random guessing.This is not, however, a level of performance with whichmost motivated SAT takers would be satisfied. Consult-ing the information available in the passages would beconsidered extremely desirable, if not absolutely neces-sary, by most examinees wishing to achieve test scoresthat reflect their true verbal reasoning abilities and arecompetitive in college admission.

A second questionable premise on which Katz et al.(1990) base their conclusions is that a correlation be-tween performance with and without the passages impliesthat the same strategies are used with and without thepassages. The only truly warranted conclusion is that testtakers who perform better than their peers when theyhave the passages also tend to do better without them.When confronted with an unconventional task like theone posed here, verbally able students are apparently ableto shift gears, invoking other strategies that seem notonly beneficial with regard to test taking but also relevantto verbal reasoning. This flexibilityto adapt strategiesto the particular situationis exactly the kind of traitthat characterizes skilled readers (see, e.g., Foertsch1992). This is probably not surprising, for it is well es-tablished that many different kinds of logically distinctabilities are relatively highly correlated. For example, theverbal and mathematical abilities measured by the SATcorrelate in the .60s (Donlon 1984). And even such quasiabilities as testwiseness are not completely unrelated toother academic abilities (see, e.g., Diamond and Evans1972).

Further support for a shifting-gears hypothesis comesfrom Freed le (1990), who showed that different factorsdetermine the difficulty of reading comprehension ques-tions according to whether or not the passages are avail-able. When passages are accessible, difficulty seems todepend to a much greater extent on the characteristics of

21

Page 22: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

the passages than on the features of the questions. Whenthey are not available, item characteristics exert a greaterinfluence.

Finally, Katz et al. (1990) seem, at least implicitly, toendorse a particular conception of reading that puts apremium on the extraction of literal information fromtext. Current views of reading, however, stress that read-ers are "active, constructive, motivated learners" (Gar-ner 1987, 1). Cognitively based conceptions of readingcomprehension in particular emphasize the interactiveand constructive nature of reading, in which readers'background knowledge is all important. As Dole, Duffy,Roehler, and Pearson (1991, 241) assert: "All readers,both novices and experts, use their existing knowledgeand a range of cues from the text and the situational con-text in which the reading occurs to build, or construct, amodel of meaning from the text."

Given the results of the current study and the prevail-ing conceptions of reading comprehension, what are theimplications for advising test takers? The study results donot suggest any particular fieed to modify the informa-tion that is now provided to SAT takers (College Board1991). Currently, the following information is conveyedto prospective test takers:

The verbal questions test your verbal reasoning andunderstanding of what you read. [p. 3]

. . . no specialized knowledge in science, social studies,literature, or other fields is needed. [p. 7]

The reading comprehension questions on the SAT mea-sure your ability to read and understand a passage. Eachpassage or pair of related passages contains all of theinformation you'll need to answer the questions thatfollow. [p. 11]

A passage with a subject that is familiar or interestingto you may be easier for you than a passage that is aboutan unfamiliar subject. [p. 151

Answer questions on the basis of what is stated or im-plied in the passage. Don't answer questions on the ba-sis of your personal opinion or knowledge. [p. 15]

None of this information seems inconsistent with theresults of the study discussed here. First, the new SATreading passages do indeed appear to depend on one'sability to read and understand passages. This does notimply, however, that other kinds of skills do not alsocome into play, at least to some degree and under somecircumstances. Second, the new SAT reading questionscan be answered from the information contained in thepassages. Again, this does not mean that personal expe-riences and knowledge cannot also provide a useful back-

drop. As is currently suggested, interest in and familiar-ity with a topic may facilitate understanding. Perhaps theonly necessary fine-tuning of information about the newSAT concerns the suggestion that test takers answer ques-tions solely on the basis of the passages. Certainly thisshould be students' first resort. But, if this fails, anotherpotentially useful tactic may be to invoke personal opin-ion or knowledge. Students should be advised, however,that this approach may in some instances be ineffectiveand even counterproductive.

In conclusion, performance on the kinds of readingcomprehension questions that will make up the revisedSAT does not appear to depend exclusively on informa-tion contained in the reading passages on which the ques-tions are based. However, the importance of nonpassagefactors appears to be relatively limited, especially in rela-tion to the influence exerted by the reading passages.Furthermore, the other factors most heavily implicatedin test performance without access to the passages are notunrelated to the verbal reasoning skills involved in aca-demic success. The desired interpretation of readingscores based on the new SAT reading comprehensionquestions does not seem unduly threatened by examin-ees' ability to benefit from information contained in thetest questions themselves.

ReferencesAmerican Educational Research Association, American Psycho-

logical Association, and National Council on Measure-ment in Education. 1985. Standards for Educational andPsychological Testing. Washington, D.C.: American Psy-chological Association.

College Board. 1991. Taking the SAT 1991-92. New York:College Entrance Examination Board.

College Board. 1992. College Bound Seniors: 1992 Profile ofSAT and Achievement Test Takers. Princeton, N.J.: Edu-cational Testing Service.

Con Ian, G. 1990. "Text and Context: Reading Comprehensionand the Mechanics of Meaning." College Board Review157:19-25.

Diamond, J.J., and W. J. Evans. 1972. "An Investigation of theCognitive Correlates of Test-Wiseness." Journal of Edu-cational Measurement 9:145-1.50.

Dole, J.A., G.G. Duffy, L.R. Roehler, and P.D. Pearson. 1991."Moving from the Old to the New: Research in ReadingComprehension Instruction." Review of Educational Re-search 61:239-264,

Donlon, T.F., ed. 1984. The College Board Technical Hand-book for the Scholastic Aptitude Test and AchievementTests. New York: College Entrance Examination Board.

Farr, R., R. Pritchard, and B. Smitten. 1990. "A Description ofWhat Happens When an Examinee Takes a Multiple-Choice Reading Comprehension Test." Journal of Educa-tional Measurement 27:209-226.

22 17

Page 23: DOCUMENT RESUME ED 386 472 TM 023 975 TITLE · Donald E. Powers is a senior research scientist at ETS. Susan T. Wilson is a senior research associate at ETS. Researchers are encouraged

Foertsch, M.A. 1992. Reading in and out of School. Washing-ton, D.C.: Office of Educational Research and Improve-ment.

Freedle, R. 1990. "Evidence for the Construct Validity of theSAT Reading Comprehension Items." Unpublished report.Princeton, N.J.: Educational Testing Service.

Garner, R., ed. 1987. Metacognition and Reading Comprehen-sion. Norwood, N.J.: Ablex.

Katz, S., A.B. Blackburn, and G. J. Lautenschlager. 1991. "An-swering Reading Comprehension Items without Passageson the SAT When Items Are Quasi-Randomized." Educa-tional and Psychological Measurement 51:747-54.

Katz, S., G. J. Lautenschlager, A.B. Blackburn, and F.H. Har-ris. 1990. "Answering Reading Comprehension Itemswithout Passages on the SAT." Psychological Science1:122-27.

Lawrence, I.M. 1992. "Comparison of Statistical Specificationsfor Current and New SAT-V." Unpublished report.Princeton, N.J.: Educational Testing Service.

Millman, J., C.H. Bishop, and R. Ebcl. 1965. "An Analysis ofTestwiseness." Educational and Psychological Measure-ment. 25:707-26.

Preston, R.C. 1962. "A New Approach to Judging the Validityof Reading Comprehension Tests: Summary of an Investi-gation." In Challenge and Experiment in Reading, ed. J.A.Figurel. International Reading Association ConferenceProceedings 7:166-67.

Preston, R.C. 1964. "Ability of Students to Identify CorrectResponses before Reading." Journal of Educational Re-search 58:181-83.

Pyrczak, F. 1972. "Objective Evaluation of the Quality ofMultiple-Choice Test Items Designed to Measure Compre-hension of Reading Passages." Reading Research Quar-terly 8:62-71.

Pyrczak, F. 1974. "Passage-Dependence of Items Designed toMeasure the Ability to Identify the Main Ideas of Para-graphs: Implications for Validity." Educational and Psy-chological Measurement 34:343-48.

Pyrczak, F. 1975. "Passage Dependence of Reading Com-prehension Questions: Examples." Journal of Reading18:308-11.

Stanovich, K.E., and A.E. Cunningham. 1991. "Reading asConstrained Reasoning." In Complex Problem Solving:Principles and Mechanisms, ed. R.J. Sternberg and P.A.Frensch, pp. 3-60. Hillsdale, N.J.: Erlbaum.

Thorndike, E.L. 1917. "Reading as Reasoning: A Study ofMistakes in Paragraph Reading." Journal of EducationalPsychology 8:323-32.

Thorndike, R.L. 1973-74. "Reading as Reasoning." ReadingResearch Quarterly 9:135-47.

Tuinman, J.J. 1973-74. "Determining the Passage Dependencyof Comprehension Questions in 5 [sic] Major Tests."Reading Research Quarterly 9:206-23.

Weaver, W.W., and A.C. Bickley. 1967. "Sources of Informa-tion for Responses to Reading Test Items." In Proceedingsof the 75th Annual Convention of the American Psycho-logical Association, 293-94.

18