The Reading & Writing Placement English Test project

Component 6: Overview Report of the Revised RWPT Zhengzheng Wu MATFL 1

ESL Reading & Writing Placement TestIntensive English Program, MIIS

Winter, 2011

Based on Language Assessment with Professor Jean Turner(EDUC 8540, Spring 2011)

Table of Contents

Overview Report

Appendix A: Articulation Document (AD)1

Appendix B: Self-assessment Component

Appendix C: the RWPT Test (Reading and Writing)

Appendix D: Answer Key and Scoring Instructions

11. This document was produced by the MIIS ESL program. Attached as Appendix A in the hard copy, it has a separate page range (1-9) from the rest of the report due to a conflict in Microsoft Word page format.

The Reading and Writing Placement Test (RWPT) was developed for the purpose of

revamping the existing placement exam for reading and writing (R&W) used by the English as

Second Language (ESL) Center at the Monterey Institute of International Studies (MIIS). The

development of the RWPT integrates rationales from multiple sources: assessment theories,

research findings, empirical data from the ESL program, test results from the previous piloted

RWPT, and consultation from the field experts.

Design and Development of the RWPT

In this section I will summarize the shortcomings of the existing ESL placement exam,

and then elaborate on how these issues can be ameliorated through the implementation of the

new RWPT.

The Existing Placement Exam for Reading and Writing

The Intensive English as Second Language Program, also known as IEP2 within the MIIS

community, attracts foreign students in their early 20s from all over the world. More than half of

these students have the goal of matriculating into a U.S. college program. The IEP program has

been using an R&W placement exam3 and a 1-8 point scale to decide the integrated proficiency

level of the student’s R&W skills. Following the test, the students new to the IEP are normally

placed into one of three classes: A (1-2), B (3-4), and C (5-6)4. The R&W level descriptors on the

1-8 scale are included in the ESL Articulation Document (AD) which is attached in Appendix A.

The levels are as follows: 1-3 (novice), 4-6 (intermediate), and 7-8 (advanced).

22. In this report, the terms “IEP” and “ESL” interchangeably refer to the MIIS ESL program. 33. The current ESL placement test comes in two forms, namely lower (less advanced) and higher (more advanced). The lower is what this project is focused on, targeting learners at levels 1-6. The higher form is used to diagnose levels 6-8. 44. A large majority of test takers are placed into levels 3-6, with most of the remaining students being placed into level 2.

The existing test has several weaknesses. First, as the IEP program administrators have

acknowledged, the number of reading subtests needs to be increased for a better judgment of the

learner’s reading proficiency (personal communication, July 2011). Secondly, the test lacks a

clear definition of constructs beyond the AD level descriptors. Without clearly defined test

constructs, the interpretations of the test scores are open to question (Bachman & Palmer, 1996).

Thirdly, the authenticity of the test needs to be improved. Bachman and Palmer (1996) contend

that all assessed features of the language should match the test takers’ domain of target language

use (TLU). The existing test prompts are not entirely composed of authentic materials derived

from the students’ TLU. Nor do the tasks justify a high degree of authenticity. Lastly, the IEP

graders rely mainly on their intuitive judgment rather than a standard rubric to determine each

student’s writing placement level.

The New RWPT (Lower Form)

The design of the new RWPT attempts to address the issues illustrated above, and

highlights a number of new features to further enhance the effectiveness of placement. The

discussion regarding the test design delves into the following topics: test usefulness, the guiding

principles from the AD, the interconnection of R&W skills, the test constructs and subskills, and

the component of self-assessment (Appendix B).

Test usefulness. Test developers should always keep in mind test usefulness to ensure the

quality of a given test (Bachman & Palmer, 1996). Bachman and Palmer’s (1996) model of “test

usefulness” serves as a theoretical framework to provide systematic directives for developing this

new test. This model consists of six test qualities to measure a test’s overall utility and

soundness. In Table 1 below, the “requirement” refers to the theoretical standard for each quality.

The “implementation plan” introduces actions taken in order to meet the standard in question.

Although the six qualities are listed individually in Table 1, they should not merely be evaluated

independently, but “in terms of their combined effect” (Bachman & Palmer, 1996, p. 18).

Table 1

Test Usefulness

Qualities Requirement (Bachman & Palmer, 1996)

Implementation Plan (c.f., Bachman & Palmer, 1996)

Reliability Variations in the characteristics of assessment tasks corresponding to variations in TLU tasks

Avoid a wide range of construct components to maximize reliability (high-stake test)

Use task types reported with high reliability in the literature

Use analytic scoring rubric

Construct validity

Test score interpretations correspond to the language abilities measured

Match test tasks with language tasks in the TLU (MIIS ESL language community)

Define test constructs integrating the current research findings on R&W

Build a strong link between R&W skills through the writing subtest

Authenticity The degree of correspondence of the characteristics of a test task to the features of a TLU task (content validity)

Use authentic texts from the TLU as test input

Confer the AD regarding the types of text and task associated with level descriptors

Interactiveness The degree of engaging a test taker’s language ability, topical knowledge, and schemata via test tasks (construct validity)

Select texts and design tasks in line with the language abilities measured by the test constructs as well as test taker’s background features

Impact The test’s impact on the test takers and the IEP R&W curriculum

Use the test result as an inference of class placement

Introduce skill-integrated tasks into the classroom (washback)

Practicality Resources required for developing and implementing the RWPT

Collect and screen test prompts

Design assessment tasks Hand-grade the tests by 2-3

in-house ESL instructors Run tests with no extra need

for technology or cost

The guiding role of the AD. Apart from above, other practical considerations have been

integrated into the test design. Maintaining the integration of R&W skills is compatible with the

ESL curriculum design. In addition, the AD scale descriptors provide referential guidelines in

terms of the selection of test stimulus and task functions. One reason for doing so is that the

program staff is already familiar with the AD scale. Another reason is that the AD scale

resembles multi-level learning objectives of the IEP program. To base the test content on course

objectives is one common design scheme for a placement test (Bachman, 1990). Furthermore,

most of the new students would most likely fall within the range 1-6 (personal communication,

July 2011), which provides a rationale to base the test design around AD levels 1-6.

The inter-connections between R&W. The new RWPT aims to promote and reinforce

the links between R&W, and there are theoretical reasons for this beyond the IEP’s curriculum

design. L2 R&W research literature reveals several rationales for merging R&W. Conventionally,

language literacy is associated with learning how to read and write (Hedgcock & Ferris, 2009).

The New Literacy Studies (NLS) argues that literacy cannot be reduced to cognitive skills.

Literacy means more than just acquiring knowledge. It includes what R&W do in the context of

social interaction (Barton, 2007). The authentic texts used in this test simulate the milieu of the

social interaction of the test takers. The writing task of this test (Appendix C) was based on

authentic materials so that the test takers must decode the passages (knowledge) in such a way as

to produce a writing sample (what to do with the knowledge) so as to reflect how the language is

used socially (literacy).

Furthermore, a student’s R&W abilities themselves are interdependently related. The

development of one modality is indispensable with the cultivation of another. Writing

proficiency is heavily determined by what one gains at the end of the reading process (Just &

Carpenter, 1987). Unsurprisingly, a proficient writer normally possesses a sound reading ability

(Ferris & Hedgcock, 2005). Likewise, an experienced reader's ability to identify critical

information in the written text can partially be traced to the schematic knowledge of his/her

writing experience (Hirvela, 2004).

Hayes (1996) argues for three types of reading that are critical to writers. A writer needs

to read his/her own writing in order to detect errors, measure coherence, build organization, and

make possible improvements. This process is known as “reading to evaluate”. Secondly, the

reader must comprehend source texts, and thirdly, their written instructions, both from which the

need to write may follow. Hayes (1996) justifies using reading texts as a stimulus to generate

writing tasks, and integrates reading ability into the list of writing skills (below).

Reading construct. The current research on reading reveals an integrative view of

bottom-up and top-down approaches of information processing (Hedgcock & Ferris, 2009).

How multiple processes interact is enhanced by the reader’s schematic knowledge, such as

cultural expectations, genre knowledge, and linguistic knowledge (Hedgcock & Ferris, 2009).

The reading construct of the RWPT assesses the reader’s ability to extract useful information

from both top-down and bottom-up processes via the aid of schematic knowledge.

Grabe and Stoller (2011) use a car-related metaphor to illustrate the interplay of bottom

up and top-down processes along with schemata. Arriving at reading comprehension is

comparable to driving a car to a destination. The fuel is comprehension at word-level (starting

point of bottom-up), and the engine is constructing meaning out of linguistic units at a higher

level, such as sentences and paragraphs (bottom-up moving up). Finally, it is the higher order of

skills and meta-strategies, such as use of schematic knowledge, inferencing and predicting

abilities as well as other top-down controls, which push the vehicle towards its destination, to

meet reading comprehension goals from the text (top-down). Successful reading therefore

requires a smooth coordination of multiple skills and sources of knowledge. Proficient readers

are more likely to be able to do so than novice readers (Hedgcock & Ferris, 2009). This sheds

light on how an integrative approach of test design can differentiate between proficient and

novice readers.

To define the reading subskills measured by the RWPT, I extracted the subskills

repeatedly featured in the AD descriptors for levels 1-6, which signifies the emphasis of reading

abilities required of the learner. Due to the fact that the AD lacks evaluation of any interactive

skills, I integrated several interactive skills from Rosenshine’s (1980) categories of reading

attributes. These interactive skills taken from Rosenshine (1980) and the subskills from the AD

form the following list of reading subskills measured by the RWPT.

Information sequence recognition (interactive)

Recognition of word(s) in context (bottom-up)

Identification of main idea(s) (top-down)

Decoding of detail(s) (bottom-up)

Inferencing ( interactive)

Comparing and contrasting (interactive)

Writing construct. As discussed above, writing represents a form of production one

performs in literacy events. The nature of writing has both cognitive and social dimensions.

Undoubtedly, writing requires knowledge of linguistic features that are stored in the writer’s

working memory (Hayes, 1996). Writing occurs in a social context where the writer needs to

have the target audience in mind (Hayes, 1996). To address a particular audience requires an

understanding of discourse and sociolinguistic knowledge, such as social register, functional uses

of language, and other aspects of audience consideration, as extracted from Grabe and Kaplan’s

(1996) taxonomy of language knowledge.

Writing based on text sources and their instructions requires the ability to interpret the

input provided (Hayes, 1996). Hayes (1996) maintains that the writer’s “reflection” is a process

deeply influenced by text interpretation. “Reflection” is how new ideas are formed out of

previously existing internal representations. The writer uses these to form his/her schemata in

linguistic production (Ferris & Hedgcock, 2005). Individual schemata have to do with one’s

expectations of the text’s content and organization (Carrell, 1987). Thus the writer relies on

his/her prior experience (in addition to the information in the text source) to generate ideas, and

to strategize how to present and organize them.

To summarize what is written above, the writing construct is defined as the ability to read

a text source, identify the audience, utilize schematic notions and discourse knowledge, and to

organize them into a coherent body of written text. The writing subskills measured by this

construct take several factors into consideration. First of all, one must measure all the major

subskills through a rubric. Weigle (2002) suggests the use of analytic scoring rubrics to ensure

placement and diagnostic accuracy, which the new RWPT does accordingly. Due to time

constraints for grading, in this case (two hours assigned to 2-3 faculty members), the design of

the rubric has to be efficient and easy to use. A lengthy list of subskills on the rubric would have

been counter-productive. Based on my interview with an IEP R&W instructor (personal

communication, October 2011), I discovered that the IEP’s writing instruction places its major

emphasis on grammatical construction, choice of vocabulary, and topical organization (including

coherence). The AD level descriptors 1-6 do not evaluate for the language at the level of

academic texts or tasks. Only from AD level 7 and above do academic R&W skills begin to be

evaluated. Therefore, rhetorical knowledge and other higher order knowledge, typically

associated with academic writing, is not evaluated in the rubric of the new RWPT. On the other

hand, sociolinguistic (e.g., audience consideration) and discourse knowledge (e.g., genre

structure) are important to writing assessment of all sorts (Weigle, 2002). Therefore, audience

consideration and discourse knowledge are also included in the list of subskills below. Text

interpretation was regarded as a separate skill area by the RWPT rubric (Appendix D) as a result

of integrating R&W skills. To avoid a lengthy list I grouped distinct but interrelated subskills into

the following three skill categories. For “language use” and “organization” I have consulted the

ESL Composition Profile compiled by Jacobs, Zinkgraf, Wormuth, Hartfiel, and Hughey (1981).

Language use (tense, agreement of subject and verb, articles, prepositions, syntactic

structures, and effective word choice, etc.)

Organization (cohesion, coherence, and genre structure)

Text interpretation (topical relevance, understanding of source text and instructions, and

consideration of audience)

Self-assessment. A self-assessment survey (Appendix B) was introduced for the new

RWPT for two main purposes. First, the IEP is considering installing a self-evaluation tool to

preliminarily identify applicants who may not be suitable for the ESL program at MIIS (e.g.,

low-novice learners at English alphabet level or high-advanced learners ready for university

courses). Secondly, the survey can serve as additional evidence of test validity when compared

with the test score as well as the learner’s current ESL placement level. The rationale of the

survey’s design was to distinguish at which point test takers feel less comfortable with certain

R&W tasks. The survey consists of ten statements, half of which are targeted to the novice level,

and the other half to the intermediate. The statements were crafted around the task attributes

reflected in the AD. Each statement appears in the format of “I can perform certain tasks in

reading, or writing, or both of them combined (e.g., I can read and write simple class notes in

English).” The response is marked by four options (not at all, not really, sometimes, or most of

the time) that describe at which level the respondent feels his/her ability is matched with the task.

I have devised a method that converts the student’s responses into a rough proficiency level in

congruence with the AD scale. The conversion formula is based on the largest observed

concentration of answers, namely “sometimes” and “most of the time,” among the statements

(see Appendix B for more details). The formula helps identify at which level the test taker feels

most confident in his/her performance, which will be used as the self-reported proficiency level.

The Test ComponentsPrompt Attributes

As mentioned above, AD levels 1-6 provide major guidelines in the selection of test

stimulus. All test prompts were chosen based on the ESL student’s life-familiar topics, which

were derived from the AD level descriptors 1-6. To ensure test authenticity the test prompts were

all exacted from authentic materials in the MIIS language community (Table 2).

Response Attributes

The decision regarding the formats of task rests on several factors. First, despite the small

size of the program, time-efficient and easy-to-use scoring is preferred by the IEP, mainly due to

the teaching staff’s tight working schedule upon the arrival of new students (Personal

Communication, July, 2011). Normally 2-3 instructors need to grade an average of twelve tests

within two hours. The MC format allows only one correct answer, which ensures the efficiency

of scoring (Bachman & Palmer, 1996). Plus, as Table 1 shows, test reliability is a key factor to

the RWPT, which can be fulfilled by the format of MC (Bachman & Palmer, 1996). Due to the

reasons above, MC items take up two-thirds of the RWPT (Appendix C).

Secondly, constructed response items are used as an alternative format to MC to elicit

more solid evidence of reading comprehension (Hedgcock & Ferris, 2009). Also compared to

MC, short-response items minimize the chances for guessing (Bachman & Palmer, 1996). The

answers for Items 14 and 15 are relatively controlled by the facts embedded in the prompts. Item

13, particularly targeting reader’s use of schemata, is open to more free production and variations

of answer.

Thirdly, as defined in the test constructs, the RWPT assesses the reader’s ability to use

interactive reading subskills along with R&W integrated literacy skills. Text reconstruction tasks

have a good potential to meet this assessment goal, where readers have to activate their schemata

and integrate them with their strategic skills (Hedgcock & Ferris, 2009). For both items 16 and

17, test takers need to reassemble the order of four sentence strips into a coherent short passage.

The authenticity of these two tasks lies in Hayes’ (1996) contention of a writer’s necessary

reading ability to examine their own writing for organizational coherence (integration of R&W).

Lastly, the writing task (item 18), an email response based on the reading of two rental

advertisements, incorporates the writer’s text interpretation skills of the test prompts and task

directions (Hayes, 1996). The minimum word limit is 150 words to ensure a ratable sample

(Hamp-Lyons, 1991). Besides the use of authentic sources for the prompts, the task also models

itself on authentic social interaction: the writer, in need of a new residence, receives two

advertisements from a friend, and writes the friend back to explain his/her choice via comparing

the two places and giving supporting facts, such as the needs of his/her personality and lifestyle.

To link the test tasks with the test constructs defined above, Table 2 below offers a quick

glimpse into the subskill(s) measured by individual test task.

Table 2

Distribution of Text Types, Task Types, and Subskill(s) Measured cross Test Items

Item No. Type of Text Type of Task Focal Subskill(s)1. Public sign on campus MC Identification of main idea(s)

2. Public sign on campus MC Identification of main idea(s)

3. Looking-for-roommate

MC Recognition of word(s) in context

4. Same as Item 3 MC Decoding of detail(s)

6. Social event invitation MC Identification of main idea(s)

7. Apartment tenant notice MC Identification of main idea(s)

8. Same as Item 7 MC Recognition of word(s) in context

9. Movie synopsis MC Identification of main idea(s) &

inferencing

10. Campus-related news story MC Identification of main idea(s)

11. Same as Item 10 MC Recognition of word(s) in context

13. Email invitation to a SR* Identification of main idea(s) &

workshop inferencing

14. Poster of a campus photo

contest

SR Decoding of detail(s)

15. Same as Item 14 SR Decoding of detail(s)

16. Description of a potential

roommate’s life routine

TR* Information sequence recognition

17. Campus-related news story TR Information sequence recognition

18. Sublet rental advertisements

(2 pieces)

W* Language use, organization, and text

interpretation (incl. the reading

subskill “comparing and contrasting”)

Note. SR*: short-response; TR*: text-reconstruction; W*: writing

Answer Key and Scoring Rubrics

One way of weighing test components is to take into account pedagogical implications

(Alderson, Clapham & Wall, 1995). To match the ESL curriculum, the test will give an equal

weight to reading and writing. To meet this goal as well as to comply with the AD 1-6 level

design, I have assigned the reading and writing sections an equal value of 6 points. The average

of the individual R&W results combined determines the final placement level (1-6).

The MC items in the reading subtest are objectively measured by a standard key

(Appendix D). The answers to the short-response items may involve graders’ subjectivity. One of

the text-reconstruction tasks is open to partially correct answers. Suggested correct answers and

solutions to partial credit are also discussed in the answer key (Appendix D). The tallied raw

score of the reading subtest is then converted to a 1-6 point scale for level assignment (Appendix

D). The cut-off points suggested by the scale merely represent a tentative proposal. The results

from future pilot tests should be conducive in fine-tuning the scale’s converting accuracy in

relation to the score band.

Weigle (2002) contends that analytic scoring, which measures different writing attributes

and provides diagnostic information separately, has a potential to assess the writer’s ability more

accurately than holistic scoring. Due to the high-stakes nature of a placement test, an analytic

scoring rubric is used to calculate the test taker’s writing level (Appendix D). As indicated in the

writing construct above, three writing skill areas formulate the focus of the rubric. Each writing

skill area (worth 2 points) is evaluated with four possible results: poor, fair, good, and excellent.

For the final writing score the results of the three independent areas are added up. The entire

writing task is worth 6 points.

Test Administration and Pilot Testing

Owing to the pilot testing of the older RWPT (see my previous C6 attached in C11 with

readers’ comments) I have maintained contact with some of the test takers. During the winter

break of the ESL program I sent out a group email seeking a volunteer to take the new RWPT.

One respondent asked if he could finish the test electronically. On January 2, 2012, I emailed the

volunteer the self-assessment form, the RWPT test, and test instructions. These instructions

included directions such as finishing the test in one sitting, and timing oneself. The following are

my suggested testing procedures and resources for a future pilot test, some of which were

integrated into my email instructions to the volunteer test taker.

The pilot test should take place in a quiet and comfortable classroom. Prior to the test, the

proctor told the test takers to bring their own pencils and dictionaries. During the self-assessment

stage test takers can use dictionaries to aid their reading of the evaluative statements. Upon

taking the test no dictionaries or electronic devices should be allowed. All cell phones must be

turned off. Although the test seems pretty straightforward, the proctor should prompt test takers

to record their concerns regarding specific items as well as their overall impression of the test in

the feedback section. He/she should also be prepared to answer questions from the test takers

regarding the task instructions. Meanwhile, the proctor can consider holding a post-pilot-test talk

to elicit more informal feedback from the test takers. This action provides an additional measure

to enhance the test’s face validity (below).

I received the completed test from the volunteer on January 7, 2012. He reported that it

took him three minutes to finish the self-assessment, and eighty-five minutes to complete the

entire test. This small fact can be used as a preliminary measure of the required time length for

the RWPT.

Test Validity and Reliability

Validity

The RWPT, a criterion-referenced test, was developed for a program-specific diagnostic

purpose. There is a lack of an established placement test used for the same purpose to compare

the results with. Therefore, criterion validity is not a relevant concern. The major evidence of

validity for the RWPT lies in the degree of correspondence of the test tasks to the test construct

(construct validity), the degree of the test’s content representative of the content measured

(content validity), the comparison with the test takers’ current placement level (a weak form of

concurrent validity), and the participants’ opinions of the test experience (face validity).

I obtained valuable feedback from two experienced ESL and testing specialists (native

speakers of English) who helped me improve the test items and evaluated the validity of the

construct along with its content at their face value. By measuring individual test items against the

subskills illustrated in Table 2, they both affirmed an observable corresponding relationship

between the subtests and the test constructs. Also, they endorsed the overall test authenticity and

the integration of R&W skills assessed by the writing task. Lastly, they both believed that the

test’s content (choice of test stimulus and item types) seemed to match the R&W needs of the

ESL students at MIIS.

The first person is an ESL teacher from University of Oregon. He streamlined the

instructions for Item 13 to make the meaning more clear. He also questioned whether I should

use the idiomatic expression “pave the way” as a part of the correct answer for MC Item 10,

which may test additional linguistic knowledge from what the item intends to measure. In light

of this concern I reworded the correct answer. Lastly, he suggested I remix the order of sentence

strips from the sequencing tasks, Items 16 and 17, to increase their overall difficulty due to the

short length of each text. The second specialist is my original class project advisor Prof. Jean

Turner. She also suggested I simplify the wording of some of the task instructions to take into

account low-proficiency readers. She suggested ways to rephrase the question of Item 14 to

further align the task to the elicited responses indicated in the answer key.

One important way to measure face validity is to gather feedback from test takers

regarding the test’s relevance to their learning (Gronlund, 1998). I looked into the volunteer’s

responses to the feedback questions. Also, in my email I encouraged the test taker to give

additional comments on the test, such as whether he found the task instructions straightforward

and readable enough. The volunteer did not raise any questions regarding any test items. He

found the test useful to measure his R&W skills, and believed the language use featured in this

test was relevant to his R&W needs in real life (test authenticity). In the future the pilot test

administrators should continue to elicit written and oral feedback from all test takers for a more

thorough estimate of face validity.

To gather a preliminary gauge of the concurrent validity, I calculated the volunteer’s test

score and compared the result to his current ESL placement level as well as self-rated proficiency

level. The volunteer was placed into level 5 by the IEP program (personal communication, De-

cember 2011). He scored level 6 for reading and 4.5 for writing in the RWPT. The average of the

two, 5.25, marks his RWPT placement level. This result matches his current IEP R&W place-

ment level. His self-assessment falls in the range of level 4-5, which also seems to correspond

well with his RWPT test result.

For the future pilot test, Spearman's rank order correlation coefficient should be calcu-

lated between the RWPT placement level and the test taker’s current R&W class level. Due to

the fact that the self-assessed proficiency is an estimated range rather than a discrete number,

Spearman’s correlation coefficient is not applicable. However, the test administrator can never-

theless empirically compare it against the other two placement levels for additional evidence of

concurrent validity.

Reliability

Inter-rater reliability should be considered as the main evidence of test reliability for the

writing subtest. Two raters, preferably experienced in teaching and grading ESL writing, should

receive a brief training on using the scoring rubric before the pilot test. After having graded the

test, they should provide feedback on the scoring effectiveness of the rubric as well as problems

detected. The input will lend a basis for further revision of the rubric.

For the reading subtests I propose using the Kuder-Richardson formula 20 (KR-20) to

calculate the RWPT’s internal consistency. The proposal is based on two main reasons. First,

KR-20 is ranked as a conservative estimate of reliability (Brown, 2005). It seems to be compati-

ble with the high-stake nature of the RWPT. Secondly, Bachman (1990) argues that KR-20 tack-

les the shortcomings of various split-half methods by taking “the average of all the possible split-

half coefficients on the basis of the statistical characteristics of the test items” (p. 176).

Conclusion

The careful planning around test usefulness of the RWPT is meant to achieve a fair, justi-

fiable, and accurate placement decision. The test design, particularly in terms of test authenticity,

may generate positive washback to the instruction and curriculum design of the IEP. The initial

pilot test with one volunteer test taker suggested a positive potential of the RWPT’s placement

ability. The RWPT project can be tremendously enhanced through full-scale pilot testing with

test-takers representing a wide-range of proficiency levels on the AD scale 1-6. The inter-rater

grading process and statistical item analysis (e.g., item facility and discrimination) will generate

more evidences for test reliability. The item analysis result will also come out as an informative

ground for individual item revision, possible removal of certain items, as well as adding new

tasks.

I have gained a tremendous amount of knowledge and experience via conducting this

assessment project. Although the test is not in Chinese, the target language I choose for my MA

track, it has offered me ample opportunities to (a) glean in-depth insights and theoretical grounds

from the contemporary literature on R&W skills, (b) gain hands-on experience in designing a

R&W integrated test that promote test usefulness, and (c) enhance my awareness of ensuring

construct validity and test reliability through design of assessment tasks and scoring tools. I

believe the skill and knowledge components listed above will better serve me in handling

language assessment projects in Chinese when the future need arises.

(Word count: 4971, not including page 1 and 2)

References

Alderson, J.C, Clapham, C., & Wall, D. (1995). Language test construction and evaluation.

Cambridge, UK: Cambridge University Press.

Bachman, L. (1990). Fundamental considerations in language testing. Oxford, UK: Oxford

University Press.

Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford, UK: Oxford University

Press.

Barton, D. (2007). Literacy: An introduction to the ecology of written language (2nd ed.). Malden,

MA: Blackwell.

Brown, J.D. (2005). Testing in language programs: A comprehensive guide to English language

assessment. New York, NY: McGraw-Hill.

Carrell, P. L. (1987). Text as interaction: Some implications of text analysis and reading research

for ESL composition. In U. Connor & R. Kaplan (Eds.), Writing across languages:

Analysis of L2 text (pp. 47-56). Reading, MA: Addison-Wesley.

Ferris, D. R. & Hedgcock, J. S. (2005). Teaching ESL composition: Purpose, process, and

practice (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.

Grabe, W., & Kaplan, R. B. (1996). Theory and practice of writing. New York, NY: Longman.

Grabe, W., & Stoller, F. (2011). Teaching and researching reading (2nd ed.). Harlow, England:

Longman/Pearson Education.

Gronlund, N. (1998). Assessment of student achievement (6th ed.). Boston, MA: Allyn & Bacon.

Hamp-Lyons, L. (Ed.). (1991). Assessing second language writing in academic contexts.

Norwood, NJ: Ablex.

Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C.M.

Levy and S. Ransdell (Eds.), The science of writing. Mahwah, NJ: Lawrence Erlbaum

Associates.

Hedgcock, J. S., & Ferris, D. R. (2009). Teaching readers of English: Students, texts, and

contexts. New York, NY: Routledge.

Hirvela, A. (2004). Connecting reading and writing in second language writing instruction. Ann

Arbor, MI: University of Michigan Press.

Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V, & Hughey, J. (1981). Testing ESL

composition: A practical approach. Rowley, MA: Newbury. House.

Just, M., & Carpenter, P. (1987). The psychology of reading and language comprehension.

Boston, MA: Allyn & Bacon.

Rosenshine, B.V. (1980). Skill hierarchies in reading comprehension. In R. J. Spiro, B. C. Bruce,

& W.F. Brew (Eds.), Theoretical issues in reading comprehension: Perspectives from

cognitive psychology, linguistics, artificial intelligence, and education (pp. 535-554).

Hillsdale, NJ: Lawrence Erlbaum.

Weigle, S. C. (2002). Assessing writing. Cambridge, England: Cambridge University Press.

Appendix B

Self-Assessment Component

Since this RWPT project involves only the lower form, the self-assessment survey is

mainly used as an additional piece of evidence of test validity rather than a tool to help the ESL

students choose which form of test to take. The self-reported proficiency levels will be compared

next to the student’s test score and his/her current ESL class placement.

Statements numbered 1-5 represent R&W skills at novice level (placement levels 1-3).

Statements numbered 6-10 represent R&W skills in an intermediate proficiency class (placement

levels 4-6). Statements 4 and 5 are intended for upper-novice level (level 3), or as a transition

between novice and intermediate levels. Statements 8 and 9 measure upper-intermediate (level-6)

reading skills; statement 10 measures upper-intermediate writing skills.

Students are asked to circle one answer out of four options on a descriptive Likert scale

indicating how confident and comfortable they are with a task. The four options are:

1. 2. 3. 4.

Not at all. Not really. Yes, sometimes. Yes, most of the time.

The self-assessment result will only be used as an empirical estimate of the student’s

proficiency level. The key approach is to find out how the answers numbered 3 and 4 are

distributed (confidence in task performance). If they are concentrated among the statements 1-5

but not 6-10, the examinee is a self-reported novice learner (levels 1-3). If the 3s and 4s are also

concentrated among statements 6-10, the examinee is in the intermediate range (levels 4-6). If

he/she circles answers 3 or 4 for statements 8-10, the examinee is likely a level 5-6 reader and

writer. If the concentration of answers 3 and 4 is among statements that border between the

novice and intermediate levels, as in statements 4-6, the student is likely a level 3 or 4 learner.

The following statements appear on the self-assessment survey.

1. I can read and understand most public signs (e.g., street & store signs) in English.

2. I can write simple questions based on an English text I understand.

3. I can read and write simple class notes in English.

4. I can read and write a simple email in English to a friend.

5. I can write simple paragraphs in English.

6. I can communicate with my teacher over complicated issues in an email.

7. I can read business correspondences from banks and schools.

8. I can read news events and write a brief summary in English.

9. I can read a news article that voices an opinion.

10. I can write a 2-3 page essay in English.

Appendix C

MIIS ESL Reading and Writing Placement TestWinter, 2011

Name: _________________ Country:______________ Completed Time:_____________

Multiple Choice: Please read each item carefully and choose the best answer.

Item 1: Look at the sign in a public bathroom.

1) What does it tell you about toilet paper?

A) It is a daily necessity. B) Use all you want. C) It needs to be saved. D) It is bad for our environment.

Item 2: You see a sign posted on the window near a library.

2) This sign means that smoking is not allowed…

A) within 20 feet of the library.B) anywhere around the library. C) within 20 feet inside the library.D) anyplace in the library.

Items 3-5: You are looking for a roommate and find this advertisement on the internet.

I'm a 22 year old student living in Salinas with my boyfriend. We have a smaller dog (not teeny

tiny, about 16 lbs) and are open to more dogs. We both like to drink socially or have a drink with

dinner. We would sometimes have a few people over, but not an insane amount of partying.

Cigarette smoking is fine, but it must be done outside.

3) Based on the context of the advertisement, insane means:

A) unexpectedB) unspecified C) uncontrolledD) unannounced

4) The writer does not want______ in the house.

A) more dogsB) visitorsC) smokingD) partying

5) Choose the sentence that best describes how much the writer drinks.

A) She does not drink at all. B) She only drinks sometimes.C) She only drinks with her boyfriend.D) She often drinks a lot.

Item 6: You received a flier in your MIIS email.

CROSS-CULTURAL LUNCHES!!!A GREAT MIIS TRADITION!!! 12:15 Thursday, October 20thHolland Center courtyardFree dessert and refreshments! Six students will sit at each table arranged by nationality and academic program. Faculty members will be there to facilitate a great cross-cultural discussion! To participate, please sign up in advance! Email Professor Peter Grothe at pgrothe@miis.edu. Tell me your name, nationality and academic program. ALL MIIS STUDENTS WELCOME!!!

6) What kind of event is being held on Oct. 20th?

A) a light-hearted lecture B) a study group C) a local food samplingD) a cultural exchange experience

Items 7-8: Read the note that was left on your apartment door.

Dear Resident:

Unfortunately, we have to shut the water off on Tuesday, 4/5/11, from the hour of 10:00 AM to 2:00 PM. This is necessary to complete the plumbing work in progress. We regret any inconvenience this may cause you.

7) What do you expect to happen on 4/5/11?

A) There will be some repair work done. B) There will be no electricity.C) The office will be closed. D) Some people will move out.

8) What does the word regret mean in the last sentence?

A) inform you ofB) apologize forC) avoidD) warn you of

Item 9: Here is a summary of a movie that is playing downtown.

The Music Never Stopped

Runtime: 105min.

SynopsisAn engineer is unable to communicate with his long lost 35-year-old son, a former hippie, who has a brain tumor preventing him from creating new memories until they discover that they can communicate through music, including Bob Dylan and the Grateful Dead.

9) If you chose to watch this film, which type of movie could you expect to see?

A) romantic comedyB) serious dramaC) action thrillerD) science fiction

Items 10-12: This is a news story going around at MIIS.

Thirty-seven Fulbright scholars from countries ranging from Madagascar to Poland and Panama to Bangladesh have arrived for a three-week intensive course at the Monterey Institute designed to prepare them for graduate studies in the United States. The focus of their studies at MIIS is on academic English, public speaking and other graduate school skills.

After their three weeks in Monterey, the students will matriculate into master’s or doctorate programs at universities across the United States, where they will be studying everything from engineering to tourism, public health to animal breeding and genetics. Although the program is intensive, there are also a number of exciting extracurricular opportunities. The students have already toured both San Francisco and Carmel and have other excursions planned for next weekend. They also visited the Monterey Farmer’s Market and were hosted for dinner by members of the Monterey Community.

10) What is the Fulbright scholars’ main purpose for being in Monterey?

A) to enrich their cultural experienceB) to exchange academic expertise C) to participate in special tours D) to get ready for school programs

11) The word extracurricular means

A) extraordinaryB) exceptionalC) after-schoolD) recreational

12) Which of the following activity was arranged?

A) visiting a farm nearbyB) going to restaurantsC) eating at a local’s homeD) hiking in a scenic area

Item 13: This is an email regarding a MIIS workshop. Please read it and answer the following question.

Dear new international students,

Please note the email that you received today from my wonderful colleague, Kathy Sparaco. There is an extremely important workshop for all new international students about dealing with the stresses and strains of difficult graduate work and about adapting to American culture and the MIIS academic culture. You will probably find it very, very helpful! There will be many helpful hints and pieces of advice from professionals that you will probably find very useful!

Kindest regards, PeterProfessor Peter Grothe, Ph.D.Director of International Student Programs EmeritusMonterey Institute of International Studies

13) Faculty members will hold a question-and-answer session at the end of the workshop. You will attend the workshop. Write two questions you will ask, or that might come from the audience. Be as specific as you can.

Items 14-15: You saw a poster on the campus. Please provide a short response to each question below.

14) Based on the poster above, name two kinds of MIIS students who would most likely participate the contest.

15) What will happen to the top-selected work? (2 details)

Items 16-17: Below you find the titles of two passages. Use the passage’s title as a clue to put the sentences of each passage into a coherent order. Write the numbers in order above the broken lines. For passage 1, there is more than one full-credit answer. However, for passage 2, there is only one full-credit answer.

16) Passage1: a self-description

My life-routine

1. I am usually home by 6:00 PM, spending my evenings making dinner, watching the news, going for a walk, reading or watching a movie.

2. I usually spend Mondays at home cooking, doing laundry, cleaning and getting ready for my workweek ahead.

3. I have Sundays and Mondays off.

4. I am an early riser, up and out by five to swim before heading off to work.

Answer: ___ ___ ___ ___

17) Passage 2: a news story

Tatiana Ivanova: First Translated Book Published Weeks after Graduation

1. Tatiana has already started translating her next book for the publishing house, Because of Mr. Terupt by Rob Buyea. Not a bad start to what is sure to be a wonderful career!

2. Tatiana Ivanova (MATI ’11) graduated in May from the very demanding Translation and Interpretation degree program at the Monterey Institute.

3. Her translation was published by the Russian Children’s publishing house Pink Giraffe in the beginning of June, less than a month after she received her master’s degree.

4. As if that was not enough of a challenge, she somehow found time in between classes and assignments to work on a translation of the 1989 award-winning novel Holes by Louis Sachar into Russian.

Answer: ___ ___ ___ ___

Item 18: Write an email response

You are thinking about moving to a new location in Monterey. Your friend at MIIS sent you two rental advertisements. Compare the two advertisements carefully, and write your friend back about which place suits you better. Consider the various features of each place, and give specific reasons why your choice appeals to you. Discuss aspects relating to your life-style and personal-ity.

You must write at least 150 words. For convenience you can use the terms “place 1” and “place 2 ” in your writing.

Place 1

Place 2

$750 beautiful house in the heart of MontereyRooms for rent in beautiful house, 750 a room, available now. looking for tenants who are clean and self-sufficient. Need to share bath with the other tenant. Close to MIIS, DLI, and downtown. Beautiful private location with nice front and backyards. new paint and carpet, plenty of parking, and lots of room for your own garden if interested ,big outdoor grill with sink on back deck, safe neighborhood with minimal street traffic. Available now call 831 XXX XXX.

$670 room for rent in MontereyRoom for rent in fun, Monterey home. The house is located very close to MIIS and offers washer/dryer, garage, garbage service (included) open fireplace, huge kitchen and living room and sunny patio area for BBQ or sunbathing during summer. Deposit (300) and rent required (670). Current residents are 20's 30's male/female mix who like to party and have fun but also expect residents to be responsible and clean up after themselves in the public areas of the house. We are seeking a candidate who shares similar interests and has an un-derstanding of our expectations of responsibility. Please txt 831 XXX XXX.

----------------------------------------------End of Test-------------------------------------------------------

Feedback 1. Overall, do you find this test overall useful to measure your reading and writing skills?

2. Do you find all the situations relevant to your English communication needs in Monterey? If no, which one situation(s) were not useful?

Appendix DScoring Key and Rubrics

General InstructionsThe rating of this test will retain the existing 8-point scale adopted by the IEP. The scale

divides test-takers into four R&W classes, namely A (1-2), B (3-4), C (5-6), and D (7-8). Due to

the fact that the test in question is the lower form, only levels 1-6 will be tested. In other words,

the score range for the reading and writing section each is 1-6 based on the 8-point scale. The

reading and writing subtests will be calculated separately, each of which accounts for 6 points.

The averaged score of the sum from both sections will be used as the basis for placing individual

the student.

The Reading SectionItems 1-12: Multiple Choice (each item=1 point, total=12 points)1. C2. A3. C4. C5. B6. D7. A8. B9. B10. D11. C12. C

Items 13-15: Short Response (each item=2 points)

13. The two written questions should pertain to either graduate-study or culturally related issues

(each question=1 point, total=2 points). Below are a few examples of possible answers. Please

note that the first two examples are based on the answers given by the volunteer test taker

mentioned in the report.

How do I deal with multiple class assignments that I do not seem to have enough time to

finish?

When I talk to my American classmates, they seem to understand me very well despite

my errors. But how would I know when I made an embarrassing mistake in my speech?

Does MIIS have free tutors who can help me with my academic writing?

Can MIIS students give their professors thank-you gifts?

How can I rapidly increase my vocabulary in academic reading?

14. 1) A MIIS student who likes photography/taking photos (1 point).

2) A MIIS student who has traveled to exotic and scenic places (1 point).

15. Any two of the three details below (each detail=1 point, total=2 points).

1) The photos will be put on display (at the International Bazaar)

2) The photos will become MIIS’ property.

3) The photos will be used in the MIIS calendar.

Items 16-17: Text reconstruction (each item/passage=2 points)

Table 1

Scoring Rubric for Items 16 and 17

Item No. 2 points 1 point 0 point

16 4 1 3 2 or 3 2 4 1

The answer contains only one pair of cohesive sentences (as in 4 1 and 3 2).

The answer contains no cohesive order of sentence pairs (as in 4 1 and 3 2).

17 2 4 3 1 N/A Any other order

Once the rater has tallied the total reading points from Items 1-17, he/she needs to refer to

the converting scale in Table 2 in order to translate the score into a proficiency level.

Table 2

Converting Test Taker’s Total Reading Score into Levels 1-6

Number of points

1-4 5-8 9-12 13-16 17-19 20-22

Converted level(AD 1-6)

1 2 3 4 5 6

The Writing Section (Item 18)

An analytic scale facilitates scoring when an individual has uneven strengths or

weaknesses among the skill areas (Bachman & Palmer, 1996). It is thus known to work more

accurately and reliably than a holistic scale for a placement test. Therefore an analytic scoring

scheme was proposed for assessing the writers’ samples for the RWPT.

To comply with the AD levels 1-6, each writing skill area is assigned a value of two

points. There are four levels in each area, marked by “poor (=0.5)”, “fair (=1.0)”, “good (=1.5)”

and “excellent (=2.0)”. The rater needs to decide which level represents each skill area from the

student’s writing sample, and add all three up to reach a final score. A writing sample that

successfully demonstrates proficiency in all three subskill areas equals 6 points. The list below

shows the categories of subskills measured by the rubric. Some of the rubric descriptors are

modeled on the analytic scale for ESL composition compiled by Jacobs, Zinkgraf, Wormuth,

Hartfiel, and Hughey (1981).

Language use (tense, agreement of subject and verb, articles, prepositions, syntactic

structures, and effective word choice, etc.)

Organization (cohesion, coherence, and genre structure)

Text interpretation (topical relevance, understanding of source text and instructions, and

consideration of audience)

Table 3

Scoring Rubric for the Writing Task

Language Use Organization Text Interpretation

Excellent (=2.0)

- Very few to no grammar errors- Effective use of varied sentence constructions- Meaning clearly expressed with effective choice of words

- Coherently and logically organized text - Ideas clearly presented and fully developed - Effective use of cohesive devices

- Thorough understanding of the text prompts and task instructions- Relevant ideas highly organized- Addresses the target audience properly

Good (=1.5) - Limited grammar errors- Minor mistakes with sentence structure and word choice- Meaning seldom obscured

- Some loosely organized sentences - Main ideas clearly stated but need furtherdevelopment- Minor errors in using cohesive devices

- Very minor misdirection from following the task instructions- Slight deviation from the topic- Slight inconsistencies in addressing the target audience

Fair (=1.0) - Frequent grammar errors- Major problems with sentence structure and word choice- Meaning frequently obscured

- Disconnected and logically confusing sentences- Ideas not well-developed and choppy

- Major lack of understanding of the prompt and or the directions- Significant deviation from the topic- Major deviation from addressing the target audience

Poor (=0.5) - Dominated by grammar errors- Meaning confusing or incomprehensibleor not enough text (incomplete) to evaluate

- Little/no organization- Little/no logical order or topical cohesionor not enough text (incomplete) to evaluate

- Little/no understanding of the prompt and or task directions - Ignores the target audience or not enough text (incomplete) to evaluate

ReferencesBachman, L., & Palmer, A. (1996). Language testing in practice. Oxford, UK: Oxford University

Press.

Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V, & Hughey, J. (1981). Testing ESL

composition: A practical approach. Rowley, MA: Newbury House.

The Reading & Writing Placement English Test project

Documents

Transcript of The Reading & Writing Placement English Test project

Welcome to Reading Apprenticeship: Writing Connectionsreadingapprenticeship.org/.../Reading-Apprenticeship-Writing...1-13-1… · Reading Apprenticeship: Writing Connections January

Placement Tests STUDY GUIDE - Ranken Technical …€¦ · Placement Tests . STUDY GUIDE . COMPASS®/ESL . Sample Test Questions for . Reading, Writing, and Math . For information

Academic Plan Assignment - Fall 2015 · Placement into Writing, Math, and Reading courses is based on your ACT scores and NEIU Placement Test results. Directions: Complete the following

For Questions 1-15, read the text below and decide which ... · Placement Test – KET 10 PLACEMENT TEST – KET - ANSWER KEY Reading and Writing – PART1 Reading and Writing –

Reading placement 2011-2012

I DOWE DOYOU DO Read/Write Aloud Shared Reading/Writing Guided Reading/Writing Independent Reading/Writing Gradual Release of Responsibility.

MEMORANDUM STRATEGIC READING AND …SUBJECT: STRATEGIC READING AND WRITING (SRW) REPORT, 2017–2018 Strategic Reading and Writing (SRW) is a reading and writing intervention course

writing about reading

Reading and Writing By Ximena Schneider. Reading Next and Writing Next.

MEMORANDUM STRATEGIC READING AND WRITING (SRW) REPORT ... · SUBJECT: STRATEGIC READING AND WRITING (SRW) REPORT, 2017–2018. Strategic Reading and Writing (SRW) is a reading and

Reading writing-grammar

Reading, Writing, and Reading-Writing in the Second ... · Reading, Writing, and Reading-Writing in the Second Language Classroom: A Balanced Curriculum Jeng-yih ... practicing the

Phone: CTIC Placement Test Resultsctic.com.au/wp-content/uploads/2020/04/CTIC-Placement... · 2020-04-03 · Speaking Reading Writing Listening Overall Recommendation Comment Website:

The Effects of reading and writing upon thinking and ......their own reading and writing activities (writing notes, reading articles, writing a draft, reading notes, reading a draft,

Welcome to the Writing Placement Assessment … Powerpoint Presentation.pdfWelcome to the Writing Placement Assessment (WPA) Workshop Presented by the Department of Rhetoric and Writing

Reading and Writing

Reading Placement Tests - WikispacesGrade+Reading... · Reading Placement Tests Easy Assessments to Determine Students’ Levels in Phonics, Vocabulary, and Reading Comprehension

English Writing Essay Placement Assessment Validation ...

Graduate Diploma Reading & Writing Session 20 Reading & Writing Skills Review.

Reading & Writing - Writing Implements