Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS...

72

Transcript of Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS...

Page 1: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 2: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Measuring Success in English for Young People

Annabelle G. SimpsonDirector,Channel Management, ETS Global Division

Page 3: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Outline

• Who is ETS?• Two Families of Products: TOEFL® & TOEIC®• How does ETS develop quality tests? • What is TOEIC® Bridge?• What is TOEFL® Junior?

Page 4: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

ETS: Our MissionTo Advance Quality and Equity in Education

for All People Worldwide

We do this by providing:

• Fair, valid and reliable assessments

• Education research

• Products and services that measure knowledgeand skills, promote learning and educational performance and support education and professional development

Page 5: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Two Families of English Assessments: TOEFL® & TOEIC®

TOEFL iBT TOEIC L&RTOEFL ITP TOEIC S&WTOEFL Junior TOEIC Bridge• Coming soon….

Page 6: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

The Origins of ETS Work with Young People

• English proficiency is an increasingly important skill for students and young adults worldwide

- Expanding access to educational, personal and professional opportunities

• EFL instruction is beginning at earlier ages• English-medium instructional environments take many forms internationally:

- Public and private schools in English-dominant countries- International schools in non English-dominant countries- Schools in any country using bilingual or CLIL approaches- Vocational schools

• Responds to aspirations of students as they attain English-language proficiency

Page 7: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 8: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

How ETS Develops Quality Tests

Page 9: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Overview

• Before discussing how ETS develops quality tests, I will discuss what we mean by “quality” in testing.• Then I will discuss the major steps in test development that are required to create a high quality test.

Page 10: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

What Is a Quality Test?

A quality test must be

•Reliable

•Valid

•Fair

•Practical

Page 11: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Reliable

• A test is only a sample.• The items are a sample of all the items that could be asked.• The time of testing is a sample of all the times that the test could be given.• The person scoring the essay is a sample of all possible scorers.

Page 12: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Reliability Is Consistency

If test taker’s knowledge is constant, how consistent would scores be if samples changed and parallel items were used? • The test was taken on a different day? • Different judges were used for scoring essays? • The higher the reliability, the more consistent the scores will be.

Page 13: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Factors That Determine Reliability

All other things being equal, • the more independently scored items, the higher the reliability• the more the items correlate with each other, the higher the reliability•the greater the variability of scores, the higher the reliability

Page 14: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Validity

• Most important indicator of test quality • Extent to which inferences based on test scores are appropriate & supported by evidence• Requires evidence to support the use of the test for the intended purpose

Page 15: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Evidence of Validity

•Qualifications of test designers• Process used to develop test• Qualifications of item writers and reviewers• Statistical indicators of item quality and fairness• Expert judgments of test content

Page 16: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Evidence of Validity

• Match of items to content standards• Relations among parts of the test• Relations of scores with other variables• Results fit with theories• Claims for use of test are met• Good consequences

Page 17: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Fairness = Validity for All

• Fairness is an aspect of validity.• Tests that show valid differences across groups are fair. • Tests that cause invalid differences across groups are not fair.

Page 18: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Practicality

• Tests must be affordable in dollar costs and in time used.• Scores must be understandable & helpful to score-users.• Items must be acceptable to diverse constituencies.• Every test is a compromise among competing demands.

Page 19: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Major Steps in Test Development

1) Make Initial Plan for Test

2) Involve External Experts

3) Write/Review Items

4) Pretest Items (Whenever Possible)

5) Review Data & Revise Items

6) Assemble Final Test

Page 20: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Major Steps (continued)

7) Administer Tests

8) Checks Before Scoring

9) Scaling & Equating

10) Test Analyses

11) Report Scores

12) Begin Planning for Next Form

Page 21: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

1) Plan Test

• PurposeWhat is test used for? What decisions made on the basis of the scores?

• PopulationWhat are characteristics of test takers?

• Construct Content & skills

Page 22: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Plan Test

What constraints on test design? Time, cost, format, scoring, etc.

Initial plan for test development workMajor tasks, schedule, staff

Evidence-Centered DesignWhat claims about test takers?What evidence supports claims?What tasks provide evidence?

Page 23: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

2) Involve External Experts

• Diverse (demographic, geographic, institutional, point of view) external contributors required in test design, item writing and reviewing. • Diverse experts help establish acceptability, validity and fairness.

Page 24: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Tasks of External Experts

• Set/approve test specificationsWhat content to measure?What skills to measure?What statistical properties?

• Write and review test items• Select items for final form

Page 25: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

3) Write/Review items

• Make item-writing assignmentsWrite items to meet specificationsWrite overage for attrition

• Internal & external reviews & revisionsAt least 2 independent content reviews per itemSeparate editorial reviewSeparate fairness review

Page 26: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

3) Write/Review items

Question (Item ) Author Artwork/graphics

Content Reviewer 1Content Reviewer 2Content Reviewer 3EditFairnessResolver

Studio recordingLock

Page 27: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

4) Pretest

• When possible, try out items before operational use. Gives information to :• Identify problem items (ambiguous, wrong difficulty, poor discrimination. For MC: no key, multiple keys, bad distracter)• Pick most appropriate items to meet specifications• Estimate final form characteristics from item data

Page 28: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 29: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 30: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Use Differential Item Functioning (DIF)

• DIF = statistical measure of how matched people in different groups perform on an item.

• DIF helps spot items that may be unfair.

• DIF is NOT proof of bias.

Page 31: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Uses of DIF

•If data available, tests assembled with low DIF items. • If no data at assembly, DIF calculated after administration. • High DIF items reviewed and removed before test is scored, if judged unfair. • External people involved in reviews.

Page 32: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

5) Review Data & Revise Items

• Review test items based on dataEnsure accuracy, clarityAppropriate difficultyAcceptable discrimination

• Revise or drop problem items• Write new items if necessary to meet specifications

Page 33: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

6) Assemble Final Test

• Choose set of items from pool according to specifications• Perform test reviews

Meet content, skill, & statistical specificationsCheck for overlap, cueing of keysCorrectness of keys

Page 34: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

7) Test Administration

• Print or format for computer• Quality control checks• Ship securely• Administer test

Acceptable conditions (space, comfort, light, temperature)Security (copying, impersonation, prior knowledge)

Page 35: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

8) Checks Before Scoring

• Investigate complaints & reports • Preliminary Item Analysis (PIA)

Identify “problem” items based on statistics (too hard, too easy, poor discrimination, change from pretest)Review items to decide if keep in test or drop before scoring

• DIF, if not done previously

Page 36: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Checks Before Scoring

• Check for anomalies (sudden drops or increases in scores) that may indicate problems

Page 37: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

9) Scaling & Equating

• Raw scores are number right or percent right on a particular test form.• 50% right on a hard test form may take more knowledge & skill than 60% right on an easy test form. • Raw scores mean different things on different test forms.• ETS very rarely reports raw scores

Page 38: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Scaling & Equating

• Scaling is arbitrary range of numbers used to report scores. e.g., 200-800 for SAT, 150-190 for PPST.• Equating is a statistical adjustment for differences in the difficulty of different forms of the same test.

• Equating allows us to treat the scores on different forms of a test as though they meant the same thing.

Page 39: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Scaling & Equating

• If a form happens to be a little harder than the others, it will take fewer raw score points to reach a particular scale score point.• If a form happens to be a little easier than the others, it will take more raw score points to reach a particular scale score point.• Scaled scores, after equating, mean the same on each form

Page 40: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

10) Test Analyses

• Analysis of final form characteristics.• Distribution of item difficulty & discrimination• Reliability•Speededness

• Did test meet content & statistical specifications? If not, where were problems?

Page 41: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

11) Report Scores

• Explain what scores mean so scores are understandable to test users• Indicate Standard Error of Measurement on score report

Page 42: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

12) Plan Next Form

• What was learned from this administration to make the next administration of the test better? • What has to change for next form?

Page 43: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 44: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

About TOEFL® Junior™

Page 45: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

A TOEFL® product for a Younger Generation

A distinct product within the growing TOEFL® family of products A natural extension of the TOEFL brand, but specifically geared to the language learning needs of middle grade students

- Informed by reviews of research and relevant standards- Based on years of experience developing international assessments

of English language proficiency for both adults and K12 students

Meets ETS Standards for Quality and Fairness Builds upon ETS’s expertise in English language assessment for young learners. TOEFL® products set the standard for English proficiency worldwide

Page 46: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

The Paper-Based Test is designed to provide useful Information

• Purpose is to assess the degree to which students aged 11-15 have attained language proficiency representative of middle school English-medium instruction

Page 47: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

TOEFL Junior Structure

Format: PaperThree Sections:

ListeningReadingLanguage Form and Meaning

Page 48: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

TOEFL Junior Structure

• Listening Comprehension: This section tests how well students understand spoken English. Number of Questions: 42Section administered by CD. Students are asked to answer questions based on a variety of statements, questions, conversations and talks recorded in English. Total time: approximately 35–40 minutes.

• Question TypesClassroom InstructionShort ConversationsAcademic Listening

Page 49: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Sample Listening Item

(Narrator): Listen to a high school principal talking to the school’s students. (Man): I have a very special announcement to make. This year, not just one, but three of our students will be receiving national awards for their academic achievements. Krista Conner, Martin Chan, and Shriya Patel have all been chosen for their hard work and consistently high marks. It is very unusual for one school to have so many students receive this award in a single year. (Narrator): What is the subject of the announcement?

What is the subject of the announcement? (A) The school will be adding new classes. (B) Three new teachers will be working at the school. (C) Some students have received an award. (D) The school is getting its own newspaper.

Page 50: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

TOEFL Junior PBT Structure

• Reading Comprehension: - This section tests how well students read and comprehend written English. Students read a variety of materials. - Number of Questions: 42 questions.- Total time: 50 minutes.

• Question Types- Non-academic- Academic

Page 51: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Sample Reading ItemQuestions are about the following announcement.

Student Volunteers Needed! On Saturday, December 12th, from 10 A.M. until 4 P.M., Carverton Middle School will be holding a music festival in the school gymnasium. The special event will feature a variety of professional musicians and singers. We are looking for Carverton students to help with the jobs listed below. Interested students should speak with Ms. Braxton, the music teacher. Students who would like to help at the festival must have written permission from a parent or guardian.

Task Time Date

Make posters 1 P.M.–4 P.M. December 5th Set up gym 11 A.M.–4 P.M. December 11th Help performers 9 A.M.–4 P.M. December 12th Welcome guests 10 A.M.–2 P.M. December 12th Clean up gym 4 P.M.–7 P.M. December 12th

What time will the festival begin? (A) 10 A.M.(B) 11 A.M.(C) 1 P.M.(D) 2 P.M.

Page 52: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

TOEFL Junior PBT Structure

• Language Form and Meaning: –This section assesses key language skills such as grammar and vocabulary in context. –The section includes 42 questions.–Total time: approximately 25 minutes.

• Question Types:–Language Meaning–Language Form

Page 53: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Sample Language Form and Meaning Item

Questions - refer to the following e-mail.

Hi, Linda!

Thanks for your last e-mail! I know you like art, just like I do, so I wanted

you about the special trip my class went on last week. We took

(A) tell (B) told (C) to tell (D) telling

Page 54: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Score Report

•Section scores for Listening, Language Form and Meaning, and Reading

Section Scale ScoresListening Comprehension 200-300Language Form & Meaning 200-300Reading Comprehension 200-300Total Score 600-900

• The TOEFL Junior score report provides a description of the English-language abilities typical of test takers scoring around a particular scaled score level. There are four possible descriptions for each section of the test• Link to the Common European Framework of Reference• Lexile measure

Page 55: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Listening Descriptions

• Test takers who score between 210 and 245 may have the following strengths:

They can understand the main idea of a brief classroom announcement if it is explicitly stated.

They can understand important details that are explicitly stated and reinforced in short talks and conversations.

They can understand direct paraphrases of spoken information when the language is simple and the context is clear.

They can understand a speaker’s purpose in a short talk when the language is simple and the context is clear.

Page 56: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Common European Framework of Reference for Languages (CEFR)

Sections CEFR Level A2 CEFR Level B1 CEFR Level B2

Listening Comprehension 210–245 250–275 280–-300 Language Form & Meaning 210–245 250–275 280–-300 Reading Comprehension 210–245 250–275 280–300

Important Note: CEFR levels are context-dependent. A B2 for middle school is not the same as a B2 for adults.

Page 57: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Appropriate Use of the TOEFL® Junior Test

• Appropriate for low- to medium-stakes decisions • Provides a general standard to measure proficiency levels of proficiency of students aged 11-15 representative of English-medium instructional environments• Serves as one piece of information supporting placement into programs designed to increase proficiency levels of these EFL students • Provides information about student progress in developing English language proficiency over time

Page 58: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

The TOEFL® Junior Test is NOT…

• …based on any specific curriculum• …directly linked to TOEFL iBT scores• …intended to predict performance on the TOEFL iBT test• …for use to support high-stakes decisions such as for admissions purposes or criterion-based exit testing• …a substitute for TOEFL iBT, TOEFL pBT or TOEFL ITP

Page 59: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Participating Countries

Latin America Brazil, Chile

Asia- China, Indonesia, Japan, Korea, Vietnam

Europe Bulgaria, France, Greece, Italy, Poland, Turkey

Middle EastEgypt. Gaza/West Bank, Lebanon, Morocco

Page 60: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.
Page 61: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

The TOEIC Bridge™ Test

Page 62: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

What is the TOEIC Bridge™ Test?

• A test to measure the emerging competencies of beginning learners of English

• A tool to help language learners focus on areas for improvement

Page 63: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Why use the TOEIC Bridge™ Test?

• To measure beginner English proficiency• To motivate English Language Learners• To set language learning goals

Page 64: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

How is the TOEIC Bridge™ Test different from the TOEIC®

Listening and Reading Test?

• The TOEIC Bridge™ test takes only one hour. The TOEIC®

test takes two hours.

• There are 100 questions in the TOEIC Bridge™ test, 200 in the TOEIC® test.• The TOEIC Bridge™ has only five parts, the TOEIC® test has seven parts.• There is more time between questions in the TOEIC Bridge™ test.• In the TOEIC Bridge™ test, the speakers speak more slowly.

Page 65: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Differences (Continued)

• TOEIC Bridge™ test questions are easier.• TOEIC Bridge™ test questions cover more general topics. • The scaled score range on the TOEIC Bridge™ is from 20 to 180; on the TOEIC® test, scores are on a scale of 10 to 990.• The TOEIC Bridge™ test is a low-stakes test; the TOEIC® test is a high-stakes test.

Page 66: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Test Format

• Two sections:

• Section I: Listening Comprehension – Candidates listen to a variety of statements, questions, short conversations, and short talks, and answer 50 questions.(tape mediated)

• Three Parts: Photo-based (15 questions)Question-Answer (20 questions)Conversations and Short Talks

• Section II: Reading Comprehension – Candidates read single sentences as well as texts and answer 50 comprehension questions.

Two Parts:Incomplete sentences (30 questions)Reading Comprehension (20 questions)

Page 67: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

TOEIC Bridge™ Content Areas

• Animals• Basic objects• Clothing• Dates/days/time• Entertainment• Family members• Food/dining out• Games• Health• Housing/residence• Measurement

• Money• Months• Music• Numbers• Recreation/hobbies• School subjects • Shopping• Sports • Travel/transportation• Weather• Work

Page 68: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Scoring

• Total scores range from 20 - 180• Listening and Reading subscores range from 10 – 90• Test administration time is approximately 1.5 hours• Test scoring – (under operational conditions) 24-48 hours in most locations

Page 69: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

• The scores are based on the number of correct responses.• The correct responses in each section (Listening and Reading) are converted to a score scale. The range of the scale is from 10 – 90 for each section.• Summing the scores of the sections produces a total scaled score. The range of the total score is then 20 – 180.

Page 70: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

CEFR Ratings

• The TOEIC Bridge test ranges from the A1 level to the B1 level.

Page 71: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

For Sample Test Questions for TOEFL Junior and TOEIC Bridge:

http://www.ets.org/toefl_juniorhttp://www.ets.org/toeicbridge

Page 72: Measuring Success in English for Young People Annabelle G. Simpson Director, Channel Management, ETS Global Division.

Thank you.