On-Line Student Assessment Richard Hill Center for Assessment Nov. 5, 2001.

26
On-Line Student Assessment Richard Hill Center for Assessment Nov. 5, 2001

Transcript of On-Line Student Assessment Richard Hill Center for Assessment Nov. 5, 2001.

On-Line Student Assessment

Richard Hill

Center for Assessment

Nov. 5, 2001

Speaking Points

Current paper-and-pencil-based assessments

Image Scoring Computer Administration Computer Scoring

Typical Current Paper-and-Pencil Based Statewide Assessment

3 grades Reading, writing, math, science, social

studies 30 MC and 6 OE questions for four areas,

one essay for writing 50,000 students per grade

Materials Processed

150,000 28-page test booklets 2 millions sheets of paper 10 tons of paper, a stack 700 feet high

150,000 20-page answer documents 1.5 million sheets of special paper 7.5 tons 600 boxes to store (per year)

Process

Materials shipped to schools Materials shipped back to contractor Materials logged in

Count everything, resolve discrepancies Note that one misplaced school can stop

entire process

Process for Receiving Materials

Separate answer booklets from test booklets Test booklets placed in temporary storage in

original boxes, then destroyed after reporting complete

Answer sheets guillotined MC answer sheets scanned OE sheets packaged by scoring

Processing of OE Sheets

Separate by content area Sorted by form, randomized across

schools Scanned to capture ID numbers Scoring headers prepared, then merged

with answer sheets

Scoring

Hire, train, qualify Score On-going evaluation of quality of scoring Determine papers that need adjudication,

then rescore as necessary Scan scoring headers Merge MC, OE and writing scores

Scoring Time

20 seconds per OE question 5 minutes per essay (2 scorings plus

adjudication, if necessary) 13 minutes per student

32,500 hours 1000 person-weeks, plus training, qualifying,

quality control and equating

Equating to Previous Year

MC OE

Difficulty of items Changes in scoring

Count, Count, Count

Initial log-in counts After packaging Every time a box is opened or closed Count boxes, too

Final Steps

Ship reports back to schools Resolve problems

Missing or misplaced students Challenges to scoring (requires finding

answer sheets—perhaps all for one student) Destroy test materials Long-term storage for answer documents

Solution # 1—Image Scoring

High-speed scanners capture images of documents

All processing is done on CRTs by looking at electronic image of original paper

Advantages

Control Scoring

Blind read-behinds Real-time tracking of accuracy of every scorer Multiple sites

Equating Blind rescores from previous year

Advantages (cont’d)

Scoring speed Next response is ready to be scored when

first is done Scoring stops when rates decline No fumbling for papers Up to 1/3 faster

Advantages (cont’d)

Tracking No need for counting Nothing is lost Nothing is damaged Records automatically linked Special-request papers easy to obtain

Prep for next year’s scoring Challenged papers Adjudication

Advantages (cont’d)

Reporting—Send sample of work home to parents

Storage Permanent Compact

Disadvantages

Hardware and software costs Costs have dropped dramatically ($150,000

server two years ago now selling for $16,000) Need to prove that scoring is the same

Writing vs. OE Connectivity Power outages

Computer Administered Tests

Web-based vs. CD Comparability

Standards—especially writing Students that write on paper and then just type in

Full use of computer capabilities Underestimation of (some) students’ abilities

Georgia’s Proposed System

Huge item bank, three levels Teachers can create tests Capacity concerns for Level III tests

Advantages

Elimination of paper Accommodations Adaptive testing

Shorter tests Diagnostic tests Lower frustation levels

Real-time scoring

Issues

Administration time All schools have some computers, but how

many? Transition

Recommendation is to test all schools the same way

Comparability Logistics of operating two programs at same time

Computer Scoring

Major vendors NCME Session N1, April 12, 2001 ETS Technologies—E-rater (Princeton, NJ) Vantage Learning—Intellimetric (Yardley, PA) TruJudge—Project Essay Grade (PEG)

(Purdue) Knowledge Analysis Technologies—Intelligent

Essay Assessor (Boulder, CO)

Advantages

Time Cost Objective (or at least impersonal)

Issues

Accuracy rates PA study—computers vs. humans

Computer more accurate than one human Computer less accurate than two humans Bias vs. random error

Beating the system (“Stakes changes everything”)

Capacity of contractors to deliver logistics

Alternate Testing Modes

Listening Special education adaptations—see

Tindel Virtual reality