1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln...

45
1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards

Transcript of 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln...

Page 1: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

1

Review of AERA/APA/NCME Test Standards Revision

Barbara S. PlakeUniversity of Nebraska-LincolnCo-Chair, Committee for Revision of Test Standards

Page 2: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

2

Joint Committee Members

Lauress Wise, Co Chair Barbara Plake, Co Chair Linda Cook, ETS Fritz Drasgow, University of Illinois Brian Gong, NCIEA Laura Hamilton, Rand Corporation Jo-Ida Hansen, University on MN Joan Herman, UCLA Michael Kane, Bar Examiners

Page 3: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

3

Joint Committee Members

Michael Kolen, University of IowaAntonio Puente, UNC-WilmingtonPaul Sackett, University of MNNancy Tippins, Valtera CorporationWalter (Denny) Way, Pearson Frank Worrell, Univ of CA- Berkeley

Page 4: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

4

Scope of Revision

Based on comments each organization received from invitation to comment

Summarized by the Management Committee in consultation with the Co-ChairsWayne Camara, Chair, APASuzanne Lane, AERADavid Frisbie, NCME

Page 5: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

5

Four Substantive Areas for Revisions

TechnologyAccountabilityWorkplaceAccess

Plus attention to format issues

Page 6: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

6

Theme Teams

Working teamsCross team collaborationsChapter LeadersFocusing of bringing into chapters

content related to themes in coherent and meaningful ways

lcook
Is Joan on two teams? both accountability and acess?
Page 7: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

7

Presentation: Four Substantive Areas

Access – Linda CookAccountability – Brian GongTechnology – Denny WayWorkplace – Laurie Wise

Page 8: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

8

Format Issues

Organization of ChaptersConsideration of ways to identify of

“Priority Standards”More parallelism between chapter

ToneComplexityTechnical language

Page 9: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

9

Timeline

First meeting January, 2009Three year process for completing text of

revisionOpen comment/Organization reviewsProjected publication Summer, 2012

Page 10: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

10

Revising our Test Standards:Access for All Examinee Populations

Presentation to the 2009 Annual Meeting of the American Educational Research AssociationSan Diego, CA

Linda Cook, ETS

Page 11: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

11

Overview

Standards related to Access appear throughout many of the chapters but are concentrated inChapter 9: Testing Individuals of Diverse

Linguistic BackgroundsChapter 10: Testing Individuals with

Disabilities Comments on Access were received by

the management committee and summarized for the committee charge

Page 12: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

12

Elements of the Charge

Five of the elements of the charge focused on accommodations/modifications

Impact/differentiation of accommodation and modification

Appropriateness for ELL and EWD Appropriateness for variety of groups, e.g., pre-K,

older populations Flagging Comparability/validity

One element focused on adequacy and comparability of translations

One element focused on Universal Design

Page 13: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

13

Key Access Issues Included in our Charge - 1

Impact/differentiation of accommodations/modifications

Appropriate ways to determine or establish the impact of accommodations/modifications on inferences, interpretations, uses of scores

How do you differentiate clearly between what is an accommodation and what is a modification?

Page 14: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

14

Key Access Issues Included in our Charge - 2

Appropriateness of accommodations for English-language learners and examinees with disabilities

Selecting the appropriate accommodation for the individual

Who should select the accommodation? What evidence should the selection be based on?

Administering the appropriate accommodation What evidence is available to determine impact on test

scores, given purpose of the test? how effective is the accommodation?

Alternative assessments/modified achievement standards

Page 15: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

15

Key Access Issues Included in our Charge - 3

Appropriateness of accommodations for a wider variety of groups

Pre-K Older populations

Number of older adults with cognitive impairments is rising

Tested to determine mental status changes There are many complexities associated with

testing this population Combined effects of medical problems, medication side

effects, multiple sensory deficits, testing environment

Page 16: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

16

Key Access Issues Included in our Charge - 4

Flagging Current treatment needs to be updated to

reflect changes in practice since 1999 standards

Most testing organizations no longer flag Decisions about flagging should be based

on empirical evidence

Page 17: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

17

Key Access Issues Included in our Charge - 5

Comparability and validity of inferences made based on scores from accommodated or modified tests

Foundational issues such as comparability and validity need to be addressed in foundational chapters

If sample sizes do not support analyses such as DIF, other evidence of validity should be pursued

Page 18: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

18

Key Access Issues Included in our Charge - 6

Adequacy and comparability of translations (language to language and language to symbol, e.g., Braille)

Evidence needed to demonstrate adequacy of translation and comparability of scores from translated tests

Fluency, rather than primary language should be used to describe target population for a test

Quality of translation/adaptation needs to be emphasized

Interaction of language proficiency and construct needs to be considered

Page 19: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

19

Key Access Issues Included in our Charge - 7

Universal Design 1999 Standards focus too much on

accommodations and modifications and not enough on building accessibility features into design and development process

Page 20: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

20

Revising our Test Standards:Issues for Accountability

Presentation to the 2009 Annual Meeting of the American Educational Research AssociationSan Diego, CA

Brian Gong, Center for Assessment

Page 21: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

21

Overview

There has been a dramatic expansion of the use of tests for various forms of accountability and other uses related to educational policy-setting.

The Joint Committee has been charged with considering how these uses in accountability should impact revisions to the Standards

As with the other themes, comments on the standards that related to accountability were compiled by the Management Committee and summarized in their charge to the Joint Committee

Page 22: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

22

Overview

Standards related to accountability currently appear throughout; accountability also is especially relevant to Chapter 13 (Educational Testing and Assessment) and Chapter 15 (Testing in Program Evaluation and Public Policy)

Under No Child Left Behind, there has been a dramatic increase in the use of tests for accountability. In such cases, test results have important consequences for third parties such as school administrators and teachers, although not always for the examinees themselves.

Federal peer review procedures have required assurances of reliability and validity that often go beyond requirements of the current Test Standards. Attention to the overall technical quality of tests and score interpretation is required. High school tests are used as a graduation requirement and there have been questions about how the current Standards should be interpreted in these cases. In general, the validity and reliability of individual and aggregated scores used for accountability purposes need to be addressed.

Page 23: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

23

Key Accountability Topics Included in our Charge

Validity and reliability requirements Issues with scores, scaling, and

equating Policy and practice Formative and interim assessments

Page 24: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

24

1. Validity and Reliability Requirements

• Use of a single test (whether or not scores resulting from retesting or repeat testing are sufficient for using more than one score for high stakes decisions) as the sole source of high stakes decisions (e.g., graduation, promotion).

• How test alignment studies should be documented and used to demonstrate the validity of score interpretations regarding mastery of required content standards.

Page 25: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

25

1. Validity, Reliability, and Reporting Requirements - continued

Provide additional guidance on score accuracy, especially when used to classify individuals or groups into performance regions or other bands on a score scale.

Validity and reliability requirements for reporting individual or aggregate performance on subscales (skills or diagnostics) and for instructing users in appropriate interpretations of such scores or data (e.g., as they impact between or within student and school comparisons, validity considerations in subscore interpretation).

Incorporating error estimates and interpretive guidance in score reports, including subscores and diagnostic reporting for individuals and groups.

Page 26: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

26

2. Issues with Scores, Scaling, and Equating

• Growth modeling, gain scores, and other methods of estimating aggregated performance or growth based on individual or school/district performance and characteristics.

• Issues or requirements when linking assessments (e.g., concordances, linkages and equating)

Page 27: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

27

3. Policy and Practice

How to balance privacy concerns for individual examinees, teachers, and administrators while meeting information needs for policy-makers.

Issues related to the appropriate role of practice and test preparation, especially in contrast to admissions testing or credentialing.

Page 28: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

28

4. Addressing formative and interim assessments

Distinguishing among commercial formative and benchmark assessments (as well as item banks), their appropriate uses, and validation evidence required in interpreting scores from them.

Page 29: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

29

Revising our Test Standards:Technological Advances

Presentation to the 2009 Annual Meeting of the American Educational Research AssociationSan Diego, CA

Denny Way, Pearson

Page 30: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

30

Overview

Technological advances are changing the way tests are delivered, scored, interpreted and in some cases, the nature of the tests themselves

The Joint Committee has been charged with considering how technological advances should impact revisions to the Standards

As with the other themes, comments on the standards that related to technology were compiled by the Management Committee and summarized in their charge to the Joint Committee

Page 31: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

31

Key Technology Issues Included in our Charge

Reliability & validity of innovative item formats

Validity issues associated with the use of:

Automated scoring algorithms Automated score reports and interpretations

Security issues for tests delivered over the internet

Issues with web-accessible data, including data warehousing

Page 32: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

32

Resources for Consideration

Guidelines for Computer-Based Testing, Copyright 2002 Association of Test Publishers (ATP)

International Guidelines on Computer-Based and Internet Delivered Testing, Copyright 2005 International Test Commission (ITC)

Page 33: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

33

Reliability & Validity of Innovative Item Formats

What special issues exist for innovative items with respect to access and elimination of bias against particular groups? How might the standards reflect these issues?

What steps should the standards suggest with regards to “usability” of innovative items?

What issues will emerge over the next five years related to innovative items/test formats that need to be addressed by the standards?

Page 34: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

34

Automated Scoring Algorithms

What level of documentation/disclosure is appropriate and tolerable for automated scoring developers/vendors?

What sorts of evidence seem most important for demonstrating the validity and “reliability” of automated scoring systems?

What issues will emerge over the next five years related to automated scoring systems that need to be addressed by the standards?

Page 35: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

35

Automated Score Reports and Interpretation

Use of computer for score interpretation

“Actionable” reports (e.g., routing students and teachers to instructional materials and lesson plans based on test results)

Page 36: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

36

Security issues for tests delivered over the internet

Two aspects of this topic are of concern: protecting privacy and threats to validity

due to breach of security. Protecting examinee privacy

Considerations likely to affect standards related to test administration and responsibilities of test users

Page 37: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

37

Web-Accessible Data, including Data Warehousing

Applicability of general technology standards?Security Interoperability

Revision to commentary vs. drafting additional standards

Page 38: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

38

Revising our Test Standards:Issues for Work-Place Testing

Presentation to the 2009 Annual Meeting of the American Educational Research AssociationSan Diego, CA

Laurie Wise, HumRRO

Page 39: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

39

Overview

Standards for testing in the work place are currently covered in Chapter 14 (one of the testing application chapters)

Work-place testing includes employment testing as well as licensure, certification, and promotion testing.

Comments on standards related to work place testing were received by the Management Committee and summarized in their charge to the Joint Committee.

Page 40: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

40

Key Work-Place Testing Issues Included in our Charge

1. Validity and reliability requirements for certification, licensure, and promotion tests.

2. Issues when tests are administered only to small populations of job incumbents.

3. Requirements for tests for new, innovative job positions that do not have incumbents or job history to provide validity evidence.

4. Assuring access to licensure, certification, and promotion tests for examinees with disabilities that may limit participation in regular testing sessions?

5. Differential requirements for certification and licensure and employment tests.

Page 41: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

41

1. Validity and Reliability Requirements

Some specific issues:Documenting and communicating the

validity and reliability of pass-fail decisions in addition to the underlying scores

How cut-offs are determined How validity and reliability information is

communicated to relevant stakeholders

Page 42: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

42

2. Issues with Small Examinee Populations

Including:Alternatives to statistical tools for item

screeningAssuring fairness Assuring technical accuracy

Alternatives to empirical validity evidenceMaintaining comparability of scores from

different test forms

Page 43: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

43

3. Requirements for New Jobs

Issues include: Identifying test contentEstablishing passing scoresAssessing reliabilityDemonstrating validity

Page 44: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

44

4. Assuring Access to Employment Testing

See also separate presentation on fairness Issues include:

Determining appropriate versus inappropriate accommodations

Relating testing accommodations to accommodations available in the work place

Page 45: 1 Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards.

45

5. Certification and Licensure versus Employment Testing

Currently, two sections in the same chapter

Examples of relevant issues:Differences in how test content is identified

and validatedDifferences in test score useWho oversees testing:

Private company versus professional board/organization