MSP Evaluation Rubric and Working Definitions Xiaodong Zhang, PhD, Westat Annual State Coordinators...

MSP Evaluation Rubric and Working Definitions

Xiaodong Zhang, PhD, Westat

Annual State Coordinators Meeting

Washington, DC, June 10-12, 2008

2

Relevant GPRA Measure

• GPRA measure—the percentage of MSP projects that use an experimental or quasi-experimental design for their evaluations that are conducted successfully and that yield scientifically valid results

• Westat’s Data Quality Initiative (DQI) developed a rubric to determine whether a grantee’s evaluation meets the GPRA measure

• The rubric is applied to each grantee’s final evaluation report

3

Evaluation Rubric

• The criteria on the rubric are the minimum criteria that need to be met for an evaluation to have been successfully conducted and yield valid data

• An evaluation has to meet each criterion in order to meet the GPRA measure

4

Evaluation Components Covered in Rubric

For Experimental Designs:

• Sample size

• Quality of measurement instruments

• Quality of data collection methods

• Data reduction rates

• Relevant statistics reported

For Quasi-Experimental Designs:

All of the above, plus…

• Baseline equivalence of groups

5

Working Definitions

• DQI developed working definitions to help implement the rubric criteria

• Report eligibility: final evaluation report that contains post-test results on key outcomes

• Multicomponent evaluations: each component will be coded separately

Teacher content knowledge Teacher instructional practice Student achievement

6

Working Definitions (Continued)

• Baseline equivalence

Pretest on key outcomes is most relevant Other related variables are acceptable (e.g., student SES

for student outcomes; education; experience for teacher outcomes)

7


• Minimum sample sizes: based on final sample Balanced designs

• Teacher outcomes– School/district level: N=12 – Teacher level: N=60

• Student outcomes– School/district level: N=12– Classroom level: N=18– Student level: N=130

Unbalanced design: smaller group must meet minimum size divided by 2

Sample size recommendations are based on power analysis

8


• Quality of instruments Existing state accountability or widely used assessment

(e.g., Iowa Test) Select items from validated assessment: must have a

minimum of 10 items, 70% of which must be from a validated and reliable instrument(s)

Grantee-developed assessment: must demonstrate reliability and validity

All instruments must have face validity

• Data reduction Allow flexibility if study population is highly mobile or if

potential differences were addressed in analysis

9

Conclusions

• Written guidance is forthcoming

• Q&A at next breakout session: Analysis of Final Reports

MSP Evaluation Rubric and Working Definitions Xiaodong Zhang, PhD, Westat Annual State Coordinators...

Documents

Transcript of MSP Evaluation Rubric and Working Definitions Xiaodong Zhang, PhD, Westat Annual State Coordinators...