Using Student Assessment Data in Teacher Evaluations must be on the same scale; content must be...

Using Student Assessment Data in Teacher Evaluations

Copyright © 2016 American Institutes for Research. All rights reserved.

Mariann Lemke October 2016

The mission of the Center on Great Teachers and Leaders (GTL Center) is to foster the capacity of vibrant networks of practitioners, researchers, innovators, and experts to build and sustain a seamless system of support for great teachers and leaders for every school in every state in the nation.

GTL Center Mission

2

State Assessment Data and Teacher Evaluations • Who, What, Why?

What We Know: Value-Added Measures (VAMs) What We Know: Student Learning Objectives (SLOs) Other Growth Measures Q&A and Discussion

Agenda

3

State Assessment Data and Teacher Evaluations

4

Policy scans show that states typically use state assessments in teacher evaluation systems to measure student growth through: • Student learning objectives (SLOs) or similar goal-setting methods • Statistical measures such as value-added models (VAMs) or growth models

State Assessment Data in Teacher Evaluations

5

Student growth means the change in student achievement for an individual student between two or more points in time.

What you need: • Data from two points or more points in time • Data that measure what students are supposed to have learned and

teachers are supposed to have taught by that point in time • An approach to connecting the data

Measuring Student Growth

6

7

Student Learning Instruction

Why Student Growth?

Proficiency

8

0

100

200

300

400

500

Student A Student B Student C Student D Student E

Ms. Smith

Prior Performance Current Performance

0

100

200

300

400

500

Student A Student B Student C Student D Student E

Ms. Jones

Prior Performance Current Performance

Why Student Growth?

9

Sample Value-Added Model

SLO Template Baseline Data

Student Population

Interval of Instruction

Standards and Content

Assessments

Growth Targets

Rationale for Growth Targets

Instructional Strategies

A student learning objective (SLO) is a measurable, long-term, academic goal informed by available data that a teacher or teacher team sets at the beginning of the year for all students or for subgroups of students.

Student Learning Objectives

10

Use of Value-Added Measures

11

The Every Student Succeeds Act (ESSA) specifies that the U.S. Department of Education (ED) cannot require anything of states with regard to educator evaluation (i.e., as condition of approval of/waiver to required state plans).

Regardless of federal policy changes, state laws and regulations related to educator effectiveness remain in effect.

ESSA and Educator Evaluation

12

Education Commission of the States Policy Scan (September 2016)

13

43 states required objective measures of student achievement to be included in teacher evaluations.

16 states included student achievement and growth as the “preponderant criterion” in teacher evaluations. These states include AK, CO, CT, DC, DE, GA, HI, KY, LA, MS, NC, NM, NY, OK, PA and TN.

19 states included growth measures as a “significant criterion” in teacher evaluations. Eleven of those states (AZ, FL, ID, IL, MI, MN, NJ, NV, OH, RI, VA) explicitly define what “significant” means for the purposes of including student achievement in teacher evaluations. Eight states (AR, IN, KS, MD, ME, MO, OR, SD) do not provide these explicit guidelines.

Eight states required objective evidence of student learning in teacher evaluations. (MA, ND, SC, UT, WA, WI, WV, WY).

Seven states required that schoolwide achievement data be used in individual teacher performance ratings, whereas 11 other states explicitly allowed the practice.

Current State Policy

Florida: • 50% of educator evaluation score

Louisiana: • 50% of educator evaluation score

Minnesota: • 35% of educator evaluation score

Ohio: • Under 2016 statue, value-added score is optional

New Mexico: • 50% of educator evaluation score

North Carolina: • One of six standards

Tennessee: • One of several approved measures used in portfolio

States Currently Using Value-Added Data in Teacher Evaluations

14

15

N.J. Triples Weight of Tests in Teacher Evaluations (Education Week, September 13, 2016)

Mass. reexamining role of student test scores in teacher evaluations (Boston Globe, September 27, 2016)

But…a Changing Landscape….

http://www.edweek.org/ew/articles/2016/09/14/nj-triples-weight-of-tests-in-teacher.html

http://www.edweek.org/ew/articles/2016/09/14/nj-triples-weight-of-tests-in-teacher.html

https://www.bostonglobe.com/metro/2016/09/27/mass-examining-role-student-test-scores-educator-evaluatons/pok30E2WYfyzLKoqaedNNI/story.html



What We Know: Value-Added Models

16

Bias/validity is a concern: Potential exists, but studies suggest no concrete evidence of it.

Precision/reliability: Documented variability in measures; can increase by pooling data over time.

Relationship to other measures: Moderate relationships with other measures of teacher effectiveness.

Data needs: Need high-quality assessment and linkage data.

Technical Characteristics

17

Typically not used as an independent measure.

Effects likely to vary depending on who uses data, their perceptions of data, and the specifics of policies—are measures to be used for accountability or improvement, and what are the specifics of supports or consequences? • Some evidence of teacher turnover, increased teacher performance,

increased student performance with use of evaluation systems • May be useful for assignment or identifying coaches

Effects of the Use of Value-Added Measures

18

What We Know: Student Learning Objectives

19

Mixed perceptions of usefulness of SLOs; some evidence this may change over time and positive perceptions of use of data.

Challenges related to assessments (selection, design) and accessing data, as well as with supports and communication.

Current studies suggest some relationship between SLO quality and student achievement, but results are not consistent across content areas or studies.

Implementation, Perceptions, and Relationship to Achievement

20

Other Growth Measures

21

22

Year1 Year2

200

100

Measuring growth = subtraction Scores must be on the same scale (could be a

rubric); content must be aligned between time periods; need a reference point to interpret results Examples:

• Math grade 4 and math grade 5 on a vertically scaled assessment

• Spanish 3 pre- and post-test • Fitness pre- and post-test

Pre-Test/Post-Test or Simple Growth

23

Measuring growth = change in performance level Scores must be on the same scale; content must be aligned between

time periods; need a reference point to interpret results Examples:

• Writing, other performance-based content or skills

Post-Classification Pre-Classification Basic Proficient Advanced Basic Proficient Advanced

Pre-Test/Post-Test With a Rubric

Collection of student work showing growth related to relevant standards • May work especially well for courses with performance-based tasks or work

that is scored via rubric • Requires a holistic rubric or repeated measures (e.g., writing assignments

scored against the same rubric each time) • Must consider means to ensure consistency, quality • Could do within a single class or course

Another Approach: Portfolios

24

Q&A and Discussion

25

Reflection Activity

26

Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Economic Policy Institute Briefing Paper No. 278.

Bill & Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the Measures of Effective Teaching Project. Retrieved from https://docs.gatesfoundation.org/Documents/preliminary-findings-research-paper.pdf

Chamberlain, G. (2013). Predictive effects of teachers and schools on test scores, college attendance, and earnings. Proceedings of the National Academy of Sciences 110(43), 17176–17182.

Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014a). Measuring the impacts of teachers I: Evaluating bias in teacher value-added estimates. American Economic Review 104(9), 2593–2632.

Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014b). Measuring the impacts of teachers II: Teacher value-added and student outcomes in adulthood. American Economic Review, 104(9), 2633–2679.

Cornetto, K. M., Schmitt L. N. T., Malerba, C., & Herrera, A. (2010). AISD REACH year 2 evaluation report II, 2008–2009 (DRE Publication No. 08.97). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/08.97_AISD_Reach_Year2_Evaluation_ReportII_2008_2009.pdf

Courtemanche, M., Orr, A., & Schmitt, L. (2014). AISD REACH program update: 2013–2014 participant feedback (DRE Report No. 13.39). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/DRE_13.39_AISD_Reach_Program_Update_2013_2014_Participant_Feedback.pdf

References and Resources

27

https://docs.gatesfoundation.org/Documents/preliminary-findings-research-paper.pdf

http://www.austinisd.org/sites/default/files/dre-reports/08.97_AISD_Reach_Year2_Evaluation_ReportII_2008_2009.pdf

http://www.austinisd.org/sites/default/files/dre-reports/08.97_AISD_Reach_Year2_Evaluation_ReportII_2008_2009.pdf

http://www.austinisd.org/sites/default/files/dre-reports/DRE_13.39_AISD_Reach_Program_Update_2013_2014_Participant_Feedback.pdf

http://www.austinisd.org/sites/default/files/dre-reports/DRE_13.39_AISD_Reach_Program_Update_2013_2014_Participant_Feedback.pdf

28

Dee, T., & Wyckoff, J. (2015). Incentives, selection, and teacher performance: Evidence from IMPACT, Journal of Policy Analysis and Management, 34(2), 267–297. Retrieved from https://ideas.repec.org/a/wly/jpamgt/v34y2015i2p267-297.html

Delaware Department of Education, Teacher and Leader Effectiveness Unit. (2013). Continuous improvement: A report on “year one” of the revised DPAS-II educator evaluation system. Dover, DE: Author. Retrieved from http://www.doe.k12.de.us/cms/lib09/DE01922744/Centricity/domain/271/present%20and%20reports/DPAS_II_Year_One_Report_2013.pdf

Donaldson, M. L., Cobb, C., LeChasseur, K., Gabriel, R., Gonzales, R., Woulfin, S., & Makuch, A. (2014). An evaluation of the pilot implementation of Connecticut’s system for educator evaluation and development. Retrieved from http://aftct.org/sites/aftct.org/files/neag_seed_report_1_1_14.pdf

Donaldson, M. L. (2012). Teachers’ perspectives on evaluation reform. Washington, DC: Center for American Progress. Retrieved from http://www.americanprogress.org/wp-content/uploads/2012/12/TeacherPerspectives.pdf

Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. (2013). Selecting growth models for school and teacher evaluations: Should proportionality matter? Washington, DC: National Center for Analysis of Longitudinal Data in Education Research (CALDER) Working Paper 80.

Felton, E. (2016, September 1). New Jersey triples weight of test scores in teacher evaluations. Education Week. Retrieved from http://blogs.edweek.org/edweek/teacherbeat/2016/09/new_jersey_tests_evaluations.html


https://ideas.repec.org/a/wly/jpamgt/v34y2015i2p267-297.html

https://ideas.repec.org/a/wly/jpamgt/v34y2015i2p267-297.html

http://www.doe.k12.de.us/cms/lib09/DE01922744/Centricity/domain/271/present%20and%20reports/DPAS_II_Year_One_Report_2013.pdf

http://www.doe.k12.de.us/cms/lib09/DE01922744/Centricity/domain/271/present%20and%20reports/DPAS_II_Year_One_Report_2013.pdf

http://aftct.org/sites/aftct.org/files/neag_seed_report_1_1_14.pdf

http://www.americanprogress.org/wp-content/uploads/2012/12/TeacherPerspectives.pdf

http://blogs.edweek.org/edweek/teacherbeat/2016/09/new_jersey_tests_evaluations.html

Glazerman, S., Loeb, S., Goldhaber, D., Staiger, D., Raudenbush, S., & Whitehurst, R. (2010). Evaluating Teachers: The Important Role of Value-Added. Washington, DC. The Brookings Brown Center Task Group on Teacher Quality.

Goldhaber, D., Gabele, B., & Walch, J. (2014). Does the model matter? Exploring the relationship between different achievement-based teacher assessments. Statistics, Politics, and Policy 1(1), 28–39.

Goldhaber, D. (2015, March). Exploring the potential of value-added performance measures to affect the quality of the teacher workforce. Educational Researcher, 44(2), 87–95.

Goldhaber, D., Cowan, J., & Walch, J. (2013). Is a good elementary teacher always good? Assessing teacher performance estimates across subjects. Economics of Education Review, Vol. 36, 216–228.

Goldhaber, D., & Hansen, M. (2012) Is it just a bad class? Assessing the long-term stability of estimated teacher performance. Economica, 80(319), 589–612.

Goldring, E., Grissom, J., Rubin, M., Neumerski, C., Cannata, M., Drake, T., & Scheuermann, P. (2015, March). Make room value added: Principals’ human capital decisions and the emergence of teacher observation data. Educational Researcher, 44, 96–104.

Jiang, J., Sporte, S., & Luppescu, S. (2015, March). Teacher perspectives on evaluation reform: Chicago’s REACH Students. Educational Researcher, 44, 105–116.

Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation. National Bureau of Economic Research Working Paper No. 14607.


29

Kane, T. J., Taylor, E., Tyler, J., & Wooten, A. (2011). Identifying effective classroom practices using student achievement data. Journal of Human Resources, 46(3), 587–613. Retrieved from http://cepr.harvard.edu/publications/identifying-effective-classroom-practices-using-student-achievement-data

Kane, T. J., Staiger, D. O., & Bacher-Hicks, A. (2014). Validating teacher effect estimates using between school movers: A replication and extension of Chetty et al. Harvard University Working Paper.

Koedel, C., & Betts, J. R. (2007, April). Re-examining the role of teacher quality in the educational production function. University of Missouri-Columbia Department of Economics Working Paper Series WP 07-08. Retrieved from https://economics.missouri.edu/working-papers/2007/wp0708_koedel.pdf

Koedel, C., Mihaly, K., & Rockoff, J. E. (2015, January). Value-added modeling: A review. University of Missouri–Columbia, Department of Economics.

Lachlan-Haché, L. (2015). The art and science of student learning objectives: A research synthesis. Washington, DC: Performance Management Advantage: Evaluation & Professional Growth at American Institutes for Research. Retrieved from http://www.air.org/sites/default/files/downloads/report/Art-and-Science-of-Student-Learning-Objectives-April-2015.pdf

Loeb, S., Soland, J., & Fox, L. (2014). Is a good teacher a good teacher for all? Comparing value-added of teachers with their English learners and non-English learners. Education Evaluation and Policy Analysis, 36(4), 457–475.

McCaffrey, D. F. (2013). Will teacher value‐added scores change when accountability tests change? Carnegie Knowledge Network. Retrieved from http://www.carnegieknowledgenetwork.org/wp-content/uploads/2013/06/CKN_2013-06_McCaffrey.pdf


30

http://cepr.harvard.edu/publications/identifying-effective-classroom-practices-using-student-achievement-data

http://cepr.harvard.edu/publications/identifying-effective-classroom-practices-using-student-achievement-data

https://economics.missouri.edu/working-papers/2007/wp0708_koedel.pdf

http://www.air.org/sites/default/files/downloads/report/Art-and-Science-of-Student-Learning-Objectives-April-2015.pdf

http://www.air.org/sites/default/files/downloads/report/Art-and-Science-of-Student-Learning-Objectives-April-2015.pdf

http://www.carnegieknowledgenetwork.org/wp-content/uploads/2013/06/CKN_2013-06_McCaffrey.pdf

http://www.carnegieknowledgenetwork.org/wp-content/uploads/2013/06/CKN_2013-06_McCaffrey.pdf

McCaffrey, D. F., Sass, T. R., Lockwood, J. R., & Mihaly, K. (2009). The intertemporal variability of teacher effect estimates. Education Finance and Policy, 4(4), 572–606.

Mihaly, K., McCaffrey D., Staiger, D., & Lockwood, J. (2013, January 8). A composite estimator of effective teaching. Seattle, WA: Bill & Melinda Gates Foundation, Measures of Effective Teaching (MET) Project.

Paufler, N. A., & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal 51(2), 328–362.

Rockoff, J. E., Staiger, D. O., Kane, T. J., & Taylor, E. S. (2012, December). Information and employee evaluation: Evidence from a randomized intervention in public schools. American Economic Review, 3184–3213.

Ronfeldt, M., Loeb, S., & Wyckoff, J. (2013). How teacher turnover harms student achievement. American Educational Research Journal, 50(1), 4–36.

Rothstein, J. 2010. Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125(1), 175–214.

Rothstein, J. 2009. Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537–571.

Schmitt, L. N. T. (2014). AISD REACH program: Summary of findings from 2007–2008 Through 2012–2013 (DRE Publication No. 12.96). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/ DRE_12.96_AISD_REACH_Program_Summary_of_Findings_2007_2008_Through_2012_2013_0.pdf


31

http://www.austinisd.org/sites/default/files/dre-reports/DRE_12.96_AISD_REACH_Program_Summary_of_Findings_2007_2008_Through_2012_2013_0.pdf

http://www.austinisd.org/sites/default/files/dre-reports/DRE_12.96_AISD_REACH_Program_Summary_of_Findings_2007_2008_Through_2012_2013_0.pdf

Schmitt, L. N. T., Lamb, L. M., Cornetto, K. M., & Courtemanche, M. (2013). AISD REACH program update, 2012−2013: Student learning objectives (DRE Publication No. 12.83). Austin, TX: Austin Independent School District. Retrieved from https://www.austinisd.org/sites/default/files/dre-reports/DRE_12.83_AISD_REACH_Program_Update_2012_2013_Student_Learning_Objectives.pdf

Schmitt, L., Malerba, C., Cornetto, K., & Bush-Richards, A. (2008). Strategic compensation interim report 2: Teacher focus group summary, spring 2008 (DPE Publication No. 07.32). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/drereports/ 07.32_Strategic_Compensation_Interim_Report_2_Teacher_Focus_Group_Summary_Spring_2008.pdf

Slotnik, W. J., Bugler, D., & Liang, G. (2014). Real progress in Maryland: Student learningobjectives and teacher and principal evaluation. Washington, DC: Mid-Atlantic Comprehensive Center. Retrieved from http://www.wested.org/wp-content/files_mf/1413394919RealProgressinMD_Report.pdf

Slotnik, W. J., Smith, M., Glass, R., & Helms, B. J. (2004). Catalyst for change: Pay for performance in Denver (Final Report). Boston, MA: Community Training and Assistance Center. Retrieved from http://www.broadeducation.org/asset/1128-catalyst%20for%20change.pdf

Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago's Excellence in Teaching project. Education Finance and Policy, 10(4), 535–572.


32

https://www.austinisd.org/sites/default/files/dre-reports/DRE_12.83_AISD_REACH_Program_Update_2012_2013_Student_Learning_Objectives.pdf

https://www.austinisd.org/sites/default/files/dre-reports/DRE_12.83_AISD_REACH_Program_Update_2012_2013_Student_Learning_Objectives.pdf

http://www.austinisd.org/sites/default/files/drereports/07.32_Strategic_Compensation_Interim_Report_2_Teacher_Focus_Group_Summary_Spring_2008.pdf

http://www.austinisd.org/sites/default/files/drereports/07.32_Strategic_Compensation_Interim_Report_2_Teacher_Focus_Group_Summary_Spring_2008.pdf

http://www.wested.org/wp-content/files_mf/1413394919RealProgressinMD_Report.pdf

http://www.broadeducation.org/asset/1128-catalyst%20for%20change.pdf

Taylor, K. (2015, December 14). New York Regents vote to exclude state tests in teacher evaluations. New York Times. Retrieved from http://www.nytimes.com/2015/12/15/ nyregion/new-york-regents-vote-to-exclude-state-tests-in-teacher-evaluations.html

Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.

TNTP. (2012). Summer report: Creating a culture of excellence in Indiana schools. Indianapolis, IN: Indiana Department of Education. Retrieved from http://www.riseindiana.org/sites/default/files/files/Summer%20Report.pdf

Whitehurst, G., Chingos, M., & Lindquist, K. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Washington, DC: Brookings Institution.

Xu, Z., Ozek, U., & Corritore, M. (2012). Portability of teacher effectiveness across school settings. Washington, DC: National Center for Analysis of Longitudinal Data in Education Research (CALDER) Working Paper No. 77.


33

http://www.nytimes.com/2015/12/15/nyregion/new-york-regents-vote-to-exclude-state-tests-in-teacher-evaluations.html

http://www.nytimes.com/2015/12/15/nyregion/new-york-regents-vote-to-exclude-state-tests-in-teacher-evaluations.html

http://www.riseindiana.org/sites/default/files/files/Summer%20Report.pdf

Advancing state efforts to grow, respect, and retain great teachers and leaders for all students

Mariann Lemke 773-283-3668 [email protected] 1000 Thomas Jefferson Street NW Washington, DC 20007-3835 877-322-8700 www.gtlcenter.org [email protected]

34

Using Student Assessment Data in Teacher Evaluations must be on the same scale; content must be...

Documents

Transcript of Using Student Assessment Data in Teacher Evaluations must be on the same scale; content must be...