Using Student Assessment Data in Teacher Evaluations must be on the same scale; content must be...
Transcript of Using Student Assessment Data in Teacher Evaluations must be on the same scale; content must be...
Using Student Assessment Data in Teacher Evaluations
Copyright © 2016 American Institutes for Research. All rights reserved.
Mariann Lemke October 2016
The mission of the Center on Great Teachers and Leaders (GTL Center) is to foster the capacity of vibrant networks of practitioners, researchers, innovators, and experts to build and sustain a seamless system of support for great teachers and leaders for every school in every state in the nation.
GTL Center Mission
2
State Assessment Data and Teacher Evaluations • Who, What, Why?
What We Know: Value-Added Measures (VAMs) What We Know: Student Learning Objectives (SLOs) Other Growth Measures Q&A and Discussion
Agenda
3
Policy scans show that states typically use state assessments in teacher evaluation systems to measure student growth through: • Student learning objectives (SLOs) or similar goal-setting methods • Statistical measures such as value-added models (VAMs) or growth models
State Assessment Data in Teacher Evaluations
5
Student growth means the change in student achievement for an individual student between two or more points in time.
What you need: • Data from two points or more points in time • Data that measure what students are supposed to have learned and
teachers are supposed to have taught by that point in time • An approach to connecting the data
Measuring Student Growth
6
Proficiency
8
0
100
200
300
400
500
Student A Student B Student C Student D Student E
Ms. Smith
Prior Performance Current Performance
0
100
200
300
400
500
Student A Student B Student C Student D Student E
Ms. Jones
Prior Performance Current Performance
Why Student Growth?
SLO Template Baseline Data
Student Population
Interval of Instruction
Standards and Content
Assessments
Growth Targets
Rationale for Growth Targets
Instructional Strategies
A student learning objective (SLO) is a measurable, long-term, academic goal informed by available data that a teacher or teacher team sets at the beginning of the year for all students or for subgroups of students.
Student Learning Objectives
10
The Every Student Succeeds Act (ESSA) specifies that the U.S. Department of Education (ED) cannot require anything of states with regard to educator evaluation (i.e., as condition of approval of/waiver to required state plans).
Regardless of federal policy changes, state laws and regulations related to educator effectiveness remain in effect.
ESSA and Educator Evaluation
12
Education Commission of the States Policy Scan (September 2016)
13
43 states required objective measures of student achievement to be included in teacher evaluations.
16 states included student achievement and growth as the “preponderant criterion” in teacher evaluations. These states include AK, CO, CT, DC, DE, GA, HI, KY, LA, MS, NC, NM, NY, OK, PA and TN.
19 states included growth measures as a “significant criterion” in teacher evaluations. Eleven of those states (AZ, FL, ID, IL, MI, MN, NJ, NV, OH, RI, VA) explicitly define what “significant” means for the purposes of including student achievement in teacher evaluations. Eight states (AR, IN, KS, MD, ME, MO, OR, SD) do not provide these explicit guidelines.
Eight states required objective evidence of student learning in teacher evaluations. (MA, ND, SC, UT, WA, WI, WV, WY).
Seven states required that schoolwide achievement data be used in individual teacher performance ratings, whereas 11 other states explicitly allowed the practice.
Current State Policy
Florida: • 50% of educator evaluation score
Louisiana: • 50% of educator evaluation score
Minnesota: • 35% of educator evaluation score
Ohio: • Under 2016 statue, value-added score is optional
New Mexico: • 50% of educator evaluation score
North Carolina: • One of six standards
Tennessee: • One of several approved measures used in portfolio
States Currently Using Value-Added Data in Teacher Evaluations
14
15
N.J. Triples Weight of Tests in Teacher Evaluations (Education Week, September 13, 2016)
Mass. reexamining role of student test scores in teacher evaluations (Boston Globe, September 27, 2016)
But…a Changing Landscape….
Bias/validity is a concern: Potential exists, but studies suggest no concrete evidence of it.
Precision/reliability: Documented variability in measures; can increase by pooling data over time.
Relationship to other measures: Moderate relationships with other measures of teacher effectiveness.
Data needs: Need high-quality assessment and linkage data.
Technical Characteristics
17
Typically not used as an independent measure.
Effects likely to vary depending on who uses data, their perceptions of data, and the specifics of policies—are measures to be used for accountability or improvement, and what are the specifics of supports or consequences? • Some evidence of teacher turnover, increased teacher performance,
increased student performance with use of evaluation systems • May be useful for assignment or identifying coaches
Effects of the Use of Value-Added Measures
18
Mixed perceptions of usefulness of SLOs; some evidence this may change over time and positive perceptions of use of data.
Challenges related to assessments (selection, design) and accessing data, as well as with supports and communication.
Current studies suggest some relationship between SLO quality and student achievement, but results are not consistent across content areas or studies.
Implementation, Perceptions, and Relationship to Achievement
20
22
Year1 Year2
200
100
Measuring growth = subtraction Scores must be on the same scale (could be a
rubric); content must be aligned between time periods; need a reference point to interpret results Examples:
• Math grade 4 and math grade 5 on a vertically scaled assessment
• Spanish 3 pre- and post-test • Fitness pre- and post-test
Pre-Test/Post-Test or Simple Growth
23
Measuring growth = change in performance level Scores must be on the same scale; content must be aligned between
time periods; need a reference point to interpret results Examples:
• Writing, other performance-based content or skills
Post-Classification Pre-Classification Basic Proficient Advanced Basic Proficient Advanced
Pre-Test/Post-Test With a Rubric
Collection of student work showing growth related to relevant standards • May work especially well for courses with performance-based tasks or work
that is scored via rubric • Requires a holistic rubric or repeated measures (e.g., writing assignments
scored against the same rubric each time) • Must consider means to ensure consistency, quality • Could do within a single class or course
Another Approach: Portfolios
24
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Economic Policy Institute Briefing Paper No. 278.
Bill & Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the Measures of Effective Teaching Project. Retrieved from https://docs.gatesfoundation.org/Documents/preliminary-findings-research-paper.pdf
Chamberlain, G. (2013). Predictive effects of teachers and schools on test scores, college attendance, and earnings. Proceedings of the National Academy of Sciences 110(43), 17176–17182.
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014a). Measuring the impacts of teachers I: Evaluating bias in teacher value-added estimates. American Economic Review 104(9), 2593–2632.
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014b). Measuring the impacts of teachers II: Teacher value-added and student outcomes in adulthood. American Economic Review, 104(9), 2633–2679.
Cornetto, K. M., Schmitt L. N. T., Malerba, C., & Herrera, A. (2010). AISD REACH year 2 evaluation report II, 2008–2009 (DRE Publication No. 08.97). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/08.97_AISD_Reach_Year2_Evaluation_ReportII_2008_2009.pdf
Courtemanche, M., Orr, A., & Schmitt, L. (2014). AISD REACH program update: 2013–2014 participant feedback (DRE Report No. 13.39). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/DRE_13.39_AISD_Reach_Program_Update_2013_2014_Participant_Feedback.pdf
References and Resources
27
28
Dee, T., & Wyckoff, J. (2015). Incentives, selection, and teacher performance: Evidence from IMPACT, Journal of Policy Analysis and Management, 34(2), 267–297. Retrieved from https://ideas.repec.org/a/wly/jpamgt/v34y2015i2p267-297.html
Delaware Department of Education, Teacher and Leader Effectiveness Unit. (2013). Continuous improvement: A report on “year one” of the revised DPAS-II educator evaluation system. Dover, DE: Author. Retrieved from http://www.doe.k12.de.us/cms/lib09/DE01922744/Centricity/domain/271/present%20and%20reports/DPAS_II_Year_One_Report_2013.pdf
Donaldson, M. L., Cobb, C., LeChasseur, K., Gabriel, R., Gonzales, R., Woulfin, S., & Makuch, A. (2014). An evaluation of the pilot implementation of Connecticut’s system for educator evaluation and development. Retrieved from http://aftct.org/sites/aftct.org/files/neag_seed_report_1_1_14.pdf
Donaldson, M. L. (2012). Teachers’ perspectives on evaluation reform. Washington, DC: Center for American Progress. Retrieved from http://www.americanprogress.org/wp-content/uploads/2012/12/TeacherPerspectives.pdf
Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. (2013). Selecting growth models for school and teacher evaluations: Should proportionality matter? Washington, DC: National Center for Analysis of Longitudinal Data in Education Research (CALDER) Working Paper 80.
Felton, E. (2016, September 1). New Jersey triples weight of test scores in teacher evaluations. Education Week. Retrieved from http://blogs.edweek.org/edweek/teacherbeat/2016/09/new_jersey_tests_evaluations.html
References and Resources
Glazerman, S., Loeb, S., Goldhaber, D., Staiger, D., Raudenbush, S., & Whitehurst, R. (2010). Evaluating Teachers: The Important Role of Value-Added. Washington, DC. The Brookings Brown Center Task Group on Teacher Quality.
Goldhaber, D., Gabele, B., & Walch, J. (2014). Does the model matter? Exploring the relationship between different achievement-based teacher assessments. Statistics, Politics, and Policy 1(1), 28–39.
Goldhaber, D. (2015, March). Exploring the potential of value-added performance measures to affect the quality of the teacher workforce. Educational Researcher, 44(2), 87–95.
Goldhaber, D., Cowan, J., & Walch, J. (2013). Is a good elementary teacher always good? Assessing teacher performance estimates across subjects. Economics of Education Review, Vol. 36, 216–228.
Goldhaber, D., & Hansen, M. (2012) Is it just a bad class? Assessing the long-term stability of estimated teacher performance. Economica, 80(319), 589–612.
Goldring, E., Grissom, J., Rubin, M., Neumerski, C., Cannata, M., Drake, T., & Scheuermann, P. (2015, March). Make room value added: Principals’ human capital decisions and the emergence of teacher observation data. Educational Researcher, 44, 96–104.
Jiang, J., Sporte, S., & Luppescu, S. (2015, March). Teacher perspectives on evaluation reform: Chicago’s REACH Students. Educational Researcher, 44, 105–116.
Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation. National Bureau of Economic Research Working Paper No. 14607.
References and Resources
29
Kane, T. J., Taylor, E., Tyler, J., & Wooten, A. (2011). Identifying effective classroom practices using student achievement data. Journal of Human Resources, 46(3), 587–613. Retrieved from http://cepr.harvard.edu/publications/identifying-effective-classroom-practices-using-student-achievement-data
Kane, T. J., Staiger, D. O., & Bacher-Hicks, A. (2014). Validating teacher effect estimates using between school movers: A replication and extension of Chetty et al. Harvard University Working Paper.
Koedel, C., & Betts, J. R. (2007, April). Re-examining the role of teacher quality in the educational production function. University of Missouri-Columbia Department of Economics Working Paper Series WP 07-08. Retrieved from https://economics.missouri.edu/working-papers/2007/wp0708_koedel.pdf
Koedel, C., Mihaly, K., & Rockoff, J. E. (2015, January). Value-added modeling: A review. University of Missouri–Columbia, Department of Economics.
Lachlan-Haché, L. (2015). The art and science of student learning objectives: A research synthesis. Washington, DC: Performance Management Advantage: Evaluation & Professional Growth at American Institutes for Research. Retrieved from http://www.air.org/sites/default/files/downloads/report/Art-and-Science-of-Student-Learning-Objectives-April-2015.pdf
Loeb, S., Soland, J., & Fox, L. (2014). Is a good teacher a good teacher for all? Comparing value-added of teachers with their English learners and non-English learners. Education Evaluation and Policy Analysis, 36(4), 457–475.
McCaffrey, D. F. (2013). Will teacher value‐added scores change when accountability tests change? Carnegie Knowledge Network. Retrieved from http://www.carnegieknowledgenetwork.org/wp-content/uploads/2013/06/CKN_2013-06_McCaffrey.pdf
References and Resources
30
McCaffrey, D. F., Sass, T. R., Lockwood, J. R., & Mihaly, K. (2009). The intertemporal variability of teacher effect estimates. Education Finance and Policy, 4(4), 572–606.
Mihaly, K., McCaffrey D., Staiger, D., & Lockwood, J. (2013, January 8). A composite estimator of effective teaching. Seattle, WA: Bill & Melinda Gates Foundation, Measures of Effective Teaching (MET) Project.
Paufler, N. A., & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal 51(2), 328–362.
Rockoff, J. E., Staiger, D. O., Kane, T. J., & Taylor, E. S. (2012, December). Information and employee evaluation: Evidence from a randomized intervention in public schools. American Economic Review, 3184–3213.
Ronfeldt, M., Loeb, S., & Wyckoff, J. (2013). How teacher turnover harms student achievement. American Educational Research Journal, 50(1), 4–36.
Rothstein, J. 2010. Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125(1), 175–214.
Rothstein, J. 2009. Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537–571.
Schmitt, L. N. T. (2014). AISD REACH program: Summary of findings from 2007–2008 Through 2012–2013 (DRE Publication No. 12.96). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/dre-reports/ DRE_12.96_AISD_REACH_Program_Summary_of_Findings_2007_2008_Through_2012_2013_0.pdf
References and Resources
31
Schmitt, L. N. T., Lamb, L. M., Cornetto, K. M., & Courtemanche, M. (2013). AISD REACH program update, 2012−2013: Student learning objectives (DRE Publication No. 12.83). Austin, TX: Austin Independent School District. Retrieved from https://www.austinisd.org/sites/default/files/dre-reports/DRE_12.83_AISD_REACH_Program_Update_2012_2013_Student_Learning_Objectives.pdf
Schmitt, L., Malerba, C., Cornetto, K., & Bush-Richards, A. (2008). Strategic compensation interim report 2: Teacher focus group summary, spring 2008 (DPE Publication No. 07.32). Austin, TX: Austin Independent School District. Retrieved from http://www.austinisd.org/sites/default/files/drereports/ 07.32_Strategic_Compensation_Interim_Report_2_Teacher_Focus_Group_Summary_Spring_2008.pdf
Slotnik, W. J., Bugler, D., & Liang, G. (2014). Real progress in Maryland: Student learningobjectives and teacher and principal evaluation. Washington, DC: Mid-Atlantic Comprehensive Center. Retrieved from http://www.wested.org/wp-content/files_mf/1413394919RealProgressinMD_Report.pdf
Slotnik, W. J., Smith, M., Glass, R., & Helms, B. J. (2004). Catalyst for change: Pay for performance in Denver (Final Report). Boston, MA: Community Training and Assistance Center. Retrieved from http://www.broadeducation.org/asset/1128-catalyst%20for%20change.pdf
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago's Excellence in Teaching project. Education Finance and Policy, 10(4), 535–572.
References and Resources
32
Taylor, K. (2015, December 14). New York Regents vote to exclude state tests in teacher evaluations. New York Times. Retrieved from http://www.nytimes.com/2015/12/15/ nyregion/new-york-regents-vote-to-exclude-state-tests-in-teacher-evaluations.html
Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.
TNTP. (2012). Summer report: Creating a culture of excellence in Indiana schools. Indianapolis, IN: Indiana Department of Education. Retrieved from http://www.riseindiana.org/sites/default/files/files/Summer%20Report.pdf
Whitehurst, G., Chingos, M., & Lindquist, K. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Washington, DC: Brookings Institution.
Xu, Z., Ozek, U., & Corritore, M. (2012). Portability of teacher effectiveness across school settings. Washington, DC: National Center for Analysis of Longitudinal Data in Education Research (CALDER) Working Paper No. 77.
References and Resources
33
Advancing state efforts to grow, respect, and retain great teachers and leaders for all students
Mariann Lemke 773-283-3668 [email protected] 1000 Thomas Jefferson Street NW Washington, DC 20007-3835 877-322-8700 www.gtlcenter.org [email protected]
34