Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that...

21
Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis Kathleen Lynch, Heather Hill, Kathryn Gonzalez, and Cynthia Pollard Background / Context: Prior research and its intellectual context. Professional development and curriculum materials constitute two major vehicles for instructional innovation and improving student outcomes. Prior to 2002, however, scholars rarely rigorously evaluated such programs. Yet following calls in the early 2000s by influential scholars for stronger research into the impact of educational interventions (e.g., Confrey & Stohl, 2004; Shavelson & Towne, 2001), research portfolios at the Institute for Education Sciences (IES) and the National Science Foundation (NSF) began to reflect a growing interest in research methods that allow causal inference, and in using student outcomes as an indicator of program success. Dollars’ and scholars’ turn in this direction has resulted in a wealth of new studies in the past fifteen years. These new studies permit rigorous empirical analyses linking program characteristics to student outcomes. Purpose / Objective / Research Question: Description of the focus of the research. We present a meta-analysis of preK-12 STEM instructional improvement programs, seeking to understand what content, formats, and activities lead to stronger student outcomes. Our analysis differs from similar recent efforts in that it is a formal meta-analysis rather than a structured review (e.g., Kennedy, 2015; Gersten, 2014), and because unlike past efforts (e.g., Scher & O’Reilly, 2009; Slavin, Lake, & Groff, 2009), the large number of newly available randomized studies allows us to exclude studies with weaker designs. Research Design. We conducted a meta-analysis of STEM professional development and curriculum improvement interventions. Data Collection and Analysis. Search procedures Our goal was to uncover studies from both the published and unpublished ('grey') literature. We began by searching library reference databases (e.g., ERIC, PsycINFO, Academic Search Premier, ProQuest Dissertations and Theses) for the years 1989 forward. We used search terms adapted from prior studies (e.g. Scher & O’Reilly, 2009; Yoon et al., 2007) and also hand- searched the reference lists of prior research reviews (e.g., Kennedy, 1999; Scher & O’Reilly, 2009; Slavin & Lake, 2008; Yoon, 2007). We also searched the NSF and IES websites for STEM awards made between 2002 and 2012, and conducted follow-up Google Scholar searches to locate studies resulting from these grants. In cases where could find no publicly available reports or information from the IES/NSF pool, we contacted project PIs via email to obtain study results. Searching ended in March 2016, although attempts to contact PIs continued through August 2017. Inclusion/exclusion criteria

Transcript of Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that...

Page 1: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts:

A Meta-Analysis

Kathleen Lynch, Heather Hill, Kathryn Gonzalez, and Cynthia Pollard Background / Context: Prior research and its intellectual context. Professional development and curriculum materials constitute two major vehicles for instructional innovation and improving student outcomes. Prior to 2002, however, scholars rarely rigorously evaluated such programs. Yet following calls in the early 2000s by influential scholars for stronger research into the impact of educational interventions (e.g., Confrey & Stohl, 2004; Shavelson & Towne, 2001), research portfolios at the Institute for Education Sciences (IES) and the National Science Foundation (NSF) began to reflect a growing interest in research methods that allow causal inference, and in using student outcomes as an indicator of program success. Dollars’ and scholars’ turn in this direction has resulted in a wealth of new studies in the past fifteen years. These new studies permit rigorous empirical analyses linking program characteristics to student outcomes. Purpose / Objective / Research Question: Description of the focus of the research. We present a meta-analysis of preK-12 STEM instructional improvement programs, seeking to understand what content, formats, and activities lead to stronger student outcomes. Our analysis differs from similar recent efforts in that it is a formal meta-analysis rather than a structured review (e.g., Kennedy, 2015; Gersten, 2014), and because unlike past efforts (e.g., Scher & O’Reilly, 2009; Slavin, Lake, & Groff, 2009), the large number of newly available randomized studies allows us to exclude studies with weaker designs. Research Design. We conducted a meta-analysis of STEM professional development and curriculum improvement interventions. Data Collection and Analysis. Search procedures Our goal was to uncover studies from both the published and unpublished ('grey') literature. We began by searching library reference databases (e.g., ERIC, PsycINFO, Academic Search Premier, ProQuest Dissertations and Theses) for the years 1989 forward. We used search terms adapted from prior studies (e.g. Scher & O’Reilly, 2009; Yoon et al., 2007) and also hand-searched the reference lists of prior research reviews (e.g., Kennedy, 1999; Scher & O’Reilly, 2009; Slavin & Lake, 2008; Yoon, 2007). We also searched the NSF and IES websites for STEM awards made between 2002 and 2012, and conducted follow-up Google Scholar searches to locate studies resulting from these grants. In cases where could find no publicly available reports or information from the IES/NSF pool, we contacted project PIs via email to obtain study results. Searching ended in March 2016, although attempts to contact PIs continued through August 2017. Inclusion/exclusion criteria

Page 2: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

In the initial round of screening, researchers downloaded 1,698 studies and examined their abstracts. Of these, 477 studies met basic criteria for relevance and were advanced to the next round of screening. To be retained in this second round, studies had to: (1) examine an intervention aimed at preK-12 teachers; (2) have at least two teachers and 15 students in each treatment group (Slavin, Lake, & Groff, 2009); (3) provide student outcome data (e.g., achievement, affective, etc.); and (4) possess a randomized or strong quasi-experimental research design. We excluded studies that had no equivalent comparison group or used post-hoc matching. After applying these criteria, 90 studies remained in the final dataset. Study coding We developed a coding scheme that captured features of intervention content (see Table 1). For interventions involving professional development, three primary coding categories emerged: the main focus (or foci) of the professional development (assessment; curriculum materials); PD format (e.g., summer workshop; coaching); and activities that teachers engaged in during the professional development. For interventions involving new curriculum materials, our codes included whether the curriculum provided implementation guidance or supported student inquiry. After meeting an initial 80% interrater agreement threshold, coding proceeded in pairs. Individuals coded separately, then met to resolve disagreements. Analysis We calculated standardized mean effect sizes by utilizing Hedges’ g, and adjust standard errors as needed for the clustering of students within classrooms or schools (Higgins, Deeks, & Altman, 2008; Littell, Corcoran, and Pillai, 2008). To conduct our main analyses, we followed the robust variance estimation (RVE) approach outlined by Tanner-Smith and Tipton (2014). This approach adjusts standards errors to account for the nesting of multiple effect sizes within studies, which occur when a single study provides multiple effect sizes estimates for the same underlying construct, or for correlated underlying measures (e.g., an intervenor-developed and state standardized algebra assessment; multiple subscales of a single assessment). We first estimated an unconditional meta-regression model with RVE to estimate the mean effect size. Next, we estimated a conditional model with a set of covariates for study design, study sample, and outcome measure type. Finally, we estimated a series of conditional models that each contain a set of covariates representing one of the primary coding categories outlined above: focus, activities, format, and characteristics of new curriculum materials. Preliminary Findings/Results. The overall impact of instructional improvement interventions is roughly 0.2 standard deviations (Table 2). Consistent with past reports (Hill, Bloom & Lipsey, 2008), researcher-designed assessments produce stronger gains than standardized tests, and preK interventions stronger impacts than K-12 programs (see Table 3). Controlling for these factors, models predicting student test score outcomes from the focus variables found that programs centered on how to use curriculum materials, how to integrate technology into classrooms, and how to use content-specific formative assessments all produced

Page 3: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

significantly stronger impacts (see Table 4). Programs focused on improving pedagogical content knowledge and/or knowledge of how students learn showed similarly positive impacts (see Table 4). Models predicting test scores from professional development activities found that when teachers solved problems – in either math or science – impacts were significantly higher (see Table 5). Models predicting test scores from professional development formats found that online professional development was associated with significantly lower average gains, and that programs featuring summer workshops and follow-up meetings had significantly stronger gains (see Table 6). Finally, no features of curriculum materials significantly predicted student outcomes (See Table 7). Brief Conclusion. In synthesizing the contemporary research evidence on instructional innovations in STEM, the current study identifies elements of instructional innovations that succeed in bolstering student outcomes.

Page 4: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

References

Agodini, R., Harris, B., Remillard, J., & Thomas, M. (2013). After two years, three elementary math curricula outperform a fourth. NCEE Technical Appendix, September 2003. Washington, DC: National Center for Education Evaluation and Regional Assistance. Argentin, G., Pennisi, A., Vidoni, D., Abbiati, G., & Caputo, A. (2014). Trying to raise (low) math achievement and to promote (rigorous) policy evaluation in Italy: Evidence from a large-scale randomized trial. Evaluation review, 38(2), 99-132. Arnold, D. H., Fisher, P. H., Doctoroff, G. L., & Dobbs, J. (2002). Accelerating math development in Head Start classrooms. Journal of Educational Psychology, 94(4), 762. Batiza, A., Luo, W., Zhang, B., Gruhl, M., Nelson, D., Hoelzer, M., ... & LaFlamme, D. (2016). Regular Biology Students Learn Like AP Students with SUN. Paper presented at the Society for Research on Educational Effectiveness Spring Conference, Washington, DC. Battistich, V., Alldredge, S., & Tsuchida, I. (2003). Number power: An elementary school program to enhance students' mathematical and social development. Standards-based school mathematics curricula: What are they, 133-160. Berlinski, Samuel G.; Busso, Matías (2015) : Challenges in Educational Reform: An Experiment on Active Learning in Mathematics, IDB Working Paper Series, No. IDB-WP-561, http://hdl.handle.net/11319/6825 Beuermann, D. W., Naslund-Hadley, E., Ruprah, I. J., & Thompson, J. (2013). The pedagogy of science and environment: Experimental evidence from Peru. The Journal of Development Studies, 49(5), 719-736. Borman, K. M., Cotner, B. A., Lee, R. S., Boydston, T. L., & Lanehart, R. (2009 March). Improving Elementary Science Instruction and Student Achievement: The Impact of a Professional Development Program. Paper presented at the Society for Research on Educational Effectiveness, Washington, DC. Borman, G. D., Gamoran, A., & Bowdon, J. (2008). A randomized trial of teacher development in elementary science: First-year achievement effects. Journal of Research on Educational Effectiveness, 1(4), 237-264. Bottge, B. A., Ma, X., Gassaway, L., Toland, M. D., Butler, M., & Cho, S. J. (2014). Effects of blended instructional models on math performance. Exceptional children, 80(4), 423-437. Bottge, B. A., Toland, M. D., Gassaway, L., Butler, M., Choo, S., Griffen, A. K., & Ma, 142 X. (2015). Impact of enhanced anchored instruction in inclusive math classrooms. Exceptional Children, 81, 158-175.

Page 5: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Bradshaw, T. J. (2012). Impact of inquiry based distance learning and availability of classroom materials on physical science content knowledge of teachers and students in central appalachia (Order No. 3579280). Available from ProQuest Dissertations & Theses Global. (1506611403). Retrieved from http://search.proquest.com.ezp-prod1.hul.harvard.edu/docview/1506611403?accountid=11311 Brendefur, J., Strother, S., Thiede, K., Lane, C., & Surges-Prokop, M. J. (2013). A professional development program to improve math skills among preschool children in head start. Early Childhood Education Journal, 41(3), 187-195. Brown, J. A., Greenfield, D. B., Bell, E., Juárez, C. L., Myers, T., & Nayfeld, I.(2013). ECHOS: Early Childhood Hands-On Science Efficacy Study. Paper presented at the Society for Research on Educational Effectiveness annual meeting, Washington, DC. Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C. P., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American educational research journal, 26(4), 499-531. Cervetti, G. N., Barber, J., Dorph, R., Pearson, P. D., & Goldschmidt, P. G. (2012). The impact of an integrated approach to science and literacy in elementary school classrooms. Journal of research in science teaching, 49(5), 631-658. Clark, T. F., Arens, S. A., & Stewart, J. (2015). Efficacy Study of a Pre-Algebra Supplemental Program in Rural Mississippi: Preliminary Findings. Paper presented at the Society for Research on Educational Effectiveness annual meeting, Washington, DC. Clarke, B., Smolkowski, K., Baker, S. K., Fien, H., Doabler, C. T., & Chard, D. J. (2011). The impact of a comprehensive Tier I core kindergarten program on the achievement of students at risk in mathematics. The Elementary School Journal, 111(4), 561-584. Clements, D. H., & Sarama, J. (2007). Effects of a preschool mathematics curriculum: Summative research on the Building Blocks project. Journal for Research in Mathematics Education, 136-163. Clements, D. H., & Sarama, J. (2008). Experimental evaluation of the effects of a research-based preschool mathematics curriculum. American educational research journal, 45(2), 443-494. Clements, D. H., Sarama, J., Spitler, M. E., Lange, A. A., & Wolfe, C. B. (2011). Mathematics learned by young children in an intervention based on learning trajectories: A large-scale cluster randomized trial. Journal for Research in Mathematics Education, 42(2), 127–166. Dash, S., De Kramer, R.M., O’Dwyer, L. M., Masters, J., & Russell, M. (2012). Impact of online professional development or teacher quality and student achievement in fifth grade mathematics. Journal of research on Technology in Education, 45(1), 1-26.

Page 6: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Debarger, A. H., Penuel, W. R., Moorthy, S., Beauvineau, Y., Kennedy, C. A., & Boscardin, C. K. (2017). Investigating purposeful science curriculum adaptation as a strategy to improve teaching and learning. Science Education, 101(1), 66-98. Desimone, L. M. (2009). Improving impact studies of teachers’ professional development: Toward better conceptualizations and measures. Educational researcher, 38(3), 181-199. Devlin-Scherer, W., Spinelli, A. M., Giammatteo, D., Johnson, C., Mayo-Molina, S., McGinley, P., ... & Zisk, L. (1998). Action Research in Professional Development Schools: Effects on Student Learning. Dominguez, P. S., Nicholls, C., & Storandt, B. (2006). Experimental Methods and Results in a Study of PBS TeacherLine Math Courses. Hezel Associates, Syracuse, NY. Eddy, R. M., & Berry, T. (2006). A Randomized Control Trial to Test the Effects of Prentice Hall’s Miller and Levine (2006) Biology Curriculum on Student Performance: Final Report. Claremont,CA: Claremont Graduate University. Eddy, R. M., Ruitman, H. T., Sloper, M., & Hankel, N. (2010). The effects of Miller & Levine Biology (2010) on student performance: Final report. LaVerne, CA: Cobblestone Applied Research & Evaluation, Inc. Garet, M. S., Wayne, A. J., Stancavage, F., Taylor, J., Eaton, M., Walters, K., ... & Sepanik, S. (2011). Middle School Mathematics Professional Development Impact Study: Findings after the Second Year of Implementation. Washington, DC: National Center for Education Evaluation and Regional Assistance. Granger, E. M., Bevis, T. H., Saka, Y., Southerland, S. A., Sampson, V., & Tate, R. L. (2012). The efficacy of student-centered instruction in supporting science learning. Science, 338(6103), 105-108. Gropen, J., Clark-Chiarelli, N., Chalufour, I., Hoisington, C., & Eggers-Pierola, C. (2009, March). Creating a successful professional development program in science for Head Start teachers and children: Understanding the relationship between development, intervention, and evaluation. Society for Research on Educational Effectiveness annual meeting. Washington, DC. Gropen, J., Clark-Chiarelli, N., Ehrlich, S., & Thieu, Y. (2011). Examining the Efficacy of" Foundations of Science Literacy": Exploring Contextual Factors. Society for Research on Educational Effectiveness. Hand, B., Therrien, W., & Shelley, M. (2013). Examining the Impact of Using the Science Writing Heuristic Approach in Learning Science: A Cluster Randomized Study. Paper presented at the Society for Research on Educational Effectiveness Spring Conference, Washington, D.C. Harris, C. J., Penuel, W. R., D'Angelo, C. M., DeBarger, A. H., Gallagher, L. P., Kennedy, C. A., ... & Krajcik, J. S. (2015). Impact of project-­‐‑based curriculum materials on student learning in

Page 7: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

science: Results of a randomized controlled trial. Journal of Research in Science Teaching, 52(10), 1362-1385. Harris, C. J., Penuel, W. R., DeBarger, A. H., D’Angelo, C., & Gallagher, L. P. (2014). Curriculum Materials Make a Difference for Next Generation Science Learning. Heller, J. I. (2010). The impact of Math Pathways & Pitfalls on students’ mathematics achievement and mathematical language development: A study conducted in schools with high concentrations of Latino/a students and English learners. Heller, J. I., Curtis, D. A., Rabe-Hesketh, S., & Verboncoeur, C. J. (2007). The Effects of" Math Pathways and Pitfalls" on Students' Mathematics Achievement: National Science Foundation Final Report. Heller, J. I., Daehler, K. R., Wong, N., Shinohara, M., & Miratrix, L. W. (2012). Differential effects of three professional development models on teacher knowledge and student achievement in elementary science. Journal of Research in Science Teaching, 49(3), 333-362. Heller, J., Hanson, T., Barnett-Clarke, C. (2010). The impact of Math Pathways & Pitfalls on Students’ Mathematics Achievement and Mathematical Language Development: A study conducted in schools with high concentrations of Latino/a student and English learners. Hinerman, K. M., Hull, D. M., Chen, Q., Booker, D. D., & Naslund-Hadley, E. I. (2014). Teacher-Led Math Inquiry in Belize: A Cluster Randomized Trial. Paper presented at the Society for Research on Educational Effectiveness annual meeting, Washington, DC. Jaciw, A. P., Hegseth, W., Ma, B., & Lai, G. (2012). Assessing Impacts of Math in Focus, a ‘Singapore Math’ Program for American Schools: A Report of Findings from a Randomized Control Trial. , Palo Alto, CA: Empirical Education. Jacob, R., Hill, H., & Corey, D. (2017). The Impact of a Professional Development Program on Teachers' Mathematical Knowledge for Teaching, Instruction, and Student Achievement. Journal of Research on Educational Effectiveness, 10(2), 379-407. Jacobs, V. R., Franke, M. L., Carpenter, T. P., Levi, L., & Battey, D. (2007). Professional development focused on children's algebraic reasoning in elementary school. Journal for Research in Mathematics Education, 258-288. Jerrim, J., & Vignoles, A. (2015). The Causal Effect of East Asian “Mastery” Teaching Methods on English Children’s Mathematics Skills’. UCL Institute of Education, University College London. Kaldon, C. R., & Zoblotsky, T. A. (2014). A Randomized Controlled Trial Validating the Impact of the LASER Model of Science Education on Student Achievement and Teacher Instruction. Paper presented at the Society for Research on Educational Effectiveness Spring Conference, Washington, DC.

Page 8: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Kennedy, M. M. (2016). How does professional development improve teaching?. Review of Educational Research, 86(4), 945-980. Kim, K. H., VanTassel-Baska, J., Bracken, B. A., Feng, A., Stambaugh, T., & Bland, L. (2012). Project Clarion: Three years of science instruction in Title I schools among K-third grade students. Research in Science Education, 42(83), 1-17. Kinzie, M. B., Whittaker, J. V., Williford, A. P., DeCoster, J., McGuire, P., Lee, Y., & Kilday, C. R. (2014). MyTeachingPartner-Math/Science pre-kindergarten curricula and teacher supports: Associations with children's mathematics and science learning. Early Childhood Research Quarterly, 29(4), 586-599. Kisker, E. E., Lipka, J., Adams, B. L., Rickard, A., Andrew-Ihrke, D., Yanez, E. E., & Millard, A. (2012). The potential of a culturally based supplemental mathematics curriculum to improve the mathematics performance of Alaska Native and other students. Journal for Research in Mathematics Education, 43(1), 75-113. Klein, A., Starkey, P., Clements, D., Sarama, J., & Iyer, R. (2008). Effects of a pre-kindergarten mathematics intervention: A randomized experiment. Journal of Research on Educational Effectiveness, 1(3), 155-178. Lafferty, J. F. (1994). The links among mathematics text, students’ achievement, and students’ mathematics anxiety: A comparison of the incremental development and traditional texts. Unpublished doctoral dissertation, Widener University, Wilmington, DE. Lanehart, R. E., Borman, K. M., Boydston, T. L., Cotner, B. A., & Lee, R. S. (2010, March). Improving gender, racial, and social equity in elementary science instruction and student achievement: The impact of a professional development program. Paper presented at the annual Society for Research on Educational Effectiveness, Washington, DC. Lang, L. B., Schoen, R. R., LaVenia, M., & Oberlin, M. (2014). Mathematics Formative Assessment System--Common Core State Standards: A Randomized Field Trial in Kindergarten and First Grade. Paper presented at the Society for Research on Educational Effectiveness Spring Conference, Washington, DC. Lara-­‐‑Alecio, R., Tong, F., Irby, B. J., Guerrero, C., Huerta, M., & Fan, Y. (2012). The effect of an instructional intervention on middle school English learners' science and English reading achievement. Journal of Research in Science Teaching, 49(8), 987-1011. Lehrer, R. (2010). Assessing Data Modeling and Statistical Reasoning Project: IES Final Report. Lewis, C. C., & Perry, R. R. (2015). A randomized trial of lesson study with mathematical resource kits: Analysis of impact on teachers’ beliefs and learning community. In Large-scale studies in mathematics education (pp. 133-158). Springer International Publishing: New York, NY.

Page 9: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Llorente, C., Pasnik, S., Moorthy, S., Hupert, N., Rosenfeld, D., & Gerard, S. (2015). Preschool Teachers Can Use a PBS KIDS Transmedia Curriculum Supplement to Support Young Children's Mathematics Learning: Results of a Randomized Controlled Trial. Paper presented at the Society for Research on Educational Effectiveness annual meeting, Washinton, DC. Llosa, L., Lee, O., Jiang, F., Haas, A., O’Connor, C., Van Booven, C. D., & Kieffer, M. J. (2016). Impact of a large-scale science intervention focused on English language learners. American Educational Research Journal, 53(2), 395-424. Maerten-Rivera, J., Ahn, S., Lanier, K., Diaz, J., & Lee, O. (2016). Effect of a multiyear intervention on science achievement of all students including English language learners. The Elementary School Journal, 116(4), 600-624. Martin, T., Brasiel, S. J., Turner, H., & Wise, J. C. (2012). Effects of the Connected Mathematics Project 2 (CMP2) on the Mathematics Achievement of Grade 6 Students in the Mid-Atlantic Region: Final Report. Washington, DC: National Center for Education Evaluation and Regional Assistance, McCoach, D. B., Gubbins, E. J., Foreman, J., Rubenstein, L. D., & Rambo-Hernandez, K. E. (2014). Evaluating the efficacy of using predifferentiated and enriched mathematics curricula for grade 3 students: A multisite cluster-randomized trial. Gifted Child Quarterly, 58(4), 272-286. Miller, G., Jaciw, A., Ma, B., & Wei, X. (2007). Comparative effectiveness of Scott Foresman Science: A report of randomized experiments in five school districts. Palo Alto, CA: Empirical Education. Montague, M., Krawec, J., Enders, C., & Dietz, S. (2014). The effects of cognitive strategy instruction on math problem solving of middle-school students of varying ability. Journal of Educational Psychology, 106(2), 469. Mutch-Jones, Puttick, & Demers (2014, April). Differentiating Science Instruction: Teacher and Student Findings from the Accessing Science ideas Project. Poster presented at the American Educational Research Association Annual Meeting, Philadelphia, PA. Newman, D. , Finney, P., Bell, S.H., and Turner, H., Jaciw, A.P., Zacamy, J. L. and Gould, L.F., Evaluation of the Effectiveness of the Alabama Math, Science, and Technology Initiative (AMSTI) (February 7, 2012). Available at SSRN: https://ssrn.com/abstract=2511347 or http://dx.doi.org/10.2139/ssrn.2511347. Oh, Y., Lachapelle, C. P., Shams, M. F., Hertel, J. D., & Cunningham, C. M. (2016 April). Evaluating the efficacy of engineering is elementary for student learning of engineering and science concepts. Paper presented at the American Educational Research Association Annual Meeting, Washington, DC. Pane, J. F., Griffin, B. A., McCaffrey, D. F., & Karam, R. (2014). Effectiveness of Cognitive Tutor Algebra I at scale. Educational Evaluation and Policy Analysis, 36(2), 127-144.

Page 10: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Penuel, W. R., Bates, L., Pasnik, S., Townsend, E., Gallagher, L. P., Llorente, C., & Hupert, N. (2010, June). The impact of a media-rich science curriculum on low-income preschoolers' science talk at home. In Proceedings of the 9th International Conference of the Learning Sciences-Volume 1 (pp. 238-245). International Society of the Learning Sciences. Penuel, W. R., Gallagher, L. P., & Moorthy, S. (2011). Preparing teachers to design sequences of instruction in earth systems science: A comparison of three professional development programs. American Educational Research Journal, 48(4), 996-1025. Piasta, S. B., Logan, J. A., Pelatti, C. Y., Capps, J. L., & Petrill, S. A. (2015). Professional development for early childhood educators: Efforts to improve math and science learning opportunities in early childhood classrooms. Journal of educational psychology, 107(2), 407. Presser, A. L., Clements, M., Ginsburg, H., & Ertle, B. (2012). Effects of a preschool and kindergarten mathematics curriculum: Big Math for Little Kids. New York, NY: Center for Children and Technology, Retrieved from http://cct. edc. org/publications/effects-preschool-andkindergarten-mathematics-curriculum-big-math-little-kids-final. Presser, A. L., Vahey, P., & Dominguez, X. (2015). Improving Mathematics Learning by Integrating Curricular Activities with Innovative and Developmentally Appropriate Digital Apps: Findings from the Next Generation Preschool Math Evaluation. Paper presented at the Society for Research on Educational Effectiveness, Washington, DC. Pyke, C., Lynch, S., Kuipers, J., Szesze, M., Nd Watson, W. (2005); Implementation study of The Real Reasons for Seasons (2004-2005):SCALE-uP Report No. 7. George Washington University, Washington, DC. Reid, E. E., Chen, J. Q., & McCray, J. (March, 2014). Achieving High Standards for Pre-K—Grade 3 Mathematics: A Whole Teacher Approach to Professional Development. Paper presented at the Society for Research Educational Effectiveness Spring Conference, Washington, DC. Resendez, M., and Azin, M., (2006). 2005 Prentice Hall Science Explorer randomized control trial. Pres Associates, Resendez, M., and Azin, M. (2008). A study of the effects of Pearson’s 2009 enVision Math Program. Pres Associates. Rethinam, V., Pyke, C., Lynch, S., Pyke, C., & Lynch, S. (2008). Using multilevel analyses to study the effectiveness of science curriculum materials. Evaluation & Research in Education, 21(1), 18-42. Rimbey, K. A. (2013). From the common core to the classroom: A professional development efficacy study for the common core state standards for mathematics (Doctoral dissertation). Retrieved from Pro Quest. (Order No. 3560379).

Page 11: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Roschelle, J., Shechtman, N., Tatar, D., Hegedus, S., Hopkins, B., Empson, S., ... & Gallagher, L. P. (2010). Integration of technology, curriculum, and professional development for advancing middle school mathematics: Three large-scale studies. American Educational Research Journal, 47(4), 833-878. Roth, K., Wilson, C., Taylor, J., Hvidsten, C., Stennett, B., Wickler, N., ... & Bintz, J. (2015, March). Testing the consensus model of effective PD: Analysis of practice and the PD research terrain. Paper presented at the International Conference of the National Association of Science Teacher Researchers, Chicago, IL. San Antonio, D. M., Morales, N. S., & Moral, L. S. (2011). Module-­‐‑based professional development for teachers: a cost-­‐‑effective Philippine experiment. Teacher development, 15(2), 157-169. Santagata, R., Kersting, N., Givvin, K. B., & Stigler, J. W. (2010). Problem implementation as a lever for change: An experimental study of the effects of a professional development program on students’ mathematics learning. Journal of Research on Educational Effectiveness, 4(1), 1-24. Sarama, J., Clements, D. H., Starkey, P., Klein, A., & Wakeley, A. (2008). Scaling up the implementation of a pre-kindergarten mathematics curriculum: Teaching for understanding with trajectories and technologies. Journal of Research on Educational Effectiveness, 1(2), 89-119. Sarama, J., Lange, A., Clements, D.H., & Wolfe, C.B. (2012). The impacts of an early mathematics curriculum on emerging literacy and language. Early Childhood Research Quarterly, 27, 489-502. doi: 10.1016/j.ecresq.2011.12.002 Saxe, G. B., & Gearhart, M. (2001). Enhancing students' understanding of mathematics: A study of three contrasting approaches to professional support. Journal of Mathematics Teacher Education, 4(1), 55-79. Scher, L., & O'Reilly, F. (2009). Professional development for K–12 math and science teachers: What do we really know?. Journal of Research on Educational Effectiveness, 2(3), 209-249 Schneider, S. (2013). Final Report for IES R305A70105: Algebra Interventions for Measured Achievement – Full Year. WestEd, San Francisco, CA. Schneider, M. C., & Meyer, J. P. (2012). Investigating the efficacy of a professional development program in formative classroom assessment in middle school English language arts and mathematics. Journal of Multidisciplinary Evaluation, 8(17), 1-24. Schwartz-­‐‑Bloom, R. D., & Halpin, M. J. (2003). Integrating pharmacology topics in high school biology and chemistry classes improves performance. Journal of Research in Science Teaching, 40(9), 922-938. Shannon, L., & Grant, B. (2012). A final evaluation report of Houghton Mifflin Harcourt’s Holt McDougal Biology. Charlottesville, VA: Magnolia Consulting, LLC.

Page 12: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Shavelson, R. J., & Towne, L. (2011). What Drives Scientific Research in Education?. APS Observer, 17(4). Slavin, R. E., Lake, C., & Groff, C. (2009). Effective programs in middle and high school mathematics: A best-evidence synthesis. Review of Educational Research, 79(2), 839-911. Sophian, C. (2004). Mathematics for the future: Developing a Head Start curriculum to support mathematics learning. Early Childhood Research Quarterly, 19(1), 59-81. Sparks, D. (2002). Designing powerful professional development for teachers and principals. Oxford, OH: National Staff Development Council. Star, J. R., Pollack, C., Durkin, K., Rittle-Johnson, B., Lynch, K., Newton, K., & Gogolen, C. (2015). Learning from comparison in algebra. Contemporary Educational Psychology, 40, 41-54. Starkey, P., Klein, A., & DeFlorio, L. (2013). Changing the Developmental Trajectory in Early Math through a Two-Year Preschool Math Intervention. Paper presented at Society for Research on Educational Effectiveness annual meeting, Washington, DC. Tatar, D., Roschelle, J., Knudsen, J., Shechtman, N., Kaput, J., & Hopkins, B. (2008). Scaling up innovative technology-based mathematics. The Journal of the Learning Sciences, 17(2), 248-286. Tanner-Smith, E. E., & Tipton, E. (2014). Robust variance estimation with dependent effect sizes: Practical considerations and a software tutorial in Stata and SPSS. Research Synthesis Methods, 5, 13–30. Tauer, S. (2002). How does the use of two different mathematics curricula affect student achievement? : A comparison study in Derby, Kansas. Retrieved from http://www.cpmponline.org/pdfs/CPMP_Achievement_Derby.pdf. Taylor, J. A., Getty, S. R., Kowalski, S. M., Wilson, C. D., Carlson, J., & Van Scotter, P. (2015). An efficacy trial of research-based curriculum materials with curriculum-based professional development. American Educational Research Journal, 52(5), 984-1017. Thompson, D. R., Senk, S. L., & Yu, Y. (2012). An evaluation of the Third Edition of the University of Chicago School Mathematics Project: Transition Mathematics. Chicago, IL: University of Chicago School Mathematics Project. Vaden-Kiernan, M., Borman, G., Caverly,S., Bell, N., Ruiz de Castilla, V., Sullivan, K. & Rodriguez, D. (March, 2016). Findings from a multi-year scale-up effectiveness trial of Everyday Mathematics. Paper presented at the Society for Research in Educational Effectiveness annual meeting, Washington, DC.

Page 13: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Van Egeren, L. A., Schwarz, C., Gerde, H., Morris, B., Pierce, S., Brophy-Herb, Lownds, N., Stein, M., & Stoddard, D. (2014, August). Cluster-randomized trial of the efficacy of early childhood science education with low-income children: Years 1-3. Poster presented at the 2014 Discovery Research K-12 PI Meeting, Arlington, VA. Walsh-Cavazos, S. (1994). A study of the effects of a mathematics staff development module on teachers' and students' achievement. (Doctoral dissertation). Retrieved from ProQuest (Order no. 9517241) Yoon, K. S., Duncan, T., Lee, S. W. Y., Scarloss, B., & Shapley, K. L. (2007). Reviewing the Evidence on How Teacher Professional Development Affects Student Achievement. Issues & Answers. REL 2007-No. 033. Regional Educational Laboratory Southwest (NJ1).

Page 14: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 1. Categories and descriptions of codes.

Code Code description PD Format Same-School Collaboration Teachers participated in professional development with

other teachers from their own school.

Implementation Meetings Teachers met formally or informally with other activity participants to discuss classroom implementation (e.g., troubleshooting meeting).

Online PD Part or all of the professional development was conducted online.

Summer Workshop The professional development included a summer workshop.

Expert Coaching

The professional development involved coaching or mentoring from experts (e.g., non-peers) who observed instruction and provided feedback (e.g., a debriefing meeting; via video or live).

PD lead by researchers/intervention developers

The PD was led by the intervention developers and/or the study authors.

PD Focus

Content-specific Instructional Strategies

The PD focused on instructional strategies specific to math or science teaching (e.g., mathematical discussions, scientific lab demonstrations).

Generic instructional strategies

The professional development was focused on content-generic instructional strategies (e.g., improving classroom climate and student motivation).

How to use Curriculum Materials The PD focused on how to use curriculum materials.

Integrate Technology The PD focused on how to integrating technology into the classroom.

Content-specific Formative Assessment

The PD focused on formative assessment strategies specific to mathematics and science teaching (e.g. strategies to elicit student understanding of fractions or the scientific method).

Improve content knowledge/pedagogical content knowledge/how students learn

The PD focused on improving teachers' pedagogical content knowledge (e.g., how students learn mathematics or science).

PD Activities

Observed Demonstration The teachers observed a video or live demonstration or modeling of instruction.

Video Focus The teachers watched stock video related to teaching practice

Solved Problems Teachers solved problems or exercises during the PD.

Page 15: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Student Materials Teachers worked through student materials during the PD. Developed Curriculum/Lesson Plans Teachers developed curricula or lesson plans during the Pd.

Review Own Student Work Teachers studied examples of their own students' work during the PD

Review Examples of Student Work

Teachers studied examples of other students’ work (including watching videos of students).

Curriculum Materials

Implementation Guidance The curriculum materials provided teachers with implementation guidance (e.g., support for student-teacher dialogues around the content).

Laboratory/Hands-on Experience, curriculum kits

The curriculum materials included materials/guidance that supported inquiry-oriented explorations (e.g., science laboratory or hands-on mathematics kits).

Curriculum Dosage (minutes) Total number of minutes that the curriculum was intended to be used.

Curriculum Proportion Replace (pct)

The proportion of each lesson that the new curriculum was intended to replace existing curriculum.

Page 16: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 2. Results of estimating an unconditional meta-regression model with robust variance estimation (RVE). Effect Size (Hedges’s g) Constant 0.215*** (0.027) N effect sizes 261 N studies 90 *p<0.10 **p<0.05 ***p<0.01. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.

Page 17: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 3. Results of estimating a conditional meta-regression model with robust variance estimation (RVE), including controls for study design, study sample, subject area and outcome measure type. Effect Size (Hedges’s g) Between-study effects RCT 0.014 (0.109) State standardized test -0.275*** (0.063) Other standardized test -0.291*** (0.060) Grade - preschool 0.146* (0.081) Effect size adjusted for covariates -0.053 (0.070) Subject matter- math 0.030 (0.049) Within-study effects State standardized test -0.300*** (0.072) Other standardized test -0.239*** (0.053) Constant 0.345** (0.112) N effect sizes 261 N studies 90 *p<0.10 **p<0.05 ***p<0.01. Following the recommendation of Tanner-Smith & Tipton (2014), we include the study-level mean value of each covariate. For the two covariates where there is within-study variability in at least 10 percent of studies (state standardized test, other standardized test), we also include a within-study version of the covariate that is calculated by subtracting the study-level mean values from the original covariate values. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.

Page 18: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 4. Results of estimating conditional meta-regression models with robust variance estimation (RVE), including features of professional development focus as moderators. Effect Size (Hedges’s g) Between-study effects Content-specific instructional strategies -0.076 -0.147

(0.131) (0.132)

Generic instructional strategies -0.021 -0.077

(0.083) (0.082)

How to use curriculum materials 0.154** 0.152**

(0.065) (0.066)

Integrate technology 0.230* 0.162

(0.114) (0.107)

Content-specific formative assessment 0.128* 0.103

(0.066) (0.063)

Improve pedagogical content knowledge/how students learn 0.121** 0.127**

(0.047) (0.048)

N effect sizes 241 241 241 241 241 241 241 N studies 84 84 84 84 84 84 84 *p<0.10 **p<0.05 ***p<0.01. Includes only studies and/or treatment arms with a professional development component. All models include controls for the following: RCT, state standardized test, other standardized test, grade-preschool, effect size is adjusted for covariates, subject matter: math. Following the recommendation of Tanner-Smith & Tipton (2014), we include the study-level mean value of each covariate. For the two covariates where there is within-study variability in at least 10 percent of studies (state standardized test, other standardized test), we also include a within-study version of the covariate that is calculated by subtracting the study-level mean values from the original covariate values. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.

Page 19: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 5. Results of estimating conditional meta-regression models with robust variance estimation (RVE), including features of professional development activities as moderators. Effect Size (Hedges’s g) Between-study effects Observed Demonstration 0.052 0.071 (0.057) (0.059) Video focus -0.049 -0.152* (0.082) (0.087) Solved problems 0.119* 0.111 (0.069) (0.085) Worked through student materials 0.057 0.030 (0.063) (0.071) Developed curriculum materials/lesson plans 0.081 0.041 (0.069) (0.071) Review own student work 0.014 0.024 (0.064) (0.075) Review generic student work 0.008 -0.002 (0.095) (0.091)

N effect sizes 241 241 241 241 241 240 241 240 N studies 84 84 84 84 84 83 84 83 *p<0.10 **p<0.05 ***p<0.01. Included only studies and/or treatment arms with a professional development component. All models include controls for the following: RCT, state standardized test, other standardized test, grade-preschool, effect size is adjusted for covariates, subject matter: math. Following the recommendation of Tanner-Smith & Tipton (2014), we include the study-level mean value of each covariate. For the two covariates where there is within-study variability in at least 10 percent of studies (state standardized test, other standardized test), we also include a within-study version of the covariate that is calculated by subtracting the study-level mean values from the original covariate values. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.

Page 20: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 6. Results of estimating conditional meta-regression models with robust variance estimation (RVE), including features of professional development format as moderators. Effect Size (Hedges’s g) Between-study effects Same-school collaboration 0.047 0.050 (0.079) (0.073) Implementation meetings 0.083 0.113** (0.057) (0.052) Any online PD -0.173*** -0.165** (0.060) (0.063) Summer workshop 0.108** 0.091* (0.049) (0.047) Expert coaching 0.0250 0.061 (0.055) (0.062) PD leaders – researchers -0.012 -0.040 (0.045) (0.043) N effect sizes 241 241 241 240 241 235 235 N studies 84 84 84 83 84 81 81 *p<0.10 **p<0.05 ***p<0.01. Included only studies and/or treatment arms with a professional development component. All models include controls for the following: RCT, state standardized test, other standardized test, grade-preschool, effect size is adjusted for covariates, subject matter: math. Following the recommendation of Tanner-Smith & Tipton (2014), we include the study-level mean value of each covariate. For the two covariates where there is within-study variability in at least 10 percent of studies (state standardized test, other standardized test), we also include a within-study version of the covariate that is calculated by subtracting the study-level mean values from the original covariate values. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.

Page 21: Strengthening the Research Base that Informs STEM ... · Strengthening the Research Base that Informs STEM Workforce Development and Curriculum Improvement Efforts: A Meta-Analysis

Table 7. Results of estimating conditional meta-regression models with robust variance estimation (RVE), including characteristics of interventions involving new curriculum materials as moderators. Effect Size (Hedges’s g) Between-study effects Implementation guidance 0.094 0.088 (0.057) (0.059) Laboratory/hands-on experience, curriculum kits -0.060 -0.051 (0.064) (0.062) Curriculum dosage (10 hours) -0.003 -0.000 (0.004) (0.004) Curriculum proportion replaced (0.00-1.00) -0.044 (0.157) N effect sizes 204 204 204 203 204 204 N studies 76 76 76 75 76 76 *p<0.10 **p<0.05 ***p<0.01. Includes only studies and/or treatment arms that included new curriculum materials. All models include controls for the following: RCT, state standardized test, other standardized test, grade-preschool, effect size is adjusted for covariates, subject matter: math. Following the recommendation of Tanner-Smith & Tipton (2014), we include the study-level mean value of each covariate. For the two covariates where there is within-study variability in at least 10 percent of studies (state standardized test, other standardized test), we also include a within-study version of the covariate that is calculated by subtracting the study-level mean values from the original covariate values. We assume the average correlation between all pairs of observed effect sizes within studies is 0.80.