Narrowing in on Educational Resources That Do Affect...

21
PEABODY JOURNAL OF EDUCATION, 81(4), 23-42 Copyright © 2006, Lawrence Erlbaum Associates, Inc. Narrowing in on Educational Resources That Do Affect Student Achievement Sarah Archibald Consortium for Policy Research in Education University of Wisconsin-Madison In an era dominated by issues of school finance adequacy, it seems particu- larly important to provide evidence that, despite a number of claims to the contrary, educational resources are indeed positively related to improved student achievement. One of the hypotheses of this article is that expendi- tures per pupil must be disaggregated into more meaningful categories to discern the relationship between resources and student achievement. To explore this question, this article uses data from the Washoe County School District in Reno, Nevada, which reports its school expenditures using a program called InSite. This program disaggregates expenditures into 4 cate- gories: instruction, instructional support, leadership, and operations and The research reported in this article was supported by a grant from the U.S. Department of Education, Office of Educational Research and Improvement, National Institute on Educational Governance, Finance, Policymaking and Management, to the Consortium for Policy Research in Education (CPRE) and the Wisconsin Center for Education Research, School of Education, University of Wisconsin-Madison (Grant No. OERI-R308A60003). The opinions expressed are those of the author and do not necessarily reflect the view of the National Institute on Educational Governance, Finance, Policymaking and Management, Office of Educational Research and Improvement, U.S. Department of Education, the institu- tional partners of CPRE, or the Wisconsin Center for Education Research. Correspondence should be sent to Sarah Archibald, Consortium for Policy Research in Education, University of Wisconsin-Madison, 1025 West Johnson Street, Madison, WI 53706. E-mail: [email protected] 23

Transcript of Narrowing in on Educational Resources That Do Affect...

PEABODY JOURNAL OF EDUCATION, 81(4), 23-42Copyright © 2006, Lawrence Erlbaum Associates, Inc.

Narrowing in on EducationalResources That Do AffectStudent Achievement

Sarah ArchibaldConsortium for Policy Research in EducationUniversity of Wisconsin-Madison

In an era dominated by issues of school finance adequacy, it seems particu-larly important to provide evidence that, despite a number of claims to thecontrary, educational resources are indeed positively related to improvedstudent achievement. One of the hypotheses of this article is that expendi-tures per pupil must be disaggregated into more meaningful categories todiscern the relationship between resources and student achievement. Toexplore this question, this article uses data from the Washoe County SchoolDistrict in Reno, Nevada, which reports its school expenditures using aprogram called InSite. This program disaggregates expenditures into 4 cate-gories: instruction, instructional support, leadership, and operations and

The research reported in this article was supported by a grant from the U.S. Departmentof Education, Office of Educational Research and Improvement, National Institute onEducational Governance, Finance, Policymaking and Management, to the Consortium forPolicy Research in Education (CPRE) and the Wisconsin Center for Education Research,School of Education, University of Wisconsin-Madison (Grant No. OERI-R308A60003). Theopinions expressed are those of the author and do not necessarily reflect the view of theNational Institute on Educational Governance, Finance, Policymaking and Management,Office of Educational Research and Improvement, U.S. Department of Education, the institu-tional partners of CPRE, or the Wisconsin Center for Education Research.

Correspondence should be sent to Sarah Archibald, Consortium for Policy Research inEducation, University of Wisconsin-Madison, 1025 West Johnson Street, Madison, WI 53706.E-mail: [email protected]

23

S. Archibald

maintenance. This school-level variable is the primary explanatory variablein this covariate adjustment model using a 3-level hierarchical linear model-ing analysis of students (approximately 7,000) nested in classrooms(approximately 420) nested in schools (approximately 55). This model alsoincludes a number of contextual and school compositional factors thatresearch tells us affect student achievement, including student demographiccharacteristics and pretest score; teacher experience, education, and a mea-sure of his or her instructional practice; and school size, school-levelpoverty, and expenditures broken out into 4 categories: instruction, instruc-tional support, leadership, and operations and maintenance. The resultsshow that expenditures for instruction and instructional support were posi-tively related and statistically significant for the reading achievement of 3rd,4th, 5th, and 6th graders in the 2002-03 school year.

Across the nation, state standards-based reform efforts are underwayto raise the level of student achievement, particularly the achievement ofstudents below state-designated proficiency levels, many of whom arestudents in poverty. With the passage of the No Child Left Behind Act of2001, the federal goverranent has also become more deeply involved inseeking to boost student achievement, again particularly students frompoverty and other conditions often associated with low performance (e.g.,English language learner, disabled, etc.). At the same time, an increas-ing number of states (e.g., Arizona, Arkansas, Delaware, Georgia, NorthCarolina, and Wyoming) are facing court mandates stating that the levelof resources provided to their schools is not adequate. Within this policycontext, it seems more important than ever to show that, controlling for asmany of the factors as possible that research tells us influence studentlearning, the level of resources available to schools makes a difference inhow much students learn.

Many studies have researched the question of whether the level ofresources influences the level of student learning. The results of these stud-ies have been mixed, which is not surprising given that the methodolo-gies-level of analysis, included variables-were different as well. In theirreview of this literature, Hedges, Laine, and Greenwald (1994) argued that,despite Hanushek's (1989) claim to the contrary, more of these studies thannot showed a positive correlation between level of resources and the levelof student learning. These scholars (and others) have debated this issuenumerous times and have not come to an agreement about the effectsof resources on student learning. Although this particular study cannotresolve this debate, it can provide evidence of a positive relationshipbetween resources for instruction and student learning from a more com-prehensive model than many previous analyses used.

24

Narrowing in on Educational Resources

The impetus for this study was the development of a conceptual frame-work for a three-level, fully specified model of student achievementwhere students are nested in classrooms nested in schools, in which all ofthese levels had controls for what theory says may affect student learn-ing (see Odden, Borman, & Fermanich, 2004). In particular, Odden et al.argued that any model for which the dependent variable is student learn-ing should include a measure of what is being taught in the classroom.Because the effect of typical teacher characteristics included in such models-years of experience and level of education-have not been found to influ-ence how much students learn, they theorized that another measure ofinstruction or the quality of instruction was necessary to provide a fullyspecified model. The first product, Mark Fermanich's (2003) dissertation,used this framework to analyze classroom and school factors that affectedstudent learning gains in mathematics in Minneapolis.

Ongoing Consortium for Policy Research in Education (CPRE) researchin Washoe County, (Reno) Nevada, provided another opportunity to applythis conceptual framework. Specifically, CPRE teacher compensationresearchers Tony Milanowski and Steve Kimball (among others) havebeen tracking the association between a teacher's score on a standards-based evaluation and his or her students' gains on standardized tests. Thestandards-based teacher evaluation score is much more comprehensivethan the typical principal evaluation, involving multiple observations ofclassroom practice. (See Appendix A for more details.) Over the course oftheir studies in multiple sites (including Washoe), the results of their two-level hierarchical linear models (HLMs) have consistently found thatteacher performance (as measured by the teacher's score on a standards-based evaluation system) is positively and often statistically significantlyrelated to student achievement (Milanowski, 2004; Milanowski &Kimball, 2005; Milanowski, Kimball, & Odden, 2005).

This finding is in line with numerous studies in the past 10 years thathave shown that teachers are an important factor influencing studentachievement (Ferguson, 1998; Goldhaber, 2002; Sanders, 2000). In light ofthe conceptual framework described earlier, one of the logical next stepsseemed to be adding a third level of analysis, specifying school-levelfactors that might also influence student achievement, both to confirm theprevious finding that the evaluation score is related to student achieve-ment and to test which school-level factors might also play a role inthis relationship. It is within this context, then, that my interest in the rela-tionship between school-level resources and student achievement, in thecontext of a fully specified model, arose; that is the primary concern ofthis article.

25

S. Archibald

This article is organized in the following manner. The first section givesthe conceptual framework for the study, which ends with the researchquestions that this study addresses. The next section presents the meth-ods, describing the data and statistical models used. The third section pre-sents the results of the analyses. The article concludes with a synthesis ofthe findings and their related policy implications and some suggestionsfor further research.

Conceptual Framework

Although there are many ways to judge the success of a teacher or aschool, today's standards-based accountability systems focus most onstudent performance on standardized tests. In this article I do not arguethe relative merits of such policies but rather use them as a pragmaticstarting point for an analysis designed to give insight into possible policylevers that may influence student achievement on standardized tests.In addition, given the current policy context cited earlier and the fact thatmany districts and states are focusing a lot of their resources on the testingrequirements of the No Child Left Behind Act of 2001 as well as statestandards-based reforms, it seems relevant to study, if indirectly, howthese expenditures influence outcomes on standardized tests.

As Fermanich (2003) summarized, prior research on the effects ofschools and teachers on student achievement includes three main types ofstudies: production function studies, effective schools studies, and school-effects/teacher-effects studies. Production function studies such as thosereviewed by Hanushek (1989) and rereviewed by Hedges et al. (1994)found, at best, a questionable link between resources and student learn-ing. However, it is also the case that these studies tend not to include fullyspecified models of how learning transpires in a classroom. Thesecond group, effective schools studies, shows that effective schools tendto have certain characteristics, but these are not necessarily linked to level ofresources in a school. The third group, school- and teacher-effects studies,uses regression analysis to show how various characteristics of schoolsand teachers are related to student-level outcomes, including achieve-ment. In terms of the effect of school-level resources, most of these studieshave not looked directly at this issue. This study looks both directly at thisissue and at some of the other variables found to influence studentachievement cited in studies next.

Some studies have shown that after accounting for student backgroundcharacteristics, the largest portion of the remaining unexplained varianceis due to the characteristics of the classroom teacher (Sanders, 2000;

26

Narrowing in on Educational Resources

Sanders & Rivers, 1996). Much of the research on teacher effects hasfocused on experience, education, certification, ability, and teacher evalu-ation score, with mixed findings on the impact of all of these factorsexcept standards-based teacher evaluation score, which is consistentlypositive (Darling-Hammond, 2000; Gallagher, 2004; Hanushek, 1992;Hanushek, Kain, O'Brien, & Rivkin, 2005; Kimball, White, Milanowski, &Borman, 2004; Milanowski & Kimball, 2005; Wayne & Youngs, 2003).

At the school level, some studies have shown a negative relation-ship between the size of a school and student achievement, suggestingthat smaller schools may be more conducive to learning (Andrews,Duncombe, & Yinger, 2002). A number of studies have analyzed the rela-tionship between per-pupil spending and student achievement, with themajority showing no relationship (Hanushek, 1989), although many ofthose models used a district-level measure of per-pupil spending ratherthan an actual measure of expenditures per student at the school level. Fora more extensive summary of the variables at the student, classroom, andschool levels that affect student learning gains, see Odden et al. (2004).

This study incorporates many of these contextual variables and focusesin particular on the relationship between per-pupil expenditures andstudent achievement. Therefore, these three research questions are con-sidered in this article:

1. Is there significant variation among per-pupil expenditures at theschool level, sufficient to allow the detection of a relationship if oneshould exist, between school-level per-pupil expenditures and studentachievement?

2. In the context of a three-level model with controls for student-,teacher-, and school-level characteristics, is there a positive rela-tionship between per-pupil spending (at the school level) andstudent achievement?

3. Does it make a difference in the magnitude or direction of the rela-tionship to separate per-pupil spending into different categoriesthat more directly reflect that on which the money is spent?

To address these questions, the study uses a covariate adjustment modelwith three nested levels of data, students nested in classrooms, nestedin schools. The dependent variable is student-level posttest score, andbecause of the particular interest of this article in the effect of fiscal vari-ables, the primary explanatory variable of interest is the school-level per-pupil spending variable and its various components. This type of model iscalled a covariate adjustment model, because the measure of student growthis not actually a gain score or a true measure of growth-it simply uses the

27

S. Archibald

student pretest score, an indicator of student achievement status, as a first-level predictor-so the outcome variable specifies the extent to which thestudent achievement status at the time of the posttest, controlling forstudent characteristics, classroom/teacher characteristics, and school-levelcharacteristics, differs from the expected score. In the next section I outlinethe method in more detail.

Method

This study uses data from elementary1 schools in the Washoe County,Nevada, school district. This district serves over 60,000 students residingin Reno, Sparks, and outlying communities. In 2000, Washoe imple-mented a standards-based teacher evaluation system closely modeled onDanielson's (1996) framework for teaching (see also Kimball, 2002, for adescription of the program) and soon after began a research relationshipwith CPRE, sharing evaluation and test score data in exchange for anongoing analysis of their teacher evaluation system (e.g., Kimball et al.,2004). Because the teacher evaluation score is part of what makes thisstudy different from others that have looked at how resources affectachievement, this evaluation system is also detailed in Appendix A.

Measures

In the following sections I discuss the student, teacher/classroom, andschool measures used in this study. Appendix C includes descriptivestatistics for the measures used at each level of the HLM model.

Student measures. Student demographic data collected and main-tained by the district were used to construct a series of dummy variablesdescribing student background. These variables include minority status,gender, participation in special education, and participation in free orreduced-price lunch. Test score data for reading and mathematics fromthe 2002-03 school year were obtained from the district; students' pretestscore was used as a control variable in this analysis, and posttest scorewas used as the outcome variable. Table 1 illustrates the tests used aspretest and posttest scores for students for the grades included in thisanalysis. With one exception, which was for fourth grade, when the TerraNova was issued in both fall and spring and could be used as a pretest

'Only elementary schools are included in this analysis because of a lack of available testdata to use the pretest-posttest design at the secondary level.

28

Narrowing in on Educational Resources

Table 1Student Achievement Measures by Grade for 2002-03

Grade Pretest Posttest

3 Second-grade district CRT Third-grade state CRT

4 Terra Nova fall Terra Nova spring

5 Terra Nova spring of fourth grade Fifth-grade state CRT

6 Fifth-grade state CRT Sixth-grade district CRT

Note. CRT = Criterion-Referenced Test.

and posttest, a different test was used for pretest and posttest scores. Thedistrict's use of different tests in different grades combined with the needfor the maximum possible sample size for this analysis led to the decisionto combine Grades 3 through 6 for this analysis. To allow combinationof student samples across grades, student test scores were transformedinto z scores. This means that interpretations of the results are limited torelative, and not absolute, changes in student achievement.

Teacher measures. For a measure of teacher instructional practice, Iused the teacher's standards-based evaluation score derived from the dis-trict's performance-based evaluation system. This score is used as a mea-sure of teacher quality; previous CPRE research, using a two-level HLManalysis, found positive, statistically significant relationships between thismeasure of teacher quality/performance and student achievement(Kimball et al., 2004; Milanowski & Kimball, 2005; see also Appendix A).Also included at the teacher level is a dummy variable indicating whetherthe student's teacher has a master's degree. Although many studies showno effect of teacher education, in a review of production function studies,Greenwald, Hedges, and Laine (1996) found stronger relations betweenteacher education and student achievement when education was coded aswhether teachers have a master's degree. A number of studies have alsobeen conducted to test whether and how teacher experience is related tostudent achievement. Most studies have found that having some experi-ence has an impact on student achievement (Hanushek, 1992, 1997; Rowan,Correnti, & Miller, 2002), but the benefits are usually realized after the firstfew years in the classroom (Hanushek, Kain, & Rivkin, 1998; Murnane,1983; Rockoff, 2004). Because research is not unanimous about how muchexperience is necessary for the positive relationship between teacher expe-rience and student achievement and the point at which it tapers off,I simply included experience as a continuous variable indicating eachteacher's step on the pay scale. I chose to include these variables, although

29

S. Archibald

they were not included in the two-level models of the same district thatMilanowski and Kimball have calculated, because I wanted to modelthe effect of the teacher evaluation score net of any effect of experience orgraduate work.

School measures. At the school level, school size as indicated by totalenrollment is included to test its relationship to student achievement. Themodel also includes a per-pupil spending figure from the Nevada statereport card Web site (Nevada Department of Education, 2004-2005).These fiscal data are available broken into four categories: instruction,instructional support, leadership, and operations. These categories arespecified in more detail in Appendix B. Because my theory is that someexpenditures are more likely to influence student achievement than oth-ers, I test the per-pupil figure first as a whole and then as an indication ofspending on instruction and instructional support only. Research has alsoshown that the overall socioeconomic level of a school can affect studentlearning (Borman & Dowling, 2003; Jencks & Mayer, 1990; Jencks &Phillips, 1998). Accordingly, I included the percentage of students qualify-ing for free or reduced-price lunch.

Sample

The data used to answer the preceding research questions come fromGrades 3 through 6 in the 2002-03 school year. For the most part, the dataeither were provided either directly by the district or were obtained fromthe Washoe County School District Web site (http://www.washoe.k12.nv.us/district/accountability/). Most of the work involved in preparingthe data for the students and teachers used in this analysis was conductedby Anthony Milanowski and Steve Kimball, with additional assistancefrom Michael Goetz.

HLM Models

I conducted the analysis using a three-level HLM model developed byBryk and Raudenbush (1988). This software enables researchers to moreeasily parse the variance that occurs within classrooms, mostly tied tostudent characteristics; the variance that lies between classrooms, mostlytied to teacher characteristics; and the variance that lies between schools,mostly tied to school characteristics or characteristics of the studentsand/or teachers as a whole. Some of the variation at each of the levels, ascan be observed in the models that follow, is due to randomness, or error,but this program also allows researchers to partition the error by level. In

30

Narrowing in on Educational Resources

addition, this model examines only fixed effects, meaning that the inter-cept for each variable is allowed to vary, but the slope is not.

The analysis uses a database from Washoe containing over 14,000student records, 666 teachers, and 60 schools. However, only studentswho could be matched to teachers who could be matched to schools couldbe included in the HLM analysis, which in this case was contained to7,601 student records, 421 teachers, and 53 schools. Some reasons formissing data include students with posttests but no pretests and teacherswho were not evaluated in the year in question (and thus none of thoseteachers' students could be included either). Although a full analysis ofmissing data has not been conducted, I do not believe that it biases ourresults. It simply limits the number of cases and, therefore, the power, ofour analysis.

Level 1 Model

Yz = P0 + Pl*(FRL) + P2*(FEMALE) + P3*(MINORITY)+ P4*(SPECED) + P5*(ZPRE) + E

where

Yz is the model's standardized estimate of the student's posttestscore, holding the other characteristics constant.2

FRL indicates whether the student participates in the free or reduced-price lunch program (1) or not (0).

FEMALE indicates whether the student is male (0) or female (1).MINORITY indicates whether the student is a member of a minority

group (1) or not (0).SPECED indicates whether the student qualifies for special education

services (1) or not (0).ZPRE is the standardized pretest score for the student for whom the

model is predicting the posttest score.

Level 2 Model

P0 = BOO + B01*(MA) + B02*(ZEXP) + B03*(ZPERFAVG) + RO

where

MA indicates whether the teacher has a master's degree (1) or not (0).

ZEXP is a standardized variable indicating at which step the teacheris on the pay scale (as a proxy for years of experience).

2Because the outcome variable is standardized, there is no need to put in a dummy for

grade level at the student level. Each student's grade has a mean of 0.

31

S. Archibald

ZPERFAVG is a standardized variable indicating the teacher's perfor-mance evaluation score on the district's standards-based teacherevaluation system.

Level 3 Model

BOO = GOOO + G001(ZPERFRLS) + G002(ZPERPUPI)+ G003(ZSCHSIZE) + UOO

where

ZPERFRL is a standardized variable indicating the percentage ofstudents participating in the free or reduced-price lunch programin a given school.

ZPERPUPI is a standardized variable that indicates how much thedistrict spends per pupil at the school where the teacher teachesand the student attends.

ZSCHSIZE is a standardized variable indicating the number ofstudents enrolled at the school in question.

In the second iteration, using a different specification of per-pupilspending based on a combination of expenditures for instruction andinstructional support, we estimate the following model:

BOO = GOOO + G001(ZPERFRLS) + G002(ZINSTPP)+ G003(ZSCHSIZE) + UOO

where

ZINSTPP is a standardized variable that indicates how much the dis-trict spends on instruction and instructional support (see AppendixB for more information on what these categories include).

The following section gives the results of the HLM models from2002-03.

Results

To simplify interpretations from the output, because test scores werestandardized across grades, all the continuous variables in the data set at thestudent, teacher, and school levels have been standardized, meaning thatthe regression coefficients provide effect size estimates for each variable.

Addressing the first research question, Table 2 illustrates the variancedecomposition between the different levels of the model. There was sig-nificant variation at all three levels of the empty model for both reading

32

Narrowing in on Educational Resources

Table 2Variance Decomposition for Reading and Mathematics, 2002-2003

Reading Mathematics

Within Between Between Within Between Between

Classroom Classroom School Classroom Classroom School

Empty .82 .04 .16 .74 .08 .19

Full .43 .03 .06 .43 .07 .01

Note. The empty models include only intercepts at each level. All table values are signifi-

cant at *p < .05.

and math. For reading, about 82% of the variation in posttest scores

occurs at the student level, 4% is between classrooms, and 16% is betweenschools. For mathematics, approximately 74% occurs at the student level,8% between classrooms, and 19% between schools. To give some means ofcomparison, in their analysis of Prospects data for elementary schools,Rowan et al. (2002) found that after controlling for student backgroundand prior achievement, the classrooms to which students were assigned

account for between 4% and 18% of the variance in students' cumulativeachievement status in a given year, which translates into a d-type effectsize of .21 to .42. At 4% for reading and 8% for math, the data for this

study were at the low end of the typical variation at the classroom level.Table 2 also illustrates the extent to which the fully specified model can

explain the variation that exists among students, classrooms and schools.From this comparison, one can see that these models explain a large per-

centage of the variation at the student level and at the school level, but

they do not do a particularly good job of explaining variation at the class-

room level. However, there was relatively little variance to be explained atthe classroom level using these data, making any results harder to detect.

Table 3 gives the results of the HLM analyses. It is important to keep inmind when viewing the table that the standardization of variables meansthat the coefficients can be treated as effect sizes. In both reading andmath, the signs are as expected for the student-level characteristics-

being a minority, poor, or eligible for special education all have a negativeeffect on a student's test score and are statistically significant relation-ships. The coefficient for female was negative and not statistically signifi-

cant for math or reading; we would not necessarily expect gender to playa role in this analysis.

At the teacher or classroom level, the effect of the teacher perfor-mance score-my measure of teacher quality-is positive and statistically

33

S. Archibald

Table 3Three-Level Fixed Effects Estimates for Reading and Math Posttests, 2002-2003

Reading Mathematics

Variables Coefficient SE Coefficient SE

Intercept -. 01 .02 -. 01 .01Student characteristics

Low income -. 08* .02 -. 06* .02Female -. 02 .02 -. 003 .02Minority -. 11* .02 -. 13* .02Special education -. 37* .03 -. 42* .03Pretest score .61* .01 .53* .02

Teacher characteristicsMaster's degree -. 01 .03 .01 .04Evaluation score .04* .01 .04* .02Step/years of experience -. 01 .01 -. 01 .02

School characteristicsPer-pupil spending .06* .02 .01 .03Poverty -. 13* .02 -. 17* .03School size -. 03* .02 -. 07* .03

Note. Coefficients can be interpreted as effect sizes because all variables are standardized.*p < .05.

significant in both reading and math. Particularly when one considers howlittle variation occurs at the classroom level, this result is worth emphasiz-ing: Within this small amount of variation is an effect that cannot be con-trolled away: the teacher responsible for the students' learning. The othermeasures of teacher quality, which many previous studies have found notto be significant factors, produced similar results in this study. Neither thedummy variable indicating that the teacher had a master's degree nor thevariable indicating the step or years of experience the teacher has was posi-tively related to achievement or statistically significant.

At the school level, school size and school-level poverty had negative,statistically significant impacts on both math and reading. Also, addressingthe second research question, per-pupil spending was positively relatedto achievement in math and reading, and the result was statistically sig-nificant for reading. This finding may be used as evidence that resourcesdo in fact matter when the goal is to improve student achievement onstandardized tests. The fact that this model used a school-level per-pupilspending number in this nested structure means that it is a better estimateof the effect of resources on achievement than many previous studies thatused data aggregated to the district level.

34

Narrowing in on Educational Resources

In terms of the third research question, the results did not confirm thetheory that some resources would affect achievement more than others. In

the second iteration of fully specified models, where the per-pupil spend-ing included only spending on instruction and instructional support, the

result was exactly the same for reading, but in math the coefficientchanged from positive, .01 with a standard error of .03, to -. 01 with a stan-

dard error of .04. Neither result was statistically significant in math. Part

of the reason for the lack of a significant finding may be due to the fact

that there was less variation in the variable when only expenditures for

instruction and instructional support was included. For more on this

topic, please see Appendix B.The following section provides some additional discussion about the

analyses presented here, including a discussion of areas where furtherresearch is needed.

Discussion and Further Research

These analyses confirm what prior CPRE research has shown: Teacherperformance as measured in a standards-based teacher evaluation systemis positively related to student achievement. It goes further by showing that

this finding holds when school-level explanatory variables are added, sug-

gesting that it is not just a proxy for something missing from the model. Asdiscussed earlier, the fact that there is relatively little variation at the class-

room level in this data set only strengthens the magnitude of this finding.Perhaps more important, this analysis has gone further by exploring

how school-level factors can affect student achievement as measured by

standardized tests. In particular, the finding that per-pupil spending atthe school level is positively related to student achievement in readingand statistically significant provides evidence that resources for educationdo matter. A possible explanation for a statistically significant effect inreading but not in math may be corroborated by a follow-up call to the

district to confirm, as suspected, that the district was directing moreresources toward literacy instruction in the 2002-03 school year.

This study also offers strong evidence that student background charac-

teristics matter, not only at the student level but for the school level as

well. After accounting for all of the socioeconomic and prior achievement

indicators at the student level, and controlling for teacher background

characteristics, the results show that other factors at the school level play a

significant role in determining how a student performs on a standardizedtest. These contextual effects, strongest for school-level poverty, have a

35

S. Archibald

statistically significant, negative effect of similar magnitude for both read-ing and math. This provides further evidence that school poverty concen-tration affects students' opportunities to learn. Qualitative research byGee (1999) and others has identified the positive effects that exposure topeers with higher vocabulary skills has on students with lower levelskills, which may be one of the reasons why school-level poverty matters.In addition, the analysis found a negative, statistically significant relation-ship between school size and student outcomes on standardized tests inboth reading and math.

The analyses in this article represent a beginning exploration of a three-level model investigating the impacts of various factors at the student,teacher, and school levels on student achievement, with a particularemphasis on level of resources available at the school level. As such,extensions to this study could be made at all three levels that would pro-vide better information about this complex relationship.

At the student level, one of the most important corrections to be mademay be attenuating the error present in the student's pretest score. This isone of the flaws inherent in a post-on-pre model such as this one, becausethe model is attempting to estimate the posttest score assuming that thepretest score is measured without error, but it is not. No pretest is withouterror, but by correcting for the test's inherent error, the model will yieldmore reliable overall results.

At the teacher level, because the results show that some of the variationin scores lies between classrooms, it would be interesting to model teacherperformance over time and contrast a value-added measure with the evalu-ation score to see the story each has to tell about teacher performance overtime, or about teacher performance with particular students or types ofstudents. Because significant Level 2 variance remains to be explained, Iwill want to further explore the data and possibly add other variables orcategorize the information differently. Other teacher-level factors thatresearch has identified but that this model does not include are class size(Finn & Achilles, 1990; Grissmer, 1999) as well as additional classroom com-position variables to test for the peer effects that play a part in studentachievement (Hoxby, 2000). More data on classroom composition wouldhelp to explore these issues.

Because one of the goals of CPRE research has been to determine theallocation of resources that yields the highest gains in student test scores,it would also be helpful to have more detailed data reflecting the use ofresources in the school, such as professional development expenditures perteacher.

More work remains to be done in investigating some of the resultsuncovered in this analysis, but this study represents a contribution to the

36

Narrowing in on Educational Resources

literature on whether resources matter for student achievement, the

importance of defining teacher quality in terms other than experience and

level of education if one is to properly estimate the magnitude of theteacher effect, the possible negative outcomes for students in large schools,and more. In this era of school finance adequacy court cases and testing

frenzies, these findings are pertinent to numerous policy discussions.

References

Andrews, M., Duncombe, W., & Yinger, J. (2002). Revisiting economies of size in American

education: Are we any closer to a consensus? Economics of Education Review, 21, 245-262.

Borman, G. D., & Dowling, N. M. (2003, April). Schools and inequality: A multilevel analysis of

Coleman's equality of educational opportunity data. Paper presented at the annual meeting of

the American Educational Research Association, Chicago.

Borman, G. D., & Kimball, S. M. (2005). Teacher quality and educational quality: Do teachers

with higher standards-based evaluation ratings close student achievement gaps?

Elementary School Journal, 106(1), 3-20.

Bryk, A., & Raudenbush, S. (1988). Toward a more appropriate conceptualization of research

on school effects: A three-level hierarchical linear model. American Journal of Education,

97,65-108.Danielson, C. (1996). Enhancing professional practice: A framework for teaching. Alexandria, VA:

Association for Supervision and Curriculum Development.

Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state

policy evidence [Electronic version]. Education Policy Analysis Archives, 8(1).

Ferguson, R. (1998). Teachers' perceptions and the Black-White test score gap. In C. Jencks &

M. Phillips (Eds.), The Black-White test score gap (pp. 273-317). Washington, DC: Brookings

Institution.Fermanich, M. (2003). School resources and student achievement: The effect of school-level resources

on instructional practices and student outcomes in Minneapolis public schools. Unpublished

doctoral dissertation, University of Wisconsin-Madison.Finn, J., & Achilles, C. (1990). Answers and questions about class size: A statewide experi-

ment. American Educational Research Journal, 27, 557-577.

Gallagher, H. A. (2004). Vaughn Elementary's innovative teacher evaluation system: Are

teacher evaluation scores related to growth in student achievement? Peabody Journal of

Education, 79(4), 79-107.Gee, J. P. (1999). Critical issues: Reading and the new literacy studies-Reframing the

National Academy of Sciences report on reading. Journal of Literacy Research, 31, 355-374.

Goldhaber, D. (2002). The mystery of good teaching: Surveying the evidence on student

achievement and teachers' characteristics. Education Next, 2, 50-55.Greenwald, R., Hedges, L., & Laine, R. (1996). The effect of school resources on student

achievement. Review of Education Research, 66, 361-396.

Grissmer, D. (Ed.). (1999). Class size: Issues and new findings [Special issue]. EducationalEvaluation and Policy Analysis, 21(2).

Hanushek, E. (1989). The impact of differential expenditures on school performance.Educational Researcher, 18(4), 45-51.

Hanushek, E. (1992). The trade-off between child quantity and quality. Journal of Political

Economy, 100, 84-117.

37

S. Archibald

Hanushek, E. A. (1997). Assessing the effects of school resources on student performance: Anupdate. Educational Evaluation and Policy Analysis, 19,141-164.

Hanushek, E. A., Kain, J. F., O'Brien, D. M., & Rivkin, S. G. (2005). The market for teacher qual-ity (Working Paper No. 11154). Cambridge, MA: National Bureau of Economic Research.

Hanushek, E. A., Kain, J. F., & Rivkin, S. G. (1998). Teachers, schools, and academic achievement(Working Paper No. 6691). Cambridge, MA: National Bureau of Economic Research.

Hedges, L. V., Laine, R. D., & Greenwald, R. (1994). Does money matter? A meta-analysis ofstudies of the effects of differential school inputs on student outcomes. EducationalResearcher, 23(3), 5-14.

Hoxby, C. (2000). Peer effects in the classroom: Learning from gender and race variables (WorkingPaper No. 7867). Cambridge, MA: National Bureau of Economic Research.

Jencks, C. S., & Mayer, S. E. (1990). The social consequences of growing up in a poor neigh-borhood. In L. E. Lynn & M. McGeary (Eds.), Inner-city poverty in the United States(pp. 111-186). Washington, DC: National Academy of Sciences.

Jencks, C., & Phillips, M. (1998). The Black-White test score gap. Washington, DC: BrookingsInstitution.

Kimball, S. M. (2002). Analysis of the feedback, enabling conditions and fairness perceptionsof teachers in three school districts with new standards-based evaluation systems. Journalof Personnel Evaluation in Education, 16, 241-268.

Kimball, S. M., White, B., Milanowski, A. T., & Borman, G. (2004). Examining the relation-ship between teacher evaluation and student assessment results in Washoe County.Peabody Journal of Education, 79(4), 54-78.

Milanowski, A. T (2004). The relationship between teacher performance evaluation scores andstudent achievement: Evidence from Cincinnati. Peabody Journal of Education, 79(4), 33-53.

Milanowski, A., & Kimball, S. (2005, April). The relationship between teacher expertise andstudent achievement: A synthesis of three years of data. Paper presented at the annual meetingof the American Educational Research Association, Montreal, Quebec, Canada.

Milanowski, A. T., Kimball, S. M., & Odden, A. (2005). Teacher accountability measures andlinks to learning. In L. Stiefel, A. E. Schwartz, R. Rubenstein, & J. Zabel (Eds.), Measur-ing school performance and efficiency: Implications for practice and research (pp. 137-161).Larchmont, NY: Eye on Education.

Murnane, R. (1983). Quantitative studies of effective schools: What have we learned? In A.Odden & L. D. Webb (Eds.), School finance and school improvement: Linkages for the 1980s(pp. 193-209). Cambridge, MA: Ballinger.

Nevada Department of Education. (2004-2005). Glossary of terms: Nevada report card [Website]. Available from http://www.nevadareportcard.com

No Child Left Behind Act of 2001, Pub L. No. 107-110, 115 Stat. 1425 (2002).Odden, A. R., Borman, G., & Fermanich, M. (2004). Assessing teacher, classroom, and school

effects, including fiscal effects. Peabody Journal of Education, 79(4), 4-32.Rockoff, J. (2004). The impact of individual teachers on student achievement: Evidence from

panel data. American Economic Review, 94, 247-252.Rowan, B., Correnti, R., & Miller, R. J. (2002). What large-scale, survey research tells us about

teacher effects on student achievement: Insights from the Prospects study of elementaryschools. Teachers College Record, 104, 1525-1567.

Sanders, W. L. (2000). Value-added assessment from student achievement data. Cary, NC: CreateNational Evaluation Institute.

Sanders, W. L., & Rivers, J.C. (1996). Cumulative and residual effects of teachers on future studentacademic achievement. Knoxville: University of Tennessee, Value-Added Research andAssessment Center.

Wayne, A., & Youngs, P. (2003). Teacher characteristics and student achievement gains: Areview. Review of Educational Research, 73, 89-122.

38

Narrowing in on Educational Resources

Appendix AThe Evaluation System

The following gives more details on Washoe's teacher evaluation system.

Under the Washoe standards-based teacher evaluation system, as

specified in the Danielson Framework (Danielson, 1996), teachers

are evaluated using rubric rating scores over four domains:

(a) Planning and Preparation, (b) Classroom Environment, (c) Instruc-

tion; and (d) Professional Responsibilities. The evaluations in Washoe

are conducted by the principal or assistant principal, though research

has shown that the use of multiple trained, objective evaluators

increases the validity of such systems (Heneman, Milanowski,Kimball & Odden, 2006). Under this system, multiple pieces of evi-

dence are used to help evaluate teacher practice, including teacher

self-assessments, lesson and unit plans, classroom and nonclassroomobservations with pre- and postobservation conferences, assignments

and student work, reflection sheets, and logs of professional develop-

ment and parental contact activities.

Each teacher is evaluated annually, but only probationary, or non-

tenured, teachers are evaluated on all four domains. Probationary

teachers are also required to be observed nine times during the schoolyear. Tenured teachers then begin a 3-year major-minor cycle, the

period of time over which they will be evaluated on all four domains.However, because of a desire to have annual information about the

instruction domain, teachers not formally being evaluated on this

domain are subject to a supplementary evaluation on a subset of

instruction-related standards. According to Borman and Kimball

(2005), composite scores can be calculated from this subset of standards

that represent psychometrically sound summary measures of teachers'

instructional performance. These scores were selected for use in this

study to maximize the number of teachers included in the analyses.

The composite teacher performance measure is based on thefollowing

3:

9 The teaching displays solid content knowledge and uses a reper-

toire of current pedagogical practices for the discipline beingtaught.

3In their 2005 study using some of the same data, Borman and Kimball calculated the

item intercorrelations for these composite scores ranged from .69 to .75, and the coefficient

alpha reliability was .91. Teacher evaluation results, as measured by the overall composite,

averaged about 2.63 on the 0- to 3-point scale.

39

S. Archibald

"* The teaching is designed coherently, using a logical sequence,matching materials and resources appropriately, and using a well-defined structure for connecting the individual activities to theentire unit. Instruction links student assessment data to instruc-tional planning and implementation.

"• The teaching provides for adjustments in planned lessons to matchthe students' needs more specifically. The teacher is persistent inusing alternative approaches and strategies for students who arenot initially successful.

"* The teaching engages students cognitively in activities and assign-ments, groups are productive, and strategies are congruent toinstructional objectives.

Appendix BPer-Pupil Spending

It is important to consider the variation in per-pupil expenditures. Asthis article mentions, expenditures are coded by a program called Insiteinto four categories that are defined in the following way:

Definitions of Funding Categories (Nevada Department of Education,2004-2005)

Instruction-This includes funding for instructional teachers, substi-tute teachers, instructional paraprofessionals, pupil-use technol-ogy, software, instructional materials, trips, and supplies.

Instruction Support-This includes funding for guidance and coun-seling, libraries and media, extracurricular activities, studenthealth services, curriculum development, staff development, sab-baticals, program management, therapists, psychologists, evalua-tors, personal attendants, and social workers.

Operations-This includes funding for transportation, food service,safety, building upkeep, utilities, building maintenance, data pro-cessing, and business operations.

Leadership-This includes funding for principals, assistant princi-pals, administrative support, deputies, senior administrators,researchers, program evaluators, superintendents, school boardrepresentatives, and legal staff.

The distribution of per-pupil expenditures can be analyzed in moredetail by looking at the coefficient of variation for the variable thatincludes overall expenditures:

40

Narrowing in on Educational Resources

Coefficient of variation for per-pupil spending:

For leadership:For operations:For instructional support:For instruction:

1,063.62/5,889.23 = 0.1885.26/317.63 = 0.29

253.85/1,004.57 = 0.25312.41/680.13 = 0.46573.378/3,886.9= 0.15

In addition, I created a variable based on the theory that operations andmaintenance and administration, which is called leadership within the Insite

program, would have less effect on expenditures than a combination of

instruction and instructional support. This variable, called instpp, has a

coefficient of variation of 837.82/4,567.03 = 0.18, which is roughly the sameamount of variation found for the per-pupil spending combined figure.

Table B1 gives the descriptive statistics for these different categories ofresources.

To investigate the variation in the measure of per-pupil spending, I ranthe descriptive statistics listed earlier. In addition, to see whether the vari-

ation among schools in this measure was being driven by variation inteacher salaries, I calculated the average step (as a proxy for salary,

because the higher the step, the higher the salary) for each school and ran

a correlation (see Table B2). Although this is somewhat of a crude mea-

sure, I deduce from this correlation that the variation in per-pupil spend-ing is based on a number of factors, not just teacher salaries.

Table 131Descriptive Statistics

N Minimum Maximum M SD

Leadership 60 220.00 803.00 317.6333 85.26131

Operations 60 681.00 2,096.00 1004.5667 253.84784

Instructional Support 60 397.00 2,554.00 680.1333 312.41077

Instruction 60 2,958.00 5,680.00 3,886.9000 573.37584

Instpp 60 3,460.00 8,234.00 4,567.0333 837.82165

Per-Pupil Spending 60 4,533.00 11,133.00 5,889.2333 1,063.61860

Note. Instpp = Instruction and Instructional Support.

Table B2Correlations

Per-Pupil Spending Pearson correlation 1 -. 074

Sig. (two-tailed) .577

N 59 59

Avg Teacher Experience Pearson correlation -. 074 1Sig. (two-tailed) .577N 59 60

41

S. Archibald

Appendix CDescriptives for Variables Prior to Z scoring

At the Student Level

I am unable to provide pre-Z-scored statistics from the student test scorevariable in any simple manner, as the files from the four grades in whichstudents had taken different tests had to be merged to create the combineddatabase. (The Z scoring was done prior to the merging.) Table C1 providesdescriptives for the dummy variables, which are most useful to look at thedemographic variables present in the "population" of this particular sample.

At the Teacher Level

Table C2 gives the descriptive statistics for the teacher-level variablesused in the model, prior to Z scoring:

At the School Level

Most of the descriptive statistics for the pre-Z-scored variables at the schoollevel were given in Table B2, but the other two variables are given in Table C3.

Table C1Descriptive Statistics for Level I Variables

N Minimum Maximum M SD

FrI 14,070 .00 1.00 .2212 .41506Speced 14,070 .00 1.00 .1035 .30460Gender dummy 14,070 .00 1.00 .4927 .49996Ethnicity dummy 14,070 .00 1.00 .3812 .48571Valid N (listwise) 14,070

Table C2Descriptive Statisticsfor Level 2 Variables

N Minimum Maximum M SD

Step 455 1.00 20.00 10.70 6.415Pcomavg 455 .968 3.000 2.63764 0.424867Master's 442 .00 1.00 0.5226 0.50005Valid N (listwise) 442

Table C3Descriptive Statisticsfor Level 3 Variables

N Minimum Maximum M SD

Perfrlsc 59 .01 .95 .4017 0.28558Schsize 59 173.00 874.00 547.76 137.417Valid N (listwise) 59

42

COPYRIGHT INFORMATION

TITLE: Narrowing in on Educational Resources That Do AffectStudent Achievement

SOURCE: Peabody J Educ 81 no4 2006WN: 0600401261005

The magazine publisher is the copyright holder of this article and itis reproduced with permission. Further reproduction of this article inviolation of the copyright is prohibited.

Copyright 1982-2007 The H.W. Wilson Company. All rights reserved.