The “Out of Africa” Hypothesis, Human Genetic Diversity ... · PDF fileThe...
Transcript of The “Out of Africa” Hypothesis, Human Genetic Diversity ... · PDF fileThe...
The “Out of Africa” Hypothesis, Human Genetic Diversity,
and Comparative Economic Development
By QUAMRUL ASHRAF AND ODED GALOR∗
ONLINE APPENDIX
This appendix (i) discusses empirical results from additional robustness checks con-
ducted for the historical analysis (Section A), (ii) presents the methodology underlying
the construction of the ancestry-adjusted measure of genetic diversity for contemporary
national populations (Section B), (iii) collects supplementary figures (Section C) and
tables (Section D) of empirical results referenced in the paper, (iv) presents details
on the 53 ethnic groups from the HGDP-CEPH Human Genome Diversity Cell Line
Panel (Section E), (v) provides detailed definitions and data sources of all the variables
employed by the empirical analyses in the present study (Section F), (vi) collects de-
scriptive statistics of the cross-country samples employed by the baseline regressions
in both the limited- and extended-sample variants of the historical analysis as well as
the contemporary analysis (Section G), and, (vii) discusses experimental evidence from
scientific studies in the field of evolutionary biology on the costs and benefits of genetic
diversity (Section H).
∗ Ashraf: Department of Economics, Williams College, 24 Hopkins Hall Dr., Williamstown, MA 01267 (email:
[email protected]); Galor: Department of Economics, Brown University, 64 Waterman St., Providence,
RI 02912 (email: [email protected]).
1
2 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
A ADDITIONAL ROBUSTNESS CHECKS FOR THE HISTORICAL ANALYSIS
A1. Results for Earlier Historical Periods
This section examines the effects of genetic diversity on economic development in
earlier historical periods of the Common Era and, in particular, establishes a hump-
shaped effect of genetic diversity, predicted by migratory distance from East Africa,
on log population density in the years 1000 CE and 1 CE. In so doing, the analysis
demonstrates the persistence of the diversity channel over a long expanse of time and
indicates that the hump-shaped manner in which genetic diversity has influenced com-
parative development, along with the optimal level of diversity, did not fundamentally
change during the agricultural stage of development.
The results from replicating the analysis in Section IV.B of the paper to explain log
population density in 1000 CE and 1 CE are presented in Tables A1 and A2 respectively.
As before, the individual and combined explanatory powers of the genetic diversity,
transition timing, and land productivity channels are examined empirically. The relevant
samples, determined by the availability of data on the dependent variable of interest as
well as all aforementioned explanatory channels, are composed of 140 countries for the
1000 CE regressions and 126 countries for the analysis in 1 CE. Despite more constrained
sample sizes, however, the empirical findings once again reveal a highly statistically
significant hump-shaped effect of genetic diversity, predicted by migratory distance from
East Africa, on log population density in these earlier historical periods. Additionally,
the magnitude and significance of the coefficients associated with the diversity channel in
these earlier periods remain rather stable, albeit less so in comparison to the analysis for
1500 CE, when the regression specification is augmented with controls for the transition
timing and land productivity channels as well as dummy variables capturing continent
fixed effects.
In a pattern similar to that observed in Table 3 of the paper, the unconditional effects of
genetic diversity in Tables A1 and A2 decrease slightly in magnitude when subjected to
controls for either the Neolithic transition timing or the land productivity channels, both
of which appear to confer their expected effects on population density in earlier historical
periods. However, as argued previously, these unconditional estimates certainly reflect
some amount of omitted variable bias resulting from the exclusion of the transition timing
and land productivity channels in Malthusian economic development. On the other hand,
unlike the pattern in Table 3 of the paper, the coefficients associated with the diversity
channel also weaken moderately in statistical significance, dropping to the 5 percent level
when controlling for transition timing in the 1000 CE analysis and to the 10 percent level
under controls for the land productivity channel in the 1 CE analysis. Nonetheless, these
reductions in statistical significance are not entirely surprising when one accounts for the
greater imprecision with which population density is recorded for these earlier periods,
given that mismeasurement in the dependent variable of an OLS regression typically
causes the resulting coefficient estimates to possess larger standard errors.
Column 5 in Tables A1 and A2 reveals the results from exploiting the combined ex-
planatory power of the genetic diversity, transition timing, and land productivity channels
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 3
TABLE A1—PREDICTED DIVERSITY AND ECONOMIC DEVELOPMENT IN 1000 CE
(1) (2) (3) (4) (5) (6)
Dependent variable is log population density in 1000 CE
Predicted diversity 219.722*** 158.631** 179.523*** 154.913** 201.239**
(68.108) (63.604) (65.981) (61.467) (97.612)
Predicted diversity square -155.442*** -113.110** -126.147*** -109.806** -145.894**
(50.379) (46.858) (48.643) (44.967) (68.252)
Log Neolithic transition 1.393*** 1.228*** 1.374*** 1.603***
timing (0.170) (0.180) (0.151) (0.259)
Log percentage of arable 0.546*** 0.371*** 0.370***
land (0.140) (0.106) (0.114)
Log absolute latitude -0.151 -0.380*** -0.373***
(0.103) (0.110) (0.137)
Log land suitability for 0.043 0.211** 0.190*
agriculture (0.135) (0.104) (0.106)
Optimal diversity 0.707*** 0.701*** 0.712*** 0.705** 0.690**
(0.039) (0.127) (0.146) (0.108) (0.293)
Continent fixed effects No No No No No Yes
Observations 140 140 140 140 140 140
R2 0.15 0.32 0.38 0.36 0.61 0.62
Note: This table establishes the significant hump-shaped effect of genetic diversity, as predicted by migratory distance
from East Africa, on log population density in 1000 CE in an extended 140-country sample while controlling for the
timing of the Neolithic Revolution, land productivity, and continent fixed effects. Bootstrap standard errors, accounting
for the use of generated regressors, are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
for log population density in 1000 CE and 1 CE. Interestingly, in each case, the linear
and quadratic coefficients associated with diversity remain rather stable when compared
to the corresponding estimates obtained under a partial set of controls in earlier columns.
In comparison to the corresponding results for population density in 1500 CE from Table
3 of the paper, the coefficients of the diversity channel uncovered here are statistically
significant at the 5 percent as opposed to the 1 percent level, a by-product of relatively
larger standard errors that again may be partly attributed to the higher measurement error
afflicting population density estimates reported for earlier historical periods.
Finally, the last column in each table augments the analysis with controls for continent
fixed effects, demonstrating that the coefficients associated with the diversity channel
in each historical period maintain significance in spite of the lower average degree of
cross-country variation in diversity within each continent as compared to that observed
worldwide. Moreover, the magnitudes of the diversity coefficients remain rather stable,
particularly in the 1000 CE analysis, and increase somewhat in the 1 CE analysis despite
the smaller sample size and, hence, even lower within-continent variation in diversity
exploited by the latter regression. Further, the estimated optimal levels of diversity in
the two periods are relatively stable in comparison to that obtained under the baseline
4 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TABLE A2—PREDICTED DIVERSITY AND ECONOMIC DEVELOPMENT IN 1 CE
(1) (2) (3) (4) (5) (6)
Dependent variable is log population density in 1 CE
Predicted diversity 227.826*** 183.142*** 129.180* 134.767** 231.689**
(72.281) (57.772) (66.952) (59.772) (113.162)
Predicted diversity square -160.351*** -132.373*** -88.040* -96.253** -166.859**
(53.169) (42.177) (49.519) (43.718) (79.175)
Log Neolithic transition 1.793*** 1.636*** 1.662*** 2.127***
timing (0.217) (0.207) (0.209) (0.430)
Log percentage of arable 0.377** 0.314** 0.348***
land (0.158) (0.125) (0.134)
Log absolute latitude 0.190 -0.121 -0.115
(0.125) (0.119) (0.135)
Log land suitability for 0.160 0.238* 0.210*
agriculture (0.173) (0.124) (0.125)
Optimal diversity 0.710*** 0.692*** 0.734** 0.700*** 0.694***
(0.052) (0.027) (0.347) (0.188) (0.194)
Continent fixed effects No No No No No Yes
Observations 126 126 126 126 126 126
R2 0.16 0.42 0.46 0.32 0.59 0.61
Note: This table establishes the significant hump-shaped effect of genetic diversity, as predicted by migratory distance
from East Africa, on log population density in 1 CE in an extended 126-country sample while controlling for the timing
of the Neolithic Revolution, land productivity, and continent fixed effects. Bootstrap standard errors, accounting for the
use of generated regressors, are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
regression for the year 1500 CE. The coefficients associated with diversity from the 1000
CE analysis suggest that, accounting for land productivity, the timing of the Neolithic
transition, and continent fixed effects, a 1 percentage point increase in genetic diversity
for the least diverse society in the sample would raise its population density by 38
percent, whereas a 1 percentage point decrease in diversity for the most diverse society
would raise its population density by 26 percent. On the other hand, for the 1 CE
analysis, a similar increase in genetic diversity for the least diverse society would raise
its population density by 47 percent, whereas a similar decrease in diversity for the most
diverse society would raise its population density by 28 percent.1 The hump-shaped
effects, implied by these coefficients, of genetic diversity on log population density in
the years 1000 CE and 1 CE are depicted in Figures A1 and A2.2
In sum, the results presented in Tables A1 and A2 suggest that, consistent with the
1These effects are calculated directly via the methodology outlined in Footnote 31 of the paper, along with the
sample minimum and maximum genetic diversity values of 0.573 and 0.774, respectively, in both the 1000 CE and 1 CE
regression samples.2For consistency with Figure 1 of the paper, which depicts the negative effect of increasing migratory distance from
East Africa on genetic diversity, the horizontal axes in Figures A1–A2 represent genetic homogeneity (i.e., 1 minus
genetic diversity) so as to reflect increasing as opposed to decreasing migratory distance from East Africa.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 5
FIGURE A1. PREDICTED GENETIC DIVERSITY AND POPULATION DENSITY IN 1000 CE
Note: This figure depicts the hump-shaped effect, estimated using a least-squares quadratic fit, of predicted genetic
homogeneity (i.e., 1 minus genetic diversity as predicted by migratory distance from East Africa) on log population
density in 1000 CE in an extended 140-country sample, conditional on the timing of the Neolithic Revolution, land
productivity, and continent fixed effects. This figure is an augmented component-plus-residual plot rather than the typical
added-variable plot of residuals against residuals. Specifically, the vertical axis represents fitted values (as predicted
by genetic homogeneity and its square) of log population density plus the residuals from the full regression model.
The horizontal axis, on the other hand, represents genetic homogeneity rather than the residuals obtained from regressing
homogeneity on the control variables in the model. This methodology permits the illustration of the overall nonmonotonic
effect of genetic homogeneity in one scatter plot.
predictions of the proposed diversity channel, genetic diversity has indeed been a signif-
icant determinant of Malthusian economic development in earlier historical periods as
well. The overall nonmonotonic effect of diversity on population density in the years
1000 CE and 1 CE is robust, in terms of both magnitude and statistical significance, to
controls for the timing of the agricultural transition, the natural productivity of land for
agriculture, and other unobserved continent-specific geographical and socioeconomic
characteristics. More fundamentally, the analysis demonstrates the persistence of the
diversity channel, along with the optimal level of diversity, over a long expanse of time
during the agricultural stage of development.
A2. Robustness to the Technology Diffusion Hypothesis
The technology diffusion hypothesis suggests that spatial proximity to global and
regional technological frontiers confers a beneficial effect on the development of less
advanced societies by facilitating the diffusion of new technologies from more advanced
6 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
FIGURE A2. PREDICTED GENETIC DIVERSITY AND POPULATION DENSITY IN 1 CE
Note: This figure depicts the hump-shaped effect, estimated using a least-squares quadratic fit, of predicted genetic
homogeneity (i.e., 1 minus genetic diversity as predicted by migratory distance from East Africa) on log population
density in 1 CE in an extended 126-country sample, conditional on the timing of the Neolithic Revolution, land
productivity, and continent fixed effects. This figure is an augmented component-plus-residual plot rather than the typical
added-variable plot of residuals against residuals. Specifically, the vertical axis represents fitted values (as predicted
by genetic homogeneity and its square) of log population density plus the residuals from the full regression model.
The horizontal axis, on the other hand, represents genetic homogeneity rather than the residuals obtained from regressing
homogeneity on the control variables in the model. This methodology permits the illustration of the overall nonmonotonic
effect of genetic homogeneity in one scatter plot.
societies through trade as well as sociocultural and geopolitical influences. In particular,
the technology diffusion channel implies that, ceteris paribus, the greater the geograph-
ical distance from the global and regional technological “leaders” in a given period,
the lower the level of economic development amongst the technological “followers”
in that period. Indeed, several studies in international trade and economic geography
have uncovered strong empirical support for this hypothesis in explaining comparative
economic development in the contemporary era. This section examines the robustness
of the hump-shaped effect of genetic diversity on economic development during the
precolonial era to controls for this additional hypothesis.
The purpose of the current investigation is to ensure that the analyses conducted in
Section IV.B of the paper and in the preceding appendix section were not ascribing
to genetic diversity the predictive power that should otherwise have been attributed
to the technology diffusion channel. To be specific, one may identify some of the
waypoints employed to construct the prehistoric migratory routes from East Africa (such
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 7
TABLE A3—THE REGIONAL TECHNOLOGICAL FRONTIERS OF THE WORLD, 1–1500 CE
City and Modern Location Continent Sociopolitical Entity Relevant Period
Cairo, Egypt Africa Mamluk Sultanate 1500 CE
Fez, Morocco Africa Marinid Kingdom of Fez 1500 CE
London, U.K. Europe Tudor Dynasty 1500 CE
Paris, France Europe Valois-Orléans Dynasty 1500 CE
Constantinople, Turkey Asia Ottoman Empire 1500 CE
Peking, China Asia Ming Dynasty 1500 CE
Tenochtitlan, Mexico Americas Aztec Civilization 1500 CE
Cuzco, Peru Americas Inca Civilization 1500 CE
Cairo, Egypt Africa Fatimid Caliphate 1000 CE
Kairwan, Tunisia Africa Berber Zirite Dynasty 1000 CE
Constantinople, Turkey Europe Byzantine Empire 1000 CE
Cordoba, Spain Europe Caliphate of Cordoba 1000 CE
Baghdad, Iraq Asia Abbasid Caliphate 1000 CE
Kaifeng, China Asia Song Dynasty 1000 CE
Tollan, Mexico Americas Classic Maya Civilization 1000 CE
Huari, Peru Americas Huari Culture 1000 CE
Alexandria, Egypt Africa Roman Empire 1 CE
Carthage, Tunisia Africa Roman Empire 1 CE
Athens, Greece Europe Roman Empire 1 CE
Rome, Italy Europe Roman Empire 1 CE
Luoyang, China Asia Han Dynasty 1 CE
Seleucia, Iraq Asia Seleucid Dynasty 1 CE
Teotihuacán, Mexico Americas Pre-Classic Maya Civilization 1 CE
Cahuachi, Peru Americas Nazca Culture 1 CE
as Cairo and Istanbul) as origins of spatial technology diffusion during the precolonial
era. This, coupled with the fact that genetic diversity decreases with increasing migratory
distance from East Africa, raises the concern that what has so far been interpreted as
evidence consistent with the beneficial effect of higher diversity may, in reality, simply
be capturing the latent effect of the omitted technology diffusion channel in earlier
regression specifications. As will become evident, however, while technology diffusion
is indeed found to have been a significant determinant of comparative development in
the precolonial era, the baseline findings for genetic diversity remain robust to controls
for this additional influential hypothesis.
To account for the technology diffusion channel, the current analysis constructs, for
each historical period examined, a control variable measuring the great circle distance
from the closest regional technological frontier in that period. Following the well-
accepted notion that the process of preindustrial urban development was typically more
pronounced in societies that enjoyed higher agricultural surpluses, the analysis adopts
historical city population size as an appropriate metric to identify the period-specific
sets of regional technological frontiers. Specifically, based on historical urban pop-
ulation data from Chandler (1987) and Modelski (2003), the procedure commences
with assembling, for each period, a set of regional frontiers comprising the two largest
cities, belonging to different civilizations or disparate sociopolitical entities, from each of
8 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
Africa, Europe, Asia, and the Americas.3 The effectiveness of this procedure in yielding
an outcome that is consistent with what one might expect from a general familiarity
with world history is evident in the set of regional frontiers obtained for each period
as shown in Table A3.4 In constructing the variable measuring distance to the closest
regional frontier for a given historical period, the analysis then selects, for each country
in the corresponding regression sample, the smallest of the great circle distances from
the regional frontiers to the country’s capital city.
To anticipate the robustness of the baseline results for genetic diversity, predicted by
migratory distance from East Africa, to controls for the technology diffusion hypoth-
esis, it may be noted that migratory distance from East Africa possesses a correlation
coefficient of only 0.02 with the great circle distance from the closest regional frontier
in the 1500 CE sample. Furthermore, for the 1000 CE and 1 CE regression samples,
migratory distance is again only weakly correlated with distance from the closest regional
technological frontier in each period, with the respective correlation coefficients being
-0.04 and 0.03.5 These encouragingly low sample correlations are indicative of the
fact that the estimated baseline regression specifications for the historical analysis were,
indeed, not simply attributing to genetic diversity the effects possibly arising from the
technology diffusion channel.
Column 1 of Table A4 reports the results from estimating the baseline specification
for log population density in 1500 CE while controlling for technology diffusion as
originating from the regional frontiers identified for this period. In comparison to the
baseline estimates revealed in Column 6 of Table 3 in the paper, the regression coeffi-
cients associated with the genetic diversity channel remain relatively stable, decreasing
only moderately in magnitude and statistical significance. Some similar robustness char-
acteristics may be noted for the transition timing and land productivity channels as well.
Importantly, however, the estimate for the optimal level of diversity remains virtually
unchanged and highly statistically significant. Interestingly, the results also establish the
technology diffusion channel as a significant determinant of comparative development
in the precolonial Malthusian era. In particular, a 1 percent increase in distance from
the closest regional frontier is associated with a decrease in population density by 0.2
3The exclusion of Oceania from the list of continents employed is not a methodological restriction but a natural result
arising from the fact that evidence of urbanization does not appear in the historical record of this continent until after
European colonization. Moreover, the consideration of the Americas as a single unit is consistent with the historical
evidence that this landmass only harbored two distinct major civilizational sequences – one in Mesoamerica and the other
in the Andean region of South America. Indeed, the imposition of the criteria that the selected cities in each continent
(or landmass) should belong to different sociopolitical units is meant to capture the notion that technology diffusion
historically occurred due to civilizational influence, broadly defined, as opposed to the influence of only major urban
centers that were developed by these relatively advanced societies.4Note that, for the year 1 CE, there are four cities appearing within the territories of the Roman Empire, which a priori
seems to violate the criterion that the regional frontiers selected should belong to different sociopolitical entities. This,
however, is simply a by-product of the dominance of the Roman Empire in the Mediterranean basin during that period.
In fact, historical evidence suggests that the cities of Athens, Carthage, and Alexandria had long been serving as centers
of regional diffusion prior to their annexation to the Roman Empire. Moreover, the appearance of Constantinople under
Europe in 1000 CE and Asia in 1500 CE is an innocuous classification issue arising from the fact that the city historically
fluctuated between the dominions of European and Asian civilizations.5These correlations differ slightly from those presented in Table G4 in Section G of this appendix, where the
correlations are presented for the entire 145-country sample used in the regressions for 1500 CE.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 9
TABLE A4—ROBUSTNESS TO THE TECHNOLOGY DIFFUSION HYPOTHESIS
(1) (2) (3)
Dependent variable is log population density in:
1500 CE 1000 CE 1 CE
Predicted diversity 156.736** 183.771** 215.858**
(75.572) (88.577) (105.286)
Predicted diversity square -114.626** -134.609** -157.724**
(52.904) (61.718) (73.681)
Log Neolithic transition 0.909*** 1.253*** 1.676***
timing (0.254) (0.339) (0.434)
Log percentage of arable 0.363*** 0.323*** 0.342***
land (0.104) (0.121) (0.131)
Log absolute latitude -0.492*** -0.454*** -0.212
(0.134) (0.149) (0.142)
Log land suitability for 0.275*** 0.239** 0.191
agriculture (0.090) (0.105) (0.120)
Log distance to regional -0.187***
frontier in 1500 CE (0.070)
Log distance to regional -0.230*
frontier in 1000 CE (0.121)
Log distance to regional -0.297***
frontier in 1 CE (0.102)
Optimal diversity 0.684*** 0.683*** 0.684**
(0.169) (0.218) (0.266)
Continent fixed effects Yes Yes Yes
Observations 145 140 126
R2 0.72 0.64 0.66
Note: This table establishes, using the extended cross-country sample for each historical time period, that the significant
hump-shaped effect of genetic diversity, as predicted by migratory distance from East Africa, on log population density
in the years 1500 CE, 1000 CE, and 1 CE, conditional on the timing of the Neolithic Revolution, land productivity, and
continent fixed effects, is robust to controlling for distance to the closest regional technological frontier in each historical
time period. Bootstrap standard errors, accounting for the use of generated regressors, are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
percent, an effect that is statistically significant at the 1 percent level.
Columns 2 and 3 establish the robustness of the hump-shaped effect of genetic diver-
sity on economic development in 1000 CE and 1 CE to controls for technology diffusion
arising from the technological frontiers identified for these earlier historical periods.
Specifically, comparing the regression for 1000 CE in Column 2 with its relevant baseline
(i.e., Column 6 of Table A1), the linear and quadratic coefficients associated with ge-
netic diversity remain largely stable under controls for technology diffusion, decreasing
slightly in magnitude but maintaining statistical significance. A similar stability pattern
also emerges for the coefficients capturing the influence of the diversity channel across
the 1 CE regressions. Indeed, the estimates for optimal diversity in these earlier periods
remain rather stable relative to their respective baselines in Tables A1 and A2. Finally, in
line with the predictions of the technology diffusion hypothesis, a statistically significant
10 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
negative effect of distance from the closest regional frontier on economic development
is observed for these earlier historical periods as well.
The results uncovered herein demonstrate the persistence of the significant hump-
shaped effect of genetic diversity on comparative development over the period 1–1500
CE, despite controls for the apparently influential role of technology diffusion arising
from the technological frontiers that were relevant during this period of world history.
Indeed, these findings lend further credence to the proposed diversity channel by demon-
strating that the historical analysis, based on genetic diversity predicted by migratory
distance from East Africa, has not been ascribing to genetic diversity the explanatory
power that should otherwise be attributed to the impact of spatial technology diffusion.
A3. Robustness to Microgeographic Factors
This section addresses concerns regarding the possibility that the hump-shaped effect
of genetic diversity on precolonial comparative development could in fact be reflecting
the latent impact of microgeographic factors, such as the degree of variation in terrain
and proximity to waterways, if these variables happen to be correlated with migratory
distance from East Africa. There are several conceivable channels through which such
factors could affect a society’s aggregate productivity and thus its population density in
the Malthusian stage of development. For instance, the degree of terrain variation within
a region can directly affect its agricultural productivity by influencing the arability of
land. Moreover, terrain ruggedness may also have led to the spatial concentration of
economic activity, which has been linked with increasing returns to scale and higher
aggregate productivity through agglomeration by the new economic geography literature.
On the other hand, by geographically isolating population subgroups, a rugged landscape
could also have nurtured their ethnic differentiation over time (Michalopoulos 2011), and
may thus confer an adverse effect on society’s aggregate productivity via the increased
likelihood of ethnic conflict. Similarly, while proximity to waterways can directly af-
fect crop yields by making beneficial practices such as irrigation possible, it may also
have augmented productivity indirectly by lowering transportation costs and, thereby,
fostering urban development, trade, and technology diffusion.6
To ensure that the significant hump-shaped effect of genetic diversity on log population
density in 1500 CE, as revealed in Section IV.B of the paper, is not simply reflecting the
latent influence of microgeographic factors, the current analysis examines variants of the
relevant baseline regression specification augmented with controls for terrain quality and
proximity to waterways. In particular, the controls for terrain quality are derived from
the G-ECON data set compiled by Nordhaus (2006), and they include mean elevation
and a measure of terrain roughness, both aggregated up to the country level from grid-
level data at a granularity of 1-degree latitude by 1-degree longitude. Moreover, in
light of the possibility that the impact of terrain undulation could be nonmonotonic,
the specifications examined also control for the squared term of the terrain roughness
6Indeed, a significant positive relationship between proximity to waterways and contemporary population density has
been demonstrated by Gallup, Sachs and Mellinger (1999).
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 11
TABLE A5—ROBUSTNESS TO MICROGEOGRAPHIC FACTORS
(1) (2) (3)
Dependent variable is
log population density in 1500 CE
Predicted diversity 160.346** 157.073** 157.059**
(78.958) (79.071) (69.876)
Predicted diversity square -118.716** -112.780** -114.994**
(55.345) (55.694) (48.981)
Log Neolithic transition 1.131*** 1.211*** 1.215***
timing (0.225) (0.201) (0.197)
Log percentage of arable 0.397*** 0.348*** 0.374***
land (0.099) (0.099) (0.087)
Log absolute latitude -0.358*** -0.354*** -0.352***
(0.124) (0.132) (0.122)
Log land suitability for 0.188* 0.248*** 0.160**
agriculture (0.101) (0.082) (0.081)
Mean elevation -0.404 0.502*
(0.251) (0.273)
Terrain roughness 5.938*** 4.076**
(1.870) (1.840)
Terrain roughness square -7.332** -7.627***
(2.922) (2.906)
Mean distance to nearest -0.437** -0.390**
waterway (0.178) (0.181)
Percentage of land near a 0.731** 1.175***
waterway (0.310) (0.294)
Optimal diversity 0.675*** 0.696*** 0.683***
(0.224) (0.188) (0.083)
Continent fixed effects Yes Yes Yes
Observations 145 145 145
R2 0.72 0.75 0.78
Note: This table establishes, using the extended 145-country sample, that the significant hump-shaped effect of genetic
diversity, as predicted by migratory distance from East Africa, on log population density in 1500 CE, while controlling for
the timing of the Neolithic Revolution, land productivity, and continent fixed effects, is robust to additional controls for
microgeographic factors, including terrain characteristics and access to waterways. Bootstrap standard errors, accounting
for the use of generated regressors, are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
index. The control variables gauging access to waterways, obtained from the data set of
Gallup, Sachs and Mellinger (1999), include the expected distance from any point within
a country to the nearest coast or sea-navigable river and the percentage of a country’s land
area located near (i.e., within 100 km of) a coast or sea-navigable river.7 Foreshadowing
the robustness of the baseline results, mean elevation, terrain roughness, and terrain
roughness square possess only moderate correlation coefficients of -0.11, 0.16, and 0.09,
respectively, with migratory distance from East Africa. Moreover, migratory distance is
7For completeness, specifications controlling for the squared terms of the other microgeographic factors were also
examined. The results from these additional regressions, however, did not reveal any significant nonlinear effects and are
therefore not reported.
12 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
also only moderately correlated with the measures of proximity to waterways, possessing
sample correlations of -0.20 and 0.19 with the distance and land area variables described
above.
The results from estimating augmented regression specifications for explaining log
population density in 1500 CE, incorporating controls for either terrain quality or access
to waterways, are shown in Columns 1 and 2 of Table A5. In each case, the coefficients
associated with the diversity channel remain statistically significant and relatively stable,
experiencing only a moderate decrease in magnitude, when compared to the baseline
results reported in Column 6 of Table 3 in the paper. Interestingly, the control variables
for terrain quality in Column 1 and those gauging access to waterways in Column 2
appear to confer statistically significant effects on population density in 1500 CE, mostly
in directions consistent with priors. The results suggest that terrain roughness does
indeed have a nonmonotonic impact on aggregate productivity, with the beneficial effects
dominating at relatively lower levels of terrain roughness and the detrimental effects
dominating at higher levels. Further, regions with greater access to waterways are found
to support higher population densities.
The final column of Table A5 examines the influence of the genetic diversity channel
under controls for both terrain quality and access to waterways. As anticipated by the
robustness of the results from preceding columns, genetic diversity continues to exert a
significant hump-shaped effect on log population density in 1500 CE, without exhibiting
any drastic reductions in the magnitude of its impact. Moreover, the estimate for the
optimal level of diversity remains fully intact in comparison to the baseline estimate
from Column 6 of Table 3 in the paper. The results uncovered here therefore suggest that
the significant nonmonotonic impact of genetic diversity, predicted by migratory distance
from East Africa, on log population density in 1500 CE is indeed not a spurious relation-
ship arising from the omission of microgeographic factors as explanatory variables in the
baseline regression specification.
A4. Robustness to Exogenous Factors in the Diamond Hypothesis
This section demonstrates the robustness of the hump-shaped effect of genetic diver-
sity, predicted by migratory distance from East Africa, on precolonial comparative devel-
opment to additional controls for the Neolithic transition timing channel. In particular,
the analysis is intended to alleviate concerns that the significant nonmonotonic impact
of genetic diversity presented in Section IV.B of the paper, although estimated while
controlling for the timing of the Neolithic Revolution, may still capture some latent in-
fluence of this other explanatory channel if correlations exist between migratory distance
from East Africa and exogenous factors governing the timing of the Neolithic transition.
The results from estimating some extended regression specifications for log population
density in 1500 CE, reflecting variants of the baseline specification in equation (8) of the
paper that additionally account for the ultimate determinants in the Diamond hypothesis,
are presented in Table A6.
Following the discussion from Section III.C of the paper on the geographic and bio-
geographic determinants of the Neolithic Revolution, the additional control variables
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 13
employed by the current analysis include (i) climate, measured as a discrete index with
higher integer values assigned to countries in Köppen-Geiger climatic zones that are
more favorable to agriculture, (ii) the orientation of the continental axis, measured as the
ratio of the largest longitudinal distance to the largest latitudinal distance of the continent
or landmass to which a country belongs, (iii) the size of the continent, measured as
the total land area of a country’s continent, (iv) the number of domesticable wild plant
species known to have existed in prehistory in the region to which a country belongs, and
(v) the number of domesticable wild animal species known to have been prehistorically
native to the region in which a country belongs.8 These variables are obtained from the
data set of Olsson and Hibbs (2005).
Column 1 of Table A6 presents the results from estimating the baseline specification
for log population density in 1500 CE using the restricted 96-country sample of Olsson
and Hibbs (2005). Reassuringly, the highly significant coefficients associated with di-
versity and the other explanatory channels remain rather stable in magnitude relative to
their estimates obtained with the unrestricted sample from Column 5 of Table 3 in the
paper, implying that any sampling bias that may have been introduced inadvertently by
the use of the restricted sample in the current analysis is indeed negligible.9
Columns 2–4 reveal the results from estimating variants of the baseline specification
where the Diamond channel is controlled for not by its proximate determinant but by one
or more of its ultimate determinants – i.e., either the set of geographic determinants or
the set of biogeographic determinants or both. The results indicate that the coefficients
associated with diversity continue to remain highly statistically significant and relatively
stable in magnitude in comparison to their baseline estimates from Column 1. Interest-
ingly, when controlling for only the geographic antecedents of the Neolithic Revolution
in Column 2, climate alone is significant amongst these additional factors. Likewise,
when only the biogeographic antecedents are controlled for in Column 3, the number of
domesticable animals rather than plants is significant. In addition, none of the ultimate
factors in the Diamond channel possess statistical significance when both geographic and
biogeographic determinants are controlled for in Column 4, a result that possibly reflects
the high correlations amongst these control variables. Regardless of these tangential
8While the influence of the number of domesticable species of plants and animals on the likelihood of the emergence
of agriculture is evident, the role of the geographic antecedents of the Neolithic Revolution requires some elaboration.
A larger size of the continent or landmass implied greater biodiversity and, hence, a greater likelihood that at least
some species suitable for domestication would exist. In addition, a more pronounced East-West (relative to North-
South) orientation of the major continental axis meant an easier diffusion of agricultural practices within the landmass,
particularly among regions sharing similar latitudes and, hence, similar environments suitable for agriculture. This
orientation factor is argued by Diamond (1997) to have played a pivotal role in comparative economic development
by favoring the early rise of complex agricultural civilizations on the Eurasian landmass. Finally, certain climates are
known to be more beneficial for agriculture than others. For instance, moderate zones encompassing the Mediterranean
and Marine West Coast subcategories in the Köppen-Geiger climate classification system are particularly amenable for
growing annual heavy grasses, whereas humid subtropical, continental, and wet tropical climates are less favorable in
this regard, with agriculture being almost entirely infeasible in dry and Polar climates. Indeed, the influence of these
various geographic and biogeographic factors on the timing of the Neolithic Revolution has been established empirically
by Olsson and Hibbs (2005) and Putterman (2008).9Note that the specifications estimated in the current analysis do not incorporate continent dummies since a sizeable
portion of unobserved continent-specific effects are captured by most of the (bio)geographic variables in the Diamond
channel that are measured at either the continental or the macro-regional levels. Augmenting the specifications with
continent fixed effects, however, does not significantly alter the results for genetic diversity.
14 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TABLE A6—ROBUSTNESS TO ULTIMATE DETERMINANTS IN THE DIAMOND HYPOTHESIS
(1) (2) (3) (4) (5)
Dependent variable is log population density in 1500 CE
Predicted diversity 216.847*** 252.076*** 174.414*** 212.123*** 274.916***
(62.764) (71.098) (62.505) (70.247) (73.197)
Predicted diversity square -154.750*** -180.650*** -125.137*** -151.579*** -197.120***
(45.680) (52.120) (45.568) (51.463) (53.186)
Log Neolithic transition 1.300*** 1.160***
timing (0.153) (0.298)
Log percentage of arable 0.437*** 0.431*** 0.441*** 0.411*** 0.365***
land (0.116) (0.119) (0.111) (0.116) (0.112)
Log absolute latitude -0.212** -0.426*** -0.496*** -0.487*** -0.332**
(0.102) (0.131) (0.154) (0.163) (0.145)
Log land suitability for 0.288** 0.184 0.297** 0.242* 0.280**
agriculture (0.135) (0.143) (0.146) (0.146) (0.122)
Climate 0.622*** 0.419 0.374*
(0.137) (0.268) (0.225)
Orientation of continental 0.281 0.040 -0.169
axis (0.332) (0.294) (0.255)
Size of continent -0.007 -0.005 -0.006
(0.015) (0.013) (0.012)
Domesticable plants 0.015 -0.005 0.003
(0.019) (0.023) (0.021)
Domesticable animals 0.154** 0.121 -0.013
(0.063) (0.074) (0.073)
Optimal diversity 0.701*** 0.698*** 0.697*** 0.700*** 0.697***
(0.021) (0.019) (0.051) (0.078) (0.020)
Observations 96 96 96 96 96
R2 0.74 0.70 0.70 0.72 0.78
Note: This table establishes, using a feasible 96-country sample, that the significant hump-shaped effect of genetic
diversity, as predicted by migratory distance from East Africa, on log population density in 1500 CE, while controlling
for the timing of the Neolithic Revolution and land productivity, is robust to additional controls for the geographic and
biogeographic antecedents of the Neolithic Revolution, including climate, the orientation of the continental axis, the size
of the continent, and the numbers of prehistoric domesticable species of plants and animals. Bootstrap standard errors,
accounting for the use of generated regressors, are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
issues, however, genetic diversity, as already mentioned, continues to exert a significant
hump-shaped effect on precolonial comparative development.
The final column in Table A6 establishes the robustness of the hump-shaped effect of
genetic diversity on log population density in 1500 CE to controls for both the proximate
and ultimate determinants in the Diamond channel. Perhaps unsurprisingly, the Neolithic
transition timing variable, being the proximate factor in this channel, captures most of
the explanatory power of the ultimate determinants of comparative development in the
Diamond hypothesis. More importantly, the linear and quadratic coefficients associated
with the diversity channel maintain relative stability, increasing moderately in magnitude
when compared to their baseline estimates, but remaining highly statistically significant.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 15
Overall, the results in Table A6 suggest that the baseline estimate of the hump-shaped
impact of genetic diversity, presented in Section IV.B of the paper, is indeed not reflect-
ing additional latent effects of the influential agricultural transition timing channel in
precolonial comparative development.
16 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
B THE INDEX OF CONTEMPORARY POPULATION DIVERSITY
This section discusses the methodology applied to construct the index of genetic di-
versity for contemporary national populations such that it additionally accounts for the
between-group component of diversity. To this effect, the index makes use of the concept
of Fst genetic distance from the field of population genetics.
Specifically, for any subpopulation pair, the Fst genetic distance between the two
subpopulations captures the proportion of their combined genetic diversity that is un-
explained by the weighted average of their respective genetic diversities. Consider, for
instance, a population comprised of two ethnic groups or subpopulations, A and B. The
Fst genetic distance between A and B would then be defined as
(B1) F ABst = 1−
θ A H Aexp + θ B H B
exp
H ABexp
,
where θ A and θ B are the shares of groups A and B, respectively, in the combined
population, H Aexp and H B
exp are their respective expected heterozygosities, and H ABexp is the
expected heterozygosity of the combined population. Thus, given (i) genetic distance,
F ABst , (ii) the expected heterozygosities of the component subpopulations, H A
exp and H Bexp,
and (iii) their respective shares in the overall population, θ A and θ B , the overall diversity
of the combined population is
(B2) H ABexp =
θ A H Aexp + θ B H B
exp(1− F AB
st
) .
In principle, the methodology described above could be applied recursively to arrive
at a measure of overall diversity for any contemporary national population, comprised of
an arbitrary number of ethnic groups, provided sufficient data on the expected heterozy-
gosities of all ethnicities worldwide as well as the genetic distances amongst them are
available. In reality, however, the fact that the HGDP-CEPH sample provides such data
for only 53 ethnic groups (or pairs thereof) implies that a straightforward application of
this methodology would necessarily restrict the calculation of the index of contemporary
diversity to a small set of countries. Moreover, unlike the historical analysis, exploiting
the predictive power of migratory distance from East Africa for genetic diversity would,
by itself, be insufficient since, while this would overcome the problem of data limitations
with respect to expected heterozygosities at the ethnic group level, it does not address
the problem associated with limited data on genetic distances.
To surmount this issue, the current analysis appeals to a second prediction of the serial
founder effect regarding the genetic differentiation of populations through isolation by
geographical distance. Accordingly, in the process of the initial stepwise diffusion of the
human species from Africa into the rest of the world, offshoot colonies residing at greater
geographical distances from parental ones would also be more genetically differentiated
from them. This would arise due to the larger number intervening migration steps and the
concomitantly larger number of genetic diversity subsampling events that are associated
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 17
FIGURE B1. PAIRWISE Fst GENETIC DISTANCE AND PAIRWISE MIGRATORY DISTANCE
Note: This figure depicts the positive impact of pairwise migratory distance on pairwise Fst genetic distance across all
1,378 ethnic group pairs from the set of 53 ethnic groups that constitute the HGDP-CEPH Human Genome Diversity Cell
Line Panel.
with offshoots residing at locations farther away from parental colonies. Indeed, this
second prediction of the serial founder effect is bourne out in the data as well. Based
on data from Ramachandran et al. (2005), Figure B1 shows the strong positive effect
of pairwise migratory distance on pairwise genetic distance across all pairs of ethnic
groups in the HGDP-CEPH sample. Specifically, according to the regression, variation in
migratory distance explains 78 percent of the variation in Fst genetic distance across the
1,378 ethnic group pairs. Moreover, the estimated OLS coefficient is highly statistically
significant, possessing a t-statistic of 53.62, and suggests that pairwise Fst genetic dis-
tance falls by 0.062 percentage points for every 10,000 km increase in pairwise migratory
distance. The construction of the index of genetic diversity for contemporary national
populations thus employs Fst genetic distance values predicted by pairwise migratory
distances.
In particular, using the hypothetical example of a contemporary population comprised
of two groups whose ancestors originate from countries A and B, the overall diversity of
the combined population would be calculated as:
(B3) H ABexp =
θ A H Aexp (dA)+ θ B H B
exp (dB)[1− F AB
st (dAB)] ,
18 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
where, for i ∈ {A, B}, H iexp (di ) denotes the expected heterozygosity predicted by the
migratory distance, di , of country i from East Africa (i.e., the predicted genetic diversity
of country i in the historical analysis), and θ i is the contribution of country i , as a
result of post-1500 migrations, to the combined population being considered. Moreover,
F ABst (dAB) is the genetic distance predicted by the migratory distance between countries
A and B, obtained by applying the coefficients associated with the regression line de-
picted in Figure B1. In practice, since contemporary national populations are typically
composed of more than two ethnic groups, the procedure outlined in equation (B3) is
applied recursively in order to incorporate a larger number of component ethnic groups
in modern populations.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 19
C SUPPLEMENTARY FIGURES
FIGURE C1. OBSERVED GENETIC DIVERSITY AND POPULATION DENSITY IN 1500 CE – THE UNCONDITIONAL
QUADRATIC AND CUBIC SPLINE RELATIONSHIPS
Note: This figure depicts the unconditional hump-shaped relationship, estimated using either a least-squares quadratic fit
or a restricted cubic spline regression, between observed genetic homogeneity (i.e., 1 minus observed genetic diversity)
and log population density in 1500 CE in the limited 21-country sample. The restricted cubic spline regression line is
estimated using three equally-spaced knots on the domain of observed genetic homogeneity values. The shaded area
represents the 95 percent confidence interval band associated with the cubic spline regression line.
20 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
(a)
Qu
adra
tic
vs.
No
np
aram
etri
c(b
)Q
uad
rati
cv
s.C
ub
icS
pli
ne
FIG
UR
EC
2.
PR
ED
ICT
ED
GE
NE
TIC
DIV
ER
SIT
YA
ND
PO
PU
LA
TIO
ND
EN
SIT
YIN
15
00
CE
–T
HE
UN
CO
ND
ITIO
NA
LQ
UA
DR
AT
IC,
NO
NP
AR
AM
ET
RIC
,A
ND
CU
BIC
SP
LIN
E
RE
LA
TIO
NS
HIP
S
No
te:
Th
isfi
gu
red
epic
tsth
eu
nco
nd
itio
nal
hu
mp
-sh
aped
rela
tio
nsh
ip,b
ased
on
thre
ed
iffe
ren
tes
tim
atio
nte
chn
iqu
es,b
etw
een
pre
dic
ted
gen
etic
ho
mo
gen
eity
(i.e
.,1
min
us
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
and
log
po
pu
lati
on
den
sity
in1
50
0C
Ein
the
exte
nd
ed1
45
-co
un
try
sam
ple
.T
he
rela
tio
nsh
ipis
esti
mat
edu
sin
g(i
)
ale
ast-
squ
ares
qu
adra
tic
fit
(bo
thp
anel
s),
(ii)
an
on
par
amet
ric
reg
ress
ion
(Pan
el(a
)),
and
(iii
)a
rest
rict
edcu
bic
spli
ne
reg
ress
ion
(Pan
el(b
)).
Th
en
on
par
amet
ric
reg
ress
ion
lin
e
inP
anel
(a)
ises
tim
ated
usi
ng
loca
l2
nd
-deg
ree
po
lyn
om
ial
smo
oth
ing
bas
edo
na
Gau
ssia
nker
nel
fun
ctio
nan
da
ker
nel
ban
dw
idth
of
0.0
6.
Th
ere
stri
cted
cub
icsp
lin
ere
gre
ssio
n
lin
ein
Pan
el(b
)is
esti
mat
edu
sin
gth
ree
equ
ally
-sp
aced
kn
ots
on
the
do
mai
no
fp
red
icte
dg
enet
ich
om
og
enei
tyval
ues
.T
he
shad
edar
eas
inP
anel
s(a
)an
d(b
)re
pre
sen
tth
e9
5
per
cen
tco
nfi
den
cein
terv
alb
and
sas
soci
ated
wit
hth
en
on
par
amet
ric
and
cub
icsp
lin
ere
gre
ssio
nli
nes
resp
ecti
vel
y.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 21
(a)
Fir
st-O
rder
Par
tial
Eff
ect
(b)
Sec
on
d-O
rder
Par
tial
Eff
ect
FIG
UR
EC
3.
PR
ED
ICT
ED
GE
NE
TIC
DIV
ER
SIT
YA
ND
PO
PU
LA
TIO
ND
EN
SIT
YIN
15
00
CE
–T
HE
FIR
ST-
AN
DS
EC
ON
D-O
RD
ER
PA
RT
IAL
EF
FE
CT
S
No
te:
Th
isfi
gu
red
epic
tsth
ep
osi
tive
firs
t-o
rder
par
tial
effe
ct(P
anel
(a))
and
the
neg
ativ
ese
con
d-o
rder
par
tial
effe
ct(P
anel
(b))
of
pre
dic
ted
gen
etic
ho
mo
gen
eity
(i.e
.,1
min
us
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
on
log
po
pu
lati
on
den
sity
in1
50
0C
Ein
the
exte
nd
ed1
45
-co
un
try
sam
ple
,co
nd
itio
nal
on
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts.
Eac
hp
anel
show
san
add
ed-v
aria
ble
plo
to
fre
sid
ual
sag
ain
stre
sid
ual
s.T
hu
s,th
ex
-an
dy
-axes
inP
anel
(a)
plo
tth
ere
sid
ual
so
bta
ined
fro
mre
gre
ssin
gth
eli
nea
rte
rmin
gen
etic
ho
mo
gen
eity
and
log
po
pu
lati
on
den
sity
,re
spec
tivel
y,o
nth
eq
uad
rati
cte
rmin
gen
etic
ho
mo
gen
eity
asw
ell
asth
eaf
ore
men
tio
ned
set
of
covar
iate
s.C
onver
sely
,th
ex
-an
dy
-axes
inP
anel
(b)
plo
tth
ere
sid
ual
so
bta
ined
fro
mre
gre
ssin
gth
eq
uad
rati
cte
rmin
gen
etic
ho
mo
gen
eity
and
log
po
pu
lati
on
den
sity
,re
spec
tivel
y,o
nth
eli
nea
rte
rmin
gen
etic
ho
mo
gen
eity
asw
ell
asth
eaf
ore
men
tio
ned
set
of
covar
iate
s.
22 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
(a)
Qu
adra
tic
vs.
No
np
aram
etri
c(b
)Q
uad
rati
cv
s.C
ub
icS
pli
ne
FIG
UR
EC
4.
AN
CE
ST
RY
-AD
JU
ST
ED
GE
NE
TIC
DIV
ER
SIT
YA
ND
INC
OM
EP
ER
CA
PIT
AIN
20
00
CE
–T
HE
UN
CO
ND
ITIO
NA
LQ
UA
DR
AT
IC,
NO
NP
AR
AM
ET
RIC
,A
ND
CU
BIC
SP
LIN
ER
EL
AT
ION
SH
IPS
No
te:
Th
isfi
gu
red
epic
tsth
eu
nco
nd
itio
nal
hu
mp
-sh
aped
rela
tio
nsh
ip,
bas
edo
nth
ree
dif
fere
nt
esti
mat
ion
tech
niq
ues
,b
etw
een
ance
stry
-ad
just
edg
enet
ich
om
og
enei
ty(i
.e.,
1
min
us
ance
stry
-ad
just
edg
enet
icd
iver
sity
)an
dlo
gin
com
ep
erca
pit
ain
20
00
CE
ina
14
3-c
ou
ntr
ysa
mp
le.
Th
ere
lati
on
ship
ises
tim
ated
usi
ng
(i)
ale
ast-
squ
ares
qu
adra
tic
fit
(bo
th
pan
els)
,(i
i)a
no
np
aram
etri
cre
gre
ssio
n(P
anel
(a))
,an
d(i
ii)
are
stri
cted
cub
icsp
lin
ere
gre
ssio
n(P
anel
(b))
.T
he
no
np
aram
etri
cre
gre
ssio
nli
ne
inP
anel
(a)
ises
tim
ated
usi
ng
loca
l
2n
d-d
egre
ep
oly
no
mia
lsm
oo
thin
gb
ased
on
aG
auss
ian
ker
nel
fun
ctio
nan
da
ker
nel
ban
dw
idth
of
0.0
6.
Th
ere
stri
cted
cub
icsp
lin
ere
gre
ssio
nli
ne
inP
anel
(b)
ises
tim
ated
usi
ng
thre
eeq
ual
ly-s
pac
edk
no
tso
nth
ed
om
ain
of
ance
stry
-ad
just
edg
enet
ich
om
og
enei
tyval
ues
.T
he
shad
edar
eas
inP
anel
s(a
)an
d(b
)re
pre
sen
tth
e9
5p
erce
nt
con
fid
ence
inte
rval
ban
ds
asso
ciat
edw
ith
the
no
np
aram
etri
can
dcu
bic
spli
ne
reg
ress
ion
lin
esre
spec
tivel
y.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 23
(a)
Fir
st-O
rder
Par
tial
Eff
ect
(b)
Sec
on
d-O
rder
Par
tial
Eff
ect
FIG
UR
EC
5.
AN
CE
ST
RY
-AD
JU
ST
ED
GE
NE
TIC
DIV
ER
SIT
YA
ND
INC
OM
EP
ER
CA
PIT
AIN
20
00
CE
–T
HE
FIR
ST-
AN
DS
EC
ON
D-O
RD
ER
PA
RT
IAL
EF
FE
CT
S
No
te:
Th
isfi
gu
red
epic
tsth
ep
osi
tive
firs
t-o
rder
par
tial
effe
ct(P
anel
(a))
and
the
neg
ativ
ese
con
d-o
rder
par
tial
effe
ct(P
anel
(b))
of
ance
stry
-ad
just
edg
enet
ich
om
og
enei
ty(i
.e.,
1
min
us
ance
stry
-ad
just
edg
enet
icd
iver
sity
)o
nlo
gin
com
ep
erca
pit
ain
20
00
CE
ina
10
9-c
ou
ntr
ysa
mp
le,
con
dit
ion
alo
nth
ean
cest
ry-a
dju
sted
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,
lan
dp
rod
uct
ivit
y,a
vec
tor
of
inst
itu
tio
nal
,cu
ltu
ral,
and
geo
gra
ph
ical
det
erm
inan
tso
fd
evel
op
men
t,an
dco
nti
nen
tfi
xed
effe
cts.
Eac
hp
anel
show
san
add
ed-v
aria
ble
plo
to
f
resi
du
als
agai
nst
resi
du
als.
Th
us,
the
x-
and
y-a
xes
inP
anel
(a)
plo
tth
ere
sid
ual
so
bta
ined
fro
mre
gre
ssin
gth
eli
nea
rte
rmin
gen
etic
ho
mo
gen
eity
and
log
inco
me
per
cap
ita,
resp
ecti
vel
y,o
nth
eq
uad
rati
cte
rmin
gen
etic
ho
mo
gen
eity
asw
ell
asth
eaf
ore
men
tio
ned
set
of
covar
iate
s.C
onver
sely
,th
ex
-an
dy
-axes
inP
anel
(b)
plo
tth
ere
sid
ual
so
bta
ined
fro
mre
gre
ssin
gth
eq
uad
rati
cte
rmin
gen
etic
ho
mo
gen
eity
and
log
inco
me
per
cap
ita,
resp
ecti
vel
y,o
nth
eli
nea
rte
rmin
gen
etic
ho
mo
gen
eity
asw
ell
asth
eaf
ore
men
tio
ned
set
of
covar
iate
s.
24 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
(a)
Mig
rato
ryD
ista
nce
and
Sk
inR
eflec
tan
ce(b
)M
igra
tory
Dis
tan
cean
dH
eig
ht
(c)
Mig
rato
ryD
ista
nce
and
Wei
gh
t
FIG
UR
EC
6.
AN
CE
ST
RY
-AD
JU
ST
ED
MIG
RA
TO
RY
DIS
TA
NC
EF
RO
ME
AS
TA
FR
ICA
AN
DS
OM
EM
EA
NP
HY
SIO
LO
GIC
AL
CH
AR
AC
TE
RIS
TIC
SO
FC
ON
TE
MP
OR
AR
YN
AT
ION
AL
PO
PU
LA
TIO
NS
No
te:
Th
isfi
gu
red
epic
tsth
atan
cest
ry-a
dju
sted
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica
has
no
syst
emat
icre
lati
on
ship
wit
hso
me
mea
np
hy
sio
log
ical
char
acte
rist
ics
of
con
tem
po
rary
nat
ion
alp
op
ula
tio
ns,
incl
ud
ing
the
aver
age
deg
ree
of
skin
refl
ecta
nce
(Pan
el(a
)),
aver
age
hei
gh
t(P
anel
(b))
,an
dav
erag
ew
eig
ht
(Pan
el(c
)),
con
dit
ion
alo
nth
ein
ten
sity
of
ult
rav
iole
tex
po
sure
,ab
solu
tela
titu
de,
the
per
cen
tag
eo
far
able
lan
d,
the
shar
eso
fla
nd
intr
op
ical
(or
sub
tro
pic
al)
and
tem
per
ate
zon
es,
mea
nel
evat
ion
,ac
cess
tow
ater
way
s,
and
con
tin
ent
fixed
effe
cts.
Eac
hp
anel
show
san
add
ed-v
aria
ble
plo
to
fre
sid
ual
sag
ain
stre
sid
ual
s.T
hu
s,th
ex
-an
dy
-axes
inea
chp
anel
plo
tth
ere
sid
ual
so
bta
ined
fro
m
reg
ress
ing
,re
spec
tivel
y,an
cest
ry-a
dju
sted
mig
rato
ryd
ista
nce
and
the
rele
van
tp
hy
sio
log
ical
char
acte
rist
ic(i
.e.,
aver
age
skin
refl
ecta
nce
,av
erag
eh
eig
ht,
or
aver
age
wei
gh
t)o
nth
e
afo
rem
enti
on
edse
to
fco
var
iate
s.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 25
D SUPPLEMENTARY RESULTS
TABLE D1—ROBUSTNESS OF THE ROLE OF MIGRATORY DISTANCE IN THE SERIAL FOUNDER EFFECT
(1) (2) (3) (4) (5) (6)
Dependent variable is observed genetic diversity
Migratory distance from -0.799*** -0.826*** -0.798*** -0.796*** -0.798*** -0.690***
East Africa (0.054) (0.062) (0.066) (0.072) (0.089) (0.148)
Absolute latitude -0.016 -0.003 -0.004 -0.003 0.074
(0.015) (0.017) (0.015) (0.022) (0.045)
Percentage of arable land -0.015 -0.013 -0.009 -0.010 0.002
(0.026) (0.031) (0.028) (0.040) (0.045)
Mean land suitability for 1.937 -1.244 -0.795 -0.904 1.370
agriculture (1.507) (5.400) (4.803) (6.260) (5.330)
Range of land suitability -1.175 -1.594 -1.477 -2.039
(4.564) (4.364) (5.789) (5.715)
Land suitability Gini -3.712 -3.767 -3.805 -4.103
(4.774) (4.402) (4.805) (4.165)
Mean elevation 0.937 0.918 -2.457
(2.352) (2.393) (1.567)
Standard deviation of -0.129 -0.112 3.418
elevation (2.284) (2.288) (2.137)
Mean distance to nearest -0.044 0.503
waterway (1.153) (0.982)
Continent fixed effects No No No No No Yes
Observations 21 21 21 21 21 21
R2 0.94 0.95 0.95 0.96 0.96 0.98
Partial R2 of migratory 0.94 0.93 0.93 0.90 0.81
distance
Note: Using the limited 21-country sample, this table (i) establishes that the significant negative effect of migratory
distance from East Africa on observed genetic diversity is robust to controls for geographical factors linked to ethnic
diversity (Michalopoulos 2011), including absolute latitude, the percentage of arable land, the mean, range, and Gini
coefficient of the distribution of land suitability for agriculture, the mean and standard deviation of the distribution of
elevation, access to waterways, and continent fixed effects, and (ii) demonstrates that these geographical factors have
little or no explanatory power for the cross-country variation in observed genetic diversity beyond that accounted for by
migratory distance. Heteroskedasticity robust standard errors are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
26 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TABLE D2—THE RESULTS OF TABLE 1 WITH CORRECTIONS FOR SPATIAL AUTOCORRELATION
(1) (2) (3) (4) (5)
Dependent variable is log population density in 1500 CE
Observed diversity 413.504*** 225.440*** 203.814***
[85.389] [55.428] [65.681]
Observed diversity square -302.647*** -161.158*** -145.717***
[64.267] [42.211] [53.562]
Log Neolithic transition 2.396*** 1.214*** 1.135***
timing [0.249] [0.271] [0.367]
Log percentage of arable 0.730*** 0.516*** 0.545***
land [0.263] [0.132] [0.178]
Log absolute latitude 0.145 -0.162* -0.129
[0.180] [0.084] [0.101]
Log land suitability for 0.734* 0.571** 0.587**
agriculture [0.376] [0.240] [0.233]
Continent fixed effects No No No No Yes
Observations 21 21 21 21 21
R2 0.42 0.54 0.57 0.89 0.90
Note: This table establishes that the significant hump-shaped relationship between observed genetic diversity and log
population density in 1500 CE in the limited 21-country sample, while controlling for the timing of the Neolithic
Revolution, land productivity, and continent fixed effects, is robust to accounting for spatial autocorrelation across
observations. Standard errors corrected for spatial autocorrelation, following Conley (1999), are reported in brackets. To
perform this correction, the spatial distribution of observations is specified on the Euclidean plane using aerial distances
between all pairs of observations in the sample, and the autocorrelation is modeled as declining linearly away from each
observation up to a threshold of 5,000 km. This threshold effectively excludes spatial interactions between the Old World
and the New World, which is appropriate given the historical period being considered.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 27
TA
BL
ED
3—
TH
ER
ES
UL
TS
OF
TA
BL
E2
WIT
HC
OR
RE
CT
ION
SF
OR
SP
AT
IAL
AU
TO
CO
RR
EL
AT
ION
Sp
atia
lS
pat
ial
OL
SO
LS
OL
SO
LS
GM
MG
MM
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
log
po
pu
lati
on
den
sity
in1
50
0C
E
Ob
serv
edd
iver
sity
25
5.2
19
**
*3
61
.42
0*
**
28
5.1
91
**
*2
43
.10
8*
**
[77
.93
3]
[10
8.6
92
][8
1.5
38
][5
5.3
25
]
Ob
serv
edd
iver
sity
squ
are
-20
9.8
08
**
*-2
68
.51
4*
**
-20
6.5
77
**
*-1
79
.57
9*
**
[58
.31
5]
[77
.74
0]
[61
.90
6]
[44
.27
1]
Mig
rato
ryd
ista
nce
0.5
05
**
*0
.07
0
[0.1
10
][0
.13
8]
Mig
rato
ryd
ista
nce
squ
are
-0.0
23
**
*-0
.01
4*
[0.0
04
][0
.00
8]
Mo
bil
ity
ind
ex0
.35
3*
**
0.0
51
[0.1
08
][0
.12
5]
Mo
bil
ity
ind
exsq
uar
e-0
.01
2*
**
-0.0
03
[0.0
03
][0
.00
5]
Lo
gN
eoli
thic
tran
siti
on
1.0
14
**
*1
.11
9*
**
tim
ing
[0.3
56
][0
.37
0]
Lo
gp
erce
nta
ge
of
arab
le0
.60
8*
**
0.6
34
**
*
lan
d[0
.18
7]
[0.2
02
]
Lo
gab
solu
tela
titu
de
-0.2
09
**
-0.1
33
[0.1
07
][0
.09
9]
Lo
gla
nd
suit
abil
ity
for
0.4
94
**
0.5
49
**
agri
cult
ure
[0.2
27
][0
.24
2]
Co
nti
nen
tfi
xed
effe
cts
No
No
No
No
No
Yes
Ob
serv
atio
ns
21
21
18
18
21
21
R2
0.3
40
.46
0.3
00
.43
––
No
te:
Th
ista
ble
esta
bli
shes
that
the
resu
lts
fro
m(i
)p
erfo
rmin
gan
info
rmal
iden
tifi
cati
on
test
,w
hic
hd
emo
nst
rate
sth
atth
esi
gn
ifica
nt
un
con
dit
ion
alh
um
p-s
hap
edim
pac
to
f
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica
on
log
po
pu
lati
on
den
sity
in1
50
0C
Ein
the
lim
ited
21
-co
un
try
sam
ple
alm
ost
enti
rely
refl
ects
the
late
nt
infl
uen
ceo
fo
bse
rved
gen
etic
div
ersi
ty,
and
(ii)
emp
loy
ing
mig
rato
ryd
ista
nce
asan
excl
ud
edin
stru
men
tfo
ro
bse
rved
gen
etic
div
ersi
tyto
esta
bli
shth
eca
usa
lh
um
p-s
hap
edef
fect
of
gen
etic
div
ersi
tyo
nlo
g
po
pu
lati
on
den
sity
in1
50
0C
Ein
the
lim
ited
21
-co
un
try
sam
ple
,w
hil
eco
ntr
oll
ing
for
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
are
robu
stto
acco
un
tin
gfo
rsp
atia
lau
toco
rrel
atio
nac
ross
ob
serv
atio
ns.
Co
lum
ns
5–
6p
rese
nt
the
resu
lts
fro
mes
tim
atin
gth
eco
rres
po
nd
ing
2S
LS
spec
ifica
tio
ns
inT
able
2o
fth
ep
aper
usi
ng
Co
nle
y’s
(19
99
)sp
atia
lG
MM
esti
mat
ion
pro
ced
ure
.S
tan
dar
der
rors
corr
ecte
dfo
rsp
atia
lau
toco
rrel
atio
n,fo
llow
ing
Co
nle
y(1
99
9),
are
rep
ort
edin
bra
cket
s.T
op
erfo
rmth
is
corr
ecti
on
,th
esp
atia
ld
istr
ibu
tio
no
fo
bse
rvat
ion
sis
spec
ified
on
the
Eu
clid
ean
pla
ne
usi
ng
aeri
ald
ista
nce
sb
etw
een
all
pai
rso
fo
bse
rvat
ion
sin
the
sam
ple
,an
dth
eau
toco
rrel
atio
n
ism
od
eled
asd
ecli
nin
gli
nea
rly
away
fro
mea
cho
bse
rvat
ion
up
toa
thre
sho
ldo
f5
,00
0k
m.
Th
isth
resh
old
effe
ctiv
ely
excl
ud
essp
atia
lin
tera
ctio
ns
bet
wee
nth
eO
ldW
orl
dan
dth
e
New
Wo
rld
,w
hic
his
app
rop
riat
eg
iven
the
his
tori
cal
per
iod
bei
ng
con
sid
ered
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
28 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
ED
4—
RO
BU
ST
NE
SS
TO
LO
G-L
OG
SP
EC
IFIC
AT
ION
S
Lim
ited
-sam
ple
his
tori
cal
Ex
ten
ded
-sam
ple
his
tori
cal
Co
nte
mp
ora
ry
anal
ysi
san
aly
sis
anal
ysi
s
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gin
com
ep
er
den
sity
ind
ensi
tyin
den
sity
ind
ensi
tyin
cap
ita
in
15
00
CE
15
00
CE
10
00
CE
1C
E2
00
0C
E
Lo
gd
iver
sity
45
1.0
66
**
41
5.8
64
*4
23
.29
8*
*4
20
.59
5*
*4
81
.77
1*
*5
11
.65
0*
**
(15
1.7
97
)(1
90
.65
1)
(17
0.9
63
)(1
99
.56
5)
(23
9.6
30
)(1
83
.41
2)
Sq
uar
edlo
gd
iver
sity
-42
5.5
79
**
-39
6.9
06
*-4
07
.99
0*
*-4
02
.34
4*
*-4
58
.41
3*
*-4
76
.12
5*
**
(15
0.8
83
)(2
04
.65
5)
(15
9.5
15
)(1
85
.25
8)
(22
2.3
06
)(1
72
.28
5)
Lo
gN
eoli
thic
tran
siti
on
1.2
49
**
*1
.17
7*
1.2
41
**
*1
.60
6*
**
2.1
32
**
*0
.06
2
tim
ing
(0.3
70
)(0
.61
8)
(0.2
32
)(0
.26
5)
(0.4
25
)(0
.26
0)
Lo
gp
erce
nta
ge
of
arab
le0
.51
0*
**
0.4
17
*0
.39
2*
**
0.3
70
**
*0
.34
8*
**
-0.1
22
lan
d(0
.16
4)
(0.2
21
)(0
.10
5)
(0.1
16
)(0
.13
3)
(0.1
04
)
Lo
gab
solu
tela
titu
de
-0.1
55
-0.1
23
-0.4
13
**
*-0
.36
8*
**
-0.1
10
0.1
73
(0.1
31
)(0
.17
0)
(0.1
24
)(0
.12
7)
(0.1
37
)(0
.12
6)
Lo
gla
nd
suit
abil
ity
for
0.5
71
*0
.67
40
.25
6*
**
0.1
89
*0
.20
9-0
.17
6*
agri
cult
ure
(0.2
96
)(0
.40
7)
(0.0
99
)(0
.10
8)
(0.1
29
)(0
.09
8)
Co
nti
nen
tfi
xed
effe
cts
No
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
21
21
14
51
40
12
61
43
R2
0.8
90
.91
0.6
90
.62
0.6
20
.57
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eli
mit
ed-
and
exte
nd
ed-s
amp
levar
ian
tso
fth
eh
isto
rica
lan
aly
sis
asw
ell
asin
the
con
tem
po
rary
anal
ysi
s,th
eh
um
p-s
hap
edef
fect
of
gen
etic
div
ersi
tyo
nec
on
om
icd
evel
op
men
t,w
hil
eco
ntr
oll
ing
for
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
rem
ain
sq
ual
itat
ivel
yro
bu
st
un
der
qu
adra
tic
spec
ifica
tio
ns
inlo
gg
edg
enet
icd
iver
sity
,th
ereb
yd
emo
nst
rati
ng
that
the
tru
ere
lati
on
ship
bet
wee
ng
enet
icd
iver
sity
and
eco
no
mic
dev
elo
pm
ent,
asex
amin
edin
the
pap
er,is
ind
eed
hu
mp
shap
edra
ther
than
lin
ear
inlo
g-t
ran
sfo
rmed
var
iab
les.
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
tyem
plo
yed
by
the
anal
ysi
sar
eo
bse
rved
gen
etic
div
ersi
tyin
Co
lum
ns
1–
2,
pre
dic
ted
gen
etic
div
ersi
ty(i
.e.,
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
inC
olu
mn
s3
–5
,an
dan
cest
ry-a
dju
sted
gen
etic
div
ersi
tyin
Co
lum
n6
.T
he
ance
stry
-ad
just
edm
easu
reo
fth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
nis
use
din
Co
lum
n6
.H
eter
osk
edas
tici
tyro
bu
stst
and
ard
erro
rsar
ere
po
rted
inp
aren
thes
esin
Co
lum
ns
1–
2.
Bo
ots
trap
stan
dar
der
rors
,ac
cou
nti
ng
for
the
use
of
gen
erat
edre
gre
sso
rs,
are
rep
ort
edin
par
enth
eses
inC
olu
mn
s3
–6
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 29
TA
BL
ED
5—
RO
BU
ST
NE
SS
TO
US
ING
GE
NE
TIC
DIV
ER
SIT
YP
RE
DIC
TE
DB
YT
HE
HU
MA
NM
OB
ILIT
YIN
DE
X
Ex
ten
ded
-sam
ple
his
tori
cal
Co
nte
mp
ora
ry
anal
ysi
san
aly
sis
(1)
(2)
(3)
(4)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gp
op
ula
tio
nL
og
inco
me
per
den
sity
ind
ensi
tyin
den
sity
inca
pit
ain
15
00
CE
10
00
CE
1C
E2
00
0C
E
Div
ersi
ty1
52
.16
3*
**
14
6.5
51
**
19
6.1
45
**
17
5.1
30
**
(54
.10
0)
(64
.08
2)
(80
.09
3)
(68
.27
9)
Div
ersi
tysq
uar
e-1
13
.29
8*
**
-11
0.7
52
**
-14
8.5
69
**
*-1
20
.27
0*
*
(37
.70
7)
(44
.39
2)
(55
.51
3)
(47
.99
8)
Lo
gN
eoli
thic
tran
siti
on
1.5
53
**
*2
.05
2*
**
3.2
42
**
*0
.04
5
tim
ing
(0.2
55
)(0
.26
5)
(0.3
49
)(0
.27
0)
Lo
gp
erce
nta
ge
of
arab
le0
.36
2*
**
0.3
14
**
0.1
82
-0.1
46
lan
d(0
.11
1)
(0.1
28
)(0
.12
6)
(0.1
10
)
Lo
gab
solu
tela
titu
de
-0.4
89
**
*-0
.40
1*
**
-0.0
87
0.1
92
(0.1
35
)(0
.14
6)
(0.1
29
)(0
.14
2)
Lo
gla
nd
suit
abil
ity
for
0.2
56
**
0.2
21
*0
.30
1*
**
-0.1
47
agri
cult
ure
(0.1
01
)(0
.11
8)
(0.1
16
)(0
.10
6)
Co
nti
nen
tfi
xed
effe
cts
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
12
71
23
11
31
25
R2
0.7
10
.66
0.6
90
.55
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eex
ten
ded
-sam
ple
his
tori
cal
anal
ysi
san
dth
eco
nte
mp
ora
ryan
aly
sis,
the
hu
mp
-sh
aped
effe
cto
fg
enet
icd
iver
sity
on
eco
no
mic
dev
elo
pm
ent,
wh
ile
con
tro
llin
gfo
rth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
n,
lan
dp
rod
uct
ivit
y,an
dco
nti
nen
tfi
xed
effe
cts,
rem
ain
sq
ual
itat
ivel
yro
bu
stw
hen
the
hu
man
mo
bil
ity
ind
ex,
aso
pp
ose
dto
the
bas
elin
ew
ayp
oin
ts-r
estr
icte
dm
igra
tory
dis
tan
ceco
nce
pt
of
Ram
ach
and
ran
etal
.(2
00
5),
isu
sed
top
red
ict
gen
etic
div
ersi
tyfo
rth
eex
ten
ded
-sam
ple
his
tori
cal
anal
ysi
san
dto
con
stru
ctth
ean
cest
ry-a
dju
sted
mea
sure
of
gen
etic
div
ersi
tyfo
rth
eco
nte
mp
ora
ryan
aly
sis.
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
tyem
plo
yed
by
the
anal
ysi
sar
eg
enet
icd
iver
sity
pre
dic
ted
by
the
hu
man
mo
bil
ity
ind
exin
Co
lum
ns
1–
3an
dit
san
cest
ry-a
dju
sted
cou
nte
rpar
tin
Co
lum
n4
.T
he
ance
stry
-ad
just
edm
easu
reo
fth
e
tim
ing
of
the
Neo
lith
icR
evo
luti
on
isu
sed
inC
olu
mn
4.
Bo
ots
trap
stan
dar
der
rors
,ac
cou
nti
ng
for
the
use
of
gen
erat
edre
gre
sso
rs,
are
rep
ort
edin
par
enth
eses
.F
or
add
itio
nal
det
ails
on
the
hu
man
mo
bil
ity
ind
exan
dh
ow
itis
use
dto
eith
erp
red
ict
gen
etic
div
ersi
tyfo
rth
eex
ten
ded
-sam
ple
his
tori
cal
anal
ysi
so
rto
con
stru
ctth
ean
cest
ry-a
dju
sted
mea
sure
of
gen
etic
div
ersi
tyfo
rth
eco
nte
mp
ora
ryan
aly
sis,
the
read
eris
refe
rred
toth
ed
efin
itio
ns
of
thes
evar
iab
les
inS
ecti
on
Fo
fth
isap
pen
dix
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
30 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
ED
6—
TH
EZ
ER
OT
H-
AN
DF
IRS
T-S
TA
GE
RE
SU
LT
SO
FT
HE
2S
LS
RE
GR
ES
SIO
NS
INT
AB
LE
2
2S
LS
reg
ress
ion
inC
olu
mn
5,
Tab
le2
2S
LS
reg
ress
ion
inC
olu
mn
6,
Tab
le2
Zer
oth
stag
eF
irst
stag
eZ
ero
thst
age
Fir
stst
age
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:D
epen
den
tvar
iab
leis
:
Ob
serv
edO
bse
rved
Ob
serv
edO
bse
rved
div
ersi
tyO
bse
rved
Ob
serv
edd
iver
sity
div
ersi
tyd
iver
sity
squ
are
div
ersi
tyd
iver
sity
squ
are
Exclu
ded
inst
rum
en
ts:
Mig
rato
ryd
ista
nce
fro
m-0
.81
4*
**
-5.1
43
**
*-5
.84
1*
**
-0.6
57
**
*-3
.99
2*
**
-4.2
34
**
*
Eas
tA
fric
a(0
.06
4)
(0.6
86
)(0
.95
9)
(0.1
96
)(0
.85
0)
(1.0
53
)
Pre
dic
ted
div
ersi
tysq
uar
e–
-3.9
37
**
*-4
.32
7*
**
–-3
.69
9*
**
-3.7
52
**
*
(fro
mze
roth
stag
e)(0
.62
8)
(0.8
80
)(0
.91
6)
(1.1
45
)
Seco
nd
stage
co
ntr
ols
:
Lo
gN
eoli
thic
tran
siti
on
-0.0
17
**
-0.1
19
**
*-0
.13
9*
**
-0.0
13
**
-0.0
82
**
*-0
.08
8*
**
tim
ing
(0.0
06
)(0
.01
6)
(0.0
23
)(0
.00
4)
(0.0
17
)(0
.02
1)
Lo
gp
erce
nta
ge
of
arab
le0
.00
00
.00
4*
*0
.00
7*
*0
.00
20
.01
5*
**
0.0
17
**
*
lan
d(0
.00
3)
(0.0
02
)(0
.00
3)
(0.0
04
)(0
.00
3)
(0.0
05
)
Lo
gab
solu
tela
titu
de
0.0
03
*0
.01
8*
**
0.0
20
**
*0
.00
5*
0.0
27
**
*0
.02
8*
**
(0.0
02
)(0
.00
2)
(0.0
03
)(0
.00
2)
(0.0
06
)(0
.00
7)
Lo
gla
nd
suit
abil
ity
for
0.0
05
0.0
32
**
*0
.03
5*
**
0.0
05
0.0
31
**
*0
.03
2*
**
agri
cult
ure
(0.0
05
)(0
.00
4)
(0.0
06
)(0
.00
5)
(0.0
07
)(0
.00
9)
Co
nti
nen
tfi
xed
effe
cts
No
No
No
Yes
Yes
Yes
Ob
serv
atio
ns
21
21
21
21
21
21
R2
0.9
60
.99
0.9
90
.98
0.9
90
.99
No
te:
Th
ista
ble
pre
sen
tsth
eze
roth
-an
dfi
rst-
stag
ere
sult
so
fth
e2
SL
Sre
gre
ssio
ns
fro
mC
olu
mn
s5
–6
of
Tab
le2
inth
ep
aper
that
emp
loy
mig
rato
ryd
ista
nce
asan
excl
ud
ed
inst
rum
ent
for
ob
serv
edg
enet
icd
iver
sity
toes
tab
lish
the
cau
sal
hu
mp
-sh
aped
effe
cto
fg
enet
icd
iver
sity
on
log
po
pu
lati
on
den
sity
in1
50
0C
Ein
the
lim
ited
21
-co
un
try
sam
ple
wh
ile
con
tro
llin
gfo
rth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
n,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts.
Sin
ceth
ese
con
dst
age
of
each
of
thes
e2
SL
Sre
gre
ssio
ns
isq
uad
rati
cin
the
end
og
eno
us
reg
ress
or,
itis
nec
essa
ryto
inst
rum
ent
for
bo
thg
enet
icd
iver
sity
an
dit
ssq
ua
red
term
ino
rder
for
the
syst
emto
be
exac
tly
iden
tifi
ed.
Th
us,
foll
ow
ing
Wo
old
rid
ge
(20
10
,p
p.2
67
–2
68
),th
ein
stru
men
tati
on
tech
niq
ue
intr
od
uce
sa
zero
thst
age
toth
ean
aly
sis
wh
ere
gen
etic
div
ersi
tyis
firs
tre
gre
ssed
on
mig
rato
ryd
ista
nce
and
all
the
seco
nd
-sta
ge
con
tro
lsto
ob
tain
pre
dic
ted
(i.e
.,fi
tted
)val
ues
of
div
ersi
ty.
Th
ep
red
icte
dg
enet
icd
iver
sity
fro
mth
eze
roth
stag
eis
squ
ared
,an
dth
issq
uar
edte
rmis
then
use
das
anex
clu
ded
inst
rum
ent
inth
ese
con
dst
age
alo
ng
wit
hm
igra
tory
dis
tan
ce.
Het
ero
sked
asti
city
robu
stst
and
ard
erro
rsar
ere
po
rted
inp
aren
thes
es.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 31
TA
BL
ED
7—
RO
BU
ST
NE
SS
TO
TH
EE
CO
LO
GIC
AL
CO
MP
ON
EN
TS
OF
LA
ND
SU
ITA
BIL
ITY
FO
RA
GR
ICU
LT
UR
E
Lim
ited
-sam
ple
his
tori
cal
Ex
ten
ded
-sam
ple
his
tori
cal
Co
nte
mp
ora
ry
anal
ysi
san
aly
sis
anal
ysi
s
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gin
com
ep
er
den
sity
ind
ensi
tyin
den
sity
ind
ensi
tyin
cap
ita
in
15
00
CE
15
00
CE
10
00
CE
1C
E2
00
0C
E
Div
ersi
ty2
36
.19
2*
*2
42
.58
3*
*1
71
.88
1*
*1
81
.68
1*
*1
95
.61
0*
*2
72
.91
5*
**
(97
.55
4)
(92
.47
0)
(77
.88
5)
(88
.46
5)
(98
.91
7)
(95
.29
3)
Div
ersi
tysq
uar
e-1
66
.67
9*
*-1
71
.66
8*
*-1
22
.68
7*
*-1
29
.28
8*
*-1
37
.81
4*
*-1
92
.82
6*
**
(73
.11
4)
(74
.71
2)
(55
.16
8)
(62
.04
4)
(69
.33
1)
(67
.76
9)
Lo
gN
eoli
thic
tran
siti
on
1.5
65
**
*1
.68
9*
1.2
14
**
*1
.52
6*
**
1.8
39
**
*-0
.02
4
tim
ing
(0.3
78
)(0
.86
6)
(0.2
25
)(0
.24
4)
(0.3
57
)(0
.28
6)
Lo
gp
erce
nta
ge
of
arab
le0
.54
7*
*0
.53
90
.38
2*
**
0.3
38
**
*0
.19
9*
-0.1
35
lan
d(0
.19
2)
(0.3
13
)(0
.09
1)
(0.0
93
)(0
.10
4)
(0.0
89
)
Lo
gab
solu
tela
titu
de
0.1
83
0.2
21
-0.1
37
-0.1
24
0.2
92
*0
.06
7
(0.1
60
)(0
.28
9)
(0.1
66
)(0
.17
0)
(0.1
50
)(0
.16
4)
Lo
gte
mp
erat
ure
1.5
20
**
1.4
23
1.5
08
**
*1
.97
1*
**
3.0
92
**
*-0
.28
9
(0.5
91
)(0
.99
3)
(0.5
35
)(0
.55
8)
(0.5
80
)(0
.46
1)
Lo
gp
reci
pit
atio
n0
.71
4*
**
0.7
67
*0
.35
8*
*0
.19
70
.17
6-0
.23
5
(0.2
24
)(0
.37
2)
(0.1
53
)(0
.16
3)
(0.1
82
)(0
.15
5)
Lo
gso
ilfe
rtil
ity
0.1
62
0.1
71
0.4
73
0.5
41
0.7
72
**
-0.2
62
(0.5
54
)(0
.57
9)
(0.3
03
)(0
.32
9)
(0.3
59
)(0
.29
7)
Co
nti
nen
tfi
xed
effe
cts
No
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
21
21
14
51
40
12
61
43
R2
0.9
30
.93
0.7
20
.67
0.7
20
.57
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eli
mit
ed-
and
exte
nd
ed-s
amp
levar
ian
tso
fth
eh
isto
rica
lan
aly
sis
asw
ell
asin
the
con
tem
po
rary
anal
ysi
s,th
eh
um
p-s
hap
edef
fect
of
gen
etic
div
ersi
tyo
nec
on
om
icd
evel
op
men
t,w
hil
eco
ntr
oll
ing
for
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
rem
ain
sq
ual
itat
ivel
yro
bu
st
wh
enco
ntr
ols
for
som
eo
fth
ein
div
idu
alec
olo
gic
alco
mp
on
ents
of
the
suit
abil
ity
of
lan
dfo
rag
ricu
ltu
re,
incl
ud
ing
tem
per
atu
re,
pre
cip
itat
ion
,an
dso
ilfe
rtil
ity,
are
use
din
lieu
of
the
bas
elin
eco
ntr
ol
for
the
over
all
ind
exo
fla
nd
suit
abil
ity.
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
tyem
plo
yed
by
the
anal
ysi
sar
eo
bse
rved
gen
etic
div
ersi
tyin
Co
lum
ns
1–
2,
pre
dic
ted
gen
etic
div
ersi
ty(i
.e.,
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
inC
olu
mn
s3
–5
,an
dan
cest
ry-a
dju
sted
gen
etic
div
ersi
tyin
Co
lum
n6
.
Th
ean
cest
ry-a
dju
sted
mea
sure
of
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
isu
sed
inC
olu
mn
6.
Het
ero
sked
asti
city
robu
stst
and
ard
erro
rsar
ere
po
rted
inp
aren
thes
esin
Co
lum
ns
1–
2.
Bo
ots
trap
stan
dar
der
rors
,ac
cou
nti
ng
for
the
use
of
gen
erat
edre
gre
sso
rs,
are
rep
ort
edin
par
enth
eses
inC
olu
mn
s3
–6
.F
or
add
itio
nal
det
ails
on
the
ind
ivid
ual
eco
log
ical
com
po
nen
tso
fth
esu
itab
ilit
yo
fla
nd
for
agri
cult
ure
,th
ere
ader
isre
ferr
edto
the
defi
nit
ion
so
fth
ese
var
iab
les
inS
ecti
on
Fo
fth
isap
pen
dix
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
32 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
ED
8—
RO
BU
ST
NE
SS
TO
TH
EM
OD
EO
FS
UB
SIS
TE
NC
EP
RE
VA
LE
NT
IN1
00
0C
E
Lim
ited
-sam
ple
his
tori
cal
Ex
ten
ded
-sam
ple
15
00
CE
Ex
ten
ded
-sam
ple
10
00
CE
anal
ysi
san
aly
sis
anal
ysi
s
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
spec
ifica
tio
nsp
ecifi
cati
on
spec
ifica
tio
nsp
ecifi
cati
on
spec
ifica
tio
nsp
ecifi
cati
on
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
Lo
gp
op
ula
tio
nd
ensi
tyin
10
00
CE
Div
ersi
ty1
86
.51
5*
*1
80
.34
0*
**
19
4.1
41
**
19
6.9
84
**
*2
00
.00
6*
*2
41
.01
9*
**
(67
.31
7)
(51
.27
5)
(76
.26
5)
(56
.21
5)
(96
.04
2)
(60
.90
2)
Div
ersi
tysq
uar
e-1
31
.92
2*
*-1
29
.88
2*
**
-14
2.3
49
**
*-1
44
.05
5*
**
-14
5.7
22
**
-17
3.8
61
**
*
(51
.71
7)
(39
.68
1)
(53
.58
5)
(40
.16
1)
(66
.58
5)
(42
.84
5)
Lo
gN
eoli
thic
tran
siti
on
1.4
01
0.8
67
1.1
07
**
*1
.01
3*
**
1.5
49
**
*1
.38
8*
**
tim
ing
(0.8
26
)(0
.84
9)
(0.2
54
)(0
.21
0)
(0.3
20
)(0
.25
2)
Lo
gsu
bsi
sten
cem
od
ein
1.6
78
**
*2
.03
0*
**
2.5
43
**
10
00
CE
(0.4
65
)(0
.73
0)
(0.9
96
)
Lo
gp
erce
nta
ge
of
arab
le0
.50
4*
*0
.45
3*
*0
.42
1*
**
0.4
58
**
*0
.38
4*
**
0.4
47
**
*
lan
d(0
.19
1)
(0.1
50
)(0
.10
1)
(0.0
97
)(0
.12
3)
(0.1
14
)
Lo
gab
solu
tela
titu
de
-0.1
72
-0.1
58
-0.4
20
**
*-0
.33
2*
**
-0.3
69
**
*-0
.28
8*
*
(0.1
50
)(0
.15
5)
(0.1
28
)(0
.11
5)
(0.1
41
)(0
.12
1)
Lo
gla
nd
suit
abil
ity
for
0.7
06
**
0.4
13
0.2
26
**
0.1
62
*0
.17
30
.06
7
agri
cult
ure
(0.2
98
)(0
.26
9)
(0.0
98
)(0
.09
4)
(0.1
11
)(0
.10
0)
Co
nti
nen
tfi
xed
effe
cts
Yes
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
20
20
14
01
40
13
51
35
R2
0.9
00
.92
0.6
70
.72
0.5
80
.67
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eli
mit
ed-s
amp
leh
isto
rica
lan
aly
sis
and
the
exte
nd
ed-s
amp
lean
aly
ses
for
the
yea
rs1
50
0C
Ean
d1
00
0C
E,
the
hu
mp
-sh
aped
effe
cto
f
gen
etic
div
ersi
tyo
nlo
gp
op
ula
tio
nd
ensi
ty,
wh
ile
con
tro
llin
gfo
rth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
n,
lan
dp
rod
uct
ivit
y,an
dco
nti
nen
tfi
xed
effe
cts,
isro
bu
stto
anad
dit
ion
al
con
tro
lfo
rth
em
od
eo
fsu
bsi
sten
ce(i
.e.,
hu
nti
ng
and
gat
her
ing
ver
sus
sed
enta
ryag
ricu
ltu
re)
pra
ctic
edd
uri
ng
the
pre
colo
nia
ler
a.G
iven
un
der
lyin
gd
ata
avai
lab
ilit
yco
nst
rain
ts
on
con
stru
ctin
ga
pro
xy
for
the
mo
de
of
sub
sist
ence
pre
val
ent
inth
ey
ear
15
00
CE
,co
up
led
wit
hth
efa
ctth
atcr
oss
-co
un
try
sub
sist
ence
pat
tern
sin
10
00
CE
sho
uld
be
exp
ecte
d
tob
eh
igh
lyco
rrel
ated
wit
hth
ose
that
exis
ted
in1
50
0C
E,
the
anal
ysi
sco
ntr
ols
for
the
mo
de
of
sub
sist
ence
pre
val
ent
inth
ey
ear
10
00
CE
inau
gm
ente
dsp
ecifi
cati
on
sex
pla
inin
g
log
po
pu
lati
on
den
sity
inb
oth
tim
ep
erio
ds.
Mo
reover
,si
nce
dat
ao
nth
em
od
eo
fsu
bsi
sten
cep
reval
ent
inth
ey
ear
10
00
CE
are
un
avai
lab
lefo
rso
me
ob
serv
atio
ns
inth
eb
asel
ine
reg
ress
ion
sam
ple
sfo
rth
eli
mit
edan
dex
ten
ded
-sam
ple
his
tori
cal
anal
yse
s,th
eta
ble
also
pre
sen
tsth
ere
sult
sfr
om
esti
mat
ing
the
bas
elin
esp
ecifi
cati
on
sfo
rth
ese
anal
yse
su
sin
g
the
corr
esp
on
din
gsu
bsi
sten
cem
od
ed
ata-
rest
rict
edsa
mp
les.
Th
isp
erm
its
am
ore
fair
asse
ssm
ento
fre
du
ctio
ns
ino
mit
ted
var
iab
leb
ias
aris
ing
fro
mth
eo
mis
sio
no
fth
esu
bsi
sten
ce
mo
de
con
tro
lfr
om
the
bas
elin
esp
ecifi
cati
on
s.T
he
rele
van
tm
easu
res
of
gen
etic
div
ersi
tyem
plo
yed
by
the
anal
ysi
sar
eo
bse
rved
gen
etic
div
ersi
tyin
Co
lum
ns
1–
2an
dp
red
icte
d
gen
etic
div
ersi
ty(i
.e.,
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
inC
olu
mn
s3
–6
.H
eter
osk
edas
tici
tyro
bu
stst
and
ard
erro
rsar
ere
po
rted
inp
aren
thes
es
inC
olu
mn
s1
–2
.B
oo
tstr
apst
and
ard
erro
rs,
acco
un
tin
gfo
rth
eu
seo
fg
ener
ated
reg
ress
ors
,ar
ere
po
rted
inp
aren
thes
esin
Co
lum
ns
3–
6.
Fo
rad
dit
ion
ald
etai
lso
nh
ow
the
pro
xy
for
the
mo
de
of
sub
sist
ence
pre
val
ent
inth
ey
ear
10
00
CE
isco
nst
ruct
ed,
the
read
eris
refe
rred
toth
ed
efin
itio
no
fth
isvar
iab
lein
Sec
tio
nF
of
this
app
end
ix.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 33
TABLE D9—ANCESTRY-ADJUSTED MIGRATORY DISTANCE VERSUS ALTERNATIVE DISTANCES
(1) (2) (3) (4)
Dependent variable is log income per capita in 2000 CE
Migratory distance 0.588*** 0.488*** 0.502*** 0.528**
(ancestry adjusted) (0.074) (0.129) (0.157) (0.230)
Migratory distance square -0.029*** -0.025*** -0.026*** -0.026***
(ancestry adjusted) (0.004) (0.006) (0.007) (0.009)
Migratory distance 0.077
(unadjusted) (0.088)
Migratory distance square -0.002
(unadjusted) (0.003)
Aerial distance 0.096
(unadjusted) (0.198)
Aerial distance square -0.004
(unadjusted) (0.011)
Aerial distance 0.097
(ancestry adjusted) (0.328)
Aerial distance square -0.006
(ancestry adjusted) (0.018)
Observations 109 109 109 109
R2 0.28 0.29 0.29 0.28
Note: This table (i) establishes a significant unconditional hump-shaped impact of ancestry-adjusted migratory distance
from East Africa (i.e., the weighted average of the migratory distances of a country’s precolonial ancestral populations)
on log income per capita in 2000 CE, (ii) confirms that this nonmonotonic effect is robust to controls for alternative
concepts of distance, including (a) the unadjusted measure of migratory distance from East Africa, (b) aerial or “as the
crow flies” distance from East Africa, and (c) ancestry-adjusted aerial distance from East Africa, and (iii) demonstrates
that these alternative concepts of distance do not possess any systematic relationship, hump-shaped or otherwise, with
log income per capita in 2000 CE, conditional on accounting for the nonmonotonic effect of ancestry-adjusted migratory
distance. Heteroskedasticity robust standard errors are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
34 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013T
AB
LE
D1
0—
RO
BU
ST
NE
SS
TO
Fst
GE
NE
TIC
DIS
TA
NC
ES
TO
EA
ST
AF
RIC
AA
ND
TH
EW
OR
LD
TE
CH
NO
LO
GIC
AL
FR
ON
TIE
R
Lim
ited
-sam
ple
his
tori
cal
Ex
ten
ded
-sam
ple
his
tori
cal
Co
nte
mp
ora
ry
anal
ysi
san
aly
sis
anal
ysi
s
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
spec
ifica
tio
nsp
ecifi
cati
on
spec
ifica
tio
nsp
ecifi
cati
on
spec
ifica
tio
nsp
ecifi
cati
on
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
Lo
gin
com
ep
erca
pit
ain
20
00
CE
Div
ersi
ty1
84
.67
2*
*2
01
.23
0*
*1
94
.68
1*
*1
94
.74
3*
*1
86
.86
3*
*2
08
.89
6*
*
(63
.33
8)
(65
.97
0)
(81
.63
8)
(85
.60
3)
(83
.67
8)
(96
.23
0)
Div
ersi
tysq
uar
e-1
30
.46
6*
*-1
44
.43
5*
*-1
42
.20
4*
*-1
45
.34
7*
*-1
29
.00
8*
*-1
48
.17
0*
*
(48
.77
4)
(50
.45
0)
(57
.19
2)
(60
.80
2)
(59
.42
4)
(68
.75
2)
Lo
gN
eoli
thic
tran
siti
on
1.1
50
*1
.04
41
.23
6*
**
1.0
25
**
*-0
.22
0-0
.12
8
tim
ing
(0.6
34
)(0
.91
5)
(0.2
42
)(0
.31
2)
(0.2
87
)(0
.34
4)
Lo
gp
erce
nta
ge
of
arab
le0
.43
3*
*0
.31
40
.39
2*
**
0.3
62
**
*-0
.15
8-0
.13
7
lan
d(0
.17
2)
(0.3
96
)(0
.09
9)
(0.1
07
)(0
.10
0)
(0.1
02
)
Lo
gab
solu
tela
titu
de
-0.1
38
-0.1
60
-0.4
19
**
*-0
.47
7*
**
0.0
66
0.0
77
(0.1
50
)(0
.18
4)
(0.1
22
)(0
.12
8)
(0.1
18
)(0
.12
4)
Lo
gla
nd
suit
abil
ity
for
0.6
36
*0
.77
80
.25
9*
**
0.3
24
**
*-0
.14
6-0
.14
8
agri
cult
ure
(0.2
94
)(0
.55
9)
(0.0
95
)(0
.09
7)
(0.0
93
)(0
.09
2)
Gen
etic
dis
tan
ceto
the
-0.0
14
-0.0
27
U.K
.(1
50
0m
atch
)(0
.04
9)
(0.0
27
)
Gen
etic
dis
tan
ceto
-0.0
47
-0.0
69
Eth
iop
ia(1
50
0m
atch
)(0
.08
4)
(0.0
43
)
Gen
etic
dis
tan
ceto
the
0.0
30
U.S
.(w
eig
hte
d)
(0.0
37
)
Gen
etic
dis
tan
ceto
-0.0
76
*
Eth
iop
ia(w
eig
hte
d)
(0.0
44
)
Co
nti
nen
tfi
xed
effe
cts
Yes
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
20
20
14
21
42
13
91
39
R2
0.9
10
.91
0.6
90
.71
0.6
10
.62
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eli
mit
ed-
and
exte
nd
ed-s
amp
levar
ian
tso
fth
eh
isto
rica
lan
aly
sis
asw
ell
asin
the
con
tem
po
rary
anal
ysi
s,th
eh
um
p-s
hap
edef
fect
of
gen
etic
div
ersi
tyo
nec
on
om
icd
evel
op
men
t,w
hil
eco
ntr
oll
ing
for
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
isro
bu
stto
add
itio
nal
con
tro
lsfo
rF
st
gen
etic
dis
tan
ceto
Eas
tA
fric
a(i
.e.,
Eth
iop
ia)
and
toth
ew
orl
dte
chn
olo
gic
alfr
on
tier
rele
van
tfo
rth
eti
me
per
iod
bei
ng
anal
yze
d(i
.e.,
the
U.K
.in
15
00
CE
and
the
U.S
.in
20
00
CE
),th
ereb
yd
emo
nst
rati
ng
that
the
hu
mp
-sh
aped
effe
cto
fg
enet
icd
iver
sity
isn
ot
refl
ecti
ng
the
late
nt
imp
act
of
thes
evar
iab
les
via
chan
nel
sre
late
dto
the
dif
fusi
on
of
dev
elo
pm
ent
(Sp
ola
ore
and
Wac
ziar
g2
00
9).
Sin
ced
ata
on
the
rele
van
tg
enet
icd
ista
nce
mea
sure
sar
eu
nav
aila
ble
for
som
eo
bse
rvat
ion
sin
the
bas
elin
ere
gre
ssio
nsa
mp
les
for
the
lim
ited
-an
dex
ten
ded
-sam
ple
var
ian
tso
fth
eh
isto
rica
lan
aly
sis
asw
ell
asth
eco
nte
mp
ora
ryan
aly
sis,
the
tab
leal
sop
rese
nts
the
resu
lts
fro
mes
tim
atin
gth
eb
asel
ine
spec
ifica
tio
ns
for
thes
ean
aly
ses
usi
ng
the
corr
esp
on
din
gg
enet
icd
ista
nce
dat
a-re
stri
cted
sam
ple
s.T
his
per
mit
sa
mo
refa
iras
sess
men
to
fre
du
ctio
ns
ino
mit
ted
var
iab
leb
ias
aris
ing
fro
mth
e
om
issi
on
of
the
gen
etic
dis
tan
ceco
ntr
ols
fro
mth
eb
asel
ine
spec
ifica
tio
ns.
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
tyem
plo
yed
by
the
anal
ysi
sar
eo
bse
rved
gen
etic
div
ersi
ty
inC
olu
mn
s1
–2
,p
red
icte
dg
enet
icd
iver
sity
(i.e
.,g
enet
icd
iver
sity
asp
red
icte
db
ym
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a)in
Co
lum
ns
3–
4,
and
ance
stry
-ad
just
edg
enet
icd
iver
sity
inC
olu
mn
s5
–6
.T
he
ance
stry
-ad
just
edm
easu
reo
fth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
nis
use
din
Co
lum
ns
5–
6.
Het
ero
sked
asti
city
robu
stst
and
ard
erro
rsar
ere
po
rted
in
par
enth
eses
inC
olu
mn
s1
–2
.B
oo
tstr
apst
and
ard
erro
rs,
acco
un
tin
gfo
rth
eu
seo
fg
ener
ated
reg
ress
ors
,ar
ere
po
rted
inp
aren
thes
esin
Co
lum
ns
3–
6.
Fo
rad
dit
ion
ald
etai
lso
nth
e
gen
etic
dis
tan
cem
easu
res
emp
loy
edb
yth
ean
aly
sis,
the
read
eris
refe
rred
toth
ed
efin
itio
ns
of
thes
evar
iab
les
inS
ecti
on
Fo
fth
isap
pen
dix
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 35T
AB
LE
D1
1—
RO
BU
ST
NE
SS
TO
RE
GIO
NF
IXE
DE
FF
EC
TS
AN
DS
UB
SA
MP
LIN
GO
NR
EG
ION
AL
CL
US
TE
RS
Om
itIn
clu
de
on
ly
Om
itS
ub
-Sah
aran
Su
b-S
ahar
an
Su
b-S
ahar
anO
mit
Afr
ican
and
Afr
ican
and
Fu
llA
fric
anL
atin
Am
eric
anL
atin
Am
eric
anL
atin
Am
eric
an
sam
ple
cou
ntr
ies
cou
ntr
ies
cou
ntr
ies
cou
ntr
ies
(1)
(2)
(3)
(4)
(5)
Dep
end
ent
var
iab
leis
log
inco
me
per
cap
ita
in2
00
0C
E
Pre
dic
ted
div
ersi
ty2
99
.28
0*
**
28
5.8
38
**
*4
92
.19
4*
**
59
6.4
32
**
*2
50
.52
8*
**
(an
cest
ryad
just
ed)
(65
.79
0)
(88
.33
5)
(14
1.3
65
)(1
94
.58
2)
(83
.94
4)
Pre
dic
ted
div
ersi
tysq
uar
e-2
08
.88
6*
**
-19
9.7
55
**
*-3
42
.64
4*
**
-41
5.1
73
**
*-1
74
.72
4*
**
(an
cest
ryad
just
ed)
(47
.25
0)
(64
.40
6)
(98
.35
5)
(13
7.6
98
)(6
0.5
68
)
Lo
gN
eoli
thic
tran
siti
on
0.3
05
*0
.02
40
.35
2*
-0.0
35
0.3
82
tim
ing
(an
cest
ryad
just
ed)
(0.1
74
)(0
.24
1)
(0.2
08
)(0
.26
9)
(0.2
71
)
Lo
gp
erce
nta
ge
of
arab
le-0
.19
3*
**
-0.1
97
**
*-0
.19
2*
**
-0.1
52
**
-0.1
81
**
lan
d(0
.04
7)
(0.0
70
)(0
.04
3)
(0.0
67
)(0
.06
9)
Lo
gab
solu
tela
titu
de
-0.0
49
0.1
18
-0.1
89
-0.1
74
-0.0
37
(0.0
90
)(0
.11
7)
(0.1
14
)(0
.15
3)
(0.1
12
)
So
cial
infr
astr
uct
ure
1.2
24
**
*0
.84
5*
1.5
25
**
0.8
69
0.6
54
(0.4
55
)(0
.43
4)
(0.6
01
)(0
.60
9)
(0.9
40
)
Eth
nic
frac
tio
nal
izat
ion
-0.3
55
0.1
12
-0.7
39
**
-0.5
11
*-0
.29
4
(0.2
54
)(0
.27
7)
(0.2
85
)(0
.29
6)
(0.4
49
)
Per
cen
tag
eo
fp
op
ula
tio
nat
-0.6
65
**
-0.5
57
**
-0.4
26
0.1
22
-1.0
79
**
risk
of
con
trac
tin
gm
alar
ia(0
.29
7)
(0.2
65
)(0
.41
6)
(0.3
20
)(0
.42
9)
Per
cen
tag
eo
fp
op
ula
tio
n-0
.30
6-0
.55
4*
**
-0.4
69
**
-0.9
76
**
*-0
.13
1
liv
ing
intr
op
ical
zon
es(0
.18
6)
(0.1
90
)(0
.21
9)
(0.2
65
)(0
.21
8)
Mea
nd
ista
nce
ton
eare
st-0
.38
4*
*-0
.76
7*
**
-0.4
16
**
*-0
.66
5*
*-0
.28
1
wat
erw
ay(0
.17
9)
(0.2
46
)(0
.14
9)
(0.2
47
)(0
.23
8)
Op
tim
ald
iver
sity
0.7
16
**
*0
.71
5*
**
0.7
18
**
*0
.71
8*
**
0.7
17
**
*
(0.0
08
)(0
.01
1)
(0.0
06
)(0
.00
7)
(0.0
14
)
Ob
serv
atio
ns
10
97
18
74
96
0
R2
0.9
00
.89
0.9
30
.94
0.8
0
No
te:
Th
ista
ble
esta
bli
shes
,u
sin
gth
e1
09
-co
un
try
sam
ple
fro
mth
eco
nte
mp
ora
ryan
aly
sis,
that
the
sig
nifi
can
th
um
p-s
hap
edef
fect
of
ance
stry
-ad
just
edg
enet
icd
iver
sity
on
log
inco
me
per
cap
ita
in2
00
0C
E,
wh
ile
con
tro
llin
gfo
rth
ean
cest
ry-a
dju
sted
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
avec
tor
of
inst
itu
tio
nal
,cu
ltu
ral,
and
geo
gra
ph
ical
det
erm
inan
tso
fd
evel
op
men
t,is
qu
alit
ativ
ely
robu
stto
(i)
con
tro
lsfo
rre
gio
n(r
ath
erth
anco
nti
nen
t)fi
xed
effe
cts,
(ii)
om
itti
ng
ob
serv
atio
ns
asso
ciat
edw
ith
eith
er
Su
b-S
ahar
anA
fric
a(C
olu
mn
2)
or
Lat
inA
mer
ica
(Co
lum
n3
)o
rb
oth
(Co
lum
n4
),w
hic
h,
giv
enth
ela
ggar
dco
mp
arat
ive
dev
elo
pm
ent
of
thes
ere
gio
ns
alo
ng
wit
hth
eir
rela
tivel
y
hig
her
and
low
erle
vel
so
fg
enet
icd
iver
sity
resp
ecti
vel
y,m
aya
pri
ori
be
con
sid
ered
tob
ein
flu
enti
alfo
rg
ener
atin
gth
ew
orl
dw
ide
hu
mp
-sh
aped
rela
tio
nsh
ipb
etw
een
div
ersi
ty
and
dev
elo
pm
ent,
and
(iii
)re
stri
ctin
gth
esa
mp
leto
on
lyo
bse
rvat
ion
sin
the
po
ten
tial
lyin
flu
enti
alS
ub
-Sah
aran
Afr
ican
and
Lat
inA
mer
ican
reg
ion
s(C
olu
mn
5).
All
reg
ress
ion
s
incl
ud
eco
ntr
ols
for
maj
or
reli
gio
nsh
ares
asw
ell
asO
PE
C,
legal
ori
gin
,an
dre
gio
nfi
xed
effe
cts.
Th
ere
gio
nfi
xed
effe
cts
emp
loy
edb
yth
ean
aly
sis
incl
ud
efi
xed
effe
cts
for
(i)
Su
b-S
ahar
anA
fric
a(e
xce
pt
inC
olu
mn
s2
and
4),
(ii)
Mid
dle
Eas
tan
dN
ort
hA
fric
a(e
xce
pt
inC
olu
mn
5),
(iii
)E
uro
pe
and
Cen
tral
Asi
a(e
xce
pt
inC
olu
mn
5),
(iv
)S
ou
thA
sia
(ex
cep
tin
Co
lum
n5
),(v
)E
ast
Asi
aan
dP
acifi
c(e
xce
pt
inC
olu
mn
5),
and
(vi)
Lat
inA
mer
ica
and
Car
ibb
ean
(ex
cep
tin
Co
lum
ns
3–
5).
Het
ero
sked
asti
city
robu
stst
and
ard
erro
rs
are
rep
ort
edin
par
enth
eses
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
36 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013T
AB
LE
D1
2—
RO
BU
ST
NE
SS
TO
PE
RC
EN
TA
GE
OF
TH
EP
OP
UL
AT
ION
OF
EU
RO
PE
AN
DE
SC
EN
T
Om
itO
EC
DO
mit
Neo
-Eu
rop
ean
Om
itE
uro
pea
n
Fu
llsa
mp
leco
un
trie
sco
un
trie
sco
un
trie
s
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
Bas
elin
eA
ug
men
ted
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
Sp
ecifi
cati
on
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Dep
end
ent
var
iab
leis
log
inco
me
per
cap
ita
in2
00
0C
E
Pre
dic
ted
div
ersi
ty2
81
.17
3*
**
26
4.4
64
**
*2
66
.49
4*
**
24
4.2
73
**
26
6.1
98
**
*2
66
.61
1*
**
28
5.4
66
**
*2
68
.29
5*
*
(an
cest
ryad
just
ed)
(70
.45
9)
(83
.15
7)
(84
.38
1)
(11
4.3
59
)(7
2.3
71
)(9
0.8
47
)(7
9.0
49
)(1
09
.02
4)
Pre
dic
ted
div
ersi
tysq
uar
e-1
95
.01
0*
**
-18
3.4
05
**
*-1
85
.22
3*
**
-16
9.6
53
**
-18
5.1
13
**
*-1
85
.40
3*
**
-19
8.2
34
**
*-1
86
.27
0*
*
(an
cest
ryad
just
ed)
(49
.76
4)
(58
.19
5)
(59
.30
2)
(79
.76
2)
(50
.91
5)
(63
.53
4)
(55
.64
1)
(75
.95
4)
Lo
gN
eoli
thic
tran
siti
on
0.4
17
*0
.37
40
.41
50
.36
80
.37
60
.37
70
.50
5*
0.4
63
tim
ing
(an
cest
ryad
just
ed)
(0.2
31
)(0
.24
0)
(0.2
83
)(0
.30
5)
(0.2
31
)(0
.24
8)
(0.2
76
)(0
.30
7)
Lo
gp
erce
nta
ge
of
arab
le-0
.18
9*
**
-0.1
90
**
*-0
.23
8*
**
-0.2
37
**
*-0
.20
5*
**
-0.2
05
**
*-0
.20
4*
**
-0.2
05
**
*
lan
d(0
.04
9)
(0.0
49
)(0
.06
0)
(0.0
60
)(0
.05
3)
(0.0
54
)(0
.05
4)
(0.0
55
)
Lo
gab
solu
tela
titu
de
0.0
03
-0.0
03
-0.0
21
-0.0
26
-0.0
29
-0.0
28
0.0
11
0.0
03
(0.1
05
)(0
.10
5)
(0.1
19
)(0
.12
2)
(0.1
09
)(0
.11
1)
(0.1
10
)(0
.11
1)
So
cial
infr
astr
uct
ure
1.8
67
**
*1
.81
2*
**
1.3
73
**
1.3
57
**
1.4
85
**
*1
.48
5*
**
1.8
36
**
*1
.77
6*
**
(0.4
04
)(0
.41
9)
(0.5
75
)(0
.60
4)
(0.5
01
)(0
.50
9)
(0.4
50
)(0
.52
9)
Eth
nic
frac
tio
nal
izat
ion
-0.3
51
-0.3
65
-0.4
50
-0.4
73
-0.4
04
-0.4
04
-0.4
06
-0.4
15
(0.2
76
)(0
.28
9)
(0.3
75
)(0
.40
0)
(0.2
97
)(0
.30
2)
(0.3
56
)(0
.36
3)
Per
cen
tag
eo
fp
op
ula
tio
nat
-0.5
04
-0.4
83
-0.5
93
-0.5
56
-0.5
85
-0.5
86
-0.5
13
-0.4
95
risk
of
con
trac
tin
gm
alar
ia(0
.34
4)
(0.3
69
)(0
.37
6)
(0.4
37
)(0
.36
3)
(0.4
03
)(0
.37
4)
(0.4
01
)
Per
cen
tag
eo
fp
op
ula
tio
n-0
.31
4-0
.29
1-0
.19
9-0
.16
3-0
.30
0-0
.30
1-0
.27
1-0
.24
8
liv
ing
intr
op
ical
zon
es(0
.19
9)
(0.2
04
)(0
.23
9)
(0.2
54
)(0
.21
5)
(0.2
23
)(0
.22
7)
(0.2
39
)
Mea
nd
ista
nce
ton
eare
st-0
.37
3*
*-0
.37
2*
*-0
.38
9*
-0.3
76
-0.4
50
**
-0.4
50
**
-0.3
20
-0.3
21
wat
erw
ay(0
.18
4)
(0.1
86
)(0
.22
2)
(0.2
29
)(0
.20
8)
(0.2
12
)(0
.20
4)
(0.2
07
)
Per
cen
tag
eo
fp
op
ula
tio
no
f0
.21
10
.27
0-0
.00
60
.18
7
Eu
rop
ean
des
cen
t(0
.61
8)
(0.9
14
)(0
.71
0)
(0.8
10
)
Op
tim
ald
iver
sity
0.7
21
**
*0
.72
1*
**
0.7
19
**
*0
.72
0*
**
0.7
19
**
*0
.71
9*
**
0.7
20
**
*0
.72
0*
**
(0.0
16
)(0
.05
5)
(0.0
69
)(0
.10
8)
(0.0
14
)(0
.01
9)
(0.0
16
)(0
.03
1)
Ob
serv
atio
ns
10
91
09
83
83
10
51
05
88
88
R2
0.9
00
.90
0.8
20
.82
0.8
90
.89
0.8
50
.85
No
te:
Th
ista
ble
(i)
esta
bli
shes
,u
sin
gth
e1
09
-co
un
try
sam
ple
fro
mth
eco
nte
mp
ora
ryan
aly
sis,
that
the
sig
nifi
can
th
um
p-s
hap
edef
fect
of
ance
stry
-ad
just
edg
enet
icd
iver
sity
on
log
inco
me
per
cap
ita
in2
00
0C
E(a
ssh
ow
nin
Co
lum
n1
),w
hil
eco
ntr
oll
ing
for
the
ance
stry
-ad
just
edti
min
go
fth
eN
eoli
thic
Rev
olu
tio
n,
lan
dp
rod
uct
ivit
y,a
vec
tor
of
inst
itu
tio
nal
,cu
ltu
ral,
and
geo
gra
ph
ical
det
erm
inan
tso
fd
evel
op
men
t,an
dco
nti
nen
tfi
xed
effe
cts,
isro
bu
stto
the
om
issi
on
of
(a)
OE
CD
cou
ntr
ies
(Co
lum
n3
),(b
)N
eo-E
uro
pea
n
cou
ntr
ies,
i.e.
,th
eU
.S.,
Can
ada,
Au
stra
lia,
and
New
Zea
lan
d(C
olu
mn
5),
and
(c)
Eu
rop
ean
cou
ntr
ies
(Co
lum
n7
)fr
om
the
reg
ress
ion
sam
ple
,an
d(i
i)d
emo
nst
rate
sth
at,
inea
ch
of
the
corr
esp
on
din
gsa
mp
les,
incl
ud
ing
the
un
rest
rict
edo
ne,
the
con
dit
ion
alh
um
p-s
hap
edef
fect
of
ance
stry
-ad
just
edg
enet
icd
iver
sity
isro
bu
stto
add
itio
nal
lyco
ntr
oll
ing
for
the
per
cen
tag
eo
fth
ep
op
ula
tio
no
fE
uro
pea
nd
esce
nt
(Co
lum
ns
2,
4,
6,
and
8),
ther
eby
con
firm
ing
that
the
hu
mp
-sh
aped
effe
cto
fg
enet
icd
iver
sity
on
con
tem
po
rary
com
par
ativ
e
dev
elo
pm
ent
isn
ot
refl
ecti
ng
the
late
nt
imp
act
of
chan
nel
sre
late
dto
the
dif
fusi
on
of
Eu
rop
ean
sd
uri
ng
the
era
of
Eu
rop
ean
colo
niz
atio
n.
All
reg
ress
ion
sin
clu
de
con
tro
lsfo
r
maj
or
reli
gio
nsh
ares
asw
ell
asO
PE
C,le
gal
ori
gin
,S
ub
-Sah
aran
Afr
ica,
and
con
tin
ent
fixed
effe
cts.
Bo
ots
trap
stan
dar
der
rors
,ac
cou
nti
ng
for
the
use
of
gen
erat
edre
gre
sso
rs,ar
e
rep
ort
edin
par
enth
eses
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 37
TA
BL
ED
13
—S
TA
ND
AR
DIZ
ED
BE
TA
AN
DP
AR
TIA
LR
2C
OE
FF
ICIE
NT
SF
OR
TH
EB
AS
EL
INE
SP
EC
IFIC
AT
ION
S
Lim
ited
-sam
ple
his
tori
cal
Ex
ten
ded
-sam
ple
his
tori
cal
Co
nte
mp
ora
ry
anal
ysi
san
aly
sis
anal
ysi
s
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gp
op
ula
tio
nL
og
po
pu
lati
on
Lo
gin
com
ep
er
den
sity
ind
ensi
tyin
den
sity
ind
ensi
tyin
cap
ita
in
15
00
CE
15
00
CE
10
00
CE
1C
E2
00
0C
E
Div
ersi
ty7
.20
36
.51
27
.00
86
.95
57
.54
05
.50
9
{0
.32
1}
{0
.21
2}
{0
.07
2}
{0
.05
6}
{0
.06
2}
{0
.03
3}
Div
ersi
tysq
uar
ed-6
.91
3-6
.25
1-6
.93
6-6
.83
8-7
.36
4-5
.48
5
{0
.29
9}
{0
.17
8}
{0
.07
4}
{0
.05
8}
{0
.06
3}
{0
.03
3}
Lo
gN
eoli
thic
tran
siti
on
0.3
72
0.3
48
0.4
90
0.6
53
0.7
65
0.0
24
tim
ing
{0
.34
6}
{0
.23
7}
{0
.20
5}
{0
.27
6}
{0
.33
8}
{0
.00
0}
Lo
gp
erce
nta
ge
of
arab
le0
.34
30
.36
30
.31
50
.30
40
.25
6-0
.12
3
lan
d{0
.28
8}
{0
.24
4}
{0
.11
0}
{0
.08
2}
{0
.06
2}
{0
.01
2}
Lo
gab
solu
tela
titu
de
-0.1
09
-0.0
86
-0.2
57
-0.2
36
-0.0
69
0.1
35
{0
.06
4}
{0
.03
2}
{0
.10
1}
{0
.07
2}
{0
.00
6}
{0
.02
1}
Lo
gla
nd
suit
abil
ity
for
0.2
91
0.2
99
0.2
25
0.1
73
0.1
63
-0.2
00
agri
cult
ure
{0
.32
3}
{0
.31
8}
{0
.06
0}
{0
.02
8}
{0
.02
8}
{0
.03
1}
Co
nti
nen
tfi
xed
effe
cts
No
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
21
21
14
51
40
12
61
43
R2
0.8
90
.90
0.6
90
.62
0.6
10
.57
No
te:
Th
ista
ble
rep
ort
s,fo
rb
oth
the
lim
ited
-an
dex
ten
ded
-sam
ple
var
ian
tso
fth
eh
isto
rica
lan
aly
sis
asw
ell
asfo
rth
eco
nte
mp
ora
ryan
aly
sis,
the
stan
dar
diz
edb
eta
coef
fici
ent
and
the
par
tial
R2
coef
fici
ent(i
ncu
rly
bra
cket
s)as
soci
ated
wit
hea
chex
pla
nat
ory
var
iab
lein
the
bas
elin
esp
ecifi
cati
on
corr
esp
on
din
gto
each
anal
ysi
s.T
he
stan
dar
diz
edb
eta
coef
fici
ent
asso
ciat
edw
ith
anex
pla
nat
ory
var
iab
lein
ag
iven
reg
ress
ion
isto
be
inte
rpre
ted
asth
en
um
ber
of
stan
dar
dd
evia
tio
ns
by
wh
ich
the
dep
end
ent
var
iab
lech
ang
esin
resp
on
seto
ao
ne
stan
dar
dd
evia
tio
nin
crea
sein
the
exp
lan
ato
ryvar
iab
le.
Th
ep
arti
alR
2co
effi
cien
tas
soci
ated
wit
han
exp
lan
ato
ryvar
iab
lein
ag
iven
reg
ress
ion
isto
be
inte
rpre
ted
asth
efr
acti
on
of
the
resi
du
alvar
iati
on
inth
ed
epen
den
tvar
iab
leth
atis
exp
lain
edb
yth
ere
sid
ual
var
iati
on
inth
eex
pla
nat
ory
var
iab
le,
wh
ere
resi
du
alvar
iati
on
of
a(d
epen
den
to
rex
pla
nat
ory
)
var
iab
lere
fers
toth
evar
iati
on
inth
atvar
iab
leth
atis
un
exp
lain
edb
yth
evar
iati
on
inal
lo
ther
exp
lan
ato
ryvar
iab
les
inth
ere
gre
ssio
n.
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
ty
emp
loy
edb
yth
ean
aly
sis
are
ob
serv
edg
enet
icd
iver
sity
inC
olu
mn
s1
–2
,p
red
icte
dg
enet
icd
iver
sity
(i.e
.,g
enet
icd
iver
sity
asp
red
icte
db
ym
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a)in
Co
lum
ns
3–
5,
and
ance
stry
-ad
just
edg
enet
icd
iver
sity
inC
olu
mn
6.
Th
ean
cest
ry-a
dju
sted
mea
sure
of
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
isu
sed
inC
olu
mn
6.
38 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013T
AB
LE
D1
4—
INT
ER
PE
RS
ON
AL
TR
US
T,
SC
IEN
TIF
ICP
RO
DU
CT
IVIT
Y,
AN
DE
CO
NO
MIC
DE
VE
LO
PM
EN
TIN
20
00
CE
Sp
ecifi
cati
on
sw
ith
inte
rper
son
altr
ust
Sp
ecifi
cati
on
sw
ith
scie
nti
fic
arti
cles
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
log
inco
me
per
cap
ita
in2
00
0C
E
Inte
rper
son
altr
ust
3.7
15
**
*2
.36
5*
**
1.7
29
**
*
(0.7
02
)(0
.66
0)
(0.5
93
)
Sci
enti
fic
arti
cles
3.3
91
**
*1
.96
8*
**
1.4
30
**
*
(0.3
12
)(0
.33
4)
(0.2
93
)
Lo
gN
eoli
thic
tran
siti
on
-0.2
30
-0.3
11
-0.1
25
-0.0
32
tim
ing
(an
cest
ryad
just
ed)
(0.2
89
)(0
.29
0)
(0.2
68
)(0
.18
9)
Lo
gp
erce
nta
ge
of
arab
le-0
.22
2*
-0.2
25
*-0
.08
9-0
.02
7
lan
d(0
.11
9)
(0.1
17
)(0
.09
9)
(0.0
74
)
Lo
gab
solu
tela
titu
de
0.3
53
**
-0.1
94
0.1
66
0.0
15
(0.1
35
)(0
.24
7)
(0.1
05
)(0
.10
8)
Lo
gla
nd
suit
abil
ity
for
0.1
77
0.1
22
-0.1
02
-0.1
40
*
agri
cult
ure
(0.1
10
)(0
.09
0)
(0.1
12
)(0
.08
1)
Exec
uti
ve
con
stra
ints
0.1
63
**
*0
.10
9*
*
(0.0
51
)(0
.04
4)
Eth
nic
frac
tio
nal
izat
ion
-0.7
24
**
-0.6
70
**
(0.3
49
)(0
.33
3)
Per
cen
tag
eo
fp
op
ula
tio
nat
-1.1
10
**
*-1
.28
4*
**
risk
of
con
trac
tin
gm
alar
ia(0
.40
6)
(0.2
67
)
Per
cen
tag
eo
fp
op
ula
tio
n-0
.54
80
.08
8
liv
ing
intr
op
ical
zon
es(0
.54
8)
(0.2
53
)
Mea
nd
ista
nce
ton
eare
st0
.06
5-0
.17
6
wat
erw
ay(0
.09
6)
(0.1
59
)
OP
EC
fixed
effe
ctN
oN
oY
esN
oN
oY
es
Leg
alo
rig
infi
xed
effe
cts
No
No
Yes
No
No
Yes
Co
nti
nen
tfi
xed
effe
cts
No
Yes
Yes
No
Yes
Yes
Ob
serv
atio
ns
73
73
73
12
51
25
12
5
R2
0.2
70
.68
0.8
40
.41
0.6
80
.80
No
te:
Th
ista
ble
(i)
esta
bli
shes
that
the
pre
val
ence
of
inte
rper
son
altr
ust
and
the
ann
ual
aver
age
nu
mb
ero
fsc
ien
tifi
car
ticl
esp
erca
pit
ab
oth
po
sses
sst
atis
tica
lly
sig
nifi
can
t
po
siti
ve
rela
tio
nsh
ips
wit
hlo
gin
com
ep
erca
pit
ain
20
00
CE
,u
sin
gfe
asib
le7
3-c
ou
ntr
yan
d1
25
-co
un
try
sam
ple
sre
spec
tivel
y,an
d(i
i)d
emo
nst
rate
sth
atth
ese
rela
tio
nsh
ips
ho
ld
un
con
dit
ion
ally
(Co
lum
ns
1an
d4
)an
dco
nd
itio
nal
on
eith
er(a
)a
bas
elin
ese
to
fco
ntr
ol
var
iab
les
(Co
lum
ns
2an
d5
),in
clu
din
gco
ntr
ols
for
the
ance
stry
-ad
just
edti
min
go
f
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
or
(b)
am
ore
com
pre
hen
sive
set
of
con
tro
lsfo
rin
stit
uti
on
al,
cult
ura
l,an
dg
eog
rap
hic
ald
eter
min
ants
of
dev
elo
pm
ent(C
olu
mn
s3
and
6),
incl
ud
ing
con
tro
lsfo
rex
ecu
tive
con
stra
ints
,et
hn
icfr
acti
on
aliz
atio
n,le
gal
ori
gin
s,tr
op
ical
dis
ease
env
iro
nm
ents
,ac
cess
tow
ater
way
s,an
dn
atu
ral
reso
urc
een
dow
men
ts.
All
reg
ress
ion
sw
ith
con
tin
ent
fixed
effe
cts
also
con
tro
lfo
ra
Su
b-S
ahar
anA
fric
afi
xed
effe
ct.
Het
ero
sked
asti
city
robu
stst
and
ard
erro
rsar
ere
po
rted
in
par
enth
eses
.F
or
add
itio
nal
det
ails
on
the
pre
val
ence
of
inte
rper
son
altr
ust
and
the
ann
ual
aver
age
nu
mb
ero
fsc
ien
tifi
car
ticl
esp
erca
pit
a,th
ere
ader
isre
ferr
edto
the
defi
nit
ion
so
f
thes
evar
iab
les
inS
ecti
on
Fo
fth
isap
pen
dix
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 39
TABLE D15—ROBUSTNESS TO AN ALTERNATIVE DEFINITION OF NEOLITHIC TRANSITION TIMING
Limited-sample analysis Extended-sample analysis
OLS OLS 2SLS 2SLS OLS OLS
(1) (2) (3) (4) (5) (6)
Dependent variable is log population density in 1500 CE
Panel A. Controlling for logged Neolithic transition timing with respect to 1500 CE
Diversity 228.020*** 198.225** 282.385*** 249.033*** 207.666*** 177.155**
(73.696) (85.848) (88.267) (72.352) (54.226) (83.806)
Diversity square -162.990** -142.553* -204.342*** -184.470*** -147.104*** -128.666**
(56.121) (71.792) (67.057) (57.795) (39.551) (58.497)
Log Neolithic transition 1.005*** 0.939* 0.850*** 0.950** 0.831*** 0.793***
timing (0.320) (0.495) (0.309) (0.371) (0.129) (0.204)
Log percentage of arable 0.517*** 0.420* 0.602*** 0.443** 0.419*** 0.431***
land (0.170) (0.217) (0.193) (0.196) (0.095) (0.101)
Log absolute latitude -0.143 -0.114 -0.189 -0.113 -0.299*** -0.416***
(0.131) (0.171) (0.125) (0.130) (0.094) (0.126)
Log land suitability for 0.558* 0.658 0.489** 0.661** 0.280*** 0.219**
agriculture (0.304) (0.403) (0.241) (0.312) (0.095) (0.099)
Continent fixed effects No Yes No Yes No Yes
Observations 21 21 21 21 144 144
R2 0.89 0.91 0.65 0.67
Panel B. Controlling for nonlogged Neolithic transition timing with respect to 1500 CE
Diversity 256.812*** 258.914** 311.376*** 341.608*** 199.715*** 225.672***
(79.383) (94.992) (85.804) (93.208) (57.660) (80.558)
Diversity square -185.560*** -191.630** -226.889*** -259.555*** -141.308*** -170.633***
(60.373) (80.404) (64.985) (76.477) (42.129) (56.691)
Neolithic transition 0.234** 0.227 0.202*** 0.258** 0.274*** 0.352***
timing (0.089) (0.160) (0.075) (0.127) (0.038) (0.054)
Log percentage of arable 0.580*** 0.497* 0.655*** 0.527** 0.416*** 0.372***
land (0.161) (0.233) (0.170) (0.208) (0.091) (0.105)
Log absolute latitude -0.229 -0.202 -0.269** -0.197* -0.407*** -0.527***
(0.134) (0.167) (0.117) (0.119) (0.101) (0.124)
Log land suitability for 0.588** 0.685 0.509** 0.699** 0.306*** 0.259***
agriculture (0.272) (0.430) (0.218) (0.336) (0.095) (0.098)
Continent fixed effects No Yes No Yes No Yes
Observations 21 21 21 21 145 145
R2 0.89 0.90 0.64 0.69
Note: This table establishes that, in both the limited- and extended-sample variants of the historical analysis for the
year 1500 CE, the hump-shaped effect of genetic diversity on log population density remains qualitatively robust under
an alternative definition of the Neolithic transition timing variable. In this case, the timing of the Neolithic Revolution
reflects the number of years elapsed, until the year 1500 CE (as opposed to 2000 CE), since the transition to sedentary
agriculture. The analysis employs the logged version of this variable in Panel A and its nonlogged version in Panel B.
In Columns 5–6, the higher number of observations in Panel B (relative to Panel A) reflects the inclusion of Australia,
which was yet to experience the Neolithic Revolution as of 1500 CE, in the sample. This permits the relevant regressions
in Panel B to exploit information on both the realized and unrealized “potential” of countries to experience the Neolithic
Revolution as of 1500 CE. The relevant measures of genetic diversity employed by the analysis are observed genetic
diversity in Columns 1–4 and predicted genetic diversity (i.e., genetic diversity as predicted by migratory distance from
East Africa) in Columns 5–6. In Columns 3–4, genetic diversity and its squared term are instrumented following the
methodology implemented in Columns 5–6 of Table 2 of the paper. Heteroskedasticity robust standard errors are reported
in parentheses in Columns 1–4. Bootstrap standard errors, accounting for the use of generated regressors, are reported in
parentheses in Columns 5–6.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
40 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TABLE D16—RESULTS FOR DISTANCES UNDER AN ALTERNATIVE DEFINITION OF NEOLITHIC TRANSITION
TIMING
Distance from: Addis Ababa Addis Ababa London Tokyo Mexico City
(1) (2) (3) (4) (5)
Dependent variable is log population density in 1500 CE
Panel A. Controlling for logged Neolithic transition timing with respect to 1500 CE
Migratory distance 0.152** -0.046 0.072 -0.007
(0.061) (0.064) (0.139) (0.101)
Migratory distance square -0.008*** -0.002 -0.008 0.003
(0.002) (0.002) (0.007) (0.004)
Aerial distance -0.030
(0.106)
Aerial distance square -0.003
(0.006)
Log Neolithic transition 0.831*** 0.847*** 0.702*** 0.669*** 1.106***
timing (0.125) (0.127) (0.140) (0.165) (0.248)
Log percentage of arable 0.419*** 0.502*** 0.367*** 0.530*** 0.506***
land (0.094) (0.105) (0.095) (0.088) (0.098)
Log absolute latitude -0.299*** -0.211** -0.320*** -0.290*** -0.195**
(0.091) (0.097) (0.114) (0.098) (0.085)
Log land suitability for 0.280*** 0.239** 0.327*** 0.169** 0.241**
agriculture (0.096) (0.106) (0.094) (0.081) (0.096)
Observations 144 144 144 144 144
R2 0.65 0.56 0.64 0.57 0.60
Panel B. Controlling for nonlogged Neolithic transition timing with respect to 1500 CE
Migratory distance 0.144** -0.078 0.017 0.072
(0.064) (0.062) (0.146) (0.099)
Migratory distance square -0.008*** -0.000 -0.005 -0.001
(0.002) (0.002) (0.007) (0.004)
Aerial distance 0.099
(0.104)
Aerial distance square -0.012*
(0.006)
Neolithic transition 0.274*** 0.279*** 0.223*** 0.240*** 0.316***
timing (0.038) (0.037) (0.038) (0.062) (0.065)
Log percentage of arable 0.416*** 0.504*** 0.357*** 0.552*** 0.507***
land (0.086) (0.100) (0.089) (0.088) (0.087)
Log absolute latitude -0.407*** -0.367*** -0.442*** -0.391*** -0.353***
(0.095) (0.103) (0.118) (0.107) (0.096)
Log land suitability for 0.306*** 0.244** 0.349*** 0.173** 0.260***
agriculture (0.092) (0.102) (0.092) (0.080) (0.090)
Observations 145 145 145 145 145
R2 0.64 0.56 0.64 0.56 0.58
Note: This table establishes that (i) the hump-shaped effect of migratory distance from East Africa on log population
density in 1500 CE and (ii) the absence of a similar effect associated with alternative concepts of distance remain
qualitatively robust under an alternative definition of the Neolithic transition timing variable. In this case, the timing of
the Neolithic Revolution reflects the number of years elapsed, until the year 1500 CE (as opposed to 2000 CE), since the
transition to sedentary agriculture. The analysis employs the logged version of this variable in Panel A and its nonlogged
version in Panel B. The higher number of observations in Panel B (relative to Panel A) reflects the inclusion of Australia,
which was yet to experience the Neolithic Revolution as of 1500 CE, in the sample. This permits the regressions in
Panel B to exploit information on both the realized and unrealized “potential” of countries to experience the Neolithic
Revolution as of 1500 CE. Heteroskedasticity robust standard errors are reported in parentheses.
*** Significant at the 1 percent level.
** Significant at the 5 percent level.
* Significant at the 10 percent level.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 41
TA
BL
ED
17
—R
OB
US
TN
ES
ST
OA
LT
ER
NA
TIV
EM
EA
SU
RE
SO
FE
CO
NO
MIC
DE
VE
LO
PM
EN
TIN
15
00
CE
(1)
(2)
(3)
(4)
(5)
(6)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nsi
zein
15
00
CE
Lo
gu
rban
izat
ion
rate
in1
50
0C
E
Pre
dic
ted
div
ersi
ty1
95
.32
7*
*1
87
.90
5*
**
17
5.8
70
**
12
0.5
83
**
14
8.7
57
**
*2
34
.41
0*
**
(78
.40
0)
(56
.12
2)
(75
.17
5)
(50
.30
0)
(48
.37
3)
(67
.32
1)
Pre
dic
ted
div
ersi
tysq
uar
e-1
39
.32
7*
*-1
35
.04
7*
**
-13
5.0
62
**
-84
.76
0*
*-1
06
.16
5*
**
-16
6.7
86
**
*
(57
.37
5)
(40
.99
6)
(52
.99
5)
(37
.32
3)
(36
.50
6)
(48
.78
0)
Lo
gN
eoli
thic
tran
siti
on
1.0
17
**
*1
.29
6*
**
0.4
02
**
0.7
52
**
*
tim
ing
(0.1
40
)(0
.25
1)
(0.2
02
)(0
.25
7)
Lo
gar
able
lan
dar
ea0
.73
5*
**
0.7
22
**
*-0
.11
6*
**
-0.1
19
**
(0.0
50
)(0
.05
1)
(0.0
44
)(0
.05
2)
Lo
gab
solu
tela
titu
de
-0.4
51
**
*-0
.44
7*
**
-0.2
36
-0.1
51
(0.0
86
)(0
.10
7)
(0.1
55
)(0
.17
0)
Lo
gla
nd
suit
abil
ity
for
-0.0
27
-0.0
55
-0.0
36
0.0
31
agri
cult
ure
(0.0
62
)(0
.06
9)
(0.0
58
)(0
.05
9)
Co
nti
nen
tfi
xed
effe
cts
No
No
Yes
No
No
Yes
Ob
serv
atio
ns
14
51
45
14
58
08
08
0
R2
0.0
80
.72
0.7
40
.30
0.4
40
.51
No
te:
Th
ista
ble
esta
bli
shes
,u
sin
gfe
asib
lesu
bse
tso
fth
eex
ten
ded
14
5-c
ou
ntr
ysa
mp
le,
that
the
sig
nifi
can
th
um
p-s
hap
edef
fect
of
gen
etic
div
ersi
ty,
asp
red
icte
db
ym
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a,o
nec
on
om
icd
evel
op
men
tin
15
00
CE
,co
nd
itio
nal
on
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
,la
nd
pro
du
ctiv
ity,
and
con
tin
entfi
xed
effe
cts,
isq
ual
itat
ivel
y
robu
stto
emp
loy
ing
alte
rnat
ive
mea
sure
so
fd
evel
op
men
t,in
clu
din
glo
gp
op
ula
tio
nsi
ze(C
olu
mn
s1
–3
)an
dlo
gu
rban
izat
ion
rate
(Co
lum
ns
4–
6),
aso
pp
ose
dto
emp
loy
ing
the
bas
elin
em
easu
reg
iven
by
log
po
pu
lati
on
den
sity
.B
oo
tstr
apst
and
ard
erro
rs,
acco
un
tin
gfo
rth
eu
seo
fg
ener
ated
reg
ress
ors
,ar
ere
po
rted
inp
aren
thes
es.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
42 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
ED
18
—R
OB
US
TN
ES
ST
OA
LL
OW
ING
FO
RS
PA
TIA
LA
UT
OR
EG
RE
SS
ION
INT
HE
EX
TE
ND
ED
-SA
MP
LE
HIS
TO
RIC
AL
AN
DC
ON
TE
MP
OR
AR
YA
NA
LY
SE
S
OL
SS
AR
SA
RE
SA
RA
RO
LS
SA
RS
AR
ES
AR
AR
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Dep
end
ent
var
iab
leis
:
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
Lo
gin
com
ep
erca
pit
ain
20
00
CE
Div
ersi
ty1
99
.72
7*
*1
81
.76
5*
**
22
3.7
73
**
*1
86
.01
8*
*2
35
.40
9*
**
20
9.1
39
**
27
0.1
86
**
35
0.7
61
**
*
(78
.33
5)
(54
.70
2)
(76
.25
5)
(74
.62
8)
(78
.73
3)
(10
1.9
27
)(1
09
.82
6)
(12
4.2
37
)
Div
ersi
tysq
uar
e-1
46
.16
7*
**
-13
5.0
62
**
*-1
70
.71
7*
**
-14
3.0
24
**
*-1
65
.29
3*
**
-14
5.2
84
**
-19
0.6
67
**
-24
7.8
85
**
*
(55
.04
3)
(39
.16
8)
(56
.47
8)
(54
.83
7)
(56
.68
2)
(71
.79
4)
(78
.00
3)
(88
.62
0)
Lo
gN
eoli
thic
tran
siti
on
1.2
35
**
*1
.01
0*
**
1.2
06
**
*1
.11
1*
**
0.0
62
0.1
25
0.1
54
0.3
12
tim
ing
(0.2
24
)(0
.21
4)
(0.2
29
)(0
.23
4)
(0.2
54
)(0
.28
7)
(0.2
93
)(0
.32
3)
Lo
gp
erce
nta
ge
of
arab
le0
.39
3*
**
0.3
79
**
*0
.32
9*
**
0.3
26
**
*-0
.12
2-0
.12
4-0
.10
9-0
.08
3
lan
d(0
.09
7)
(0.0
91
)(0
.08
0)
(0.0
80
)(0
.10
4)
(0.0
92
)(0
.09
1)
(0.0
90
)
Lo
gab
solu
tela
titu
de
-0.4
17
**
*-0
.44
7*
**
-0.2
53
**
-0.3
05
**
0.1
71
0.1
79
*0
.15
40
.10
5
(0.1
19
)(0
.10
2)
(0.1
17
)(0
.12
0)
(0.1
19
)(0
.09
8)
(0.1
01
)(0
.10
8)
Lo
gla
nd
suit
abil
ity
for
0.2
57
**
*0
.22
7*
**
0.3
28
**
*0
.30
8*
**
-0.1
76
*-0
.16
9*
*-0
.16
3*
-0.1
34
agri
cult
ure
(0.0
94
)(0
.08
3)
(0.0
81
)(0
.08
1)
(0.0
99
)(0
.08
3)
(0.0
84
)(0
.08
5)
Co
nti
nen
tfi
xed
effe
cts
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Ob
serv
atio
ns
14
51
45
14
51
45
14
31
43
14
31
43
R2
0.6
90
.57
No
te:
Th
ista
ble
esta
bli
shes
that
,in
bo
thth
eex
ten
ded
-sam
ple
his
tori
cal
anal
ysi
san
dth
eco
nte
mp
ora
ryan
aly
sis,
the
hu
mp
-sh
aped
effe
cto
fg
enet
icd
iver
sity
on
eco
no
mic
dev
elo
pm
ent,
wh
ile
con
tro
llin
gfo
rth
eti
min
go
fth
eN
eoli
thic
Rev
olu
tio
n,la
nd
pro
du
ctiv
ity,
and
con
tin
ent
fixed
effe
cts,
isq
ual
itat
ivel
yro
bu
stto
emp
loy
ing
alte
rnat
ive
esti
mat
ors
that
allo
wfo
rsp
atia
lau
tore
gre
ssio
nin
eith
erth
ed
epen
den
tvar
iab
le(C
olu
mn
s2
and
6)
or
the
dis
turb
ance
term
(Co
lum
ns
3an
d7
)o
rb
oth
(Co
lum
ns
4an
d8
).S
pec
ifica
lly,
the
esti
mat
or
use
din
Co
lum
ns
4an
d8
assu
mes
afi
rst-
ord
ersp
atia
lau
tore
gre
ssiv
em
od
elw
ith
firs
t-o
rder
spat
ial
auto
reg
ress
ive
dis
turb
ance
s(S
AR
AR
).T
his
mo
del
iso
fth
efo
rm
y=λ
Wy+
Xβ+
uw
ith
u=ρ
Mu+ε
,w
her
ey
isan
n×
1vec
tor
of
ob
serv
atio
ns
on
the
dep
end
ent
var
iab
le,
Wan
dM
are
n×
nsp
atia
lw
eig
hti
ng
mat
rice
s(w
ith
dia
go
nal
elem
ents
equ
alto
zero
and
off
-dia
go
nal
elem
ents
corr
esp
on
din
gto
the
inver
seg
reat
circ
led
ista
nce
sb
etw
een
geo
des
icce
ntr
oid
s),W
yan
dM
uar
en×
1vec
tors
rep
rese
nti
ng
spat
ial
lag
s,λ
andρ
are
no
n-z
ero
scal
arp
aram
eter
sre
flec
tin
gth
esp
atia
lau
tore
gre
ssiv
ep
roce
sses
,X
isan
n×
km
atri
xo
fo
bse
rvat
ion
so
nk
ind
epen
den
tvar
iab
les
andβ
isit
sas
soci
ated
k×
1p
aram
eter
vec
tor,
and
fin
ally
,ε
isa
n×
1vec
tor
of
resi
du
als.
On
the
oth
erh
and
,th
ees
tim
ato
ru
sed
inC
olu
mn
s2
and
6as
sum
eso
nly
afi
rst-
ord
ersp
atia
lau
tore
gre
ssiv
e
mo
del
(SA
R)
of
the
form
y=λ
Wy+
Xβ+ε
(i.e
.,S
AR
AR
wit
hρ=
0),
wh
erea
sth
ees
tim
ato
ru
sed
inC
olu
mn
s3
and
7as
sum
eso
nly
afi
rst-
ord
ersp
atia
lau
tore
gre
ssiv
eer
ror
mo
del
(SA
RE
)o
fth
efo
rmy=
Xβ+
uw
ith
u=ρ
Mu+ε
(i.e
.,S
AR
AR
wit
hλ=
0).
Fo
rco
mp
aris
on
,C
olu
mn
s1
and
5re
pro
du
ceth
ep
aram
eter
esti
mat
eso
bta
ined
un
der
the
OL
Ses
tim
ato
rth
at,
foll
ow
ing
the
above
no
tati
on
,as
sum
esa
mo
del
of
the
form
y=
Xβ+ε
(i.e
.,S
AR
AR
wit
hλ=
0an
dρ=
0).
Th
ere
levan
tm
easu
res
of
gen
etic
div
ersi
ty
emp
loy
edb
yth
ean
aly
sis
are
pre
dic
ted
gen
etic
div
ersi
ty(i
.e.,
gen
etic
div
ersi
tyas
pre
dic
ted
by
mig
rato
ryd
ista
nce
fro
mE
ast
Afr
ica)
inC
olu
mn
s1
–4
and
ance
stry
-ad
just
edg
enet
ic
div
ersi
tyin
Co
lum
ns
5–
8.
Th
ean
cest
ry-a
dju
sted
mea
sure
of
the
tim
ing
of
the
Neo
lith
icR
evo
luti
on
isu
sed
inC
olu
mn
s5
–8
.S
tan
dar
der
rors
are
rep
ort
edin
par
enth
eses
.N
ote
that
thes
est
and
ard
erro
rsar
en
ot
bo
ots
trap
ped
toac
cou
nt
for
the
use
of
gen
erat
edre
gre
sso
rs.
How
ever
,th
efa
ctth
atth
eh
eter
osk
edas
tici
tyro
bu
stst
and
ard
erro
rsp
rese
nte
din
Co
lum
ns
1an
d5
are
on
lym
arg
inal
lysm
alle
rth
anth
eir
bo
ots
trap
ped
cou
nte
rpar
tsp
rese
nte
din
Tab
le3
(Co
lum
n6
)an
dT
able
6(C
olu
mn
2)
of
the
pap
ersu
gg
ests
that
stat
isti
cal
infe
ren
ce
wo
uld
no
tb
eq
ual
itat
ivel
yaf
fect
edh
adth
est
and
ard
erro
rsin
Co
lum
ns
2–
4an
d6
–8
bee
nad
just
edto
acco
un
tfo
rth
eu
seo
fg
ener
ated
reg
ress
ors
.
**
*S
ign
ifica
nt
atth
e1
per
cen
tle
vel
.
**
Sig
nifi
can
tat
the
5p
erce
nt
level
.
*S
ign
ifica
nt
atth
e1
0p
erce
nt
level
.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 43
E THE 53 HGDP-CEPH ETHNIC GROUPS
Ethnic Group Migratory Distance Country Region
(in km)
Bantu (Kenya) 1,338.94 Kenya Africa
Bantu (Southeast) 4,306.19 South Africa Africa
Bantu (Southwest) 3,946.44 Namibia Africa
Biaka Pygmy 2,384.86 Central African Republic Africa
Mandenka 5,469.91 Senegal Africa
Mbuti Pygmy 1,335.50 Zaire Africa
San 3,872.42 Namibia Africa
Yoruba 3,629.65 Nigeria Africa
Bedouin 2,844.95 Israel Middle East
Druze 2,887.25 Israel Middle East
Mozabite 4,418.17 Algeria Middle East
Palestinian 2,887.25 Israel Middle East
Adygei 4,155.03 Russia Europe
Basque 6,012.26 France Europe
French 5,857.48 France Europe
Italian 5,249.04 Italy Europe
Orcadian 6,636.69 United Kingdom Europe
Russian 5,956.40 Russia Europe
Sardinian 5,305.81 Italy Europe
Tuscan 5,118.37 Italy Europe
Balochi 5,842.06 Pakistan Asia
Brahui 5,842.06 Pakistan Asia
Burusho 6,475.60 Pakistan Asia
Cambodian 10,260.55 Cambodia Asia
Dai 9,343.96 China Asia
Daur 10,213.13 China Asia
Han 10,123.19 China Asia
Han (North China) 9,854.75 China Asia
Hazara 6,132.57 Pakistan Asia
Hezhen 10,896.21 China Asia
Japanese 11,762.11 Japan Asia
Kalash 6,253.62 Pakistan Asia
Lahu 9,299.63 China Asia
Makrani 5,705.00 Pakistan Asia
Miao 9,875.32 China Asia
Mongola 9,869.85 China Asia
Naxi 9,131.37 China Asia
Oroqen 10,290.53 China Asia
Pathan 6,178.76 Pakistan Asia
She 10,817.81 China Asia
Sindhi 6,201.70 Pakistan Asia
Tu 8,868.14 China Asia
Tujia 9,832.50 China Asia
Uygur 7,071.97 China Asia
Xibo 7,110.29 China Asia
Yakut 9,919.11 Russia (Siberia) Asia
Yi 9,328.79 China Asia
Melanesian 16,168.51 Papua New Guinea Oceania
Papuan 14,843.12 Papua New Guinea Oceania
Colombian 22,662.78 Colombia Americas
Karitiana 24,177.34 Brazil Americas
Maya 19,825.71 Mexico Americas
Pima 18,015.79 Mexico Americas
44 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
F VARIABLE DEFINITIONS AND SOURCES
F1. Outcome Variables
Population density in 1 CE, 1000 CE, and 1500 CE. Population density (in persons per square km) for a given year is
calculated as population in that year, as reported by McEvedy and Jones (1978), divided by total land area, as reported
by the World Bank’s World Development Indicators. The cross-sectional unit of observation in McEvedy and Jones’s
(1978) data set is a region delineated by its international borders in 1975. Historical population estimates are provided
for regions corresponding to either individual countries or, in some cases, to sets comprised of 2-3 neighboring countries
(e.g., India, Pakistan, and Bangladesh). In the latter case, a set-specific population density figure is calculated based on
total land area, and the figure is then assigned to each of the component countries in the set. The same methodology
is employed to obtain population density for countries that exist today but were part of a larger political unit (e.g., the
former Yugoslavia) in 1975. The data reported by the authors are based on a wide variety of country- and region-specific
historical sources, the enumeration of which would be impractical for this appendix. The interested reader is therefore
referred to McEvedy and Jones (1978) for more details on the original data sources cited therein.
Urbanization rate in 1500 CE. The percentage of a country’s total population residing in urban areas (each with a city
population size of at least 5,000), as reported by Acemoglu, Johnson and Robinson (2005).
Income per capita in 2000 CE. Real GDP per capita, in constant 2000 international dollars, as reported by the Penn
World Table, version 6.2.
Interpersonal trust. The fraction of total respondents within a given country, from five different waves of the World
Values Survey conducted during the time period 1981–2008, that responded with “Most people can be trusted” (as opposed
to “Can’t be too careful”) when answering the survey question “Generally speaking, would you say that most people can
be trusted or that you can’t be too careful in dealing with people?”
Scientific articles. The mean, over the period 1981–2000, of the annual number of scientific articles per capita, calculated
as the total number of scientific and technical articles published in a given year divided by the total population in that
year. The relevant data on the total number of articles and population in a given year are obtained from the World Bank’s
World Development Indicators.
F2. Genetic Diversity Variables
Observed genetic diversity (for the limited historical sample). The average expected heterozygosity across ethnic
groups from the HGDP-CEPH Human Genome Diversity Cell Line Panel that are located within a given country. The
expected heterozygosities of the ethnic groups are from Ramachandran et al. (2005).
Predicted genetic diversity (for the extended historical sample). The expected heterozygosity (genetic diversity) of
a given country as predicted by (the extended sample definition of) migratory distance from East Africa (i.e., Addis
Ababa, Ethiopia). This measure is calculated by applying the regression coefficients obtained from regressing expected
heterozygosity on migratory distance at the ethnic group level, using the worldwide sample of 53 ethnic groups from the
HGDP-CEPH Human Genome Diversity Cell Line Panel. The expected heterozygosities and geographical coordinates
of the ethnic groups are from Ramachandran et al. (2005).
Note that for Table D5 in Section D of this appendix, the migratory distance concept used to predict the genetic
diversity of a country’s population is the human mobility index, calculated for the journey from Addis Ababa (Ethiopia)
to the country’s modern capital city, as opposed to the baseline waypoints-restricted migratory distance concept used
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 45
elsewhere. For additional details on how the human mobility index is calculated, the interested reader is referred to the
definition of this variable further below.
Predicted genetic diversity (ancestry adjusted). The expected heterozygosity (genetic diversity) of a country’s popula-
tion, predicted by migratory distances from East Africa (i.e., Addis Ababa, Ethiopia) to the year 1500 CE locations of the
ancestral populations of the country’s component ethnic groups in 2000 CE, as well as by pairwise migratory distances
between these ancestral populations. The source countries of the year 1500 CE ancestral populations are identified from
the World Migration Matrix, 1500–2000, discussed in Putterman and Weil (2010), and the modern capital cities of these
countries are used to compute the aforementioned migratory distances. The measure of genetic diversity is then calculated
by applying (i) the regression coefficients obtained from regressing expected heterozygosity on migratory distance from
East Africa at the ethnic group level, using the worldwide sample of 53 ethnic groups from the HGDP-CEPH Human
Genome Diversity Cell Line Panel, (ii) the regression coefficients obtained from regressing pairwise Fst genetic distances
on pairwise migratory distances between these ethnic groups, and (iii) the ancestry weights representing the fractions of
the year 2000 CE population (of the country for which the measure is being computed) that can trace their ancestral
origins to different source countries in the year 1500 CE. The construction of this measure is discussed in detail in
Section B of this appendix. The expected heterozygosities, geographical coordinates, and pairwise Fst genetic distances
of the 53 ethnic groups are from Ramachandran et al. (2005). The ancestry weights are from the World Migration Matrix,
1500–2000.
Note that, in contrast to the baseline waypoints-restricted migratory distance concept used elsewhere, the migratory
distance concept used to predict the ancestry-adjusted genetic diversity of a country’s population for Table D5 in Section
D of this appendix is the human mobility index, calculated for the journey from Addis Ababa (Ethiopia) to each of the
year 1500 CE locations of the ancestral populations of the country’s component ethnic groups in 2000 CE, as well as for
the journey between each pair of these ancestral populations. For additional details on how the human mobility index is
calculated, the interested reader is referred to the definition of this variable further below.
F3. Distance Variables
Migratory distance from East Africa (for the limited historical sample). The average migratory distance across ethnic
groups from the HGDP-CEPH Human Genome Diversity Cell Line Panel that are located within a given country. The
migratory distance of an ethnic group is the great circle distance from Addis Ababa (Ethiopia) to the location of the group
along a land-restricted path forced through one or more of five intercontinental waypoints, including Cairo (Egypt),
Istanbul (Turkey), Phnom Penh (Cambodia), Anadyr (Russia), and Prince Rupert (Canada). Distances are calculated
using the Haversine formula and are measured in units of 1,000 km. The geographical coordinates of the ethnic groups
and the intercontinental waypoints are from Ramachandran et al. (2005).
Migratory distance from East Africa (for the extended historical sample). The great circle distance from Addis
Ababa (Ethiopia) to the country’s modern capital city along a land-restricted path forced through one or more of five
aforementioned intercontinental waypoints. Distances are calculated using the Haversine formula and are measured in
units of 1,000 km. The geographical coordinates of the intercontinental waypoints are from Ramachandran et al. (2005),
while those of the modern capital cities are from the CIA’s World Factbook.
Migratory distance from East Africa (ancestry adjusted). The cross-country weighted average of (the extended sample
definition of) migratory distance from East Africa (i.e., Addis Ababa, Ethiopia), where the weight associated with a given
country in the calculation represents the fraction of the year 2000 CE population (of the country for which the measure
is being computed) that can trace its ancestral origins to the given country in the year 1500 CE. The ancestry weights are
obtained from the World Migration Matrix, 1500–2000 of Putterman and Weil (2010).
46 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
Migratory distance from a “placebo” point of origin. The great circle distance from a “placebo” location (i.e., other
than Addis Ababa, Ethiopia) to the country’s modern capital city along a land-restricted path forced through one or
more of five aforementioned intercontinental waypoints. Distances are calculated using the Haversine formula and are
measured in units of 1,000 km. The geographical coordinates of the intercontinental waypoints are from Ramachandran
et al. (2005), while those of the modern capital cities are from the CIA’s World Factbook. The placebo locations for which
results are presented in the paper include London (U.K.), Tokyo (Japan), and Mexico City (Mexico).
Aerial distance from East Africa. The great circle distance “as the crow flies” from Addis Ababa (Ethiopia) to the
country’s modern capital city. Distances are calculated using the Haversine formula and are measured in units of 1,000
km. The geographical coordinates of capital cities are from the CIA’s World Factbook.
Aerial distance from East Africa (ancestry adjusted). The cross-country weighted average of aerial distance from
East Africa (i.e., Addis Ababa, Ethiopia), where the weight associated with a given country in the calculation represents
the fraction of the year 2000 CE population (of the country for which the measure is being computed) that can trace its
ancestral origins to the given country in the year 1500 CE. The ancestry weights are from the World Migration Matrix,
1500–2000 of Putterman and Weil (2010).
Distance to regional frontier in 1 CE, 1000 CE, and 1500 CE. The great circle distance from a country’s capital city to
the closest regional technological frontier for a given year. The year-specific set of regional frontiers comprises the two
most populous cities, reported for that year and belonging to different civilizations or sociopolitical entities, from each
of Africa, Europe, Asia, and the Americas. Distances are calculated using the Haversine formula and are measured in
km. The historical urban population data used to identify the frontiers are obtained from Chandler (1987) and Modelski
(2003), and the geographical coordinates of ancient urban centers are obtained using Wikipedia.
Human mobility index. The average migratory distance across ethnic groups from the HGDP-CEPH Human Genome
Diversity Cell Line Panel that are located within a given country. The migratory distance of an ethnic group is the
distance from Addis Ababa (Ethiopia) to the location of the group along an “optimal” land-restricted path that minimizes
the time cost of travelling on the surface of the Earth in the absence of steam-powered transportation technologies. The
optimality of a path is determined by incorporating information on natural impediments to human spatial mobility, such
as the meteorological and topographical conditions prevalent along the path, as well as information on the time cost of
travelling under such conditions as reported by Hayes (1996). Distances are measured in weeks of travel time. The
geographical coordinates of the ethnic groups are from Ramachandran et al. (2005). The methodology underlying the
construction of this index is discussed in greater detail by Ashraf, Galor and Özak (2010) and Özak (2010).
Genetic distance to the U.K. or Ethiopia (1500 match). The Fst genetic distance, as reported by Spolaore and Wacziarg
(2009), between the year 1500 CE populations of a given country and the U.K. (or Ethiopia), calculated as the genetic
distance between the two ethnic groups comprising the largest shares of each country’s population in the year 1500 CE.
Genetic distance to the U.S. or Ethiopia (weighted). The Fst genetic distance, as reported by Spolaore and Wacziarg
(2009), between the contemporary national populations of a given country and the U.S. (or Ethiopia), calculated as the
average pairwise genetic distance across all ethnic group pairs, where each pair comprises two distinct ethnic groups, one
from each country, and is weighted by the product of the proportional representations of the two groups in their respective
national populations.
F4. Timing of the Neolithic Revolution and Subsistence Mode Variables
Neolithic transition timing. The number of years elapsed, until the year 2000 CE, since the majority of the population
residing within a country’s modern national borders began practicing sedentary agriculture as the primary mode of
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 47
subsistence. This measure, reported by Putterman (2008), is compiled using a wide variety of both region- and country-
specific archaeological studies as well as more general encyclopedic works on the transition from hunting and gathering
to agriculture during the Neolithic Revolution. The reader is referred to Putterman’s web site for a detailed description of
the primary and secondary data sources employed in the construction of this variable.
Note that the historical analysis, as conducted in Section IV of the paper, employs the Neolithic transition timing
variable defined above (i.e., measured as the number of thousand years since the onset of sedentary agriculture as of the
year 2000 CE). This results in the inclusion of countries that were yet to experience the onset of sedentary agriculture as
of the year 1500 CE in the sample, thereby permitting the relevant regressions to exploit information on both the realized
and unrealized “potential” of countries to undergo the Neolithic Revolution. Nevertheless, Tables D15 and D16 in Section
D of this appendix demonstrate that all the results of the historical analysis are robust under an alternative definition of
the Neolithic transition timing variable where this variable reflects the number of years elapsed, until the year 1500 CE
(as opposed to 2000 CE), since the transition to agriculture.
Neolithic transition timing (ancestry adjusted). The cross-country weighted average of Neolithic transition timing,
where the weight associated with a given country in the calculation represents the fraction of the year 2000 CE population
(of the country for which the measure is being computed) that can trace its ancestral origins to the given country in the
year 1500 CE. The ancestry weights are obtained from the World Migration Matrix, 1500–2000 of Putterman and Weil
(2010).
Subsistence mode in 1000 CE. An index in the [0,1]-interval that gauges the extent to which sedentary agriculture was
practiced, in the year 1000 CE, within a region delineated by a country’s modern international borders. This index is
constructed using data from Peregrine’s (2003) Atlas of Cultural Evolution, which reports, amongst other variables, a
measure of the mode of subsistence on a 3-point categorical scale at the level of a cultural group (or “archaeological
tradition”) that existed in the year 1000 CE. Specifically, the measure is taken to assume a value of 0 in the absence
of sedentary agriculture (i.e., if the cultural group exclusively practiced hunting and gathering), a value of 0.5 when
agriculture was practiced but only as a secondary mode of subsistence, and a value of 1 when agriculture was practiced
as the primary mode of subsistence. Given that the cross-sectional unit of observation in Peregrine’s (2003) data set is
a cultural group, specific to a given region on the global map, and since spatial delineations of groups, as reported by
Peregrine (2003), do not necessarily correspond to contemporary international borders, the measure is aggregated to the
country level by averaging across those cultural groups that are reported to appear within the modern borders of a given
country. For more details on the underlying data employed to construct this index, the interested reader is referred to
Peregrine (2003).
F5. Geographical Variables
Percentage of arable land. The fraction of a country’s total land area that is arable, as reported by the World Bank’s
World Development Indicators.
Absolute latitude. The absolute value of the latitude of a country’s approximate geodesic centroid, as reported by the
CIA’s World Factbook.
Land suitability for agriculture. A geospatial index of the suitability of land for agriculture based on ecological
indicators of climate suitability for cultivation, such as growing degree days and the ratio of actual to potential evap-
otranspiration, as well as ecological indicators of soil suitability for cultivation, such as soil carbon density and soil pH.
This index was initially reported at a half-degree resolution by Ramankutty et al. (2002). Formally, Ramankutty et al.
(2002) calculate the land suitability index, S, as the product of climate suitability, Sclim , and soil suitability, Ssoil ,
48 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
i.e., S = Sclim × Ssoil . The climate suitability component is estimated to be a function of growing degree days,
G DD, and a moisture index, α, gauging water availability to plants, calculated as the ratio of actual to potential
evapotranspiration, i.e., Sclim = f1(G DD) f2(α). The soil suitability component, on the other hand, is estimated
to be a function of soil carbon density, Csoil , and soil pH, pHsoil , i.e. Ssoil = g1(Csoil)g2(pHsoil).The functions, f1(G DD), f2(α), g1(Csoil), and g2(pHsoil) are chosen by Ramankutty et al. (2002) by
empirically fitting functions to the observed relationships between cropland areas, G DD,α, Csoil , and pHsoil . For
more details on the specific functional forms chosen, the interested reader is referred to Ramankutty et al. (2002). Since
Ramankutty et al. (2002) report the land suitability index at a half-degree resolution, Michalopoulos (2011) aggregates the
index to the country level by averaging land suitability across grid cells within a country. This study employs the country-
level aggregate measure reported by Michalopoulos (2011) as the control for land suitability in the baseline regression
specifications for both historical population density and contemporary income per capita.
Range of land suitability. The difference between the maximum and minimum values of a land suitability index, reported
at a half-degree resolution by Ramankutty et al. (2002), across grid cells within a country. This variable is obtained from
the data set of Michalopoulos (2011). For additional details on the land suitability index, the interested reader is referred
to the definition of the land suitability variable above.
Land suitability Gini. The Gini coefficient based on the distribution of a land suitability index, reported at a half-degree
resolution by Ramankutty et al. (2002), across grid cells within a country. This variable is obtained from the data set of
Michalopoulos (2011). For additional details on the land suitability index, the interested reader is referred to the definition
of the land suitability variable above.
Soil fertility. The soil suitability component, based on soil carbon density and soil pH, of an index of land suitability
for agriculture. The soil suitability data are reported at a half-degree resolution by Ramankutty et al. (2002) and are
aggregated to the country level by Michalopoulos (2011) by averaging across grid cells within a country. For additional
details on the soil suitability component of the land suitability index, the interested reader is referred to the definition of
the land suitability variable above.
Mean elevation. The mean elevation of a country in km above sea level, calculated using geospatial elevation data
reported by the G-ECON project (Nordhaus 2006) at a 1-degree resolution, which, in turn, is based on similar but more
spatially disaggregated data at a 10-minute resolution from New et al. (2002). The measure is thus the average elevation
across the grid cells within a country. The interested reader is referred to the G-ECON project web site for additional
details.
Standard deviation of elevation. The standard deviation of elevation across the grid cells within a country in km above
sea level, calculated using geospatial elevation data reported by the G-ECON project (Nordhaus 2006) at a 1-degree
resolution, which, in turn, is based on similar but more spatially disaggregated data at a 10-minute resolution from New
et al. (2002). The interested reader is referred to the G-ECON project web site for additional details.
Terrain roughness. The degree of terrain roughness of a country, calculated using geospatial surface undulation data
reported by the G-ECON project (Nordhaus 2006) at a 1-degree resolution, which is based on more spatially disaggregated
elevation data at a 10-minute resolution from New et al. (2002). The measure is thus the average degree of terrain
roughness across the grid cells within a country. The interested reader is referred to the G-ECON project web site for
additional details.
Temperature. The intertemporal average monthly temperature of a country in degrees Celsius per month over the 1961–
1990 time period, calculated using geospatial average monthly temperature data for this period reported by the G-ECON
project (Nordhaus 2006) at a 1-degree resolution, which, in turn, is based on similar but more spatially disaggregated
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 49
data at a 10-minute resolution from New et al. (2002). The measure is thus the spatial mean of the intertemporal average
monthly temperature across the grid cells within a country. The interested reader is referred to the G-ECON project web
site for additional details.
Precipitation. The intertemporal average monthly precipitation of a country in mm per month over the 1961–1990 time
period, calculated using geospatial average monthly precipitation data for this period reported by the G-ECON project
(Nordhaus 2006) at a 1-degree resolution, which, in turn, is based on similar but more spatially disaggregated data at a
10-minute resolution from New et al. (2002). The measure is thus the spatial mean of the intertemporal average monthly
precipitation across the grid cells within a country. The interested reader is referred to the G-ECON project web site for
additional details.
Mean distance to nearest waterway. The distance, in thousands of km, from a GIS grid cell to the nearest ice-free
coastline or sea-navigable river, averaged across the grid cells of a country. This variable was originally constructed by
Gallup, Sachs and Mellinger (1999) and is part of Harvard University’s CID Research Datasets on General Measures of
Geography.
Percentage of land near a waterway. The percentage of a country’s total land area that is located within 100 km of an
ice-free coastline or sea-navigable river. This variable was originally constructed by Gallup, Sachs and Mellinger (1999)
and is part of Harvard University’s CID Research Datasets on General Measures of Geography.
Percentage of population living in tropical zones. The percentage of a country’s population in 1995 that resided in
areas classified as tropical by the Köppen-Geiger climate classification system. This variable was originally constructed
by Gallup, Sachs and Mellinger (1999) and is part of Harvard University’s CID Research Datasets on General Measures
of Geography.
Percentage of population at risk of contracting malaria. The percentage of a country’s population in 1994 residing
in regions of high malaria risk, multiplied by the proportion of national cases involving the fatal species of the malaria
pathogen, P. falciparum (as opposed to other largely nonfatal species). This variable was originally constructed by Gallup
and Sachs (2001) and is part of Columbia University’s Earth Institute data set on malaria.
Climate. An index of climatic suitability for agriculture based on the Köppen-Geiger climate classification system. This
variable is obtained from the data set of Olsson and Hibbs (2005).
Orientation of continental axis. The orientation of a continent (or landmass) along a North-South or East-West axis.
This measure, reported in the data set of Olsson and Hibbs (2005), is calculated as the ratio of the largest longitudinal
(East-West) distance to the largest latitudinal (North-South) distance of the continent (or landmass).
Size of continent. The total land area of a continent (or landmass) as reported in the data set of Olsson and Hibbs (2005).
Domesticable plants. The number of annual and perennial wild grass species, with a mean kernel weight exceeding 10
mg, that were prehistorically native to the region to which a country belongs. This variable is obtained from the data set
of Olsson and Hibbs (2005).
Domesticable animals. The number of domesticable large mammalian species, weighing in excess of 45 kg, that were
prehistorically native to the region to which a country belongs. This variable is obtained from the data set of Olsson and
Hibbs (2005).
50 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
F6. Institutional, Cultural, and Human Capital Variables
Social infrastructure. An index, calculated by Hall and Jones (1999), that quantifies the wedge between private and
social returns to productive activities. To elaborate, this measure is computed as the average of two separate indices. The
first is a government anti-diversion policy (GADP) index, based on data from the International Country Risk Guide, that
represents the average across five categories, each measured as the mean over the 1986–1995 time period: (i) law and
order, (ii) bureaucratic quality, (iii) corruption, (iv) risk of expropriation, and (v) government repudiation of contracts.
The second is an index of openness, based on Sachs and Warner (1995), that represents the fraction of years in the time
period 1950–1994 that the economy was open to trade with other countries, where the criteria for being open in a given
year includes: (i) nontariff barriers cover less than 40% of trade, (ii) average tariff rates are less than 40%, (iii) any black
market premium was less than 20% during the 1970s and 80s, (iv) the country is not socialist, and (v) the government
does not monopolize over major exports.
Democracy. The 1960–2000 mean of an index that quantifies the extent of institutionalized democracy, as reported in the
Polity IV data set. The Polity IV democracy index for a given year is an 11-point categorical variable (from 0 to 10) that is
additively derived from Polity IV codings on the (i) competitiveness of political participation, (ii) openness of executive
recruitment, (iii) competitiveness of executive recruitment, and (iv) constraints on the chief executive.
Executive constraints. The 1960–2000 mean of an index, reported annually as a 7-point categorical variable (from 1 to
7) by the Polity IV data set, quantifying the extent of institutionalized constraints on the decision-making power of chief
executives.
Legal origins. A set of dummy variables, reported by La Porta et al. (1999), that identifies the legal origin of the
Company Law or Commercial Code of a country. The five legal origin possibilities are: (i) English Common Law, (ii)
French Commercial Code, (iii) German Commercial Code, (iv) Scandinavian Commercial Code, and (v) Socialist or
Communist Laws.
Major religion shares. A set of variables, from La Porta et al. (1999), that identifies the percentage of a country’s
population belonging to the three most widely spread religions of the world. The religions identified are: (i) Roman
Catholic, (ii) Protestant, and (iii) Muslim.
Ethnic fractionalization. A fractionalization index, constructed by Alesina et al. (2003), that captures the probability
that two individuals, selected at random from a country’s population, will belong to different ethnic groups.
Percentage of population of European descent. The fraction of the year 2000 CE population (of the country for which
the measure is being computed) that can trace its ancestral origins to the European continent due to migrations occurring
as early as the year 1500 CE. This variable is constructed using data from the World Migration Matrix, 1500–2000 of
Putterman and Weil (2010).
Years of schooling. The mean, over the 1960–2000 time period, of the 5-yearly figure, reported by Barro and Lee (2001),
on average years of schooling amongst the population aged 25 and over.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 51
G DESCRIPTIVE STATISTICS
TABLE G1—SUMMARY STATISTICS FOR THE 21-COUNTRY HISTORICAL SAMPLE
Obs. Mean S.D. Min. Max.
(1) Log population density in 1500 CE 21 1.169 1.756 -2.135 3.842
(2) Observed genetic diversity 21 0.713 0.056 0.552 0.770
(3) Migratory distance from East Africa 21 8.238 6.735 1.335 24.177
(4) Human mobility index 18 10.965 8.124 2.405 31.360
(5) Log Neolithic transition timing 21 8.342 0.539 7.131 9.259
(6) Log percentage of arable land 21 2.141 1.168 -0.799 3.512
(7) Log absolute latitude 21 2.739 1.178 0.000 4.094
(8) Log land suitability for agriculture 21 -1.391 0.895 -3.219 -0.288
TABLE G2—PAIRWISE CORRELATIONS FOR THE 21-COUNTRY HISTORICAL SAMPLE
(1) (2) (3) (4) (5) (6) (7)
(1) Log population density in 1500 CE 1.000
(2) Observed genetic diversity 0.244 1.000
(3) Migratory distance from East Africa -0.226 -0.968 1.000
(4) Human mobility index -0.273 -0.955 0.987 1.000
(5) Log Neolithic transition timing 0.735 -0.117 0.024 0.011 1.000
(6) Log percentage of arable land 0.670 0.172 -0.183 -0.032 0.521 1.000
(7) Log absolute latitude 0.336 0.055 -0.012 0.044 0.392 0.453 1.000
(8) Log land suitability for agriculture 0.561 -0.218 0.282 0.245 0.299 0.376 0.049
52 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
EG
3—
SU
MM
AR
YS
TA
TIS
TIC
SF
OR
TH
E1
45
-CO
UN
TR
YH
IST
OR
ICA
LS
AM
PL
E
Ob
s.M
ean
S.D
.M
in.
Max
.
(1)
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
14
50
.88
11
.50
0-3
.81
73
.84
2
(2)
Lo
gp
op
ula
tio
nd
ensi
tyin
10
00
CE
14
00
.46
31
.44
5-4
.51
02
.98
9
(3)
Lo
gp
op
ula
tio
nd
ensi
tyin
1C
E1
26
-0.0
70
1.5
35
-4.5
10
3.1
70
(4)
Pre
dic
ted
gen
etic
div
ersi
ty1
45
0.7
11
0.0
53
0.5
72
0.7
74
(5)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
14
58
.34
30
.59
55
.99
19
.25
9
(6)
Lo
gp
erce
nta
ge
of
arab
lela
nd
14
52
.23
21
.20
3-2
.12
04
.12
9
(7)
Lo
gab
solu
tela
titu
de
14
53
.00
30
.92
40
.00
04
.15
9
(8)
Lo
gla
nd
suit
abil
ity
for
agri
cult
ure
14
5-1
.40
91
.31
3-5
.85
7-0
.04
1
(9)
Lo
gd
ista
nce
tore
gio
nal
fro
nti
erin
15
00
CE
14
57
.30
91
.58
70
.00
09
.28
8
(10
)L
og
dis
tan
ceto
reg
ion
alfr
on
tier
in1
00
0C
E1
45
7.4
06
1.2
15
0.0
00
9.2
58
(11
)L
og
dis
tan
ceto
reg
ion
alfr
on
tier
in1
CE
14
57
.38
91
.30
70
.00
09
.26
1
(12
)M
ean
elev
atio
n1
45
0.5
55
0.4
81
0.0
24
2.6
74
(13
)T
erra
inro
ug
hn
ess
14
50
.17
80
.13
50
.01
30
.60
2
(14
)M
ean
dis
tan
ceto
nea
rest
wat
erw
ay1
45
0.3
50
0.4
56
0.0
14
2.3
86
(15
)P
erce
nta
ge
of
lan
dn
ear
aw
ater
way
14
50
.43
70
.36
80
.00
01
.00
0
(16
)C
lim
ate
96
1.5
31
1.0
46
0.0
00
3.0
00
(17
)O
rien
tati
on
of
con
tin
enta
lax
is9
61
.52
10
.68
50
.50
03
.00
0
(18
)S
ize
of
con
tin
ent
96
30
.60
81
3.6
05
0.0
65
44
.61
4
(19
)D
om
esti
cab
lep
lan
ts9
61
3.2
60
13
.41
62
.00
03
3.0
00
(20
)D
om
esti
cab
lean
imal
s9
63
.77
14
.13
60
.00
09
.00
0
(21
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a1
45
8.3
99
6.9
70
0.0
00
26
.77
1
(22
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
14
56
.00
33
.55
80
.00
01
4.4
20
(23
)M
igra
tory
dis
tan
cefr
om
Lo
nd
on
14
58
.88
47
.10
40
.00
02
6.8
60
(24
)M
igra
tory
dis
tan
cefr
om
To
ky
o1
45
11
.07
63
.78
50
.00
01
9.3
10
(25
)M
igra
tory
dis
tan
cefr
om
Mex
ico
Cit
y1
45
15
.68
16
.18
50
.00
02
5.0
20
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 53
TA
BL
EG
4—
PA
IRW
ISE
CO
RR
EL
AT
ION
SF
OR
TH
E1
45
-CO
UN
TR
YH
IST
OR
ICA
LS
AM
PL
E
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10
)(1
1)
(12
)
(1)
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
1.0
00
(2)
Lo
gp
op
ula
tio
nd
ensi
tyin
10
00
CE
0.9
63
1.0
00
(3)
Lo
gp
op
ula
tio
nd
ensi
tyin
1C
E0
.87
60
.93
61
.00
0
(4)
Pre
dic
ted
gen
etic
div
ersi
ty0
.39
10
.31
20
.34
11
.00
0
(5)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
0.5
11
0.5
67
0.6
45
0.2
75
1.0
00
(6)
Lo
gp
erce
nta
ge
of
arab
lela
nd
0.5
82
0.5
01
0.4
55
0.1
32
0.1
57
1.0
00
(7)
Lo
gab
solu
tela
titu
de
0.1
01
0.0
90
0.2
84
0.1
06
0.3
22
0.2
72
1.0
00
(8)
Lo
gla
nd
suit
abil
ity
for
agri
cult
ure
0.3
64
0.3
02
0.2
48
-0.2
51
-0.1
33
0.6
49
-0.0
44
1.0
00
(9)
Lo
gd
ista
nce
tore
gio
nal
fro
nti
erin
15
00
CE
-0.3
60
-0.3
67
-0.4
01
-0.0
21
-0.3
96
-0.1
88
-0.3
18
-0.0
12
1.0
00
(10
)L
og
dis
tan
ceto
reg
ion
alfr
on
tier
in1
00
0C
E-0
.24
3-0
.32
8-0
.38
9-0
.08
4-0
.52
2-0
.10
1-0
.30
70
.17
50
.60
61
.00
0
(11
)L
og
dis
tan
ceto
reg
ion
alfr
on
tier
in1
CE
-0.3
26
-0.3
90
-0.5
03
-0.0
82
-0.5
21
-0.1
77
-0.3
41
-0.0
02
0.4
57
0.7
03
1.0
00
(12
)M
ean
elev
atio
n-0
.02
8-0
.04
6-0
.04
70
.10
60
.06
9-0
.05
1-0
.02
60
.01
80
.01
80
.03
30
.02
81
.00
0
(13
)T
erra
inro
ug
hn
ess
0.1
97
0.1
99
0.2
18
-0.1
61
0.2
15
0.1
26
0.0
68
0.2
87
-0.0
60
-0.1
10
-0.2
29
0.6
26
(14
)M
ean
dis
tan
ceto
nea
rest
wat
erw
ay-0
.30
5-0
.33
1-0
.36
60
.19
5-0
.01
7-0
.17
8-0
.01
4-0
.23
00
.15
50
.11
80
.17
00
.42
9
(15
)P
erce
nta
ge
of
lan
dn
ear
aw
ater
way
0.3
83
0.3
62
0.3
91
-0.1
92
0.1
10
0.2
94
0.2
44
0.2
90
-0.2
10
-0.0
84
-0.2
19
-0.5
26
(16
)C
lim
ate
0.5
27
0.5
67
0.6
33
0.0
80
0.6
21
0.4
38
0.5
63
0.1
01
-0.4
14
-0.4
56
-0.4
60
-0.1
78
(17
)O
rien
tati
on
of
con
tin
enta
lax
is0
.47
90
.50
00
.57
30
.15
90
.64
40
.30
20
.45
30
.06
6-0
.20
2-0
.24
1-0
.33
6-0
.01
9
(18
)S
ize
of
con
tin
ent
0.3
08
0.3
39
0.4
08
0.4
65
0.4
54
0.2
44
0.3
27
-0.1
59
-0.1
25
-0.2
18
-0.2
67
0.1
48
(19
)D
om
esti
cab
lep
lan
ts0
.51
00
.53
80
.66
30
.34
60
.63
60
.37
10
.64
2-0
.11
2-0
.43
9-0
.50
7-0
.50
2-0
.19
7
(20
)D
om
esti
cab
lean
imal
s0
.58
00
.61
50
.69
90
.24
90
.76
80
.36
60
.64
0-0
.05
5-0
.42
7-0
.46
7-0
.46
1-0
.12
4
(21
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a-0
.39
1-0
.31
2-0
.34
1-1
.00
0-0
.27
5-0
.13
2-0
.10
60
.25
10
.02
10
.08
40
.08
2-0
.10
6
(22
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
-0.2
87
-0.2
38
-0.2
83
-0.9
34
-0.3
31
-0.0
44
-0.0
17
0.3
34
-0.0
53
0.0
79
0.0
73
-0.1
53
(23
)M
igra
tory
dis
tan
cefr
om
Lo
nd
on
-0.5
37
-0.4
73
-0.5
18
-0.8
99
-0.4
97
-0.2
71
-0.3
85
0.1
76
0.2
15
0.2
24
0.2
56
0.0
01
(24
)M
igra
tory
dis
tan
cefr
om
To
ky
o-0
.42
0-0
.40
3-0
.35
3-0
.26
6-0
.55
9-0
.18
7-0
.31
60
.05
60
.16
40
.22
70
.22
4-0
.12
2
(25
)M
igra
tory
dis
tan
cefr
om
Mex
ico
Cit
y0
.19
80
.12
80
.16
70
.82
2-0
.03
40
.00
9-0
.00
6-0
.21
00
.16
90
.18
90
.18
00
.00
2
(13
)(1
4)
(15
)(1
6)
(17
)(1
8)
(19
)(2
0)
(21
)(2
2)
(23
)(2
4)
(13
)T
erra
inro
ug
hn
ess
1.0
00
(14
)M
ean
dis
tan
ceto
nea
rest
wat
erw
ay-0
.00
21
.00
0
(15
)P
erce
nta
ge
of
lan
dn
ear
aw
ater
way
0.0
39
-0.6
65
1.0
00
(16
)C
lim
ate
0.1
47
-0.4
97
0.4
31
1.0
00
(17
)O
rien
tati
on
of
con
tin
enta
lax
is0
.26
0-0
.23
60
.28
50
.48
21
.00
0
(18
)S
ize
of
con
tin
ent
-0.0
79
0.1
04
-0.1
70
0.3
27
0.6
68
1.0
00
(19
)D
om
esti
cab
lep
lan
ts0
.07
4-0
.33
30
.33
40
.81
70
.61
30
.48
71
.00
0
(20
)D
om
esti
cab
lean
imal
s0
.13
2-0
.32
20
.34
90
.77
80
.74
30
.51
60
.87
81
.00
0
(21
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a0
.16
1-0
.19
50
.19
2-0
.08
0-0
.15
9-0
.46
5-0
.34
6-0
.24
91
.00
0
(22
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
0.1
70
-0.2
20
0.2
92
-0.0
74
-0.0
58
-0.4
29
-0.2
81
-0.1
91
0.9
34
1.0
00
(23
)M
igra
tory
dis
tan
cefr
om
Lo
nd
on
0.0
88
-0.0
81
-0.0
42
-0.4
07
-0.4
55
-0.5
69
-0.6
83
-0.6
12
0.8
99
0.8
00
1.0
00
(24
)M
igra
tory
dis
tan
cefr
om
To
ky
o-0
.26
4-0
.12
8-0
.11
8-0
.22
0-0
.67
2-0
.31
0-0
.30
9-0
.62
60
.26
60
.17
20
.41
81
.00
0
(25
)M
igra
tory
dis
tan
cefr
om
Mex
ico
Cit
y-0
.28
40
.09
4-0
.19
4-0
.01
4-0
.03
50
.26
80
.14
80
.07
5-0
.82
2-0
.75
9-0
.67
5-0
.02
5
54 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
EG
5—
SU
MM
AR
YS
TA
TIS
TIC
SF
OR
TH
E1
43
-CO
UN
TR
YC
ON
TE
MP
OR
AR
YS
AM
PL
E
Ob
s.M
ean
S.D
.M
in.
Max
.
(1)
Lo
gin
com
ep
erca
pit
ain
20
00
CE
14
38
.41
61
.16
56
.15
81
0.4
45
(2)
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
14
30
.89
11
.49
6-3
.81
73
.84
2
(3)
Pre
dic
ted
gen
etic
div
ersi
ty(u
nad
just
ed)
14
30
.71
20
.05
20
.57
20
.77
4
(4)
Pre
dic
ted
gen
etic
div
ersi
ty(a
nce
stry
adju
sted
)1
43
0.7
27
0.0
27
0.6
28
0.7
74
(5)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(un
adju
sted
)1
43
8.3
43
0.5
99
5.9
91
9.2
59
(6)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(an
cest
ryad
just
ed)
14
38
.49
50
.45
47
.21
39
.25
0
(7)
Lo
gp
erce
nta
ge
of
arab
lela
nd
14
32
.25
11
.18
0-2
.12
04
.12
9
(8)
Lo
gab
solu
tela
titu
de
14
33
.01
40
.92
10
.00
04
.15
9
(9)
Lo
gla
nd
suit
abil
ity
for
agri
cult
ure
14
3-1
.41
61
.32
1-5
.85
7-0
.04
1
TA
BL
EG
6—
PA
IRW
ISE
CO
RR
EL
AT
ION
SF
OR
TH
E1
43
-CO
UN
TR
YC
ON
TE
MP
OR
AR
YS
AM
PL
E
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(1)
Lo
gin
com
ep
erca
pit
ain
20
00
CE
1.0
00
(2)
Lo
gp
op
ula
tio
nd
ensi
tyin
15
00
CE
-0.0
11
1.0
00
(3)
Pre
dic
ted
gen
etic
div
ersi
ty(u
nad
just
ed)
-0.2
12
0.3
78
1.0
00
(4)
Pre
dic
ted
gen
etic
div
ersi
ty(a
nce
stry
adju
sted
)-0
.16
00
.06
50
.77
51
.00
0
(5)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(un
adju
sted
)0
.19
10
.51
20
.27
60
.01
61
.00
0
(6)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(an
cest
ryad
just
ed)
0.4
38
0.2
45
-0.1
50
-0.1
51
0.7
62
1.0
00
(7)
Lo
gp
erce
nta
ge
of
arab
lela
nd
-0.0
08
0.5
71
0.0
97
0.0
83
0.1
56
0.1
58
1.0
00
(8)
Lo
gab
solu
tela
titu
de
0.5
02
0.0
83
0.0
82
0.0
43
0.3
22
0.4
19
0.2
48
1.0
00
(9)
Lo
gla
nd
suit
abil
ity
for
agri
cult
ure
-0.1
05
0.3
69
-0.2
51
-0.2
43
-0.1
33
-0.0
82
0.6
70
-0.0
41
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 55
TA
BL
EG
7—
SU
MM
AR
YS
TA
TIS
TIC
SF
OR
TH
E1
09
-CO
UN
TR
YC
ON
TE
MP
OR
AR
YS
AM
PL
E
Ob
s.M
ean
S.D
.M
in.
Max
.
(1)
Lo
gin
com
ep
erca
pit
ain
20
00
CE
10
98
.45
51
.18
96
.52
41
0.4
45
(2)
Pre
dic
ted
gen
etic
div
ersi
ty(a
nce
stry
adju
sted
)1
09
0.7
26
0.0
30
0.6
28
0.7
74
(3)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(an
cest
ryad
just
ed)
10
98
.41
70
.47
87
.21
39
.25
0
(4)
Lo
gp
erce
nta
ge
of
arab
lela
nd
10
92
.24
81
.14
5-2
.12
04
.12
9
(5)
Lo
gab
solu
tela
titu
de
10
92
.84
10
.96
00
.00
04
.15
9
(6)
So
cial
infr
astr
uct
ure
10
90
.45
30
.24
30
.11
31
.00
0
(7)
Eth
nic
frac
tio
nal
izat
ion
10
90
.46
00
.27
10
.00
20
.93
0
(8)
Per
cen
tag
eo
fp
op
ula
tio
nat
risk
of
con
trac
tin
gm
alar
ia1
09
0.3
69
0.4
38
0.0
00
1.0
00
(9)
Per
cen
tag
eo
fp
op
ula
tio
nli
vin
gin
tro
pic
alzo
nes
10
90
.36
10
.41
90
.00
01
.00
0
(10
)M
ean
dis
tan
ceto
nea
rest
wat
erw
ay1
09
0.2
98
0.3
23
0.0
08
1.4
67
(11
)P
erce
nta
ge
of
po
pu
lati
on
of
Eu
rop
ean
des
cen
t1
09
0.3
13
0.4
04
0.0
00
1.0
00
(12
)Y
ears
of
sch
oo
lin
g9
44
.52
72
.77
60
.40
91
0.8
62
(13
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(u
nad
just
ed)
10
99
.08
17
.71
50
.00
02
6.7
71
(14
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(a
nce
stry
adju
sted
)1
09
6.3
65
3.9
29
0.0
00
19
.38
8
(15
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(un
adju
sted
)1
09
6.3
32
3.8
61
0.0
00
14
.42
0
(16
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(an
cest
ryad
just
ed)
10
95
.17
82
.42
60
.00
01
2.1
80
56 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
TA
BL
EG
8—
PA
IRW
ISE
CO
RR
EL
AT
ION
SF
OR
TH
E1
09
-CO
UN
TR
YC
ON
TE
MP
OR
AR
YS
AM
PL
E
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10
)
(1)
Lo
gin
com
ep
erca
pit
ain
20
00
CE
1.0
00
(2)
Pre
dic
ted
gen
etic
div
ersi
ty(a
nce
stry
adju
sted
)-0
.26
31
.00
0
(3)
Lo
gN
eoli
thic
tran
siti
on
tim
ing
(an
cest
ryad
just
ed)
0.5
35
-0.2
32
1.0
00
(4)
Lo
gp
erce
nta
ge
of
arab
lela
nd
0.0
50
0.0
94
0.2
21
1.0
00
(5)
Lo
gab
solu
tela
titu
de
0.5
63
-0.0
07
0.3
75
0.2
42
1.0
00
(6)
So
cial
infr
astr
uct
ure
0.8
08
-0.2
20
0.3
62
0.1
19
0.4
76
1.0
00
(7)
Eth
nic
frac
tio
nal
izat
ion
-0.6
44
0.1
99
-0.4
06
-0.2
55
-0.5
90
-0.4
68
1.0
00
(8)
Per
cen
tag
eo
fp
op
ula
tio
nat
risk
of
con
trac
tin
gm
alar
ia-0
.78
40
.41
1-0
.62
4-0
.20
4-0
.59
9-0
.59
30
.67
21
.00
0
(9)
Per
cen
tag
eo
fp
op
ula
tio
nli
vin
gin
tro
pic
alzo
nes
-0.4
18
-0.1
80
-0.2
66
-0.0
01
-0.7
23
-0.3
80
0.3
96
0.4
14
1.0
00
(10
)M
ean
dis
tan
ceto
nea
rest
wat
erw
ay-0
.44
40
.33
5-0
.27
6-0
.18
9-0
.23
8-0
.30
90
.41
70
.44
3-0
.07
91
.00
0
(11
)P
erce
nta
ge
of
po
pu
lati
on
of
Eu
rop
ean
des
cen
t0
.73
2-0
.15
30
.46
40
.26
00
.55
10
.67
4-0
.53
3-0
.62
9-0
.38
5-0
.34
6
(12
)Y
ears
of
sch
oo
lin
g0
.86
6-0
.13
70
.42
80
.13
80
.55
60
.75
8-0
.47
8-0
.66
5-0
.43
6-0
.36
0
(13
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(u
nad
just
ed)
0.2
84
-0.7
49
0.2
38
-0.0
99
-0.0
15
0.0
99
-0.1
68
-0.4
24
0.2
67
-0.2
79
(14
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(a
nce
stry
adju
sted
)0
.26
3-1
.00
00
.23
2-0
.09
40
.00
70
.22
0-0
.19
9-0
.41
10
.18
0-0
.33
5
(15
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(un
adju
sted
)0
.33
5-0
.81
40
.18
8-0
.04
70
.06
40
.19
5-0
.22
9-0
.44
30
.23
9-0
.33
8
(16
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(an
cest
ryad
just
ed)
0.3
31
-0.9
42
0.1
66
-0.0
27
0.1
25
0.2
99
-0.2
66
-0.4
10
0.1
19
-0.3
70
(11
)(1
2)
(13
)(1
4)
(15
)
(11
)P
erce
nta
ge
of
po
pu
lati
on
of
Eu
rop
ean
des
cen
t1
.00
0
(12
)Y
ears
of
sch
oo
lin
g0
.77
31
.00
0
(13
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(u
nad
just
ed)
0.2
53
0.1
91
1.0
00
(14
)M
igra
tory
dis
tan
cefr
om
Eas
tA
fric
a(a
nce
stry
adju
sted
)0
.15
30
.13
70
.74
91
.00
0
(15
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(un
adju
sted
)0
.28
30
.26
10
.93
90
.81
41
.00
0
(16
)A
eria
ld
ista
nce
fro
mE
ast
Afr
ica
(an
cest
ryad
just
ed)
0.2
24
0.2
33
0.6
77
0.9
42
0.8
29
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 57
H EVIDENCE FROM EVOLUTIONARY BIOLOGY
The proposed diversity hypothesis suggests that there exists a trade-off with respect
to genetic diversity in human populations. Specifically, higher diversity generates so-
cial benefits by enhancing society’s productivity through efficiency gains via comple-
mentarities across different productive traits, by increasing society’s resilience against
negative productivity shocks, and by fostering society’s adaptability to a change in the
technological environment. Higher diversity also generates social costs, however, by
increasing the likelihood of miscoordination and distrust between interacting agents and
by inhibiting the emergence and sustainability of cooperative behavior in society. Indeed,
the ideas underlying these channels ensue rather naturally from well-established concepts
in evolutionary biology.
The following narrative discusses some of the analogous arguments from the field
of evolutionary biology and presents supporting evidence from recent scientific studies.
These studies typically focus on organisms like ants, bees, wasps, and certain species of
spiders and birds that are not only amenable to laboratory experimentation but also dis-
play a relatively high degree of social behavior in nature, such as living in task-directed
hierarchical societies, characterized by division of labor, or engaging in cooperative
rearing of their young. The motivation behind studying such organisms is often related
to the work of sociobiologists (e.g., Wilson 1978, Hölldobler and Wilson 1990) who
have argued that the application of evolutionary principles in explaining the behavior of
social insects lends key insights to the understanding of social behavior in more complex
organisms like humans.
H1. Benefits of Genetic Diversity
The notion that genetic diversity within a given population is beneficial for individual
reproductive fitness, and thus for the adaptability and survivability of the population as
a whole, is one of the central tenets of Darwin’s (1859) theory of evolution. In the short
term, by reducing the extent of inbreeding, genetic diversity prevents the propagation
of potentially deleterious traits in the population across generations (Houle 1994). In
the long term, by permitting the force of natural selection to operate over a wider spec-
trum of traits, genetic diversity increases the population’s capacity to adapt to changing
environmental conditions (Frankham et al. 1999).
To elaborate further, the study by Frankham et al. (1999) provides clear experimental
evidence for the beneficial effect of genetic diversity in enhancing the survivability of
populations under deleterious changes in the environment. In their experiment, popula-
tions of the common fruit fly, Drosophila melanogaster, were subjected to different rates
of inbreeding, and their ability to tolerate increasing concentrations of sodium chloride,
or common salt, which is harmful for this species of flies, was compared with that of
outbred base populations. Indeed, the less diverse inbred populations became extinct at
significantly lower concentrations of sodium chloride than the more genetically diverse
base populations.
58 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
In related studies, Tarpy (2003) and Seeley and Tarpy (2007) employ the honeybee,
Apis mellifera, to demonstrate that polyandry, i.e., the practice of mating with multiple
male drones by queen bees, may be an adaptive strategy since the resultant increase in
genetic diversity increases the colony’s resistance to disease. For instance, having created
colonies headed by queens that had been artificially inseminated by either one or ten
drones, Seeley and Tarpy inoculated these colonies with spores of Paenibacillus larvae, a
bacterium that causes a highly virulent disease in honeybee larvae. The researchers found
that, on average, colonies headed by multiple-drone inseminated queens had markedly
lower disease intensity and higher colony strength relative to colonies headed by single-
drone inseminated queens.
In addition to increasing disease resistance, it has been argued that genetic diversity
within honeybee colonies provides them with a system of genetically-based task special-
ization, thereby enabling them to respond more resiliently to environmental perturbations
(Oldroyd and Fewell 2007). Evidence supporting this viewpoint is provided by the study
of Jones et al. (2004). Honeybee colonies need to maintain their brood nest temperature
between 32◦C and 36◦C, and optimally at 35◦C, so that the brood develops normally.
Workers regulate temperature by fanning hot air out of the nest when the temperature
is perceived as being too high and by clustering together and generating metabolic heat
when the temperature is perceived to be too low. Ideally, a graded rather than precipitous
response is required to ensure that the colony does not constantly oscillate between
heating and cooling responses. In their experiment, Jones et al. artificially constructed
genetically uniform and diverse honeybee colonies and compared their thermoregulation
performances under exposure to ambient temperatures. The researchers found that, over
a period of 2 weeks, the within-colony variance in temperatures maintained by the diverse
colonies (0.047◦C) was less than one-third of the within-colony temperature variance
maintained by the uniform ones (0.165◦C) and that this difference in thermoregulation
performance was statistically significant (F-statistic = 3.5, P-value < 0.001). Figure H1
illustrates the superior thermoregulation performance of a genetically diverse colony, in
comparison to that of a uniform one, in the Jones et al. experiment.
A popular hypothesis regarding the benefits of diversity, one that appears most anal-
ogous to the arguments raised in this paper, suggests that genetically diverse honey-
bee colonies may operate more efficiently by performing tasks better as a collective,
thereby gaining a fitness advantage over colonies with uniform gene pools (Robinson and
Page 1989). Results from the experimental study by Mattila and Seeley (2007) provide
evidence supporting this hypothesis. Since the channel highlighted by this hypothesis is
closely related to the idea proposed in the current study, the remainder of this section is
devoted to the Mattila and Seeley experiment.
A honeybee colony propagates its genes in two ways: by producing reproductive
males (drones) and by producing swarms. Swarming occurs when a reproductive female
(queen) and several thousand infertile females (workers) leave their colony to establish
a new nest. Swarming is costly and perilous. With limited resources and labor, a swarm
must construct new comb, build a food reserve, and begin rearing workers to replace
an aging workforce. In temperate climates, newly founded colonies must operate effi-
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 59
diverse colonies (�2 � 0.22°C) (F603,603 �3.83, P � 0.001).
Our third experiment shows a necessarycondition for the task threshold model to berelevant to colony thermoregulation: Natural-ly occurring patrilines should vary markedlyin their threshold for the task of fanning. Weexposed two five-patriline colonies to in-creasing temperatures and collected fanningbees from the entrances. We then determinedthe paternity of the fanning workers by meansof genetic markers (Fig. 2). As required bythe task threshold model, the proportion offanning workers from each patriline variedsignificantly as temperature was increased(likelihood ratio test; Colony A: G � 70.5, df� 28, P � 0.001; Colony B: G � 44.07, df �24, P � 0.007). In both colonies tested, somepatrilines (Fig. 2, A2, A3, and B3) fanned inmuch higher proportions than other patrilinesfor many or all of the experimental tempera-tures. This supports the response thresholdmodel, as it suggests that these patrilines hadlower than average thresholds for fanning.
In both experimental colonies, there werealso significant differences in the proportionof workers of each patriline in the fanningsamples relative to the random samples atmost experimental temperatures (6 tempera-tures out of 8 in colony A and 4 out of 7 incolony B, G tests, P � 0.05, df � 4). To testthe possibility that these changes were causedby the time of day rather than by temperature,we conducted a control experiment using col-ony B in which ambient temperature was heldat a constant 37°C. Here, time of day did nothave a significant effect on the proportion ofworkers of each patriline fanning (G � 16.28,df � 12, P � 0.2).
The responses of different patrilines tochanges in ambient temperature show twoimportant phenomena. First, patrilines un-doubtedly vary in their responses to changingtemperature, a necessary condition for thetask threshold model. Second, the proportionof fanning workers from different patrilineschanges erratically with temperature. Thereare three likely reasons for the observed non-linearity of patrilineal responses to environ-mental changes. First, a patriline’s thresholdfor performing another thermoregulationtask, such as water collection, may be lowerthan that for fanning and therefore drawmembers of that patriline away from the taskof fanning. Second, the work of nest mates ofother patrilines must change the stimulus tofan. Finally, at least some of the apparentlyrandom changes in patriline proportions aredue to the way we have presented our data.Workers from any single patriline could infact be fanning in steady numbers, rather thanincreasing or decreasing, but as the number ofworkers fanning from another patriline in-creases, the number from the first patrilineappears to decrease proportionally. This arti-
fact could only be overcome if it were possi-ble to test the entire fanning population, rath-er than sampling a subset.
Why should advanced insect societiessuch as that of the honey bee rely on mul-tiple mating and a lottery of paternal geno-types to ensure that their nests are homeo-static? Polyandry probably evolved inhoney bees for reasons other than the taskallocation system. Because of the sex de-termination system of hymenoptera (24), aqueen that mates with a single male carry-ing the same sex allele as herself suffers a50% loss of her diploid brood. Queens canreduce the probability of this occurring bymating with many males, and this seems tohave been the primary cause of the evolu-tion of polyandry in some eusocial insects(25, 26). We argue that, as a secondarilyacquired phenomenon, genetic diversity inthe stimulus level required for an individualto begin a task contributes to overall colonyfitness by enhancing the task allocationsystem. We suggest that a genetically di-
verse colony can respond appropriately to agreater variety of environmental perturba-tions without overreacting. In contrast, col-onies with low genetic diversity (only oneor two patrilines) have a narrow range ofthresholds among their workers, and thiscan lead to perturbations in colony ho-meostasis because too many workers areallocated to those tasks for which the col-ony’s particular genotypes have a low taskthreshold (27, 28). Such colonies can expe-rience large oscillations above and belowthe optimal colony-level phenotype.
Evolutionary theory (29, 30) suggests thattraits related to fitness should exhibit lowgenetic variation, because selection shouldact to remove genetic variance from the pop-ulation. However, in insect societies, selec-tion acts at the level of the colony (31) tofavor those that can most precisely regulatethe internal conditions of the nest, includingthose with the ability to precisely regulatebrood nest temperature over a broad range ofambient temperatures. Without direct selec-
Fig. 1. Temperature variation ingenetically diverse and uniformhoney bee colonies. This graphshows the average hourly tem-perature for one representativepair of colonies in the first exper-imental week. Other colony pairscan be seen in Fig. S1.
Fig. 2. Patrilines vary in their fanning response to changing ambient temperatures. The twofive-patriline colonies studied each consisted of �5000 bees. We used five-patriline colonies toreduce the sample size required to produce adequate minimum expected values in a G test (32).Each colony was maintained in a two-frame observation hive in an insulated room in which thetemperature could be controlled to �1°C. Colonies were heated from 25°C to 40°C in 1°C steps.Fanning bees (50) were collected over each 2-degree interval from the entrance tube with forceps.A random sample of 50 bees was also taken from the colony after each experiment. To determinethe patriline of all workers sampled, we extracted DNA using the Chelex method (33, 34). DNA wasthen amplified by polymerase chain reaction with the microsatellite primers A76 (35) and A113(36) for colony 1 and A88 (36) and A113 for colony 2. Patrilines were then determined as outlinedby Estoup et al. (35).
R E P O R T S
www.sciencemag.org SCIENCE VOL 305 16 JULY 2004 403
on
Oct
ober
31,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
FIGURE H1. THERMOREGULATION IN GENETICALLY UNIFORM VS. DIVERSE HONEYBEE COLONIES
Note: This figure depicts the results from the experimental study by Jones et al. (2004), illustrating the superior
thermoregulation performance, as reflected by lower intertemporal temperature volatility, of a genetically diversity
honeybee colony in comparison to a genetically uniform honeybee colony.
Source: Jones et al. (2004).
ciently because there is limited time to acquire the resources to support these activities.
Colony founding through swarming is so difficult that only 20% of swarms survive their
first year. Most do not gather adequate food to fuel the colony throughout the winter
and therefore die of starvation. With the challenges of successful colony founding in
mind, Mattila and Seeley conducted a long-term study to compare the development
characteristics of genetically diverse and genetically uniform colonies after a swarming
event. The researchers began by creating genetically diverse colonies, using queens
instrumentally inseminated with semen from multiple drones, and genetically uniform
ones, using queens inseminated by one drone. They then generated swarms artificially,
selecting from each colony the queen and a random subset of her worker offspring, and
allowed these swarms to found new colonies. The observations in the Mattila and Seeley
experiment begin on June 11, 2006, when the swarms established their new nest sites. In
particular, they document colony development by measuring comb construction, brood
rearing, foraging activity, food storage, population size, and mean weight gain at regular
intervals.
As depicted in Figure H2, Mattila and Seeley found that, during the first two weeks
of colony development, colonies with genetically diverse worker populations built about
30% more comb than colonies with genetically uniform populations, a difference that
60 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
Genetically Diverse Genetically Uniform
FIGURE H2. COMB AREA GROWTH IN GENETICALLY DIVERSE VERSUS UNIFORM HONEYBEE COLONIES
Note: This figure depicts the results from the experimental study by Mattila and Seeley (2007), illustrating the superior
productivity, as reflected by faster mean comb area growth, of genetically diversity honeybee colonies in comparison to
genetically uniform honeybee colonies.
Source: Mattila and Seeley (2007).
was highly statistically significant (F-statistic = 25.7, P-value < 0.001). Furthermore,
as illustrated in Figure H3, during the second week of colony founding, genetically
diverse colonies maintained foraging rates (measured as the number returning to hive per
minute of either all workers and or only those carrying pollen) that were between 27%
and 78% higher than those of genetically uniform colonies. Consequently, after two
weeks of inhabiting their nest sites, genetically diverse colonies stockpiled 39% more
food than the uniform ones. The researchers also found that production of new workers
and brood rearing by existing workers were both significantly higher in the genetically
diverse colonies within the first month of colony development. As a result of these
various accumulated productivity gains, the genetically diverse colonies all survived
an unusually cold exposure, occurring two months after the establishment of their nest
sites, that starved and killed about 50% of the genetically uniform colonies. Based on
their findings, the authors conclude that collective productivity and fitness in honeybee
colonies is indeed enhanced by intracolonial genetic diversity.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 61
DiverseUniform
DiverseUniform
All Workers Workers Carrying Pollen
FIGURE H3. FORAGING RATES IN GENETICALLY DIVERSE VERSUS UNIFORM HONEYBEE COLONIES
Note: This figure depicts the results from the experimental study by Mattila and Seeley (2007), illustrating the superior
productivity, as reflected by a higher mean foraging rate, of genetically diversity honeybee colonies in comparison to
genetically uniform honeybee colonies.
Source: Mattila and Seeley (2007).
H2. Benefits of Genetic Relatedness and Homogeneity
The notion that genetic relatedness between individuals, and genetic homogeneity
of a group in general, can be collectively beneficial is highlighted in an extension of
Darwinian evolutionary theory known as kin selection theory. In particular, the concept
of “survival of the fittest” in standard Darwinian theory implies that, over time, the world
should be dominated by selfish behavior since natural selection favors genes that increase
an organism’s ability to survive and reproduce. This implication of evolutionary theory
remained at odds with the observed prevalence of altruistic and cooperative behavior in
nature until the formalization of kin selection theory by Hamilton (1964) and Maynard
Smith (1964). According to this influential theory, the indirect fitness gains of genetic
relatives can in some cases more than compensate for the private fitness loss incurred by
individuals displaying altruistic or cooperative behavior. Hence, given that relatives are
more likely to share common traits, including those responsible for altruism or cooper-
ation, kin selection provides a rationale for the propagation of cooperative behavior in
nature.
An immediate implication of kin selection theory is that, when individuals can distin-
62 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
guish relatives from non-relatives (kin recognition), altruists should preferentially direct
aid towards their relatives (kin discrimination). The study by Russell and Hatchwell
(2001) provides experimental evidence of this phenomenon in Aegithalos caudatus, a
species of cooperatively breeding birds commonly known as the long-tailed tit. In this
species, individuals distinguish between relatives and non-relatives on the basis of vocal
contact cues (Sharp et al. 2005), and failed breeders can become potential helpers in
rearing the young of successful breeders within the same social unit. In their research,
Russell and Hatchwell designed an experiment to investigate whether the presence of
kin within the social unit was a necessary condition for altruistic behavior and whether
kin were preferred to non-kin when given the choice. As depicted in Figure H4, the
researchers found that failed breeders did not actually become helpers when kin were
absent from the social unit (Panel (a)), but when both kin and non-kin were present in
the same social unit, the majority of failed breeders provided brood-rearing assistance at
the nests of kin (Panel (b)).
�� � �� ��� � ����� ���� �� � ����� ���� ������ � ���������� � ��������� ?������ ��� ������� ���� ��������� ����� �������� ��� ���� ��� �� ����� ������� ������� ������ �� �������� �� ��� ����� �� � ������� ��� ��� � � �������� �� �� ��� ��� ���"��� �� �� �#������ ���������� ���� ��� �� ������ ��� �� � ����� ��� �� ����� ��� ������� �� ��� ���� �� ���� ������� ���� ������� � � ���� �� �� �� ������� �� ����������� ��� �� ��� ��� ��� ������� �#������ �������
���� ������ �� ��� �� ��� ����������� �� ���������������� �� ������� �� ����� ���� � ��� �� ������ ��������� ���� ��� ��� ����� �� ��� ���� � �� �������� � � �� �� ��� ���� ���� �������� !��������� �� ������ ��� ��� �� �� ��� ������� �� ���� ������������ ���� ����� "����� ����"�� �� ����D����� ������������� ��������� ������� �� ���� ������ ��� ������ ��� ������� ��� ��� � ��� � ��� �������������� =������� �� ���� ���� � ��� ���������� ����������� ������ ������ ����������� �,������ �� � � :55("����� ���� ��� ���� ��� ���������� �� ��� ����� ������M���� ())5�� ��� ���� ��� ������������� �� ���� ����� ������ �� ��������� �� ��� �#������� ��#��� ������ ���� ����� ����� ����� ��� ����������� ������ �� ������� �� ���� �3���� ()))% ,������ �� � ! :55(��� ���������� ��� ����� ��� �� ���������� ��� ������� 4��� ����� �� ���� ���� �� ��� ��� ���������� ����������� ������ �� ��� �� ��� �� ��� �������� �� ��� ����������� �� ��� ����� �� �#����� ������ ���� �� �� ������ ���� ������ �� ���� ���� � �� ��
������ �� �� �� ��� ����� �� � ������� ���� ���� ��� ������ ��� ���� �� � �� ���� ����� � ��� ���� ������ ����� � ������ �� �����
��� ,����� ������� ��� ������ ����� ����� �� � ���� ��� ���� ��� �� ��� � ������ ����"� ��������� ���� �� ������ � ������ ���� ��� � ������ ���� ��� ��������� ��
���� >� ��������� ��������� ����� ��� ������ ������ ���� �� ��� ��� �� �� ���� �� ��� ���� ��� ������ �� � ��� �� ��������� �� ����� �������������� �E������ < M����� ())5% ,������ �� � �:555��
����� ��� ��� ��� �� ����������� ��� ��� ������� ����������� ����� � ���� �� �������� �� ���� �������� ���� ������� ��� ���������� � ������ ����� ����� �� ��� �� ���� ������"������ �� ����!����� �� � � ())+% 6����� < ,������ ()))��
$� ���� �������� �� ��� �� ������� �������������� �� ����� �������� �� � ��� ������� �� �� ��� 4�� �������� ����������� �� ���� ������ �� � �� �������� �� ���� �� �� �� ��� ���� ���� �� ���� �� ��� ��� � �� �� � ������� ,������ �� � � �:55("� ������ ���������� ���� ��� ��������� � �� ��� ������ �� � ���������� ��� �� ���� ������� ���� ��������� �� ���������� ������� ������ ��� ������� �� ��� ,������ ������ �� �������� �� ������� ���� �� �� ���� ��������� �1���� �� ���� �������� ����� �� ���� �� ���� ����������� �,������ �� � � :555�� ������ � � ������� ��������3���� ()))% '� E� ,������ � ���� ����� ������ >�������� �� � �� ������ �� ����� ���� ��� �� ����� ����������� �,������ < 3���� ()).�� �� �� ������ �M ��< ?���� ()**� ��� �� ������� ���� ������ ����� ����� ��� �� ����� ��� �3���� ()))�� >� ��������� �� ����� ���� � ������� �1��� �� ���� ����� ������������������� ����� ������ ��� ���� ���������� ����������3���� ()))% '� E� ,������ � ���� ����� ������ ��������� �� ���� ��� ������"�� ��� ������� ���� �� �������� ������ �� �� ��� �� ���� �� �� ������� �� ��������������� �� ����� ���� ���� �� ������ �� �� ���0������ ��� ���� ���� �� �� ����� ��� ������������ ��������� ������� �� �������� �� ����� ���� ������ � � ���� ������ ����� �������� �� ��� ��������� �� ��� �� ������ �� ��� ����������� �� ���������������� �� ���� �������2� � ��� ���� �������� �� �� ���� ����� �� � ������
���� �������� ���� ������ �������� �� ��� ���������� �� ��� ��� � ������ ���� ������� ��1������ �� �������� ����������� �� ��� ��� �������� -� �� < $����()**� ������ ����� �� ��� �� ���� � ������� ������������ ��� ���� �&���� " �'������ ��� �� ������ ����� � ���� �� �� ��� ���� �� ��� � ����� ��������� ��� ��
:(+: 4� � 3���� ��� '� E� ,������ ��� ���������� �� ��%��� �� ����
���! #! ��! $��! ' �:55(�
50
40
20
10
30
09765432 8
distance travelled to help (rank)
freq
uenc
y (%
)
1
���� @� ������� ����������� �� ��� ��� ��� �� ������������� ��� ��� ���� �� � �� �A� ��� ������� ������ ����� ���� �� ����� �� ������ � �� � ����� ( ��������� ��������� ����� : ��������� ��� ������ ������ ���� ��� �� ����
100
80
40
20
60
0
perc
enta
ge h
elpi
ng
kin presentsame clan
kin absentsame clan
kinsame clan
non-kinsame clan
(a) (b)
���� /� ��� ��� �������� �� ��� �� ��� ��������� ������� ��� �� ������� ��� ��� ������ �� ������� �� ������ �� �� ���� ��� ��� ����� ������ ��� ���� � ����� (+� ��� ������ ��� ��� ���� � �� ��� )� �������� ��������� ����� �� 5�559�� �"� ��� ������ �� ������� �� �� ���� �� �� �� ��� ����� �� ��� ��� ������� ���� ������� ���� �� � �� ��� ���� � �� ��� (+� �������� �� ������� ����� �� 5�55(��
FIGURE H4. PREFERENTIAL BIAS OF COOPERATION WITH KIN IN THE LONG-TAILED TIT
Note: This figure depicts the results from the experimental study by Russell and Hatchwell (2001), illustrating that, in
the long-tailed tit (a species of cooperatively breeding birds), (i) the presence of genetic relatives (kin) within the social
unit is a necessary condition for the prevalence of altruistic behavior (Panel (a)) and (ii) altruism is preferentially directed
towards genetic relatives when both relatives and non-relatives are present within the same social unit (Panel (b)).
Source: Russell and Hatchwell (2001).
Another prediction of kin selection theory is that the extent of altruism should be
positively correlated with the degree of genetic relatedness (between potential helpers
and beneficiaries) and that this correlation should be stronger the greater the indirect
fitness benefit from altruism. Empirical support for this prediction comes from a study
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 63
by Griffin and West (2003) where relevant data from 18 collectively breeding vertebrate
species was used to (i) test the relationship between the amount of help in brood rearing
and relatedness and (ii) examine how this correlation varied with the benefit of helping
(measured in terms of relatives’ offspring production and survival). Specifically, the
study exploited variation across social units within each species in genetic relatedness,
the amount of help, and the indirect fitness benefit of helping. Consistently with kin
selection theory, the researchers found that the cross-species average of the species-
specific cross-social unit correlation between the amount of help and genetic relatedness
was 0.33, a correlation that was statistically significantly larger than zero (P-value <0.01). Moreover, the study also found that kin discrimination, i.e., the species-specific
cross-social unit correlation between the amount of help and relatedness, was higher in
species where the indirect fitness benefits from altruism were larger. Figure H5 depicts
the cross-species relationship found by Griffin and West between kin discrimination and
the benefit from altruistic behavior.
Indirect Fitness Benefit of Helping
Kin
Dis
crim
inat
ion
FIGURE H5. KIN DISCRIMINATION AND THE INDIRECT FITNESS BENEFIT FROM ALTRUISM
Note: This figure depicts the results from the study by Griffin and West (2003), illustrating that the extent of kin
discrimination, i.e., the strength of the species-specific correlation between the amount of help in brood rearing and
genetic relatedness, is higher in species where there is a larger indirect fitness benefit of altruism, measured in terms of
relatives’ offspring production and survival.
Source: Griffin and West (2003).
While the studies discussed thus far provide evidence of a positive correlation between
genetic relatedness and altruism, they do not substantiate the effect of relatedness on
the other type of social behavior stressed by kin selection theory, that of mutually or
collectively beneficial cooperation. This concept is directly associated with solving
64 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
the problem of public goods provision due to the “tragedy of commons.” In particu-
lar, cooperation within groups that exploit a finite resource can be prone to cheating
whereby the selfish interests of individuals result in disadvantages for all members of
the group. While cooperative behavior can be enforced through mechanisms such as
reciprocity or punishment, kin selection provides a natural alternative for the resolution
of such social dilemmas. Specifically, by helping relatives pass on shared genes to
the next generation, cooperation between related individuals can be mutually beneficial.
Experimental evidence on the importance of genetic relatedness for cooperative behavior
comes from the study by Schneider and Bilde (2008) that investigates the role of kinship
in cooperative feeding amongst the young in Stegodyphus lineatus, a species of spider
displaying sociality in juvenile stages.
Sibs Familiar NonsibsUnfamiliar Nonsibs
FIGURE H6. WEIGHT GROWTH IN KIN VERSUS NONKIN GROUPS OF COOPERATIVELY FEEDING SPIDERS
Note: This figure depicts the results from the experimental study by Schneider and Bilde (2008), illustrating the superior
weight gain performance of groups of cooperatively feeding spiders where individuals were genetically related (sibs)
in comparison to groups where individuals were either (i) genetically and socially unrelated (unfamiliar nonsibs) or (ii)
genetically unrelated but socially related (familiar nonsibs).
Source: Schneider and Bilde (2008).
Schneider and Bilde argue that communally feeding spiders are ideal to investigate the
costs and benefits of cooperation because of their mode of feeding. These spiders hunt
cooperatively by building and sharing a common capture web, but they also share large
prey items. Since spiders digest externally by first injecting their digestive enzymes
and then extracting the liquidized prey content, communal feeding involves everyone
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 65
injecting saliva into the same carcass and thus exploiting a common resource that was
jointly created. Such a system is especially prone to cheating because each feeder can
either invest in the digestion process by contributing enzymes or cheat by extracting
the liquidized prey with little prior investment. The outcomes of such conflicts in a
collective can thus be quantified by measuring feeding efficiency and weight gain. In
this case, kin selection theory predicts that groups with higher mean genetic relatedness
should outperform others on these biometrics due to a relatively lower prevalence of such
conflicts.
FIGURE H7. FEEDING EFFICIENCY IN KIN VS. NONKIN GROUPS OF COOPERATIVELY FEEDING SPIDERS
Note: This figure depicts the results from the experimental study by Schneider and Bilde (2008), illustrating the
superior feeding efficiency of groups of cooperatively feeding spiders where individuals were genetically related (sibs)
in comparison to groups where individuals were either (i) genetically and socially unrelated (unfamiliar nonsibs) or (ii)
genetically unrelated but socially related (familiar nonsibs).
Source: Schneider and Bilde (2008).
To test this prediction, Schneider and Bilde conducted an experiment with three treat-
ment groups of juvenile spiders: genetically related (sibs), genetically and socially un-
related (unfamiliar nonsibs), and genetically unrelated but socially related (familiar non-
sibs). Social, as opposed to genetic, relatedness refers to familiarity gained through
learned association as a result of being raised by the same mother (either foster or biolog-
ical) in pre-juvenile stages. The third treatment group therefore allowed the researchers
to control for nongenetic learned associations that could erroneously be interpreted as
kin-selected effects. In their experiment, Schneider and Bilde followed two group-level
outcomes over time. They measured growth as weight gained over a period of eight
66 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
weeks, and they measured feeding efficiency of the groups by quantifying the mass
extracted from prey in repeated two-hour assays of cooperative feeding.
As depicted in Figure H6, consistently with kin selection, sib groups gained signif-
icantly more weight than genetically unrelated groups (both familiar and unfamiliar)
over the experimental period of 8 weeks (F-statistic = 9.31, P-value < 0.01), and while
nonsib unfamiliar spider groups had a higher start weight than the two other groups,
sib groups overtook them by following a significantly steeper growth trajectory. Indeed,
as Figure H7 illustrates, this growth pattern was due to the higher feeding efficiency
of sib groups compared with nonsib groups, the former extracting significantly more
mass from their prey during a fixed feeding duration (F-statistic = 8.91, P-value < 0.01).
Based on these findings, Schneider and Bilde conclude that genetic similarity facilitates
cooperation by reducing cheating behavior and, thereby, alleviates the negative social
impact of excessive competition.
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 67
*
REFERENCES
Acemoglu, Daron, Simon Johnson, and James A. Robinson. 2005. “Institutions as a
Fundamental Cause of Long-Run Growth.” In Handbook of Economic Growth, Vol IA.
, ed. Philippe Aghion and Steven N. Durlauf. Amsterdam, The Netherlands:Elsevier
North-Holland.
Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat, and
Romain Wacziarg. 2003. “Fractionalization.” Journal of Economic Growth, 8(2): 155–
194.
Ashraf, Quamrul, Oded Galor, and Ömer Özak. 2010. “Isolation and Development.”
Journal of the European Economic Association, 8(2-3): 401–412.
Barro, Robert J., and Jong-Wha Lee. 2001. “International Data on Educational
Attainment: Updates and Implications.” Oxford Economic Papers, 53(3): 541–563.
Chandler, Tertius. 1987. Four Thousand Years of Urban Growth: An Historical Census.
Lewiston, NY:The Edwin Mellen Press.
Conley, Timothy G. 1999. “GMM Estimation with Cross Sectional Dependence.”
Journal of Econometrics, 92(1): 1–45.
Darwin, Charles. 1859. On the Origin of Species by Means of Natural Selection.
London, UK:John Murray.
Diamond, Jared. 1997. Guns, Germs and Steel: The Fates of Human Societies. New
York, NY:W. W. Norton & Co.
Frankham, Richard, Kelly Lees, Margaret E. Montgomery, Phillip R. England,
Edwin H. Lowe, and David A. Briscoe. 1999. “Do Population Size Bottlenecks
Reduce Evolutionary Potential?” Animal Conservation, 2(4): 255–260.
Gallup, John L., and Jeffrey D. Sachs. 2001. “The Economic Burden of Malaria.” The
American Journal of Tropical Medicine and Hygiene, 64(1-2): 85–96.
Gallup, John L., Jeffrey D. Sachs, and Andrew D. Mellinger. 1999. “Geography and
Economic Development.” International Regional Science Review, 22(2): 179–232.
Griffin, Ashleigh S., and Stuart A. West. 2003. “Kin Discrimination and the Benefit of
Helping in Cooperatively Breeding Vertebrates.” Science, 302(5645): 634–636.
Hall, Robert E., and Charles I. Jones. 1999. “Why Do Some Countries Produce
So Much More Output Per Worker Than Others?” Quarterly Journal of Economics,
114(1): 83–116.
Hamilton, William D. 1964. “The Genetical Evolution of Social Behaviour I and II.”
Journal of Theoretical Biology, 7(1): 1–52.
Hayes, Theodore R. 1996. “Dismounted Infantry Movement Rate Study.” U.S. Army
Research Institute of Environmental Medicine.
Hölldobler, Bert, and Edward O. Wilson. 1990. The Ants. Cambridge, MA:The
Belknap Press of Harvard University Press.
68 THE AMERICAN ECONOMIC REVIEW FEBRUARY 2013
Houle, David. 1994. “Adaptive Distance and the Genetic Basis of Heterosis.” Evolution,
48(4): 1410–1417.
Jones, Julia C., Mary R. Myerscough, Sonia Graham, and Benjamin P. Oldroyd.
2004. “Honey Bee Nest Thermoregulation: Diversity Promotes Stability.” Science,
305(5682): 402–404.
La Porta, Rafael, Florencio Lopez-de-Silanes, Andrei Shleifer, and Robert W.
Vishny. 1999. “The Quality of Government.” Journal of Law, Economics, and
Organization, 15(1): 222–279.
Mattila, Heather R., and Thomas D. Seeley. 2007. “Genetic Diversity in Honey Bee
Colonies Enhances Productivity and Fitness.” Science, 317(5836): 362–364.
Maynard Smith, John. 1964. “Group Selection and Kin Selection.” Nature,
201(4924): 1145–1147.
McEvedy, Colin, and Richard Jones. 1978. Atlas of World Population History. New
York, NY:Penguin Books Ltd.
Michalopoulos, Stelios. 2011. “The Origins of Ethnolinguistic Diversity.” American
Economic Review, forthcoming.
Modelski, George. 2003. World Cities: -3000 to 2000. Washington, DC:FAROS 2000.
New, Mark, David Lister, Mike Hulme, and Ian Makin. 2002. “A High-Resolution
Data Set of Surface Climate Over Global Land Areas.” Climate Research, 21(1): 1–25.
Nordhaus, William D. 2006. “Geography and Macroeconomics: New Data and New
Findings.” Proceedings of the National Academy of Sciences, 103(10): 3510–3517.
Oldroyd, Benjamin P., and Jennifer H. Fewell. 2007. “Genetic Diversity Promotes
Homeostasis in Insect Colonies.” Trends in Ecology and Evolution, 22(8): 408–413.
Olsson, Ola, and Douglas A. Hibbs, Jr. 2005. “Biogeography and Long-Run Economic
Development.” European Economic Review, 49(4): 909–938.
Özak, Ömer. 2010. “The Voyage of Homo-Economicus: Some Economic Measures of
Distance.” Mimeo, Brown University.
Peregrine, Peter N. 2003. “Atlas of Cultural Evolution.” World Cultures: Journal of
Comparative and Cross-Cultural Research, 14(1): 1–75.
Putterman, Louis. 2008. “Agriculture, Diffusion, and Development: Ripple Effects of
the Neolithic Revolution.” Economica, 75(300): 729–748.
Putterman, Louis, and David N. Weil. 2010. “Post-1500 Population Flows and the
Long Run Determinants of Economic Growth and Inequality.” Quarterly Journal of
Economics, 125(4): 1627–1682.
Ramachandran, Sohini, Omkar Deshpande, Charles C. Roseman, Noah A.
Rosenberg, Marcus W. Feldman, and L. Luca Cavalli-Sforza. 2005. “Support from
the Relationship of Genetic and Geographic Distance in Human Populations for a
Serial Founder Effect Originating in Africa.” Proceedings of the National Academy
of Sciences, 102(44): 15942–15947.
Ramankutty, Navin, Jonathan A. Foley, John Norman, and Kevin McSweeney.
2002. “The Global Distribution of Cultivable Lands: Current Patterns and Sensitivity
VOL. 103 NO. 1 ASHRAF AND GALOR: DIVERSITY AND DEVELOPMENT (APPENDIX) 69
to Possible Climate Change.” Global Ecology and Biogeography, 11(5): 377–392.
Robinson, Gene E., and Robert E. Page, Jr. 1989. “The Genetic Basis of Division of
Labor in an Insect Society.” In The Genetics of Social Evolution. , ed. Michael D. Breed
and Robert E. Page, Jr. Boulder, CO:Westview Press.
Russell, Andrew F., and Ben J. Hatchwell. 2001. “Experimental Evidence for Kin-
Biased Helping in a Cooperatively Breeding Vertebrate.” Proceedings of the Royal
Society: Biological Sciences, 268(1481): 2169–2174.
Sachs, Jeffrey D., and Andrew Warner. 1995. “Economic Reform and the Process of
Global Integration.” Brookings Papers on Economic Activity, 26(1): 1–118.
Schneider, Jutta M., and Trine Bilde. 2008. “Benefits of Cooperation with Genetic
Kin in a Subsocial Spider.” Proceedings of the National Academy of Sciences,
105(31): 10843–10846.
Seeley, Thomas D., and David R. Tarpy. 2007. “Queen Promiscuity Lowers Disease
within Honeybee Colonies.” Proceedings of the Royal Society: Biological Sciences,
274(1606): 67–72.
Sharp, Stuart P., Andrew McGowan, Matthew J. Wood, and Ben J. Hatchwell. 2005.
“Learned Kin Recognition Cues in a Social Bird.” Nature, 434(7037): 1127–1130.
Spolaore, Enrico, and Romain Wacziarg. 2009. “The Diffusion of Development.”
Quarterly Journal of Economics, 124(2): 469–529.
Tarpy, David R. 2003. “Genetic Diversity within Honeybee Colonies Prevents Severe
Infections and Promotes Colony Growth.” Proceedings of the Royal Society: Biological
Sciences, 270(1510): 99–103.
Wilson, Edward O. 1978. On Human Nature. Cambridge, MA:Harvard University
Press.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data.
. 2 ed., Cambridge, MA:The MIT Press.