CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions...

82
7411 2018 December 2018 Top Lights Bright Cities and their Contribution to Economic Decelopment Richard Bluhm, Melanie Krause

Transcript of CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions...

Page 1: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

7411 2018 December 2018

Top Lights Bright Cities and their Contribution to Economic Decelopment Richard Bluhm, Melanie Krause

Page 2: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Impressum:  

CESifo Working Papers ISSN 2364‐1428 (electronic version) Publisher and distributor: Munich Society for the Promotion of Economic Research ‐ CESifo GmbH The international platform of Ludwigs‐Maximilians University’s Center for Economic Studies and the ifo Institute Poschingerstr. 5, 81679 Munich, Germany Telephone +49 (0)89 2180‐2740, Telefax +49 (0)89 2180‐17845, email [email protected] Editors: Clemens Fuest, Oliver Falck, Jasmin Gröschl www.cesifo‐group.org/wp   An electronic version of the paper may be downloaded  ∙ from the SSRN website:           www.SSRN.com ∙ from the RePEc website:          www.RePEc.org ∙ from the CESifo website:         www.CESifo‐group.org/wp    

 

Page 3: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

CESifo Working Paper No. 7411 Category 6: Fiscal Policy, Macroeconomics and Growth

Top Lights Bright Cities and their Contribution to Economic Development

Abstract The commonly-used satellite images of nighttime lights fail to capture the true brightness of most cities. We show that night lights are a reliable proxy for economic activity at the city level, provided they are first corrected for top-coding. We present a stylized model of urban luminosity and empirical evidence which both suggest that these ‘top lights’ follow a Pareto distribution. We then propose a simple correction procedure which recovers the full distribution of city lights. Applying this approach to cities in Sub-Saharan Africa, we find that primate cities are outgrowing secondary cities but are changing from within.

JEL-Codes: O100, O180, R110, R120.

Keywords: development, urban growth, night lights, top-coding, inequality.

Richard Bluhm Leibniz University Hannover Institute of Macroeconomics

Hannover / Germany [email protected]

Melanie Krause Hamburg University

Department of Economics Hamburg / Germany

[email protected]

December 2018 This paper has been presented at BrownU, UEA, ERSA, ECINEQ, EEA, EPCS, GGDC anniversary conference, IARIW, MonashU, RCEF, RES, SEM, UAE, UBrisbane, UGold Coast, UGroningen, UHamburg, UKiel, ULausanne, U St. Gallen, UUme_a, UVienna, VfS AEL, and VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda Asmus, Francesco Audrino, Benjamin Bechtel, Chris Elvidge, Xavier Gabaix, Oded Galor, Martin Gassebner, Roland Hodler, Robert Inklaar, Sebastian Kripfganz, Rafael Lalive, Christian Lessmann, Stelios Michalopoulos, Maxim Pinkovskiy, Stefan Pichler, Paul Raschky, Prasada Rao, Nicholas Rohde, Dominic Rohner, Paul Schaudt, André Seidel, Adam Storeygaard, Eric Strobl, and David Weil for helpful comments and suggestions. We gratefully acknowledge financial support from the German Science Foundation (DFG). All remaining errors are ours.

Page 4: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

1 Introduction

Cities are hubs of economic activity and productivity. About 4.2 billion people, or 55%

of the world’s population, currently live in urban areas, and developing countries are

urbanizing at a rapid pace (United Nations, 2018). The African continent alone might

add up to a billion people to its urban population by 2050. Many important questions in

development economics and macroeconomics are intimately linked to understanding the

processes driving the concentration of people and economic activity in cities. The lion’s

share of the extant literature is oriented towards cities in advanced economies, for which

ample data are available (Glaeser and Henderson, 2017). Much less is known about the

rising cities of the 21st century, who not only lack comparable data, but are undergoing

such a fast-paced urbanization that existing sources quickly become dated.

Satellite images of Earth are transforming how economists and other social scientists

are tracking human activity and its consequences.1 Nighttime images of light emissions

are now an established proxy for local economic activity (Chen and Nordhaus, 2011,

Henderson et al., 2012, Donaldson and Storeygard, 2016) and have been used in a

variety of innovative applications (e.g. Michalopoulos and Papaioannou, 2013, Hodler and

Raschky, 2014, Alesina et al., 2016, Pinkovskiy and Sala-i Martin, 2016, Lessmann and

Seidel, 2017, Pinkovskiy, 2017, Henderson et al., 2018).2 They have several advantages

over conventional survey-based data. Night lights are measured uniformly around the

globe, allowing us to bypass discussions over adjustments for exchange rates and regional

price levels. Another appealing feature of the most widely-used data is that they are

available as an annual panel from 1992 to 2013 at a resolution of less than one square

kilometer.

Our primary objective in this paper is to establish how these data can be used to

reliably track economic activity within and across cities. A serious drawback of the

standard night lights data is that they are top-coded in larger cities. The Operation

Linescan System (OLS)—a part of the US Defense Meteorological Satellite Program

(DMSP)—was designed to pick up dim light sources, but the satellites have a limited

on-board storage capacity and are based on outdated 1970s technology. They record

light intensities as integerized digital numbers from 0 DN (dark) to 63 DN (bright) and

truncate all observations above this limit to save space. The upper end of this scale is,

however, easily reached by the light intensity emitted by a mid-sized city. As a result,

the recorded signal “flatlines” when the satellites encounter bright city lights: the central

business district and the outskirts of larger cities appear to be equally bright in the

1For recent reviews of the related literature see Donaldson and Storeygard (2016), who illustrate theadvantages of remotely sensed data in general, and Michalopoulos and Papaioannou (2018), who focuson the night lights data in particular.

2Nighttime lights have also revitalized the aid effectiveness literature, enabling it to focus on thelocal impact of aid projects (see Dreher and Lohmann, 2015, Bluhm et al., 2018, Civelli et al., 2018,Isaksson and Kotsadam, 2018).

2

Page 5: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

truncated data.3

The scale of the truncation is sizable. In these so-called ‘stable lights’ data, the urban

centers of large, busy cities, such as New York or London appear to emit as little light

as smaller American or British towns. Our estimates instead suggest that they are more

than an order of magnitude brighter than recorded in the original data. While top-coding

tends to affect developed countries more than their less-developed counterparts, we show

that it is pervasive and distorts the ranking of cities within and across countries. Nearly

all primary cities in Africa and mid-sized cities in Asia hit the top-coding threshold.

Large agglomerations in developing countries, such as Johannesburg or New Delhi, are

affected particularly strongly.

In this paper, we analyze the global distribution of city lights at the pixel level, develop

a new procedure to recover the details of inner city activity, and then study the evolution

of cities in Sub-Saharan Africa. We make three distinct contributions to the literature:

First, we argue that it is natural to characterize the distribution of the world’s

brightest lights, which we dub ‘top lights’, by a Pareto distribution. We provide

theoretical and empirical evidence supporting this claim. In terms of theory, we present

a stylized model of light emissions from cities, combining standard assumptions on the

evolution of city sizes (e.g. Zipf’s law) with regularities in urban scaling. The model

gives rise to a power law in light emissions above a certain threshold. Our empirical tests

based on auxiliary satellite data also strongly favor a heavy-tailed Pareto distribution in

top lights with an inequality parameter comparable to top incomes or wealth in the US

(e.g. see Atkinson et al., 2011).

Our second contribution is methodological. Building on the Pareto property of top

lights, we develop a top-coding correction for the truncated data. The correction uses

a geo-referenced ranking method at the pixel level combined with auxiliary observations

from less frequently available satellites. Based on this method, we present a new annual

panel of nighttime lights over the entire period from 1992 to 2013.4 The correction makes

a substantial difference in virtually all major cities. The top 4% of pixels—the average

share of pixels which we correct globally—account for 32% of the total brightness observed

on Earth. This is nearly double their original share, underlining the contribution of big

cities to global economic activity. As a result, the spatial distribution of economic activity

turns out to be much less equal.

Figure 1 illustrates the problem and the effectiveness of our solution. It shows that

the stable lights data are unable to differentiate among the light intensities originating

3The data are known to suffer from a variety of other problems, such as bottom-coding (Jean et al.,2016), overglow or blooming (Abrahams et al., 2018), and geolocation errors (Tuttle et al., 2013). Thenew Visible Infrared Imaging Radiometer Suite (VIIRS) has considerably improved sensors, including aday-and-night band which records light intensity after midnight since late 2011. While these data willbecome more important in the future, the DMSP-OLS series is the only available series covering thehistorical period from 1992 to 2013. We use the VIIRS data for robustness checks in Online Appendix F.

4Our corrected images can be downloaded from www.lightinequality.com.

3

Page 6: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure 1 – Selected cities in 1999, stable lights and corrected lights

(a) New York City

−74.4 −74.0 −73.6 −73.2

050

010

0015

0020

00Transect

Longitude

Ligh

t int

ensi

ty (

DN

)

CorrectedSaturated

−74.4 −74.0 −73.6

40.2

40.4

40.6

40.8

41.0

41.2

Saturated

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

−74.4 −74.0 −73.6

40.2

40.4

40.6

40.8

41.0

41.2

Corrected

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

64−249

250+

(b) London

−1.0 −0.5 0.0 0.5

050

010

0015

0020

00

Transect

Longitude

Ligh

t int

ensi

ty (

DN

)

CorrectedSaturated

−1.0 −0.5 0.0 0.5

50.8

51.0

51.2

51.4

51.6

51.8

52.0

Saturated

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

−1.0 −0.5 0.0 0.5

50.8

51.0

51.2

51.4

51.6

51.8

52.0

Corrected

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

64−249

250+

(c) New Delhi

77.0 77.2 77.4 77.6

050

010

0015

0020

00

Transect

Longitude

Ligh

t int

ensi

ty (

DN

)

CorrectedSaturated

77.0 77.2 77.4 77.6

28.2

28.4

28.6

28.8

Saturated

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

77.0 77.2 77.4 77.6

28.2

28.4

28.6

28.8

Corrected

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

64−249

250+

(d) Johannesburg

27.6 27.8 28.0 28.2 28.4

050

010

0015

0020

00

Transect

Longitude

Ligh

t int

ensi

ty (

DN

)

CorrectedSaturated

27.6 27.8 28.0 28.2 28.4

−26

.6−

26.4

−26

.2−

26.0

−25

.8

Saturated

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

27.6 27.8 28.0 28.2 28.4

−26

.6−

26.4

−26

.2−

26.0

−25

.8

Corrected

Longitude

0

1−9

10−19

20−29

30−39

40−49

50−63

64−249

250+

Notes: Comparison of the light intensities (DN) recorded by the stable lights and our corrected lightsin four major cities. The left panel show the light intensity along a longitudinal transect throughthe brightest pixel in each city. The middle panel shows a map based on the stable lights datafrom satellite F121999. The right panel shows the same map using the corrected data presented inthis paper. Both data have been binned and the color scales were adjusted so as to be comparable.Dashed lines indicate the transect path.

4

Page 7: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

from different inner-city locations in New York, London, New Delhi or Johannesburg.

Across these four major cities, the average light intensity differs little and the sum of

light in each city is dominated by the urban extent. After our correction, we can clearly

identify urban cores which are much brighter than the outskirts. This carries over to

other subnational aggregates. The corrected data also turn out to be a better predictor

of regional economic activity in rich and densely populated countries, such as Germany.

Third, we use this new data to analyze the relative growth rates and internal structure

of cities in Sub-Saharan Africa. The subcontinent is still the world’s poorest region and

urbanizing at a considerably lower level of development than other regions have in the

past (Glaeser, 2014). Economic activity in Sub-Saharan Africa is heavily concentrated in

primate and coastal cities due to a combination of factors, including high transport costs

(Storeygard, 2016), outward-oriented colonial infrastructure (Jedwab and Moradi, 2016,

Bonfatti and Poelhekke, 2017), urban bias (Lipton, 1977, Ades and Glaeser, 1995), the

limited reach of national institutions (Michalopoulos and Papaioannou, 2014), climate

change (Barrios et al., 2006), natural resource booms (Gollin et al., 2016), and the urban

mortality transition (Jedwab and Vollrath, 2018). This structure is neither optimal nor

static. On the one hand, excessive primacy in the city size distribution has been linked to

the prevalence of slums and slow economic growth (Henderson, 2003, Castells-Quintana,

2017). On the other hand, recent studies suggest that Africa’s secondary cities might be

gaining ground vis-a-vis primate cities and playing an important role in poverty reduction

(Henderson et al., 2012, Christiaensen and Todo, 2014, Christiaensen et al., 2016, Fetzer

et al., 2016). Official numbers remain elusive though, so it remains an open question if

secondary cities in Africa are in fact rising.

Our application shows that Africa’s largest cities are maintaining their dominant

position but are changing from within. Primary cities in Sub-Saharan Africa were growing

faster than secondary cities over the period from 1992 to 2013, suggesting they are

continuously absorbing growing populations in informal settlements (in line with Jedwab

and Vollrath, 2018). However, as these cities grow, we observe two distinct developments:

light inequality within the initial city limits narrows and the inner-city distribution of

light becomes more fragmented. We interpret this finding as an indication that public

services are being expanded throughout cities, while increasing fragmentation implies the

formation of sub-centers with a continued lack of connectivity to other neighborhoods.

Disconnectedness and long travel times, in turn, limit productivity and economies of scale

(Lall et al., 2017, Venables, 2017). Both of these findings would have been difficult to

establish without the correction developed in this paper, In fact, the stable lights data

indicate that secondary cities outperform primate cities during the period of study, and

they offer little insight into the distribution of economic activity within cities.

More broadly, this paper takes on the challenge of linking the pixel-level distribution

of economic activity as observed from high resolution satellites to economic theory and

5

Page 8: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

empirical laws in urban economics. We argue that this approach leads to a better

understanding of the features these data should have and at what scale they will be

particularly useful. This perspective shows that the influence of top-coding is small

when lights are aggregated to the country level but then rises steeply as the size of the

unit of observation decreases. Moreover, as we demonstrate below, the impact of top-

coding increases over time as countries and cities develop. By providing a solution to this

problem, we hope to further encourage researchers to use these data in innovative ways.

The remainder of this paper is organized as follows. Section 2 provides some

background on the nighttime lights data and the extent of top-coding around the world. In

Section 3 we present theoretical and empirical evidence in favor of a Pareto distribution in

top lights. Section 4 outlines our correction procedure. Section 5 contains the application

of our top-coding corrected data to African cities. Section 6 concludes. An Online

Appendix contains the accompanying material, such as additional theoretical results,

supplementary details on the data and summary statistics, a benchmarking exercise with

German regional data, and a battery of robustness checks.

2 Top-coding around the world

The DMSP-OLS satellites have been orbiting the earth for several decades with the

primary purpose of detecting clouds. As a byproduct, they measure night lights in the

evening hours between 8:30 and 10:00 pm local time around the globe every day. The

recorded data are pre-processed by the National Geophysical Data Center at the National

Oceanic Administration Agency (NOAA) and averaged over cloud-free days.5 The result

are images of annual ‘stable light’ intensities from 1992 to 2013 for every 30 by 30 arc

seconds pixel of the globe (about 0.86 square kilometers at the equator).6

We cannot use the truncated stable lights data to gauge the extent of top-coding.

Fortunately, for seven years, additional satellites were orbiting Earth with sensor settings

that were less sensitive to light. NOAA generates a series of ‘radiance-calibrated’ lights

by combining the stable lights data from normal flight operations with auxiliary data

obtained from these low amplification sensors (Elvidge et al., 1999, Ziskin et al., 2010,

Hsu et al., 2015). The resulting series is free of top-coding and has no theoretical upper

bound.7 The radiance-calibrated data can be used directly for cross-sectional analyses.

Henderson et al. (2018), for example, use the 2010 version of this data in their study of

5This is done to remove observations of cloudy days and sources of lights which are not man-made,such as auroral lights or forest fires. This process removes a lot of dim light sources. NOAA also makesa series of unfiltered lights available, which we later use for the delineation of urban extents.

6In our analysis, we always exclude areas close to the polar zones (65 degrees south and 75 degreesnorth latitude) known to be influenced by ephemeral lights and remove areas affected by gas flaring.

7Note that in spite of calibration issues, Hsu et al. (2015, p. 1865) point out that, within the sameyear, the “DNs below saturation of the Stable Lights product and DN EQs of merged fixed-gain imagerycan be directly compared to each other.”

6

Page 9: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

the global spatial distribution of economic activity. They are, however, only of limited

use for comparisons over time. Apart from their sparse temporal coverage, the radiance

calibration process introduces a non-trivial amount of noise.8

While the literature typically aggregates the pixel level data to some study area of

interest (e.g. grid cell, city or region), we conduct our analysis at the native resolution

of 30 arc seconds to avoid averaging over top-coded areas. On the global scale, this is

a formidable task.9 To ease the computational burden, we conduct large parts of the

analysis with a spatial random sample of 10% of pixels within all countries that have a

landmass larger than 500 km2 (but later apply the correction to the full data). The sample

contains more than two million pixels per year located in 194 countries or territories.

Table 1 – Summary statistics of global lights in 2010

World USA Brazil Israel South Africa China Netherlands

Panel a) Stable lights (from 0 to 63 DN)

Mean 17.55 18.35 17.97 31.61 15.16 17.72 35.11Standard deviation 15.35 16.90 15.45 21.49 15.29 15.50 16.09Maximum 63.00 63.00 63.00 63.00 63.00 63.00 63.00Spatial Gini 0.4258 0.4486 0.4165 0.3858 0.4604 0.4260 0.2637Pixels 2,154,889 427,922 60,310 2,060 18,369 165,521 6,549

Panel b) Radiance-calibrated lights (from 0 to ∞ DN)

Mean 19.04 23.14 19.76 46.29 15.18 19.97 37.83Standard deviation 44.35 53.42 41.76 82.70 29.99 46.71 44.08Maximum 2109.67 1710.59 646.84 914.14 575.22 1862.04 435.63Spatial Gini 0.6045 0.6613 0.5995 0.6505 0.5941 0.6093 0.4880Pixels 2,154,889 427,922 60,310 2,060 18,369 165,521 6,549

Panel c) Comparison of top-coded pixels at same locations

Share top-coded 0.0576 0.0847 0.0588 0.2447 0.0387 0.0597 0.1709Radiance-cal. mean 142.05 153.77 143.36 135.70 113.27 143.96 107.92

Notes: The table reports summary statistics using the stable lights in Panel (a) and the radiance-calibrated lights in Panel (b). Panel (c) compares both sources at the pixel level. All three panelsare based on a 10% sample of all lit pixels, where each pixel is 30×30 arc seconds. The stable lightsdata are averaged across the whole year, while the radiance-calibrated data come from satellite F16,which recorded these data from January 11 to December 9, 2010.

We still have to define where top-coding begins before we can assess its impact.

Since the scale goes up to 63 DN, it seems natural to assume that this would be the

appropriate threshold. There are, however, compelling technical reasons suggesting that

the threshold should be much lower. Each value we observe in the annual data has

already been averaged several times.10 In fact, our analysis shows that many pixels with

8Online Appendix A provides summary statistics of the stable lights and radiance-calibrated dataover time and discuss technical reasons for the large fluctuations we observe in the latter. The ranksof pixels in the radiance-calibrated data are considerably more stable over time than their absolutevalues—a feature we later exploit in our pixel-level correction.

9Every image contains more than 700 million pixels, about a third of which are on land and a halfof which are lit.

10Abrahams et al. (2018) provide a detailed explanation of how the DMSP-OLS satellites process the

7

Page 10: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

DNs of 62, 61, down to the mid-50s, are subject to implicit top-coding and should be

considerably brighter than they are recorded in the data (see Online Appendix B). NOAA

even suggests that the first—albeit faint—influence of top-coding starts at much lower

values.11 Throughout this paper, we conservatively set the top-coding threshold to 55

DN in order to not overstate its impact. Note that our correction approach will work

with any sufficiently high top-coding threshold.

Table 1 compares the stable and radiance-calibrated lights across the globe in 2010—

the latest year where both data sources are available. The first column already shows that

the difference in scale is striking. The brightest radiance-calibrated pixel in our sample is

more than 30 times brighter than the end of the stable lights scale. This is reflected in a

standard deviation which is three times higher and a spatial Gini coefficient of inequality

in lights which differs by 18 percentage points. In 2010, about six percent of all stable

lights pixels are top-coded, while their unsaturated counterparts are—on average—more

than twice as bright.

Figure 2 – Share of top-coded pixels and country characteristics

(a) GDP per capita

● ●

●●

● ●

●●

● ●●

●●

●●

● ●● ●

●●●●

●●

●●

●●

●●● ●●

● ●●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●● ●

● ●

●●

● ●

● ●●●

●●

●●●

0 20000 40000 60000 80000 100000 120000

0.0

0.2

0.4

0.6

0.8

GDP per capita (in 2011 PPPs)

Sha

re o

f pix

els

55 D

N a

nd a

bove

ARE

BEL

BHR

BRN

COG

DJIEGY

HKG

ISR

KWT

LUX

NLD

NOR

PRI

QAT

SGP

TTO

(b) Log population density

● ●●

●●

●●

●●

●●●

●●

●●

●●●●

●●● ●

●●

● ●

● ●

●●

●● ●● ●● ●●

● ●●

● ●

● ●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

● ●●●

●●

●●

● ●

● ●●●

●●

●●●

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

Log population density

Sha

re o

f pix

els

55 D

N a

nd a

bove

AGOARE

BEL

BGD

BHR

BRN

COM

DJIEGY

GUM

HKG

ISR

KOR

LBNMUS

NLD

PRI

PSEQAT

RWA

SGP

TTO

Notes: Illustration of the systematic bias introduced by top-coding. The data are a 10%representative sample of all non-zero lights in satellite F182010. GDP and population data arefrom the World Development Indicators.

Top-coding affects all countries but not uniformly. Countries which are i) highly

developed, ii) small and iii) urbanized are more strongly affected by top-coding than

others, but there is substantial heterogeneity. The remaining columns of Table 1

data before transmitting them to Earth, including how this leads to geolocation errors, blurring, andtop-coding. The satellite first truncates individual pixels at a much finer resolution and then aggregatesthose to a coarser resolution. Each night the origin shifts a bit, recreating a finer resolution. Finally,these data are aggregated and averaged again at NOAA when the annual composites are created.

11Since “the OLS does onboard averaging to produce its global coverage data, saturation does nothappen immediately when radiance reaches the maximum level. On the contrary, as the actual radiancegrows, the observed DN value fails to follow the radiance growth linearly, causing a gradual transitioninto a plateau of full saturation” (Hsu et al., 2015, p. 1872).

8

Page 11: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

document this diversity in a selection of countries. Larger middle income countries,

like China, Brazil or South Africa, have a top-coding share comparable to the world

average. Mature economies of different sizes and population densities, such as the US,

Israel or the Netherlands, have greater top-coding shares from 8% up to 25%. In Israel

and the Netherlands, a high average light intensity coupled with a high incidence of top-

coding in the stable lights data generate such an artificially low spatial Gini coefficient

that it rises by more than 20 percentage points in the radiance-calibrated data. Figure 2

illustrates these patterns across all countries in the sample and highlights the exceptions.

For example, the overwhelming majority of pixels in high income, high density city states,

such as Singapore, Hong Kong, or Bahrain, are top-coded. Top-coding is also particularly

pronounced in low income countries with a low average population density but large

primate cities, such as Egypt. The incidence of top-coding is thus a complex function of

the spatial equilibrium, that is, the size, number and location of cities in each country.

3 A Pareto distribution in top lights

Our correction approach rests on the claim that a Pareto distribution is a reasonable

description of top lights. The Pareto distribution is an often-found empirical feature of

data used in physics, biology and many other sciences (see Newman, 2005), including

the distribution of top income or wealth (e.g. Piketty, 2003, Atkinson et al., 2011) and

the size distribution of cities as delineated via night lights (Small et al., 2011). Yet, it

has not been established for the upper tail of the distribution of night light intensities

so far. We approach this issue in two ways. We first present a tractable model of light

intensities within and across cities, showing that the density of top lights at the pixel level

is Pareto, or can at least be closely approximated by a Pareto density. We then return to

the radiance-calibrated data and subject it to a battery of empirical tests to investigate

whether top lights can be described by a power law and settle on a parameter estimate.

Note that throughout this paper, we focus on light intensity, which is analogous to the

product of population and GDP per capita.

3.1 A stylized model of inner city lights

Our starting point is the size distribution of cities. It is well known that combining

Gibrat’s (1931) law of homogeneous growth of cities with a lower bound of city sizes

leads to a Pareto or Zipf distribution of large cities (Gabaix, 1999, Eeckhout, 2004).

Zipf’s law predicts that city ranks are inversely proportional to their size; for instance,

the biggest city in the U.S. (New York) has twice the population of the second-ranked

city (Los Angeles) and three times the population of the third-ranked city (Chicago).

Numerous studies have provided empirical evidence for this regularity based on U.S.

9

Page 12: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

cities or metropolitan areas (Gabaix and Ioannides, 2004, Rozenfeld et al., 2011, Ioannides

and Skouras, 2013). Despite heterogeneity in the rank-size parameter in other countries

(Rosen and Resnick, 1980, Soo, 2005), it is generally considered a good approximation of

the size distribution of large cities (Luckstead and Devadoss, 2014) and evidence in its

favor becomes stronger when cities are defined as “natural” agglomerations as opposed

to administrative boundaries (e.g. see Small et al., 2011, Jiang and Jia, 2011).

Assumption 1. The size distribution of big cities in terms of their population x is Zipf,

i.e. a Pareto distribution with shape parameter η = 1 above some threshold xc.

The CDF of the number of cities with population x is

F (x) = 1−(xcx

)η=

∫ x

xc

ηxηcxη+1

dx =

∫ x

xc

xcx2

dx. (1)

We are interested in the size of cities in terms of their area, not their population. The

urban allometry literature (Stewart, 1947, Jones, 1975), which deals with the scaling of

human-made structures within cities, links the two quantities in the following way:

Assumption 2. The population x and the area s of a city are proportional, i.e. x ∼ sφ.

It is typically assumed that 1 < φ < 2, so that larger cities not only spread out on the

plane by converting the surrounding agricultural land but also grow in terms of height

(Batty and Longley, 1994). Bettencourt (2013), for example, motivates scaling laws based

on a network theory of human interactions using the parameter value φ = 1.5.

To derive the distribution of individual pixels y, we require an assumption on the

shape of cities. The standard assumption in the workhorse model of urban economics is

monocentricity (Mills, 1967, Amson, 1972, Desmet and Rossi-Hansberg, 2013).

Assumption 3. Cities are monocentric and of circular shape. They consist of rings of

unit width from the center to the outskirts.

Depending on its area s and therefore its population x, each city has r = π−1/2x1/(2φ)

rings.12 Integration by substitution yields the CDF of the number of rings per city13

F (r) = 2φxcπ−φ∫ r

r

r−2φ−1dr =

0 for r < r

1− xcπ−φr−2φ for r >= r(2)

with r = π−1/2x1/(2φ)c . Its density

f(r) = xcπ−φr1−2φ for r >= r (3)

12For simplicity of exposition, we suppress the proportionality constant in x ∼ sφ and assume x = sφ.13Online Appendix C derives this result.

10

Page 13: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

follows a power law with a shape parameter 2φ− 1 > η = 1 for φ > 1.

The distribution of rings has fewer extreme values than the distribution of city sizes

in terms of their population.

Each city with r rings consists of πr2 rectangular pixels of unit size. The pixels are

located at distance d from their respective city center. There are two opposing effects

governing how the number of pixels depends on this distance: i) within a given city, the

number of pixels increases linearly with d because rings farther from the center contain

more pixels, and ii) the larger the distance d from the city center, the fewer cities of that

size are left, namely only the cities with r ≥ d rings.

Dividing the absolute amount of pixels at each distance d by the total number of

pixels yields their density

f(d) =

2π φ−1φx−1/φc d for d < d

2π1−φ φ−1φx1−1/φc d1−2φ for d ≥ d

(4)

with d = π−1/2x1/(2φ)c .14

To derive the density of luminosity f(l), we require a last assumption on how light

intensity within a city varies with the distance to the center d. We resort to standard

models of population density, since differences in light intensity within countries are

mostly driven by population (Henderson et al., 2018).

A popular choice in the literature is the negative exponential function, that is, p(d) =

P0 exp(−γd) where p(d) is the population density at distance d from the city center,

P0 ≥ p is the density at the center, and γ > 0 is a decay parameter so that the city

periphery is more sparsely populated or, in our case, lit. The negative exponential can

be motivated on the basis of the standard Alonso-Muth-Mills model (Brueckner, 1982).

Empirical studies typically find γ ≈ 0.15 (Bertaud and Malpezzi, 2014). An alternative to

the exponential distribution is the inverse power function, which was originally proposed

for gravity models of traffic flow (Smeed, 1961, Coleman, 1964, Batty and Longley, 1994).

It is defined as p(d) = P0d−a with d > 0 and shape parameter a > 0.

Both functions are qualitatively similar for intermediate distances. The inverse power

function differs from the exponential function in the city center, where it has higher and

more sharply declining values, as well as in the outskirts, where it predicts a moderately

higher density. A greater concentration at the center makes the inverse power function

particularly suitable for business floor-space models (as recommended by Zielinski, 1980).

At the same time, its tail is known to fit the urban fringe well (Parr, 1985). Since we

are interested in lights rather than population density, we would like to capture both the

very bright central business districts in most cities and the more dimly lit but significant

suburban sprawl in the outskirts, which typically features prominently in the footprint

14Online Appendix C derives this result.

11

Page 14: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

of city lights (e.g. see Small et al., 2011).

Modeling inner city light densities with an inverse power function better reflects the

light gradient and turns out to be analytically appealing. Hence, it serves as our baseline

case. Note that using the negative exponential function also leads to an expression for

the distribution of top lights which (under plausible conditions) can be approximated by

a Pareto density.15

Assumption 4. Within cities, the light density l(d) follows an inverse power function

l(d) = L0d−a for d > 0 with L0 > l as the maximum luminosity at the center and a > 0

as the decay parameter.

Applying the variable transformation from the inverse power function to the pixel

density in eq. (4) yields

f(l) =

2π1−φ φ−1

φx1−1/φc

(L0/l

)(1−2φ)/afor 0 < l ≤ l

2πφ− 1

φx−1/φc︸ ︷︷ ︸

c

(L0/l

)1/afor l < l ≤ L0,

(5)

where l = L0πa/2x

−a/(2φ)c .

Restricting our attention to the upper part of the light distribution from l onwards,

we can establish our key result:

Result 1. Based on assumptions 1–4, top lights above threshold l follow the Pareto

distribution f(l) = c(L0/l

)1/awith shape parameter 1/a.

Using a grid-search, we find that for a = 0.7 the inverse power function comes closest

to the negative exponential function of the consensus parameter γ = 0.15.16 This, in

turn, implies a Pareto shape parameter around one and a half.

In sum, a limited set of four assumptions allows us to analytically derive a Pareto

distribution in top lights, with shape parameter 1/a, maximum luminosity L0, and a

multiplicative constant. We believe that this is an important insight, even though the

model is very stylized to remain tractable. Note that our correction procedure later on

only relies on the Pareto property and does not require any of the simplifying assumptions,

such as monocentricity or a particular population density gradient. In fact, part of our

application below investigates how sub-centers have been forming in African cities.

There are alternative ways to characterizing the distribution of top lights. We could,

for instance, consider the truncation purely as a statistical issue of tail probabilities and

use extreme value theory to derive their distribution. Reassuringly, this approach also

points towards a Paretian distribution of top lights (see Online Appendix D).

15Online Appendix C derives this result.16The curve-fitting was conducted on the domain for d from 1 to 50, minimizing the squared error

between the negative exponential function with γ = 0.15 and the inverse power function with a ∈ [0.05, 2].

12

Page 15: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

3.2 Empirical tests

Empirical tests for a Pareto tail were popularized in economics by literature on top

incomes (e.g. Piketty, 2003, Atkinson et al., 2011) as well as city size (e.g. Rosen and

Resnick, 1980, Gabaix, 1999, Gabaix and Ioannides, 2004, Rozenfeld et al., 2011). These

tests usually exploit particular properties of the Pareto distribution.

Recall that data y which is Pareto distributed above a certain threshold yc has the

probability density function, f(y) = αyαc y−α−1, where α is the relevant shape parameter

and only takes on positive values. The survival function, 1 − F (y) = (yc/y)α, gives the

probability that the random variable Y is larger than the given value y. This maps directly

to our model from the previous section, only that we now denote the light intensity at

the pixel level by y and the shape parameter by α = 1/a to simplify the exposition.

Visual inspection: Following Cirillo (2013), we first visually check whether our data

are Pareto distributed, before estimating shape parameters that are only meaningful

if this condition is fulfilled. We use the seven radiance-calibrated satellites to analyze

the shape of the tail missing in the stable lights data. Figure 3 shows a discriminant

moment ratio plot with the coordinate pair of the coefficient of variation (i.e., standard

deviation divided by the mean) on the x-axis and skewness on the y-axis (Cirillo, 2013).

Each parametric distribution has its particular curve of feasible coordinates, so that the

relevant part of the plane can be divided into a Pareto area, a lognormal area, and a gray

area possibly belonging to both. This type of plot provides a more reliable indication

of the Paretian nature of the data than more traditional graphical devices (such as Zipf

plots shown in Online Appendix E).

The visual evidence in favor of a Pareto distribution is strongest for higher thresholds.

For the top 4% of pixels in panel (a), all but one satellite are located in the area of

indeterminacy.17 This ambiguity mirrors similar findings in the city-size literature, where

both the Pareto and the log-normal with a large standard deviation generate tails which

are virtually indistinguishable (Eeckhout, 2004, 2009).18 For smaller percentages, such

as the top 1% in panel (b), the evidence in favor of a Pareto distribution becomes much

stronger. All but one satellite are in the Pareto area, far from the lognormal area and

17The only exception is satellite F16 in 1996 which is based on considerably fewer cloud-free overpassesthan later years. It is the earliest and dimmest of the radiance-calibrated products. Its highest valuesare far below those in subsequent years and the data contain many ties. Online Appendix A containsthe relevant summary statistics for the radiance-calibrated data.

18It is well-established in the literature on city sizes and top incomes that the Pareto distribution isoften only a good representation for the very top of data, with a moderately decreasing fit as the thresholddecreases. Our data suggest that a Pareto distribution of top lights remains a good approximation downto the top 10% of the data. This is more than sufficient for our purposes, since we only consider 3% to 5%of all pixels to be top-coded in any given year. Indeterminacy vis-a-vis the lognormal occurs is commonin this literature as well, since “the tail of a lognormal is indistinguishable from the Pareto under certaincircumstances, [so that] the researcher who is interested in the tail properties of a size distribution canchoose which one to use” (Eeckhout, 2009).

13

Page 16: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure 3 – Discriminant moment ratio plots

(a) Top 4%

0.0 0.2 0.4 0.6 0.8 1.0 1.2

01

23

45

CV

Ske

wne

ss

Pareto Area

Gray Area

Lognormal Area

Exponential/ Thin tailed

Normal or Symmetric

●●

● ● ● ● ● ● ●1996 1999 2000 2002 2004 2005 2010

(b) Top 1%

0.0 0.2 0.4 0.6 0.8 1.0 1.20

12

34

5

CV

Ske

wne

ss

Pareto Area

Gray Area

Lognormal Area

Exponential/ Thin tailed

Normal or Symmetric

●●

●●

●●

● ● ● ● ● ● ●1996 1999 2000 2002 2004 2005 2010

Notes: The panels show discriminant moment ratio plots (Cirillo, 2013). The input data are a 10%representative sample of all non-zero lights in the radiance-calibrated data at the pixel level, whereeach pixel is 30× 30 arc seconds.

those of other thin-tailed distributions.

We take a closer look at the issue of lognormal versus Pareto in Online Appendix E.

We conduct several goodness-of-fit tests, where we compare the empirical distribution of

the radiance-calibrated data to the best-fitting theoretical distribution. The findings are

unambiguous. The Pareto distribution mimics the distribution of top lights much better

than the lognormal (with R2s close to unity). Hence, we conclude that our data are best

characterized by a heavy-tailed Pareto distribution.

Log-rank regressions: Log-rank regressions are a popular approach to estimating

the Pareto shape parameter and firmly rooted in urban economics. They are based

on the following approximation. For Pareto-distributed observations yi, i = 1, ...N ,

with the survival function given above, we have rank(yi) ≈ Nyαc y−αi , or, in logarithms

log rank(yi) − logN ≈ α log yc − α log yi. Gabaix and Ibragimov (2011) show that

directly estimating this relationship systematically underestimates the true coefficient

and standard error. However, once the ranks are shifted by minus one-half and the

standard errors are adjusted, rank regressions consistently estimate the parameters of

interest and turn out to be relatively robust to deviations from power laws.

Table 2 reports the corresponding results. We separately estimate the Pareto shape

parameter for each of the radiance-calibrated satellites and for the top 5% to 1% of

the data. The point estimates are relatively stable for the top 3% to 5% of lights—

14

Page 17: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table 2 – OLS rank regressions

Year 1996 1999 2000 2003 2004 2006 2010 Average

Panel a) Top 5%

Pareto α 1.3122 1.2993 1.2471 1.4482 1.4484 1.4790 1.4453 1.3828(0.0060) (0.0054) (0.0054) (0.0065) (0.0063) (0.0066) (0.0062) [0.0932]

Observations 96,685 116,858 106,914 100,095 106,899 99,487 107,745 –

Panel b) Top 4%

Pareto α 1.3514 1.4000 1.3594 1.5890 1.5837 1.6250 1.5963 1.5007(0.0069) (0.0065) (0.0066) (0.0079) (0.0077) (0.0081) (0.0077) [0.1236]

Observations 77,348 93,484 85,482 80,075 85,489 79,590 86,196 –

Panel c) Top 3%

Pareto α 1.4270 1.5665 1.5444 1.7667 1.7876 1.8423 1.8330 1.6811(0.0084) (0.0084) (0.0086) (0.0102) (0.0100) (0.0107) (0.0102) [0.1654]

Observations 58,011 70,115 64,111 60,058 64,134 59,692 64,647 –

Panel d) Top 2%

Pareto α 1.6714 1.8614 1.9314 2.0737 2.0976 2.1751 2.2160 2.0038(0.0120) (0.0122) (0.0132) (0.0147) (0.0143) (0.0154) (0.0151) [0.1932]

Observations 38,673 46,742 42,740 40,039 42,756 39,794 43,097 –

Panel e) Top 1%

Pareto α 2.2300 2.4350 2.4470 2.5914 2.5046 2.7075 2.8641 2.5399(0.0227) (0.0225) (0.0237) (0.0259) (0.0242) (0.0271) (0.0276) [0.2052]

Observations 19,337 23,373 21,372 20,020 21,377 19,898 21,551 –

Notes: The table reports the results of OLS rank regressions with log (rank(yi)− 1/2)− logN as thedependent variable. Asymptotic standard errors computed as (2/N)1/2α are reported in parentheses(see Gabaix and Ibragimov, 2011). The data are a 10% representative sample of all non-zero lightsin the radiance-calibrated data at the pixel level, where each pixel is 30× 30 arc seconds. The lastcolumn reports the point average of the seven satellites and its standard deviation in brackets.

approximately the range of shares we will later replace each year—and then rise as smaller

percentages are considered.19 The last column reports the simple average of the estimated

coefficients. Somewhat remarkably, the average Pareto shape parameter is about one and

a half for the top 4% of lights—not far from our model-based guesstimate. Moreover,

averaging all 21 coefficients in panels (a) to (c) also yields a central estimate of about

one and a half. Note that there is also little variation over time at the lower thresholds,

apart perhaps from a small discontinuity in the early 2000s. Our range of parameter

estimates implies inequality comparable to the top tail of the U.S. income or wealth

distribution (Piketty, 2003, Atkinson et al., 2011). Top lights are considerably more

equally distributed than city sizes, just as predicted by the model in the previous section.

We provide an array of additional robustness checks in Online Appendix E and F. Our

main findings are robust to i) using the Hill estimator instead of OLS rank regressions, ii)

19In theory, with Pareto-distributed data, conducting the estimation with the portion of thedistribution above a higher threshold yhc > yc should lead to the same estimated α. In practice, this willoften not hold exactly and the results for tail regressions are known to depend on the precise thresholdused (see for instance Rosen and Resnick, 1980).

15

Page 18: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

estimating unrestricted rank regressions, which are another way of testing for a Pareto

distribution, and iii) using top-coding free data from the Visible Infrared Imagining

Radiometer Suite (VIIRS) satellites, which are currently only available for 2015. The

VIIRS satellites are technologically superior and recorded at a much finer resolution.

Reassuringly, they are either as indicative of a Pareto tail in top lights as the radiance-

calibrated data, or provide even stronger evidence in its favor, such as linear Zipf plots.20

4 Correcting for top-coding

We propose a simple correction procedure inspired by the methods used in the top incomes

literature (Piketty, 2003, Atkinson et al., 2011). All lights below the top-coding threshold

are unaltered, while those above are replaced by a Pareto-distributed counterpart. An

appealing feature of this approach is that it keeps the overwhelming majority of the data

intact and replaces only a small but highly influential fraction of pixels.

Our theoretical arguments and empirical tests strongly suggest a Pareto parameter

around one and a half. Recall that a plausible parameterization of our model implies

α = 1/0.7 ≈ 1.43 and our empirical estimates are centered on one and a half. We use this

fixed value as a rule-of-thumb parameter. Of course, our procedure is not predicated on

a particular value, nor does it require the parameter to be constant over time. However,

we do not detect an unambiguous trend indicating that the distribution of top lights

has become more equal over time. As will become clear shortly, assuming a constant

parameter value still allows for significant variation over time as particular pixels i) cross

the top-coding threshold and ii) achieve a higher rank vis-a-vis other pixels. Cities

can thus become brighter in absolute terms and grow relative to other cities after the

correction.21

Our preferred correction approach is a pixel-level replacement in which we directly

substitute the top-coded lights by their corresponding Pareto quantile. Correcting the raw

data has the advantage that they can then be flexibly aggregated to the unit of interest.

Another option is to analytically correct the relevant summary statistics, such as mean

lights and spatial Gini coefficients. We provide closed-form solutions that are particularly

useful for back-of-the envelope calculations in Online Appendix G. Exclusively correcting

the summary statistics, however, limits the potential applications and is not sufficient

when the actual locations of the top-coded pixels matter.

20The estimated shape parameters are a bit higher for top shares around 3% to 5% but then alsoappear to be more stable in the upper tail. Since the VIIRS data are five years after the most recentradiance-calibrated image and have a different overpass time, it is difficult to identify the source of theseslight discrepancies.

21In additional analyses, we assigned the seven satellite-specific estimates to the adjacent stable lightsatellite-years. This approach gives very similar results overall, albeit with more jumps. Country-specificestimates also lead to similar results, although comparatively darker and poorer countries experiencelarger corrections. Additional results are available on request.

16

Page 19: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

A remaining challenge is to geo-reference the Pareto quantiles so that the brightest

pixels actually end up in the centers of dense urban agglomerations. It turns out that

there is a straightforward solution. Since we know the exact location of all pixels, we

can rank them according to their radiance-calibrated values from the nearest year and

distribute the highest values from the Pareto distribution in the same manner. Working

with ranks avoids importing the artificial variability of the radiance-calibrated satellites

but preserves a crucial part of the data structure.22 Our algorithm for replacing top-coded

pixels with their quantile counterpart from Pareto distribution works as follows:

(1) For each of the 34 satellite-years t of stable lights data, calculate the number Nt of

pixels ≥ 55 DN to be be replaced.

(2) Produce a ranking of these Nt pixels based on the radiance-calibrated data

associated with the same satellite-year or the data from the closest year.23

(3) Generate Nt theoretical values from a truncated Pareto distribution with the rule-

of-thumb α = 1.5, the top-coding threshold yc = 55, and upper bound H = 2000.

(4) Replace the Nt stable lights pixels ≥ 55 so that the stable lights pixel with the i-th

highest rank from (2) is replaced by the i-th highest theoretical value.

We apply this procedure to every stable lights satellite image over the entire period

from 1992 to 2013. This is the data which we use in the subsequent application and

which underlies Figure 1. Note that the support of the Pareto distribution is unbounded,

so that its highest quantiles yield a handful of values far exceeding the natural limit of

man-made light intensity. To generate a realistic depiction of city lights we impose an

upper bound of 2000 DN, which approximately corresponds to the average maximum

observed in the world’s brightest cities.24

The correction has a significant impact on how we understand the global distribution

of economic activity. On average, 3.7% pixels above the top-coding threshold account

for 17.7% of the total sum of lights observed in the stable lights data. This share almost

doubles to 31.94% after the correction—a statement which is approximately true in every

22The ranks of the pixels are much more stable over time than the values of the radiance-calibrateddata. The rank correlation of maximum city lights typically ranges from 0.90–0.95 for adjacent radiance-calibrated satellites (see Online Appendix A).

23Whenever the radiance-calibrated data creates ties, we first try to break those ties using the ranks ofthe stable lights data and then use the ranks of neighboring radiance-calibrated satellites. This producesa near unique ranking each time. We also experimented with other rankings. The results are very similarand available on request.

24Online Appendix A reports and discusses the observed inner-city maxima obtained by the radiance-calibrated satellites which motivate this upper bound in man-made luminosity. The truncated Pareto

has the CDF F (y) =(

1 −(ycy

)α)/(

1 −(ycH

)α)for yc ≤ y ≤ H, where an upper bound of H = 2000

does not affect the overwhelming majority of stable lights pixels ≥ 55 to be replaced, but does ensurerealistic values at top, say, 0.01% of the data.

17

Page 20: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

individual satellite-year, although later shares are bigger in absolute terms before and

after the correction. In other words, about four percent of all lit pixels account for about

a third of all visible economic activity. The worldwide spatial Gini coefficient rises by

nine percentage points, on average, after the correction. Note that the annual variation

in the global Gini coefficient is only a few percentage points, so that it is swamped by

the size of the top-coding correction. These findings are not particularly sensitive to

the choice of the Pareto shape parameter, although the size of the top-coding correction

varies somewhat.25

We conduct two benchmarking exercises to assess the properties of the corrected data

and better understand at which scale of aggregation top-coding influences the conclusions

we are likely to draw when using night lights. We only briefly summarize the results

here and relegate a full discussion to Online Appendix I. The first exercise estimates

national light-output elasticities as in Henderson et al. (2012). Even at this high level

of aggregation, the top-coding corrected series performs marginally better. For the most

part, however, the estimates using the corrected data are not just economically but also

numerically very close to the original results. Our second benchmark goes from the

national to the regional level and uses high-quality German regional accounts. Here we

obtain two notable results. First, the light-output elasticity rises considerably after the

top-coding correction, particularly in urban areas. Only the corrected data allow us to

recover estimates of the light-output elasticity comparable to the national-level in a cross-

section and panel of German regions. Second, light intensities in the corrected series are

approximately linear in population density and diverge from the stable lights data after

about one thousand people per square kilometer, that is, in denser urban areas. Only the

corrected lights provide a realistic ranking of larger German cities. Both findings suggest

that the corrected data better capture the light-output gradient in developed economies

with mature urban structures.

5 Application: Cities in Sub-Saharan Africa

Armed with this new data, we now return to the question of whether primate cities are

outgrowing secondary cities in Sub-Saharan Africa and how city structures are adjusting

to continuously growing populations. In most African countries, economic activity is

concentrated in the biggest city: Dar Es Salaam, Kinshasa and Lagos already are mega

cities with more than 10 million inhabitants, or will attain that status in a few years

(United Nations, 2018). Excessive spatial concentration is generally a feature of countries

25Online Appendix H provides a variety of summary statistics and sensitivity checks using differentPareto shape parameters. The size of the correction in both the country averages of light intensities andthe country-wide spatial Gini coefficients varies systematically with GDP per capita, country size andpopulation density, in line with the occurrence of top-coding illustrated in Figure 2.

18

Page 21: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

with poor infrastructure and a low level of development (Krugman, 1991, Puga, 1998,

Jedwab and Vollrath, 2018), although Africa’s urbanization differs from the experience

of industrialized economies for a variety of reasons.

The new millennium marked a turning point for most African economies. Sustained

consumption growth and pro-poor distributional change brought about falling poverty

rates (Bluhm et al., 2018). However, as countries are developing, it is theoretically unclear

whether secondary cities will catch up (Duranton, 2008). There is some empirical evidence

suggesting that this is the case. Henderson et al. (2012), for example, estimate that the

African hinterland was growing about 2.3% faster than primate cities over the period from

1992 to 2008.26 Moreover, many in the World Bank view secondary city development as

key to sustained poverty reduction (Christiaensen and Todo, 2014, Christiaensen et al.,

2016). Yet, manufacturing is heavily concentrated in primate and coastal cities with

greater access to world markets. The oil price boom of the 2000s, for example, hurt

remote secondary cities more than primate cities (Storeygard, 2016). Secondary cities

in Africa have been characterized as “consumption cities”, catering to the agricultural

hinterland rather than the modern sector (Gollin et al., 2016).

Whether or not primate cities are driving productivity depends—at least in part—

on their internal structure. Informal settlements in large cities can more easily absorb

growing populations (Jedwab and Vollrath, 2018), but if these neighborhoods remain

badly connected to the center, then cities will be crowded, fragmented and less productive

(Lall et al., 2017). An adverse urban form implies that companies and inhabitant are

faced with high transport costs and long commuting times, which limits interactions

and positive spillovers across the city (Rosenthal and Strange, 2004). Ultimately, this

may prevent the industries located in primate cities from reaping increasing returns

and diversifying into tradables, creating more urbanization without industrialization

(Venables, 2017, Gollin et al., 2016).

Our application tackles both of these aspects. The first part uses the top-coding

corrected data to show that primate cities have outgrown secondary cities and towns over

the period from 1992 to 2013. This holds both at the intensive and extensive margin. The

second part provides a novel perspective on inner city activity. We show that inequality

in light decreases over time as poorer neighborhoods develop. At the same time, we

find evidence for increasing fragmentation over time, suggesting that newly forming sub-

centers are not well-connected with the city as a whole.

26Note that Henderson et al. (2012) also suggest that a “detailed study would be required to explainthe result” (p. 1024). In a related contribution, Fetzer et al. (2016) find that democratization in Africaled to higher growth of secondary cities at the expense of the primary city. In India, Gibson et al. (2017)find that secondary cities matter more for rural poverty reduction when studying the spillover of urbangrowth at the intensive and extensive margin.

19

Page 22: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

City boundaries: Urban areas are often delineated using night light by defining

them as contiguously lit clusters above some luminosity threshold (Small et al., 2011,

Storeygard, 2016, Gibson et al., 2017). Satellite-derived footprints are particularly useful

in Africa, where administrative boundaries quickly become outdated. The night lights

approach is appealing, since it offers a time-varying measure of urban expansion, but also

suffers from several well-known problems. No single threshold works well for all cities:

thresholding overestimates the urban extent of larger cities and penalizes other cities at

the same time (Small et al., 2011, Abrahams et al., 2018). We address these issues using

a method developed by Abrahams et al. (2018) to resolve exactly this issue. Their de-

blurring algorithm reduces the non-linear “overglow” in the lights data and considerably

improves the accuracy of the identified urban extents.27

Figure 4 – Primate and secondary cities in Sub-Saharan Africa

Notes: Illustration of the location of our panel of 40 primate and 493 secondary cities in SSA usingthe urban footprint detection algorithm outlined in the text. Note that we dropped EquatorialGuinea due to gas flaring on the capital island of Malabo and consider South Sudan as part ofSudan for the entire sample. Small towns and villages with a land area smaller than 10 km2 andcities whose footprints were not detected in both periods are not considered.

To capture changes at the extensive margin and further minimize measurement errors,

27The de-blurring approach is based on two insights into the data generating process: i) the originallight sources are blurred by a symmetric Gaussian point-spread function, and ii), pixels in which lightsources are located must be local maxima in the so called the percent frequency of light detection image(for more detail see Abrahams et al., 2018).

20

Page 23: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

we define cities as contiguously illuminated pixels in the de-blurred lights, provided that

light is detected in at least two satellites over a period of three years.28 The period from

1992 to 1994 marks the initial boundaries and the period from 2011 to 2013 represents

the latest available boundaries. We then identify city locations by overlaying these urban

areas with all settlement points from the Global Rural-Urban Mapping Project (GRUMP)

within three kilometers of the urban perimeter.29 An urban area receives the name and

attributes of the most populous settlement located within the expanded urban perimeter

(or the settlement point closest to the polygon centroid when population estimates are

unavailable). Areas which do not coincide with known settlements or are not observed

in both periods are dropped. Finally, we identify the primate city in each country as the

city with largest population and define all other cities and towns as secondary.30

Figure 4 shows a map of our universe of 533 cities and their population in 2000. 40

cities, one in each country, are defined as primate, while 493 are secondary cities. The

largest agglomeration in the sample is Johannesburg, followed by Lagos and Cape Town.

City growth: Cities grow at the intensive margin, as existing city quarters develop and

become brighter, and at the extensive margin, as they expand and absorb surrounding

areas. Total city growth is the combination of intensive growth in the center, extensive

growth in terms of area, and changes in light intensity or population density in the newly

added fringe of the city. We track our universe of cities in Sub-Saharan Africa over the

period from 1992 to 2013, calculating total growth and its components in each year.31

Primate cities differ from secondary cities on a number of ways: i) they are larger than

secondary cities at the start of the sample (454.14 km2 versus 56.60 km2), ii) their urban

perimeter expands considerably faster (at a rate of 2.15% versus 0.46% per annum),

iii) they are initially much brighter than secondary cities, both in sum and in terms

of average light intensity, and iv) their overall light intensity grows considerably faster

over the entire period from 1992 to 2013. Table 3 shows that these stylized facts hold

regardless of whether we use the stable lights or the corrected data.

Substantial differences between the two data sources appear once we focus on city

growth at the intensive margin. Top-coding affects primary cities much more than

28Two images of the same place in the same year may indicate different urban footprints due to a lackof on-board calibration. Considering a window of three years and requiring multiple detection pointseffectively cancels out most of this artificial variation.

29We manually extend this data to include all coordinates of cities which at some point over the periodfrom 1992 to 2013 were designated the administrative capital of a province or state. We first identifythe administrative capitals of subnational regions using www.statoids.com and then geocode each cityusing multiple online gazetteers.

30We assume that secondary cities have an urban extent of at least 10 km2. Including villages andsmall towns does not alter our qualitative results but induces more noise. Online Appendix J shows boththe initial and latest boundaries together with daytime images of Lagos, Luanda, and Johannesburgtaken at the end of both periods.

31Following the literature, we average the summary statistics whenever more than one satellite areavailable in a particular year.

21

Page 24: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table 3 – Annualized growth rates of African cities, 1992–2013

Primate Cities Secondary Cities

1992 2013 1992 2013

Panel a) Area in km2

Sum within initial bounds 454.14 454.14 56.60 56.60Sum within latest bounds 603.78 603.78 65.53 65.53Extensive growth 0.0215 0.0046

Panel b) Stable lights in DN

Sum within initial bounds 20010.17 26670.58 1876.09 2601.98Sum within latest bounds 20482.85 34020.97 1833.12 3115.56Intensive growth (center) 0.0215 0.0242Intensive growth (fringe) 0.0325 0.0327Total growth 0.0369 0.0288

Panel c) Corrected lights in DN

Sum within initial bounds 22524.88 36433.83 1987.19 2947.30Sum within latest bounds 23000.64 44098.00 1944.74 3468.52Intensive growth (center) 0.0274 0.0247Intensive growth (fringe) 0.0377 0.0333Total growth 0.0404 0.0292

Notes: The table reports a selection of summary statistics for African cities. The data are computedfor each city and then averaged across 40 primate and 493 secondary cities.

secondary cities and the size of the correction becomes larger over time. Panels (b)

and (c) of Table 3 suggest that the corrected sum of light using the initial boundaries in

1992 is only about 12.6% larger in the corrected than in the stable lights data. By 2013,

the correction rises to 36.6%, as more pixels inside primary cities cross the top-coding

threshold and move up in the global ranking. Secondary cities experience much smaller

corrections. Note that this pattern has hardly changed over time. The stable lights data

also underestimate total growth in primate cities but by a much smaller margin, since

newly added outskirts tend to be less densely populated.

The correction has important implications for our understanding of city growth and

agglomeration economies in Sub-Saharan Africa. The stable lights data suggest that

secondary cities have outgrown primary cities at the intensive margin (2.42% vs. 2.15%

per year) and are, therefore, catching up. This picture is reversed after the top-coding

correction (2.47% vs. 2.74% per year). Top-coding plays virtually no role in small

primate cities, such as Bissau. However, the annualized growth rates of larger cities, like

Kinshasa, Johannesburg, Luanda, Dakar or Khartum, increase by a percentage point or

more.32 Total growth also rises to about 4% per annum in primate cities—a result which

is remarkably close to population-based estimates (United Nations, 2018).

32Online Appendix J presents data for each country, listing the primate city, as well as the growthrates of both the primary and all secondary cities in terms of intensive growth.

22

Page 25: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

We underpin these descriptive results with panel regressions focusing on growth at

the intensive margin. In each case, we regress the log of average lights emitted by city j

in country i at time t, ln lightsijt, on a linear time trend, an interaction of the linear time

trend with an indicator for primate cities, Pij, and a set of fixed effects that vary across

specifications. We typically include city fixed effects and then add satellite dummies to

purge systematic measurement errors across satellites.33

Table 4 – Growth regressions for African cities, intensive margin

Dependent variable: Log mean lights

Stable lights data Corrected data(1) (2) (3) (4) (5) (6)

Linear trend 0.013∗∗∗ 0.013∗∗∗ 0.008∗∗∗ 0.014∗∗∗ 0.013∗∗∗ 0.008∗∗∗

(0.001) (0.001) (0.001) (0.001) (0.001) (0.001)

Primate × linear trend 0.003 0.003 0.007∗∗∗ 0.007∗∗∗

(0.002) (0.002) (0.003) (0.003)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No YesObservations 11724 11724 11724 11724 11724 11724Cities 533 533 533 533 533 533

Notes: The table reports the results of city-level panel regressions using the stable lights and top-coding corrected data. The specifications are variants of ln lightsijt = β1t+β2(t×Pij)+cij+st+εijtwhere t is a linear time trend, Pij is an indicator for primate cities, cij is a city fixed effect and stcontains satellite dummies. Standard errors clustered at the city level are in parentheses. Significantat: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

Table 4 confirms that primate cities outgrew secondary cities by a significant margin

over this period but also adds a few qualifications. Columns (1) and (4) show that the

average growth rate across all cities, which is driven by the great majority of secondary

cities, was slightly less than 1.5%. Pronounced differences appear once we include the

interaction terms. The stable lights data in columns (2) and (3) suggest no differences

across city types, which is an artifact of pixels in primate cities increasingly hitting the

top-coding threshold. On the contrary, the corrected data in columns (4) and (5) indicate

that primate cities grew 0.7 percentage points faster. The estimated coefficient on the

interaction term directly corresponds to a difference in means test. Hence, we can reject

the null hypothesis that both trends are the same at all conventional significance levels.

The growth rates of secondary cities are virtually unaffected by the correction.

However, they fall considerably according to both data sources once we include satellite

dummies in columns (3) and (6). Later satellites tend to be more sensitive to light

than earlier satellites, so that the trends documented in the descriptive statistics and

33Note that our vector of satellite dummies accounts for all the possible combinations of satellites,e.g. the fact that the value for 1993 are averages of satellites F10 and F12 which is different than a valuethat is solely obtained from either F10 or F12.

23

Page 26: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

earlier regressions are overestimated across the board. Primate cities actually grew

approximately twice as fast as secondary cities at the intensive margin. Stated differently,

over the entire period of 21 years, light intensity within primate cities increased by about

37% but only by about 18% in secondary cities.

Table 5 – Split sample growth regressions for African cities, intensive margin

Dependent variable: Log mean lights

Stable lights data Corrected data(1) (2) (3) (4) (5) (6)

Panel a) Period from 1992 to 2002

Linear trend 0.005∗∗∗ 0.004∗∗∗ -0.001 0.005∗∗∗ 0.005∗∗∗ -0.001(0.001) (0.001) (0.002) (0.001) (0.001) (0.002)

Primate × linear trend 0.006 0.006 0.007∗ 0.007(0.004) (0.004) (0.004) (0.004)

Panel b) Period from 2003 to 2013

Linear trend 0.061∗∗∗ 0.063∗∗∗ 0.017∗∗∗ 0.064∗∗∗ 0.064∗∗∗ 0.016∗∗∗

(0.001) (0.001) (0.002) (0.001) (0.001) (0.002)

Primate × linear trend -0.021∗∗∗ -0.021∗∗∗ -0.006 -0.006(0.004) (0.004) (0.005) (0.005)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No YesObservations (a) 6394 6394 6394 6394 6394 6394Observations (b) 5330 5330 5330 5330 5330 5330Cities (both) 533 533 533 533 533 533

Notes: The table reports the results of city-level panel regressions using the stable lights and top-coding corrected data. The specifications are variants of ln lightsijt = β1t+β2(t×Pij)+cij+st+εijtwhere t is a linear time trend, Pij is an indicator for primate cities, cij is a city fixed effect and stcontains satellite dummies. Standard errors clustered at the city level are in parentheses. Significantat: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

Table 5 investigates whether the relative growth rates of primate and secondary cities

differ from the 1990s to the 2000s. Panel (a) shows that both city types grew slowly or not

at all in the first period (1992-2002)—consistent with accounts of sluggish recoveries from

the deep recessions of the 1970s and 1980s across Sub-Saharan Africa (e.g. Bluhm et al.,

2018). Robust growth returned in the 2000s. The estimates in panel (b) demonstrate

that growth in secondary cities increased substantially after 2002. Top-coding plays

a large role in this period. The stable lights data suggest that secondary cities grew

significantly faster than primary cities, while the corrected lights show that this is not

the case. Column (3) even suggests a marginally negative growth rate of primate cities.

Our preferred specification in column (6) suggests a growth rate of about 1.6% per annum

for both city types. Note that all other specifications without satellite fixed effects clearly

overestimate the growth rates of secondary cities.

24

Page 27: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

So far we have emphasized the intensive margin over the extensive margin or total

growth. When the city footprint is kept fixed, we can interpret growth in lights directly as

increases in population density and economic activity per square kilometer. Population

density, in turn, is strongly correlated with living standards and public good provision

in developing countries (Gollin et al., 2017). Estimates from developed countries suggest

that a doubling of population density raises productivity by about 5% (Rosenthal

and Strange, 2004). Dense city centers are also the places where top-coding is most

pronounced. New developments at the fringe of a city are usually dimmer but here too

primate cities have been growing substantially faster than secondary cities. Lights in

the fringe grow about 2.4% faster in primate cities than in secondary cities.34 Taken

together, our first set of results are good news for African economies. Improvements

in infrastructure and greater housing density should be increasing welfare in primate

cities, provided that they are not just absorbing growing populations and becoming more

congested.

City structure: Next, we study how inner city structures are transforming over time,

in order to better understand whether neighborhoods within African cities are becoming

better connected or are developing into loose clusters of informal settlements. From now

on, we exclusively use the corrected data since they provide a consistent view onto inner

city activity and focus on the core of the city as defined by the initial boundaries.35

We compute two proxies for the variation of urban population density or inner city

fragmentation, both of which have been previously used in the literature on urban forms

(e.g. see Tsai, 2005). Our first measure is the coefficient of variation of light intensity per

pixel. The coefficient of variation is a simple inequality measure capturing the variation of

light intensities across an entire city. It is defined as the ratio of the standard deviation

to the mean. A high (low) value indicates large (small) inner-city differences in the

dispersion of light. The index is not bounded from above.

Our second measure of fragmentation is Moran’s I (Moran, 1950). Moran’s I takes

the precise location of each pixel within a city into account and indicates whether similar

light intensities cluster together in space. It is defined as

I =N

S0

∑i

∑j wij(xi − x)(xj − x)∑

i(xi − x)2

where N is the number of pixels in the initial footprint of the city, wij are elements of an

N ×N inverse distance weight matrix, S0 is the sum of all wij, xi or xj is the pixel-level

light intensity, and x is mean luminosity.36

34Online Appendix J contains the corresponding regression results.35Including the expanding city perimeter does not fundamentally change our result on city structures.36We work with a scaled version of Moran’s I to make cities consisting of different numbers of pixels

comparable, that is, we subtract its expected value under the null hypothesis of no spatial correlation:

25

Page 28: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Positive values of Moran’s I indicate that pixels are surrounded by others of similar

luminosity or population density (positive autocorrelation), while negative values reflect

a checkerboard pattern (negative autocorrelation). The index ranges from minus one to

one. Light intensities within cities are positively spatially correlated but there is a clear

ranking. The index continuously falls as we move from monocentric cities over polycentric

cities to decentralized urban sprawl. A monocentric city in which luminosity slowly and

gradually decreases from the densely populated center to the sparsely populated outskirts

will have a higher Moran’s I than a checkered city in which dense and sparsely populated

areas take turns.

Panel (a) of Figure 5 illustrates the heterogeneity of urban structures on the

subcontinent and shows that our light-based measures capture meaningful variation.

Consider, for example, cities with a high Moran’s I and relatively low coefficients of

variation, such as Conakry, Freetown and Dakar. This combination indicates a regular

structure with a bright center surrounded by similarly bright areas with a slow decay

towards darker outskirts. Other cities with the same coefficient of variation have a

much lower Moran’s I. Their spatial distribution is considerably more fragmented,

matching other accounts. A large part of Abidjan’s population, for example, lives in

slums characterized by illegal land tenure, buildings of non-permanent materials, and

little or next to no infrastructure (UN-Habitat, 2003).

Figure 5 – Varying structures of cities in Sub-Saharan Africa

(a) Fragmentation in primate cities, 2000

●●

●●

20 40 60 80 100

1520

2530

3540

CVs

Mor

an's

I

COTONOU

GABORONE

OUAGADOUGOU

BUJUMBURAABIDJAN

DOUALA

BANGUI

NDJAMENA

BRAZAVILLE

KINSHASA

DJIBOUTI

ASMARA

ADDISABABA

LIBREVILLE

ACCRA

CONAKRY

BISSAU

NAIROBI

MASERU

ANTANANARIVO

BLANTYRE

BAMAKO

NOUAKCHOTT

MAPUTO

WINDHOEK

NIAMEY

LAGOS

KIGALI

DAKARFREETOWN

MOGADISHU

JOHANNESBURG

ALKHARTUM

MBABANE

LOME

KAMPALA

DARESSALAAM

LUSAKA

HARARE

(b) Moran’s I in Johannesburg

1990 1995 2000 2005 2010 2015

1415

1617

18

Mor

an's

I

Notes: Illustration of urban structures in Sub-Saharan Africa. Panel (a) shows a cross-sectionalscatter plot of the estimated Coefficient of Variation (CV) and Moran’s I in 2000. Panel (b) displaysthe evolution of Moran’s I in Johannesburg over the period from 1992 to 2013. The coefficient ofvariation and Moran’s I have been scaled by 100 for the ease of exposition.

Johannesburg is a particularly interesting case in terms of fragmentation. In 2000, it

has the second highest coefficient of variation and the lowest Moran’s I in our sample of

I∗ = I − E[I] = I − (−1/(N − 1)).

26

Page 29: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

primate cities. Owing to a legacy of racial segregation during Apartheid, Johannesburg

consists of alternating poor and rich neighborhoods which do not form a single integrated

city. There is limited evidence that this pattern is changing. Panel (b) of Figure 5 shows

that we observe a moderate increase in Moran’s I since the mid-2000s while the coefficient

of variation is decreasing at the same time. This suggests that efforts are being made at

integrating these very different neighborhoods, although the overall levels of inequality

and fragmentation remain very high.

To analyze these data in a more structured manner, we regress one of the measures

of inner city fragmentation, Fijt, on a linear time trend, an interaction of a linear time

trend with an indicator for primate cities, Pij, the log of average luminosity in the city,

ln lightsijt, and a set of fixed effects that vary across specifications. We also include

the city-wide average light intensity to interpret the changing structure net of growth

effects.37

Table 6 – Trends in fragmentation of African cities, 1992–2013

Dependent variable:

Coefficient of Variation Moran’s I(1) (2) (3) (4) (5) (6)

Linear trend -0.275∗∗∗ -0.245∗∗∗ -0.141∗∗∗ 0.042∗∗∗ 0.048∗∗∗ -0.302∗∗∗

(0.029) (0.030) (0.053) (0.012) (0.012) (0.034)

Primate × linear trend -0.425∗∗∗ -0.417∗∗∗ -0.096∗∗∗ -0.098∗∗∗

(0.074) (0.074) (0.036) (0.036)

Log light intensity -7.817∗∗∗ -7.681∗∗∗ -9.147∗∗∗ -1.361∗∗∗ -1.330∗∗∗ -1.075∗∗

(1.512) (1.520) (2.034) (0.368) (0.369) (0.467)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No Yes

Observations 11680 11680 11680 11680 11680 11680Cities 531 531 531 531 531 531

Notes: The table reports the results of city-level panel regressions using the top-coding correcteddata. The specifications are variants of Fijt = β1t+β2(t×Pij)+β3 ln lightsijt+cij +st+εijt, whereFijt is either the coefficient of variation or Moran’s I, t is a linear time trend, Pij is an indicator forprimate cities, cij is a city fixed effect and st contains satellite dummies. Standard errors clusteredat the city level are in parentheses. Both measures of fragmentation have been scaled by 100 forreadability. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

Table 6 reveals two new insights. First, we observe a decrease in the dispersion of

lights over time which differs across city types. Column (1) shows that the coefficient of

variation has been decreasing steadily over the period from 1992 to 2013. Columns (2)

and (3) add that the trend is almost three times as fast in primate cities than in secondary

cities (once satellite dummies are included). Second, this decrease in light inequality is

37Note that this does not alter our conclusions in any important manner.

27

Page 30: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

accompanied by a decrease in clustering as measured by Moran’s I. African cities are

becoming less compact and this effect is about 25% greater in primate cities. Column (4)

and (5) show that omitting satellite fixed effects would lead us to mistakenly conclude

that secondary cities have been becoming more compact, whereas column (6) shows that

the trend is negative across both types of cities. Moreover, all column shows that the

increasing average light intensity has been a driving factor behind the decrease in both

the coefficient of variation and Moran’s I.

We take a closer look at when these characteristics emerge in Online Appendix J. The

trend of decreasing inner city inequality is equally pronounced throughout the sample

period, while increasing fragmentation (or decreasing compactness) is a relatively new

phenomenon starting in the 2000s. We also present somewhat inconclusive evidence on

whether an increasing fragmentation of cities inhibits their subsequent growth.

Our preferred interpretation of these findings is that Africa’s biggest cities are at a

crossroads. They are growing rapidly at the intensive and extensive margin, while the

distribution of economic activity and people is starting to equalize across the city. The

simultaneous decrease in clustering tells us that subcenters are forming but these may

or may not ultimately coalesce into a cohesive whole.38 Although poorer neighborhoods

are becoming denser and brighter relative to the center, a lack of connectivity to other

neighborhoods remains a major obstacle and constrains economies of scale. By reducing

the frequency and quality of human interactions, congestion limits the ability of cities to

facilitate matching, sharing and learning (Duranton and Puga, 2004). Moreover, the lack

of a compact shape itself induces large welfare losses for urban consumers in developing

countries (Harari, 2017).

It is important to emphasize that there is nothing tautological about these results

in our correction; if anything, the procedure provides a lower bound of primate city

growth and fragmentation. Top-coded pixels which become brighter over time move up

in the global ranking and thus receive higher theoretical values. Our correction does not

explicitly consider whether these pixels are located in primary or secondary cities, or if

they are surrounded by other bright pixels. This is precisely how Johannesburg is able

to buck the trend of other primate cities in Sub-Saharan Africa.

6 Concluding remarks

While satellite data of nighttime lights are an increasingly popular proxy for economic

activity, they suffer from top-coding and severely underestimate the brightness of most

cities. The key contribution of this paper is to provide a solution to this problem and

establish new findings about the economic performance of cities in Sub-Saharan Africa.

38Simulations show that the emergence of sub-centers goes in line with an increase in light density, adecrease in light inequality and an increase in fragmentation, just as we observe in the African data.

28

Page 31: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Our solution rests on the claim that top lights can be characterized by a Pareto

distribution. We support this conjecture in two ways. First, a model of luminosity

emitted by large cities suggests that plausible assumptions directly lead to a power law

in light emissions. Second, a battery of empirical tests indicates that a Pareto distribution

is a sound representation of the data. Other parametric or non-parametric approaches

are possible, but we find it appealing to directly link the distribution of bright lights

to both Zipf’s law of cities and the standard tail extrapolation problem. On this basis,

we develop a geo-referenced ranking procedure to replace the top-coded pixels with their

theoretical counterparts and present a new global panel of light intensities over the period

from 1992 to 2013.

The new data lends itself to numerous applications and performs well in several

benchmarking exercises. In this paper, we focus on city growth and city structure in Sub-

Saharan Africa. Our main finding is that primary cities have maintained their dominant

position but are becoming more fragmented internally. This limits economies of scale and

their ability to break into world markets. Institutional features certainly play a role in

this development and warrant more research. If public services are improving, and public

infrastructure connects striving neighborhoods, then this will have wide-ranging benefits

extending well beyond the city. Note that by focusing on cities in Sub-Saharan Africa,

we also submit our new data to the ultimate test. Since the top-coding correction makes

a substantial difference in a setting where both electrification rates and urban building

densities are comparatively low, it will almost surely matter more in other parts of the

world.

29

Page 32: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

References

Abrahams, A., C. Oram, and N. Lozano-Gracia (2018). Deblurring DMSP nighttime lights: A newmethod using gaussian filters and frequencies of illumination. Remote Sensing of Environment 210,242–258.

Ades, A. F. and E. L. Glaeser (1995). Trade and circuses: Explaining urban giants. Quarterly Journalof Economics 110 (1), 195–227.

Alesina, A., S. Michalopoulos, and E. Papaioannou (2016). Ethnic inequality. Journal of PoliticalEconomy 124 (2), 428–488.

Amson, J. (1972). Equilibrium models of cities: 1. An axiomatic theory. Environment and PlanningA 4 (4), 429–444.

Atkinson, A. B., T. Piketty, and E. Saez (2011). Top incomes in the long run of history. Journal ofEconomic Literature 49 (1), 3–71.

Barrios, S., L. Bertinelli, and E. Strobl (2006). Climatic change and rural–urban migration: The case ofSub-Saharan Africa. Journal of Urban Economics 60 (3), 357–371.

Batty, M. and P. Longley (1994). Fractal Cities: A Geometry of Form and Function. San Diego, CAand London: Academic Press.

Bertaud, A. and S. Malpezzi (2014). The spatial distribution of population in 57 world cities: The roleof markets, planning, and topography. Unpuplished, University of Wisconsin-Madison.

Bettencourt, L. M. A. (2013). The origins of scaling in cities. Science 340 (6139), 1438–1441.

Bluhm, R., D. de Crombrugghe, and A. Szirmai (2018). Poverty accounting. European EconomicReview 104 (C), 237–255.

Bluhm, R., A. Dreher, A. Fuchs, B. Parks, A. Strange, and M. Tierney (2018). Connective financing:Chinese infrastructure projects and the diffusion of economic activity in developing countries. AidDataWorking Paper 42, AidData at William & Mary, Williamsburg, VA.

Bonfatti, R. and S. Poelhekke (2017). From mine to coast: Transport infrastructure and the directionof trade in developing countries. Journal of Development Economics 127, 91–108.

Brueckner, J. K. (1982). A note on sufficient conditions for negative exponential population densities.Journal of Regional Science 22 (3), 353–359.

Castells-Quintana, D. (2017). Malthus living in a slum: Urban concentration, infrastructure andeconomic growth. Journal of Urban Economics 98, 158–173.

Chen, X. and W. D. Nordhaus (2011). Using luminosity data as a proxy for economic statistics.Proceedings of the National Academy of Sciences 108 (21), 8589–8594.

Christiaensen, L., R. Kanbur, L. Christiaensen, and R. Kanbur (2016). Secondary towns and povertyreduction: Refocusing the urbanization agenda. Policy Research Working Paper Series 7895, TheWorld Bank.

Christiaensen, L. and Y. Todo (2014). Poverty reduction during the rural-urban transformation: Therole of the missing middle. World Development 63 (C), 43–58.

Cirillo, P. (2013). Are your data really Pareto distributed? Physica A: Statistical Mechanics and itsApplications 392 (23), 5947–5962.

Civelli, A., A. Horowitz, and A. Teixeira (2018). Foreign aid and growth: A Sp P-VAR analysis usingsatellite sub-national data for Uganda. Journal of Development Economics 134, 50–67.

Coleman, J. S. (1964). Introduction to Mathematical Sociology. New York, NY: The Free Press.

Desmet, K. and E. Rossi-Hansberg (2013). Urban accounting and welfare. American EconomicReview 103 (6), 2296–2327.

Donaldson, D. and A. Storeygard (2016). The view from above: Applications of satellite data ineconomics. Journal of Economic Perspectives 30 (4), 171–198.

Dreher, A. and S. Lohmann (2015). Aid and growth at the regional level. Oxford Review of EconomicPolicy 31 (3-4), 420–446.

Duranton, G. (2008). Viewpoint: From cities to productivity and growth in developing countries.Canadian Journal of Economics 41 (3), 689–736.

Duranton, G. and D. Puga (2004). Micro-Foundations of Urban Agglomeration Economies. In J. V.Henderson and J. F. Thisse (Eds.), Handbook of Regional and Urban Economics, Volume 4 of Handbookof Regional and Urban Economics, Chapter 48, pp. 2063–2117. Elsevier.

Eeckhout, J. (2004). Gibrat’s law for (all) cities. American Economic Review 94 (5), 1429–1451.

Eeckhout, J. (2009). Gibrat’s law for (all) cities: Reply. American Economic Review 99 (4), 1676–1683.

30

Page 33: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Elvidge, C. D., K. E. Baugh, J. B. Dietz, T. Bland, P. C. Sutton, and H. W. Kroehl (1999).Radiance calibration of DMSP-OLS low-light imaging data of human settlements. Remote Sensing ofEnvironment 68 (1), 77–88.

Fetzer, T., V. Henderson, D. Nigmatulina, and A. Shanghavi (2016). What happens to cities whencountries become democratic? Unpublished, University of Warwick and London School of Economics.

Gabaix, X. (1999). Zipf’s law for cities: An explanation. Quarterly Journal of Economics 114 (3),739–767.

Gabaix, X. and R. Ibragimov (2011). Rank - 1/2: A simple way to improve the ols estimation of tailexponents. Journal of Business & Economic Statistics 29 (1), 24–39.

Gabaix, X. and Y. Ioannides (2004). The evolution of city size distribution. In J. V. Henderson andJ. F. Thisse (Eds.), Handbook of Regional and Urban Economics (1 ed.), Volume 4, Chapter 53, pp.2341–2378. Elsevier.

Gibrat, R. (1931). Les Inegalites Economiques: Applications: Aux Inegalites des Richesses, a laConcentration des Entreprises, aux Populations des Villes, aux Statistiques des Familles, etc. D’uneLoi Nouvelle, la Loi de l’Effet Proportionnel. Paris: Librairie du Recueil Sirey.

Gibson, J., G. Datt, R. Murgai, and M. Ravallion (2017). For India’s rural poor, growing towns mattermore than growing cities. World Development 98, 413–429.

Glaeser, E. and J. V. Henderson (2017). Urban economics for the developing world: An introduction.Journal of Urban Economics 98, 1–5.

Glaeser, E. L. (2014). A world of cities: The causes and consequences of urbanization in poorer countries.Journal of the European Economic Association 12 (5), 1154–1199.

Gollin, D., R. Jedwab, and D. Vollrath (2016). Urbanization with and without industrialization. Journalof Economic Growth 21 (1), 35–70.

Gollin, D., M. Kirchberger, and D. Lagakos (2017). In search of a spatial equilibrium in the developingworld. Working Paper 23916, National Bureau of Economic Research.

Harari, M. (2017). Cities in bad shape: Urban geometry in India. Unpublished, University ofPennsylvania.

Henderson, J. V. (2003). The urbanization process and economic growth: The so-what question. Journalof Economic Growth 8 (1), 47–71.

Henderson, J. V., T. Squires, A. Storeygard, and D. Weil (2018). The global distribution of economicactivity: Nature, history, and the role of trade. Quarterly Journal of Economics 133 (1), 357–406.

Henderson, J. V., A. Storeygard, and D. N. Weil (2012). Measuring economic growth from outer space.American Economic Review 102 (2), 994–1028.

Hodler, R. and P. A. Raschky (2014). Regional favoritism. Quarterly Journal of Economics 129 (2),995–1033.

Hsu, F.-C., K. E. Baugh, T. Ghosh, M. Zhizhin, and C. D. Elvidge (2015). DMSP-OLS radiancecalibrated nighttime lights time series with intercalibration. Remote Sensing 7 (2), 1855–1876.

Ioannides, Y. and S. Skouras (2013). US city size distribution: Robustly Pareto, but only in the tail.Journal of Urban Economics 73 (1), 18–29.

Isaksson, A.-S. and A. Kotsadam (2018). Chinese aid and local corruption. Journal of PublicEconomics 159, 146–159.

Jean, N., M. Burke, M. Xie, W. M. Davis, D. B. Lobell, and S. Ermon (2016). Combining satelliteimagery and machine learning to predict poverty. Science 353 (6301), 790–794.

Jedwab, R. and A. Moradi (2016). The permanent effects of transportation revolutions in poor countries:Evidence from Africa. Review of Economics and Statistics 98 (2), 268–284.

Jedwab, R. and D. Vollrath (2018). The urban mortality transition and poor country urbanization.American Economic Journal: Macroeconomics (forthcoming).

Jiang, B. and T. Jia (2011). Zipf’s law for all the natural cities in the United States: A geospatialperspective. International Journal of Geographical Information Science 25 (8), 1269–1281.

Jones, A. (1975). Density-size rule, a further note. Urban Studies 12 (2), 225–228.

Krugman, P. (1991). Increasing returns and economic geography. Journal of Political Economy 99 (3),483–499.

Lall, S. V., J. V. Henderson, and A. J. Venables (2017). Africa’s Cities: Opening Doors to the World.Washington, DC: World Bank.

Lessmann, C. and A. Seidel (2017). Regional inequality, convergence, and its determinants: A view fromouter space. European Economic Review 92 (B), 110–132.

31

Page 34: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Lipton, M. (1977). Why Poor People Stay Poor: Urban Bias in World Development. Cambridge, MA:Harvard University Press.

Luckstead, J. and S. Devadoss (2014). Do the world’s largest cities follow Zipf’s and Gibrat’s laws?Economics Letters 125 (2), 182–186.

Michalopoulos, S. and E. Papaioannou (2013). Pre-colonial ethnic institutions and contemporary Africandevelopment. Econometrica 81 (1), 113–152.

Michalopoulos, S. and E. Papaioannou (2014). National institutions and subnational development inAfrica. Quarterly Journal of Economics 129 (1), 151–213.

Michalopoulos, S. and E. Papaioannou (2018). Spatial patterns of development: A meso approach.Annual Review of Economics 10 (1), 383–410.

Mills, E. (1967). An aggregative model of resource allocation in a metropolitan area. American EconomicReview 57 (2), 197–210.

Moran, A. P. (1950). Notes on continuous stochastic phenomena. Biometrika 37 (1/2), 17–23.

Newman, M. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46 (5),323–351.

Parr, J. (1985). A population-density approach to regional spatial structure. Urban Studies 22 (4),289–303.

Piketty, T. (2003). Income inequality in France, 1901–1998. Journal of Political Economy 111 (5),1004–1042.

Pinkovskiy, M. (2017). Growth discontinuities at borders. Journal of Economic Growth 22 (2), 145–192.

Pinkovskiy, M. and X. Sala-i Martin (2016). Lights, camera ... income! illuminating the national accounts-household surveys debate. Quarterly Journal of Economics 131 (2), 579–631.

Puga, D. (1998). Urbanization patterns: European versus less developed countries. Journal of RegionalScience 38 (2), 231–252.

Rosen, K. T. and M. Resnick (1980). The size distribution of cities: An examination of the Pareto lawand primacy. Journal of Urban Economics 8 (2), 165–186.

Rosenthal, S. and W. Strange (2004). Evidence on the nature and sources of agglomeration economics.In J. Henderson and J. Thisse (Eds.), Handbook of Regional and Urban Economics, Volume 4, pp.2119–2171. Elsevier North Holland.

Rozenfeld, H., D. Rybski, X. Gabaix, and H. Makse (2011). The area and population of cities: Newinsights from a different perspective on cities. American Economic Review 101 (5), 2205–2225.

Small, C., C. Elvidge, D. Balk, and M. Montgomery (2011). Spatial scaling of stable night lights. RemoteSensing of Environment 115 (2), 269–280.

Smeed, R. (1961). The Traffic Problem in Towns. Manchester Statistical Society Papers.

Soo, K. T. (2005). Zipf’s law for cities: A cross-country investigation. Regional Science and UrbanEconomics 35 (3), 239–263.

Stewart, J. Q. (1947). Suggested principles of “social physics”. Science 106 (2748), 179–180.

Storeygard, A. (2016). Farther on down the road: Transport costs, trade and urban growth in sub-saharan africa. Review of Economic Studies 83 (3), 1263–1295.

Tsai, Y.-H. (2005). Quantifying urban form: Compactness versus ‘sprawl’. Urban Studies 42 (1), 141–161.

Tuttle, B., S. Anderson, P. Sutton, C. Elvidge, and K. Baugh (2013). It used to be dark here.Photogrammetric Engineering and Remote Sensing 79 (3), 287–297.

UN-Habitat (2003). The Challenge of Slums: Global Report on Human Settlements, 2003. EarthscanPublications.

United Nations (2018). World Urban Prospects, the 2018 Revision. United Nations New York.

Venables, A. J. (2017). Breaking into tradables: Urban form and urban function in a developing city.Journal of Urban Economics 98, 88–97.

Zielinski, K. (1980). The modelling of urban population density: A survey. Environment and PlanningA 12 (2), 135–154.

Ziskin, D., K. Baugh, F. C. Hsu, T. Ghosh, and C. Elvidge (2010). Methods used for the 2006 radiancelights. Proceedings of the Asia-Pacific Advanced Network 30, 131–142.

32

Page 35: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Online appendix

A Additional summary statistics ii

B The top-coding threshold viii

C Proofs and extensions of the model x

D Extreme value theory xiv

E Additional results using the radiance-calibrated data xvii

F Additional results using the VIIRS data xxiii

G An analytical top-coding correction xxvi

H Characteristics of the corrected data xxviii

I Benchmarking exercises xxxiv

J Additional results for African cities xli

i

Page 36: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

A Additional summary statistics

In this section we provide further details on the construction of the stable lights and

radiance-calibrated lights, and compare their characteristics.

The main advantage of the stable lights series is that they are available as an annual

panel. Moreover, for several years more than one satellite has been orbiting Earth,

resulting in a total of 34 satellite-years over the period of 1992 to 2013. The radiance-

calibrated data, by contrast, are based on rarely occurring additional flights of satellites

which are about to be decommissioned and could be operated with a different gain setting

(lower or higher amplification settings). These auxiliary data are only available for seven

years over the entire period from 1996 to 2010. NOAA blends the stable lights data

from normal flight operations with these auxiliary satellite data to obtain the radiance-

calibrated series (Elvidge et al., 1999, Ziskin et al., 2010, Hsu et al., 2015). The resulting

night light intensities are free of top-coding and have no upper bound (at least in theory).

Several technical issues and measurement errors, occurring when the different fixed

gain images are merged at NOAA, produce a lot of variability in the radiance-calibrated

data: i) the low amplification data are based on considerably fewer orbits than the

stable lights series (often covering only small parts of a year), ii) they are generated by

blending different parts of the frequency spectrum which are deemed reliable, iii) higher

light intensities are supported by fewer and fewer fixed-gain images1, and iv) fires or

stray lights have not been fully removed from the auxiliary data. All this contributes

to the high variance across different radiance-calibrated satellite-years.2 Because of this

instability, together with the fact that they are only available for seven out of 22 years,

we only rely on the radiance-calibrated data to infer the shape of the distribution at the

top. The relative ranks of pixels are consistently measured across the different satellites

and less prone to be affected by measurement errors.

Table A-1 reports summary statistics for the 34 stable light satellite-years and the

seven radiance-calibrated years. Between 2.7% and 5.9% of all pixels in the stable lights

images reach the top of the scale (i.e., 55 DN to 63 DN), more so in later years. As the

radiance-calibrated lights do not suffer from top-coding, their mean, standard deviation

and Gini in lights are much higher. Rather than being capped at 63 DN, they reach

maximum values from 2000 to 5000 DN. The fluctuations across satellites are reflected

in the overall mean light intensity, but are most apparent at the top. The maximum

1Consider the 2010 radiance-calibrated product for example, the maximum number of cloud-freeimages is 134, the suburbs of Paris are informed by about 50–60 cloud-free images, but the inner citycore only by 10–20 images. This pattern repeats itself throughout all major cities.

2Measurement errors are also present in the stable lights data and affect their reliability in the timeseries dimension but to a much lesser extent. The sensors of the satellites deteriorate over their lifetimeand have to be replaced every couple of years, which implies that later recordings of any particularsatellite tend to be the brightest (although this is not a hard rule). In panel regressions, economistsusually resort to a combination of satellite and time fixed effects to partially address this issue.

ii

Page 37: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

light intensity doubles within three years and then decreases again by a similar amount

(whereas the mean increases and decreases by about 27% over the same period).

Table A-2 confirms that these fluctuations are not driven by a few outliers. Instead

of examining overall maxima, we now report various percentiles for the seven radiance-

calibrated satellites and the means above these percentiles. For example, the top 2% begin

at 147.01 DN in the 1996 data, at 214.59 DN in 2003, and again at 150.90 DN in 2010.

The means above the various percentiles vary similarly over time. The differences are

largest in absolute values at the very top but remain sizable throughout the distribution.

This variation cannot be explained economically.

Table A-3 shows the maximum values attained by the seven radiance-calibrated

satellites in 30 selected cities. Despite considerable variability over time, the relative

ranking is in line with our expectations. The light intensity of the brightest pixel in New

York City, for example, is about ten times greater than that of the brightest pixel in

Nairobi. Note that the average maximum light intensity hardly exceeds 2000 DN, no

matter if we compute it for London, New York, or Shanghai. This is why we restrict the

maximum light intensities generated by our pixel-level correction to 2000 DN.

Table A-4 illustrates that not all differences between the stable lights and radiance-

calibrated data can be attributed to top-coding. It regresses all pixels below 55 DN of

the stable lights on the radiance-calibrated lights, where top-coding is supposed to not

play a role. We find a regression coefficient around one-half rather than equivalence.

This absence of a one-to-one correspondence is owed to the lack of on-board calibration,

blooming (Abrahams et al., 2018), the presence of stray light (Hsu et al., 2015), and

geo-location errors (Tuttle et al., 2013).

Table A-5 reports the maximum light intensities recorded within 25 kilometers of the

city center in 988 world cities with more than 500,000 inhabitants. Table A-6 adds the

rank-correlations. The latter are much higher and typically around 0.90–0.95 for adjacent

radiance-calibrated years, which supports our preference for pixel ranks over their actual

values.

iii

Page 38: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table A-1 – Summary statistics of the stable lights and radiance-calibrated data

Stable lights Radiance-calibrated

Year Flight No. Mean Std. Dev. Gini % ≥ 55 Mean Std. Dev. Gini Max

1992 F10 13.83 13.51 0.44 3.811993 F10 11.96 12.81 0.46 3.121994 F10 12.02 13.31 0.48 3.49

F12 14.65 13.93 0.44 4.201995 F12 13.09 13.57 0.46 3.761996 F12 12.69 13.36 0.46 3.51 19.42 55.63 0.65 20641997 F12 13.45 13.74 0.45 3.94

F14 10.98 12.87 0.49 3.161998 F12 13.89 13.89 0.45 4.18

F14 10.94 12.78 0.49 3.051999 F12 14.74 14.34 0.44 4.67 19.53 56.93 0.64 4698

F14 10.15 12.31 0.49 2.782000 F14 11.34 12.99 0.49 3.18 22.88 65.84 0.63 5552

F15 13.25 13.34 0.44 3.702001 F14 11.64 13.32 0.49 3.50

F15 12.93 13.26 0.45 3.542002 F14 12.14 13.70 0.49 3.77

F15 13.18 13.44 0.45 3.722003 F14 11.96 13.72 0.49 3.82 24.83 67.57 0.65 4186

F15 10.28 12.45 0.50 2.702004 F15 10.08 12.52 0.51 2.76 24.07 65.94 0.66 4357

F16 11.82 13.04 0.46 3.402005 F15 10.44 12.73 0.51 2.79

F16 10.44 12.54 0.49 2.852006 F15 10.56 12.91 0.51 2.93 20.63 50.93 0.63 3333

F16 12.26 13.37 0.47 3.482007 F15 10.74 12.82 0.50 2.79

F16 13.05 13.79 0.46 4.032008 F16 12.97 13.84 0.47 3.952009 F16 13.50 14.12 0.47 4.172010 F18 17.55 15.35 0.43 5.91 19.04 44.35 0.60 21102011 F18 14.78 14.68 0.46 4.942012 F18 16.44 15.20 0.44 5.762013 F18 16.23 15.20 0.44 5.78

Notes: The table reports summary statistics using a 10% sample of the stable lights and radiance-calibrated data at the pixel level, where each pixel is 30 × 30 arc seconds. There are several yearswhen two DMSP satellites were concurrently recording data for the stable lights series, so that thereare 34 satellite-years between 1992 and 2013. The radiance-calibrated data are only available forthe following periods: 16 Mar 96 – 12 Feb 97 (1996), 19 Jan 99 – 11 Dec 99 (1999), 03 Jan 00 – 29Dec 00 (2000), 30 Dec 02 – 11 Nov 2003 (2003), 18 Jan 04 – 16 Dec 04 (2004), 28 Nov 05 – 24 Dec06 (2006), and 11 Jan 10 – 9 Dec 10 (2010), although the actual coverage in terms of days oftenrefers to a much smaller period.

iv

Page 39: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table A-2 – Summary statistics of the radiance-calibrated data (top shares)

Year 1996 1999 2000 2003 2004 2006 2010

Panel a) Top 5%

Percentile (x) 62.87 66.74 73.75 94.60 90.97 74.97 64.84Mean above x 186.04 197.42 228.61 245.59 236.80 189.34 166.90

Panel b) Top 4%

Percentile (x) 76.30 84.79 95.62 119.27 114.26 94.03 81.98Mean above x 215.29 228.01 264.84 280.40 270.51 215.70 190.42

Panel c) Top 3%

Percentile (x) 98.42 114.12 131.13 154.40 149.82 122.72 108.27Mean above x 258.23 271.33 315.97 328.70 317.06 251.82 222.49

Panel d) Top 2%

Percentile (x) 147.01 166.33 198.77 214.59 207.97 168.84 150.90Mean above x 327.60 338.23 393.22 402.32 387.54 305.83 269.84

Panel e) Top 1%

Percentile (x) 259.04 275.41 318.53 331.88 314.53 255.44 229.79Mean above x 460.17 463.60 534.81 538.85 519.98 404.80 354.36

Panel f) Top 0.1%

Percentile (x) 729.41 716.94 815.16 822.00 805.43 605.13 511.62Mean above x 979.91 960.86 1117.96 1110.62 1111.93 806.63 687.53

Panel g) Top 0.01%

Percentile (x) 1355.38 1279.48 1528.71 1491.25 1516.16 1085.71 936.22Mean above x 1551.16 1652.31 1893.03 1828.03 1914.32 1316.93 1137.76

Notes: The table shows summary statistics of the radiance-calibrated data at the various percentiles.The input data are a 10% representative sample of all non-zero lights in the radiance-calibrated dataabove the defined threshold at the pixel level, where each pixel is 30× 30 arc seconds.

v

Page 40: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table A-3 – Maximum light intensities in 30 selected cities over time

City 1996 1999 2000 2003 2004 2006 2010 Average

Beijing 2265.07 3160.86 977.94 2979.75 2911.25 1575.00 1262.30 2161.74Berlin 757.45 518.67 430.80 490.48 556.83 418.00 375.34 506.79Bogota 661.73 774.02 416.75 602.80 828.82 622.18 489.89 628.02Brussels 1334.32 1765.47 1632.47 1561.80 1767.69 1058.33 674.38 1399.21Cairo 543.47 622.09 527.60 730.67 753.50 550.00 414.87 591.74Calgary 1084.76 2077.92 1669.85 822.00 1520.70 731.15 721.22 1232.51Casablanca 919.44 769.98 729.69 1214.73 1075.77 708.33 620.97 862.70Damascus 800.90 724.38 470.05 472.74 893.66 820.00 675.15 693.84Dhaka 1026.01 1410.89 882.44 935.03 840.66 920.00 465.11 925.73Dubai 329.21 323.17 269.86 457.58 607.57 1280.34 420.12 526.84Edinburgh 811.04 537.69 453.18 973.48 767.20 425.24 518.75 640.94Foshan 715.98 1410.36 537.46 1499.66 1625.73 1142.86 1164.98 1156.72Istanbul 579.83 551.57 413.14 653.84 543.84 421.08 327.46 498.68Jakarta 683.82 664.56 1100.27 1381.62 788.43 805.95 632.81 865.35Johannesburg 349.23 406.95 271.40 343.70 510.57 314.39 304.96 357.31London 3342.95 2145.71 1664.97 1575.50 2123.50 1815.38 1366.45 2004.92Los Angeles 1741.22 1807.50 1519.95 1757.50 1794.70 1288.89 1087.33 1571.01Manila 1117.10 768.36 390.44 969.96 1260.40 692.31 551.60 821.45Moscow 1011.28 1308.20 1270.86 1282.32 1142.01 945.45 655.45 1087.94Mosul 225.95 232.32 211.38 370.24 284.30 321.64 319.33 280.74Mumbai 1456.54 1790.01 1515.98 1775.52 1963.82 1322.22 1842.57 1666.67Nairobi 180.27 188.66 211.83 173.54 191.45 174.02 164.13 183.41New York 2299.18 2090.67 1971.10 2283.33 2877.00 1592.86 1399.78 2073.42Paris 1827.80 2444.32 1177.72 1430.28 1794.70 1425.00 874.55 1567.77Rio de Janeiro 926.51 917.27 748.92 699.31 708.83 484.08 461.57 706.64Seoul 629.42 695.65 629.67 808.30 810.81 580.00 513.82 666.81Shanghai 1965.24 1906.01 1123.89 2931.80 3982.13 2307.14 1926.59 2306.12Sydney 1482.57 1470.49 1006.67 1923.48 1600.94 1275.00 751.58 1358.68Tel Aviv 1284.19 1679.72 997.83 1397.40 1446.72 1188.24 1099.83 1299.13Tokyo 1709.40 1768.79 1785.16 1876.90 2013.90 1273.33 940.31 1623.97

Notes: The table report the maximum light intensity in DN recorded within 25 km radius of thecity center in selection of cities. The input data are the radiance-calibrated lights. City locationsare obtained from the Natural Earth point data of major populated places.

vi

Page 41: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table A-4 – Regression of stable lights on radiance-calibrated data

Year 1996 1999 2000 2003 2004 2006 2010

Stable lights 0.5557 0.5502 0.4241 0.4357 0.3473 0.4874 0.7468(0.0002) (0.0003) (0.0002) (0.0002) (0.0001) (0.0002) (0.0004)

Constant 4.6422 5.5218 3.9172 3.6007 3.2069 2.9392 6.4094(0.0045) (0.0054) (0.0044) (0.0043) (0.0036) (0.0037) (0.0066)

R2 0.7440 0.7013 0.7115 0.7709 0.7873 0.8011 0.6319

Notes: The table reports OLS estimates of a regression of all pixels smaller than 55 DN of the stablelights on their radiance-calibrated counterpart in all those years for which both data sources areavailable. Standard errors are in parentheses. The data are a 10% random sample of lights at thepixel level, where each pixel is 30× 30 arc seconds.

Table A-5 – Correlation matrix of maximum city lights

Years 1996 1999 2000 2003 2004 2006 2010

1996 1.00001999 0.9142 1.00002000 0.8615 0.8561 1.00002003 0.8588 0.8619 0.7715 1.00002004 0.8733 0.9002 0.8073 0.9181 1.00002006 0.8737 0.8974 0.8109 0.9307 0.9379 1.00002010 0.7831 0.7872 0.7428 0.8529 0.8525 0.8955 1.0000

Notes: The table reports correlations between the maximum light intensities recorded within 25 kmradius of the city center of 988 world cities with more than 500,000 inhabitants.

Table A-6 – Rank correlation matrix of maximum city lights

Years 1996 1999 2000 2003 2004 2006 2010

1996 1.00001999 0.9557 1.00002000 0.9167 0.9122 1.00002003 0.9048 0.9162 0.8451 1.00002004 0.9129 0.9331 0.8447 0.9536 1.00002006 0.9063 0.9256 0.8611 0.9482 0.9549 1.00002010 0.8495 0.8651 0.8108 0.8964 0.8973 0.9270 1.0000

Notes: The table reports rank correlations between the maximum light intensities recorded within25 km radius of the city center of 988 world cities with more than 500,000 inhabitants.

vii

Page 42: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

B The top-coding threshold

The influence of top-coding in the DMSP-OLS satellite data has been underestimated

in part because much of the literature assumes it only affects pixels with the highest

recorded value. However, even though the scale of stable lights goes up to 63, we have

good reason to assume that many pixels with DNs of 62, 61, down to the mid-50s, are

subject to top-coding and should be brighter than they are recorded in the data.

The rationale behind this conjecture is straightforward. The stable lights data have

already been averaged at least twice during the data construction. First, the DMSP

satellites average several higher resolution pixels on-board to reduce the amount of

information that needs to be transmitted down to Earth. The OLS system records images

at a nominal resolution of 0.56 km (the so-called fine resolution), which is averaged on-

board into 5 × 5 blocks to create a 2.77 km (smooth) resolution and then reprojected

onto a 30 arc second grid.3 Second, the data providers at NOAA process the daily

images into a single annual composite. As a result, many pixels suffering from top-coding

in at least one of the underlying fine resolution data points or smooth resolution daily

images will end up with an average value of less than 63. Hsu et al. (2015) suggest that

this subtle type of top-coding may even start at a DN as low as 35. Since “the OLS

does onboard averaging to produce its global coverage data, saturation does not happen

immediately when radiance reaches the maximum level. On the contrary, as the actual

radiance grows, the observed DN value fails to follow the radiance growth linearly, causing

a gradual transition into a plateau of full saturation” (Hsu et al., 2015, p. 1872).

We explore the location of the top-coding threshold with a statistical approach. If

only the stable lights at 63 DN were subject to top-coding, we would expect the histogram

in panel (a) of Figure B-1 to show a decreasing shape ending in a spike only at 63 DN.

Instead, we observe an increase in the number of pixels from 55 onwards (e.g. a bathtub

shape), signaling that these values are top-coded as well. Further evidence along these

lines is provided by panel (b) of Figure B-1. It shows a histogram of the light intensity

of the stable lights DNs associated with high radiance-calibrated values (above 160 DN).

There are a large number of pixels with DNs down to the mid-50s which correspond

to very high radiance-calibrated values, but the density falls rapidly below the mid-50s.

Other years show very similar patterns.

Table B-1 list the percentile values of the radiance-calibrated lights corresponding to

stable lights at 55 DN, 56 DN and so on. The stable lights at 63 DN have the highest

radiance-calibrated values (50% of them are higher than 390 DN). But there is also a

significant share of 55 DN lights corresponding to high radiance-calibrated values, for

instance, 25% are recorded with 140 DN or brighter.

3See https://directory.eoportal.org/web/eoportal/satellite-missions/d/dmsp-block-5d

or Abrahams et al. (2018) for a detailed description of the sensors and on-board processing.

viii

Page 43: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure B-1 – Histograms of stable lights in 1999

(a) If stable DN > 9

Saturated lights (DN)

Den

sity

10 20 30 40 50 60

0.00

0.02

0.04

0.06

0.08

0.10

(b) If radiance-calibrated DN > 160

Saturated lights (DN)

Den

sity

40 45 50 55 600.

00.

10.

20.

30.

40.

5

Notes: Illustration of the location of the top-coding threshold in the stable lights. Panel a) showsa histogram of the F12 satellite in 1999 for all pixels with a DN greater 9. Panel b) shows ahistogram of the same satellite only for pixels where the radiance-calibrated light intensity is greater160 DN. The input data are a 10% representative sample of all non-zero lights in the stable lightsand radiance-calibrated data at the pixel level (see Elvidge et al., 2009, Hsu et al., 2015).

Table B-1 – Percentiles of radiance-calibrated values at given stable lights values in 2000

Stable lights Radiance-calibrated percentilesDN 5% 25% 50% 75% 95% 99%

55 53.20 74.94 99.41 140.85 232.90 328.8656 56.15 79.99 108.20 153.92 250.93 344.0557 60.14 84.99 115.11 164.63 262.18 357.6058 64.13 92.81 125.35 179.57 277.59 392.3359 70.32 101.97 141.92 203.17 306.77 423.2860 79.16 116.64 163.92 231.91 344.57 497.2561 89.33 137.89 196.68 268.21 410.91 625.3062 109.03 176.36 246.66 331.46 524.18 762.6363 160.91 276.92 390.08 560.28 952.14 1494.85

Notes: The table reports values from the cumulative distribution function of the radiance-calibratedlights which are associated with a given stable lights value (from 55 to 63). For instance, 25% ofthe radiance-calibrated values associated with a stable lights value of 61 DN, are below 122.06. Thedata are a representative 10% sample for the year 2000.

ix

Page 44: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

C Proofs and extensions of the model

In this section we provide additional proofs and extensions of the model presented in the

main text.

The CDF of the number of rings: Note that r = π−1/2x1/(2φ) implies x = πφr2φ and

dx = 2φπφr2φ−1dr. Substituting these definitions into eq. (1) and integrating yields the

CDF of the number of rings per city as presented in eq. (2) of the main text

F (r) = 2φxcπ−φ∫ r

r

r−2φ−1dr = 2φxcπ−φ[− 1

2φr−2φ

]rr

=

0 for r < r = π−1/2x1/(2φ)c

1− ycπ−φr−2φ for r >= r = π−1/2x1/(2φ)c .

(C-1)

The density of pixels: Start with the distribution of the number of pixels. At

distances d < d, the amount of pixels increases linearly in d as rings farther away from

the center contain more pixels: dddπd2 = 2πd. Beyond d, the effect within each city has

to be multiplied by the survival function 1 − F (r) from eq. (2), as there are fewer and

fewer cities of such size. Denoting the number of cities as M , the absolute amount of

pixels N as a function of d is

P (d) =

2πdM for d < d = π−1/2x1/(2φ)c

2π1−φMxcd1−2φ for d ≥ d = π−1/2x

1/(2φ)c .

(C-2)

The total number of pixels, N , can be obtained by integration

N =

∫ d

0

2πdMdd+

∫ ∞d

2π1−φMxcd1−2φdd = 2πM

[1

2d2]d0

+ 2π1−φMxc

[1

2− 2φd2−2φ

]∞d

= πMy1/φc

π+π1−φMycφ− 1

(y1/φc

π

)1−φ

= Mx1/φc +1

φ− 1Mx1/φc =

φ

φ− 1Mx1/φc . (C-3)

Dividing eq. (C-2) by N yields the density, f(d), shown in eq. (4):

f(d) =

2π φ−1φx−1/φc d for d < d

2π1−φ φ−1φx1−1/φc d1−2φ for d ≥ d

(C-4)

with d = π−1/2x1/(2φ)c .

The density is illustrated in panel (a) of Figure C-1.

x

Page 45: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

An exponential distribution of light in cities: Next, we derive the distribution of

top lights when lights within a city follow a negative exponential function instead of an

inverse power function. The following replaces assumption 4 from the main text.

Assumption 5. Within cities, the light density l(d) follows an exponential function

l(d) = L0 exp(−γd), where L0 ≥ l is the density at the center and γ > 0 is a decay

parameter.

Panel (b) of Figure C-1 illustrates the negative exponential distribution (with γ =

0.15) and inverse power distribution (with a = 0.7). Both functions exhibit a comparable

decay from the city center to the outskirts of the city. The main difference is that the

negative exponential function attains values which are not as high in the center but

decreases more quickly towards zero at the outskirts, whereas the inverse power function

has a longer tail.

Contrary to our baseline case this altered setting does not directly generate a Pareto

distribution in lights. The distribution now depends on L0, the maximum luminosity

of the center of each city. We consider three cases. In each case, we focus on the light

density f(l) conditional on L0. Using the variable transformation of eq. (4) together with

d = 1γ

ln(L0/l) yields

f(l | L0) =

2π1−φ φ−1

φx1−1/φc

[1γ

ln(L0/l)]1−2φ

for l ≤ l

2πφ− 1

φx−1/φc

1

γ︸ ︷︷ ︸c

ln(L0/l) for l < l < L0, (C-5)

where l = L0 exp(−γd).

This conditional density increases for dim luminosities (at the fringes of the largest

cities) and decreases from l onwards. The turning point and maximum, l, corresponds to

the minimum size, d, of each city in terms of distance from center. At higher luminosities,

there are fewer and fewer pixels as these are the ones located in ever smaller rings closer

to the center.

To derive the marginal density f(l), we have to make assumptions about the

distribution of maximum luminosities L0 across cities.

Case 1: L0 is a constant, so that all cities large and small are equally bright in the

center. This is unrealistic, but mathematically simple. The marginal density of lights

f(l) equals the conditional density f(l | L0). Analyzing the top end of the distribution

c ln(L0/l) for l < l < L0 we observe a near linear decrease with a slope of c ddl

ln(L0/l) ∝−1

lfor l close to L0. There is no power law. The assumption of a constant L0 across all

cities generates too many pixels with the highest luminosities.

Case 2: L0 follows a Pareto distribution across cities so that some city centers are

much brighter than others. Empirical evidence points in this direction (see Table A-3).

xi

Page 46: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

We are only interested in the upper part of the density in eq. (C-5).4 If L0 follows the

Pareto density with η = 1 so that f(L0) = Lmin/L20, with Lmin as the minimum center

luminosity, we have the joint density of l and L0 as

f(l, L0) = f(L0)f(l | L0) = cLminL20

ln(L0/l) for lmax < l < Lmax and L0 > l. (C-6)

The marginal density, f(l), is found by integrating L0 out of the above

f(l) = cLmin

∫ Lmax

l

1

L20

ln(L0/l)dL0 = cLmin

[ln l − lnL0 − 1

L0

]Lmaxl

= cLmin

[1

l− 1

Lmax(ln(Lmax/l) + 1)

], (C-7)

which holds for lmax < l < Lmax.

Case 3: Alternatively, an intermediate case for L0 is a uniform distribution between

values Lmin and Lmax with density f(L0) = (Lmax−Lmin)−1. Cities (big and small) differ

in their maximum luminosity, but all maximum luminosities are equally likely across

cities. Following the same steps as in the second case, the marginal density is

f(l) =c

Lmax − Lmin

∫ Lmax

l

ln(L0/l)dL0 =c

Lmax − Lmin

[L0(lnL0 − ln l − 1)

]Lmaxl

=c

Lmax − Lmin

[l + Lmax(ln(Lmax/l)− 1)

], (C-8)

which holds for lmax < l < Lmax.

Cases 2 and 3 are heavy-tailed distributions which differ mathematically from the

simple Pareto. But given their heavy tails, they may be approximated by a Pareto. As

shown in Figure C-2, this works particularly well for case 2 with a Pareto distribution of

α = 1.5, while for case 3 the Pareto distributions with α = 1.2 works reasonably well.

Result 2. Based on Assumptions 1–3 and 5, as well as a sufficient variation of maximum

luminosities across cities, it follows that top lights above a threshold l can be approximated

by a Pareto distribution.

In sum, when inner city lights follow a negative exponential function, a Pareto

distribution of top lights does not arise analytically in the three cases considered here.

However, depending on the exact assumptions made about differences in the maximum

brightness across cities, the resulting distribution is heavy-tailed and can be approximated

by a Pareto distribution.

4Note that the threshold l depends on the random variable L0. Hence, we restrict our analysis to thearea of the density starting at the highest possible threshold value lmax, corresponding to L0 = Lmax.

xii

Page 47: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure C-1 – Illustration of distributions

(a) Pixel density

0 5 10 15 20 25 30 35

0.00

0.01

0.02

0.03

0.04

0.05

Distance d

Den

sity

(b) Light gradient

0 5 10 15 20 25 30 35

050

010

0015

0020

00Distance d

Ligh

t int

ensi

ty (

DN

)

Negative exponentialInverse power

Notes: The left panel shows the pixel density from eq. (4) with xc = 10000 and φ = 1.5. Theright panel shows the negative exponential distribution with γ = 0.15 as well as the inverse powerdistribution with a = 0.7, both start at P0 = 2000.

Figure C-2 – Approximating the theoretical densities with Pareto distributions

(a) Case 2 and Pareto with α = 1.5

0 500 1000 1500 2000

0.00

00.

002

0.00

40.

006

Light intensity (DN)

Den

sity

Case 2Pareto

(b) Case 3 and Pareto with α = 1.2

0 500 1000 1500 2000

0.00

00.

005

0.01

00.

015

0.02

00.

025

0.03

0

Light intensity (DN)

Den

sity

Case 3Pareto

Notes: The left panel shows the pixel density from eq. (C-7) with xc = 10000, φ = 1.5 and γ = 1.5.The Pareto distribution with α = 1.5 is scaled to fit. The right panel shows the pixel density fromeq. (C-8) with xc = 10000, φ = 1.5 and γ = 1.5. The Pareto distribution with α = 1.2 is scaled tofit.

xiii

Page 48: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

D Extreme value theory

As an alternative to our stylized urban economics model, we can also motivate a Pareto

distribution in top lights purely on statistical grounds using extreme value theory (EVT).

EVT deals with the probability distributions of sparse observations such as threshold

exceedances. A key result of this theory is that these quantities observe a Generalized

Pareto distribution (Coles, 2001).

More precisely, let X1, X2, . . . be a sequence of independent random variables –

such as light – with common but unknown distribution function F , and let Mn =

max{X1, . . . , Xn}. If F satisfies the extremal types theorem (Coles, 2001), so that for

large n, P[Mn > z] ≈ G(z) with G(z) as the Generalized Extreme Value distribution,

then, for a high enough threshold u, the distribution of the threshold exceedance

P[(X − u) > y|X > u] is approximately

H(y) = 1−(

1 +ξy

σ

)− 1ξ

, (D-1)

no matter which regular distribution X was drawn from.

This means that we will observe a Generalized Pareto distribution with parameters

ξ and σ for all lights values above a specified threshold. With ξ = 0, this reduces to

the exponential distribution and with ξ > 0 the distribution is Pareto. There is strong

evidence that the latter case holds for the lights data.

Table D-1 shows the results of fitting the Generalized Pareto distribution to various

top shares of the light distribution of the seven radiance-calibrated satellites. The fit is

very good and the estimated ξ parameters are always significantly positive. This clearly

points towards a Pareto distribution.

Figure D-1 plots the Generalized Pareto distribution against the empirical distribution

function of the radiance-calibrated data from 2010. It visualizes the close fit and confirms

the results from the previous regression.

xiv

Page 49: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table D-1 – Fitted Generalized Pareto distributions, varying thresholds

Year 1996 1999 2000 2003 2004 2006 2010 Average

Panel a) Top 5%

lnσ 4.2422 4.5197 4.7304 4.7560 4.7136 4.5239 4.4477 4.5619(0.0062) (0.0051) (0.0053) (0.0052) (0.0050) (0.0051) (0.0049) [0.1860]

ξ 0.4917 0.3136 0.2792 0.2354 0.2398 0.1972 0.1652 0.2746(0.0056) (0.0043) (0.0044) (0.0042) (0.0040) (0.0040) (0.0038) [0.1075]

Threshold 63 67 74 95 91 75 65 –Observations 96,685 116,858 106,914 100,094 106,899 99,486 107,745 –

Panel b) Top 4%

lnσ 4.5136 4.6989 4.9099 4.8569 4.8241 4.6278 4.5574 4.7127(0.0066) (0.0055) (0.0055) (0.0057) (0.0054) (0.0055) (0.0052) [0.1545]

ξ 0.3720 0.2401 0.2020 0.2051 0.2051 0.1605 0.1212 0.2152(0.0057) (0.0044) (0.0044) (0.0045) (0.0042) (0.0043) (0.0039) [0.0790]

Threshold 76 85 96 119 114 94 82 –Observations 77,348 93,484 85,481 80,075 85,489 79,589 86,195 –

Panel c) Top 3%

lnσ 4.8260 4.8674 5.0702 4.9841 4.9127 4.7153 4.6387 4.8592(0.0069) (0.0060) (0.0060) (0.0063) (0.0061) (0.0062) (0.0058) [0.1491]

ξ 0.2266 0.1753 0.1387 0.1629 0.1873 0.1356 0.0944 0.1601(0.0056) (0.0047) (0.0045) (0.0049) (0.0047) (0.0047) (0.0042) [0.0424]

Threshold 98 114 131 154 150 123 108 –Observations 58,010 70,112 64,110 60,057 64,133 59,691 64,646 –

Panel d) Top 2%

lnσ 5.0520 5.0042 5.1151 5.0807 4.9771 4.7847 4.6836 4.9568(0.0079) (0.0070) (0.0073) (0.0076) (0.0075) (0.0075) (0.0070) [0.1614]

ξ 0.1355 0.1332 0.1438 0.1432 0.1933 0.1266 0.0903 0.1380(0.0060) (0.0053) (0.0055) (0.0057) (0.0058) (0.0056) (0.0050) [0.0304]

Threshold 147 166 199 215 208 169 151 –Observations 38,673 46,742 42,740 40,039 42,755 39,795 43,097 –

Panel e) Top 1%

lnσ 5.2035 5.1025 5.2287 5.1729 5.1013 4.8650 4.7009 5.0535(0.0109) (0.0099) (0.0103) (0.0108) (0.0109) (0.0108) (0.0100) [0.1936]

ξ 0.0961 0.1262 0.1374 0.1483 0.2037 0.1324 0.1163 0.1372(0.0082) (0.0074) (0.0078) (0.0082) (0.0086) (0.0082) (0.0074) [0.0337]

Threshold 259 275 319 332 315 255 230 –Observations 19,337 23,371 21,370 20,019 21,378 19,897 21,548 –

Notes: The table reports parameter estimates from fitted the Generalized Pareto distribution shownin eq. (D-1). The input data are a 10% representative sample of all non-zero lights in the radiance-calibrated data above the defined threshold at the pixel level, where each pixel is 30 × 30 arcseconds. The last column reports the point average of the seven satellites and its standard deviationin brackets.

xv

Page 50: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure D-1 – Generalized Pareto CDF versus EDF, radiance-calibrated data in 2010

0 1000 2000 3000 4000 5000

0.00

00.

002

0.00

40.

006

0.00

80.

010

Light intensity in DN

Den

sity

EmpiricalEVT Model

Notes: Illustration of Generalized Pareto CDF fitted to the data and the empirical distributionfunction (EDF). The EDF and Generalized Pareto CDF are fitted to the top 4% of stable lights in2010. The input data are a 10% representative sample of all non-zero lights of the radiance-calibrateddata at the pixel level, where each pixel is 30× 30 arc seconds.

xvi

Page 51: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

E Additional results using the radiance-calibrated

data

This section complements the analysis in the paper by proving additional robustness

checks of our Pareto hypothesis using the seven radiance-calibrated satellites.

Visual Inspection: Panel (a) of Figure E-1 shows Zipf plots for the top 2% of lights

for each of the seven radiance-calibrated satellites. A Zipf plot is a visualization of the

Pareto survival function in logs. A linear Zipf plot is usually considered evidence in

favor of the Pareto distribution, but its practical relevance is being contested (Cirillo,

2013). Our plots for the lights data are qualitatively similar to those of the top incomes

literature, in that they display linear sections together with some initial curvature and

outliers at the end.5 It is well-known that Zipf plots often deviate from linearity at the

very top since fewer and fewer values are observed at the extremes. Sometimes this is

addressed by removing the very top. We use logarithmic bins so that the size of the

bins increases by a multiplicative factor (Newman, 2005). The sensitivity of Zipf plots

to outliers is compounded by instability and measurement errors afflicting the radiance-

calibrated satellites. While we conclude that the Zipf plot using the radiance-calibrated

data is ambiguous, we obtain a near-linear Zipf plot using the superior VIIRS data (see

the next section).

Panel (b) of Figure E-1 provides another graphical test for the Pareto distribution

based on ‘Van der Wijk’s Law’. The Pareto distribution is unique in that the average

above some level y is proportional to y at all points in the tail, with a factor of

proportionality equal to αα−1 > 1. The graph plots, for each DN on the x-axis, the

average luminosity of all pixels brighter than this value on the y-axis. As expected, we

observe a linear relationship with a slope above unity.

Tests against the lognormal distribution: As a robustness check, we pit the Pareto

distribution against other plausible candidates. We pay particular attention to the log-

normal distribution, since it is commonly used to describe the complete distribution of

incomes or city sizes.

Table E-1 shows the results from separate regressions of the empirical distribution

function on the Pareto CDF and the lognormal CDF based on the top 4% of the data.

The estimated coefficient for the Pareto CDF is closer to unity and the R2 is substantially

larger than in the lognormal counterpart (0.98 vs. 0.83).

5Working with any top share, from the top 5% to the top 1% gives qualitatively similar results, evenif the case for a Pareto distribution tends to be stronger the higher we set the threshold. This is in linewith the empirical literature on Pareto applications in other fields.

xvii

Page 52: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure E-2 visualizes this difference in fit for the year 2010. The lognormal CDF fits

the data poorly, while the Pareto CDF is always closer to the empirical distribution.

Unrestricted rank regressions: Recall that for Pareto-distributed observations yi,

i = 1, ...N , we have rank(yi) ≈ Nyαc y−αi , or, in logarithms log rank(yi)−logN ≈ α log yc−

α log yi. Hence, in the regression

log(

rank(yi)−1

2

)− logN = α1 log yc + α2 log yi + ε (E-1)

only the Pareto distribution satisfies the null hypothesis that −α1 = α2 with α2 < 0. As

before, we follow Gabaix and Ibragimov (2011) and subtract one half from the rank to

improve the OLS estimation of the tail exponent in the rank regression.

Table E-2 reports the OLS rank regression results of eq. (E-1) for all seven satellites

at various different thresholds, i.e. the top 5% to top 1%. The two coefficients are usually

very close and the R2s are high (0.96 – 0.99).6

The Hill estimator: If the null hypothesis −α1 = α2 = α is enforced in eq. (E-1), one

can directly obtain the parameter estimate for the Pareto α. In the main text we estimate

this parameter using OLS rank regressions. As a robustness check, we now use the Hill

estimator (Hill, 1975), αHill = (N − 1)(∑N−1

i=1 log yi − log yc

)−1, for the restricted rank

regression

log rank(yi)− logN ≈ α log yc − α log yi. (E-2)

Under the assumption of a Pareto distribution, the Hill estimator equals the efficient

maximum likelihood estimator and is known for its superior properties for fitting the tail

of the Pareto distribution (Soo, 2005, Eeckhout, 2009). The standard errors are given by

αHill/√N − 3 (see Gabaix, 2009).

Table E-3 report the results for all seven satellites at various different thresholds, i.e.

the top 5% to top 1%. The Pareto parameters obtained using the Hill estimator are very

similar to the OLS estimates in the main text. For the top 3-4%, the values are between

1.3 and 1.6 for the seven satellites, very close to the OLS average parameter estimate of

1.5. For higher thresholds, we observe also the same increase in the parameter estimate

that we observe in the OLS results.

6Note that formal statistical tests, e.g. tests of coefficient equality or Kolmogorov-Smirnoff tests, donot make much sense in huge samples such as ours. Gabaix and Ioannides (2004, p. 2350) capture thisnicely: “with an infinitely large dataset one can reject any non-tautological theory.” The extremely smallstandard errors lead to overrejections of the null hypothesis unless the empirical value equals exactly thetheoretical value.

xviii

Page 53: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure E-1 – Zipf plot and Van der Wijk’s plot

(a) Zipf

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

y in DN

1−F

(y)

10−

610

−5

10−

410

−3

10−

2

200 500 1000 2000 3000 4000 ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

1996199920002002200420052010

(b) Van der Wijk

0 500 1000 1500 2000 2500 3000

050

010

0015

0020

0025

0030

00

u

E[y

| y

> u

]

1996199920002002200420052010

Notes: Popular graphical tests for an approximate Pareto distribution in top lights. Panel a) showsthe Zipf plot for the top 2% of all pixels. The figure uses logarithmic binning to reduce noise andsampling errors in the right tail of the distribution (see Newman, 2005). There are about 100 binsin the tail, where the exact number depends on the range of the input data. Panel b) demonstratesVan der Wijk’s law, which states that the average light above some value u is proportional to u,this is E[y|y > u] ∝ u. Here, too, the data is the top 2% of all pixels. The input data are a 10%representative sample of all non-zero lights in the radiance-calibrated data at the pixel level.

xix

Page 54: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table E-1 – Regression of the EDF on theoretical CDFs, top 4%

Year 1996 1999 2000 2003 2004 2006 2010 Average

Panel a) Pareto CDF on RHS

Slope 1.0108 1.0551 1.0616 1.0575 1.0746 1.0722 1.0796 1.0588(0.0003) (0.0004) (0.0005) (0.0004) (0.0004) (0.0005) (0.0005) [0.0231]

Constant -0.0320 -0.0668 -0.0746 -0.0666 -0.0787 -0.0784 -0.0858 -0.0690(0.0002) (0.0002) (0.0003) (0.0002) (0.0003) (0.0003) (0.0003) [0.0177]

R2 0.9914 0.9866 0.9802 0.9884 0.9869 0.9860 0.9831 –Panel b) Lognormal CDF on RHS

Slope 0.9004 0.9265 0.9181 0.9387 0.9520 0.9472 0.9488 0.9331(0.0014) (0.0016) (0.0018) (0.0014) (0.0014) (0.0014) (0.0015) [0.0190]

Constant -0.1653 -0.2186 -0.2238 -0.1954 -0.2088 -0.2031 -0.2179 -0.2047(0.0011) (0.0013) (0.0015) (0.0011) (0.0011) (0.0011) (0.0012) [0.0200]

R2 0.8496 0.7913 0.7626 0.8508 0.8467 0.8480 0.8268 –

Notes: The table reports results of a regression of the empirical distribution function (EDF) on thePareto or lognormal CDF, using the top 4% of the data. The data are a 10% representative sampleof all non-zero lights in the radiance-calibrated data at the pixel level, where each pixel is 30×30 arcseconds. The last column reports the point average of the seven satellites and its standard deviationin brackets.

Figure E-2 – Pareto and lognormal CDF versus EDF, radiance-calibrated lights in 2010

500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

Light intensity in DN

CD

F

EmpiricalLognormalPareto

Notes: Illustration of the difference between the Pareto and lognormal CDFs fitted to the data andthe empirical distribution function (EDF). Note that the lognormal distribution was fitted to thewhole distribution rather than the tail because of its unimodal shape, while the Pareto distributionis estimated only on the tail. For comparison, we adjust the CDFs so that they all start at thetop 4% of radiance-calibrated lights in 2010. The input data are a 10% representative sample ofall non-zero lights in the radiance-calibrated data at the pixel level, where each pixel is 30× 30 arcseconds.

xx

Page 55: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table E-2 – Unrestricted rank regressions

Year 1996 1999 2000 2003 2004 2006 2010 Average

Panel a) Top 5%

yi -1.4334 -1.4996 -1.4630 -1.6903 -1.6933 -1.7388 -1.7170 -1.6050(0.0012) (0.0012) (0.0013) (0.0013) (0.0013) (0.0014) (0.0015) [0.1330]

yc 1.4736 1.5632 1.5318 1.7539 1.7582 1.8087 1.7936 1.6690(0.0014) (0.0015) (0.0016) (0.0016) (0.0015) (0.0017) (0.0018) [0.1405]

R2 0.9694 0.9651 0.9600 0.9701 0.9721 0.9677 0.9620 0.9666Observations 96,685 116,858 106,914 100,095 106,899 99,487 107,745 104,955

Panel b) Top 4%

yi -1.5130 -1.6328 -1.6165 -1.8403 -1.8513 -1.9101 -1.9056 -1.7528(0.0015) (0.0014) (0.0016) (0.0016) (0.0014) (0.0017) (0.0018) [0.1612]

yc 1.5618 1.6974 1.6870 1.8978 1.9132 1.9767 1.9804 1.8163(0.0017) (0.0017) (0.0019) (0.0018) (0.0016) (0.0019) (0.0020) [0.1655]

R2 0.9662 0.9661 0.9623 0.9725 0.9759 0.9711 0.9658 0.9685Observations 77,348 93,484 85,482 80,075 85,489 79,590 86,196 83,952

Panel c) Top 3%

yi -1.6609 -1.8385 -1.8624 -2.0491 -2.0633 -2.1470 -2.1746 -1.9708(0.0019) (0.0018) (0.0019) (0.0019) (0.0016) (0.0020) (0.0021) [0.1882]

yc 1.7225 1.9017 1.9340 2.1044 2.1174 2.2068 2.2429 2.0328(0.0022) (0.0020) (0.0022) (0.0021) (0.0018) (0.0022) (0.0024) [0.1872]

R2 0.9646 0.9695 0.9691 0.9761 0.9811 0.9762 0.9721 0.9727Observations 58,011 70,115 64,111 60,058 64,134 59,692 64,647 62,967

Panel d) Top 2%

yi -1.9711 -2.1628 -2.2315 -2.3687 -2.3478 -2.4809 -2.5663 -2.3042(0.0025) (0.0023) (0.0022) (0.0023) (0.0018) (0.0024) (0.0025) [0.2009]

yc 2.0329 2.2180 2.2831 2.4156 2.3880 2.5295 2.6215 2.3555(0.0029) (0.0025) (0.0025) (0.0025) (0.0020) (0.0026) (0.0027) [0.1974]

R2 0.9698 0.9757 0.9798 0.9825 0.9871 0.9826 0.9807 0.9797Observations 38,673 46,742 42,740 40,039 42,756 39,794 43,097 41,977

Panel e) Top 1%

yi -2.5471 -2.7216 -2.7241 -2.8508 -2.7006 -2.9769 -3.1652 -2.8123(0.0039) (0.0031) (0.0031) (0.0030) (0.0027) (0.0032) (0.0029) [0.2049]

yc 2.5922 2.7593 2.7596 2.8823 2.7258 3.0097 3.2005 2.8471(0.0043) (0.0034) (0.0034) (0.0033) (0.0029) (0.0035) (0.0031) [0.2031]

R2 0.9781 0.9849 0.9864 0.9889 0.9895 0.9886 0.9911 0.9868Observations 19,337 23,373 21,372 20,020 21,377 19,898 21,551 20,990

Notes: The table reports OLS results obtained from the unrestricted rank regressions eq. (E-1) atvarious relative thresholds. The input data are a 10% representative sample of all non-zero lightsin the radiance-calibrated data above the defined threshold at the pixel level, where each pixel is30× 30 arc seconds. Standard errors are in parentheses. The last column reports the point averageof the seven satellites and its standard deviation in brackets.

xxi

Page 56: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table E-3 – Parameter estimates from rank regressions (Hill estimator)

Year 1996 1999 2000 2003 2004 2006 2010 Average

Panel a) Top 5%

Pareto α 1.2286 1.1833 1.1289 1.3112 1.3100 1.3356 1.3012 1.2570(0.0040) (0.0035) (0.0035) (0.0041) (0.0040) (0.0042) (0.0040) [0.0780]

Observations 96,685 116,858 106,914 100,095 106,899 99,487 107,745 –

Panel b) Top 4%

Pareto α 1.2487 1.2689 1.2233 1.4431 1.4315 1.4666 1.4333 1.3593(0.0045) (0.0042) (0.0042) (0.0051) (0.0049) (0.0052) (0.0049) [0.1065]

Observations 77,348 93,484 85,482 80,075 85,489 79,590 86,196 –

Panel c) Top 3%

Pareto α 1.2948 1.4152 1.3805 1.6023 1.6234 1.6672 1.6478 1.5188(0.0054) (0.0053) (0.0055) (0.0065) (0.0064) (0.0068) (0.0065) [0.1509]

Observations 58,011 70,115 64,111 60,058 64,134 59,692 64,647 –

Panel d) Top 2%

Pareto α 1.5068 1.6869 1.7536 1.8920 1.9325 1.9860 2.0095 1.8239(0.0077) (0.0078) (0.0085) (0.0095) (0.0093) (0.0100) (0.0097) [0.1832]

Observations 38,673 46,742 42,740 40,039 42,756 39,794 43,097 –

Panel e) Top 1%

Pareto α 2.0363 2.2458 2.2613 2.4101 2.3582 2.5190 2.6558 2.3552(0.0146) (0.0147) (0.0155) (0.0170) (0.0161) (0.0179) (0.0181) [0.2011]

Observations 19,337 23,373 21,372 20,020 21,377 19,898 21,551 –

Notes: The table reports the results of the restricted rank regression eq. (E-2) using the Hillestimator. The data are a 10% representative sample of all non-zero lights in the radiance-calibrateddata at the pixel level, where each pixel is 30 × 30 arc seconds. The last column reports the pointaverage of the seven satellites and its standard deviation in brackets.

xxii

Page 57: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

F Additional results using the VIIRS data

Since October 2011, the first satellite of the Suomi National Polar Partnership Visible

Infrared Imagining Radiometer Suite (NPP-VIIRS) has been in orbit. The VIIRS day-

night-band (DNB) on-board sensors have a much higher native resolution of 15 arc

seconds, are radiometrically calibrated, do not suffer from top-coding, and record a

physical quantity (radiance). This section complements the analysis in the paper by

proving additional robustness checks of our Pareto hypothesis using this new data.

Although the new system is undoubtedly superior in many respects, comparability

with the previous series is limited for at least two reasons: i) the first annual VIIRS

composite made available by NOAA refers to the year 2015, so that there is no temporal

overlap with the 1992-2013 DMSP-OLS series, ii) the VIIRS satellites have an overpass

time around midnight, in contrast to the evening hours of the DMSP-OLS satellites,

so that it is not entirely clear what kind of production and consumption activity they

capture (Elvidge et al., 2014, Nordhaus and Chen, 2015). While we do not rely on the

VIIRS data for our replacement procedure, we use them as another robustness check for

whether the Pareto distribution holds. The VIIRS data are particularly insightful in this

respect because of their superior quality.

To compare the higher resolution VIIRS image to the DMSP data, we resample the

raster to the DMSP resolution and then extract radiances of each pixel at the locations

of the 10% sample that we have been using thus far. Naturally, there are considerable

differences in the scale since the VIIRS-DNB records radiance. Note that radiance is

measured in nano watt per steradian per square centimeter (10−9Wcm−2sr−1). The

difference in scale is reflected in the summary statistics of the VIIRS data. The mean is

3.98, the standard deviation is 18.65, and the maximum is 6567.42. The spatial Gini is

much higher using the VIIRS data than in the radiance-calibrated data (0.79 vs. 0.60-

0.65) which is owed to their improved sensors and finer resolution. Nevertheless, the top

tail of the light distribution essentially exhibits the same properties.

Figure F-1 shows the Zipf plot for the VIIRS data. The shape is nearly linear, even

high up in the tail and displays less curvature than the corresponding plot for the radiance-

calibrated data. This also suggests that the radiance-calibration process introduces noise

and understates the Paretian nature of night lights.

Table F-1 replicates the results of the rank regressions from the previous section using

the VIIRS data. The results are qualitatively similar to those obtained with the radiance-

calibrated data, but some small differences are notable. In particular, the estimated shape

parameters are a bit higher for top shares around 3% to 5% but then also appear to be

more stable in the upper tail. Since the VIIRS data are from five years after the most

recent radiance-calibrated image and have a different overpass time, it is difficult to

identify the source of these discrepancies.

xxiii

Page 58: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure F-1 – Zipf plot using the top 2% of pixels in the VIIRS data

●●

●●

●●

●●

●●

●●●●●

●●

●●●●●●●●

●●●●●

●●●●●●●

●●●●

●●●

●●

●●●●

●●●●

●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

y in radiance

1−F

(y)

10−

610

−5

10−

410

−3

10−

2

100 200 500 1000 2000 3000

Viirs 2015

Notes: The figure shows a Zipf plot for the top 2% of all pixels of the VIIRs data, after resamplingthe data to the DMSP-OLS grid and resolution. The figure uses logarithmic binning to reduce noiseand sampling errors in the right tail of the distribution (see Newman, 2005). There are about 140bins in the tail, where the exact number depends on the range of the input data. The VIIRs pixelscorrespond to the same 10% representative sample of all non-zero lights in the radiance-calibrateddata at the pixel level obtained from Hsu et al. (2015) and used in the rest of the paper.

xxiv

Page 59: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table F-1 – Rank regressions based on the VIIRS data in 2015

Unrestricted regressions Hill estimates Restricted regressions

Panel a) Top 5%

yi -1.9603 α 1.5876 α 1.7331(0.0013) (0.0065) (0.0100)

yc 2.0405(0.0015)

R2 0.9879Observations 59,633 59,633 59,633

Panel b) Top 4%

yi -2.0831 α 1.7331 α 1.8747(0.0013) (0.0079) (0.0121)

yc 2.1479(0.0015)

R2 0.9909Observations 47,705 47,705 47,705

Panel c) Top 3%

yi -2.2150 α 1.9144 α 2.0419(0.0014) (0.0101) (0.0153)

yc 2.26220.0016

R2 0.9933Observations 35,780 35,780 35,780

Panel b) Top 2%

yi -2.3438 α 2.1671 α 2.2481(0.0017) (0.0140) (0.0206)

yc 2.3665(0.0019)

R2 0.9939Observations 23,854 23,854 23,854

Panel e) Top 1%

yi -2.3778 α 2.4716 α 2.4235(0.0033) (0.0226) (0.0314)

yc 2.3682(0.0036)

R2 0.9888Observations 11,927 11,927 11927

Notes: The table uses the VIIRS data to repeat three regressions which were conducted with theradiance-calibrated data before: the unrestricted OLS rank regression eq. (E-1) and the restrictedregression eq. (E-2) using both the OLS and the Hill estimator. Standard errors are reportedin paretheses. For the OLS restricted rank regression, these are the asymptotic standard errorscomputed as (2/N)1/2. The data are a 10% representative sample of all non-zero lights in theradiance-calibrated data at the pixel level, where each pixel is 30× 30 arc seconds.

xxv

Page 60: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

G An analytical top-coding correction

Researchers are often interested in aggregate measures, such as average luminosity or

light inequality in a region or a country. Here we present simple formulas to correct these

summary statistics for top-coding. These corrections work with arbitrary thresholds and

Pareto shape parameters.

Mean luminosity: The top-coding corrected mean luminosity µ of a country or region

is simply the weighted average of the bottom and top means µB and µT . If the latter is

the mean of a Pareto distribution starting at yc, we have

µ = ωBµB + (1− ωB)µT = ωBµB + (1− ωB)α

α− 1yc (G-1)

where ωB and ωT = 1 − ωB are the shares of pixels below and above the threshold. A

simple numerical illustration shows how correcting for top-coding drives up the mean

luminosity. If top-coding starts at yc = 55, affects 5% of the study area of interest, α

is 1.5 and mean luminosity in the non-top-coded pixels is µB = 10, then the corrected

mean luminosity is 17.75 rather than 12.25.

Spatial Gini coefficients: The overall Gini coefficient can be written as the weighted

sum of the bottom-share and top-share Ginis (i.e., the within-group Gini) as well as the

difference between the top share of total lights minus the top share of pixels (i.e., the

between-group Gini), such that

G = ωBφBGB + ωTφTGT + [φT − ωT ], (G-2)

where the shares of all light accruing to the top and bottom groups are φB = ωBµB/µ

and φT = ωTµT/µ, and GT = 1/(2α−1). A greater share of top-coded pixels ωT , brighter

top-coded pixels φT , and a greater spread in the distribution of the top-coded data GT

all increase the size of the correction.

The above decomposition of the Gini coefficient can be derived by defining the Gini

coefficient over multiple groups as in Mookherjee and Shorrocks (1982)

G =1

2N2µ

∑i

∑j

|yi − yj| (G-3)

=1

2N2µ

∑k

∑i∈Nk

∑j∈Nk

|yi − yj|+∑i∈Nk

∑j /∈Nk

|yi − yj|

(G-4)

=∑k

(Nk

N

)2µkµGk +

1

2N2µ

∑k

∑i∈Nk

∑j /∈Nk

|yi − yj| . (G-5)

xxvi

Page 61: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

where GK is the within group Gini coefficient of group k. The second term is a measure

of group overlap including their between group differences.

Perfect separation (no overlap between groups) implies∑

i∈Nk

∑j∈Nh |yi − yj| =

NkNh |µk − µh|. Hence, we can simplify equation eq. (G-5) to

G =∑k

(Nk

N

)2µkµGk +

∑k

∑h

NkNh

2N2µ|µk − µh| . (G-6)

With two bottom and top groups k, h ∈ {B, T} (where µT > µB) and some algebra,

this becomes

G =

(NB

N

)2µBµGB +

(NT

N

)2µTµGT +

[(NT

N

)2µTµ− NT

N

]. (G-7)

Now define the pixel shares below and above the threshold as ωB and ωT , where

ωT = 1 − ωB and the group’s share of all income (light) as φB = ωBµBµ

and φT = ωTµTµ

to obtain eq. (G-2) above.

xxvii

Page 62: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

H Characteristics of the corrected data

In this section we compare the back-on-the-envelop analytical corrections from the

previous section with our corrected data at the pixel level, examine the correlations

between our corrected data and the radiance calibrated data, and discuss the size of the

top-coding correction around the world.

Table H-1 reports the mean luminosities and global Gini coefficients before and after

the correction for each satellite, using both the analytic formulas and the data corrected

at the pixel level. Our geo-referenced pixel-level replacement comes close to the analytic

solutions but is generally more conservative (due to the fixed upper bound). Mean

luminosity increases on average from 12.7 DN to 15.3–16.6 DN and inequality in lights

from 0.47 to 0.56–0.59.

Table H-2 reports mean luminosity and the Gini coefficient of inequality for 2010, using

a wider range of parameters as robustness checks in the analytical correction eq. (G-2).

Working with a smaller (larger) parameter than our benchmark α = 1.5 implied more

(less) inequality in the tail of the light distribution. The corrections are consequently

larger (smaller). We can also see that parameter values of 1.4–1.6 only lead to very small

differences in the magnitude of the correction. Also, using a higher α does not change

the magnitude of the correction as much as using a smaller α, as the comparison of the

extreme values of 1.2 and 1.8 shows.

Figure H-1 plots the time series graph of global inequality in lights from 1992 to

2013, both before and after the top-coding correction based on eq. (G-2). Parameter

values of 1.4 and 1.6 serve as comparison bands for the benchmark case of 1.5. The

global distribution of lights became slightly more unequal over the 1990s, remained flat

in the first decade of the new millennium and then became temporarily more equal in the

aftermath of the global financial crises and great recession. However, this year-to-year

variation is completely swamped by the size of the top-coding correction.

Table H-3 provides another comparison check for our pixel-level corrected data. It

shows the correlations between the corrected lights and the radiance-calibrated data for

the seven years where both are available. These figures refer to the whole distribution,

not just the top. Remember that in the correction procedure, we only rely on the ranks

for the top, not the precise radiance-calibrated values, and infer values from the Pareto

distribution. It is all the more remarkable that our corrected values turn out to be

strongly correlated to the radiance-calibrated values, with correlations of 0.94–0.96 for

all seven available years.

The global summary statistics presented so far hide a lot of between-country

heterogeneity. Figure H-2 illustrates the size of the correction in different countries with

various scatter plots. As expected, the same characteristics that drive the number of

top-coded pixels (see Section 2 in the paper) turn out to be predictive of the size of

xxviii

Page 63: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

the correction in terms of country-wide mean luminosity and inequality in light. The

correction is strongly increasing in GDP per capita, weakly in country size and moderately

in population density. Numerous developing countries experience sizable corrections (such

as Egypt, Paraguay or Mexico). City states, such as Singapore, have large top-coding

corrections, as do smaller countries, like Israel and Estonia. Nevertheless, even large

countries like the US experience a sizable increase in both mean luminosity (plus 7

DN) and the Gini coefficient (plus 14 percentage points). No single factor captures

all the relevant heterogeneity. Instead, the correction is a complex function of the spatial

equilibrium, that is, the size, number, and brightness of larger cities in each country.

xxix

Page 64: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-1 – Satellite level statistics of the top-coding correction

Satellite Top Top share (light) Mean luminosity Gini coefficientshare Unadj Adj Unadj Form Pixel Unadj Form Pixel

(pixels) Adj Adj Adj Adj

F101992 0.0381 0.1663 0.3096 13.83 17.81 16.70 0.4390 0.5626 0.5334F101993 0.0312 0.1568 0.2938 11.96 15.23 14.28 0.4593 0.5737 0.5456F101994 0.0349 0.1754 0.3207 12.02 15.67 14.60 0.4783 0.5980 0.5684F121994 0.0420 0.1733 0.3176 14.65 19.04 17.74 0.4353 0.5634 0.5316F121995 0.0376 0.1733 0.3174 13.09 17.02 15.85 0.4580 0.5813 0.5505F121996 0.0351 0.1670 0.3068 12.69 16.37 15.25 0.4607 0.5801 0.5494F121997 0.0394 0.1766 0.3210 13.45 17.57 16.31 0.4540 0.5799 0.5477F121998 0.0418 0.1816 0.3276 13.89 18.25 16.90 0.4474 0.5774 0.5436F121999 0.0467 0.1915 0.3412 14.74 19.62 18.08 0.4447 0.5802 0.5449F141997 0.0316 0.1739 0.3169 10.98 14.29 13.28 0.4876 0.6047 0.5747F141998 0.0305 0.1680 0.3067 10.94 14.13 13.13 0.4883 0.6023 0.5720F141999 0.0278 0.1648 0.3011 10.15 13.06 12.13 0.4895 0.6017 0.5714F142000 0.0318 0.1689 0.3062 11.34 14.67 13.59 0.4852 0.6003 0.5687F142001 0.0350 0.1817 0.3276 11.64 15.30 14.16 0.4856 0.6069 0.5754F142002 0.0377 0.1872 0.3375 12.14 16.08 14.90 0.4896 0.6126 0.5818F142003 0.0382 0.1930 0.3409 11.96 15.96 14.65 0.4928 0.6177 0.5836F152000 0.0370 0.1685 0.3063 13.25 17.13 15.89 0.4399 0.5647 0.5308F152001 0.0354 0.1645 0.3011 12.93 16.64 15.46 0.4463 0.5679 0.5351F152002 0.0372 0.1700 0.3085 13.18 17.08 15.82 0.4465 0.5710 0.5370F152003 0.0270 0.1582 0.2894 10.28 13.11 12.17 0.4982 0.6055 0.5751F152004 0.0276 0.1642 0.2979 10.08 12.97 12.00 0.5080 0.6163 0.5853F152005 0.0279 0.1604 0.2953 10.44 13.36 12.43 0.5115 0.6171 0.5886F152006 0.0293 0.1666 0.2988 10.56 13.63 12.55 0.5135 0.6217 0.5892F152007 0.0279 0.1547 0.2844 10.74 13.68 12.69 0.5049 0.6099 0.5795F162004 0.0340 0.1734 0.3129 11.82 15.38 14.23 0.4641 0.5863 0.5528F162005 0.0285 0.1642 0.2993 10.44 13.43 12.46 0.4926 0.6040 0.5732F162006 0.0348 0.1707 0.3041 12.26 15.91 14.61 0.4714 0.5908 0.5546F162007 0.0403 0.1861 0.3302 13.05 17.28 15.86 0.4624 0.5916 0.5554F162008 0.0395 0.1832 0.3285 12.97 17.11 15.78 0.4702 0.5961 0.5622F162009 0.0417 0.1862 0.3356 13.50 17.87 16.54 0.4694 0.5966 0.5644F182010 0.0591 0.2033 0.3614 17.55 23.73 21.89 0.4258 0.5720 0.5361F182011 0.0494 0.2020 0.3595 14.78 19.95 18.41 0.4552 0.5936 0.5598F182012 0.0576 0.2118 0.3734 16.44 22.45 20.68 0.4361 0.5838 0.5481F182013 0.0578 0.2151 0.3800 16.23 22.28 20.55 0.4389 0.5876 0.5530

Average 0.0374 0.1765 0.3194 12.65 16.56 15.34 0.4691 0.5917 0.5595

Notes: The table reports summary statistics of the global lights data before the top-coding correctionand after the analytical, formula-based correction at the aggregate level (eq. (G-1) and eq. (G-2)) aswell as the pixel-level correction from the paper. Column 1 reports the share of pixels above 55 DN,Column 2 and 3 the share of lights emitted by these top pixels respectively in the unadjusted andadjusted data set. Columns 4-6 and 7-9 report the mean luminosity and Gini coefficient, respectivelyfor the unadjusted data, the analytical, formula-based correction and the pixel-level corrected data.All corrections use α = 1.5 and yc = 55 for the Pareto tail.

xxx

Page 65: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-2 – Correction of global mean and Gini coefficient in 2010, different parameters

Unadj. Pareto parameter α =

1.2 1.3 1.4 1.5 1.6 1.7 1.8

Mean luminosity 17.55 33.49 28.07 25.36 23.73 22.65 21.88 21.30Spatial Gini 0.4258 0.6954 0.6372 0.5990 0.5720 0.5519 0.5363 0.5240

Notes: The table computes the top-coding corrected mean and Gini coefficient of global inequalityin lights for the year 2010 with different α parameters based on eq. (G-1) and eq. (G-2) with yc = 55.The input data are a representative 10% sample of non-zero lights from satellite F182010.

Figure H-1 – Global Gini coefficient in lights before and after the correction

1995 2000 2005 2010

0.40

0.45

0.50

0.55

0.60

0.65

0.70

Spa

tial G

ini C

oeffi

cien

t

alpha = 1.7alpha = 1.5alpha = 1.3original

Notes: Illustration of the global top-coding correction. The figure shows global inequality inlights calculated by eq. (G-2) using the specified Pareto shape parameters. The input data area representative 10% sample of non-zero lights. For years when more than two satellites flewconcurrently, the values were averaged.

xxxi

Page 66: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-3 – Correlations between corrected data and radiance-calibrated data

Radiance-calibrated

Corrected 1996 1999 2000 2003 2004 2006 2010

1996 0.95461999 0.95432000 0.95872003 0.94542004 0.95922006 0.96142010 0.9530

Notes: The table reports correlations between the corrected lights and the radiance calibrated lightsfor the seven years where the radiance-calibrated data are available. The corrections are based onsatellites F121996, F141999, F152000, F152003, F162004, F162006 and F182010.

xxxii

Page 67: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure H-2 – Size of the correction and country characteristics

(a) Mean correction vs. GDP per capita

●●

● ●

●●

● ●

● ●

●●

●●●

●●

●●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●● ●

●●●

● ●

●●

● ●●

0 20000 40000 60000 80000

05

1015

GDP per capita (in 2011 PPPs)

Siz

e of

cor

rect

ion

(in D

N)

BEL BRN

CHE

CHL GBR

ISR

JPN

KOR

LUX

NLDNOR

OMN

PRI

SAU

TTO

USA

(b) Gini correction vs. GDP per capita

● ●

● ●

●●

● ●

● ●

● ●

●●

●●

●●● ●

●●

●●

●●

●●

●●●

●●●

●● ●

●●

●●

●●

●● ●

●●

● ●

●● ●

0 20000 40000 60000 80000 100000 120000

0.0

0.1

0.2

0.3

0.4

GDP per capita (in 2011 PPPs)

Siz

e of

cor

rect

ion

(in G

ini)

ARE

BHR

BRN

CHE

CHL

HKG

ISR

JPN

KOR

KWT

LUX

MEX

NLDNOR

OMN

QAT

SAU

SGP

USA

(c) Mean correction vs. log area

●●

●●

●●●

● ●

● ●

●●

●●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

●●●

●●

●●

● ●

● ●

●●●

● ●●

●●

●●

● ●●

6 8 10 12 14 16

05

1015

Log land area

Siz

e of

cor

rect

ion

(in D

N)

AFG

AGO

ARG

AUS

BEL

BLR

BOL

BRA

BRN

CAN

CHL

CHN

EGY

ESP

EST

FIN

GBR

ISR

JOR

JPN

KOR

LBYLVA MEXMYS

NLDNOR

OMN

PRI

PRT

PRYPSE

RUS

SAU

SWE

TTO

USAVUT

(d) Gini correction vs. log area

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

● ●

●●●

● ●

●●

●● ●

5 10 15

0.0

0.1

0.2

0.3

0.4

Log land area

Siz

e of

cor

rect

ion

(in G

ini)

AGO

ARE

ARG

AUS

BEL

BHR

BLRBOL

BRABRNCAN

CHL

CHNEGY

ESPEST

FINGBR

HKG

ISR

JOR

JPN

KAZ

KOR

KWT

LBYLVA

MEX

MYSNOR

PRI PRY

QAT

RUS

SAU

SGP

TKM

TTO USAVUT

(e) Mean correction vs. log density

●●

●●

●●

● ●

●●

●●

● ●●

● ●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●●

●●

●● ●

●●

●●

●●

● ●

●●●

●●●

● ●

●●

●●●

0 2 4 6

05

1015

Log population density

Siz

e of

cor

rect

ion

(in D

N)

AFG

AGO

ARGBEL

BGD

BLR

BOL

BRA

BRN

CAN

CHL

CHN

EGY

ESP

EST

FIN

GBR

ISR

JOR

JPN

KOR

LBYLVA MEX

MUS

MYS

NLDNOR

OMN

PRI

PRT

PRYPSE

SAU

SWE

TTO

USAVUT

(f) Gini correction vs. log density

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

● ●

●●●

0 2 4 6 8 10

0.0

0.1

0.2

0.3

0.4

Log population density

Siz

e of

cor

rect

ion

(in G

ini)

AGO

ARE

ARG

BEL

BGD

BHR

BLRBOL

BRA BRNCAN

CHL

CHNEGY

ESPEST

FINGBR

HKG

ISR

JOR

JPN

KAZ

KOR

KWT

LBY LVA

MEX

MUS

MYSNOR

PRIPRY

PSE

QAT

RUS

SAU

SGP

TKM

TTOUSAVUT

Notes: Illustration of how the correction of means and Gini coefficients, based on eq. (G-1) andeq. (G-2), correlate with log GDP per capita (PPP), log land area, and log population density.The data are a 10% representative sample of all non-zero lights in satellite F182010. GDP andpopulation data are from the World Development Indicators. For display purposes, the left panelsexclude countries with a correction of mean luminosity larger than 20 DN, these are Singapore,Hong Kong, Qatar, Bahrain, Kuwait, and the UAE. These countries are included in all panels onthe right.

xxxiii

Page 68: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

I Benchmarking exercises

Light-output elasticities at the national level: To validate our corrected data, we

estimate light-output elasticities at the national level in the spirit of Henderson et al.

(2012). Henderson et al. (2012) run fixed effects regressions of log GDP on their measure

of log lights per square kilometer. They report an income elasticity of lights around 0.28.

To revisit their results and show that top-coding even plays a small role at the national

level, we construct a matched sample of the stable lights data, our top-coding corrected

lights and the radiance-calibrated data for the seven years they have in common over the

period from 1996 to 2010.

Table H-1 shows that we are able to replicate their result and—even at the highly

aggregated country level—improve marginally with our top-coding corrected data. The

corrected data always yield the highest within-R2, no matter if we use lights per square

kilometer or lights per capita. As expected, the results are not materially different though,

so that for an analysis of the national light-output relation either data can be used without

explicitly considering the role of top-coding.

Table H-2 digs deeper and investigates the role of top-coding in rich and densely

populated countries. We define a ‘rich-and-dense’ dummy which is unity for countries

that were richer and more densely populated than the world average in 2000. 32% of our

global sample are in this group, including, Singapore and Kuwait, but also Germany and

South Korea. Although the light-output relation is weaker in these countries as shown in

columns (1) and (2), the correction still makes a small difference. We cannot reject the

null hypothesis that in these countries the relationship between lights per km2 and GDP

is zero in the stable lights data, but we are able to reject this hypothesis in the corrected

data. This has little economic implication though; the estimated coefficients are quite

close in both data sources.

Light-output elasticities at the subnational level For this benchmarking exercise,

we chose Germany as an example of a rich and decentralized country. Germany has a

widespread geographic distribution of economic activity and excellent regional accounts.

Lessmann et al. (2015) compile GDP per capita estimates and other statistics at the

district level in Germany over the period from 2000 to 2011. To these data, we add

our estimates of lights per capita from the stable lights data and our corrected data.

Lessmann et al. (2015) show that the light-output elasticity using the standard data is not

only substantially lower for German districts than at the country level, it even becomes

indistinguishable from zero when population and area are used as controls. While we

can mimic these results for the stable lights data, they are overturned once we use the

corrected series. This suggests that top-coding is one of primary reasons why night lights

are worse predictors of output in developed rather than developing countries.

xxxiv

Page 69: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-3 compares the light output elasticities obtained from cross-sectional

regressions using the stable lights data (columns 1 to 3) and our corrected data (columns

4 to 6). The informational value of the stable lights variation drops as population, area,

an urban dummy, and an former federal republic dummy are added to the regression.7 In

fact, columns (2) and (3) suggest that there is little value added in using night lights to

predict local output once population and the spatial extent of the region are accounted for.

This picture changes completely with the corrected data in columns (4) to (6). The sub-

national light output elasticity rises substantially and remains highly significant even in

the presence of additional controls. Our corrected data without top-coding correlate much

better with local output. Instead of falling, the cross-sectional light-output elasticity

reaches levels comparable to its national counterpart.

Table H-4 shows that this result carries over to changes in light and output over

time albeit with some qualifications. It reports regression results using panel data from

2000 to 2011. The inclusion of state and time fixed effects decreases the light-output

elasticity to one third of its national counterpart with the stable lights data. With

the corrected data, most specifications recover results comparable to the cross-sectional

estimates provided we use within-state variation, see columns (4) and (5). Only once

we include municipality FEs, both estimates in columns (3) and (6) come very close.

This occurs for two reasons. First, the within-district variation is very small to begin

with, making it difficult to empirically trace out the light-output relationship. Second,

since the urban structure of Germany is very mature, our correction primarily affects the

cross-sectional ordering of districts, rather than their ranking over time. In any case, the

correction either substantially improves estimates of the light-output relationship at the

local level or, at the minimum, provides comparable answers.

Figure H-1 plots the average light intensity per square kilometer indicated by both

data sources over population density in German districts in order to better understand the

nature of the correction. Clearly, the two series begin to diverge after a density of about

one thousand people per km2. The stable lights series displays an asymptotic movement

towards the top-coding threshold of slightly above 100 DN whereas the corrected series

is approximately linear in population density.8 Another notable feature is that the

three brightest cities according to the stable lights data are all medium-sized cities with

populations (well) below half a million (Herne, Oberhausen and Gelsenkirchen) whereas

the corrected data perfectly identifies the three largest and most populated economic

centers as the brightest (Frankfurt, Munich and Berlin).

7The former FRG dummy is included, since the East has benefited substantially from publicinvestments in unified Germany and also adopted energy-saving lights differently from the West.

8Note that when working with square kilometers rather than 30 × 30 arc second pixels, the stablelights series in lights per square kilometer is not bounded by 63 DN because of the division by area.Most pixels in Germany are smaller than 1 km2 but bigger than 500 m2, hence the theoretical upperbound is slightly above 100 DN, eg. 63 DN/0.5 km2 = 126 DN/km2. This occurs because the platecarree projection keeps the area of each pixel constant in degrees not kilometers.

xxxv

Page 70: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

In sum, top-coding is a major issue for estimating urban-rural differences, precisely

because there the cross-sectional comparison matters most. Using the German data, we

can establish two findings which are likely to hold in other settings as well. First, top-

coding rises with urbanization. Second, the economic ranking of cities is counter-intuitive

in the stable lights data but a sensible ranking emerges after the correction.

xxxvi

Page 71: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-1 – Light-output elasticity, country-level, 1996–2010

GDP in 2005 PPPs GDP per capita in 2005 PPPs

(1) (2) (3) (4) (5) (6)Stable Corrected Radiance Stable Corrected Radiance

Lights per km2 0.275∗∗∗ 0.278∗∗∗ 0.233∗∗∗

(0.067) (0.064) (0.054)

Lights per capita 0.279∗∗∗ 0.280∗∗∗ 0.239∗∗∗

(0.056) (0.055) (0.047)

Within-R2 0.721 0.725 0.712 0.540 0.543 0.526Observations 1288 1288 1288 1288 1288 1288Countries 186 186 186 186 186 186

Note(s): The table reports panel FE estimates. Lights per capita and lights per km2 are measuredin logs. All columns include country and time fixed-effects. The specifications are variants ofln yit = β ln lightsit + x′itγ + ci + st + εit where xit is a vector of controls, ci is a country fixedeffect, and st are time dummies. Country-clustered standard errors in parentheses. Significant at:∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

Table H-2 – Light-output elasticity with interactions, country-level, 1996–2010

GDP in 2005 PPPs GDP per capita in 2005 PPPs

(1) (2) (3) (4) (5) (6)Stable lights Corrected Radiance Stable lights Corrected Radiance

Lights per km2 0.280∗∗∗ 0.287∗∗∗ 0.256∗∗∗

(0.066) (0.063) (0.058)

Richdense × Lights per km2 -0.185∗∗∗ -0.165∗∗∗ -0.170∗∗∗

(0.036) (0.034) (0.063)

Lights per capita 0.289∗∗∗ 0.291∗∗∗ 0.256∗∗∗

(0.056) (0.055) (0.052)

Richdense × Lights per capita -0.103∗∗ -0.104∗∗ -0.102(0.044) (0.041) (0.068)

F -test of sum (p-value) 0.162 0.042 0.061 0.002 0.001 0.004Within-R2 0.728 0.732 0.717 0.543 0.547 0.529Observations 1288 1288 1288 1288 1288 1288Countries 186 186 186 186 186 186

Note(s): The table reports panel FE estimates. Lights per capita and lights per km2 are measuredin logs. All columns include country and time fixed-effects. The specifications are variants ofln yit = β1 ln lightsit + β2(ln lightsit ×Di) + x′itγ + ci + st + εit where Di is a dummy for rich anddense countries, xit is a vector of controls, ci is a country fixed effect, and st are time dummies.Country-clustered standard errors in parentheses. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗

p < 0.01.

xxxvii

Page 72: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-3 – Light-output elasticity, Germany at NUTS-3 level, cross-section

Dependent variable: GDP per capita

Saturated data Corrected data(1) (2) (3) (4) (5) (6)

Lights per capita 0.194∗∗∗ 0.126 0.157∗ 0.206∗∗∗ 0.258∗∗ 0.342∗∗∗

(0.019) (0.094) (0.093) (0.018) (0.101) (0.101)

Population 0.215∗∗∗ 0.266∗∗∗ 0.247∗∗∗ 0.313∗∗∗

(0.041) (0.042) (0.041) (0.042)

Area in km2 -0.132∗∗ -0.122∗∗ -0.037 0.010(0.062) (0.061) (0.068) (0.068)

Urban -0.158∗∗∗ -0.162∗∗∗

(0.043) (0.042)

Former FRG 0.242∗∗∗ 0.251∗∗∗

(0.022) (0.023)

Adjusted-R2 0.216 0.434 0.526 0.255 0.442 0.540Regions 412 412 412 412 412 412

Notes: The table reports cross-sectional OLS estimates. All columns include a constant (not shown).Lights per capita, population and area are all measured in logs. Urban and former FRG are binaryvariables. The specifications are variants of ln yi = β ln lightsi + x′iγ + εi, where xit is a vector ofcontrols. The data have been averaged over the period from 2000 to 2011. Robust standard errorsin parentheses. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

xxxviii

Page 73: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table H-4 – Light-output elasticity, Germany at NUTS-3 level, 2000–2011

Dependent variable: GDP per capita

Saturated data Corrected data(1) (2) (3) (4) (5) (6)

Lights per capita 0.101∗ 0.115∗ 0.046∗∗∗ 0.213∗∗∗ 0.234∗∗∗ 0.035∗∗∗

(0.059) (0.060) (0.008) (0.067) (0.068) (0.008)

Population 0.220∗∗∗ 0.270∗∗∗ -0.550∗∗∗ 0.242∗∗∗ 0.294∗∗∗ -0.572∗∗∗

(0.038) (0.039) (0.077) (0.036) (0.039) (0.077)

Area in km2 -0.143∗∗∗ -0.155∗∗∗ -0.062 -0.071(0.039) (0.040) (0.045) (0.045)

Urban -0.131∗∗∗ -0.137∗∗∗

(0.043) (0.043)

Time FE Yes Yes Yes Yes Yes YesState FE Yes Yes No Yes Yes NoMunicipality FE No No Yes No No Yes

Within-R2 0.599 0.610 0.794 0.605 0.617 0.793Observations 4944 4944 4944 4944 4944 4944Regions 412 412 412 412 412 412

Notes: The table reports panel fixed effects estimates. Lights per capita, population and areaare all measured in logs. Urban is a binary variable. The specifications are variants of ln yit =β ln lightsit+x′itγ + ci+st+ εit where xit is a vector of controls, ci is the state or municipality fixedeffect, and st are time dummies. NUTS-3-clustered standard errors in parentheses. Significant at:∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

xxxix

Page 74: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure H-1 – Light intensity versus population density in German regions

●●

●●

● ●

●●

●●

●●

●● ●

●● ● ●

● ●

●●

●●

●●

●●

●● ●●

●●● ●

●●

● ●●● ●

●●

●●●

●●

●●● ●● ●● ●●●

●● ●

●●

●●

●●

●●●

● ●●●●

●●●●●

● ●● ●● ●●●

● ●●● ● ●●●●● ●●●

● ●● ●●● ●● ●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●● ●●● ● ●●● ●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0 1 2 3 4

050

100

150

Population per square km (in 1000s)

Ave

rage

DN

per

squ

are

km

●● ●● ●●● ●●● ●●

●●●● ● ●●

● ● ●●●● ● ●●● ● ● ●●● ●● ●● ●●● ●● ● ●● ●● ●● ●●●●●●●● ●● ●● ●

●●●●●●● ●●●● ●●● ● ●● ●●● ●● ●●● ●● ●● ●●● ●● ●● ●● ● ●●●● ●●

●● ●●●●● ●●●●● ●● ●● ●● ●● ●●● ● ●●●●● ●●● ●● ●● ●●● ●● ●●●●●● ●●●●●●●●●●● ●●●●●● ●●●●●● ●●●●●●●●●●●●●● ●●● ● ●●● ●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

FrankfurtMunich

Berlin

Hamburg

Frankfurt MunichBerlin

Hamburg

CorrectedStable lights

Notes: Illustration of the value added of the top-coding corrected data for urban economics. Thedata are cross-sectional averages of lights per km2 and population density in German NUTS-3 regionsover the period from 2000 to 2011.

xl

Page 75: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

J Additional results for African cities

Here we present additional results complementing our application on cities in Sub-Saharan

Africa. We discuss the urban extents and their growth over time, as well as summary

statistics by country.

Figure J-1 shows the urban extents of selected cities and compares them with Google

Earth images at the end of the periods we use to delineate the urban extents, i.e., 12/1994

and 12/2013. We can see that the urban areas detected by our algorithm coincide well

with built-up structures. The footprints using the Abrahams et al. (2018) approach are

considerably larger than comparable extents obtained using DN thresholding (not shown)

and correspond more closely to population dispersion on the ground. Abrahams et al.

(2018) report similar results based on a systematic benchmarking exercise of urban areas

derived by this approach to those obtained from a remotely sensed built-up grid.

Table J-1 gives an overview of the countries included in our study. The table reports

the names of the primary city, the number of secondary cities, the annual growth rate

of stable lights, and the annual growth rate of corrected lights. Within all countries the

corrections are larger in primary than in secondary cities. Whether secondary cities grew

faster than primary cities over the period from 1992 to 2013 differs across countries, but

the top-coding correction often tilts the scale in favor of primary cities.

Table J-2 looks at city growth at the urban fringes, examining whether there are

differences between primary and secondary cities in terms of growth outside the urban

fringe. In the main text, we show that primary cities exhibit stronger intensive growth

than secondary cities within the initial 1992 to 1994 boundaries after the top-coding

correction. Table J-2 demonstrates that they also grew faster at the fringes. As

expected, the top-coding correction does not have a noticeable impact at the urban

fringe. Expanding cities are typically first adding dim lights as the suburban area expands,

which then intensify as the areas become part of the city proper. Combining this with the

finding from the main text underlines that primate cities are consolidating their dominant

position.

Table J-3 examines the how inner-city structures have been changing across different

time periods. As in the main text, we regress either the coefficient of variation or Moran’s

I on a linear time trend, the interaction with primacy and city-level luminosity, but we

now split the sample into the first decade (1992–2002) and second decade (2003–2013).

Confirming our previous result, we observe a decrease in inequality in lights which is

highly significant in both periods. The increase in fragmentation (a decreasing Moran’s

I) is a more recent phenomenon. In the first period, increasing fragmentation was driven

by fast-growing cities, as indicated by the negative and highly significant coefficient on

log mean lights. This effect still persists in the second period, that is, once satellite fixed-

effects are included, but is complemented by a negative time trend in Moran’s I (net of

xli

Page 76: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

growth effects).

Table J-4 explores the impact of the changing structure of African cities documented

above on city growth at the intensive margin. To study this question, we regress

luminosity within initial boundaries of each city on the coefficient of variation or Moran’s

I in the previous year, a linear time trend, an interaction with primacy, and a combination

of city and satellite-year fixed effects. The results are ambiguous. However, the estimated

effect sizes are relatively small and lose significance in the presence of satellite fixed

effects. Continued fragmentation could have negative consequences for the growth of

African cities, but more research is necessary to establish the sign and size of this effect.

xlii

Page 77: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Figure J-1 – Urban extents of selected cities in SSA

(a) Lagos, 12/1994 (b) Lagos, 12/2013

(c) Luanda, 12/1994 (d) Luanda, 12/2013

(e) Johannesburg, 12/1994 (f) Johannesburg, 12/2013

Notes: Illustration of the urban extents detection algorithm presented in the text. Note thedifferences in map scale. Comparison of 1992-1994 urban footprint with December 1994 andDecember 2013 Landsat/ Copernicus images obtained via Google Earth Pro. Google Earth imagesare used as part of their “fair use” policy. All rights to the underlying maps belong to Google.

xliii

Page 78: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table J-1 – Intensive Growth: Annualized growth rates of cities in Africa, 1992–2013

Country Primate city Annual growth (primate) Secondary Annual growth (secondary)

Stable lights Corrected cities Stable lights Corrected

Angola Luanda 0.0282 0.0834 4 0.0497 0.0545Benin Cotonou 0.0273 0.0301 2 0.0441 0.0437Botswana Gaborone 0.0328 0.0406 10 0.0453 0.0453Burkina Faso Ouagadougou 0.0265 0.0325 3 0.0439 0.0443Burundi Bujumbura 0.0114 0.0123 0Cameroon Douala 0.0193 0.0250 8 0.0253 0.0264CAR Bangui -0.0076 -0.0076 0Chad Ndjamena 0.0300 0.0324 1 0.0294 0.0294Congo Brazzaville 0.0124 0.0217 2 0.0375 0.0390Congo (D.R.) Kinshasa 0.0074 0.0181 11 0.0222 0.0229Cote d’Ivoire Abidjan 0.0128 0.0229 13 0.0376 0.0378Djibouti Djibouti 0.0205 0.0247 0Eritrea Asmara 0.0217 0.0221 1 -0.0813 -0.0813Ethiopia Addis

Abbeba0.0224 0.0245 3 0.0334 0.0333

Gabon Libreville 0.0138 0.0182 4 0.0225 0.0238Ghana Accra 0.0214 0.0281 18 0.0303 0.0306Guinea Conakry 0.0263 0.0268 1 -0.0118 -0.0118Guinea-Bissau

Bissau 0.0154 0.0154 0

Kenya Nairobi 0.0188 0.0209 13 0.0115 0.0117Lesotho Maseru 0.0390 0.0409 0Madagascar Antananarivo 0.0288 0.0309 5 0.0242 0.0242Malawi Blantyre 0.0167 0.0206 5 0.0128 0.0133Mali Bamako 0.0225 0.0298 1 0.0520 0.0520Mauritania Nouakchott 0.0294 0.0365 2 0.0332 0.0329Mozambique Maputo 0.0341 0.0475 11 0.0468 0.0472Namibia Windhoek 0.0130 0.0198 13 0.0211 0.0216Niger Niamey 0.0188 0.0181 4 0.0304 0.0304Nigeria Lagos 0.0175 0.0244 59 0.0151 0.0157Rwanda Kigali 0.0212 0.0218 0Senegal Dakar 0.0213 0.0329 9 0.0326 0.0334SierraLeone Freetown 0.0270 0.0270 0Somalia Mogadishu 0.0635 0.0635 0South Africa Johannesburg 0.0087 0.0178 204 0.0248 0.0254Sudan Alkhartum 0.0209 0.0318 21 0.0293 0.0297Swaziland Mbabane 0.0246 0.0254 7 0.0239 0.0242Tanzania Daressalaam 0.0230 0.0245 15 0.0178 0.0178Togo Sokode 0.0200 0.0228 2 0.0129 0.0129Uganda Kampala 0.0317 0.0366 4 0.0227 0.0227Zambia Lusaka 0.0185 0.0293 18 0.0288 0.0293Zimbabwe Harare 0.0011 0.0014 19 0.0023 0.0023

Notes: The table reports lists the name of the primary city as well as the number of secondarycities for each Sub-Saharan African country, as defined in the text. The annualized intensive growthrate in mean lights is computed from 1992 to 2013, for both the stable lights and the top-codingcorrected lights. For primary cities, these growth rates refer to the primary city of the country. Forsecondary cities, the reported growth rate is an average of the secondary cities in the country.

xliv

Page 79: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table J-2 – Extensive growth regressions for African cities, 1992-2013

Dependent variable: Log mean lights in the fringe

Stable lights data Corrected data(1) (2) (3) (4) (5) (6)

Linear trend 0.054∗∗∗ 0.051∗∗∗ 0.036∗∗∗ 0.054∗∗∗ 0.051∗∗∗ 0.037∗∗∗

(0.002) (0.002) (0.005) (0.002) (0.002) (0.005)

Primate × linear trend 0.024∗∗∗ 0.024∗∗∗ 0.024∗∗∗ 0.024∗∗∗

(0.005) (0.005) (0.005) (0.005)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No YesObservations 7101 7101 7101 7103 7103 7103Cities 367 367 367 367 367 367

Notes: The table reports the results of city-level panel regressions using the stable lights and top-coding corrected data at the city fringes, defined as the the area outside the 1992 bounds. Theregressions capture growth at the extensive margin. The specifications are variants of ln lightsijt =β1t+ β2(t×Pij) + cij + st + εijt where t is a linear time trend, Pij is an indicator for primate cities,cij is a city fixed effect and st contains satellite dummies. Standard errors clustered at the city levelare in parentheses. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

xlv

Page 80: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table J-3 – Trends in fragmentation of African cities, 1992–2013, split sample period

Dependent variable:

Coefficient of Variation Moran’s I(1) (2) (3) (4) (5) (6)

Panel a) Period from 1992 to 2002

Linear trend -0.380∗∗∗ -0.354∗∗∗ 0.340∗∗∗ 0.007 0.005 0.044(0.044) (0.046) (0.084) (0.020) (0.021) (0.041)

Log light intensity -16.819∗∗∗ -16.759∗∗∗ -15.957∗∗∗ -2.116∗∗∗ -2.121∗∗∗ -1.487∗∗

(3.162) (3.169) (3.504) (0.580) (0.582) (0.579)

Primate × lin. trend -0.355∗∗ -0.360∗∗ 0.025 0.021(0.144) (0.145) (0.053) (0.055)

Panel b) Period from 2003 to 2013

Linear trend -0.550∗∗∗ -0.498∗∗∗ -0.623∗∗∗ -0.170∗∗∗ -0.171∗∗∗ -0.572∗∗∗

(0.056) (0.057) (0.053) (0.035) (0.036) (0.051)

Log light intensity 0.224 0.084 -1.350 0.470 0.473 -1.556∗∗∗

(1.033) (1.032) (1.244) (0.511) (0.511) (0.595)

Primate × lin. trend -0.589∗∗∗ -0.601∗∗∗ 0.010 -0.008(0.128) (0.126) (0.070) (0.067)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No YesObservations (a) 6370 6370 6370 6370 6370 6370Observations (b) 5310 5310 5310 5310 5310 5310Cities (both) 531 531 531 531 531 531

Notes: The table reports the results of city-level panel regressions using the top-coding correcteddata. The specifications are variants of Fijt = β1t+β2(t×Pij)+β3 ln lightsijt+cij +st+εijt, whereFijt is either the coefficient of variation or Moran’s I, t is a linear time trend, Pij is an indicator forprimate cities, cij is a city fixed effect and st contains satellite dummies. Standard errors clusteredat the city level are in parentheses. Both measures of fragmentation have been scaled by 100 forreadability. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01.

xlvi

Page 81: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Table J-4 – Impact of fragmentation on city growth, 1992–2013

Dependent variable: Log mean lights

(1) (2) (3) (4) (5) (6)

Linear trend 0.012∗∗∗ 0.012∗∗∗ 0.005∗∗∗ 0.013∗∗∗ 0.013∗∗∗ 0.006∗∗∗

(0.001) (0.001) (0.001) (0.001) (0.001) (0.001)

Primate × linear trend 0.005∗∗ 0.006∗∗∗ 0.006∗∗∗ 0.006∗∗∗

(0.002) (0.002) (0.002) (0.002)

Lagged CV -0.003∗∗∗ -0.002∗∗ -0.001(0.001) (0.001) (0.001)

Lagged Moran -0.004∗∗∗ -0.004∗∗∗ -0.001(0.001) (0.001) (0.001)

City FE Yes Yes Yes Yes Yes YesSatellite FE No No Yes No No YesObservations 11149 11149 11149 11149 11149 11149Cities 531 531 531 531 531 531

Notes: The table reports the results of city-level panel regressions using the top-coding correcteddata, where either the lagged coefficient of variation or Moran’s I are used as regressors (Fij,t−1).The specifications are variants of ln lightsijt = β1t + β2(t × Pij) + Fij,t−1 + cij + st + εijt where tis a linear time trend, Pij is an indicator for primate cities, cij is a city fixed effect and st containssatellite dummies. Standard errors clustered at the city level are in parentheses. Both measures offragmentation have been scaled by 100 for readability. Significant at: ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗

p < 0.01.

xlvii

Page 82: CESifo Working Paper no. 7411 · VfS Annual Meeting. We have greatly benefited from discussions with the participants. We would like to thank Alexei Abrahams, Gordon Anderson, Gerda

Additional references

Abrahams, A., C. Oram, and N. Lozano-Gracia (2018). Deblurring DMSP nighttime lights: A newmethod using gaussian filters and frequencies of illumination. Remote Sensing of Environment 210,242–258.

Cirillo, P. (2013). Are your data really Pareto distributed? Physica A: Statistical Mechanics and itsApplications 392 (23), 5947–5962.

Coles, S. (2001). An Introduction to the Statistical Modeling of Extreme Values. Springer Series inStatistics.

Eeckhout, J. (2009). Gibrat’s law for (all) cities: Reply. American Economic Review 99 (4), 1676–1683.

Elvidge, C. D., K. E. Baugh, J. B. Dietz, T. Bland, P. C. Sutton, and H. W. Kroehl (1999).Radiance calibration of DMSP-OLS low-light imaging data of human settlements. Remote Sensing ofEnvironment 68 (1), 77–88.

Elvidge, C. D., F.-C. Hsu, K. E. Baugh, and T. Ghosh (2014). National trends in satellite-observedlighting. In Q. Weng (Ed.), Global Urban Monitoring and Assessment through Earth Observation,Taylor & Francis Series in Remote Sensing Applications, Chapter 6, pp. 97–119. CRC Press.

Elvidge, C. D., D. Ziskin, K. E. Baugh, B. T. Tuttle, T. Ghosh, D. W. Pack, E. H. Erwin, and M. Zhizhin(2009). A fifteen year record of global natural gas flaring derived from satellite data. Energies 2 (3),595–622.

Gabaix, X. (2009). Power laws in economics and finance. Annual Review of Economics 1 (1), 255–294.

Gabaix, X. and R. Ibragimov (2011). Rank - 1/2: A simple way to improve the ols estimation of tailexponents. Journal of Business & Economic Statistics 29 (1), 24–39.

Gabaix, X. and Y. Ioannides (2004). The evolution of city size distribution. In J. V. Henderson andJ. F. Thisse (Eds.), Handbook of Regional and Urban Economics (1 ed.), Volume 4, Chapter 53, pp.2341–2378. Elsevier.

Henderson, J. V., A. Storeygard, and D. N. Weil (2012). Measuring economic growth from outer space.American Economic Review 102 (2), 994–1028.

Hill, B. (1975). A simple general approach to inference about the tail of a distribution. The Annuals ofStatistics 3 (5), 1163–1174.

Hsu, F.-C., K. E. Baugh, T. Ghosh, M. Zhizhin, and C. D. Elvidge (2015). DMSP-OLS radiancecalibrated nighttime lights time series with intercalibration. Remote Sensing 7 (2), 1855–1876.

Lessmann, C., A. Seidel, and A. Steinkraus (2015). Satellitendaten zur Schatzung vonRegionaleinkommen: Das Beispiel Deutschland. ifo Dresden berichtet 22 (6), 35–42.

Mookherjee, D. and A. Shorrocks (1982). A decomposition analysis of the trend in UK income inequality.Economic Journal 92 (368), 886–902.

Newman, M. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46 (5),323–351.

Nordhaus, W. and X. Chen (2015). A sharper image? Estimates of the precision of nighttime lights asa proxy for economic statistics. Journal of Economic Geography 15 (1), 217–246.

Soo, K. T. (2005). Zipf’s law for cities: A cross-country investigation. Regional Science and UrbanEconomics 35 (3), 239–263.

Tuttle, B., S. Anderson, P. Sutton, C. Elvidge, and K. Baugh (2013). It used to be dark here.Photogrammetric Engineering and Remote Sensing 79 (3), 287–297.

Ziskin, D., K. Baugh, F. C. Hsu, T. Ghosh, and C. Elvidge (2010). Methods used for the 2006 radiancelights. Proceedings of the Asia-Pacific Advanced Network 30, 131–142.

xlviii