Evaluating district-level income distribution for India using nighttime satellite imagery and other...

25
income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado, Boulder, USA, and Indicus Analytics Private Limited, New Delhi, India Mayuri Chaturvedi, Indicus Analytics Private Limited, New Delhi, India Laveesh Bhandari, Indicus Analytics Private Limited, New Delhi, India Chris Elvidge , NOAA National Geophysical Data Center (NGDC), Boulder, Colorado, USA Kim Baugh, CIRES, University of Colorado, Boulder, USA India Geospatial Forum, Gurgaon, Haryana 8 th February, 2012

Transcript of Evaluating district-level income distribution for India using nighttime satellite imagery and other...

Page 1: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets

Tilottama Ghosh, CIRES, University of Colorado, Boulder, USA, and Indicus Analytics Private Limited, New Delhi, India

Mayuri Chaturvedi, Indicus Analytics Private Limited,New Delhi, India

Laveesh Bhandari, Indicus Analytics Private Limited,New Delhi, India

Chris Elvidge , NOAA National Geophysical Data Center (NGDC),Boulder, Colorado, USA

Kim Baugh, CIRES, University of Colorado, Boulder, USA

India Geospatial Forum, Gurgaon, Haryana8th February, 2012

Page 2: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Overview

• Introduction• Research objective• Methods – data used • Analysis – Step 1: State-level graphical analysis Step 2: Model 1 Step 3: Model 2• Results• Discussion• Conclusion and Future considerations

Page 3: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Why use nightlights to study income distribution?• Inclusive growth one of the major policy thrust areas in the current

as well as next Five-Year Plan

• Income distribution data not easy to come by

• Limitations include:– Under-reporting, Over-reporting, Misreporting– Inappropriate sampling and/or weighting– Lack of standardization across sampling organizations– Enormous expense involved in data collection– Political and economic situations in areas inhibiting data collection– Huge time lags between collection and publication, and low

frequency of data collection– Coarse spatial resolution, Modifiable Areal Unit Problem

• Nightlights (NL) can help circumvent these problems

Introduction

Page 4: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Research objective In this paper, we take a look at the relationship between night lights and Income distribution, as captured by the number of households in different income brackets. We then include other datasets to improve the estimation.

Use multinomial regression techniques to study the statistical relationship

Map the prediction errors to identify regions of maximum estimation errors

Use socio-economic insights to understand probable reasons behind the errors

Research objective

Page 5: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Data used

Radiance-calibrated nighttime image of India,

2004Source: NOAA, NGDC

LandScan population data, 2004

Source: Oak Ridge National Laboratory with

United States Department of Energy.

State and districts shapefile of India

Source: Indicus Analytics Pvt. Ltd.

Methods

Page 6: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Data used• Three categories of households defined on the

basis of annual household income

– Upper income households (earning more than Rs 10 lakh per annum)

– Middle income households (earning Rs 3-10 lakh per annum)

– Lower income households (earning less than Rs 3 lakh per annum)

• Sum of lights extracted for the States and the Districts

• Area calculated for the districts

• Total population extracted for the districts

• Percentage of rural population in each district calculated from Indicus’ data repository comprising of urban, rural, and total population

• Sum of lights and number of households in each income category graphed at the State level

Methods

Page 7: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

State-level graphical analysis Lower income households

0

5000

10000

15000

20000

25000

30000

35000

Tota

l H

H 0

-3l

('00

0)

MH

AP

RJ

TN

KRGJ

UP

PB

MP

HR

WB

DEL

KLOR

CH

BI

JH

J&KUK

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

0

5000

10000

15000

20000

25000

30000

35000

Tota

l H

H 0

-3l

('00

0)

MH

AP

RJ

TN

KRGJ

UP

PB

MP

HR

WB

DEL

KLOR

CH

BI

JH

J&KUK

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

R2 =0.61

Analysis – Step 1

Page 8: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

State-level graphical analysis Middle income households

0

500

1000

1500

2000

To

tal

HH

3-1

0l

('0

00

)

MH

AP

RJ

TN

KR

GJ

UP

PB

MP

HR

WB

DEL

KL

ORCH

BIJH

J&KUK

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

0

500

1000

1500

2000

To

tal

HH

3-1

0l

('0

00

)

MH

AP

RJ

TN

KR

GJ

UP

PB

MP

HR

WB

DEL

KL

ORCH

BIJH

J&KUK

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

R2 =0.81

Analysis – Step 1

Page 9: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

State-level graphical analysis Upper income households

0

100

200

300

400

500

Tota

l HH

>10

l ('0

00)

MH

AP

RJ

TN

KR

UP

PB

MP

HR

WBDEL

KL

CHBIJ&K

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

0

100

200

300

400

500

Tota

l HH

>10

l ('0

00)

MH

AP

RJ

TN

KR

UP

PB

MP

HR

WBDEL

KL

CHBIJ&K

0 500 1000 1500 2000 2500 3000 3500

Sum of lights ('000)

R2 =0.77

Analysis – Step 1

Page 10: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

• Lights definitely have a relationship with households in different income categories, but is not able to capture the entire picture at the state-level

• Examples highlight the need of analysis at a finer spatial resolution – Maharashtra and Andhra Pradesh (similar lights, dissimilar

incomes)– Madhya Pradesh and Rajasthan (similar incomes, dissimilar lights)– Uttar Pradesh in the graph and in the NL Image (variegated lighting

pattern)

• Complex role of population is highlighted

State-level graphical analysis - inferences Analysis – Step 1

Page 11: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1: Using nighttime lights and dummy variables

The relationship between nighttime lights and household income suggested a logarithmic relationship

Analysis – Step 2, Developing Model 1

Page 12: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1Dummy variables were created for commercially and

administratively important districts which are also high population zones

Analysis – Step 2, Developing Model 1

Page 13: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1Hypotheses of the model

• While we can have data on households in different income brackets, we can obtain information only on total sum of lights in a region

• Hypothesis One: NL should be more closely associated with the richer in any given region than with the poorer

• Hypothesis Two: NL will most likely tend to under-estimate the number of poor households and over-estimate the rich households

• Logarithmic multivariate regression model used for all three income categories using the same predictor variables

Number of households Contribution to Nightlights

Analysis – Model 1

Page 14: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1Model coefficients

Ln Y = α + β1 (Ln X1) + β2 X2 + β3X3 + β4X4 + β5X5 Ln Y = α + β1 (Ln X1) + β2 X2 + β3X3 + β4X4 + β5X5

* Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval

Analysis – Model 1

Page 15: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1Inferences

• Tightening of relationship between NL and households’ categories as the income goes up as seen in higher adjusted R2 values for middle and upper income category models

• Magnitude of the coefficient for NL (β1) increases as we move from the lower to the higher income segments

• Most of the predictor variables significant at the 99% level of significance

• Coefficients of all dummy variables go up monotonically for higher income group

• Lights are better able to estimate households in more affluent categories (Hypothesis One)

• β’s consistently highest for the Metropolitan dummy followed by dummy for Suburbs of Metros for all three models

Analysis – Model 1

Page 16: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 1Discussion

• Error maps were created to study the pattern of relationship between nighttime lights and number of households in each income category

• Under-estimation of number of households was observed in lower income category for highly populated states with over 80% rural population

• Under-estimation of upper income households by NL observed in high population density states of UP, Bihar and Kerala

• Under-estimation was lesser for upper- and middle-income households

• Over-estimation of lower income households in border districts of Rajasthan

• Over-estimation of lower income households in agriculturally rich states of Punjab, Haryana

• Thus, both Hypothesis one and Hypothesis two proved to be true

Analysis – Model 1

Page 17: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 2: Using nighttime lights, population density data & including another dummy variable

Analysis – Step 3, Developing Model 2

Population density calculated at the district level

A dummy variable created for districts with percentage of rural population

greater than 80%

Page 18: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 2Model coefficients

Ln Y = α + β1 (Ln X1) + β2 (Ln X2) + β3X3 + β4X4 + β5X5 + β6X6+ β7X7Ln Y = α + β1 (Ln X1) + β2 (Ln X2) + β3X3 + β4X4 + β5X5 + β6X6+ β7X7

* Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval

Analysis – Model 2

Page 19: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Model 2Inferences

• Inclusion of population density and the dummy variable of districts with rural population greater than 80%, increases the R2 for all the three income categories

• Highest percentage increase (about 13%) in R2 value is seen for households in the lowest income category

• Magnitude of the coefficient for NL (β1) is highest for the higher income group

• Magnitude of the coefficient for population density (β2) is lowest for the higher income group

• The rural population’s indicator is most significant for the lowest income group

• In fact, the rural indicator is negatively correlated with the middle and upper income households

• Coefficients of all other dummy variables go up monotonically for higher income group

Analysis – Model 2

Page 20: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Comparing error maps of Model 1 and Model 2Error maps – Lower income households

Results

Model 1 Model 2

Page 21: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Results Comparing error maps of Model 1 and Model 2Error maps – Middle income households Model 1 Model 2

Page 22: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Results Comparing error maps of Model 1 and Model 2Error maps – Upper income households Model 1 Model 2

Page 23: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

• Good relationship exists between nighttime lights and income distribution at the district level, with the relationship being stronger for households in the highest income category

• Inclusion of population density and dummy variable for districts with rural population greater than 80% causes the greatest improvement in the estimates of the lower income households

• A study of the error maps show that, in general , Model 2 expands the yellow areas in the maps (-5 to +5 % error) , which we are considering as ‘acceptable’ percentage errors, across all the income groups

• High population density in urban areas, big share of rural population and presence of large expanse of cultivated areas which are not lit, lack of government provision of public amenities, presence of affluent farmers, presence of military base along border areas, are some of the characteristics noticed of districts with anomalous estimates of economic activity by nightlights

DiscussionDiscussion

Page 24: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Conclusion and Future considerationsConclusion

• Finer spatial resolution analysis of nightlights is more effective in understanding and using this remotely sensed spatial data as a proxy of economic activity

• The same holds true for spatial population data

• The developed models (with further improvements) can be used to estimate households in different income categories for years when such data are not available

• These models can be useful in studying income inequality.

• Inclusion of data such as land use, land cover, vegetation cover, are some of the variables that can be considered for improving the model

Page 25: Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado,

Thank You!!

Questions?