216 Journal of Economic Perspectives
Figure 3AThe Basic Column Chart
Source Stinebrickner and Stinebrickner (2013)
400
500
600
700
800
900
1000
1100
inco
me
inth
ousa
nds
finish no school
finish 1 yr
finish 3 yrs
grad 20 GPA
grad 30 GPA
grad 375 GPA
Figure 2 Discounted Expected LifetimeEarnings VN(t)
Figure 3BThe Revised Column Chart
Source Authorrsquos calculations using numbers inferred from text in Stinebrickner and Stinebrickner (2013)
Discounted Expected Lifetime Earnings VN(t)(Income in thousands)
0 200 400 600 800 1000 1200
Finish no school
Finish 1 year
Finish 3 years
Graduate 20 GPA
Graduate 30 GPA
Graduate 375 GPA
29 56
216 Journal of Economic Perspectives
Figure 3AThe Basic Column Chart
Source Stinebrickner and Stinebrickner (2013)
400
500
600
700
800
900
1000
1100
inco
me
inth
ousa
nds
finish no school
finish 1 yr
finish 3 yrs
grad 20 GPA
grad 30 GPA
grad 375 GPA
Figure 2 Discounted Expected LifetimeEarnings VN(t)
Figure 3BThe Revised Column Chart
Source Authorrsquos calculations using numbers inferred from text in Stinebrickner and Stinebrickner (2013)
Discounted Expected Lifetime Earnings VN(t)(Income in thousands)
0 200 400 600 800 1000 1200
Finish no school
Finish 1 year
Finish 3 years
Graduate 20 GPA
Graduate 30 GPA
Graduate 375 GPA
30 56
An Economistrsquos Guide to Visualizing Data 217
The 3D ChartThe 3D ChartFigure 4A uses the now-familiar 3D effect In such graphs the third dimen-Figure 4A uses the now-familiar 3D effect In such graphs the third dimen-
sion does not plot data values but it does add clutter to the chart and worse it sion does not plot data values but it does add clutter to the chart and worse it can distort the information Look at the far-right-hand bar labeled 6 percent No can distort the information Look at the far-right-hand bar labeled 6 percent No point of the column touches the gridline for that value This software toolmdashlike point of the column touches the gridline for that value This software toolmdashlike many othersmdashuses perspective to give depth to the imaginary plane that runs across many othersmdashuses perspective to give depth to the imaginary plane that runs across the top of the column intersecting the gridline But most readers will perceive the the top of the column intersecting the gridline But most readers will perceive the actual value of the column as less than 6 percent Figure 4B shows a redesign cancel actual value of the column as less than 6 percent Figure 4B shows a redesign cancel the 3D treatment and integrate the disconnected legend with the graph Notice that the 3D treatment and integrate the disconnected legend with the graph Notice that inserting the common baselinemdashportrayed in the original by a hovering barely inserting the common baselinemdashportrayed in the original by a hovering barely perceptible thin gray linemdashpermits a more effective comparison among groupsperceptible thin gray linemdashpermits a more effective comparison among groups
The Unbalanced ChartThe source material for Figure 5A originally appeared in an interactive The source material for Figure 5A originally appeared in an interactive
visualization on the Organisation for Economic Co-operation and Development visualization on the Organisation for Economic Co-operation and Development (OECD) website (httpwwwoecdorggenderdataproportionofemployedw(OECD) website (httpwwwoecdorggenderdataproportionofemployedwhoareseniormanagersbysexhtm) a static version was later reproduced in a hoareseniormanagersbysexhtm) a static version was later reproduced in a New York Times Economix blog post (httpeconomixblogsnytimescom20130402Economix blog post (httpeconomixblogsnytimescom20130402comparing-the-worlds-glass-ceilings_r=2)comparing-the-worlds-glass-ceilings_r=2)
Figure 4AA 3D Chart
Source Ottaviano and Peri (2008)
Change in real weekly wages of US-born workers by group 1990-2006
-60
-40
-20
00
20
40
60
80
100
120
Some High School High School Graduate Some College College Graduate
04
-12 -12
113
-54
-13
-30
60
groups
Young (experience below 20 years)
Old (Experience above 20 years)
31 56
218 Journal of Economic Perspectives
Figure 5AAn Unbalanced Chart
0
5
10
15
20
Uni
ted
Stat
es
New
Zea
land
U
nite
d K
ingd
om
Irel
and
Aus
tral
ia
Est
onia
B
elgi
um
Gre
ece
Can
ada
Icel
and
Fran
ce
Ital
y N
ethe
rland
s Fi
nlan
d O
EC
D a
vera
ge
Hun
gary
Sp
ain
Isra
el
Slov
enia
Po
land
C
zech
Rep
ublic
Sw
itzer
land
A
ustr
ia
Port
ugal
N
orw
ay
Slov
ak R
epub
lic
Ger
man
y Sw
eden
Lu
xem
bour
g Tu
rkey
D
enm
ark
Mex
ico
Kor
ea
Women Men
Percentage of Employed Who Are Senior Managers by Sex 2008
Source Author based on OECD (no date) and Rampell (2013)
Figure 4BFlattening a 3D Chart
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
32 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
33 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
216 Journal of Economic Perspectives
Figure 3AThe Basic Column Chart
Source Stinebrickner and Stinebrickner (2013)
400
500
600
700
800
900
1000
1100
inco
me
inth
ousa
nds
finish no school
finish 1 yr
finish 3 yrs
grad 20 GPA
grad 30 GPA
grad 375 GPA
Figure 2 Discounted Expected LifetimeEarnings VN(t)
Figure 3BThe Revised Column Chart
Source Authorrsquos calculations using numbers inferred from text in Stinebrickner and Stinebrickner (2013)
Discounted Expected Lifetime Earnings VN(t)(Income in thousands)
0 200 400 600 800 1000 1200
Finish no school
Finish 1 year
Finish 3 years
Graduate 20 GPA
Graduate 30 GPA
Graduate 375 GPA
30 56
An Economistrsquos Guide to Visualizing Data 217
The 3D ChartThe 3D ChartFigure 4A uses the now-familiar 3D effect In such graphs the third dimen-Figure 4A uses the now-familiar 3D effect In such graphs the third dimen-
sion does not plot data values but it does add clutter to the chart and worse it sion does not plot data values but it does add clutter to the chart and worse it can distort the information Look at the far-right-hand bar labeled 6 percent No can distort the information Look at the far-right-hand bar labeled 6 percent No point of the column touches the gridline for that value This software toolmdashlike point of the column touches the gridline for that value This software toolmdashlike many othersmdashuses perspective to give depth to the imaginary plane that runs across many othersmdashuses perspective to give depth to the imaginary plane that runs across the top of the column intersecting the gridline But most readers will perceive the the top of the column intersecting the gridline But most readers will perceive the actual value of the column as less than 6 percent Figure 4B shows a redesign cancel actual value of the column as less than 6 percent Figure 4B shows a redesign cancel the 3D treatment and integrate the disconnected legend with the graph Notice that the 3D treatment and integrate the disconnected legend with the graph Notice that inserting the common baselinemdashportrayed in the original by a hovering barely inserting the common baselinemdashportrayed in the original by a hovering barely perceptible thin gray linemdashpermits a more effective comparison among groupsperceptible thin gray linemdashpermits a more effective comparison among groups
The Unbalanced ChartThe source material for Figure 5A originally appeared in an interactive The source material for Figure 5A originally appeared in an interactive
visualization on the Organisation for Economic Co-operation and Development visualization on the Organisation for Economic Co-operation and Development (OECD) website (httpwwwoecdorggenderdataproportionofemployedw(OECD) website (httpwwwoecdorggenderdataproportionofemployedwhoareseniormanagersbysexhtm) a static version was later reproduced in a hoareseniormanagersbysexhtm) a static version was later reproduced in a New York Times Economix blog post (httpeconomixblogsnytimescom20130402Economix blog post (httpeconomixblogsnytimescom20130402comparing-the-worlds-glass-ceilings_r=2)comparing-the-worlds-glass-ceilings_r=2)
Figure 4AA 3D Chart
Source Ottaviano and Peri (2008)
Change in real weekly wages of US-born workers by group 1990-2006
-60
-40
-20
00
20
40
60
80
100
120
Some High School High School Graduate Some College College Graduate
04
-12 -12
113
-54
-13
-30
60
groups
Young (experience below 20 years)
Old (Experience above 20 years)
31 56
218 Journal of Economic Perspectives
Figure 5AAn Unbalanced Chart
0
5
10
15
20
Uni
ted
Stat
es
New
Zea
land
U
nite
d K
ingd
om
Irel
and
Aus
tral
ia
Est
onia
B
elgi
um
Gre
ece
Can
ada
Icel
and
Fran
ce
Ital
y N
ethe
rland
s Fi
nlan
d O
EC
D a
vera
ge
Hun
gary
Sp
ain
Isra
el
Slov
enia
Po
land
C
zech
Rep
ublic
Sw
itzer
land
A
ustr
ia
Port
ugal
N
orw
ay
Slov
ak R
epub
lic
Ger
man
y Sw
eden
Lu
xem
bour
g Tu
rkey
D
enm
ark
Mex
ico
Kor
ea
Women Men
Percentage of Employed Who Are Senior Managers by Sex 2008
Source Author based on OECD (no date) and Rampell (2013)
Figure 4BFlattening a 3D Chart
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
32 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
33 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 217
The 3D ChartThe 3D ChartFigure 4A uses the now-familiar 3D effect In such graphs the third dimen-Figure 4A uses the now-familiar 3D effect In such graphs the third dimen-
sion does not plot data values but it does add clutter to the chart and worse it sion does not plot data values but it does add clutter to the chart and worse it can distort the information Look at the far-right-hand bar labeled 6 percent No can distort the information Look at the far-right-hand bar labeled 6 percent No point of the column touches the gridline for that value This software toolmdashlike point of the column touches the gridline for that value This software toolmdashlike many othersmdashuses perspective to give depth to the imaginary plane that runs across many othersmdashuses perspective to give depth to the imaginary plane that runs across the top of the column intersecting the gridline But most readers will perceive the the top of the column intersecting the gridline But most readers will perceive the actual value of the column as less than 6 percent Figure 4B shows a redesign cancel actual value of the column as less than 6 percent Figure 4B shows a redesign cancel the 3D treatment and integrate the disconnected legend with the graph Notice that the 3D treatment and integrate the disconnected legend with the graph Notice that inserting the common baselinemdashportrayed in the original by a hovering barely inserting the common baselinemdashportrayed in the original by a hovering barely perceptible thin gray linemdashpermits a more effective comparison among groupsperceptible thin gray linemdashpermits a more effective comparison among groups
The Unbalanced ChartThe source material for Figure 5A originally appeared in an interactive The source material for Figure 5A originally appeared in an interactive
visualization on the Organisation for Economic Co-operation and Development visualization on the Organisation for Economic Co-operation and Development (OECD) website (httpwwwoecdorggenderdataproportionofemployedw(OECD) website (httpwwwoecdorggenderdataproportionofemployedwhoareseniormanagersbysexhtm) a static version was later reproduced in a hoareseniormanagersbysexhtm) a static version was later reproduced in a New York Times Economix blog post (httpeconomixblogsnytimescom20130402Economix blog post (httpeconomixblogsnytimescom20130402comparing-the-worlds-glass-ceilings_r=2)comparing-the-worlds-glass-ceilings_r=2)
Figure 4AA 3D Chart
Source Ottaviano and Peri (2008)
Change in real weekly wages of US-born workers by group 1990-2006
-60
-40
-20
00
20
40
60
80
100
120
Some High School High School Graduate Some College College Graduate
04
-12 -12
113
-54
-13
-30
60
groups
Young (experience below 20 years)
Old (Experience above 20 years)
31 56
218 Journal of Economic Perspectives
Figure 5AAn Unbalanced Chart
0
5
10
15
20
Uni
ted
Stat
es
New
Zea
land
U
nite
d K
ingd
om
Irel
and
Aus
tral
ia
Est
onia
B
elgi
um
Gre
ece
Can
ada
Icel
and
Fran
ce
Ital
y N
ethe
rland
s Fi
nlan
d O
EC
D a
vera
ge
Hun
gary
Sp
ain
Isra
el
Slov
enia
Po
land
C
zech
Rep
ublic
Sw
itzer
land
A
ustr
ia
Port
ugal
N
orw
ay
Slov
ak R
epub
lic
Ger
man
y Sw
eden
Lu
xem
bour
g Tu
rkey
D
enm
ark
Mex
ico
Kor
ea
Women Men
Percentage of Employed Who Are Senior Managers by Sex 2008
Source Author based on OECD (no date) and Rampell (2013)
Figure 4BFlattening a 3D Chart
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
32 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
33 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
218 Journal of Economic Perspectives
Figure 5AAn Unbalanced Chart
0
5
10
15
20
Uni
ted
Stat
es
New
Zea
land
U
nite
d K
ingd
om
Irel
and
Aus
tral
ia
Est
onia
B
elgi
um
Gre
ece
Can
ada
Icel
and
Fran
ce
Ital
y N
ethe
rland
s Fi
nlan
d O
EC
D a
vera
ge
Hun
gary
Sp
ain
Isra
el
Slov
enia
Po
land
C
zech
Rep
ublic
Sw
itzer
land
A
ustr
ia
Port
ugal
N
orw
ay
Slov
ak R
epub
lic
Ger
man
y Sw
eden
Lu
xem
bour
g Tu
rkey
D
enm
ark
Mex
ico
Kor
ea
Women Men
Percentage of Employed Who Are Senior Managers by Sex 2008
Source Author based on OECD (no date) and Rampell (2013)
Figure 4BFlattening a 3D Chart
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
Change in real weekly wages of US-born workers by group 1990ndash2006(Percent)
04
-12 -12
113
-54
-13-30
60 Young (experience below 20 years)
Old (experience above 20 years)
-6
-4
-2
0
2
4
6
8
10
12
Some High School College Graduate Some College High School Graduate
32 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
33 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
33 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 221
Figure 6AA Spaghetti Chart
Source Social Security Advisory Board (2012)
27 Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975-2010
0
5
10
15
20
25
30
35
1975 1980 1985 1990 1996 2000 2005 2010
Mental
Cancer
Circulatory
Musculoskeletal
Figure 6BRevising the Spaghetti Chart
Initial DI Worker Awards by Major Cause of DisabilitymdashCalendar Years 1975ndash2010(Percent)
Circulatory MentalMental Circulatory
Musculoskeletal Cancer
1975 1980 1985 1990 1995 2000 2005 2010
32
11
17
26
1975 1980 1985 1990 1995 2000 2005 2010
11
23
10
14
34 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
35 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
36 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
An Economistrsquos Guide to Visualizing Data 225
in this case and is a useful approach when labels are diffi cult to fi t in the vertical in this case and is a useful approach when labels are diffi cult to fi t in the vertical column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the column chart layout (recall Figure 3 also see Schwabish 2013c) Also notice the (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with (subjective) decision to omit the y-axis the usefulness of the y-axis is doubtful with data labels placed on top of each columndata labels placed on top of each column
Figure 9ATwo Pie Charts for Comparison
Aggregate income by source
1962
Other16
Governmentemployeepensions
6
Assetincome15 Earnings
28
SocialSecurity30
Privatepensions
3
2007
Other3
Governmentemployeepensions
8Private
pensions9
Assetincome16
Earnings29
SocialSecurity36
Shares of Aggregate Income 1962 and 2007
Source Social Security Administration (2009)
Figure 9BAlternative to a Pie Chart A Paired Column Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30 28
15
36
18
38
29
11 9 9
4
Social Security Earnings Asset income Privatepensions
Governmentemployeepensions
Other
1962 2009
37 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
38 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
226 Journal of Economic Perspectives
Alternatively the Alternatively the stacked bar chart in Figure 9C in Figure 9C shows the distribution of the shows the distribution of the various groups and that the groups sum to 100 percent while also highlighting various groups and that the groups sum to 100 percent while also highlighting differences from one year to the other Finally the differences from one year to the other Finally the slope chart in Figure 9D in Figure 9D also also shows the difference in each category from the fi rst year to the last by pairing points shows the difference in each category from the fi rst year to the last by pairing points on two vertical axes Slope charts can be used for a variety of purposes including on two vertical axes Slope charts can be used for a variety of purposes including showing correlations for example the relationship between a statersquos obesity rate showing correlations for example the relationship between a statersquos obesity rate and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this and the share of people with at least a bachelorrsquos degree (Cairo 2013) In this example the color contrast (or what appears as different shades of grey in the example the color contrast (or what appears as different shades of grey in the black-and-white printed version) identifi es which categories increased over time black-and-white printed version) identifi es which categories increased over time (blue darker) and those that declined (orange lighter)(blue darker) and those that declined (orange lighter)
Figure 9CAlternative to a Pie Chart A Stacked Bar Chart
Shares of Aggregate Income 1962 and 2009(Percent)
30
38
28
29
18
4
15
11
6
9
3
9
1962
2009
Social Security Earnings Other Private
pensions
Government
employee pensionsAsset income
Figure 9DAlternative to a Pie Chart The Slope Chart
Shares of Aggregate Income 1962 and 2009(Percent)
38
29
11 9
4
1962 2009
Social Security 30 Earnings 28
Private pensions 3
Other 18Asset income 15
Government employeepensions 6
39 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
A data set in pictures Chetty et alJAMA 2016
40 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
women the comparable changes were 023 years in the highestquartile and 010 years in the lowest quartile (P lt 001) These dif-ferences persisted after controlling for the higher growth rate ofincome for individuals in the top quartile relative to the bottomquartile (eTable 4 in the Supplement)
The lower panels of Figure 3 show the annual increase in race-adjusted life expectancy by income ventiles The annual increase inlongevity was 018 years for men (which translates to an increase of234 years from 2001-2014) and 022 years for women (an in-crease of 291 years from 2001-2014) in the top 5 of the incomedistribution In the bottom 5 of the income distribution the av-erage annual increase in longevity was 002 years (an increase of 032years from 2001-2014) for men and 0003 years (an increase of 004years from 2001-2014) for women (P lt 001 for the differences be-tween top and bottom 5 of income distributions for both sexes)
Local Area Variation in Life Expectancy by IncomeLevels of Life Expectancy by Commuting ZoneLife expectancy varied significantly across areas within the UnitedStates especially for low-income individuals Figure 4 shows life ex-pectancy by income ventile for New York New York San FranciscoCalifornia Dallas Texas and Detroit Michigan There was substan-tial variation across these areas for low-income individuals but littlevariation for high-income individuals Life expectancy ranged from723 years to 786 years for men in the lowest income ventile acrossthese 4 cities the corresponding range for men in the top ventilewas 865 years to 875 years
The results in Figure 4 are representative of the variation acrosscommuting zones more generally The SD of life expectancy acrossall commuting zones (weighted by population) was 139 years formen in the bottom income quartile vs 070 years in the top incomequartile (P lt 001) Life expectancy varied less across areas forwomen than men in the bottom income quartile and the amount
of variation across commuting zones also declined with income forwomen (eTable 5 in the Supplement)
Figure 5 shows maps of expected age at death by commutingzone for men and women in the bottom and top quartiles of the na-tional income distribution (maps for the middle-income quartiles ap-pear in eFigure 10 in the Supplement) For individuals in the bot-tom income quartile life expectancy differed by about 5 years formen and 4 years for women between the lowest and highest lon-gevity commuting zones (P lt 001 for both sexes) A summary ofstandard errors by commuting zone appears in part VC of theeAppendix and in eFigure 11
Nevada Indiana and Oklahoma had the lowest life expectan-cies (lt779 years) when men and women in the bottom income quar-tile were averaged Of the 10 states with the lowest levels of life ex-pectancy for individuals in the bottom income quartile 8 formed ageographic belt from Michigan to Kansas (Michigan Ohio IndianaKentucky Tennessee Arkansas Oklahoma Kansas) The states withthe highest life expectancies for individuals in the bottom incomequartile (gt806 years) were California New York and Vermont Lifeexpectancy in the South was similar to the national mean for bothsexes (minus022 years [P = 47] for women and minus096 years [P = 03]for men) in the bottom income quartile Individuals in the top in-come quartile had the lowest life expectancies (lt853 years) inNevada Hawaii and Oklahoma Individuals in the top income quar-tile had the highest life expectancies (gt876 years) in Utah Wash-ington DC and Vermont
Table 1 lists the top 10 and bottom 10 commuting zones in meanlife expectancy (averaging men and women) among the 100 mostpopulated commuting zones for individuals in the bottom and topincome quartiles The expected age at death for the bottom quar-tile ranged from 742 years for men and 807 years for women in GaryIndiana to 795 years for men and 840 years for women in New YorkNew York The commuting zones with the highest life expectancies
Figure 2 Race- and Ethnicity-Adjusted Life Expectancy for 40-Year-Olds by Household Income Percentile 2001-2014
90
85
80
75
70
0 100
19 million
20 million
80
112
119
60
71
77
40
45
50
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Percentile
Mean household income in thousands $ a
WomenMen
20
24
26
Women
Men
Expected age at death y
Bottom 1 788 (95 CI 787-789)Women by household income percentile
Top 1 889 (95 CI 887-891)
Bottom 1 727 (95 CI 726-729)Men by household income percentile
Top 1 873 (95 CI 872-875)
Life expectancies were calculated using survival curves analogous to those inFigure 1 The vertical height of each bar depicts the 95 confidence intervalThe difference between expected age at death in the top and bottom incomepercentiles is 101 years (95 CI 99-103 years) for women and 146 years(95 CI 144-148 years) for men To control for differences in life expectanciesacross racial and ethnic groups race and ethnicity adjustments were calculated
using data from the National Longitudinal Mortality Survey and estimates werereweighted so that each income percentile bin has the same fraction of blackHispanic and Asian adultsa Averaged across years and ages The data are in thousands unless otherwise
indicated
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E5
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
41 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
A data set in pictures Chetty et alJAMA 2016 as rendered in NYT
42 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
A data set in pictures Chetty et alJAMA 2016
Copyright 2016 American Medical Association All rights reserved
were clustered in California (6 of the top 10) whereas the commut-ing zones with the lowest life expectancies were clustered in the in-dustrial Midwest (5 of the bottom 10) The commuting zones withthe highest life expectancies for those in the bottom income quar-tile also had the smallest gaps in life expectancy between the topand bottom quartiles (r = minus082 P lt 001) The expected age at deathfor the top income quartile ranged from 828 years for men and 853years for women in Las Vegas Nevada to 866 years for men and890 years for women in Salt Lake City Utah The areas with the high-est and lowest life expectancies for those in the top income quar-tile were less clustered geographically for example California hadcommuting zones in both the top 10 and bottom 10 of the list
The differences in life expectancy across commuting zones weresimilar in analyses with income measures adjusted for cost of liv-ing with controls for differences across areas in the income distri-bution within each quartile and using measures of loss in life yearsup to the age of 77 years that did not make use of extrapolations be-yond observed ages (part IVC of the eAppendix and eTable 6 in theSupplement) There was also considerable variation in life expec-tancy across counties within commuting zones (part V of theeAppendix eFigure 12 and eTable 7)
Trends in Life ExpectancySimilar to levels of life expectancy temporal trends variedsignificantly across geographic areas Figure 6 maps the annualchange in life expectancy between 2001 and 2014 by state for menand women in the bottom income quartile Hawaii Maine andMassachusetts had the largest gains in life expectancy (gaining gt019years annually) when men and women in the bottom income quar-tile were averaged The states in which low-income individuals ex-perienced the largest losses in life expectancy (losing gt009 yearsannually) were Alaska Iowa and Wyoming
Table 2 lists the top 10 and bottom 10 commuting zones interms of trends in life expectancy (when averaging men and wom-en) among the 100 most populated commuting zones for individu-als in the bottom and top income quartiles The estimated trends
for individuals in the bottom income quartile ranged from anannual gain of 038 years in Toms River New Jersey to an annualloss of 017 years in Tampa Florida Gaps in life expectancybetween the bottom and top income quartiles generally declinedor remained stable in areas in which the bottom income quartileexperienced the largest gains in life expectancy such as Toms RiverNew Jersey In contrast gaps in life expectancy between the topand bottom income quartiles increased by approximately 03 yearsannually in places such as Tampa Florida
Figure 7 shows race- and ethnicity-adjusted life expectanciesby year for men and women in the bottom income quartile in 2 com-muting zones in the top 10 (Birmingham Alabama and CincinnatiOhio) and 2 commuting zones in the bottom 10 (Knoxville Tennes-see and Tampa Florida) This Figure shows that trends in life ex-pectancy across these areas diverged continuously throughout the2000s For example life expectancy increased by approximately 32years from 2001 through 2014 for men and women in CincinnatiOhio but declined by approximately 22 years in Tampa Florida
Correlates of Local Area Variation in Life ExpectancyFigure 8 shows correlations of commuting zone-level estimates ofrace- and ethnicity-adjusted life expectancy for the bottom in-come quartile with local area characteristics The correlations are di-vided into 6 groups health behaviors access to health care envi-ronmental factors income inequality and social cohesion local labormarket conditions and other factors Data for men and women arecombined correlations were similar by sex (eTable 8 in the Supple-ment) County-level correlations were also similar (eTable 9)
Health BehaviorsLife expectancy was negatively correlated with rates of smoking(r = minus069 P lt 001) and obesity (r = minus047 P lt 001) and positivelycorrelated with exercise rates (r = 032 P = 004) among individu-als in the bottom income quartile The maps for rates of smokingobesity and exercise among low-income individuals were similarto those for life expectancy (eFigure 13 in the Supplement)
Figure 4 Race- and Ethnicity-Adjusted Life Expectancy by Income Ventile in Selected Commuting Zones 2001-2014
90
85
80
75
700 5
30
10
60
15
101
20
683
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Mean household income in thousands $ a
Men90
85
80
75
700 5
27
10
54
15
95
20
653
Exp
ecte
d A
ge
at D
eath
fo
r 4
0-Y
ear-
Old
s y
Household Income Ventile
Women
Detroit MI
New York NY
San Francisco CA
Dallas TX
Detroit MI
New York NY
San Francisco CA
Dallas TX
Estimates of race- and ethnicity-adjusted expected age at death for 40-year-olds computed by income ventile (5 percentile point bins)a Averaged across years and ages
Association Between Income and Life Expectancy in the United States Special Communication Clinical Review amp Education
jamacom (Reprinted) JAMA Published online April 10 2016 E7
Copyright 2016 American Medical Association All rights reserved
Downloaded From httpjamajamanetworkcom by Suman Ganguli on 04142016
43 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Visualizing a research strategy
Figure 1
Pollution in China and the Huai RiverQinling Mountain Range
Notes The cities shown are the locations of the Disease Surveillance Points Cities north of the solid line were covered by the home heating policy The figure coloring is generated by interpolating PM10 levels at the 12 nearest pollution monitoring stations to create a high resolution grid of pollution throughout China (1 degree latitude cell width) Areas are left in white which are not within acceptable range of a station
44 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Visualizing a research strategy
Figure 2Particulate Matter Levels (PM10) South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging PM10 across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
4060
8010
012
014
016
0
PM10
(4gm
3 )
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
PM10 in South PM10 in North Local Linear Regression
The estimated change in PM10 (and height of the brace) just north of the Huai River is 416 microgm3 and is statistically significant (95 CI 116 716)
45 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Visualinzing a research strategy
Figure 3Life Expectancy South and North of the Huai River Boundary
Notes Each observation (circle) is generated by averaging life expectancy across the Disease Surveillance Point locations within a 1 degree latitude range weighted by the population at each location The size of the circle is in proportion to the total population at DSP locations within the 1 degree latitude range The plotted line reports a local linear regression plot estimated separately on on each side of the Huai River
7075
8085
Life
Expe
ctanc
y(Ye
ars)
-20 -15 -10 -5 0 5 10 15 20Degrees North of the Huai River Boundary
LE in South LE in North Local Linear Regression
The estimated change in Life Expectancy (and height of the brace) just north of the Huai River is -31 years and is statistically significant (95 CI -50 -13)
46 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Tablesbull Use the same principle
bull Show the databull Donrsquot lie about itbull Focus
bull Which translates inbull Report the important numbers not all the coefficientsbull Keep the number of significant digits downbull No vertical linesbull very few horizontal lines donrsquot box results (3 lines are usually
enough)bull in doubt align leftbull Report the statistics that the reader will need not necessary
those that come by defaultbull Sample questions to ask yourself
bull Do you need the R2bull Are there important tests to report (say between equality of
two coefficients)bull You may need the mean in the control group
47 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Table 6 Impact of the reform on MGNREGS projects Evidence from asset survey
All Projects Ongoing All Projects Ongoing
(1) (2) (3) (4)
Treatment 00494 -0210 0309 00271
(0263) (0413) (0239) (0267)
Observations 390 390 385 385Mean in Control 1380 1169 1179 9819
Number foundNumber Registered
Note the unit of observation is a Gram Panchayat (GP) The dependent variables are the
number of projects registered in the public data portal (nreganicin) on May 15 2013 (1) the
number of projects declared as ongoing in nreganicin (2) the number of registered (3) and
ongoing (4) projects found by surveyors in June-July 2013 Out of 5390 projects registered in
nreganicin for the 390 GP of the survey sample a random sample of 3900 projects were
surveyed (10 per GP) The number of projects found in the survey is scaled up using the
number of registered projects divided by the number of sampled projects rate 5 GP (28
projects) could not be surveyed All specifications include district fixed effects
Table 7 Impact of the reform on fake beneficiaries Evidence from matching ofnreganicin job cards with SECC census
All job cards
Intervention period Post intervention
(as of April 2014) July 2012-March
2013
Apr 2013 - March
2014(1) (2) (3)
Treatment 00187 00181 00107
(000741) (000766) (000696)
Observations 3095 2868 2922
Mean in Control 0644 0673 0698
Treatment 00135 00126 00104
(000613) (000764) (000732)
Observations 3093 2836 2906
Mean in Control 0243 0282 0286
Panel A Match Rate for job cards with one member only
Panel B Match Rate for job cards with two members or more
Job cards with at least one working
member
Note The unit of observation is a GP The dependent variable is the fraction of job cards from nreganicin
matched by name with households from the SECC census A job card with two members or more is matched
when at least to members have been matched by name with a census household The nreganicin data was
extracted from the nreganicin server it covers the period from July 2011 to March 2014 Treatment is a
dummy which is equal to one for the blocks selected for the intervention All specifications include district
fixed effects
39
48 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Stargazer
bull In practice in R use the rdquostargazerrdquo package
bull It will create standardized tables can output them for you inhtml latex word ASCII
bull The default is not bad
bull But you can customize it to chose what statistics to includechose what coefficients to include etc
bull httpscranr-projectorgwebpackages
stargazervignettesstargazerpdf
bull httpjakerusscomcheatsheetsstargazerhtml
49 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Default output
Table 1 Results
Dependent variableftvoteshare
(1) (2)fncandidates 0134lowastlowastlowast
(0007)
RESprior 0059lowastlowast
(0024)
Constant 0012 0094lowastlowastlowast
(0010) (0019)
Observations 372 372R2 0532 0016Adjusted R2 0530 0013Residual Std Error (df = 370) 0153 0222F Statistic (df = 1 370) 420093lowastlowastlowast 6056lowastlowast
Note lowastplt01 lowastlowastplt005 lowastlowastlowastplt001
1
50 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Improve on it
bull Transparent label names for dependent and independentvariables
bull Choose the coefficients you would like to include
bull No need to include both R squared and adjusted R squared
bull Maybe the need of the dependent variable in the controlgroup rather than the constant
51 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Visualizing regression results
Figure 1 Overview of Bandhan Results
1
52 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Figure 2 Endline 1-At a Glance
2
53 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Tools and resources
bull R is a great visualization tool (especially ggplot Need to pickup a good book to chose colors etc)
bull Yau Flowing data has many tutorials for how to do things inR (httpflowingdatacomcategorytutorials) and a 4weeks mini-course you can take at your own pace
bull R handlesproduces maps as well
bull At this site you can find a Tuftersquos charts in Rhttpmotioninsocialcomtufte
54 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Referencesbull Edward Tufte Visual Display of Quantitative Information
Graphics Press 2013 (second edition)bull Jonathan Schwabish ldquoAn Economistrsquos guide to visualizing
datardquo Journal of Economic Perspective 2014 vol 28 number1
bull Banerjee Chattopadhyay Duflo Shapiro ldquoThe long termimpact of a graduation program Evidence from West BengalrdquoMIMEO MIT
bull Duflo Esther ldquoSchooling and Labor Market Consequences ofSchool construction in Indonesiardquo American EconomicReview 2001
bull Chetty Raj Michael Stepner Sarah Abraham Shelby LinBenjamin Scuderi Nicholas Turner Augustin Bergeron andDavid CutlerldquoThe association between income and lifeexpectancy in the United States 2001-2014rdquo JAMA 315 no16 (2016) 1750-1766
55 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
References
bull Bui Quoctrung and Neil IrwinldquoThe Rich Live LongerEverywhere For the Poor Geography Matters New YorkTImes April 11 2016
bull Chen Yuyu Avraham Ebenstein Michael Greenstone andHongbin Li ldquoEvidence on the Impact of Sustained Exposureto Air Pollution on Life Expectancy from Chinalsquos Huai RiverPolicyrdquo
bull Lee David S Enrico Moretti and Matthew J Butler ldquoDovoters affect or elect policies Evidence from the US HouserdquoThe Quarterly Journal of Economics (2004) 807-859
bull Klerman Jacob Alex and Caroline Danielson ldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management vol30 no 4 2011 pp 863888
56 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
References
bull Stinebrickner Ralph and Todd StinebricknerldquoAcademicPerformance and College Dropout Using LongitudinalExpectations Data to Estimate a Learning Model WesternUniversity CIBC Working Paper 2013-5 ( July)httpeconomicsuwocacibcworkingpapers_docs
wp2013Stinebrickner_Stinebrickner05pdf 2013
bull Ottaviano Gianmarco I P and Giovanni Peri Immigrationand National Wages Clarifying the Theory and the EmpiricsNBER Working Paper 14188 ( July) 2008
bull Social Security Advisory Board Aspects of Disability DecisionMaking Data and Materials Social Security Administration(February) httpwwwssabgovPublicationsDisabilityGPO_Chartbook_FINAL_06122012pdf 2012
57 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
References
bull Social Security Administration Fast Facts Figures AboutSocial Security 2009 Social Security Administration ( July)httpwwwssagovpolicydocschartbooksfast_
facts2009fast_facts09pdf 2009
bull Klerman Jacob Alex and Caroline DanielsonldquoTheTransformation of the Supplemental Nutrition AssistanceProgram Journal of Policy Analysis and Management 30(4)863 88 2011
bull Hanson Gordon H 2012 The Rise of Middle KingdomsEmerging Economies in Global Trade Journal of EconomicPerspectives 26(2) 41 64
58 56
Top Related