Post on 14-Apr-2017
Understanding disparitiesA deep dive into the American Community Survey
Sean Green, Ph.D.Data Solutions StrategistCity of Seattle
Ways of capturing socioeconomic data
• Product registration surveys• Online behavior and purchase history• Phone surveys• Door-to-door surveys• Workplace/school surveys• The Census and American Community Survey
The United States Census
• The decennial census is mandated by Article I, Section 2 of the US Constitution• Is the most comprehensive source of population demographic information• Is used to apportion representation, and as such, requires a count of
individuals and households• From 1940 through 2000 a subset of Americans received a long form
which contained detailed socioeconomic questions• After 2000 the American Community Survey (ACS) was developed, and
eventually replaced the long form as a survey that could be used to provide estimates between decennial censuses
Three levels of detailMost comprehensive, fewer questions
Less comprehensive, more questions, more detail
Decennial Census
American CommunitySurvey
ACS PUMS
What you get with each
Decennial Census American Community Survey
Public-Use Microdata Sample
• Comprehensive counts• Total population• Estimates for
proportions of demographic groups
• Only available once a decade
• Sampling for smaller groups
• Questions regarding household income, educational attainment, and employment status
• Available yearly for areas of sufficient size
• Same questions as ACS• Anonymized individual
responses for custom tables
• Available yearly for areas of sufficient size
• Higher uncertainty bounds for small areas and groups
2010 Census Tracts vs. 2015 PUMAs
2010 Census Tracts vs. 2015 Zip Code Areas
What the data can tell us about disparities
Decennial Census American Community Survey
Public-Use Microdata Sample
• Proportionality of resources spent and services offered by geography
• How Seattle’s population compares to the population in other cities
• Socioeconomic data aggregated by geographic unit
• Which groups are faring better or worse over time in terms of employment, wages, and ease of transportation
• Socioeconomic data by household
• Individual and population-level factors that are correlated with disparities
• How have groups fared in our city by geography
Poverty rates from ACS
Source: American Community Survey 2007-2010 data
2007 2008 2009 20100%
5%
10%
15%
20%
25%
30%
35%
40%
32.9%
21.6% 21.5%
35.0%
17.4%
14.3% 14.6% 14.8%
9.4% 9.0% 8.6%10.9%
18.6%
22.7%
13.7%
25.8%
15.4%14.0%
9.8%
13.7%
Seattle Poverty Rates by Race
African American Asian Caucasian Hispanic Other
Educational Attainment from PUMs
Other
Hispanic
Caucasian
Asian/Pacific
African American
0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0%
12.3%
18.7%
4.9%
3.3%
13.2%
18.8%
23.6%
12.6%
7.2%
31.7%
47.4%
33.0%
41.4%
50.0%
37.9%
1.6%
2.6%
5.5%
12.0%
1.3%
17.0%
22.1%
34.5%
23.2%
11.1%
2.8%
0.0%
1.1%
4.8%
4.8%
Educational Attainment by race for Seattle Youths Aged 18-24
Less than HS
HS Diploma or Equivalent
Some College
Associate's Degree
Bachelor's Degree
MA or Higher
Source: American Community Survey PUMS 2009-2013
Youth Labor Force Status from PUMS
Source: American Community Survey PUMS 2009-2013
African American Asian Caucasian Hispanic Native American Other0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
49.6%
67.3%74.7%
66.0%
35.2%
54.8%
24.2%
10.2%11.0%
13.2%
16.6%
20.5%
26.2% 22.5%14.3%
20.8%
48.2%
24.7%
Seattle Labor Force Status by Race for Youth 16-24
Employed Unemployed Not in labor force
Demographic comparisonsAfrican Americans comprise a larger proportion of students in Public Schools than they do in the County or City population
Sources: American Community Survey PUMS 2009-2013, ACS 2009-2013, and OSPI. Seattle Public Schools data for all grades; Seattle and King County Data for all ages
16.5%
16.5%
45.8%
12.4%
0.8%7.9%
Seattle Public Schools
6.8%
13.2%
67.4%
6.2%
0.5%5.9%
Seattle
African American Asian/Pacific Islander Caucasian HispanicNative American Other
5.8%
14.8%
64.8%
8.6%
0.7%5.2%
King County
Data for Juvenile court referrals 2005-2014
Combining Census data with other sets
Calculating re-offense rate
Clustering to determine common histories
Tabplot:tableplot – Cleaning the data
Shiny - Making the data more accessible
Other useful tools and packages
• RCurl – Scrape data from HTML tables via HTTP• sqldf – Create data frames using SQL statements• My Maps – Create Google maps with polygons and spreadsheet
backends• Ggplot2 – Visualizations and static maps• Rpart – Recursive partitioning algorithm• Shiny – Visualizations and interactive maps • Data.seattle.gov – Seattle’s Open Data Initiative resource
Shortcomings in the data
• The Census does a poor job of capturing data on the homeless, incarcerated populations, and undocumented aliens• These groups often bear the brunt of disparities
• The ACS does not permit estimates of small population groups, such as many of our immigrant and refugee communities, which makes it difficult to know where to provide services• The data capture population-level counts, but do not provide an easy
solution for measuring changes due to migration• Aggregate statistics may improve without having treated the root problem
Last points
• Democratization of data• Important to balance informational needs with privacy concerns• It takes work but is important to know the limitations of the data • Once you know the data sets you can shed a light on disparities
Clustering: Beyond predetermined groupings