Census 2010: Accessing Census Data
description
Transcript of Census 2010: Accessing Census Data
Census 2010: Accessing Census Data
THURSDAY, July 21, 2011 10-11:30am
Quick Review
2010 decennial data is “short-form” only – demographic 2010 decennial data is “short-form” only – demographic characteristics; ACS now source of “long-form” type of datacharacteristics; ACS now source of “long-form” type of data
Census Data released in two “flavors” – Census Data released in two “flavors” – Aggregate Data Aggregate Data MicrodataMicrodata
A third type of data product identifies geographic boundariesA third type of data product identifies geographic boundaries
Aggregate Data released in a variety of products, differing in Aggregate Data released in a variety of products, differing in content, geographic specificity and temporal coveragecontent, geographic specificity and temporal coverage
Microdata has flexibility of individual level information, but Microdata has flexibility of individual level information, but balances this by only gross geographic detailbalances this by only gross geographic detail
Access and Resources
Aggregate data resourcesAggregate data resourcesMicrodata resourcesMicrodata resourcesGeography resourcesGeography resources
Documentation resourcesDocumentation resourcesVisualization resources Visualization resources Local resourcesLocal resources
Access and Resources
Aggregate resources:Aggregate resources:Current: American Factfinder, DataFerrett, Uexplore/DexterCurrent: American Factfinder, DataFerrett, Uexplore/DexterHistorical: NHGIS, ICPSR, Historical Census BrowserHistorical: NHGIS, ICPSR, Historical Census Browser
Microdata resourcesMicrodata resourcesOnline Analysis: SDA & IPUMSOnline Analysis: SDA & IPUMSExtract/Download: IPUMS, ICPSR, NBER, DataFerrett, CensusExtract/Download: IPUMS, ICPSR, NBER, DataFerrett, Census
Geographic: Census, MABLE/Geocorr, IPUMSGeographic: Census, MABLE/Geocorr, IPUMS
Documentation: IPUMS, AFF2, ICPSRDocumentation: IPUMS, AFF2, ICPSR
Visualization: Social ExplorerVisualization: Social Explorer
Local Resources: DOF/DRU, SDC’s, UC DATA, DataLabLocal Resources: DOF/DRU, SDC’s, UC DATA, DataLab
Aggregate Data Resources
The “old” American Factfinder
The “new” American Factfinder
Alternative to AFF:FTP Full Files
Same FTPOptions for
ACS
Historical Census Data Browser
Micro-data Resources
Survey Documentation and Analysis (SDA) and
the Integrated Public Use Microdata Samples
(IPUMS)
The Integrated Public Use Microdata Samples
The Integrated Public Use Microdata Sampleswww.ipums.org at the Minnesota Population Center at the Minnesota Population Center
IPUMS-USAHarmonized data on people in the U.S. census and American Harmonized data on people in the U.S. census and American Community Survey, from 1850 to the presentCommunity Survey, from 1850 to the present.
IPUMS-CPSHarmonized data on people in the Current Population Survey, Harmonized data on people in the Current Population Survey, every March from 1962 to the presentevery March from 1962 to the present
Important! Important! HarmonizedHarmonized: Questions asked change over time: How to make : Questions asked change over time: How to make data comparable?data comparable?IntegratedIntegrated: Multiple data collections & surveys simultaneously : Multiple data collections & surveys simultaneously availableavailableMicrodataMicrodata: The underlying individual-level data is available, : The underlying individual-level data is available, not just pre-defined tables.not just pre-defined tables.
The American Community Survey and the Current Population Survey
CPSCPS – Long-running monthly survey (dating back to the 1940’s) – Long-running monthly survey (dating back to the 1940’s) focused on labor force characteristics (unemployment, focused on labor force characteristics (unemployment, earnings, hours worked). earnings, hours worked). ~ 55,000 sample HH’s, multiple interviews, personal~ 55,000 sample HH’s, multiple interviews, personal
In addition to the basic monthly questions, additional modules In addition to the basic monthly questions, additional modules are “piggy-backed” onto the survey to provide more depth on are “piggy-backed” onto the survey to provide more depth on particular topics. Most widely used supplement is the Annual particular topics. Most widely used supplement is the Annual Social and Economic Supplement (ASEC) - aka Annual Social and Economic Supplement (ASEC) - aka Annual Demographic Survey or the March Files. (~100,000 HH’s)Demographic Survey or the March Files. (~100,000 HH’s)
In-depth survey – lots of detail about sources of income, work, In-depth survey – lots of detail about sources of income, work, occupational, hours, etc. (as well as core demographic occupational, hours, etc. (as well as core demographic information on race/ethnicity, nativity, age, sex, educataion)information on race/ethnicity, nativity, age, sex, educataion)
The American Community Survey and the Current Population Survey
ACS ACS – “New” continuous survey, replaces the long form of the – “New” continuous survey, replaces the long form of the decennial census, first fully implemented in 2005 (non-decennial census, first fully implemented in 2005 (non-institutionalized) and 2006 (institutionalized).institutionalized) and 2006 (institutionalized).~ 2,000,000 HH’s annually, mixed mail-in/personal interviews~ 2,000,000 HH’s annually, mixed mail-in/personal interviews
Substantial overlapping content with CPS Substantial overlapping content with CPS Broader range of content, somewhat less detailBroader range of content, somewhat less detail Larger sample sizes allow for greater geographic detailLarger sample sizes allow for greater geographic detail
The American Community Survey and the Current Population Survey
Aggregate
Microdatavs.
The Integrated Public Use Microdata Samples
www.ipums.org at the Minnesota Population Center
Strengths:Strengths:Tremendous centralized documentation Tremendous centralized documentation Many “value-added” data items Many “value-added” data items Wonderful extraction engine (if Wonderful extraction engine (if downloadingdownloading data) data)Multiple statistical Packages supportedMultiple statistical Packages supportedOnline Analysis also possibleOnline Analysis also possible
The Integrated Public Use Microdata Samples
Online Analysis Links
What is SDA?What is SDA?
What can you do with SDA?What can you do with SDA?
The parts of the SDA interfaceThe parts of the SDA interface• MenuMenu• Variable ListVariable List• Active variablesActive variables• Analysis SpecificationAnalysis Specification
The Basics of SDA
1.1. Parts of the SDA interfaceParts of the SDA interface
2.2. Finding data/variables/subjectsFinding data/variables/subjects- search- search- documentation- documentation
3.3. Analysis -Analysis -Components - rows, columns, selection, controlsComponents - rows, columns, selection, controlsProcedures - crosstabs, means, correlationsProcedures - crosstabs, means, correlations
4. Aids in Analysis4. Aids in Analysis RecodingRecoding Saving new variablesSaving new variables DownloadingDownloading
But… Before we go live….Part II. Working with SDA
What is SDA?What is SDA?
SDA (Survey Documentation and Analysis) is a set of programs SDA (Survey Documentation and Analysis) is a set of programs for the for the documentationdocumentation and Web-based and Web-based analysisanalysis of survey data.of survey data.
It was developed and is maintained by the Computer-assisted Survey Methods Program (CSM) at UC Berkeley.
It was developed as a companion program with CASES (Computer Assisted Survey Execution Program), a package for collecting survey data based on structured questionnaires, using a variety of modes of data collection.
It operates on a transposed file structure, which makes analysis of datasets, especially large datasets, extremely fast.
The Basics of SDA
What is SDA?What is SDA?
SDA (Survey Documentation and Analysis) is a set of programs SDA (Survey Documentation and Analysis) is a set of programs for the for the documentationdocumentation and Web-based and Web-based analysisanalysis of survey data.of survey data.
It was developed and is maintained by the Computer-assisted Survey Methods Program (CSM) at UC Berkeley.
It was developed as a companion program with CASES (Computer Assisted Survey Execution Program), a package for collecting survey data based on structured questionnaires, using a variety of modes of data collection.
It operates on a transposed file structure, which makes analysis of datasets, especially large datasets, extremely fast.
Part I. The Basics of SDA
What data is available in SDA?What data is available in SDA?
LOTS!LOTS!
Many popular social science datasets (e.g. the GSS, the ANES, Many popular social science datasets (e.g. the GSS, the ANES, the PUMS from the Decennial Census, the ACS, the CPS Annual the PUMS from the Decennial Census, the ACS, the CPS Annual Demographic Files,…… can be found in SDA format.Demographic Files,…… can be found in SDA format.
Many archives (ICPSR, IPUMS, CPANDA, Roper, SDA, Many archives (ICPSR, IPUMS, CPANDA, Roper, SDA, UCDATA….) provide at least some of their holdings in SDA UCDATA….) provide at least some of their holdings in SDA format.format.
Part I. The Basics of SDA
Multiple Census Samples at IPUMS (http://usa.ipums.org/usa/sda/)
And CPS (March files) data, as well (http://cps.ipums.org/cps/sda/)
What can you do with SDA?What can you do with SDA?
SDA can be used to:SDA can be used to:
• learn about a dataset (metadata, paradata)learn about a dataset (metadata, paradata)
• search for variables of interestsearch for variables of interest
• investigate sample sizes and variable distributionsinvestigate sample sizes and variable distributions
• perform statistical analysesperform statistical analyses
• transform, manipulate and create variables for each transform, manipulate and create variables for each unitunit
• extract and download subsets or full datasetsextract and download subsets or full datasets
The Basics of SDA
The four parts of the SDA The four parts of the SDA interfaceinterface
• Action MenuAction Menu
• Variable ListVariable List
• Active VariableActive Variable
• Analysis SpecificationAnalysis Specification
Part I. The Basics of SDA
Action Menu
Collapsed Variable Tree
Active Variables
Analysis Specification
2.2. Finding data/variables/subjectsFinding data/variables/subjects
Online SDA codebookOnline SDA codebook
IPUMS detailed documentationIPUMS detailed documentation
Analysis –Analysis –
Components - rows, columns, selection, controlsComponents - rows, columns, selection, controlsProcedures -Procedures - crosstabscrosstabs, means, correlations, means, correlations
Screens will vary depending upon what procedure you are using.Screens will vary depending upon what procedure you are using.
Start with exploratory – frequencies, cross-tabulationsStart with exploratory – frequencies, cross-tabulations
Working with SDA
The variables you are interested in
Who to include in the table
Aids in Analysis Aids in Analysis RecodingRecoding Saving new variablesSaving new variables DownloadingDownloading
Part II. Working with SDA
age (5-18) Selects, but does not collapseage (r: 5-18) Selects AND Collapsesage (d: 5-18) Collapses, but does not selectage (c:13,5) Collapses into categories of width wage (c:st,w) starting with value st
Recoding variables – on the fly
Recoding variables – Web interface
Can be used in row, column, control (Crosstabs)
Question 1: Use the CPS or ACS?Question 1: Use the CPS or ACS?
Question 2: What is the desired level of analysis (person, family, Question 2: What is the desired level of analysis (person, family, household)?household)?
Question 3: Who should be excluded?Question 3: Who should be excluded? (How to limit to family households, or only (How to limit to family households, or only particular age groups, or….?particular age groups, or….?
DataFerrett Content
Geographic Resources
Social ExplorerACS, 2010 Census
Selected Data Resources at Berkeley
Library Data Labhttp://sunsite3.berkeley.edu/wikis/datalab/
SDA (Survey Documentation & Analysis)http://sda.berkeley.edu/
Statewide Databasehttp://swdb.berkeley.edu/
California Census Research Data Centerhttp://www.ccrdc.ucla.edu/
The Econometrics lab http://emlab.berkeley.edu/data2.shtml
Thomas J. Long Business & Economics Libraryhttp://www.lib.berkeley.edu/BUSI/electres.html