Public Use Microdata Samples - University of Michigan › dis › census › tools ›...

Post on 10-Jun-2020

3 views 0 download

Transcript of Public Use Microdata Samples - University of Michigan › dis › census › tools ›...

Public Use Public Use Microdata Microdata SamplesSamples

Using PDQ Explore Software

Grace YorkUniversity of Michigan Library

May 2004

2000 Census Data 2000 Census Data TabulationsTabulations

• Summary Files 1-4, Equal Employment Opportunity, School District Data, and Work Flow data are TABULATED data

• American Factfinder EXTRACTS the tabulated data

Public Use Public Use Microdata Microdata SamplesSamples

• Copies of the original questionnaires with identifying information edited out

• Create your own cross tabulations of census data

Typical PUMS QuestionsTypical PUMS Questions• Single years of age by sex for teachers in

Michigan (e.g. when will they retire?)

• Race of those with Arab ancestry (no, they are not all white)

• Demographic characteristics of immigrants from Senegal (age, sex, education, occupation, income, citizenship for a social survey)

• Age, race and sex of automotive industry employees (campaign for organ donations)

PUMS Software ProgramsPUMS Software Programs• FTP data from Census Bureau (and manipulate

with SAS or SPSS)http://www.census.gov/Press-Release/www/2003/PUMS5.html

• Census Bureau CD-ROMS (Beyond 20/20 software)

http://www.census.gov/mp/www/Tempcat/PUMS.html

• SDA Software for Michigan (UMich Only)http://nds.umdl.umich.edu/n/nds/

• PDQ Explorehttp://www.pdq.com

PDQ Explore SoftwarePDQ Explore Software

• Easy interface to– Public Use Microdata Samples, 1 and 5%,

1980-2000– IPUMS, edited PUMS, 1850-1880, 1900-

1920, 1940-1990– Current Population Survey, 1991+– Mortality Schedules

• Permits users to tabulate their own variables

Access to PDQAccess to PDQ

• Librarians may request free Ids, passwords, and software from PDQ

• Send e-mail to info@pdq.com– You are a librarian who talked to Grace York– Requesting ID and password for using PDQ

Explore – Want to download software for the PDQ

Toolbox, Expert Edition

http://www.pdq.com

SoftwareSoftware

• Download the software per instructions to your hard drive

• To begin searching, open the icon on your desktop

Before Beginning …Before Beginning …

Choose FileChoose File

Two PUMS files – 1% and 5% sample

• 1% has data for the nation, states, MSAs and super-Pumas (areas of 400,000)

• 5% has data for the nation, states, MSAs and Pumas (areas of 100,000)

Before Beginning…Before Beginning…Define the data you want in terms of a spreadsheet. The longer part should be defined as rows rather than columns.

I want single years of age by sex for all Vietnam-era veterans in the United States

Universe = Vietnam-era veterans in the U.S.Column=sex (not very wide)Row=single years of age (could be long)

Before Beginning…Before Beginning…

Consult Chapter 7 of the PUMS codebook if you want to check the possible variables and the appendices for place/language/ancestry and occupation codes

http://www.census.gov/prod/cen2000/doc/pums.pdf

Chapter 7 is also available on the University of Michigan web site at:

http://www.lib.umich.edu/govdocs/census2/pums2000/pums7.pdf

Before Beginning…Before Beginning…

Housing RecordAll geographic codes (state, MSA, PUMA)All housing recordsSome population records

Population RecordAll population variablesOk to combine with geographic codes in housingAsk for help for other population/housing combinations at: info@pdq.com

Before Beginning…Before Beginning…

Variable Codes for the Questionin the Technical Documentation Data Dictionary

AGE Single Years of AgeSEX Male or FemaleVPS5 Veteran’s Period of Service 5:

On active duty during the Vietnam Era (Aug. 1964 to Apr. 1975)

http://www.lib.umich.edu/govdocs/census2/pums2000/pums7.pdf

Logging OnLogging On

Enter the subscriber name and password that you were given by the PDQ staff

Logging OnLogging On

Press OK to close the message of the day

Defining WorkspaceDefining Workspace• To conduct a new search, create a new

workspace• Press Finish or return twice

Defining WorkspaceDefining WorkspaceName your file on your hard drive and save.

Defining WorkspaceDefining WorkspaceAt the next screen, use the top menu to choose

Workspace; then Add a Data Set

Defining WorkspaceDefining WorkspaceBrowse data sets; highlight ipums, pums, cps, or

mortality file; Open

Defining VariablesDefining Variables• Once you choose a data set, its codebook will open up• Click on the plus button to get a list of variables, their

alphabetic symbols, and any numeric values

Defining VariablesDefining Variables• Determine the alphanumeric variables you want

(e.g. Vietnam-era veteran: yes is VPS5=1)• Use Top Menu to Choose Query/Setup New Expert Query(Access the codebook later through a tab on the desktop toolbar)

Expert Query FormExpert Query Form1. Make sure you have the correct data set2. Determine if you want a tabulation (counts or numbers)3. Name your file

Expert Query FormExpert Query FormEnter the code for UNIVERSE (what you’re counting)

in the Universe box (e.g. vps5=1 are Vietnam-era veterans for the entire U.S.)

Expert Query FormExpert Query Form• Enter the code for the variables in the ROW box

(age = single years of age; age/5 would be five year age groups)• Enter the code for the variables in the COLUMN box (e.g. sex)• Press RESULTS to run the query

Search ResultsSearch ResultsSearch results appear in spreadsheet format

Saving ResultsSaving Results• Click on File/Export Query Results• You can save as CSV , tab delimited and several other formats.

CSV (WYSIWIG) recommended for use with Excel• Use SETUP button to return to query or icon at bottom to review

the codebook

Geographic CodesGeographic Codes• Geographic codes are found in the Housing documentation• Limit files to Michigan with the code state=26• Click on Query/New Expert Query to continue

Narrowing the UniverseNarrowing the UniverseNarrow Narrow the universe by using the universe by using & & newcodenewcode (e.g. (e.g. vps5=1 & state=26)vps5=1 & state=26)

Logical Operators in PDQLogical Operators in PDQhttp://www.lib.http://www.lib.umichumich..eduedu//govdocsgovdocs/census2//census2/pdqoppdqop..pdfpdf

& & is one of numerous operators used in PDQis one of numerous operators used in PDQ

Operator Name Example/Comment

X:a..b range age:15..44unary + plus sex=+1 (never needed)unary - minus income4<=-1000* multiply 73*income1/100/ divide rhhinc/persons% modulo subsample%10+ add income1+income2- subtract rhhinc-rearning< less than age<65> greater than age>64<= less than or equal age<=65>= greater than or equal age>=65= or == equal age=23!= or <> not equal income!=0& or && and race=2 & looking=1^ exclusive or bit-wise--use with caution| or || or age<18 | age>=65

Altering the Spreadsheet Altering the Spreadsheet TabulationsTabulations

Once you have a spreadsheet, click on Options to create totals or percentages for tables or columns

Adding More ParametersAdding More ParametersExpand the table detail by repeating the row and column

data for another parameter (e.g. race) as shown in Dimension 3

Altering Spreadsheet Altering Spreadsheet AppearanceAppearance

• The default shows separate tables for each of the values in the third dimension (e.g. separate spreadsheets for white and black)

• Change Axis3 tab to FOREACH everything on same spreadsheet

Calculating Means or AveragesCalculating Means or Averages• Calculate averages by changing the query type to summary

statistics (e.g. mean or average) at the top• Fill in the new Describe Expression box at the bottom with a

variable code (e.g. age, income)

Complex TableComplex TableMean income of white male Vietnam-era veterans in Michigan

by age, whether or not they have earningsYou can respecify only veterans with earnings

Altering Mean IncomeAltering Mean IncomeAdd & incws > 0 to universe to count only Vietnam-eraveterans who are earning more than $0

Complex TableComplex TableMean income is higher when data limited to wage-earning

veterans

Small Area GeographySmall Area Geography• Data from the PUMS 5% file is available for

states, metropolitan areas, and Public Use Microdata Areas (PUMAS) of 100,000

• You can identify a PUMA or group of PUMAsusing– Maps in American Factfinder (http://factfinder.census.gov/)– PDF maps on the Census Bureau web site

(http://www.census.gov/geo/www/maps/puma5pct.htm)– Mable/Geocorr Search Engine

(http://mcdc2.missouri.edu/websas/geocorr2k.html)

Small Area GeographySmall Area GeographyThis map shows Detroit as PUMAs 3701-3708

PUMA Codes for MichiganPUMA Codes for Michigan

Ann Arbor 3200Detroit 3701-3708Flint 2200Grand Rapids 1300Lansing 1800

PUMA to Placehttp://www.lib.umich.edu/govdocs/census2/pumapl00.txt

Place to PUMAhttp://www.lib.umich.edu/govdocs/census2/plpuma00.txt

Codebook and PUMASCodebook and PUMASThe Explore Codebook shows PUMA5 as term for The Explore Codebook shows PUMA5 as term for

5% PUMA boundaries5% PUMA boundaries

Small Area Geography and Small Area Geography and RangesRanges

When creating data sets for PUMAS, be sure to include the correct state as the universe (e.g. state=26)

Small Area Geography and Small Area Geography and RangesRanges

Puma5: 3701..3708 will list the data for each individual area

Small Area Geography and Small Area Geography and RangesRanges

Search result for each individual PUMA

Small Area Geography for RangesSmall Area Geography for Ranges

To get the total for the area, list it in the universe as puma5 >3700 & puma5 <3709 & state=26

Small Area Geography for RangesSmall Area Geography for Ranges

To get a listing of single years of age between 65 and 85, list column as age: 65..85

Calculating TotalsCalculating Totals• To calculate the most spoken languages by 65-85 year

olds as a group• Click on Options/Total Options/Row

Complex ResultComplex ResultSpanish and Polish are two most popular Spanish and Polish are two most popular

languages spoken by seniors 65languages spoken by seniors 65--85 in Detroit85 in Detroit

Access to PDQAccess to PDQ

• Librarians may request free Ids, passwords, and software from PDQ

• Send e-mail to info@pdq.com– You are a librarian who talked to Grace York– Requesting ID and password for using PDQ

Explore – Want to download software for the PDQ

Toolbox, Expert Edition

http://www.pdq.com

Contacts for Research Contacts for Research AssistanceAssistance

Initial QueriesInitial QueriesGrace York, Documents Center, 203 HatcherGrace York, Documents Center, 203 Hatcher

graceyor@umich.edugraceyor@umich.edu or 936or 936--23782378JoAnnJoAnn Dionne, Numeric and Spatial Data Services, 825 Dionne, Numeric and Spatial Data Services, 825

Hatcher, Hatcher, jdionne@umich.edujdionne@umich.edu, , 763763--94089408

Complex Data SetsComplex Data SetsLisa Lisa NeidertNeidert, Population Studies Center, 426 Thompson, , Population Studies Center, 426 Thompson,

lisan@umich.edulisan@umich.edu, 763, 763--21632163PDQ Staff, 310 Depot Street, Suite C, Ann Arbor PDQ Staff, 310 Depot Street, Suite C, Ann Arbor

48104, 48104, info@pdq.cominfo@pdq.com