Doing data & statistics at the reference desk (some of) what you’ll need to know OLA Super...
-
Upload
gerald-lindsey -
Category
Documents
-
view
212 -
download
0
Transcript of Doing data & statistics at the reference desk (some of) what you’ll need to know OLA Super...
Doing data & statisticsDoing data & statisticsat the reference deskat the reference desk
(some of)(some of)
what you’ll need to knowwhat you’ll need to know
OLA Super Conference 2003OLA Super Conference 2003
2003.02.012003.02.01
Walter W. GiesbrechtWalter W. Giesbrecht
Data Librarian, York UniversityData Librarian, York University
not this kind of Data ...not this kind of Data ...
… … but these kinds!but these kinds!
what’s on the menuwhat’s on the menu
• how to deal with numeric panichow to deal with numeric panic• definitionsdefinitions
– types of data & statistics, analysis
• things to learn about data and the things to learn about data and the reference interviewreference interview
• sources of data & statisticssources of data & statistics– tools required
what’s (mostly) what’s (mostly) not on the menunot on the menu
• geographic data filesgeographic data files– not qualified to deal with it in great detail– those interested will have attended Friday’s
session (“GIS and Digital Map Reference for Non-Map Librarians”)
• details on 2001 Census of Canada details on 2001 Census of Canada – general overview only– those interested wil already have attended
Thursday’s session (“Get Familiar With Canada!”)
numeric panic!numeric panic!
• related conditions are numerophobia, related conditions are numerophobia, arithmophobia, statistophobiaarithmophobia, statistophobia
• in librarians, a condition brought on by in librarians, a condition brought on by a request for a statistical fact, figure, a request for a statistical fact, figure, table or datatable or data
• symptoms include symptoms include – a blank mind– feeling of a clenched fist in your stomach– urge to run from the reference desk
how to deal with numeric panic?how to deal with numeric panic?
• ask the right questionsask the right questions
• search the right sourcessearch the right sources
• spread it around!spread it around!– know who to turn to for help– train colleagues so the load doesn’t fall only
on you
what are data?what are data?• facts or figures from which conclusions facts or figures from which conclusions
can be drawncan be drawn
• numeric files created and organizednumeric files created and organized– for analysis, or to create a new table
• includes geographic data includes geographic data – (to make maps)
what data are notwhat data are not
"The plural of anecdote is not data.""The plural of anecdote is not data."
-- Roger Brinner -- Roger Brinner
what are statistics?what are statistics?
• type of information obtained through type of information obtained through mathematical operations on numerical mathematical operations on numerical datadata
• statistics are processed data, or data statistics are processed data, or data that have been analyzed in some waythat have been analyzed in some way
• generally used to support an argument generally used to support an argument or position in a study or reportor position in a study or report
statisticsstatistics
• in in print formprint form, typically found in , typically found in statistical abstracts, census and other statistical abstracts, census and other government publications (monograph or government publications (monograph or serial)serial)
• in in digital formdigital form, found on CD-ROM or in , found on CD-ROM or in online databasesonline databases
data vs. statisticsdata vs. statistics
• difference between looking at a difference between looking at a photograph and taking the photograph photograph and taking the photograph yourselfyourself
• statistics are like a photograph or statistics are like a photograph or postcardpostcard– a captured image of the data chosen by
someone else
• data are like the view through a cameradata are like the view through a camera– you choose the view you want
the data continuumthe data continuum
raw survey data
tables, charts, graphsa ‘number’
# French Mother Tongue (1996) in Ontario
Employment levels by
occupation class
Annual inflation rate from 1914 to present
Aggregate Data Microdata
Coded responses of
surveyed individuals
aggregate dataaggregate data
• data that have been grouped or data that have been grouped or summarized in some waysummarized in some way– e.g., by geography or age group
• boundary between aggregate data and boundary between aggregate data and statistics sometimes blurrystatistics sometimes blurry
aggregate data structureaggregate data structure
• timetime– e.g., time series data from CANSIM, Labour
Force Historical Review, multiple Census years
• geographygeography– e.g., Census data – neighbourhood --> national
• social contentsocial content– e.g., injury data from Health Indicators
Database
Beyond 20/20 tableBeyond 20/20 table
microdatamicrodata• unsummarized dataunsummarized data
– often samples of actual responses to surveys
• two types of microdata filestwo types of microdata files– master file -- raw data, usually directly
available only to STC employees and authorized researchers
– PUMF (public-use microdata file) -- anonymized version of master file
excerpt from NPHS microdata fileexcerpt from NPHS microdata file
column 8 -- sex of respondent
column 13 – pets?
column 42-44 -- # visits to eye specialist
the analysis continuumthe analysis continuum
Percentages
Counts
StandardDeviations
Tests ofSignificance
Descriptive StatisticsDescriptive Statistics
(aggregate data?)
Averages
Inferential Inferential StatisticsStatistics
Significance testing
Percentages
Counts
Standard Deviations
Averages
Tables, Charts, GraphsA ‘number’ Raw Survey Data
Data continuum …Data continuum …
Statistical analysis continuum …Statistical analysis continuum …
Aggregate / Descriptive Microdata / Inferential
aggregate data vs. microdataaggregate data vs. microdatain the reference interviewin the reference interview
• aggregate data is what you’ll be aggregate data is what you’ll be working with at the reference desk working with at the reference desk (most of the time)(most of the time)
• microdata microdata usuallyusually requires referral to requires referral to data librarian or Statistics Canada, data librarian or Statistics Canada, except when ...except when ...
examples of examples of Web interfaces to microdataWeb interfaces to microdata
• QWIFSQWIFS ((QQueen's ueen's WWeb eb IInterface nterface FFor or SSPSS)PSS)< < linklink > >
• TriUniversity Data ResourcesTriUniversity Data Resources< < linklink > >
data at the desk: data at the desk: the reference interviewthe reference interview
• proper reference interview will help you proper reference interview will help you tremendouslytremendously
• makes referrals more efficientmakes referrals more efficient
reference interview -- one viewreference interview -- one view
another viewanother view
few
report
numbers
intendeduse
YES
exists in print? NO
exists as data?
print source OTHER
many
analysis
data source
YES
NO
essential factors in essential factors in data reference interviewdata reference interview
• geographygeography– determines jurisdiction, reporting agency
• timetime– current / historical / both (time series)
• level of observationlevel of observation
• intended useintended use
• formatformat
how to know where to lookhow to know where to look
• know your users know your users
• know your sourcesknow your sources– don’t ignore print sources
• know your limitationsknow your limitations
• know who to ask for help!know who to ask for help!
jurisdiction & reporting agencyjurisdiction & reporting agency
Federal• National Accounts• Census• Trade
Provincial• Health• Education
Canada International
United NationsOECDIMFWorld Bank Eurostats
United States
Federal Departments• Commerce• Labor• Justice• Agriculture
Canadian dataCanadian data
• Statistics Canada is generally the first Statistics Canada is generally the first stop for Canadian datastop for Canadian data
• search tools:search tools:– the Daily– Online Catalogue– Thesaurus– CANSIM– E-STAT
The The DailyDaily
Beyond 20/20Beyond 20/20
• application used by STC to display many application used by STC to display many of their data tablesof their data tables
• easily handles large tables with multiple easily handles large tables with multiple dimensionsdimensions
• user can easily manipulate the data to user can easily manipulate the data to get the desired presentationget the desired presentation
• data can also be exported to other data can also be exported to other formatsformats
linklink to table to table
STC online catalogueSTC online catalogue
STC thesaurusSTC thesaurus
STC publications on the WebSTC publications on the Web
• two ways to get themtwo ways to get them
– free, direct from Statistics Canada
– free (to eligible institutions) via DSP
CANSIMCANSIM
• premier source of Canadian time-series datapremier source of Canadian time-series data
• available throughavailable through
– subscription via UofT (DLI only)
– E-STAT (educational institutions, DLI & DSP)
– STC – same interface as E-STAT, but updated continously; $3/time series
E-STATE-STAT
• intended for use by education intended for use by education community, and DSP librariescommunity, and DSP libraries
• provides “free” access to CANSIMprovides “free” access to CANSIM– CANSIM on E-STAT only updated once a
year
• census data from 1986-2001, and census data from 1986-2001, and selected censuses from 1665-1871selected censuses from 1665-1871
• data can be mapped/exporteddata can be mapped/exported
map generated in E-STATmap generated in E-STAT
2001 Census2001 Census
• lots of material available on STC lots of material available on STC website, and much more to comewebsite, and much more to come– much more than for 1996 census
• two levels of accesstwo levels of access– level 1: general population– level 2: DLI & DSP institutions
link
information available from STCinformation available from STC
training & instructiontraining & instruction
• ask your data person for a training ask your data person for a training sessionsession
• take advantage of training offered by take advantage of training offered by CAPDU/DLICAPDU/DLI
• get to know the most heavily-used get to know the most heavily-used sourcessources
• if you find a really good source, if you find a really good source, tell somebody! tell somebody!
training, etc.training, etc.
• create your own web page(s) of create your own web page(s) of favourite and/or heavily-used sourcesfavourite and/or heavily-used sources– York– UofT– “cheat sheets”
• DON’T BE AFRAID TO ASK FOR HELP!DON’T BE AFRAID TO ASK FOR HELP!
sources of helpsources of help
• CAPDUCAPDUCanadian Association of Public Data UsersCanadian Association of Public Data Users
• DLILISTDLILISTData Liberation InitiativeData Liberation Initiative
• INFODEPINFODEPDepository Services ProgramDepository Services Program
• Don’t be afraid to ask questions; all the stupid Don’t be afraid to ask questions; all the stupid ones have already been asked -- by “experts”!ones have already been asked -- by “experts”!
http://www.yorku.ca/walterg/ola2003/
Walter W. GiesbrechtWalter W. Giesbrecht
Data Librarian, York UniversityData Librarian, York University
OLA Super Conference 2003OLA Super Conference 2003
2003.02.012003.02.01