Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and...
-
Upload
brendan-hancock -
Category
Documents
-
view
213 -
download
1
Transcript of Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and...
![Page 1: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/1.jpg)
Terminology for Statistics
How Can End Users Connect?
Stephanie W. HaasSchool of Information and Library ScienceUniversity of North Carolina at Chapel Hill
![Page 2: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/2.jpg)
Open Forum 2000 2
Overview
1. Terminology and End User Searching Characteristics of users and searches Types of queries Other sources of confusion
2. Ideas for Solutions Goals What needs to be solved Possible tools and structures
3. Final Points
![Page 3: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/3.jpg)
Open Forum 2000 3
Terminology and End User Searching
Characteristics of users and searchesTypes of queriesOther sources of confusion
![Page 4: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/4.jpg)
Open Forum 2000 4
Searching isn’t easy
“Query matching is effective only when the search is specific, the searcher knows precisely what he or she wants, and the request can be expressed adequately in the language of the system” (Borgman, 1996, p. 494)
If you don’t know what to call it, you can’t find it.
If you don’t know what it means, you can’t use it.
![Page 5: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/5.jpg)
Open Forum 2000 5
The Mapping Problem
DataElement(s)
AgencyTerm(s)
User’sTerm(s)
User’sInformation
Need
Search
![Page 6: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/6.jpg)
Open Forum 2000 6
Inside the System – Metadata Registry
Statistical experts’ understanding and usage
Crisp operational definitions (ideal)
Unambiguous terms (ideal)
Minimal or predictable contextual effects
DataElement(s)
AgencyTerm(s)
![Page 7: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/7.jpg)
Open Forum 2000 7
Outside the System
Choice of terms may depend on: user’s domain
knowledge user’s search
knowledge user’s notion of
what is available terms seen
elsewhere luck?
User’sTerm(s)
User’sInformation
Need
![Page 8: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/8.jpg)
Open Forum 2000 8
Users’ Knowledge
Varying sophistication of questions
What is the universe for this survey question, given the questions leading up to it?
What is the current unemployment rate? Please send me the answer before my 9:00 class tomorrow.
![Page 9: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/9.jpg)
Open Forum 2000 9
Types of Queries
Correct (matching) termconsumer price index consumer price
indexObvious synonym
health care medical care (CPI)Conceptual cluster of synonyms/near
synonymswoman, female, girls women
![Page 10: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/10.jpg)
Open Forum 2000 10
Types of Queries (2)
“External” terms, common outside the agency, no direct data element equivalent inside the agency.inflation (generally use CPI or PPI)turnover (retention rate? job or profession
tenure?)new jobs (first appearance on payroll?)
![Page 11: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/11.jpg)
Open Forum 2000 11
Types of Queries (3)
“Trendy” terms. Subset of external terms.cyberjobs (from magazine article)Webmaster (recent coinage)reinvention
![Page 12: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/12.jpg)
Open Forum 2000 12
Types of Queries (4)
Concept access”Give me everything you have about
worker benefits”Good answer requires pulling together
information from many sources (which may be more or less compatible).
(See MapStats for example. http://www.fedstats.gov/mapstats/)
![Page 13: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/13.jpg)
Open Forum 2000 13
Contributing Factors
Confusion about basic statistical conceptsseasonal adjustment
“Indicates the adjustment of timeseries data to eliminate the effect of intrayear variations which tend to occur during the same period on an annual basis.” (BLS Selective Access)
![Page 14: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/14.jpg)
Open Forum 2000 14
“To seasonally adjust a given economic time series is to eliminate that part of the change in the series which can be ascribed to the normal seasonal variation”
“Seasonal adjustment is a mathematical process whereby the effects of recurring non-economic factors are removed from an economic time series.”
(Dictionary of U.S. Government Statistical Terms, 1991)
![Page 15: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/15.jpg)
Open Forum 2000 15
“A term applied to time series from which periodic oscillations with a period of one year have been removed.” (Cambridge Dictionary of Statistics, 1998)
What is this number, and what does it mean?rate, index, ratio, value
![Page 16: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/16.jpg)
Open Forum 2000 16
Contributing Factors (2)
Major conceptual distinctions and when they apply. Different levels of geographical regions,
and the data available at each level (nation, region, state, metropolitan area, county)
Establishment data vs. household data Note the importance of context in the
use of these terms and data.
![Page 17: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/17.jpg)
Open Forum 2000 17
Contributing Factors (3)
Inherent ambiguity: the pay concept Carol Hert & John Fieber, search
terms from FedStats Web Page (http://www.fedstats.gov/), 11/98, 28,248 unique queries
Agency terms used for pay concept include:
income, compensation, earnings, wage, salary
![Page 18: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/18.jpg)
Open Forum 2000 18
BLS/CPS Terms
Total combined income “includes money from jobs, net income
from business, farm or rent, pensions, dividends, interest, social security payments and any other money income received” (CPS)
Compensation “sometimes used to encompass the
entire range of wages and benefits” (BLS Glossary of Compensation Terms)
![Page 19: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/19.jpg)
Open Forum 2000 19
BLS/CPS Terms (2)
Usual weekly earnings “include any overtime pay,
commissions, or tips usually received” (CPS concepts)
Hourly earnings “hourly rate as stated by the
employer…does not include tips, commissions, or any other non-hourly wages.” (CPS interviewer manual)
![Page 20: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/20.jpg)
Open Forum 2000 20
What does this user want? correction officer, income
Monetary income received - including that unrelated to job
Compensation, including benefits - total job package
Usual weekly earnings - including regular overtime
Hourly earnings - excluding overtime
![Page 21: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/21.jpg)
Open Forum 2000 21
Ideas for Solutions
GoalsWhat needs to be solvedPossible tools and structures
![Page 22: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/22.jpg)
Open Forum 2000 22
Goals for Possible Solutions
Maintain the distinction between agency (authority) terms and user terms. Note the distinction between a
terminology and user vocabulary Often lack of structure, stability, or
context (although patterns do exist)
![Page 23: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/23.jpg)
Open Forum 2000 23
Not equally weighted terminologies
T1
Data ElementConcepts
Data Elements
T2
![Page 24: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/24.jpg)
Open Forum 2000 24
Asymmetrical Structure
Agency Terms User Terms
Data ElementConcepts
Data Elements
registry contents
![Page 25: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/25.jpg)
Open Forum 2000 25
Maintenance Issues
Indexing is not the primary function of the agency.
Less than total coverage will still help.Can we assume:
Agency terms are adopted/defined slowly? User terms are more volatile (especially the
“trendy” ones)?
How often must mapping structures, procedures be updated?
![Page 26: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/26.jpg)
Open Forum 2000 26
Easing Users’ Pain
No problem same word(s), same meaning different word(s), different meaning
Support needed (thesaurus, definitions, explanation) different word(s), same meaning
(synonyms) same word(s) or different word(s), some
relationship between meanings (e.g., BT, NT, part-of, domain specific)
![Page 27: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/27.jpg)
Open Forum 2000 27
Same word(s) or different word(s), some undefined overlap in meaning
??? Can these users be helped ??? Same word(s), different meaning (if
unnoticed by user) Same word(s) or different word(s), no
relationship (wrong source of information?)
![Page 28: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/28.jpg)
Open Forum 2000 28
Providing Agency Information
Substituting agency term(s) for user term(s) and/or expanding user term(s) Hidden or overt? Automatic or interactive?
Displaying conceptual term clusters (e.g., gender, race, occupation)
Facilitating browsing
![Page 29: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/29.jpg)
Open Forum 2000 29
Giving definitions and examples source? “official” or basic?
Highlighting usage notes (the footnotes) Who needs to see them? When?
![Page 30: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/30.jpg)
Open Forum 2000 30
Crosswalk
Mapping between agency and user terms
Asymmetrical, build from users’ side80/20 principle for coverageMultiple sources of terms:
Search sessions Interviews with consultants, intermediaries Media reports, textbooks, other “public”
sources
![Page 31: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/31.jpg)
Open Forum 2000 31
Asymmetrical Structure
Agency Terms User Terms
Data ElementConcepts
Data Elements
Crosswalk
![Page 32: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/32.jpg)
Open Forum 2000 32
“Enhanced Indexing”
Expanding agency pay terms, FedStats Web page (Hert & Haas, preliminary findings)
Assume that more overlap between terms increases users’ chances of success
Query sessions where 50% of terms were agency terms Without expansion = 89% With expansion = 73%
![Page 33: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/33.jpg)
Open Forum 2000 33
Other Possibilities
Thesaurus, with relationships such as see and use for
Multilingual thesaurus or dictionary, treating terminologies as equal
Fully incorporate end-user terms into classification or data element concept entries (Desirable?)
![Page 34: Terminology for Statistics How Can End Users Connect? Stephanie W. Haas School of Information and Library Science University of North Carolina at Chapel.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa91a28abf838c99c3e/html5/thumbnails/34.jpg)
Open Forum 2000 34
Final Points
Users are inventive in term use.Users discourage easily.Maintenance is a crucial concern.Is the 80/20 principle useful?