The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human...
-
date post
21-Dec-2015 -
Category
Documents
-
view
222 -
download
0
Transcript of The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human...
![Page 1: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/1.jpg)
The GovStat Projectils.unc.edu/govstat
Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the
National Statistical Knowledge Network
Carol A. Hert
Syracuse University NSF Grants EIA 0131824 and EIA 0129978
Principal Investigators: Gary Marchionini, Stephanie Haas, Ben Shneiderman, Catherine Plaisant, and Carol Hert
Gov Stat
![Page 2: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/2.jpg)
Project Partners
• Bureau of Labor Statistics
• Census Bureau
• Center for Health Statistics
• Social Security Administration
• National Agriculture Statistical Service
• Energy Information Administration
Gov Stat
![Page 3: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/3.jpg)
Project Goals
• To create an integrated model of user access to and use of US government statistical information (The Statistical Knowledge Network)
• Design and test prototype interface tools to support finding and using statistics
• To support integration (technical and intellectual) of statistical data
Gov Stat
![Page 4: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/4.jpg)
Statistical Knowledge Network Architecture
Agencies
SKN Registry
ActionsContribute
FindDisplay
Annotate UnderstandManipulate Collaborate
…..
………….
ObjectsActions
Private Work Space
Objects Actions
Private Work Space
Ontology Rules & Constraints
SKN Consortium
…..
Objects Reports metadataTables metadataPeople metadata
GlossaryAnnotations
Objects Actions
Private Work Space
Objects Actions
Private Work Space
![Page 5: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/5.jpg)
Statistical Knowledge Network Architecture
• Enable statistical agencies to:– Reach wider audiences
– Standardize strategies for transmission, retrieval & use
– Reduce costs
– Facilitate cooperation among agencies & organizations
Goal: Increase find-ability, understand-ability & use of government statistics
![Page 6: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/6.jpg)
Metadata as a Linchpin of Integration of Diverse Statistical Information
• Metadata during statistical information seeking
• User studies of statistical information use• Building a schema to support these activities• A hierarchy of integration (and the metadata
to support it)
• With a few closing words on technology transfer! Gov Stat
![Page 7: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/7.jpg)
Metadata for Statistical Information Seeking
• The user challenges:– Who has the relevant data?
• decentralized statistical system
– Finding data that map to the set of topical, time period, geographic and other requirements
• Interface tool relying on metadata (currently harvested automatically from webpages)– Supports exploration prior to access
Gov Stat
![Page 8: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/8.jpg)
.go
vRelation Browser with all EIA pages
![Page 9: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/9.jpg)
User Studies of Metadata and Statistical Information Use
1. metadata requirements for understanding tables (Hert & Hernández, 1999).
2. metadata requirements in a variety of integration tasks (Denn, Haas, & Hert, 2003).
3. Statistical comparisons particularly investigating the types of comparisons made and the rules experts employ during those comparison processes (Hert, 2004).
Gov Stat
![Page 10: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/10.jpg)
Some insights from the studies• Some types needed:
– Definitions– Survey methodology– Rationales and information on differences (what is the
difference between concept 1 and concept 2)– Currency of information (what’s the latest data I can get,
when will more data be available, etc.)– Table structure– Interface design
• Supporting use requires significant amounts of metadata including some not easily generated (automatically or otherwise) Gov Stat
![Page 11: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/11.jpg)
Some insights from the studies
• Comparing is a key activity in integrating statistics
• Business rules for operating on the metadata necessary to support user tasks
• Metadata supports help tools, help tools will be necessary to support metadata usage
Gov Stat
![Page 12: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/12.jpg)
Metadata Schema Philosophy
• To provide sub-document level access and integration across documents and agencies.
• To provide a minimal set of metadata elements necessary while allowing for extensibility.
• To achieve these goals in a manner that enables efficient transfer to agencies.
Gov Stat
![Page 13: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/13.jpg)
Our Schema in Action: An Example
• Scenario: The fact that the percentage of older people in the population of the US is increasing raises a question about the overall economic status of this group. In particular, we are interested in people who are retired or no longer in the work force and over a certain age (65 or older). We want to know the following things to understand the economic status of this particular group of people:– Income level (in terms of median income) compared to the
general (whole) population– Sources of income– Employment status
![Page 14: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/14.jpg)
Examples from the Markup
• Table markup:– For each table, the schema encodes the table
title, each row or column heading, and the data values in the table.
• Each data value element references the row and column heading elements associated with it.
• Footnotes are encoded at the highest level to which they apply – the table level, the row/column level, or the individual data value level.
![Page 15: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/15.jpg)
Examples from the Markup <tableInfo>
<tableTitle>Table 1.1 Percentage with income from specified source, by age, marital status, and sex of nonmarried persons</tableTitle>
<rowInfo><rowTitle>Source of Income -
Earnings</rowTitle><rowID>r001</rowID>
</rowInfo><rowInfo>
<rowTitle>Source of Income - Earnings - Wages and salaries</rowTitle>
<rowID>r002</rowID></rowInfo><rowInfo>
<rowTitle>Source of Income - Earnings - Self-employment</rowTitle>
<rowID>r003</rowID></rowInfo><rowInfo>
<rowTitle>Source of Income - Retirement benefits</rowTitle>
<rowID>r004</rowID></rowInfo><rowInfo>
<rowTitle>Source of Income - Retirement benefits - Social Security</rowTitle>
<rowFootnote>Social Security includes retired-worker benefits, dependents' or survivors' benefits, disability benefits, transitionally insured benefits, or
special age-72 benefits</rowFootnote><rowID>r005</rowID>
</rowInfo>...
In order to preserve category information, individual row and column headings include the category labelling.
Including the category labelling within the row/column headings improves access to data embedded within tables by making the category information searchable.
![Page 16: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/16.jpg)
Examples from the Markup (cont.)<tableTitle>Table 3. Comparison of
Summary Measures of Money Income and Earnings by Selected Characteristics: 2001 and 2002</tableTitle>
<tableFootnote>Source: US Census Bureau, Current Population Survey, 2002 and 2003 Annual Social and Economic Supplements</tableFootnote>
<tableFootnote>Households and people as of March of the following year</tableFootnote>
<rowInfo>
<rowTitle>Age of Householder - 65 years and over</rowTitle>
<rowID>r015</rowID>
</rowInfo>
<colInfo>
<colTitle>2002 - Median money income - value</colTitle>
<colFootnote>dollars</colFootnote>
<colID>c005</colID>
</colInfo>
<cellInfo>
<cellValue rowID="r015" colID="c005">23,152</cellValue>
</cellInfo>
![Page 17: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/17.jpg)
Examples from the Markup (cont.)
<rowInfo><rowTitle>Age of Householder - 65 years and over</rowTitle><rowID>r015</rowID>
</rowInfo>
<colInfo><colTitle>2002 - Median money income - value</colTitle><colFootnote>dollars</colFootnote><colID>c005</colID>
</colInfo><cellInfo>
<cellValue rowID="r015" colID="c005">23,152</cellValue>
</cellInfo>
<colInfo>
<colTitle>Aged 65 or older Total All units</colTitle>
<colID>c003</colID>
</colInfo>
<rowInfo>
<rowTitle>Source of Income - Earnings - Wages and salaries</rowTitle>
<rowID>r002</rowID>
<rowInfo>
<rowTitle>Source of Income - Earnings - Wages and salaries</rowTitle>
<rowID>r002</rowID>
</rowInfo>
<cellInfo>
<cellValue rowID="r002” colID="c003">19</cellValue>
</cellInfo>
Note that since these headings both contain keywords for age 65 or older that we can begin to think about ways to integrate these data.
![Page 18: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/18.jpg)
What the Example Demonstrates
• Access: preserving data from table titles, row/column headings, and footnotes allows metadata essential for understanding to travel with the data values, and aids in search and retrieval
• Integration: once we have this essential metadata tagged, it becomes easier to use tag similarities to allow us to investigate options for displaying data from different tables in an integrated manner.
![Page 19: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/19.jpg)
A Hierarchy of Integration
Low level of integration
High level of integration
• Searchable table titles
• Searchable row and column headings
• Linking of data values to row and column headings
• Linking of row and column headings to underlying survey variables
• Linking of analysis units, universe statements, concept definitions, across documents and agencies
• Linking of contextual information (such as footnotes) to tables, row/column headings, or data values
Our schema can provide the items beneath this dotted line.
Limited amount of metadata
Increasing amounts of metadata
![Page 20: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/20.jpg)
Using the Hierarchy of Integration
Low level of integration
High level of integration
• Searchable table titles
• Searchable row and column headings
• Linking of data values to row and column headings
• Linking of row and column headings to underlying survey variables
• Linking of analysis units, universe statements, concept definitions, across documents and agencies
• Linking of contextual information (such as footnotes) to tables, row/column headings, or data values
Limited amount of metadata
Increasing amounts of metadata
Organization can determine where to“sit” on this hierarchy in terms of effort and level of integration desired
![Page 21: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/21.jpg)
Using the Hierarchy of Integration
Low level of integration
High level of integration
• Searchable table titles
• Searchable row and column headings
• Linking of data values to row and column headings
• Linking of row and column headings to underlying survey variables
• Linking of analysis units, universe statements, concept definitions, across documents and agencies
• Linking of contextual information (such as footnotes) to tables, row/column headings, or data values
Limited amount of metadata
Increasing amounts of metadata
![Page 22: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/22.jpg)
What have we learned about technology transfer
• Must demonstrate utility of research with working prototypes– Relationship Browser (and other
interface tools)– Metadata workstation in development
• Agencies need simplicity or to understand value of complexity to readjust resources– Hierarchy of integration used as a
conceptual tool– Provide training
Gov Stat
![Page 23: The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d605503460f94a410df/html5/thumbnails/23.jpg)
Further information
• [email protected]• Project website (including demos of
Relationship Browser, an interactive glossary tool, etc.) at http://ils.unc.edu/govstat
Gov Stat