Post on 01-Jan-2016
description
U.S. Census Data & TIGER/Line Files
Census Bureau: Charged with the Constitutional responsibility of carrying out the decennial
census Census of Population and Housing
Very large mapping component involved in undertaking a national census! Census demographic/socioeconomic data:
Demographic, economic, & social data about persons & households Aggregated by census enumeration units: e.g. block, block group, tract,
county, metropolitan area, etc… TIGER/Line files:
The “geography” of the census Topogically Integrated Geographic Encoding & Referencing e.g., polygons for enumeration units, streets & landmarks
TIGER/Line files - background 1967 - New Haven Census Use Study
test digital data structures for storing census data by geographic areas test processes for creating computerized Census maps had topology!!
1970s - Census DIME files expansion of New Haven study into production version data coverage: U.S. urban areas important component of 1980 decennial Census
1980s - development of TIGER/Line files incorporated DIME files for urban areas (DIME updated in 1981 & 1985) incorporated nationwide 1:100,000 USGS DLG data additional information from local officials & Census fieldwork
1990s – TIGER in use used for 1990 Census TIGER updated nearly yearly after 1990 from variety of sources 1998-1999: major update & prep for 2000 Census
2000 - latest Census 2nd use of TIGER for Census data being released now
TIGER/Line Files
TIGER designed to: support pre-census functions in preparation for
Census of Population and Housing support census-taking efforts evaluate success of the Census provide geographic framework for analysis
Nominal scale: 1:100,000 Data "layers":
Enumeration units blocks, block groups, tracts/block numbering areas,
counties, cities/MA, etc. multiple hierarchies
Voting districts used for Congressional redistricting
Supporting geography roads/streets/highways basic hydrography point & area landmarks etc...
TIGER Area (polygon) & Landmark Data Point and poly landmarks
Census geography (tracts, blocks, etc.) used for reporting Census data ID linkage from polygons
in TIGER/Line data to Census attribute data
TIGER Line and Address Data Linear features...
Form polygon boundaries Roads
attributes include basic road type, address ranges
also hydro features, etc.
Link to Census Data Census attribute data - Summary Tape File (STF) data
files
Link to Census geographic entities in TIGER/Line files using unique Census geography IDs
Lets us merge a tremendously rich souce of detailed socioeconomic data (Census) with a comprehensive geography for the entire country…
Orange County, NC block groups w/ median income data (darker green = higher income)
hierarchical tabulation systems, e.g.:
USA
Region
Division
State
County
Tract
Block Group
Block
2000 Census tallies for entire US:
65,443 tracts
208,790 block groups
8,205,582 blocks
for NC:
1,563 tracts
5,271 block groups
232,403 blocks
Census Geographic Hierarchy
TIGER Address Data
address ranges: street address numbers at beginning and ending of arc/line in database allows address geocoding
match data with address to a spatial location using an interpolated estimate
data use implication: explosion of analysis and data integration capabilities! extremely large (and growing) amount of data tied to addresses
problem: incomplete address range data, esp. in rural areas
--why? some areas simply have incomplete data (very large data collection task) PO rural routes (though this is changing due to E-911 systems)
Census Bureau steadily improving rural address data private street/address data providers enhance address range data
Relational Database Structure
Relational DBMS
Data stored as tuples (tup-el), conceptualized as tables
Table – data about a class of objects Two-dimensional list (array) Rows = objects Columns = object states (properties, attributes)
Row = object
Column = property
Table = Object Class
Object Classes withGeometry called Feature Classes
Relation Rules
Only one value in each cell (intersection of row and column)
All values in a column are about the same subject
Each row is unique No significance in column sequence No significance in row sequence
Joined Table
Relational Join
Fundamental query operation Occurs because
Normalization Data created/maintained by different users, but integration
needed for queries Table joins use common keys (column values) Table (attribute) join concept has been extended to
geographic case
Normalization
Process of converting tables to conform to relational rules
Split tables into new tables that can be joined at query time The relational join
Several levels of normalization Forms: 1NF, 2NF, 3NF, etc.
Normalization creates many expensive joins De-normalization is OK for performance optimization
Spatial Relations
Equals – same geometries Disjoint – geometries share common point Intersects – geometries intersect Touches – geometries intersect at common boundary Crosses – geometries overlap Within– geometry within Contains – geometry completely contains Overlaps – geometries of same dimension overlap Relate – intersection between interior, boundary or exterior
Two Possible Relations
Point Quadtree
Region Quadtree