Post on 05-Jan-2016
9 October 2006 EPA Meeting
NCSU Libraries
Preservation Partnership with Library of Congress:
NDIIPP and the North Carolina Geospatial Data Archiving Project
Steve Morris
Jim Tuttle
Rob Farrell
Jeff Essic
NCSU Libraries
What is NDIIPP? Why NCSU Libraries?
• NDIIPP = National Digital Information Infrastructure and Preservation Program
• Responding to concern that we might be in the middle of a “digital dark age” Congress earmarked $100 million for digital preservation efforts through 2010
• Timeline– Aug. 2003: Library of Congress (LC) puts out call for
proposals for “preservation partners”– Sept. 2004: LC finalizes agreements with eight principal
partners, including NCSU.– Oct. 2004: the three-year projects begin
• A cooperative agreement … not a grant– emphasis on ongoing interaction with LC and other
partners, with transfer of learning experience to LC as primary outcome
NCSU Libraries
NC Geospatial Data Archiving Project (NCGDAP)
• Partner: NC Center for Geographic Information & Analysis (state agency)
• Focus: State and local agency digital geospatial data in NC as state demonstration
• Objective: Engage existing spatial data infrastructure (SDI) in the problem of preservation
• Tied to the NC OneMap initiative, which provides for seamless access to data, metadata, and inventories
NCSU Libraries
Geospatial Data Types: Vector &Attribute Data
Time seriesParcel Boundary Changes 2001-2004
North Raleigh, NC
NCSU Libraries
Geospatial Data Types: Vector Data
Time seriesParcel Boundary Changes 2001-2004
North Raleigh, NC
NCSU Libraries
Geospatial Data Types: Aerial Imagery
NCSU Libraries
Geospatial Data Types: Aerial Imagery
NCSU Libraries
85+ NC counties with orthophotos1-5 flights per county30-200 gb per flight
Geospatial Data Types: Aerial Imagery
NCSU Libraries
Today’s Geospatial Data as Tomorrow’s
Cultural Heritage
Future uses of data are difficult to anticipate (as with Sanborn Maps).
NCSU Libraries
Digital Preservation Points of Failure
• Data is not saved, or …
• can’t be found, or …• media is obsolete, or
…• media is corrupt, or
…• format is obsolete,
or …• file is corrupt, or …• meaning is lost
Solutions:
MigrationEmulationEncapsulationXML
NCSU Libraries
Risks to Digital Geospatial Data
• Producer focus on current data– Data overwrite as common practice
• Future support of data formats in question– No open, supported format for vector data
• Shift to web services-based access– Data becoming more ephemeral
• Inadequate or nonexistent metadata– Impedes discovery and use
• Increasing use of spatial databases for data management– Complex entities: the whole is greater
than the sum of the parts
NCSU Libraries
• Technical solutions: How do we archive acquired content over the long term?– Build a data repository: not as an end in itself but as a
catalyst for discussion within the data community– Develop a repository ingest workflow: create technical
points of engagement with the NDIIPP partners
• Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production?– Engage data producer community and spatial data
infrastructure through outreach and engagement; influence practice
– Sell the problem to software vendors and standards development
– Find overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc.
– Start a discussion about roles at the local, state, and federal level
NCGDAP Approach to Preservation
NCSU Libraries
Repository Ingest Workflow
• Flexible, extensible processes
• Clear, documented procedures
• Adherence to standard practices, where they exist
• Automation
NCSU Libraries
Technical Solution:Building a Digital Repository
• Three “Rights”:– Right format– Right tags (metadata)– Right relationship
Oh, and of course, valid for the rest of the Digital Age!
NCGDAP is about researching methodologies…
NCSU Libraries
What is the “Right” Format???
• Well, it’s complicated…
•Databases•Multi-part datasets
•Open Source
•Developments
Web Services
NCSU Libraries
Our Format Methodology
• Decide on archival format(s)
• Migrate non-archival formats
• Archive both versions of the data set
We need a methodology that can do this a few hundred thousand times…
initially.
NCSU Libraries
The Zip Codes Example
NCSU Libraries
Where Is the Data Set?
NCSU Libraries
Here Is One!
NCSU Libraries
Needles in the Haystack
• Computer Programs Written– Utilize functionality of GIS– Iterate through the data sets– Create “bundles” for deposit
• Process Steps1. Locate a data set 2. Determine the format3. Make appropriate conversion4. Create and isolate “bundle” with new and
original format5. Repeat
NCSU Libraries
Custom Tools
NCSU Libraries
Custom Tools
NCSU Libraries
Hub-and-spoke Metadata Transformation
NCSU Libraries
Hub-and-spoke Metadata Transformation
NCSU Libraries
Preserving Local Collections
NCSU Libraries
Preserving Local Collections
NCSU Libraries
Preserving Local Collections
NCSU Libraries
Preserving Local Collections
NCSU Libraries
Geologic and Historic Topographic Maps: Georeferencing and Preservation
NCSU Libraries
NCSU Libraries
NCSU Libraries
Historic Topographic Map Preservation
• 165 Historic 15-minute series topographic maps for NC
• Date range: 1892-1959
• Documentation at http://www.lib.ncsu.edu/gis/historictopos.html
• Available on NCSU Libraries Geodata server
NCSU Libraries
Geologic Map Preservation
• 290 Geologic Maps for NC
• Map sources are US Geologic Survey, NC Geologic Survey, theses and dissertations
• Documentation at http://www.lib.ncsu.edu/gis/geolmaps.html
• Public download at http://wfs.enr.state.nc.us/NCGeologicMaps/
NCSU Libraries
Geologic Map Preservation
1,200 – 24,000
1:500,000 – 1:2.5 M
1:31,680 – 1:430,000
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCSU Libraries
NCGS Project Summary
• Project came to us - workplan and intern identified • Preservation risk - data was stored on external drive• Content is in high demand by patrons, hardcopy only,
scarce to obtain• Collection acquired at no cost to Libraries• Data files publicly available for download• Partnership with NC Dept. of Environment and Natural
Resources; increasing interest in preservation• Early raster dataset for NCGDAP – test for large data
volumes, ingest process, metadata creation• NCGS Open File Report forthcoming
NCSU Libraries
• Engaging spatial data infrastructure– Evaluating metadata and content standard
adherence– Cultivation of content exchange networks– Sept. 2006 survey of current practice in local
agencies• External partnerships
– Partners on JISC-funded effort in the UK (Edinburgh)
• Engaging software vendors– Meetings with ESRI development teams
• Engaging standards development processes– Nov. 2005, partnered with University of
Edinburgh on presenting the preservation problem to the Open Geospatial Consortium (OGC) Technical Committee
– Oct. 2006, partnered with NARA on initiating a formal working group on digital preservation within the OGC
NCGDAP: Engagement with the Data Community
NCSU Libraries
NCGDAPon the Road
Presentations, posters, andworkshops
Jan. 2005- Sept. 2006
Highlights:
O’Reilly Where 2.0OGC Meeting (Germany)Digital Curation Center (UK)IS&T Archiving (Canada)IASSIST (UK)ESRI InternationalJoint NDIIPP & JISC Meeting
National/International: 37State/Local: 21
NCSU Libraries
• Project shifting to data acquisition mode• Current contract ends Oct. 2007• Likely continuation of project funding through Oct.
2010• Four responses to additional LC “Requests for
Expression of Interest (RFEI)”– Development of content exchange networks– Development of tool for automated capture of
web mapping services– Participation in repository exchange tests– Multi-state project involving State Archives
… RFEI status pending
NCGDAP: Future Directions
NCSU Libraries
Questions?
North Carolina Geospatial Data Archiving Project website
http://www.lib.ncsu.edu/ncgdap
Library of Congress NDIIPP website
http://www.digitalpreservation.gov/