Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in...
Transcript of Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in...
![Page 1: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/1.jpg)
Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
Infrastructure, Curation, and Metadata
Mark A. Parsons
NSF Workshop on Cyberinfrastructure for Polar Sciences10 September 2013
![Page 2: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/2.jpg)
From Interregional Highways: Message from the President of the United States Transmitting a Report of the National Interregional Highway Committee, Outlining and Recommending a National System of Interregional Highways, 12 Jan. 1944.CC-BY Eric Fischer http://www.flickr.com/photos/walkingsf/8270270785/
![Page 3: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/3.jpg)
http://www.shockblast.net/aerial-photographs/urban-sprawl-by-christoph-gielen-arizona/
![Page 4: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/4.jpg)
Infrastructure is
Relationships, interactions, and connectionsbetween humans, technologies, and institutions
![Page 5: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/5.jpg)
Interchangecc-by-sa Steven Vance http://www.flickr.com/photos/jamesbondsv/8475376363/
![Page 6: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/6.jpg)
Ranch ExitCC-BY-SA Ken Lund http://www.flickr.com/photos/kenlund/2381991900/
![Page 7: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/7.jpg)
Understanding Infrastructure: Dynamics, Tensions, and Design
2007Paul Edwards, Steven Jackson, Geoffrey Bowker,
and Cory Knobelhttp://hdl.handle.net/2027.42/49353
![Page 8: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/8.jpg)
![Page 9: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/9.jpg)
![Page 10: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/10.jpg)
![Page 11: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/11.jpg)
![Page 12: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/12.jpg)
![Page 13: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/13.jpg)
![Page 14: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/14.jpg)
![Page 15: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/15.jpg)
![Page 16: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/16.jpg)
![Page 17: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/17.jpg)
![Page 18: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/18.jpg)
![Page 19: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/19.jpg)
datawrangler
![Page 20: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/20.jpg)
Data Management for CLPX—AGU 11 December 2003
Example of Snow Pit DataWe start with this:
![Page 21: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/21.jpg)
Data Management for CLPX—AGU 11 December 2003
Example of Snow Pit Data
… …
Summary Table
Density Table
Stratigraphy Table
Temperature Table
And create this:
![Page 22: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/22.jpg)
Curation is
continual and conscious improvement of data.
![Page 23: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/23.jpg)
Curation matters
• Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data sheets had issues corrected in the field that would not have been correctable later).
• Reveals and documents tacit knowledge. Can even uncover science questions.
• Integrated much more than the field data with common grids and formats
• “Standard” formats often do not exist and need to be created by the community. Sometimes you need multiple formats.
• A mediated relationship between user and collector.
• Need to include curators in data collection even if they are not in the field.
• Get to know your local curator.
![Page 24: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/24.jpg)
Metadata are
Everything necessary to make data independently understandable by a designated community.
![Page 25: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/25.jpg)
Infrastructure is comprised of relationships.
Curators create and enhance relationships.
Metadata is often the information shared (often tacitly) in those relationships.
![Page 26: Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data](https://reader036.fdocuments.us/reader036/viewer/2022071101/5fda205994f98069a3202656/html5/thumbnails/26.jpg)
A standard sensor format?
• Temporally varying sensors (e.g. a borehole)
• Spatially varying sensors (e.g. a transect)
• Temporally and spatially varying sensors (e.g. a drifting buoy)