The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management...

37
The US Long Term The US Long Term Ecological Research Ecological Research (LTER) Network: (LTER) Network: Site and Network Level Site and Network Level Information Management Information Management Kristin Vanderbilt Kristin Vanderbilt Department of Biology Department of Biology University of New Mexico University of New Mexico Albuquerque, New Mexico USA Albuquerque, New Mexico USA

Transcript of The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management...

Page 1: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

The US Long Term The US Long Term Ecological Research Ecological Research (LTER) Network: (LTER) Network: Site and Network Level Site and Network Level Information ManagementInformation Management

Kristin VanderbiltKristin VanderbiltDepartment of BiologyDepartment of BiologyUniversity of New MexicoUniversity of New MexicoAlbuquerque, New Mexico USAAlbuquerque, New Mexico USA

Page 2: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

LTER NetworkLTER Network

Six LTER sites Six LTER sites were established were established in 1980 with in 1980 with funding from NSFfunding from NSF

Now there are 26 Now there are 26 sites with 1800 sites with 1800 associated associated scientists and scientists and students.students.

Page 3: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 4: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Research themes for Research themes for all sites:all sites: BiodiversityBiodiversity Net primary productivityNet primary productivity Disturbance patternsDisturbance patterns BiogeochemistryBiogeochemistry

Page 5: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Goals of the LTER Goals of the LTER NetworkNetwork Understand general ecological Understand general ecological

phenomena that occur over long phenomena that occur over long temporal and broad spatial scalestemporal and broad spatial scales

Conduct major Conduct major data synthesesdata syntheses Provide Provide informationinformation for the identification for the identification

and solution of societal problemsand solution of societal problems Create a Create a legacylegacy of of well-designed andwell-designed and

documented long-term experiments and documented long-term experiments and observationsobservations for use by future for use by future generations generations

Page 6: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Data are the legacy of Data are the legacy of the LTER Networkthe LTER Network Integration of information Integration of information

management with research has management with research has been part of the LTER mandate since been part of the LTER mandate since the beginningthe beginning

The goal of information The goal of information management is to make high management is to make high quality, well-documented data easy quality, well-documented data easy to find, access, and use now and in to find, access, and use now and in the futurethe future

Page 7: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

LTER Data Release PolicyLTER Data Release Policy http://www.http://www.lternetlternet..eduedu/data//data/netpolicynetpolicy.html.html

Data and information derived from publicly Data and information derived from publicly funded research in the U.S. LTER Network … funded research in the U.S. LTER Network … are to be made available online with as few are to be made available online with as few restrictions as possible, on a restrictions as possible, on a nondiscriminatory basis. nondiscriminatory basis.

Page 8: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

LTER Data Release PolicyLTER Data Release Policy http://www.http://www.lternetlternet..eduedu/data//data/netpolicynetpolicy.html.html

Data and information derived from publicly funded Data and information derived from publicly funded research in the U.S. LTER Network … are to be research in the U.S. LTER Network … are to be made available online with as few restrictions as made available online with as few restrictions as possible, on a nondiscriminatory basis. possible, on a nondiscriminatory basis.

There are two data types: There are two data types: – Type I – data are to be released to the general public Type I – data are to be released to the general public

within 2 years from collectionwithin 2 years from collection and no later than the and no later than the publication of the main findings from the dataset and, publication of the main findings from the dataset and,

– Type II - data are to be released to restricted audiences Type II - data are to be released to restricted audiences according to terms specified by the owners of the data. according to terms specified by the owners of the data. Type II data are considered to be exceptional and should Type II data are considered to be exceptional and should be rare in occurrencebe rare in occurrence. Some examples of Type II data . Some examples of Type II data restrictions may include: locations of rare or endangered restrictions may include: locations of rare or endangered species, data that are covered under prior licensing or species, data that are covered under prior licensing or copyright (e.g., SPOT satellite data), or covered by the copyright (e.g., SPOT satellite data), or covered by the Human Subjects Act. Human Subjects Act.

Page 9: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Information Management Information Management responsibilities differ at responsibilities differ at site and network levels: site and network levels:

– quality data and metadata must be quality data and metadata must be archived and made accessible at archived and made accessible at each each LTER SiteLTER Site

– And tools must be created to And tools must be created to integrate site databases across the integrate site databases across the LTER NetworkLTER Network

Page 10: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

26 LTER sites = 26 26 LTER sites = 26 different approaches different approaches to information to information managementmanagement Structure and content of Structure and content of

metadata varymetadata vary Mechanisms for storing and Mechanisms for storing and

manipulating data differmanipulating data differ– flat ASCII files, RDBMS; Matlab, SAS, flat ASCII files, RDBMS; Matlab, SAS,

SQLSQL Different search mechanisms for Different search mechanisms for

datadata

Page 11: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Typical LTER Site Typical LTER Site Information Management Information Management SystemSystem Designated information manager (IM)Designated information manager (IM) Standard procedures for working with Standard procedures for working with

data to ensure adequate data to ensure adequate documentation and quality documentation and quality assuranceassurance

Hardware for data storageHardware for data storage A web-pageA web-page Back-up systemBack-up system

Page 12: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Project Initiation: Scientist submits protocols and

project description to IM

Information Management Procedures at the Sevilleta LTER

Page 13: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Data Entry and QA/QC by Scientist

Project Initiation: Scientist submits protocols and

project description to IM

Data Collection

Information Management Procedures at the Sevilleta

Page 14: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Data Entry and QA/QC by scientist

Project Initiation: scientist submits protocols and

project description to IM

Data Collection

Metadata submitted by scientist in Excel

Template

Metadata and data reviewed by IM

Information Management Procedures at the Sevilleta

Page 15: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 16: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Data Entry and QA/QC by scientist

Project Initiation: scientist submits protocols and

project description to IM

Data Collection

Metadata submitted by scientist in Excel

Template

Metadata and data reviewed by IM

ASCII file created

containing data and metadata

Files archived on Sevilleta

server

Information Management Procedures at the Sevilleta

Page 17: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Sevilleta data are stored in an ASCII text Sevilleta data are stored in an ASCII text file with a defined structure: file with a defined structure:

\log

History of the data file (when created, updated, errors fixed, etc.)

\doc

Metadata

\header

Variable names

\data

Data records

Page 18: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Sevilleta Data ArchiveSevilleta Data Archive

Data sets are stored in directories clustered by theme….

archive

nutrientanimal plant

phenology npptransectdecomposition soilN

Page 19: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Data Entry and QA/QC by scientist

Project Initiation: PI submits protocols and

project description to IM

Data Collection

Metadata submitted by PI in Excel

Template

Metadata and data reviewed by IM

ASCII file created

containing data and metadata

Files archived on Sevilleta

server

Data Made Web Accessible

Information Management Procedures at the Sevilleta

Page 20: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 21: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 22: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 23: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 24: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Ecological Information Ecological Information Management System at Management System at Sevilleta illustrates:Sevilleta illustrates: An IM system can be built with An IM system can be built with

readily available toolsreadily available tools– ExcelExcel– ASCII text filesASCII text files– Web page design toolWeb page design tool

IT genius not required; these IT genius not required; these tools are easy to learntools are easy to learn

Page 25: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Long Term Ecological Research Network Information Management

System

Page 26: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Objective of the NISObjective of the NIS

Enhance discovery and access to Enhance discovery and access to data across LTER sites in order to data across LTER sites in order to facilitate synthesis and facilitate synthesis and collaborative ecological researchcollaborative ecological research

Page 27: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Strategies for enhancing Strategies for enhancing data discovery and data discovery and access:access:

Adopt Ecological Metadata Language Adopt Ecological Metadata Language (EML) as standard(EML) as standard

Establish an all-site data catalog based Establish an all-site data catalog based on EMLon EML

Establish a single-portal web interface Establish a single-portal web interface for querying catalogfor querying catalog

Page 28: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Adoption of a metadata Adoption of a metadata standard is key to standard is key to building NISbuilding NIS Ecological Metadata Language Ecological Metadata Language

(EML) is an XML-based standard (EML) is an XML-based standard – XML is similar to HTML, except the XML is similar to HTML, except the

tags are customized to reflect the tags are customized to reflect the information contentinformation content

Page 29: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Why are metadata Why are metadata standardized ?standardized ? Use a common set of understandable Use a common set of understandable

terms; information is described in a terms; information is described in a similar waysimilar way

Location of information (metadata Location of information (metadata descriptors) can be found in the same descriptors) can be found in the same place; facilitating entry & retrievalplace; facilitating entry & retrieval

““Tools” can be built; e.g., metadata Tools” can be built; e.g., metadata entry and searches can be automatedentry and searches can be automated

Page 30: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

<eml><dataset> <title> Sevilleta LTER Net Primary Productivity Data 2004</title><creator>

<individualName><salutation> Dr. </salutation><givenName> Esteban </givenName><surName> Muldavin </surName>

</individualName><eMailAddress> [email protected]

</eMailAddress></creator><abstract>Net primary production (NPP) is a fundamental ecological variable that measures rates of carbon consumption and fixation. </abstract><keywordSet>

<keyword>biomass</keyword><keyword>ANPP</keyword><keyword>LTER</keyword>

</keywordSet><contact>

<positionName> Information Manager </positionName><eMailAddress> [email protected]

</eMailAddress></contact></dataset></eml>

Example of EML

Page 31: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

http://knb.ecoinformatics.org/index.jsp

Morpho: A tool for creating EML

Page 32: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

LTER Metaca

t

SEV

EML

CAP

EML

JRN

EML

CWT

EML

Page 33: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 34: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Page 35: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Excerpt from: Report on the CODATA Workshop on Archiving Scientific and Technical (S&T) Data, 20-21 May 2002, Pretoria, South Africa

Managing S&T Data: Preservation in the Managing S&T Data: Preservation in the Broader Context: Broader Context: Needs for future access of Needs for future access of datadata

– Shared standards and best practices for repository Shared standards and best practices for repository management, federated archives, and metadatamanagement, federated archives, and metadata

– An integrated framework … that provides a shared An integrated framework … that provides a shared way of communicating and standard technologyway of communicating and standard technology

Page 36: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Excerpt from: Report on the CODATA Workshop on Archiving Scientific and Technical (S&T) Data, 20-21 May 2002, Pretoria, South Africa

Managing S&T Data: Preservation in the Managing S&T Data: Preservation in the Broader Context: Broader Context: Needs for future access of Needs for future access of datadata

– Shared standards and best practices for repository Shared standards and best practices for repository management, federated archives, and metadatamanagement, federated archives, and metadata

LTER approach: LTER Network data release policy, LTER approach: LTER Network data release policy, NIS, EMLNIS, EML

– An integrated framework … that provides a shared An integrated framework … that provides a shared way of communicating and standard technologyway of communicating and standard technology

Page 37: The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.

Excerpt from: Report on the CODATA Workshop on Archiving Scientific and Technical (S&T) Data, 20-21 May 2002, Pretoria, South Africa

Managing S&T Data: Preservation in the Managing S&T Data: Preservation in the Broader Context: Broader Context: Needs for future access of Needs for future access of datadata

– Shared standards and best practices for repository Shared standards and best practices for repository management, federated archives, and metadatamanagement, federated archives, and metadata

LTER approach: LTER Network data release policy, NIS, LTER approach: LTER Network data release policy, NIS, EMLEML

– An integrated framework … that provides a shared An integrated framework … that provides a shared way of communicating and standard technologyway of communicating and standard technology

LTER approach: LTER Metacat metadata database and LTER approach: LTER Metacat metadata database and centralized data catalogcentralized data catalog