CSIRO National Research Collections
-
Upload
australiannationaldataservice -
Category
Education
-
view
91 -
download
2
Transcript of CSIRO National Research Collections
Andrew Young – Director, (NRCA)Margaret Cawsey – Curator National Wildlife CollectionJohn Morrissey - CSIRO IMT
National Research Collections Australia (NRCA) Specimen Identifiers – possible futures!
Australia: a mega-diverse continentAustralia has:
• A lot of biodiversity – 8% of the Earth’s species
• Unique biodiversity – >70%+ endemic
• Valuable biodiversity – soybean, cotton, sorghum, macadamia,
acacias, eucalypts
The challenge and opportunity is to:• Manage biodiversity for conservation
and ecosystems services– species decline, Convention on Biological
Diversity
• Exploit biological assets for industry– food, fibre, medicines, novel compounds
NRCA Mission• National Research Collections Australia (NRCA) is a world-class
“science-ready” collections research facility• It discovers, documents, describes and explores Australia’s
biodiversity• NRCA delivers digital data and science to inform the conservation
and use of Australia’s unique biological assets
What is NRCA?
4 |
Wildlife Collection
Insect Collection
Herbarium(CANBR + ATH JVs)
Algae Culture
Collection
Tree Seed CentreStaff: 8
Fish Collection
Atlas of Living
Australia (ALA)
• Six national biological collections • 15+ million specimens• 200 year time-series (1805)• Web-based digital delivery
portal - Atlas of Living Australia (ALA)
Data from Drawers 2015 | Andrew Young
Crop germplasm Collection
Soil Collection
What is in NRCA?• Physical specimens
– whole organisms, skins, tissues samples, DNA samples
• Living collections– cultures, seed banks, seed orchards
• Digital specimens– sounds, photographs, X ray images, DNA
sequences• Contextual data
– Location, site descriptions, species associations
• Unique $1.4+ billion research asset
Currently each collection has its own database: • 5/6 are bespoke• Only one is run by IMT • Inefficient, ineffective and vulnerable…
Data management challenge – a single system:• 15+ million specimens x 30-40 fields = 500 000 000 records• Links to field books, living collections, nomenclature, associated samples
(e.g. seeds, tissues, DNA samples, sounds) • Loans (30 000 - 40 000pa) and curation• Room for future expansion (30 000+ pa)• New data layers e.g. genomes and phenomes• Biologically intuitive interface• Seamless data delivery to the ALA
We need Integrated data management!
Collective Access- Open source- Thin client- Fits IMT architecture- Good functionality
Kinds of digital data
• High resolution scans/ photographs• X-rays• Fluorescent images• 3D images• Sounds• Micro CT scans
Data from Drawers 2015 | Andrew Young7 |
You want me to re-label 15+ Million specimens?
• Collection Management Systems‒ loans and tissue grants‒ Provenance data about each specimen‒ Taxonomy data (which can change)‒ Geospatial data
• Data standards – Compliance with Biodiversity Information Standards (TDWG) main
driver• Data sharing and discoverability
– Facilitated via metadata feeds to various domain specific aggregators like ALA and GBIF
Lets look at GUIDs from a CSIRO Natural History Collections point of view….
•Established 1985•Initial Data standards
– Faunal communities: Darwin Core– Herbaria: HISPID 6
•GUIDs – 2006–Relatively simple requirements–The LSID: Life Science Identifier–URN technology
•2010 - URI–Semantic web and linked data
introducing TDWG
•2001 - GBIF•2007 - ALA (Darwin Core)
– LSIDs adopted by the faunal collections– each mint their own
• But ...–Often don’t resolve–Not used by many major collections–ALA & GBIF both mint its own record identifiers
Discoverability – a brief history
urn:lsid:ozcam.taxonomy.org.au:ANWC:Birds:Specimen:B56401
• Lists 14 recommendationsR1. GUID technologies should be chosen from the list of recommended GUID types.
• *HTTP URI (used as a basis for some of the following options)
• LSID — *Life Science Identifier• DOI — Digital Object Identifier.• PURL — Permanent URL.• UUID — Universally Unique Identifier.• Handle System
TDWG GUID applicability statement 2010
http://bioguid.info/urn:lsid:ozcam.taxonomy.org.au:ANWC:Birds:Specimen:B56401
GUID’s can be applied to a variety of objects – Scientific names– Taxonomic Concepts– Datasets & Collections– Specimens– Genetic samples– Images, videos, sound recordings– Observations
TDWG GUID applicability statement 2010
•GBIF• Moved to using DOIs• Recommend the adoption of an identifier scheme that would
work well with DOI’s
•TDWG• TDWG are removing recommendations to use LSID, as decided at
the TDWG Executive meeting Sept 2016
•Conclusions:• No consensus reached ...• It is unlikely that any particular GUID technology will be
successfully implemented until TDWG achieves consensus
Current situation for the natural history collections...
So what about the rest of CSIRO’s collections?• Most are currently using a bespoke in-house Specimen ID’s within
their collection management systems. • Poor understanding of the value proposition of adopting IGSN or
any other resolvable GUID technology• Major need to have GUID’s visible beyond the collection
management system so that data and be easily linked to specimens from other systems like:
– CSIRO Data Access Portal – CSIRO Publications repository– Digitisation and Characterisation services like Australian Synchrotron– ALA, TERN and GBIF…
John Morrissey
CSIRO IMT
NATIONAL FACILITIES AND COLLECTIONS, NATIONAL RESEARCH COLLECTIONS AUSTRALIA
Thank you