eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank...
Transcript of eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank...
![Page 1: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/1.jpg)
EPrints Workshop, January 2005 1
eBank UK: Dissemination of research data using EPrints
Simon Coles, School of Chemistry, University of Southampton
![Page 2: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/2.jpg)
EPrints Workshop, January 2005 2
Overview• Scholarly communications in Chemistry
Data, information, workflows and provenance
• The data publication bottlenecke-Science and chemistry
• eBank UK Information architecture, data flow and interoperability
• Challenges for the futureExpansion into other disciplines and data formats
![Page 3: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/3.jpg)
EPrints Workshop, January 2005 3
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Searching , harvesting, embedding
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
The scholarly knowledge cycle.
Liz Lyon, eBankUK article. Ariadne, July 2003.
![Page 4: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/4.jpg)
EPrints Workshop, January 2005 4
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services: eBankUK
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
![Page 5: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/5.jpg)
EPrints Workshop, January 2005 5
Current chemistry publishing protocolsIdeas and interpretations Hooks into the literature
Results & derived data
Raw data!
![Page 6: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/6.jpg)
EPrints Workshop, January 2005 6
![Page 7: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/7.jpg)
EPrints Workshop, January 2005 7
Data Overload!
How do we disseminate?
EPSRC National Crystallography
Service
The data deluge
![Page 8: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/8.jpg)
EPrints Workshop, January 2005 8
CombeChem: eScience testbed
Properties
X-Raye-Lab
Analysis
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
![Page 9: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/9.jpg)
EPrints Workshop, January 2005 9
Establishing common ground…
• Understand the data creation process • Terminology and definitions
– Data– Metadata– Datafile– Dataset– Data holding
• Different views– Digital library researchers, computer scientists, chemists– Generic vs specific– Modeller vs practitioner
• Aim for a common ontology• Modelling the domain• Creating a metadata schema
![Page 10: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/10.jpg)
EPrints Workshop, January 2005 10
Crystallography workflow• Initialisation: mount new sample on diffractometer &
set up data collection• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic Information File
format)• Report: generate Crystal Structure Report
RAW DATA DERIVED DATA RESULTS DATA
![Page 11: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/11.jpg)
EPrints Workshop, January 2005 11
Deposition into the archive
![Page 12: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/12.jpg)
EPrints Workshop, January 2005 12
An Archive entry
ecrystals.chem.soton.ac.uk
![Page 13: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/13.jpg)
EPrints Workshop, January 2005 13
Access to the underlying data
![Page 14: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/14.jpg)
EPrints Workshop, January 2005 14
Some metadata issues
• Using simple and qualified Dublin Core • Additional chemical information in schema for
harvesting e.g. empirical formula• Schema contains International Chemical Identifier
(InChI)• Links to all datasets associated with an experiment• Links to individual datasets within an experiment• Links to EPrints (and other published literature)
derived from the data• Using vocabularies specific to crystallography• Engaging the broader scientific community to ensure
different schemas are compliant and standards can emerge
![Page 15: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/15.jpg)
EPrints Workshop, January 2005 15
ebank_dcrecord (XML)
Crystal structure (data holding)
Crystal structure report (HTML)
Dataset
Dataset
Institutional repository
eBank UK aggregator service
ePrint UK aggregator service
Subject service
DepositHarvesting OAI-PMH
ebank_dc
Harvesting OAI-PMHoai_dc
Harvesting OAI-PMHoai_dc
Dataset
dc:identifier
dcterms:references
Linking
dc:type=“CrystalStructure” and/or “Collection”
Model input Andy Powell, UKOLN.
Eprint oai_dcrecord (XML)
dcterms:isReferencedBy
dc:type=“Eprint” and/or ”Text”
Data flow in eBank
Eprint“jump-off” page (HTML)
dc:identifierEprintmanifestation (e.g. PDF)
Linking
![Page 16: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/16.jpg)
EPrints Workshop, January 2005 16
Harvesting: OAIster
![Page 17: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/17.jpg)
EPrints Workshop, January 2005 17
Linking and aggregating
![Page 18: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/18.jpg)
EPrints Workshop, January 2005 18
Embedded in a science portal
![Page 19: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/19.jpg)
EPrints Workshop, January 2005 19
Current situation
• Version 2.0 eBank metadata schema• Pilot institutional e-data repository for harvesting
(raw, derived, results data) using EPrints.orgsoftware
• Exports records as ebank_dc and oai_dc• Validation of schema & discussion with
International Union of Crystallography for final developments and wider deployment
• Pilot eBank UK aggregator service• Developing search interface Version 1.0 • Testing with PSIgate physical sciences portal –
embedding eBank UK
![Page 20: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/20.jpg)
EPrints Workshop, January 2005 20
What’s next?
• Progress towards generic metadata schemas • Validation against other schema (CCLRC Model)• Eprints.org software: allow for more generic scientific data
and schemas? • Metadata enhancement: keywords based on knowledge of
keywords in related publications?• Investigate identifiers: International Chemical Identifier • Explore context sensitive linking• Full embedding into chemical and crystallographic research
and publishing• e-Learning embedding and pedagogic evaluation• Feasibility study in related domains
![Page 21: eBank UK: Dissemination of research data using EPrintseprints.rclis.org/6194/1/coles.pdf · eBank UK: Dissemination of research data using EPrints ... Aggregator services: ... transformation,](https://reader031.fdocuments.us/reader031/viewer/2022020412/5b0a34177f8b9abe5d8ddbf0/html5/thumbnails/21.jpg)
EPrints Workshop, January 2005 21
Breakout Session?• Describing non ‘Dublin Core’ terms
Qualified Dublin CoreComplex object formats: METS vs MPEG-21 DIDL Set & Friends containers
• Compliance between schemasOne generic schemaDevelop multiple schemas
• RightsUse / reusePublisher
• Linking & aggregatingDOIKeyword ontologiesIdentifiersContext sensitive linking