CERIF for Datasets: Background and Key Findings
description
Transcript of CERIF for Datasets: Background and Key Findings
CERIF for Datasets:Background and Key Findings
Workshop, London26th July 2013
CERIF slides reproduced from presentations by euroCRIS members :Keith Jeffery, Brigitte Joerg, Anna Clements
C4D workshop, Glasgow & London. July 2013
JISC MRD Programme Consortium : Sunderland, Glasgow, St Andrews,
NERC, EPSRC, DCC and euroCRIS “CERIFication” of the metadata about research
datasets Focus on MEDIN* standard : NERC requirement
for http://www.bodc.ac.uk/
* http://www.oceannet.org/
C4D Summary
C4D workshop, Glasgow & London. July 2013
Datasets & metadataDatasets have sparked interest in metadata standards that support their: Discoverability Description Usability Re-use
C4D workshop, Glasgow & London. July 2013
For example … CKAN : www.ckan.org
– Software platform; default schema is DC eGMSDescription
– UK e-Government metadata standard; based on DC– ‘flat’ model; single entity (a resource or dataset); keep
adding attributes DCAT
– RDF schema vocabulary for PSI (public sector info)– Some normalisation; can’t capture different
roles/semantics in relationships
C4D workshop, Glasgow & London. July 2013
Houssos, N., Joerg, B., Matthews, B.. A multi-level metadata approach for a Public Sector Information data infrastructure. CRIS2012. Prague 06-09 June 2012http://www.engage-project.eu/engage/wp/
C4D workshop, Glasgow & London. July 2013
Common European Research Information Format A conceptual model for describing the complete
research domain A standard for the development, implementation
and interoperability of current research information systems (CRIS) and their various application
Est. 1991; maintained by www.euroCRIS.org
… so what about CERIF?
C4D workshop, Glasgow & London. July 2013
Not for profit organisation of experts– Research organisations; funders; publishers; systems
providers; standards organisations 109 institutional, 38 personal & 20 affiliate
members (euroCRIS annual report 2012)
41 countries; not just Europe Main activity is the development, maintenance
and of implementation CERIF
… and euroCRIS?
C4D workshop, Glasgow & London. July 2013
euroCRIS : Strategic Partners
C4D workshop, Glasgow & London. July 2013
In the UK : The CERIF landscape
C4D workshop, Glasgow & London. July 2013
1/3 of UK HEIs have a CERIF-compliant CRIS* Driven by desire to better support research
management at the institutional level … and streamline reporting to funders
UK CERIF adoptipn
• Source: UKOLN (R. Russell), Adoption of CERIF in Higher Education Institutions in the UK: A Landscape Study, March 2012
http://www.ukoln.ac.uk/isc/reports/cerif-landscape-study-2012/CERIF-UK-landscape-report-v1.1.pdf
C4D workshop, Glasgow & London. July 2013
1991
CERIF 91
PROJECT
2000
CLASSIFICATION
RESULTS EQUIPMENT
PROJECT
OrgUnit PERSON
EXPERTISERoles
CERIF 2000 Model
- Networking of DBs- Exchange of Records
- EC Recommendation to Member States
- Data Model - Multilinguality- Controlled Vocabulary- Roles / Types- User-driven
- EC Recommendation to Member States
ProjectProject OrganisationOrganisation
Service
Funding Programme
Patent
Skills
CV
Product
Event
PersonPerson
Classification(Semantics)
Classification(Semantics)
PublicationEquipment
2ndLevel Base
LanguageSemantics Link
CERIF 2006 / 2008 Model
- Data Model- Model Normalization - Robust/Consistent Structure - Extensible Structure - Semantic Layer - XML Exchange Specification
- Elaboration on Publication- CERIF Core Semantics (2008 1.2)
2006 2012
--Data Model- - Infrastructure
- Facility, Equipment, Service
- Measurement & Indicator - Entities and Link Tables- Geographic
Bounding Box- CERIF 1.3 Vocabulary
- UUIDs - Terms - Schemes- CERIF 1.4 new XML format- CERIF 1.5 Federated Identifiers
CERIF 1.5CERIF 1.4 (XML)
CERIF 1.3
+ Linked Data
Acronym : ERGOParticipants : Keith Jefffery, Anne Asserson, Rutherford Appleton Lab, Univ Bergen,, many more
2002 2013
CERIF 1.6
--Data Model- - C4D
datasets
C4D workshop, Glasgow & London. July 2013
CERIF Entity Types• Base Entities• Result Entities • Infrastructure Entities• 2nd Level Entities• Link Entities
CERIF Features• Multiple Language • Semantics• Measures & Indicators• Geographic Bounding Box
C4D workshop, Glasgow & London. July 2013
C4D workshop, Glasgow & London. July 2013
Person OrganisationUnit
Project
PersonPerson OrganisationUnitOrganisationUnit
ProjectProject
Person OrganisationUnit
Project
PersonPerson OrganisationUnitOrganisationUnit
ProjectProject
C4D workshop, Glasgow & London. July 2013
Person OrganisationUnit
Project
PersonPerson OrganisationUnitOrganisationUnit
ProjectProject
PersonIDURIGenderFirstNamesOtherNamesFamilyNamesNameVariantsResearchInterestKeywords
ProjectIDURIAcronymStartDateEndDateTitleAbstractKeywords
OrganisationUnitIDURIAcronymNameHeadCountCurrencyCodeTurnoverResearchActivityKeywords
C4D workshop, Glasgow & London. July 2013
Person OrganisationUnit
Project
PersonPerson OrganisationUnitOrganisationUnit
ProjectProject
cfOrganisationUnitcfIDcfURIcfAcronymcfHeadCountcfCurrencyCodecfTurnover
cfTitle
cfAbstract
cfKeywords
cfName
cfKeywo
rds
cfDescription
cfKeywords
cfFamil
yNames
cfFirst
Names
cfOther
Names
cfNameV
ariants
cfPersoncfIDcfURIcfGendercfBirthdate
cfProjectcfIDcfURIcfAcronymcfStartDatecfEndDate
C4D workshop, Glasgow & London. July 2013
ResultProduct
ResultPublication
ResultPatent ResultProduct
ResultPublicationResultPublication
ResultPatent
ResultProduct
ResultPublication
ResultPatent ResultProduct
ResultPublicationResultPublication
ResultPatent
C4D workshop, Glasgow & London. July 2013
ResultProduct
ResultPublication
ResultPatent ResultProduct
ResultPublicationResultPublication
ResultPatent
ResultProductIDURI
ResultPublicationIDURITitleSubtitleAbstractBibl. NotePublicationDateTotalPagesStartPageEndPageKeywords ResultPatent
IDURIPatentNumberTitleCountryCodeRegistrationDateApprovalDateDescriptionKeywords
C4D workshop, Glasgow & London. July 2013
ResultProduct
ResultPublication
ResultPatent ResultProduct
ResultPublicationResultPublication
ResultPatent cfResultPublicationcfIDcfURIcfNumberPublicationDatecfStartPagecfEndPagecfTotalPagescfEditioncfSeriescfIssuecfVolumecfISBNcfISSN
cfResultPatentcfIDcfURIcfPatentNumbercfCountryCodecfRegistrationDatecfApprovalDate
cfTitle
cfAbstract
cfKeywords
cfSubtitle
cfVersionInfo
cfVersionInfo
cfBibliographic Note
cfAbbreviation
cfDescription
cfKeywords
cfName
cfResultProductcfIDcfURI
cfVersionInfo
cfAbstract
cfKeywords
cfName
C4D workshop, Glasgow & London. July 2013
CERIF has many advantages as the canonical model (the research information entities, attributes, associations and semantics) for contextual metadata for datasets:– Covers all aspects of research information: researchers, projects, organisations,
funding, outputs, equipment, services, and so on;– An optimal (relational) architecture allowing the expression of any kind of relation
between entities/attributes with every relation “time-stamped” and semantically defined;
– Very fine-grained structure, allowing output of the metadata to virtually any format;– A separated “semantic layer” allowing the use of multiple (any) controlled
vocabularies (classifications, typologies) as well as their cross-linking and mapping;
– Ability to cope with multiple languages
Advantages of CERIF
C4D workshop, Glasgow & London. July 2013
Mapping to CERIF
24 of 30 MEDIN elements mapped to CERIF
C4D workshop, Glasgow & London. July 2013
C4D workshop, Glasgow & London. July 2013
DataCite version 3.0
Mandatory
• Identifier•Creator•Title•Publisher•Publication Year
Recommended
•Subject•Contributor•Dates relevant to work•Resource Type
Optional
•Scheme URI•Title Type•Subject Scheme
• Related Identifier• Relation Type• Description• GeoLocation
• Language of Resource• Alternate Identifier• Related Metadata• Size
• Data Format• Version• Rights• Geolocation Place
More work
required?
C4D workshop, Glasgow & London. July 2013
http://www.cerifsupport.org/2013/07/24/cerif-1-6-formal-models-released-for-testing/
CERIF 1.6 released for testing 25th July 2013
C4D workshop, Glasgow & London. July 2013
Mapping to other schemata
C4D vs RE3Data vs DCI vs DataCite
C4D workshop, Glasgow & London. July 2013
CERIF metadata model can be used to record rich metadata about datasets can related to other pieces of the research landscape can evolve / extend within formal euroCRIS governance structure
BUT … Needs testing in production environments Is cfResProd appropriate? Not just a research result? Ongoing need for agreed vocabularies
CASRAI RCUK harmonisation
Key Findings
C4D workshop, Glasgow & London. July 2013
Have used C4D as basis for checking whether DataFinder is rich and detailed enough
Once the C4D profile has been finalised, DaMaRo will embark on implementation of C4D-compliant outputs
Most fields map to C4D
Case Study: DaMaRo
C4D workshop, Glasgow & London. July 2013
Further consultation with euroCRIS/CERIF TG in terms of best approach
Aiming to achieve most comprehensive set of metadata (incorporating RE3Data, DataCite, etc.)
Move new Pure model to production (after REF) Exporting and importing CERIF-XML from
systems; exploring this with http://ckan.org Aggregation of data into national data
register model
Next Steps
[email protected]@st-andrews.ac.uk