GBIF BIFA mentoring, Day 5b Data paper, July 2016
-
Upload
dag-endresen -
Category
Data & Analytics
-
view
136 -
download
0
Transcript of GBIF BIFA mentoring, Day 5b Data paper, July 2016
Adatapaperisasearchablemetadatadocument,describingapar2culardatasetoragroupofdatasets,publishedintheformofapeer-reviewedar2cleinascholarlyjournal.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Theprimarypurposeofadatapaperistodescribedataandthecircumstancesoftheircollec2on,ratherthantoreporthypothesesandconclusions.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Adatapaperisameanofbringingcreditandrecogni4ontoallthoseinvolvedindatapublica2onandtoalertthescien2ficcommunitytotheexistenceofbiodiversitydatasetsandthevaluetheycanbringtopar2cularresearchprojects;andasamechanismforqualityassessmentandcontrolofdataaccessiblethroughGBIFandothernetworks.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Adatasetisunderstoodhereasadigitalcollec2onoflogicallyconnectedfacts(observa4ons,descrip4onsormeasurements).Typicallystructuredintabularformasasetofrecords,witheachrecordcomprisingasetoffields,andrecordedinoneormorecomputerdatafilesthattogethercompriseadatapackage.
PenevL,MietchenD,ChavanV,HagedornG,RemsenD,SmithV,ShoPonD(2011).PensoTDataPublishingPoliciesandGuidelinesforBiodiversityData.PensoTPublishers,hPp://www.pensoT.net/J_FILES/PensoT_Data_
Publishing_Policies_and_Guidelines.pdf
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
PenevL,MietchenD,ChavanV,HagedornG,RemsenD,SmithV,ShoPonD(2011).PensoTDataPublishingPoliciesandGuidelinesforBiodiversityData.PensoTPublishers,hPp://www.pensoT.net/J_FILES/PensoT_Data_
Publishing_Policies_and_Guidelines.pdf
Thereisadis2nc2onbetween‘sta$cdata’some2mescalled‘deaddatasets’and‘curated’dataor‘living’datasets.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
discretecollec2onofdataunderlyingapaper
collec2onofdatarelatedtomonitoringac2vi2es
discretecollec2onofdatarelatedtodistribu4onofspecies
discretecollec2onofdataunderlyingaspecificresearch
discretecollec2onofdatarelatedtoacollec4onofspecimens
A DATASET CAN BE:
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
adatapackageisthe‘file’containingtheactualdataandmetadataandadescriptorfile
GBIFprefersDarwinCoreArchives(DwC-A)asaformatforpublishingDataDarwinCoreArchive(DwC-A)isaBiodiversityinforma2csdatastandardthatmakesuseoftheDarwinCoretermstoproduceasingle,self-containeddatasetforspeciesoccurrenceorchecklistdata.Essen2allyitisasetoftext(CSV)fileswithasimpledescriptor(meta.xml)toinformothershowyourfilesareorganized.TheformatisdefinedintheDarwinCoreTextGuidelines.ItisthepreferredformatforpublishingdatatotheGBIFnetwork.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
dataproducedusingpublicfundsshouldberegardedasacommongood,andshouldbeopenlypublishedandmadeavailableforinspec2on,interpreta2onandre-usebythirdpar2es.
Opendataincreasestransparencyandtheoverallqualityofscience;
Publisheddatasetscanbere-analyzedandverifiedbyothers;
Publisheddatacanbecitedandre-usedinthefuture,eitheraloneorinassocia2onwithotherdata;
Datacanbeintegratedwithotherdatasetsacrossbothspaceand2me;
Dataintegra2onincreasesrecogni4onandopportuni4esforcollabora2on;
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Duplica2onofdata-collec2ngeffortsandassociatedcostswillbereduced;
Publisheddatacanbeindexedandmadediscoverable,browsableandsearchablethroughinternetservices(e.g.Websearchengines)ormorespecificinfrastructures(e.g.,GBIFforbiodiversitydata);
Collec2onmanagerscantraceusageandcita4onsofdigi2zeddatafromtheircollec2ons;
Datacreators,andtheirins2tu2onsandfundingagencies,canbecreditedfortheirworkofdatacrea2onandpublica2onthroughtheconven2onalchannelsofscholarlycita2on;priorityandauthorshipisachievedinthesamewayaswithapublica2onofaresearchpaper;
Opendataincreasesthepoten2alforinterdisciplinaryresearch,andforre-useinnewcontextsnotenvisagedbythedatacreator;
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Datasetsandtheirmetadata,andanyrelatedDataPapers,maybeinter-linkedintoResearchObjects,toexpediteandmutuallyextendtheirdissemina2on,tothebenefitoftheauthors,otherscien2stsintheirfields,andsocietyatlarge;
Publisheddatamaybestructuredas‚LinkedData‘,andsocreatesnewknowledge
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
• AllrightsreservedàDataunusable
• OpenDataCommonsPublicDomainDedica2onandLicense(PDDL)–(CC0)
• Crea2veCommonsAPribu2on-NoDerivs(CCBY-ND)• Crea2veCommonsAPribu2on-NonCommercial(
CCBY-NC)• Crea2veCommonsAPribu2on-ShareAlike(CCBY-SA)
orOpenDataCommonsOpenDatabaseLicense(ODbL)
• Crea2veCommonsAPribu2on(CCBY)orOpenDataCommonsAPribu2onLicense(ODC-By)
hPp://www.canadensys.net/2012/why-we-should-publish-our-data-under-cc0
WHAT DATA-LICENSE SHOULD I USE?
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Improvetheusabilityofyourpublisheddata!Receivecreditthroughindexingandcita2onofthepublishedpaper.
Increasethevisibilityandcredibilityofthedataresourcesyoupublish.Trackmoreeffec2velytheusageandcita2onsofthedatayoupublish.Receivefeedbackonyourdataset.Increaseyournetwork.Getmoreoutofyoudata.Improvethequalityofyourdataset.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Metadata,literally“dataaboutdata”areanessen2alcomponentofadatamanagementsystem,describingsuchaspectsasthe“what,where,when,whoandhow”pertainingtoaresource.IntheGBIFcontext,resourcesaredatasets,looselydefinedascollec2onsofrelateddata,thegranularityofwhichisdeterminedbythedatacustodian.Metadatacanoccurinseverallevelsofcompleteness.Ingeneral,metadatashouldallowaprospec2veenduserofdatato:1.iden2fy/discoveritsexistence,2.learnhowtoaccessoracquirethedata,3.understanditsfitness-for-use,and4.learnhowtotransfer(obtainacopyof)thedata.5.learnhowthedatashouldbeused
GBIF(2011).GBIFMetadataProfile,ReferenceGuide,Feb2011,(contributedbyOTuama,E.,Braak,K.,Copenhagen:GlobalBiodiversityInforma2onFacility,19pp.AccessibleathPp://links.gbif.org/gbif_metadata_profile_how-to_en_v1
METADATA
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
TheGBIFMetadataProfileisprimarilybasedontheEcologicalMetadataLanguage(EML).TheGBIFprofileu2lizesasubsetofEMLandextendsittoincludeaddi2onalrequirementsthatarenotaccommodatedintheEMLspecifica2on.Thefollowingtablesprovideshortdescrip2onsoftheprofileelements,andwhererelevant,linkstomorecompleteEMLdescrip2ons.
- hPp://knb.ecoinforma2cs.org/soTware/eml/- hPps://knb.ecoinforma2cs.org/#external//emlparser/docs/index.html
TheGBIFMetadataProfile(GMP)wasdevelopedinordertostandardizehowresourcesgetdescribedatthedatasetlevelintheGBIFDataPortalThisprofilecanbetransformedtoothercommonmetadataformatssuchastheISO19139metadataprofile.
TheelementsusedinIPTareasubsetofthecompleteGBIFmetadataprofile
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
DESCRIBING A DATASET
IPT metadata based on GBIF metadata profile
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
•Dataset(Resource)•Project•PeopleandOrganisa2ons•KeywordSet(GeneralKeywords)•Coverage
oTaxonomicCoverageoGeographicCoverageoTemporalCoverage
•Methods•IntellectualPropertyRights•Addi2onalMetadata+NCD(NaturalCollec2onsDescrip2onsData)
Element Descrip4on2tle Adescrip2onofthe
resourcethatisbeingdocumentedthatislongenoughtodifferen2ateitfromothersimilarresources.
descrip2on Abriefoverviewoftheresourcethatisbeingdocumented.
metadatalanguage ThelanguageinwhichthemetadatadocumentiswriPen.
type Thetypeofresource.
subtype Specimenorobserva2ons
BASIC METADATA
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4onresourcecontact Theresourcecontactisthe
personororganisa2onthatshouldbecontactedtogetmoreinforma2onabouttheresource,thatcuratestheresourceortowhomputa2veproblemswiththeresourceoritsdatashouldbeaddressed
resourcecreator Theresourcecreatoristhepersonororganisa2onresponsiblefortheoriginalcrea2onoftheresourcecontent.
metadataprovider Themetadataprovideristhepersonororganisa2onresponsibleforproducingtheresourcemetadata.
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4ongeographicDescrip2on shorttextdescrip2onofa
dataset'sgeographicarealdomain.
westBoundingCoordinate fieldcoveringtheWmarginofaboundingbox.
eastBoundingCoordinate fieldcoveringtheEmarginofaboundingbox.
northBoundingCoordinate fieldcoveringtheNmarginofaboundingbox.
southBoundingCoordinate fieldcoveringtheSmarginofaboundingbox.
descrip2on Shortdescrip2onofthegeographicalcoverage
GEOGRAPHIC COVERAGE
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
TAXONOMIC COVERAGE
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4onTaxonomiccoveragedescrip2on
Adescrip2onoftherangeoftaxaaddressedinthedatasetorcollec2on.
AddseveraltaxaAddScien2ficName;CommonName;Rank
Addtaxainmetadatauptolowestsharedrank.
Element Descrip4on
Startdate =beginDateEnddata =endDate
TEMPORALCOVERAGE
hPps://knb.ecoinforma2cs.org/#external//emlparser/docs/eml-2.1.1/./eml-resource.html
Element Descrip4onThesaurus/vocabulary n/aKeywordlist Keywordsseparatedby“,”
Element Descrip4onResourcecontact
ASSOCIATEDPARTIES
KEYWORDS
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4on2tle TitleoftheprojectPersonnelfirstnamerolefunding Referencetofunding
partners
StudyareaDescrip2on àgeographiccoverageelaborated
Designdescrip2on Projectabstract&design
PROJECT DATA
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4onStudyextent Samplingarea(specific)and
samplingfrequency
Samplingdescrip2onQualitycontrol Valida2onandquality
controlac2onsperformedonthedataset
Stepdescrip2on Proceduresfollowedtoproduceadataobject
SAMPLING METHODS
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4onCita2oniden2fier URIordoiResourcecita2on cita2on
COLLECTIONDATA Element Descrip4onCollec2onNamecollec2onIDParentcollec2oniden2fier
Preserva2onmethod
CITATIONS
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015
Element Descrip4onResourcehomepage URLAddnewexternallink
ADDITIONALMETADATA Element Descrip4onHierarchylevel datasetDatepublishedPurpose(Ra2onale)LicenseIPrightsAddi2onalinforma2on
EXTERNAL LINKS
Slides modified from Dimitri Brosens, data paper workshop in Trondheim October 2015