Research Data Explored: Citations versus Altmetrics
-
Upload
elisabeth-lex -
Category
Science
-
view
1.012 -
download
1
Transcript of Research Data Explored: Citations versus Altmetrics
www.tugraz.at n
W I S S E N n T E C H N I K n L E I D E N S C H A F T
u www.tugraz.at
Research Data Explored: Citations versus Altmetrics
Isabella Peters (ZBW), Peter Kraker (Know-Center), Elisabeth Lex (TU Graz), Christian Gumpenberger (Uni Wien), Juan Gorraiz (Uni Wien) 32. Austrian Librarian Day, Sept 17th 2015, Vienna
www.tugraz.at n
Motivation
• Data citations have gained momentum • Citations: Publish or Perish
• Altmetrics: social-media based metrics • Societal impact of research data
Our Goal: Investigate research data with respect bibliometric characteristics - citations as well as
altmetrics
2
www.tugraz.at n
Our study – Dataset
• Thomson Reuters Data Citation Index (DCI) • high-quality research data from various
repositories • Enables search, exploration and bibliometric
analysis of research data through Web of Science
• We did a basic analysis for all items published in DCI between 1960 and 2014
• Plus: altmetrics collected from three big altmetrics data providers: ImpactStory, Altmetric.com, PlumX
3
www.tugraz.at n
Research Questions
1. How often are research data cited? Which and how many of these have a DOI? From which repositories do research data originate?
2. What are the characteristics of the most cited research data? Which data types and disciplines are the most cited? How does citedness evolve over time?
3. To what extent are cited research data visible on various altmetrics channels? Are there any differences between the tools used for altmetrics scores aggregation?
4
www.tugraz.at n
ImpactStory
• Targeted at individual researcher • Works with individually assigned permanent
identifiers (e.g. DOIs, URLs, PubMed IDs) or links to ORCID, Figshare, Publons, Slideshare, or Github to auto-import new research outputs like e.g. papers, data sets, slides
• Features altmetric scores (Twitter, Facebook, Mendeley, Figshare, Google+, and Wikipedia mentions)
5
www.tugraz.at n
Altmetric.com
• Targeted towards institutions and organizations • Provides an altmetrics score + underlying data • Search within variety of social media-platforms (e.g.,
Twitter, Facebook, Google+, blogs) for keywords and for permanent identifiers
• E.g. DOIs, arXiv IDs, PubMed IDs
6
www.tugraz.at n
PlumX
• Article-level metrics for “artifacts” • articles, audios, videos, book chapters, trials
• Works with ORCID and other user IDs (e.g., from YouTube, Slideshare) as well as with DOIs, ISBNs, PubMed-IDs, patent numbers, and URLs
• Statistics on usage of articles and artifacts • e.g., views to or downloads of html pages or pdfs),
Mendeley readers, GitHub forks, Facebook comments, YouTube subscribers.
7
www.tugraz.at n
Methodology
• DCI to retrieve records of cited research data • Items published in the last decades (1960-9, 1970-9,
1980-9, 1990-9, 2000-9, 2010-4) • Metadata fields: DOI/URL, doc type, source, research
area, publication year, data type, #citations, ORCID • Citedness investigated for each decade • Distribution of document types, data types, sources,
research area • with >=2 citation (Sample 1, n=10,934 records ) • with >= 2 citations and at least 1 altmetric score
(Sample 2, n= 301)
8
www.tugraz.at n
Results
9
high uncitedness of research data low percentage of altmetrics scores available for research data with >= 2 citations
www.tugraz.at n
Results for Sample 1
10
Citedness comparatively higher for research data published more recently ! interest in younger research data and increase in social media activity
www.tugraz.at n
Citation Distribution for Sample 1
11
• Almost half of the data studies have a DOI (48.9%) but only few data sets
• Data studies on average more cited than data sets
• Data studies with DOI more citations than with URL
• Only few repositories (51), but most citations
www.tugraz.at n
Citation Distribution for Sample 1
12
Half of the research data (4,974 items; 45.5%) à only 2 citations 6 items (2 repos and 4 data studies): > 1000 citations
www.tugraz.at n
Citation Distribution for Sample 1
• Differences between most cited data types when considering research data with a DOI or with a URL
13
www.tugraz.at n
Citation Distribution for Sample 1
• More common to refer to data studies via DOIs in Social Sciences than in Natural and Life Sciences
14
Disciplinary differences: DOIs vs URLs, document types
www.tugraz.at n
Results for Sample 2
15
• Total of altmetrics scores < than number of citations for all document types with or without DOI
• Mean altmetrics score higher for data studies than for data sets
www.tugraz.at n
Results for Sample 2
• Distributions of data types and subject areas
16
www.tugraz.at n
Results for Sample 2
• Distributions of data types and subject areas
17
www.tugraz.at n
Results for Sample 2
• Distributions of data types and subject areas
18
www.tugraz.at n
Correlation Analysis
19
No correlation between citations and altmetrics scores in Sample 2
www.tugraz.at n
Details on Altmetrics Analysis in Plum X
20
• DOIs for data sets seem to be important in order to get captures (Mendeley)
• URL sufficient for inclusion in social media (e.g. Facebook, Twitter)
www.tugraz.at n
More Altmetrics Results...
• Top 10 research data-DOIs with >=2 citations and with at least 1entry in PlumX
• Cited research data attracts more citations than altmetrics scores
• No correlation between highly cited and highly scored research data.
21
www.tugraz.at n
Conclusions
• Low percentage of altmetrics scores for research data with two or more citations
• Research data not so often published/shared? • Reliability of altmetrics aggregation tools?
• We didn‘t observe a correlation between citation and altmetrics scores
• Neither most cited research data nor most cited sources (repositories) received highest scores in PlumX
• Interestingly, although “figshare” accounts for almost 25% of the DCI, no item from “figshare” was cited at least twice in DCI à see our follow-up work presented at STI 2015! 22
www.tugraz.at n
Conclusions
• Growing trend in citing research data since 2008 – bias towards more recent research data à in general, Research data mostly uncited
• Availability of cited research data with a DOI rather low in DCI, but increasing
• Data studies with a DOI attract more citations than those with a URL
• DOI in cited research data has so far been more embraced in the Social Sciences than in the Natural Sciences
• DOI/identifiers important to increase altmetrics scores as well as aggregators rely on it
23
www.tugraz.at n
Future Work
• Investigate data citations in more detail • Different from „paper citations“
• E.g. we found that entire repositories are proportionally more often cited than single data sets
• Meaning of data citations • Influence of structure of underlying data
• Data curation, identifiers,..
24