Confidential: For Review Only - bmj.com · Neurology; University Medical ... low-frequency words...

28
Confidential: For Review Only The use of positive and negative words in scientific abstracts: too good to be true? Journal: BMJ Manuscript ID BMJ.2015.029354.R1 Article Type: Research BMJ Journal: BMJ Date Submitted by the Author: 19-Nov-2015 Complete List of Authors: Vinkers, Christiaan; University Medical Center Utrecht, Department of Psychiatry Tijdink, Joeri; VU Medical Center, Department of Internal Medicine Otte, Willem; University Medical Center Utrecht, Department of Pediatric Neurology; University Medical Center Utrecht, Image Sciences Institute Keywords: positive outcome bias, publiciation patterns, novel, innovative, robust, unprecedented, PubMed https://mc.manuscriptcentral.com/bmj BMJ

Transcript of Confidential: For Review Only - bmj.com · Neurology; University Medical ... low-frequency words...

Confidential: For Review O

nly

The use of positive and negative words in scientific

abstracts: too good to be true?

Journal: BMJ

Manuscript ID BMJ.2015.029354.R1

Article Type: Research

BMJ Journal: BMJ

Date Submitted by the Author: 19-Nov-2015

Complete List of Authors: Vinkers, Christiaan; University Medical Center Utrecht, Department of Psychiatry Tijdink, Joeri; VU Medical Center, Department of Internal Medicine Otte, Willem; University Medical Center Utrecht, Department of Pediatric Neurology; University Medical Center Utrecht, Image Sciences Institute

Keywords: positive outcome bias, publiciation patterns, novel, innovative, robust, unprecedented, PubMed

https://mc.manuscriptcentral.com/bmj

BMJ

Confidential: For Review O

nly

1

The use of positive and negative words in scientific abstracts: too good to be

true?

REVISED VERSION

Christiaan H Vinkers, Joeri Tijdink, Willem M Otte

Christiaan H Vinkers MD PhD, Department of Psychiatry, Brain Center Rudolf Magnus, University Medical

Center Utrecht, 3584 CX Utrecht, the Netherlands

Christiaan H Vinkers, Assistant professor

Joeri Tijdink MD, Department of Internal Medicine, VU University Medical Center, 1081 HZ Amsterdam, the

Netherlands

Joeri Tijdink, PhD student

Willem M Otte PhD, Department of Child Neurology, Brain Center Rudolf Magnus, University Medical Center

Utrecht, 3584 CX, Utrecht, the Netherlands, Biomedical MR Imaging and Spectroscopy, Center for Image

Sciences, University Medical Center Utrecht, Utrecht, the Netherlands

Willem M Otte, Assistant professor

Correspondence to: Christiaan H Vinkers, MD PhD, Department of Psychiatry, Brain Center

Rudolf Magnus, University Medical Center Utrecht, Utrecht, the Netherlands, Heidelberglaan

100, 3584 CX Utrecht, The Netherlands. Tel: +31 (0) 88 7 555 555. E-mail:

[email protected]

Page 1 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

2

Abstract

Objective: Our perception of the world is reflected in how we use language. We aimed to

investigate whether language use in science would skew towards the use of strikingly positive

and negative words over time.

Design: Retrospective analysis of all scientific abstract included between 1974 and 2015 in

PubMed.

Main outcome measures: Positive and negative word frequencies in comparison to

frequencies of words with a neutral and random connotation, expressed as relative change

since 1980.

Methods: The yearly frequency of 25 positive, 25 negative, and 25 neutral words, as well as

100 randomly selected words was normalized for the total number of abstracts. Sub-analyses

included pattern quantification of words in isolation, specificity for high impact journals, and

comparison between affiliations within or outside countries with English as the official

majority language. Frequency patterns were compared with 4% of all books ever printed and

digitized using the Google Books Ngram Viewer.

Results: The relative increase in word frequency over four decades was 880% for positive

words and 257% for negative words. All individual positive words contributed to the increase,

particularly the words ‘robust’, ‘novel’, ‘innovative’ and ‘unprecedented’ which increased in

relative frequency up to 15000%. Comparable but less pronounced results were obtained

when restricting the analysis to high-impact factor journals. Authors affiliated to an institute

located in a non-English speaking country used significantly more positive words. No

apparent increase was found in the use of neutral and random words, and neither did the

frequency of positive words increase in published books over the same time period.

Page 2 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

3

Conclusions: Our unique lexicographic analysis convincingly demonstrates that scientific

abstracts are currently written with more positive and negative words. The remarkable

increase in frequency of positively valenced words provides a novel and unprecedented

insight into the evolution of scientific writing. Apparently scientists look on the bright side of

research results. However, whether this perception fits reality should be questioned.

Page 3 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

4

Introduction

Science has shown an impressive growth over the last decades and more scientific papers are

published now than ever before.1 Between 1996 and 2011, over 15 million individuals

authored around 25 million papers.2 Due to expanding research fields, it is increasingly

difficult to get studies published in high-impact journals.3 This is important since publication

quantity and associated impact factors have a considerable effect on a scientist’s career

perspective.4 Consequently, in order to get published, scientific discoveries may sometimes be

exaggerated or the potential implications overstated.5 6 Indeed, overinterpretation,

overstatement and misreporting of scientific results have been frequently reported.7-12

However, the prevalence of this problem in the scientific literature is unclear.

There is a well-known universal tendency in humans to use positive words,13 and

exaggeration of research-related news has previously been linked to overstatements in

academic press releases.14 In the current study, we used a data-driven approach to investigate

trends in the use of positively and negatively valenced words in PubMed abstracts and titles

over the last four decades. Subsequently, positive and negative word trends were contrasted to

either neutral or random words, as well as to patterns obtained from the corpus of digitized

texts containing about 4% of all books ever printed using Google Ngram Viewer. We

hypothesized that the emergence of a culture aimed at productivity and novelty could have

affected the use of positive and negative words in scientific reporting and discussion.

Methods

The yearly frequency of 25 predefined positive, negative, and neutral words was quantified in

titles and abstracts obtained from the PubMed database (www.pubmed.gov) (Table 1).

Analyses were restricted to 1974 – 2015 to ensure that all abstract texts were available. Words

Page 4 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

5

were selected before analyses were carried out after reaching consensus between the authors

through discussion which included manual analysis of random abstracts and search of

thesaurus listings. To validate the results from these pre-specified lists, additional positive

words were selected from a recent article on superlatives in news coverage of cancer drugs.15

To further exclude a bias in the choice of these words, we additionally searched for 50 nouns

and 50 adjectives randomly selected from Ogden’s 850 core words of Basic English

(https://en.wiktionary.org/wiki/Appendix:Basic_English_word_list). The yearly number of

abstracts containing one or more of the positive, negative, neutral, or random words in title or

abstract text (based on the OR operator) was divided by the total number of yearly

publications. Search queries are provided as supplementary material (supplementary data 1).

Differences between trends across the last 10 years were also summarized (mean 95%

confidence interval (CI)) and statistically tested with unpaired t-tests. Patterns of individual

words were plotted to determine whether developments were comparable across words.

Future predictions for the word ‘novel‘ were calculated with low order polynomial

regressions. Co-occurrence of positive and negative words in abstracts was examined with

random sets of positive words using the ‘AND’ operator. All analyses were carried out using

R and plots were created with the R package ‘ggplot2’

To ensure that any trend in the use of positive and negative words in PubMed abstracts

was specific for science rather than reflecting general trends in words use in society, the use

of positive and negative words in published books between 1975-2009 was also quantified

using the Google Books Ngram Viewer that charts frequencies of any word or short sentence

found in millions of books printed between 1800 and 2009.16 We plotted average Google

Books patterns and corresponding CIs (calculated from bootstrap sampling of all individual

word frequency patterns; 1000 samples/year) to evaluate differences with the patterns

obtained from the Pubmed queries.

Page 5 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

6

In light of the increasing number of journals and the rise of the open access movement,

we also restricted our search to 20 high impact journals which were pre-specified and based

on consensus between the authors (see supplementary table S1). Finally, we investigated a

possible cultural influence by comparing the use of positive and negative words in titles and

abstracts between authors with an affiliation in a country where English is the de facto official

language (Australia, New Zealand, United Kingdom and the United States).

Results

Between 1974 and 1980, the percentage of PubMed records containing one or more positive

word in title or abstract varied between 1.7 and 2.3%. This further increased to 17.5% in

2015, a relative increase of 880% (Figure 1, top left). Increases above 700% were present

with random selections of each 20 positive words. The usage of the same positive words in

published books increased to 146% from 1975 to 2009 (Figure 1, top). Frequency patterns of

all individual words in abstracts showed increased usage although with large variation (Figure

2). In isolation, the words ‘robust’, ‘novel’, ‘innovative’ and ‘unprecedented’ increased in

relative frequency from 2500 to 15000% (Figure 2). Removal of these words still yielded a

relative frequency increase of 540%. Moreover, word trends were similar after exclusion of

low-frequency words such as ‘inventive’ and ‘astonishing’. Analyses of additional positive

words (“breakthrough”, “cure”, “marvel”, “miracle”, “revolutionary” and “transformative”)

based on a recent article15 revealed comparable and consistent patterns increases in frequency

(supplemental figure S1). Positive word use also increased in high-impact factor journals,

with a change from 1.1 to 8.9% (relative increase of 674%; Figure 1, top left). However, the

increase in positive word use over the last ten years was significantly lower in high-impact

factor journals compared to the frequency pattern of positive words across all journals (-

159.8%, CI -92.9 to -226.7%, p-value 0.0001). Similar results were found using the top 20 list

Page 6 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

7

of journals based on the journal impact factor (both for all journals and for general medical

journals, Journal Citation Reports 2014) (data not shown). Combinations of more than two

positive words in single abstracts only occurred in a minority of abstracts. Patterns in positive

and negative words significantly differed between authors with an affiliation inside or outside

an English speaking country, with lower frequency rates in the last ten years for those

affiliated with an institution in Australia, New Zealand, UK or US. (-31.4%, CI -50.6 to -

12.2%, p-value 0.003; supplemental figure S3). Extrapolating the upward trend of positive

words over the last 40 years to the future, we predict that the word ‘novel’ will appear in

every record by the year 2123.

For negative words, a similar but less unequivocal increase in relative frequency was

found: in 2015, up to 257% and 199% if restricted to high impact journals (Figure 1, top

right). Individual negative word patterns are included as supplemental figure S2. No increase

was found in the use of neutral words and only a modest increase in relative frequency to

150% for random words (Figure 1, bottom).

Discussion

Our analysis of scientific abstracts demonstrates that positive and - to a lesser extent -

negative words are increasingly used over the past four decades. In contrast, this increase was

absent for neutral and random words. The increase in positive words could not be attributed to

general language tendencies as represented by the corpus of millions of printed books. Neither

is the increase driven by one or two words as all words showed increased frequency patterns.

Even though the upward trend in positive word use was conserved in high-impact journals,

this trend was significantly less pronounced (Figure 1). This could be the result of a more

thorough and critical editorial and peer review process.

Page 7 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

8

Our approach has strengths and weaknesses. The main strength of our lexicographic analysis

is the inclusion of all PubMed abstracts for over four decades which prevents selection bias.

Side-by-side comparisons with patterns of other word lists and general English texts provide

robust reference data. However, our study also has limitations. First, we limited the list of

positive and negative words, and the choice of words is likely to affect the specificity of the

observed patterns. However, the general tendency was comparable across individual words

and sensitivity analyses with additional positive words yielded similar results. Second, we did

not account for changes in the maximum abstract length of Pubmed abstracts over the years.

However, the upward trends are more or less linear over time, and abstract length would

likely have resulted in an increase of neutral or random words as well. Third, we did not

study the location of the words in the abstracts, or the context of their usage. Contextual

analysis of words may differentiate between the connotation of isolated words and the

connotation conditional on the sentence. Moreover, we did not directly examine the

relationship between word usage and the current scientific culture, i.e. the role of increased

publication pressure and the perceived relevance of publications for a scientific career.

Finally, we cannot exclude the possibility that the scientific process has considerably

improved over the last decades and that the more frequent use of positive words is

appropriate.

Although researchers may have adopted an increasingly optimistic writing approach and are

ever more enthusiastic about their results, another explanation is more likely: in order to get

published, scientists may assume that results and their implications have to be exaggerated

and overstated in order to get published. Our finding that scientific abstracts use more overt

positive language is also probably related to the emergence of a positive outcome bias that

currently dominates scientific literature.17

There is a high pressure on scientists in academia to

Page 8 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

9

publish as many papers as possible in order to further one’s career. As a result, we may be

afraid to break the bad news that many studies do not result in statistically significant or

clinically meaningful effects. Currently, the majority of research findings may be false or

exaggerated,6 18 and research resources are often wasted.

19 Overestimation of research

findings directly impairs the ability of science to find true effects and leads to an unnecessary

focus on research marketability. The consequences of this increase are worrisome since it

makes research a survival of the fittest: the person who is best able to ‘sell’ their results may

be the most successful. It is time for a new academic culture that rewards quality over

quantity and stimulates researchers to revere nuance and objectivity. Notwithstanding the

steady increase of superlatives in science, this finding should not detract us from the fact we

need bright, unique, innovative, phenomenal, creative, and excellent scientists.

Competing interests

All authors have completed the ICMJE uniform disclosure form at

www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the

submitted work; no financial relationships with any organisations that might have an interest

in the submitted work in the previous three years; no other relationships or activities that

could appear to have influenced the submitted work.

Funding source

No funding source supported this study.

Licence Statement

The Corresponding Author has the right to grant on behalf of all authors and does grant on

behalf of all authors, a worldwide licence to the Publishers and its licensees in perpetuity, in

Page 9 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

10

all forms, formats and media (whether known now or created in the future), to i) publish,

reproduce, distribute, display and store the Contribution, ii) translate the Contribution into

other languages, create adaptations, reprints, include within collections and create summaries,

extracts and/or, abstracts of the Contribution, iii) create any other derivative work(s) based on

the Contribution, iv) to exploit all subsidiary rights in the Contribution, v) the inclusion of

electronic links from the Contribution to third party material where-ever it may be located;

and, vi) licence any third party to do any or all of the above.”

Declaration of contribution

CV, JT and WO all had a substantial contributions to the conception or design of the work; or

the acquisition, analysis, or interpretation of data for the work; AND

CV, JT and WO drafted the work or revising it critically for important intellectual content;

AND

CV, JT and WO gave final approval of the version to be published; AND

CV, JT and WO all agreed to be accountable for all aspects of the work in ensuring that

questions related to the accuracy or integrity of any part of the work are appropriately

investigated and resolved.

Data sharing

No additional data available.

Page 10 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

11

Transparency statement

I, Christiaan Vinkers, affirm that the manuscript is an honest, accurate, and transparent

account of the study being reported; that no important aspects of the study have been omitted;

and that any discrepancies from the study as planned have been explained.

What this paper adds

Section 1: What is already known on this subject

• Science has shown an impressive growth over the last decades and in order to get

published, scientific discoveries are sometimes exaggerated or the potential

implications overstated.

Section 2: What this study adds

• Our analysis of Pubmed abstracts demonstrates that positive words are increasingly

used over the past four decades.

• The use of more overt positive language is probably related to the emergence of a

positive outcome bias that currently dominates scientific literature.

Page 11 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

12

References

1. Ridker PM, Rifai N. Expanding options for scientific publication: is more always better?

Circulation 2013;127(2):155-6.

2. Boyack KW, Klavans R, Sorensen AA, Ioannidis JP. A list of highly influential biomedical

researchers, 1996-2011. Eur J Clin Invest 2013;43(12):1339-65.

3. Fraser AG, Dunstan FD. On the impossibility of being expert. BMJ 2010;341:c6815.

4. Publish or perish. Nature 2015;521(7552):259.

5. Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JP, et al. Biomedical

research: increasing value, reducing waste. Lancet 2014;383(9912):101-4.

6. Ioannidis JP. Why most published research findings are false. PLoS Med 2005;2(8):e124.

7. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized

controlled trials with statistically nonsignificant results for primary outcomes. JAMA

2010;303(20):2058-64.

8. Ochodo EA, de Haan MC, Reitsma JB, Hooft L, Bossuyt PM, Leeflang MM.

Overinterpretation and misreporting of diagnostic accuracy studies: evidence of

"spin". Radiology 2013;267(2):581-8.

9. Lockyer S, Hodgson R, Dumville JC, Cullum N. "Spin" in wound care research: the

reporting and interpretation of randomized controlled trials with statistically non-

significant primary outcome results or unspecified primary outcomes. Trials

2013;14:371.

10. Patel SV, Chadi SA, Choi J, Colquhoun PH. The use of "spin" in laparoscopic lower GI

surgical trials with nonsignificant results: an assessment of reporting and interpretation

of the primary outcomes. Dis Colon Rectum 2013;56(12):1388-94.

11. Boutron I, Altman DG, Hopewell S, Vera-Badillo F, Tannock I, Ravaud P. Impact of spin

in the abstracts of articles reporting results of randomized controlled trials in the field

of cancer: the SPIIN randomized controlled trial. J Clin Oncol 2014;32(36):4120-6.

12. Lazarus C, Haneef R, Ravaud P, Boutron I. Classification and prevalence of spin in

abstracts of non-randomized studies evaluating an intervention. BMC Med Res

Methodol 2015;15:85.

13. Dodds PS, Clark EM, Desu S, Frank MR, Reagan AJ, Williams JR, et al. Human language

reveals a universal positivity bias. Proc Natl Acad Sci U S A 2015;112(8):2389-94.

Page 12 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

13

14. Sumner P, Vivian-Griffiths S, Boivin J, Williams A, Venetis CA, Davies A, et al. The

association between exaggeration in health related science news and academic press

releases: retrospective observational study. BMJ 2014;349:g7015.

15. McCarthy M. Superlatives are commonly used in news coverage of cancer drugs, study

finds. BMJ 2015;351:h5803.

16. Michel JB, Shen YK, Aiden AP, Veres A, Gray MK, Google Books T, et al. Quantitative

analysis of culture using millions of digitized books. Science 2011;331(6014):176-82.

17. Dwan K, Gamble C, Williamson PR, Kirkham JJ, Reporting Bias G. Systematic review of

the empirical evidence of study publication bias and outcome reporting bias - an

updated review. PLoS One 2013;8(7):e66844.

18. Ioannidis JP. How to make more published research true. PLoS Med

2014;11(10):e1001747.

19. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research

evidence. Lancet 2009;374(9683):86-9.

Page 13 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

14

Figures and Tables

Figure 1: Relative frequency patterns of positive (top left), negative (top right), neutral

(bottom left), and random (bottom right) words in PubMed abstracts and titles over time. The

mean relative frequency patterns of the same positive and negative words in general books is

plotted in A and B including 95% confidence intervals (gray shaded).

Figure 2: Relative frequencies of 24 individual positive words as used in PubMed between

1975 and 2015. The word ‘inventive’ was not plotted due to low search volumes.

Page 14 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

15

Table 1: List of the positive, negative, and neutral words used in Pubmed search queries and

Google books search engine.

Positive

words

Amazing, assuring, astonishing, bright, creative, encouraging, enormous,

excellent, favorable, groundbreaking, hopeful, innovative, inspiring, inventive,

novel, phenomenal, prominent, promising, reassuring, remarkable, robust,

spectacular, supportive, unique, unprecedented

Negative

words

Detrimental, disappointing, disconcerting, discouraging, disheartening,

disturbing, frustrating, futile, hopeless, impossible, inadequate, ineffective,

insignificant, insufficient, irrelevant, mediocre, pessimistic, substandard,

unacceptable, unpromising, unsatisfactory, unsatisfying, useless, weak,

worrisome

Neutral

words

Animal, blood, bone, brain, condition, design, disease, experiment, human,

intervention, kidney, liver, man, men, muscle, patient, prospective, rodent,

significant, skin, skull, treatment, vessel, woman, women

Page 15 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

Figure 1: Relative frequency patterns of positive (top left), negative (top right), neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time. The mean relative frequency patterns of the same positive and negative words in general books is plotted in A and B including 95%

confidence intervals (gray shaded).

Page 16 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

Figure 2: Relative frequencies of 24 individual positive words as used in PubMed between 1975 and 2015. The word ‘inventive’ was not plotted due to low search volumes.

Page 17 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

1

Supplementary data

Page 2: Supplementary data 1: Search queries

Page 4: Supplementary Table S1: List of twenty journals used for positive word analysis in high

impact journals

Page 5: Supplementary figure S1: Relative frequencies of positive words both combined (top left)

and in isolation selected from a recent paper on superlatives commonly used in news coverage of

cancer drugs.

Page 6: Supplemental figure S2: Relative frequencies of 21 individual negative words as used in

PubMed between 1975 and 2015. Four words with low search volumes (‘disconcerting’,

‘disheartening’, ‘unpromising’ and ‘unsatisfying’) were not plotted.

Page 7: Supplemental figure S3. Relative frequency patterns of positive (top left), negative (top

right), neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time

between authors affiliated with an institution inside or outside countries with English as the official

majority language.

Page 18 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

2

Supplementary data 1: Search queries

Pubmed screenshot (November 121h

, 2015). Shows example query for the word ‘robust’. If the search

volume is high enough a box will appear on the right – marked with red – which allows to download

yearly frequency counts as a comma-separated file.

Combined query with one or more positive words in abstracts

(Amazing OR Assuring OR Astonishing OR Bright OR Creative OR Encouraging OR Enormous OR

Excellent OR Favorable OR Groundbreaking OR Hopeful OR Innovative OR Inspiring OR Inventive

OR Novel OR Phenomenal OR Prominent OR Promising OR Reassuring OR Remarkable OR Robust

OR Spectacular OR Supportive OR Unique OR Unprecedented)

Combined query with one or more negative words in abstracts

(Detrimental OR Disappointing OR Disconcerting OR Discouraging OR Disheartening OR Disturbing

OR Frustrating OR Futile OR Hopeless OR Impossible OR Inadequate OR Ineffective OR

Insignificant OR Insufficient OR Irrelevant OR Low-quality OR Mediocre OR Pessimistic OR

Substandard OR Unacceptable OR Unpromising OR Unsatisfactory OR Unsatisfying OR Useless OR

Weak OR Worrisome)

Combined query with one or more neutral words in abstracts

(Animal OR Blood OR Bone OR Brain OR Condition OR Design OR Disease OR Experiment OR

Human OR Intervention OR Kidney OR Liver OR Man OR Men OR Muscle OR Patient OR

Page 19 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

3

Prospective OR Rodent OR Significant OR Skin OR Skull OR Treatment OR Vessel OR Woman OR

Women)

Combined query with one or more random words in abstracts

(manager OR substance OR law OR dust OR bite OR butter OR fold OR mind OR protect OR

insurance OR test OR father OR letter OR friend OR power OR edge OR linen OR scale OR bread

OR statement OR weather OR smell OR glass OR food OR level OR steam OR soap OR help OR rule

OR wind OR interest OR purpose OR hole OR fight OR representative OR danger OR prose OR

change OR discussion OR company OR direction OR balance OR organisation OR size OR trade OR

rice OR invention OR heat OR road OR mountain OR electric OR good OR natural OR sweet OR

dead OR strange OR thin OR political OR open OR bitter OR dark OR complex OR warm OR full OR

red OR kind OR possible OR strong OR free OR quick OR slow OR cut OR narrow OR certain OR

dependent OR flat OR acid OR fixed OR responsible OR false OR great OR like OR green OR cold

OR poor OR low OR opposite OR bright OR military OR fertile OR second OR left OR wrong OR

hanging OR gray OR mixed OR angry OR foolish OR loose OR late)

Query to combine with other queries to select abstract within specific journals only

(ANN INTERN MED[journal]) OR (ANNU REV IMMUNOL[journal]) OR (ARCH INTERN

MED[journal]) OR (BMJ[journal]) OR (CANCER CELL[journal]) OR (CELL[journal]) OR

(IMMUNITY[journal]) (JAMA[journal]) OR (LANCET[journal]) OR (BLOOD[journal]) OR (NAT

NEUROSCI[journal]) OR (NAT REV CANCER[journal]) OR (NAT REV GENET[journal]) OR

(NAT REV IMMUNOL[journal]) OR (NAT REV NEUROSCI[journal]) OR (NATURE[journal]) OR

(N ENGL J MED[journal]) OR (PLOS MED[journal]) OR (P NATL ACAD SCI U S A[journal]) OR

(SCIENCE[journal])

Query to combine with other queries to select abstract with authors from within English-

speaking countries (use NOT operator to invert result)

("UK"[ad] OR "United Kingdom"[ad] OR "Great Britain"[ad] OR "US"[ad] OR "USA"[ad] OR

"United States"[ad] OR "United States of America"[ad] OR "Australia"[ad] OR "New Zealand"[ad])

Query to get total number of papers for each year

"[journal]”

Page 20 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

4

Supplementary Table S1: List of twenty journals used for positive word analysis in high impact

journals.

1. Annals of Internal Medicine

2. Annual Reviews in Immunology

3. Archives of Internal Medicine

4. Blood

5. British Medical Journal

6. Cancer Cell

7. Cell

8. Immunity

9. JAMA

10. Lancet

11. Nature

12. Nature Neuroscience

13. Nature Reviews in Cancer

14. Nature Reviews in Genetics

15. Nature Reviews in Immunology

16. Nature Reviews in Neuroscience

17. New England Journal of Medicine

18. PLoS Medicine

19. PNAS

20. Science

Page 21 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

5

Supplementary figure S1: Relative frequencies of positive words both combined (top left) and in

isolation selected from a recent paper on superlatives commonly used in news coverage of cancer

drugs.

Page 22 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

6

Supplemental figure S2: Relative frequencies of 21 individual negative words as used in PubMed

between 1975 and 2015. Four words with low search volumes (‘disconcerting’, ‘disheartening’,

‘unpromising’ and ‘unsatisfying’) were not plotted.

Page 23 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

7

Supplemental figure S3. Relative frequency patterns of positive (top left), negative (top right),

neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time

between authors affiliated with an institution inside or outside countries with English as the official

majority language.

Page 24 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

Page 25 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

Page 26 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review O

nly

Page 27 of 27

https://mc.manuscriptcentral.com/bmj

BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960