Replication in psychology Necessity or overblown crises ? Psychology department Faculty of...
-
Upload
jared-osborne -
Category
Documents
-
view
214 -
download
0
Transcript of Replication in psychology Necessity or overblown crises ? Psychology department Faculty of...
Replication in psychologyNecessity or overblown crises ?
Psychology departmentFaculty of philosophy, Belgrade University
Iris Žeželj
Replication, an engine to the advancement of science?
Reproducibility: the amount of consistency in results when scientific studies are repeated
" Demarcation criterion between science and non science" (Braude, 1979)
Replication workshop, October 2015 2
How it should be…
Important scientific findings are independently replicatied, evidence of their robustness and universality is accumulated.
If a finding is theoretically grounded, from a soundly designed study with enough statistical power, it will see the light of day.
Regardless of its status : positive or negative.
Science is self correcting: only replicable findings pass the test, their epistemological status becomes more sound.
3Replication workshop, October 2015
Replication?
Not only it confirms scientific findingsBut also:Specifies the conditions under which the effect is
registered.Helps to more accurately estimate the strength of
the effect.(Brandt et al, 2013)
4Replication workshop, October 2015
However…
Analysis of ALL articles in top 10 psychological journals from 1900:
• 1.6% uses the term "replication"
Analysis of 500 randomly chosen articles from 1.6%:• 68% of articles using the term replication are designed
to replicate
5Replication workshop, October 2015
How big of a problem?
In an attempt to develop treatments for different types of tumors, 53 landmark studies published in biomedical journals were replicated in course of 10 years:
Only 6 (11%) were sucesfully replicated."Some non-reproducible preclinical papers had spawned an entire
field, with hunderds of secondary publications that expanded on elements of the original observation but did nor actually seek to confirm or falsify its fundamental basis." (Begley & Ellis, 2012, p.532)
6Replication workshop, October 2015
Reactions from pharmaceutical and biotech industry:
" Situation is intolerable. Why aren’t we progressing faster in discovering efficient treatments? One option is that academic community is not supplying accurate findings. " (CNBC, 2012)
8Replication workshop, October 2015
9Replication workshop, October 2015
Collaborative replication effort-what is a successful replication-
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
10Replication workshop, October 2015
Collaborative replication effort-predictors of success-
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Why is it so?
Structure of insentives in science
"Questionable research practices (QRP)"
11Replication workshop, October 2015
Incentives
Principles for funding and publishing research support innovative findings, but not testing the robustness of the existing findings
Daril Bem: Article on ESP (series of underpowered experiments), published in JPSPIndependent replication unsucesfull ”We NEVER publish direct replications” (JPSP Editorial board)Blog documenting troubles with publishing non replication:
http://chronicle.com/blogs/percolator/wait-maybe-you-cant-feel-the-future/27984
Replication finally published in a journal with a different editorial policy
12Replication workshop, October 2015
Is this a shared belief?
13
"Negative results are as fundamental for the scientific progress as attractive, counterintuitive positive results – regardless of a person’s popperian passion for falsification."(Kahneman, 2012)
Kahneman D. (2012). Open Letter linked to Nature
Replication workshop, October 2015
Fascination with statistical significance
14
Analysis od 165 articles in 4 major APA journals:• 94% reports on statistical significance• Out of that, 96% rejects null hypothesisAlthough it’s present in other sciences, appears less
prominent. In medical journals:• 70% reports on statistical significance• Out of that, 84% rejects null hypothesis
Replication workshop, October 2015
Where do these patterns originate from?
15
Three types of bias:
Editorial: of 79 editors of high impact journals 94% claims they do not encourage replications (Madden, 1995)
Reviewer: 60% reviewers favour novel findings over replications – "waist of journal space" (Neuliep & Crandall, 1993)
Author: probability of submitting positive 8 higher than negative finding (Greenwald, 1975)
Replication workshop, October 2015
How to interpret null findings "unsucessful replications"
17
a. Type 2 error: highly probable finding does not replicateb. There is no original effectc. Real strength of the effect in lower than claimed in the
original studyd. Design or analysis of either original study or replication are
methodologicaly flawed
Replication workshop, October 2015
Questionable research practices (QRP)
18
Anonymous survey- 6000 APA members:
• 74% does not report on al DVs, but only the ones that produce significant effects
• 71% stops collecting data when reach statistical significance• 54% reports the unexpected results as if they were expected (tzv HARK ing-
Hypothesizing After Results are Known)• 50% ommits negative findings as pilot studies or states they are
methodologicaly flawed, whilst positive findings are excepted with no scrutiny• 1.7% admits to fabricating data
Replication workshop, October 2015
Questionable research practices (QRP) revisited
19
Anonymous survey- 1138 members of German Psychological Association (Schwartz & Fiedler, in press):
Replication workshop, October 2015
20Replication workshop, October 2015
QRPs not recognized as such
21
Standards in psycological research not recognized as QRP (Ioannidis, 2005) :
Series of ”small" experiments low in statistical power---- illusion of robustness of the effect
Continuing data collection over the planned sample size until statistical significance is reached ----ideo of increasing power
All QRPs more common in experimental then correlational studies.
Replication workshop, October 2015
Consequences
24
" The prevalence of QRPs raises questions about the credibility of research findings and research integrity by producing unrealistically elegant results that may be difficult to match without engaging in such practices oneself. This could lead to race to the bottom, with questionable research begetting even more questionable research – like performance enhancers in science!" (John, Lowenstein & Prelec, 2012 p. 8)
What is good for the scientists is not good for the science?
Replication workshop, October 2015
Replications: counterarguments
25
Fear of type 1 error (false positive) is unfounded if we stick to the conservative level to reject null hypothesis(.05 ili .01).
But:"A simple count of the percentage of significant results in journals would suggest that psychological studies have over 90% statistical power to reject the null-hypothesis. However, power estimates based on sample sizes and effect sizes suggest that power is at best 60%." (Giegerenzer & Sedelmeier, 1995).
Replication workshop, October 2015
Replications: counterarguments
26Replication workshop, October 2015
(Pashler & Harris, 2012, p. 532)
27Replication workshop, October 2015
Collaborative replication effort-journal ranking by R index-
R index – discrepancy between power estimated by reported significant results and power calculated based on sample sizes and effect sizeshttps://replicationindex.wordpress.com/2015/08/13/replicability-ranking-of-26-psychology-journals/
28
Although direct replications are rare, conceptual replications are often conducted and published, testing not only validity but also the universality of the effect.
ButWhat does an unsucesfull conceptual replication tell us?---Little or nothing about the
original effectWhat does a sucesfull conceptual replication tell us? ---- The original effect is
reproducible and possible to generalize to different, although similar conditions
Replications: counterarguments
Replication workshop, October 2015
29
Null results can be a consequence of an error in designing or conducting the research and therefore has no meaningful value.
But:Errors are posible, but in both scenarios (not discovering existing and discovering non
existing phenomena) – no reason to presume asymetry.
Replications: counterarguments
Replication workshop, October 2015
30
You can never claim you conducted the research identicaly as the original researcher (so called “quality of the chef” argument).
Is it a valid argument?How detailed our method sections should be?
Replications: counterarguments
Replication workshop, October 2015
31
Science lays on asymmetry between positive and negative claims – negative findings do not deserve equal treatment as positive. ONE positive finding (black swan) has more weight than number of negative ones.
ButAre scientific claims of this type? They are usually probability claims, which means the
asymmetry is reversed: one should not prove that the rule applies in one context for one respondent but to number of context and number of respondents..
Replications: counterarguments
Replication workshop, October 2015
32
By publishing negative findings, scientists are discouraged to research non robust, subtle effects, and those are often fundamental scientific knowledge.
One could agree or disagree.
Replications: counterarguments
Replication workshop, October 2015
33
Publishing negative results impedes the reputation of the original authors.
How much weight should this argument be given?
Replications: counterarguments
Replication workshop, October 2015
Replication etiquette (Kahneman , 2014)
34
• Contact the original authors• Ask for materials• Ask for unpublished detailes about the procedure• Share the findings with the original authors• Upon their permission, publish
Replication workshop, October 2015
How to procede, change of insentives needed?
36
Change in design
• Always directly replicate the original effect in an idenpendent experiment • Larger samples, more statistical power• Design preregistration
Replication workshop, October 2015
How to procede, change of incentives needed?
37
Change in reporting
• Abandon the binary approach: report on the effect strength and confidence intervals, not only p levels
• Bayesian statistics • More metaanalysis (even small scale metaanalysis) • Banning inferential statistics alltogether (BASP, Trafimow & Marks, 2015)• Journals more supportive of replication: PsychScience, Perspectives on Psych Science, Social
psychology, JJDM, JRP
Replication workshop, October 2015
38
Sharing the databases, accumulating data
Open Science Framework : https://osf.io/www.figshare.comhttp://psychfiledrawer.org/
How to procede, change of incentives needed?
Replication workshop, October 2015
39
No novelty requirement?
• PLOS
Post publishing peer review?
• Test period in medical science, Pub Med
How to procede, change of incentives needed?
Replication workshop, October 2015
Our experience:preregistered replications
41
Call for special issue of Social psychology: Replications
Introduction and Method section sent out for review
Replication workshop, October 2015
42
Method
Participants: Sampling plan (sample size calculated based on the strength of the effect in initial study; recommended statistical strength for replication .95); Sample characteristics – known differences from the original study
Materials: In ideal scenario, materials are shared by the author of the original study. If not, explain why not and how the equivalence was guaranteed.
Our experience:preregistered replications
Replication workshop, October 2015
43
Method
Procedure: Very detailed description of the setting, experimenters (presence, blind to hypothesis or not), original instructions etc…
Data analysis plan: Specification of the analysis aiming to test the original effect. If integration of databases is planned, specification of metaanalysis procedure
Our experience:preregistered replications
Replication workshop, October 2015
44
Further steps
• 2 reviews• Revising the design• Conducting the replication• Final manuscript reviewed by two editors• Revising the manuscript
Our experience:preregistered replications
Replication workshop, October 2015
45
Collaborating with the original authors
• Final manuscript sent to the original authors• The original authors write commenary• Commentary is reviewed• Replication team responds to the comments• Rejoinder is reviewed• Journal publishes: Replication, commentary and rejoinder
Our experience:preregistered replications
Replication workshop, October 2015
... And different experiences
47Replication workshop, October 2015
Take home messages
48
• Direct replication = the first step in cross cultural colaboration• Sample size planned based on the effect size• Design preregistered, if possible• Results integrated in joint databases• Metaanalysis (even small scale) on the data
Replication workshop, October 2015
Motivated reasoning?
49
Scientists=human beings
Link to a folder with the material:https://www.dropbox.com/sh/mzk0tvsaxnkp8c7/AADuLq6vsHtHspjuNdmsMSg7a?dl=0
Replication workshop, October 2015
Thank you for your [email protected]