Oac guidelines

Post on 29-May-2015

129 views 1 download

Tags:

Transcript of Oac guidelines

Transparency and consistency

Jonas Ranstam PhD

Scientific research

A systematic investigation ... designed to develop or contribute to generalizable knowledge1.

Generalizable: Having predictive and reliable results.

When sampling errors don't exist or are irrelevant, qualitative research methods (e.g. case reporting) can be used.

If sampling errors do exist, the unavoidable sampling uncertainty must be quantified (quantitative research) and presented, usually in terms of p-values and confidence intervals.

1 The US National Science Foundation

Statistics

Medical researchers rely as never before on statistics for generating and testing hypotheses and for estimating risks and benefits of old and new therapies.

Journals can facilitate the writing and reading of research reports by implementing clear guidelines for manuscript preparation.

Milestones in scientific publication

1658 – the first scientific journals1858 – the IMRAD structure1957 – the abstract1978 – the Vancouver convention (ICMJE)1987 – the structured abstract1997 – the CONSORT guidelines2007 – the STROBE guidelines

Ten recommendations

1. Purpose 2. Data source 3. Observations 4. Descriptions 5. Methods 6. Assumptions 7. Significance 8. Confidence 9. Multiplicity 10. Claims

1. Purpose

State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis?

1. Purpose

State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis?

Bad

We have shown that the success rate differs between two common techniques for autologous chondrocyte implantation.

1. Purpose

State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis?

Good

We designed an experiment to test the hypothesis of identical success rates of two common techniques for autologous chondrocyte implantation.

2. Data source

Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study.

2. Data source

Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study.

Bad

We collected 36 pieces of human cartilage.

2. Data source

Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study.

Good

Three pieces of cartilage from each of twelve physically active men between 25 and 75 years of age, previously included as healthy controls in a clinical trial (ref.), were collected for this study.

3. Observations

When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule.

3. Observations

When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule.

Bad

3. Observations

When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule.

Good

4. Descriptions

When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations.

4. Descriptions

When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations.

Bad

The mean change in total knee cartilage volume was 0.62 ml.

4. Descriptions

When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations.

Good

The mean change in total knee cartilage volume was 0.62 ±1.3 ml (n=24).

5. Methods

Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages.

5. Methods

Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages.

Bad

We used the independent groups t-test in the group comparison.

5. Methods

Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages.

Good

We used Satterthwaite's t-test in the group comparison.

6. Assumptions

The validity of statistical results rely on certain assumptions being fulfilled. Were they?

6. Assumptions

The validity of statistical results rely on certain assumptions being fulfilled. Were they?

The man of science has learned to believe in justification, not by faith, but by verification.

Thomas Huxley, 1866

6. Assumptions

The validity of statistical results rely on certain assumptions being fulfilled. Were they?

Good

The ANOVA residual was examined using a normal probability plot, which indicated a Gaussian distribution.

The homogeneity of variance was tested using Levene's test.

The assumption of proportional hazards was investigated using hypothesis tests of Schoenfeld residuals.

7. Significance

A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself.

Distinguish between practical and statistical significance. Clarify what hypotheses are tested.

7. Significance

A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself.

Distinguish between practical and statistical significance. Clarify what hypotheses are tested.

Bad

There was no difference in mean systolic blood pressure between treated patients (190 mmHg) and controls (135 mmHg) (p = 0.06).

7. Significance

A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself.

Distinguish between practical and statistical significance. Clarify what hypotheses are tested.

Good

Treated patients had in this study higher mean systolic blood pressure than controls, 190 vs. 135 mmHg. The observation, even if not statistically significant (p = 0.06), raises concern for future treatment.

8. Confidence

The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion.

8. Confidence

The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion.

Bad

The reproducibility was high (ICC = 0.91; p < 0.0001).

8. Confidence

The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion.

Good

The reproducibility was high (ICC = 0.91; 95%Ci: 0.64 - 0.98).

9. Multiplicity

All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated.

9. Multiplicity

All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated.

Bad

We have in this randomized trial shown that patients born under the astrological sign of Gemini benefit aspirin treatment more than others.

9. Multiplicity

All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated.

Good

When multiplicity issues were taken into account, we were unable to find any interaction between astrological sign and benefit from aspirin treatment.

10. Claims

The level of statistical rigor (precision and addressed uncertainty issues) should be consistent with the author's purpose and conclusions.

What is all this fuss about confidence intervals and clinical significance?

Questions that can be answered using p-values

- Can I be sure that there is an effect?

Questions that can be answered using confidence intervals

- Can I be sure that there is an effect?

- Can I be sure that there isn't an effect?

- What effect is there?

Effect

0Clinically significant effect

Confidence intervalsStatistical and clinical significance

P-valuesStatistical significance

p < 0.05 or n.s.

Statements that should be avoided

- “Statistical difference”- “Significant difference”- “There was no difference”- “ns” and “p > 0.05”- “p < 0.03”

Thank you for your attention