The use of Prediction Intervals in Meta-Analysis

67
The use of Prediction Intervals in Meta-Analysis Nikesh Patel March 28, 2013

Transcript of The use of Prediction Intervals in Meta-Analysis

Page 1: The use of Prediction Intervals in Meta-Analysis

The use of Prediction Intervals in Meta-Analysis

Nikesh Patel

March 28, 2013

Page 2: The use of Prediction Intervals in Meta-Analysis

Abstract

Background

Systematic reviews containing meta-analyses of randomised controlled trials providethe best and most reliable information on health care interventions. Meta-analysiscombines treatment effects from included studies to produce overall summary results.In the fixed-effect analysis, a common effect is assumed whereas in a random-effectsanalysis, the model allows for between-study heterogeneity. The goal of analysingheterogeneous studies is not only to report a summary estimate but to explain the ob-served differences. Whilst a random-effects model remains gold standard for analysingheterogeneous studies, solely reporting the summary estimate and its 95% confidenceinterval masks the potential effects of heterogeneity. A 95% prediction interval, whichtakes into the account the full uncertainty surround the summary estimate, describesthe whole distribution of effects in a random-effects model, the degree of between-study heterogeneity and conveniently gives a range for which we are 95% sure thatthe treatment effect in a brand new study lies within.

Aims

I aim to apply a 95% prediction interval to a collection of meta-analyses of randomisedcontrolled trials and observe the impact it has on their outcomes. I also aim to applya 95% prediction interval to meta-epidemiological studies which assesses the influenceof trial characteristics on the treatment effect estimates in meta-analyses.

Results

I carried out an empirical review to look at the impact of 95% prediction intervals onexisting meta-analyses of randomised controlled trials on the Lancet. From 26 studies,I extracted 36 meta-analyses containing between three and thirty-four randomisedcontrolled trials (median eight, IQ range seven) and reproduced each using a random-effects model with a 95% prediction interval. I found 19 (52.8%) had significant 95%confidence intervals of which 10 (27.8%) had insignificant 95% prediction intervals, 9(25%) had significant 95% prediction intervals. Also, 95% prediction intervals wereapplied to 4 meta-epidemiological studies revealing extra information concerning thesummary estimates.

Page 3: The use of Prediction Intervals in Meta-Analysis

Conclusion

Every random-effects meta-analysis should include a 95% prediction interval but forbest performance, the analysis should include a sufficient number of good qualityunbiased randomised controlled trials. To enhance quality and robustness of meta-epidemiological studies, a 95% prediction interval should be included.

2

Page 4: The use of Prediction Intervals in Meta-Analysis

Contents

1 Introduction 31.1 Systematic Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Meta-Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Fixed-Effect Meta-Analysis . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Carrying out a Fixed-Effect Meta-Analysis . . . . . . . . . . . . . . . 61.5 Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 Random-Effects Meta-Analysis . . . . . . . . . . . . . . . . . . . . . . 111.7 Carrying out a Random-Effects Meta-Analysis . . . . . . . . . . . . . 121.8 Fixed-Effect v Random-Effects . . . . . . . . . . . . . . . . . . . . . . 14

2 Prediction Interval 172.1 95% Prediction Interval . . . . . . . . . . . . . . . . . . . . . . . . . 182.2 Calculating a Prediction Interval . . . . . . . . . . . . . . . . . . . . 182.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Empirical review of the impact of using prediction intervals on ex-isting meta-analyses 223.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2.1 Search Strategy and Selection Criteria . . . . . . . . . . . . . 233.2.2 Data Calculations . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.3 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.1 Principal Findings . . . . . . . . . . . . . . . . . . . . . . . . 373.4.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4.3 Comparison with other studies . . . . . . . . . . . . . . . . . . 403.4.4 Final Remarks and Implications . . . . . . . . . . . . . . . . . 40

4 Prediction intervals in Meta-Epidemiological studies 424.1 Meta-Epidemiological Study . . . . . . . . . . . . . . . . . . . . . . . 434.2 Prediction Intervals in Meta-Epidemiological Studies . . . . . . . . . 43

1

Page 5: The use of Prediction Intervals in Meta-Analysis

4.2.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.2.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.2.3 Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2.4 Example 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Final Discussion and Conclusion 53

A STATA Codes 57

2

Page 6: The use of Prediction Intervals in Meta-Analysis

Chapter 1

Introduction

In health care and medicine, clinicians, researchers and other important figures re-quire quality and accurate information to assist them in being able to make the bestpossible decisions on health care interventions. Such information is normally foundin systematic reviews containing meta-analyses of randomised controlled trials.1 Theaim of this paper is to investigate the use of prediction intervals in meta-analysis, atypical statistical component of a systematic review and how its application can helpaid interpretation of meta-analysis results to a higher degree of quality and accuracy.

1.1 Systematic Review

Since the 1990s, systematic reviews have become very important in medicine andhealth care. Reasons for this are down to the sheer volume of medical literatureproduced annually and the requirement for clinicians and other health care officialsto have up to data quality and accurate information on health care interventions.1

The objective of a systematic review is to present a balanced and impartial summaryof all the available research on a well-defined research question.1 It uses systematicand explicit methods to identify, assess, select and synthesise all the evidence that isrelevant to answering a well-defined research questions in an objective and unbiasedmanner. Systematic reviews have replaced traditional narrative reviews since theformer does not follow peer-protocol, do not use any kind of rigorous methods andtend to lack transparency causing bias; a systematic review corrects these issues.2

A systematic review begins by clearly defining a research question of interest, thismay include what treatments are being compared, what outcomes are being mea-sured, what is the population of interest etc. The next step is to search for studiesthat are relevant to the research question, this is done by searching all of the publishedand unpublished information against a well-defined quality search criterion which can

3

Page 7: The use of Prediction Intervals in Meta-Analysis

involve searching databases such as MEDLINE, PubMed etc. The studies which passthrough the search criterion go through further quality assessment to remove any irrel-evant studies. The next step is extract all the relevant data from the included studiesand then carry out a statistical synthesis of the data which is done using meta-analysis(see Meta-Analysis). The final step is to present all the findings from the analysis aswell as analysing any possible heterogeneity between the studies, commenting on thequality of the studies (e.g. bias) and identifying areas of further research.1

Examples of systematic reviews can be found easily on the internet on websites ofthe British Medical Journal (BMJ) or the Cochrane Collaboration and many more.These websites dedicate themselves to provide information on health care interven-tions to the health care and medicine industry. A robust methodology for preparingand producing systematic reviews can be found on these websites for example, TheCochrane handbook for systematic reviews of interventions.3

1.2 Meta-Analysis

“The Statistical analysis of a large collection of analysis results from individualstudies for the purpose of integrating the findings”

Gene V. Glass definition of Meta-Analysis

A meta-analysis is a statistical technique whereby results from studies, included inthe analysis, are combined to produce a total and complete summary of the studies.In epidemiology, a stereotypical systematic review of randomised controlled trials willuse meta-analysis as its statistical component whereby treatment effects from indi-vidual trials are synthesised in the aim of assessing clinical effectiveness of healthcareinterventions.4 Meta-analysis is based on one of two models, the fixed-effect and therandom-effects model. In this chapter, I discuss both models and when each typeshould be used.

It first seems appropriate to address the reasons why we would want to use a meta-analysis and not the traditional narrative approach. In a narrative approach, the focustends to be on p-values of individual studies and observing if there is significant effectin each study. Since there is no rigorous way of synthesising p-values, the findings froma narrative approach tends to lack transparency and in many cases, the researchersmay only include studies that support their own opinions which leads to the resultsbeing biased towards their own opinions.1;2 A meta-analysis, on the other hand, worksdirectly with the treatment effects of each study and their respective standard errorsand performs one single synthesis of all the data to produce an aggregate summaryestimate, which I denote as θ̂.4 Since we are combining all the information across thestudies, we reduce uncertainty compared to any individual study since we are increas-ing the sample size and in turn, increasing the power to detect clinically meaningful

4

Page 8: The use of Prediction Intervals in Meta-Analysis

results.2 A meta-analysis also addresses the consistency of treatment effects acrossthe studies, something a narrative approach fails to do. If the treatment effects areconsistent, then the focus is on the summary estimate and making sure we estimatethis accurately as possible. If the treatment effects are not consistent, the not onlyshould we estimate the summary effect but explain the differences that exist betweenthe studies.2;5 Treatment effects are generally much more important to clinicians andother health care officials compared to p-values. The effect size tells us not only ifthe treatment effect is better/worse (i.e. greater or less than the null value), butalso the magnitude of the effect. Also, p-values can be easily misinterpreted as someresearchers may deem a non-significant p-value to suggest the treatment effect has noeffect.2 I later return to the argument of a narrative approach against a meta-analysiswhen I consider an example (see page 8).

A vital requirement for a strong meta-analysis is a well-conducted systematic review.If the underlying systematic review isn’t carried out under good conduct, the meta-analysis will produce results that may lead to misleading conclusions.1;2 A meta-analysis should also be carried out under good conduct, once again I recommend theCochrane handbook for systematic reviews on how to conduct a good meta-analysis.3

1.3 Fixed-Effect Meta-Analysis

The first type of meta-analysis I discuss is the fixed-effect meta-analysis. The fixed-effect model assumes that all the studies included in the analysis are estimating thesame underlying treatment effect or in other words, we believe the true treatmenteffect is common across all the other studies and each study is estimating that sametrue treatment effect. The repercussions of this model is that any differences observedbetween the individual treatment effects are down solely to random sampling error(within-study error). If we had an infinite number of studies with an infinitely largesample size, we expect the within-study error in each study to tend to zero and theindividual treatment effects to be the same as the true common treatment effect.2

In the fixed-effect model, we can express the observed treatment effects in the followingway,

Yk = θ̂ + εk (1.1)

where Yk is the observed treatment effect in study k, θ̂ is the estimate of the summarytreatment effect and εk is the random sampling error in study k. We can assumethat the errors follow a normal distribution with mean 0 and variance equal to thevariance of the treatment effect in study k, i.e. that εk ∼ N(0,Var(Yk)). Here the

5

Page 9: The use of Prediction Intervals in Meta-Analysis

errors account for the within-study error in each study since in the fixed-effect model,we assume this is the only source of variation.2

For the fixed-effect meta-analysis, the aim is to compute the summary estimate θ̂,which is interpreted as the best estimate of the common treatment effect that underlieseach of the studies in the analysis, along with a 95% confidence interval.

1.4 Carrying out a Fixed-Effect Meta-Analysis

A general approach to meta-analysis is given by the inverse-variance method, thismethod works for any type of data as long as we can obtain a treatment effect andits standard error.2 For continuous data, we need a mean difference (or any kind ofdifference), for survival data, we need a log hazard ratio and for binary outcomes, weneed a log odds ratio or log relative risk along with their respective standard errors(log standard errors for ratios).

In the fixed-effect model, the weight assigned to each study is one over the varianceof the study, hence the term inverse variance method. Studies with smaller variancesare assigned larger weights than studies with larger variances.

The fixed-effect inverse-variance weighting is therefore given by

Wk =1

V ar(Yk). (1.2)

where V ar(Yk) is the variance of the observed treatment effect in study k.

The formula for θ̂ using a fixed-effect model is given by

θ̂ =

∑nk=1 YkWk∑nk=1Wk

(1.3)

which has variance given by

V ar(θ̂) =1∑n

k=1Wk

. (1.4)

Here Wk is the weighting given using the inverse variance given by (1.2). I note thatθ̂ is the maximum likelihood estimate for θ and it is asymptotically unbiased, efficientand normal.6 I reiterate that θ̂ should be interpreted as the best estimate of thecommon treatment effect since the fixed-effect model assumes that each of the studiesin the analysis are estimating the same treatment effect.

6

Page 10: The use of Prediction Intervals in Meta-Analysis

We also calculate a 95% confidence interval to express our uncertainty around oursummary estimate θ̂, assuming that θ̂ is approximately normally distributed, usingthe following formula

[θ̂ ± 1.96s.e.(θ̂)

]. (1.5)

If we are working on the log scale, i.e. we are using some type of ratio, we mustremember to exponentiate θ̂ in (1.3) and the end points of the confidence interval in(1.5). I could also present a 100(1−α)% confidence interval but for convention, I amonly going to calculate 95% confidence intervals in this paper.

Example 1

Table 1.1, presented below, shows the results from ten randomised controlled trialseach comparing the benefit of an anti-hyperintensive treatment, treatment A, againstplacebo. Each trial is presented with its unbiased estimated mean difference in changein systolic blood pressure (mmHg), variance and a 95% confidence interval.7

Trial(k) Yk V ark 95% C.I.1 -0.49 0.12 [-1.17,0.19]2 -0.17 0.05 [-0.61,0.27]3 -0.52 0.06 [-1.00,-0.04]4 -0.48 0.14 [-1.21,0.25]5 -0.26 0.06 [-0.74,0.22]6 -0.36 0.08 [-0.91,0.19]7 -0.47 0.05 [-0.91,0.03]8 -0.30 0.02 [-0.58,-0.02]9 -0.15 0.07 [-0.67,0.37]10 -0.28 0.25 [-1.26,0.70]

Table 1.1: Results of trials comparing treatment A against placebo (A value < 0represents a reduction in blood pressure and therefore beneficial)

Using a fixed-effect model, we weight each study using (1.2) and then obtain a sum-mary estimate for the treatment effect along with a 95% confidence interval. Using(1.3), we calculated our summary estimate θ̂ to be -0.33, so we expect treatment Ato consistently reduce systolic blood pressure by 0.33mmHg. Our 95% confidenceinterval calculated using (1.5) is [-0.48,-0.18]. Since the null value of 0 is not in the95% confidence interval for θ̂, there is strong evidence at 5% level that treatment Ais effective in reducing systolic blood pressure. The results are presented in a forestplot given below in figure 1.1.

7

Page 11: The use of Prediction Intervals in Meta-Analysis

Figure 1.1: Forest plot of a meta-analysis of randomised controlled trials showing theeffects of treatment A on reducing systolic blood-pressure (SMD = standardised meandifference)7

On the forest plot, the squares represent the weight that is assigned to the corre-sponding study with the centre of the square depicting the observed treatment effectfor that study. The 95% confidence interval for each study is represented by the linesgoing through the squares with them beginning and ending at the end points of theinterval. The diamond at the bottom of the forest plot represents the 95% confidenceinterval of the summary estimate with the centre representing the summary estimate.

I know return to the argument of using meta-analysis over a narrative approach.If we observing the forest plot in figure 1.1, eight trials have a confidence intervalthat contains the null value 0 and therefore have insignificant p-values. If we took anarrative approach and consider each study separately, we would most likely concludethat since 80% of the studies produced insignificant p-values, the treatment isn’tbeneficial. When we perform a meta-analysis, the 95% confidence interval for thesummary estimate doesn’t contain the null value and therefore we obtain a significantp-value since we have increased the power to detect significant results.2

8

Page 12: The use of Prediction Intervals in Meta-Analysis

1.5 Heterogeneity

In the fixed-effect model, we assumed that all the studies in the analysis are estimat-ing the same treatment effect and the only error we allow for is random sampling error(within-study heterogeneity), but is this always a plausible assumption. In general,studies looking at the same treatment may differ in many ways such as patient charac-teristics (age, patient health etc), location of study, intervention applied (dosage etc)and many more known and unknown factors causing the treatment effects across thestudies to longer remain consistent.2 If the treatment effects are no longer consistent,then there exist real differences between the studies (between-study heterogeneity)and the aim of a meta-analysis should be to assess the heterogeneity between thetreatment effects as well as calculate a summary estimate.2;5;8 If we used a fixed-effectmethod in the presence of between-study heterogeneity, we would be wrongly implyinga common effect exists and hence lead to misleading conclusions about the treatment.

I now discuss ways in which we can assess heterogeneity, since heterogeneity is made upof real differences (between-study heterogeneity) and random sampling error (within-study error), we need some tools to help us see if between-study heterogeneity ispresent. I first introduce the Q-statistic which is based on the result of the Q-test.This test is useful if we believe the presence of between-study heterogeneity is causingmore variation in the treatment effects than is expected only by random samplingerror.2;9

The Q-test is then defined as follows;

H0: Y1 = Y2 = · · · = Yk (for all k studies)H1: At least one Yk differs,

where Yk is the observed treatment in study k and Wk is the fixed-effect weighting ofstudy k.

The Q-statistic, which is given by the following formula

Q =n∑

k=1

WkY2k −

(∑n

k=1WkYk)2∑nk=1Wk

, (1.6)

is compared to χ2n−1(α). If we find Q > χ2

n−1(α), then we reject the null hypothesis at(1−α)% level and this suggests that there is evidence of between-study heterogeneity.If Q < χ2

n−1(α), then we accept the null hypothesis at at (1− α)% and this suggeststhere is no evidence of between-study heterogeneity.2;9

Another useful statistic is the I2-statistic, this measures approximately the percentageof total variation that is down to between-study heterogeneity.9 It is given by thefollowing formula;

9

Page 13: The use of Prediction Intervals in Meta-Analysis

I2 = 100%(Q− (k − 1))

Q(1.7)

where Q is the Q-statistic worked out using (1.5).

If our I2 is 0%, this suggests that all the variability in our summary estimate is down torandom sampling error (within-study heterogeneity) and not down to between-studyvariation and therefore it could make sense to use a fixed-effect model. I2 values areconsidered by Higgins et al. to be low, moderate and high on the values of 25%, 50%and 75% respectively.2;9 If we obtain a negative value for I2, the value is set to 0 andinterpreted in the same way as 0.

I must stress that both the Q-test and the I2-statistic should be used as tools tohelp us to decide what model we use, the decision on what model we use shouldn’tbe solely based on the performance of the Q-test and I2-statistic since they aren’tprecise. If we consider the Q-test, while a significant p-value suggests that thereexists variation in the individual treatment effects, a non-significant p-value doesn’tnecessarily mean a common effect exists. The lack of significant can be as a resultof a lack of power. If there are few trials or we have lots of within-study error as aresult of trials having small sample sizes, the even the presence of a large amount ofbetween-study heterogeneity may result in a non-significant p-value.2 If there are fewstudies, a significance level of 10% is often used because of lack of power, so a p-valuestrictly less than 0.1 would be enough to accept the null hypothesis that there existsno between-study heterogeneity.

The I2-statistic itself is dependent on the Q-statistic therefore if the Q-test lackspower, then the I2 will be imprecise. Also, I2 may tell us what proportion of thevariation is down to real error but what it doesn’t tell us is how spread out the erroris. A high value of I2 implies a high proportion of the variation is down to realerror but this error may only be spread out narrowly since the studies may have highprecision. Conversely, a low I2 only implies a low proportion of variation is down toreal error but doesn’t imply the effects are grouped together in a narrow range, theycould easily vary of a wide range if the studies used lack precision.2 Higgins in hispaper10 talks about the misunderstanding of the I2-statistic and believes it shouldonly be used a descriptive statistic.

Example 1 (Continued)

I now apply both the Q-test and the I2-statistic to example 1 and see if conductinga fixed-effect analysis to that example was appropriate. Conducting a Q-test leadsto a Q-statistic of 2.490 using (1.6), this is compared to χ2

9 = 14.684 (we use 10%level of significance, since we only have a few studies). Since our test statistic of

10

Page 14: The use of Prediction Intervals in Meta-Analysis

2.490 < 14.684, there is no statistical evidence against H0 at 10% level of significance.This suggests that there is no sign of between-study heterogeneity. I also work outthe I2-statistic, here our I2 value is −261.385% using (1.7), which is set to 0 whichsuggests that the total variation across the studies is only down to within-study error.If we observe the forest plots in figure 1.1, it’s fairly clear to see that the observedtreatment effects do not deviate too far from the summary estimate so using a fixed-effect model seems appropriate, so I can regard our summary estimate as the commoneffect.

If we conclude that between-study heterogeneity is present, we cannot use the fixed-effect model, we instead use the random-effects model which is discussed in the nextchapter. I briefly discuss two alternatives that try and eradicate all presence ofbetween-study heterogeneity which can be ideal from a researchers perspective. Thefirst is sub-group analysis, in this case, a series of fixed-effect meta-analyses are per-formed on each sub-group where studies in each group are deemed similar enough toassume a common effect. Problems with the sub-group analysis is that each sub-groupwill contain fewer studies so we have a loss of power and instead of carrying out onesynthesis, we are doing several and we still aren’t guaranteed a sufficient amount ofbetween-study heterogeneity will be removed.2 The second option is meta-regressionwhere the covariates in the model explain the variation in the data and we can obtainthe treatment effect for each covariate while adjusting for the others. A problem withthis method is that unidentified sources of heterogeneity aren’t accounted for.11 Aproblem inherent in both alternatives is that with a few studies, both aren’t usefulsince there is a loss of power, i.e. in the case of meta-regression, we have low powerto detect what covariates explain heterogeneity.2;11

1.6 Random-Effects Meta-Analysis

The second type of meta-analysis I discuss is the random-effects meta-analysis. Thismodel assumed that the individual treatments effects vary across the studies becauseof the presence of real differences (between-study heterogeneity) as well as randomsampling error. A random-effects model assumes that the true effects from the in-dividual studies come from a distribution of true effects with mean θ and varianceequal to the magnitude of the between-study heterogeneity which I denote as τ 2 andterm between-study variance (we can usually assume a normal distribution). Therepercussions of this model is that if we had an infinite number of studies with aninfinitely large sample size, we expect the random sampling error to tend to zero butexpect the individual treatment effects to still differ because of real differences thatexist between them.2;5

In the random-effects model, we can express the observed treatment effects in thefollowing way,

11

Page 15: The use of Prediction Intervals in Meta-Analysis

Yk = θ̂ + ζk + εk (1.8)

where θ̂ is the summary estimate, εk is the sampling error in study k and ζk is thebetween-study error in study k. We again assume that εk ∼ N(0,Var(Yk)) and assumethat ζk ∼ N(0,τ̂ 2). Here the errors account for the within-study error and the between-study error since in the random-effects model, we allow for two sources of variation.2

For the fixed-effect meta-analysis, the aim is to compute the summary estimate θ̂,which is interpreted as the best estimate of the common treatment effect that underlieseach of the studies in the analysis, along with a 95% confidence interval. For therandom-effects meta-analysis, computing the summary estimate alone and its 95%confidence interval is insufficient. Since we assume there exists real differences betweenthe treatment effects, the aim of a random-effects meta-analysis is not only to computethe summary estimate but also to explain the differences that exists between thetrials and learn about how the individual treatment effects are distributed about thesummary estimate.2;5 I note that the summary estimate θ̂ is now interpreted as theaverage effect.

1.7 Carrying out a Random-Effects Meta-Analysis

To carry out a random-effects meta-analysis, we first need to estimate the between-study variance since it describes the magnitude of the between-study heterogeneityand this has to be incorporated into the calculations of the summary estimate θ̂.

To estimate τ 2, we use the DerSimonian and Laird method which provides an unbiasedpoint estimate for τ 2.12 This is given by the following formula,

τ̂ 2 =Q− (k − 1)∑n

k=1Wk −∑n

k=1W2k∑n

k=1Wk

(1.9)

where Q is the Q-statistic calculated using (1.6) and Wk are the weights for eachstudy from the fixed-effect meta-analysis calculated using (1.2). I note that shouldQ < (k − 1), then we set τ 2 = 0. If our point estimate of between-study varianceis zero (implying no between-study heterogeneity), then the random-effects modelreduces to the fixed-effects model.

Similar to the fixed-effect model, we use the inverse variance method to weight theindividual studies. In the fixed-effect model, since we assume each study is estimatingthe same common effect, the study with the highest precision is given the largestweighting since it will contain the most information about the true summary effect

12

Page 16: The use of Prediction Intervals in Meta-Analysis

θ. In a random-effects model, the weighting has to be given more care since eachstudy is no longer estimating the same treatment effect.2 The weighting must nowtake into account the estimate of the between-study variance τ̂ 2 so the study with thelargest precision doesn’t have as much influence as it would if a fixed-effect model wasassumed. So, in a random-effects model, the weight given to each study is given by

W ∗k =

1

V ar(Yk + τ̂ 2). (1.10)

The formula for θ̂ using a random-effects model is given by

θ̂ =

∑nk=1 YkW

∗k∑n

k=1W∗k

(1.11)

and has variance

V ar(θ̂) =1∑n

k=1W∗k

. (1.12)

I reiterate that θ̂ should be interpreted as average or mean treatment effect and notthe common effect, since by using a random-effects model, I am assuming that thetrue effects from each of the studies are distributed about the man of a distributionof true effects and θ̂ is the estimate of this mean. I also note that the true treatmenteffect in an individual study could be lower or higher than this average effect.

A 95% confidence interval for θ̂ is given by

[θ̂ ± 1.96s.e.(θ̂)

]. (1.13)

Example 2

Table 1.2 presented below shows the results from ten randomised trials each compar-ing the benefit of another anti-hyperintensive treatment, treatment B against placebo.Each trial is presented with its unbiased estimated mean difference in change in sys-tolic blood pressure (mmHg), variance and a 95% confidence interval.7

13

Page 17: The use of Prediction Intervals in Meta-Analysis

Trial(k) θk V ark 95% C.I.1 0.00 0.423 [-0.829,0.829]2 0.10 0.219 [-0.329,0.529]3 -0.40 0.026 [-0.451,-0.349]4 -0.80 0.199 [-1.190,-0.410]5 -0.63 0.301 [-1.220,-0.040]6 -0.22 0.301 [-0.370,0.810]7 -0.34 0.071 [-0.480,-0.201]8 -0.51 0.102 [-0.710,-0.310]9 -0.03 0.122 [-0.209,0.269]10 -0.81 0.301 [-1.340,-0.220]

Table 1.2: Results of trials comparing treatment B against placebo (A value < 0represents a reduction in blood pressure and therefore beneficial)

I first test for heterogeneity to help us decide what type of meta-analysis we shoulduse. We obtain a Q-statistic of 30.876 > χ2

9(0.05) = 14.684 using (1.6) which suggestsevidence of heterogeneity at 10% level of significance. I also obtained an I2 value of70.85% using (1.7) which suggests that 70.85% of the variation in treatment effects isdue to between-study heterogeneity and the rest is due to chance. This is considered ahigh level of between-study heterogeneity and therefore a random-effects meta-analysiswould seem appropriate to use.

Using the formulas (1.9) through to (1.13), I obtained τ̂ 2 to be 0.029 and summaryestimate of -0.33 along with 95% confidence of [-0.48,-0.18]. So on average, treatmentB reduced systolic blood pressure by 0.33mmHg but in an individual study, the treat-ment effect can vary from this average and since the null value of 0 is not in the 95%confidence interval, there is strong evidence at 5% level that treatment B, on average,is beneficial. A forest plot of the results from the meta-analysis is shown in figure 1.2.

We can see that unlike in figure 1.1, there is clear deviations from the individualtreatment effects to the summary estimate so it would therefore seem appropriate toassume that each trial is estimating a different treatment effect and use a random-effects model to account for it.

1.8 Fixed-Effect v Random-Effects

It is imperative that when conducting a meta-analysis, the right model is chosen sinceit influences how we interpret the results. If we look at examples 1 (figure 1.1 on page8) and 2 (figure 1.2 on page 15), both of these produce the same summary estimate of-0.33 and have the same 95% confidence interval of [-0.48,-0.18]. Despite these simi-larities, the way in which they are interpreted are very different. In example 1, I used

14

Page 18: The use of Prediction Intervals in Meta-Analysis

Figure 1.2: Forest plot of a meta-analysis of randomised controlled trials showing theeffects of treatment B on reducing systolic blood-pressure (SMD = standardised meandifference)7

a fixed-effect model which I justified because I believeed there is no presence of realdifferences between the studies so the summary estimate is the common effect acrossthe studies. In example 2, I decided to use a random-effects model since I believedthe variation between the individual treatment effects were down to real differencesas well as random-sampling error so therefore, I regard the summary estimate as theaverage across the studies but in an individual study, the treatment effect can varyfrom this average effect.

Despite these differences, there still seems to be some misunderstanding when it comesto choosing what model to use and in interpreting the results. Riley at al.7 reviewed44 Cochrane reviews that wrongly interpreted θ̂ as the common effect rather thanthe average effect when using a random-effects approach. They also reviewed 31Cochrane reviews that used a fixed-effect meta-analysis and found that 26 of thesehad I2 values of 25% or more without justifying why a fixed-effect model was used.Using a fixed-effect model in these situations must be justified, otherwise we endup making inaccurate conclusions from the results since we are suggesting there is

15

Page 19: The use of Prediction Intervals in Meta-Analysis

a single common effect when actually no common treatment effect exists because ofreal differences amongst the studies. A reason for misinterpretation can be put downto the fact that if we observed the forest plots for examples 1 and 2, the results arepresented in the same way which causes confusion. Skipka et al.13 point this out andalso point out that the point estimate of τ 2 is never displayed on the forest plot.

I have already commented that the choice of what model we use shouldn’t be solelybased on the Q-test and the I2-statistic but how should we go about choosing whatmodel we use. Let say we wish to carry out a meta-analysis on a sufficient number ofstudies looking at some treatment against placebo. If we know there are a sufficientnumber of properties that these studies have in common, for example similar agerange, similar dosage, similar follow-up time etc, then it would seem appropriate touse a fixed-effect model since we believe there are negligible real differences betweenthe studies and any factors that do affect the treatment effects are the same acrossthe studies. A common procedure is to carry out a fixed-effect meta-analysis andobserve the forest plot to see if the observed treatment effects are similar.2 There aretwo problems with this, firstly it isn’t clear if the observed differences are only downto random sampling error and if this was the incorrect model, then carrying it outwas a waste of time.

If we believe there are real differences, then a random-effects model should be imple-mented. In this model, each study is expected to be estimating a difference treatmenteffect and the job of this type of meta-analysis is to make sense of the differencesbetween the studies and how the true individual treatment effects are distributedabout the summary estimate.2;5 A clear advantage of a random-effects meta-analysisis that we can generalise our results to a range of populations not included in theanalysis given that the analysis includes a sufficient number of studies, this maybeone of the goals of the underlying systematic review.2;5 If we wanted to estimate whatthe treatment effects will be in a new study, we can draw it from our results as longas we can describe how the individual treatments are distributed about the summaryestimate with adequate precision.5 In a fixed-effect model, we cannot generalise sinceour results are exclusive to certain properties, for example a particular population.2

16

Page 20: The use of Prediction Intervals in Meta-Analysis

Chapter 2

Prediction Interval

In the presence of between-study heterogeneity, the aim of a meta-analysis isn’t justto calculate the summary estimate but also to make sense of the heterogeneity. I havealready pointed out that methods of eradicating all presence of heterogeneity canbe difficult because of unknown sources of heterogeneity so it would seem better toassess heterogeneity rather than try and remove it. Higgins10 believes any amount ofheterogeneity is acceptable provided there is a “sound predefined eligibility criteria”and that the “data is correct” but stresses that a meta-analysis must provide a sternassessment of heterogeneity.

Since a random-effects meta-analysis accounts for unidentified sources of heterogene-ity7, I believe it should be gold standard for explaining heterogeneous data. Un-fortunately, once researchers have carried out a random-effects meta-analysis, theytend to focus on the summary estimate and its 95% confidence interval, this howeverisn’t sufficient since, by the assumption of a random-effects model, we allow for realdifferences between the individual studies.2;7 If we were using a fixed-effect model,then focusing on the summary estimate, which gives the best estimate of the commoneffect, and its 95% confidence interval, which describes the impact of within-studyheterogeneity on the summary estimate, is adequate. The random-effects summaryestimate tell us the average effect across the studies and its 95% confidence intervalindicates the region in which we are 95% sure that our estimate lies in, neither tell ushow the individual treatment effects are distributed about the random-effects sum-mary estimate.5 This leads us to the introduction of the prediction interval which isdiscussed in this chapter.

17

Page 21: The use of Prediction Intervals in Meta-Analysis

2.1 95% Prediction Interval

A 95% prediction interval gives the range for which we are 95% sure that the potentialtreatment effect of a brand new individual study lies. The beauty of a predictioninterval is that not only does it quantitatively give a range for a treatment effect in anew study thus allowing the researcher, clinicians etc to apply the results into futureapplications, but it also offers a suitable way to express the full uncertainty around thesummary estimate in a way which acknowledges heterogeneity. A prediction intervalcan also describe how the true individual treatment effects are distributed about thesummary estimate.2;5;7;13 For these reasons, the inclusion of a prediction interval ina random-effects meta-analysis can make its conclusions more robust and provide amore complete summary of the results and therefore making the results more relevantto clinical practice.14

The notion of a prediction interval was first proposed by Ades et al.8 where theypropose a predictive distribution of a future treatment effect in a brand new studyusing a Bayesian approach to meta-analysis. A further push for the prediction intervalin meta-analysis is seen in a paper by Higgins et al.5. The authors acknowledge thesmall attention that has been given to predictions to meta-analysis and present theprediction interval in a classical (frequentist) framework to meta-analysis. Higginset al.10;5 believe that a prediction interval is the most convenient way to present thefindings of a random-effects meta-analysis in a way that acknowledges heterogeneitysince it takes into account the full distribution of effects in the analysis.

2.2 Calculating a Prediction Interval

When calculating a prediction interval, we not only account for the between-studyand within-study heterogeneity, but also for the uncertainty of the summary estimateθ̂ and the uncertainty of the between-study variance τ̂ 2.2 Let say we knew the truevalues of the summary effect θ and the between-study variance τ 2, if we made theassumption that the treatment effects across the studies are normally distributed, the95% prediction interval would be given by

[θ ± 1.96

√τ 2]. (2.1)

The problem with (2.1) is that we do not know the exact values of theta and τ 2, ratherwe are estimating them and because of this, there is uncertainty surrounding theseestimates.2 To account for this, we use the following formula provided by Higgins etal.5 for a 95% prediction interval which is given by

18

Page 22: The use of Prediction Intervals in Meta-Analysis

[θ̂ ± t0.05n−2

√τ̂ 2 + V ar(θ̂)

]. (2.2)

Here, θ̂ is the summary estimate form the random-effects meta-analysis, V ar(θ̂) is thevariance of the summary estimate accounting for the uncertainty of the estimate ofθ, τ̂ 2 is the estimate of the between-study variance, t0.05n−2 is the t-value correspondingto the 95th percentile of the t-distribution where there are n − 2 degrees of freedom(where n is the number of studies) which accounts for the uncertainty of the estimateof τ 2.2;5 We require at least three studies to calculate a prediction interval7 and wealso must remember to exponentiate the end points of (2.2) if we are working on thelog scale.

Example 2 with a Prediction Interval

In example 2, I used a random-effects model and found the summary estimate to be-0.33mmHg, between-study variance τ̂ 2 to be 0.029 and the 95% confidence intervalfor the summary estimate to be [-0.48,-0.18] (see figure 1.2). I can now calculate aprediction interval for example 2 using (2.2), I obtained the interval [-0.76,0.09]. Wenotice that the null value of 0 is now in the prediction interval so therefore, it isn’tstatistically significant at the 5% level. So, in a brand new individual study setting,we are 95% sure that the potential treatment effect for this study will be between -0.76mmHg and 0.09mmHg. Although on average, the treatment will be beneficial (asindicated from the 95% confidence interval), in a single study setting, we cannot ruleout that the treatment may actually be harmful (since the 95% prediction intervalcontains values < 0). The prediction interval therefore acknowledges the impactof heterogeneity that was masked by just focusing on the random-effects summaryestimate and its 95% confidence by themselves. A forest plot for example 2 is givenin figure 2.1 but now includes a 95% prediction interval.

The prediction interval is given by the diamond at the bottom of the forest plotin figure 2.1. The centre of the diamond represents the random-effects summaryestimate, the width of the diamond represents the 95% confidence interval for thesummary estimate and the width of the lines going through the diamond representsthe 95% prediction interval. Skipka et al.13 discuss different methods that have beenproposed of how a prediction interval should be presented in a forest plot. They alsosuggests that the inclusion of a prediction interval in a forest plot is a good way ofdistinguishing between a random-effects and fixed-effect forest plot. Throughout thispaper, I will present a 95% prediction interval in a forest plot as is seen in figure 2.1.

19

Page 23: The use of Prediction Intervals in Meta-Analysis

Figure 2.1: Forest plot of a meta-analysis of randomised controlled trials showingthe effects of treatment B on reducing systolic blood pressure with a 95% predictioninterval (SMD = standardised mean difference)7

2.3 Discussion

It is important that I address a few issues that arise when working with a predictioninterval. A problem that exists in both prediction interval and in a random-effectsmeta-analysis is when the analysis has few studies. If we have few studies, regard-less how large they are, the prediction interval will be wide because of the lack ofprecision in the DerSimonian and Laird method (using (1.9)) estimate of τ 2.2;5 If ourmeta-analysis contains few studies and has substantial between-study heterogeneity, arandom-effects meta-analysis remains the correct option but an alternative approachcould be to use a Bayesian approach to estimate τ 2 instead of using the DerSimonianand Laird method which is sensitive to the number of studies in the analysis. ABayesian approach uses prior information outside the studies to calculate an estimateto τ̂ 2. This approach has the advantage of naturally allowing the full uncertaintyaround all the parameters in the model and incorporation information that may notbe considered in a frequentist model. The approach however can be difficult to im-

20

Page 24: The use of Prediction Intervals in Meta-Analysis

plement and could be prone to bias. I refer papers by Higgins et al.5 and Ades et al.8

which provide a more thorough description on the Bayesian approach to predictionintervals.

Another problem that occurs because of having a small number of studies is thevalidity of the assumption that when calculating a prediction interval, the populationin a new study “sufficiently similar” to those in the studies already included in theanalysis. In a random-effects meta-analysis, since we allow for real differences, eachstudy will be different in many ways, the more studies we have, the broader the rangeof populations we cover thus validating this assumption.5 We also assume that eachstudy has a low risk of bias, i.e. that each study included in the analysis has beencarried out under good conduct. If this wasn’t the case, the prediction interval willinherit heterogeneity caused by these biases.7

Finally, it seems meaningful to make it absolutely clear the differences between arandom-effects 95% confidence interval and a 95% prediction interval since. A 95%confidence interval in a random-effects meta-analysis contains the region in which weare 95% sure that our summary estimate (regarded as the average effect) lies within.The width of the confidence interval accounts for the error in the summary estimateand with an infinite number of infinitely large studies, the end points of the confidenceinterval will tend to the summary estimate.2 The mistake that is made is that the95% confidence interval from a random-effects meta-analysis measures the extent ofheterogeneity but this wrong since we only consider the error in the summary esti-mate.5 A 95% prediction interval contains the region in which we are 95% sure thatthe potential treatment effect in a brand new individual study lies within. Anotherway to describe a 95% prediction interval is that we can draw the potential treatmenteffect, denoted ynew with 95% precision from the prediction interval since the predic-tion interval describes how the true individual treatment effects are distributed aboutthe summary estimate.5 If we had an infinite number of infinitely large studies, weexpect the width of the prediction interval to reflect the actual variation between thetrue treatment effects.2 Since the 95% prediction interval accounts for all the uncer-tainty, the 95% prediction interval will never be smaller than its corresponding 95%random-effects confidence interval so we can regard the 95% random-effects confidenceinterval as a subset of the 95% prediction interval.

21

Page 25: The use of Prediction Intervals in Meta-Analysis

Chapter 3

Empirical review of the impact ofusing prediction intervals onexisting meta-analyses

3.1 Introduction

A random-effects meta-analysis should remain gold standard for analysing heteroge-neous studies but solely presenting the summary estimate from the random-effectsmeta-analysis and its 95% confidence interval masks the potential effects of hetero-geneity.7 The addition of a prediction interval gives a more complete summary of theresults from a random-effects meta-analysis in a way that acknowledges heterogeneityand therefore making it easy to apply to clinical practice.5 A 95% prediction inter-val, with enough studies, can describe the distribution of true treatment effects andtherefore gives a range for which we can be 95% sure that the potential treatmenteffect in a brand new study, ynew, is within.2;5

The aim of this review is to assess the impact of a 95% prediction interval on theoutcomes of existing meta-analyses of randomised controlled trials. I want to see ifthe inclusion of a 95% prediction interval can help interpret the results of a random-effects meta-analysis to a higher degree of accuracy and therefore recommend whetheror not a random-effects meta-analysis should always include a 95% prediction intervalin its analysis.

22

Page 26: The use of Prediction Intervals in Meta-Analysis

3.2 Methods

3.2.1 Search Strategy and Selection Criteria

To find the studies for the review, I electronically searched for studies on the Lancetwebsite (www.lancet.com). I used the Lancet since it is one of the oldest and mostrespected medical journals and has vast amounts of medical literature. I used theadvanced search toolbar on the Lancet website using the key words “RANDOMISEDTRIAL” and “META ANALYSIS” in the abstract of all research, reviews and semi-nars in all years in all Lancet journals. The search was carried out on 20/12/2011 andproduced 61 studies. For each study, I initially obtained a PDF file of the study plusany supplementary material using Sciencedirect via access through the University ofBirmingham student portal.

The eligibility criteria for the studies to enter the review is that each study mustinclude at least one meta-analysis of three or randomised controlled trials on theirprimary outcomes as defined by the authors of the studies. Of the 61 studies, Ireviewed their abstracts to remove any irrelevant studies. I excluded studies thatonly contained a meta-analysis of non-randomised controlled trials (e.g. observa-tional studies) since I am only interested in meta-analyses of randomised controlledtrials whereby patients are randomly assigned to the treatment or control group.Randomised controlled trials cancel the effects of known and unknown confoundingfactors as well as selection bias.2 I also excluded studies that had a meta-analysis ofless than three randomised controlled trials which is seen as the minimum number oftrials required to calculate a prediction interval.7 In the case where the meta-analysiscontained a mixture of randomised and non-randomised controlled trial, I took themeta-analysis of the randomised controlled trials only if the author had explicitlypresented the meta-analysis of the randomised controlled trials along with the over-all meta-analysis, if they only presented a meta-analysis covering all randomised andnon-randomised trials, the study is excluded. I also excluded any studies that didn’tdisplay data by trial. Other reasons for study exclusion were that some of the stud-ies were only randomised controlled trials and not meta-analyses, some studies wereinformative studies or research papers on meta-analysis and a couple of studies werenetwork meta-analyses which were removed since they are potentially more subjectto error than typical meta-analyses. I also came across studies that were duplicatesfor which I only considered the most recent study.

The flow chart given below in figure 3.1 describes the process. The boxes containthe reasons for excluding the studies and the number represents the studies that wereremoved for that reasons.

23

Page 27: The use of Prediction Intervals in Meta-Analysis

Figure 3.1: Flow chart describing the process of excluding studies for the review

3.2.2 Data Calculations

I had a total of 26 studies that passed my eligibility criteria to enter the review. Fromthese studies, I extracted 36 meta-analyses containing between three to thirty-fourrandomised controlled trials. For each meta-analysis, I reproduced the analysis usinga random-effects model (using formulas (1.9) to (1.13)) with a 95% prediction interval(using formula (2.2)) as well as calculating I2-statistic (using formula (1.7)). For 20of the studies, from which 26 meta-analyses were extracted, I could directly calculateindividual trial treatment effects and its variance (log variance if the effect-size ofinterest is a ratio). For these, the individual treatment effects are calculated using

24

Page 28: The use of Prediction Intervals in Meta-Analysis

the following formulas depending on the relevant outcome of interest.

We define the following

a = Number of events in the treatment groupb = Number of events in the control groupNT = Total number of patients in the treatment groupNC = Total number of patients in the control groupc = NT − ad = NC − b

Odds Ratio

The odds ratio for trial k is given by2

Y ORk =

a · db · c

(3.1)

and has log variance

ln(V ar(Y ORk )) =

1

a+

1

b+

1

c+

1

d. (3.2)

A 95% confidence interval for the odds ratio in the k-th trial is given by

exp

([ln(Y OR

k )± 1.96√V ar(Y OR

k )

]). (3.3)

Relative Risk

The relative risk for trial k is given by2

Y RRk =

a ·NC

b ·NT

(3.4)

and has log variance

ln(V ar(Y RRk )) =

1

a+

1

b− 1

NT

− 1

NC

. (3.5)

A 95% confidence for the relative risk in the k-th trial is given by

25

Page 29: The use of Prediction Intervals in Meta-Analysis

exp

([ln(Y RR

k )± 1.96√V ar(Y RR

k )

]). (3.6)

Risk Difference

The risk difference for trial k is given by2

Y RDk =

a

NT

− b

NC

(3.7)

and has variance

V ar(Y RDk ) =

(a

NT

)(1−

(a

NT

))NT

+

(b

NC

)(1−

(b

NC

))NC

. (3.8)

A 95% confidence for the risk difference in the k-th trial is given by

[Y RDk ± 1.96

√V ar(Y RD

k )

](3.9)

Hazard Ratio

To calculate the Hazard Ratio for the k-th trial, we require the difference between theobserved deaths and expected deaths (O − E) and the variance V ar(O − E).15

Y HRk = exp

((O − E)

V ar(O − E)

)(3.10)

and has log variance

ln(V ar(Y HRk )) =

1

V ar(O − E). (3.11)

A 95% confidence for the hazard ratio in the k-th trial is given by

exp

([ln(Y HR

k )± 1.96√V ar(Y HR

k )

]). (3.12)

26

Page 30: The use of Prediction Intervals in Meta-Analysis

Extra Formulas

For 6 of the studies, from which 10 meta-analyses were extracted, only the individualtrial treatment effects along with their 95% confidence intervals were reported. Forthese studies, I couldn’t directly calculate the individual trial standard errors andtherefore the standard errors are estimated using the following formulas.

We let x− and x+ be the lower and upper bounds respectively of the 95% confidenceinterval for θk. For effect-sizes that require us to work on the log scale, i.e. odds ratios,relative risks and hazard ratios, the standard error in the k-th trial is calculated usingthe formula

s.e.(Y HR,RR,ORk

)=

1

2

(ln(x+)− ln(x−)

1.96

), (3.13)

For differences (continous outcomes), the standard error in the k-th trial is calculatedusing the formula

s.e.(Y RDk

)=

1

2

(x+ − x−

1.96

). (3.14)

3.2.3 Software

I used the statistical software STATA v10.1 to perform a random-effects meta-analysiswith a 95% prediction interval on each meta-analysis that we included in the study.The software incorporates the formulas (1.7), (1.9) to (1.13), (2.2) and any of therelevant formulas from (3.1) to (3.12). All forest plots produced in this paper arecreated using STATA (see Appendix for STATA codes).

3.3 Results

From 26 studies, I took 36 meta-analyses containing between three to thirty-four ran-domised controlled trials (median eight trials, IQ range seven trials) and reproducedeach meta-analysis using a random-effects model with a 95% prediction interval. Theresults of all 36 random-effects meta-analyses with a 95% prediction interval are pre-sented in the table in figure 3.2.

27

Page 31: The use of Prediction Intervals in Meta-Analysis

Figure 3.2: Main characteristics of studies included in the review (Note: Outcome ofinterest defined as given by authors, HR = Hazard Ratio, OR = Odds Ratio, RD =Risk Difference, RR = Relative Risk, θ̂ is the random-effects summary estimate, 95%C.I. = 95% confidence interval ,I2 is percentage of heterogeneity down to real differ-ences, τ̂ 2 is estimate of between-study variance, 95% P.I. = 95% prediction interval)

28

Page 32: The use of Prediction Intervals in Meta-Analysis

I classified each study to the following groups;

1. Their 95% confidence and prediction interval contained their null value

2. Their 95% confidence and prediction interval excluded their null value

3. Their 95% confidence interval excluded the null value but their 95% predictionincluded the null value

For the first type, I found 17 (47.2%) of the meta-analyses had their 95% confidenceinterval contain their respective null values. For these meta-analyses, the 95% pre-diction interval will also contain the null value since the 95% confidence interval isa subset of the 95% prediction interval. Focusing on these studies, 6 of these hadonly three trials which is the minimum required to calculate a prediction interval. Infact, 11 of these 17 meta-analyses had less than ten trials in their analysis which mayexplain why their 95% confidence intervals contain their null value, since a random-effects meta-analysis will have low power to detect significant results when there arefew studies in the analysis.2

In study ID 1530, the meta-analysis contains only three trials (there were originallyfour trials but no events occurred in one of the trials so the trial was discarded from theanalysis), yet there is a significant amount of between-study heterogeneity as indicatedby the large I2 value of 49.4% (suggesting that almost half of the variation in treatmenteffects is down to real differences) and τ̂ 2 value of 0.3369. The study itself is primarilya randomised controlled trial looking at assessing whether granulocyte-macrophagecolony stimulating factor (GM-CSF) administered as prophylaxis to preterm neonatesat high risk of neutropenia reduces sepsis, mortality and morbidity. The authors alsocarried out a meta-analysis of their trial along with two other published randomisedcontrolled trials to see if there is a treatment benefit. Each trial estimated on oddsratio with an odds ratio < 1 indicated treatment is beneficial. The authors used afixed-effect model stating “there was no evidence of between-trial heterogeneity” yetthe large τ̂ 2 and I2 values suggest otherwise so a random-effects model would be bettersuited to analyse the data. I obtained a summary estimate of 0.84 (authors obtained0.94) and a 95% confidence interval of [0.32,2.17] (authors obtained [0.55,1.60]). Inboth cases, the 95% confidence intervals included the null value so on average, thereisn’t any evidence at 5% level that the treatment is beneficial. The authors look touse subgroup analysis to analyse the data but a prediction interval can further explainthe results in a way that acknowledges heterogeneity. A 95% prediction interval wascalculated to be (0,12655.86]. All the results are presented in a forest plot in figure3.3.

The 95% prediction interval is extremely large in this case. The results occurs becausewe are using the t-distribution, which accounts for the uncertainty in τ̂ 2, with few stud-ies which results in a large value of tk−2 as well as accounting for large between-studyheterogeneity. When using a random-effects meta-analysis, we make the assumption

29

Page 33: The use of Prediction Intervals in Meta-Analysis

Figure 3.3: Forest plot showing a meta-analysis of randomised controlled trials ofGM-CSF for preventing neonatal infections30

that each study is estimating a different treatment effect, if we have studies in thepresence of substantial between-study heterogeneity, irrespective of how large theyare, we have low power to detect significant results.2;5

Study ID 1732, a meta-analysis of three randomised controlled trials, also has a large95% prediction interval given by (0,91064.69] but unlike study ID 1530, has no evidenceof between-study heterogeneity suggested by I2 and τ̂ 2 values of 0. In this case, thelarge prediction interval is attributed to the uncertainty in the estimate of τ 2 sincethere are too few trials. In these cases, a Bayesian approach to calculating τ̂ 2 maywork better.5;8 The studies that had more than 10 trials which had both their 95%confidence and prediction intervals contain the null value tended to have narrower95% confidence intervals and apart from study ID 3c18, only slightly include theirrespective null value.

For the second type, 9 (25%) meta-analyses had both their 95% confidence and predic-tion interval exclude their respective null value. In these cases, the prediction intervalremains significant at the 5% level even after we have considered the whole distribu-tion of effects. Out of these 9 meta-analyses, 7 of there had their I2 and τ̂ 2 values tobe 0 (or very close to 0) and 1 other meta-analysis had an I2 value of 6.1% and τ̂ 2

30

Page 34: The use of Prediction Intervals in Meta-Analysis

value of 0.0027. In the case of these 8 meta-analyses, the 95% predictions intervalsare only slightly wider than the 95% confidence intervals. In the general case where aprediction interval slightly increases the width of a random-effects confidence intervaland I2 and τ̂ 2 are 0 (suggesting no evidence of between-study heterogeneity), a com-mon effect may be assumed since the impact of heterogeneity is negligible and theextra width in the prediction interval is only attributable in the uncertainty surroundthe estimate of τ 2 (which are 0 or very close to 0 in these cases).

In study ID 11a26, the authors carried out two meta-analyses of individual patientdata to investigate the effect of adjuvant chemotherapy in operable non-small-cell lungcancer. The first meta-analysis was observing the effect of surgery and chemotherapyagainst surgery on survival by type of chemotherapy and the second was the effectof surgery and radiotherapy and chemotherapy versus surgery and radiotherapy onsurvival by type of chemotherapy. Both meta-analyses were extracted for the reviewbut the first meta-analysis is the one of interest. The analysis included thirty-fourrandomised controlled trials each estimating a hazard ratio where a hazard ratio < 1indicates survival better with surgery and chemotherapy. I calculated I2 and τ̂ 2 valuesto be 6.1% (authors calculated 4% and 0.0027 respectively) indicating little between-study heterogeneity across the trials despite the trials differing by number of patients,drug used, number of cycles, etc. The authors used a fixed-effect model to analysethe data and used χ2 test to investigate any differences in treatment effects across thetrials. Using a random-effects meta-analysis, I obtained a summary estimate of 0.86(authors also obtained 0.86), a 95% confidence interval of [0.80,0.92] (authors obtained[0.81,0.92]) and 95% prediction interval of [0.75,0.97], the results are displayed in figure3.4.

The summary estimate suggests that on average, survival is better with surgery andchemotherapy compared to surgery alone. The 95% confidence interval didn’t containthe null value and is entire < 1 so there is strong evidence that on average, survivalbetter with surgery and chemotherapy. The authors acknowledge this and state alongwith their second meta-analysis “The results showed a clear benefit of chemotherapywith little heterogeneity”, but is this always the case. The 95% prediction intervalis also entirely < 1, so now having considered the whole distribution of effects, wecan say that chemotherapy surgery will increase survival when carried out in at least95% of brand new individual study settings. I point out that the author’s results,using a fixed-effect meta-analysis, were very similar to my results using a random-effects meta-analysis. Furthermore, the 95% prediction interval is only slightly widerthan the 95% confidence interval which indicates that the impact of between-studyheterogeneity is small across all the trials and there maybe justification for using afixed-effect model. Despite this, a random-effects model is still useful since it accountsfor all uncertainty5. We’ve seen already how a prediction interval can be wide (e.g.Study ID 1530, Study ID 1732) if there is uncertainty in the actual estimates regardlessof whether there is evidence of between-study heterogeneity or not.

31

Page 35: The use of Prediction Intervals in Meta-Analysis

Figure 3.4: Forest plot showing a meta-analysis of randomised controlled trials as-sessing the effect of surgery (S) and chemotherapy (CT) versus surgery alone26

The 1 other meta-analysis that is yet unaccounted for is study ID 3d18. The authorsare assessing the use of recombinant tissue plasminogen activator (rt-Pa) for acuteischaemic stroke. They had updated a previous systematic review by adding a newlarge randomised controlled trial to the analysis. The review contained four meta-analyses, all of which were extracted for the review but the meta-analysis of interest(study ID 3d) is looking at the effect or rt-Pa on systematic intracranial haemorrhage(SICH) within 7 days on patients who have suffered an acute ischaemic stroke. The

32

Page 36: The use of Prediction Intervals in Meta-Analysis

analysis included twelve randomised controlled trials each estimating an odds ratiowhere an odds ratio < 1 indicates rt-Pa reduced development of SICH. The trialsused in this study differed by dosage, final follow-up time, stroke type etc, whichhas resulted in us obtaining large I2 and τ̂ 2 values of 43.4% and 0.2320 respectively.The authors used a standard fixed-effect model and calculated heterogeneity usingχ2-statistic if there is presence of substantial heterogeneity. Given the large values ofI2 and τ̂ 2 and observing the treatment effect as well as taking into account the differ-ences between the trials, a random-effects meta-analysis seems more appropriate. So,using a random-effects meta-analysis, I obtained a summary estimate of 3.93 (authorsobtained 3.72), 95% confidence interval of [3.44,6.35] (authors obtained [2.98,4.64])and a 95% prediction interval of [1.18,13.10], the results are displayed in figure 3.5.

Figure 3.5: Forest plot showing a meta-analysis of randomised controlled trials as-sessing the effects of SICH within 7 days (treatment up to 6 hours)18

The summary estimate suggests that on average, the odds of developing SICH in thetreatment group is 3.93 times the odds of developing SICH in the control group. The95% confidence interval didn’t contain the null value and is entirely > 1 so provides

33

Page 37: The use of Prediction Intervals in Meta-Analysis

strong evidence that on average, the treatment is more likely to increase the oddsof SICH but it doesn’t indicate whether it will be always be the case. The 95%interval is entire > 1 suggesting that the treatment will increase the odds of SICHwhen carried out in at least 95% of brand new individual settings. Like study ID11a26, the 95% prediction interval remains significant but unlike study ID 11a, the95% prediction interval in study ID 3d was much wider than its 95% random-effectsconfidence interal. Here the impact of between-study heterogeneity is large, this canalso be seen by the large I2 ad τ̂2 values which result in the large width of the 95%prediction interval. Like study ID 11a, the 95% prediction interval remains significantbut unlike study ID 11a, the 95% prediction interval in study ID 3d is much widerthan its 95% random-effects confidence interval. Here the impact of between-studyheterogeneity is large (in study ID 11a, the impact is low), this can also be seen bythe large I2 and τ̂ 2 values. The impact is such that in some cases, the odds of SICH,when rt-Pa is given, could be as low as 1.18 times the odds in the control but couldbe as high as 13.1 times the odds in the control group. The authors, by using afixed-effect method, fail to acknowledge the potential effects of heterogeneity. Theyreport that “42 more patients were alive and independent, 55 more were alive with afavourable outcome at the end of follow up despite an increase in the number of earlysymptomatic intracranial haemorrhages and early deaths. Since the odds of SICH inthe treatment group could be as high as 13.1, further research could be carried out toidentify scenarios when this may occur since this could reduce the number of patientsthat will have favourable results come the end of follow up.

For the third type, 10 (27.8%) of the meta-analyses had their 95% confidence intervalsexclude the null value but had their 95% prediction interval include the null. In thesecases, the 95% prediction intervals are not significant at the 5% level after we haveconsidered the whole distribution of effects. Most of the studies, apart from two,tended to have a significant amount of between-study heterogeneity based on the I2

value ranging from 22.3% to 62.7% and τ̂ 2 values ranging from 0.022 to 0.098. Twostudies had I2 value and τ̂ 2 values of 0. These were study ID 924, which had 3 trialsand justifiably use a fixed-effect method, and study ID 16b31, which had 9 trials, useda random-effects meta-analysis but do exercise caution since there are few trials whichcan result in the summary estimates carrying large uncertainty.

In study ID 2035, the authors are looking at the efficacy of probiotics in preventionof acute diarrhoea . They carried out a meta-analysis of thirty-four randomised con-trolled trials each estimating a relative risk with a relative risk < 1 indicating theprobiotic has a beneficial effect. The authors used a random-effects meta-analysisacknowledging the potential effects of heterogeneity since the studies differed in manysuch as study setting, age grow, follow-up duration, probiotic administered, dosageetc which resulted in a large I2 value of 62.7% and τ̂ 2 value of 0.0980. I obtainedidentical results to the authors, a summary estimate of 0.65 and a 95% confidenceinterval of [0.55,0.78]. Additionally, I obtained a 95% prediction interval of [0.34,1.27],

34

Page 38: The use of Prediction Intervals in Meta-Analysis

the results are displayed in figure 3.6.

Figure 3.6: Forest plot of a meta-analysis of randomised controlled trials assessingthe effects of probiotics on diarrhoeal morbidity35

The summary estimate of 0.65 indicates on average, the risk of diarrhoea morbidity is0.65 times the risk of diarrhoea morbidity in the placebo group. The 95% confidenceinterval is entirely < 1 providing strong evidence that on average, the probiotics arebeneficial but is this always the case. The authors acknowledge heterogeneity firstby using a random-effects model and then by carrying out a subgroup and stratified

35

Page 39: The use of Prediction Intervals in Meta-Analysis

analysis by assessing the effect of age, setting of trial, type of diarrhoea, probioticstrains used, formulation of probiotics administered, influence of setting and qualityscore of trials. A more formal way of acknowledging heterogeneity is to consider a 95%prediction interval which I calculated to be [0.34,1.27]. This interval now contains thenull value and contains values > 1, so although on average the use of probiotics arebeneficial, it may not always be the case in a brand new individual setting, in factin some cases it may be harmful and further research is required to identify thesescenarios.

In study 2338, the authors are looking at the efficacy and safety of electroconvulsivetherapy in depressive disorders. They carried out a meta-analysis of twenty-two ran-domised controlled trials each estimating a standardised risk difference where a riskdifference > 0 favoured unilateral ECT and a risk difference < 0 favoured bilateralECT. The authors reported both fixed-effect and random-effects results and acknowl-edge heterogeneity since the trials differ by dosage, methods of administration etcand this can be seen by the I2 value of 24.00% and τ̂ 2 value of 0.0286. I obtainedslightly different results to the authors when using a random-effects meta-analysis, asummary estimate of -0.34 (authors obtained -0.32) and a 95% confidence interval of[-0.49,-0.20] (authors obtained [-0.46,-0.19]). I also obtained a 95% prediction intervalof [-0.73,0.04], the results are displayed in figure 3.7.

The summary estimate suggests that on average, out of a 100 patients, 34 morepatients had favourable results in the bilateral group compared to the unilateral group.The 95% confidence interval is entirely < 0 providing strong evidence that on average,the bilateral group is better but is this always the case. The authors acknowledgeheterogeneity by firstly reporting random-effects results and then by carrying out ameta-regression analysis but considering a prediction interval would be a more formalway of acknowledging heterogeneity. The 95% prediction interval is [-0.73,0.04] whichnow contains the null value 0 and slightly exceeds 0. This suggests that althoughon average the bilateral group is better, in a brand new individual study setting, thebilateral group may not be better and further research is required to identify suchscenarios.

3.4 Discussion

From 26 studies that entered my review, 36 meta-analyses were extracted and eachreproduced using a random-effects model with a 95% prediction interval. My aim wasto see whether or not these intervals had a significant impact on the conclusions ofthese studies. Most of the studies that I found reported a summary estimates (fixedor random-effects) along with a 95% confidence interval and carried out some typeof analysis to assess heterogeneity. An observation worth noting is that none of thestudies post 2005 mentioned the idea of predictions in the context of meta-analysis.

36

Page 40: The use of Prediction Intervals in Meta-Analysis

Figure 3.7: Forest plot of a meta-analysis of randomised controlled trials assessingthe effect of bilateral versus unilateral electrode placement on depressive symptoms38

Papers by Ades et al.8 and Higgins et al.5 set the foundations for the use of predictionintervals in traditional and Bayesian meta-analysis and how presenting it can describethe extent of heterogeneity, how the true individual treatment effects are distributedabout the random-effects summary estimate as well as giving a range for which thetrue treatment effect in an individual brand new study setting lies within.2;5

3.4.1 Principal Findings

I found that 17 (47.2%) of the 36 meta-analyses had their 95% confidence intervalcontain the null value. In these cases, the average effect across the trials is not signif-icant at the 5% level and the 95% prediction interval will also include the null value.Presenting a 95% prediction interval in these cases is still useful since it helps describe

37

Page 41: The use of Prediction Intervals in Meta-Analysis

the distribution of effects across the studies given there is between-study heterogene-ity. The other 19 (52.8%) meta-analyses had their 95% confidence interval exclude thenull values. In these cases, the average effect is significant at the 5% level, the aim isto see how many of their 95% predictions intervals now include the null value. I foundthat 9 of the meta-analyses had their 95% prediction interval exclude the null valuewhilst the other 10 included the null value. In terms of clinical practice, the predictioninterval excluding the null indicates that in 95% of the times the treatment is appliedin brand new study settings, the treatment will be beneficial/worse which is muchmore useful to clinicians than just reporting the average effect and the uncertaintyaround it. If the prediction interval included the null, then although the average ef-fect is beneficial/worse, in some brand new individual study settings, the effect maybe worse/beneficial. Again, this is much useful to clinicians and researchers since itreveals the impact of heterogeneity and can motivate further research to identify suchcases.

Another way of discussing our results is to consider the size of heterogeneity acrossthe meta-analyses. I reiterate that describing heterogeneity is a key motivation for aprediction interval. If heterogeneity wasn’t a problem, then we could use a fixed-effectmodel in all cases but even the slightest differences between studies must be consid-ered.2 I found 12 meta-analyses had no evidence of between-study heterogeneity (I2

and τ̂ 2 values of 0), only in two of these cases20;26 did they have more than ten trials.In many of these cases, the authors would tend to use a fixed-effect model but sincethere are few studies, we have low power to detect heterogeneity and therefore theremay be uncertainty around I2 and τ̂ 2 values.2 A common-effect should be assumedif there is no evidence of between-study heterogeneity and the 95% confidence andprediction intervals are close suggesting that the impact of heterogeneity is negligibleand the uncertainty around the parameters are low (e.g. Study ID 11b26). In somecases, there may seem to be no evidence of heterogeneity but if there are few studies,the uncertainty around τ̂ 2 can be large resulting in wide prediction intervals (e.g.Study ID 1732).

The other 24 meta-analyses had evidence of between-study heterogeneity (I2 rangingfrom 0.30% to 62.90% and τ̂ 2 ranging from 0.0001 to 0.3369). Whilst the random-effects model wasn’t always used in these cases, in most of these cases, the authorsdid carry out some analysis of heterogeneity (e.g. subgroup analysis, meta regressionetc). The problem that occurs is that if there are few trials in the analysis, the powerto detect sources of heterogeneity is low and therefore the analysis lacks precision.2;11.A prediction interval when calculated with few studies will be large (e.g. study ID1530 and may not be useful from a clinicians point of view since the range of effectsis so wide. On the other hand, in study ID 3d18, the 95% prediction interval is largeyet was entirely above the null value, so even though there is uncertainty on whatthe effect could be in an individual study setting, we know that 95% of the times thetreatment will have a negative effect (in that case), we just don’t know how bad of

38

Page 42: The use of Prediction Intervals in Meta-Analysis

an effect it could be. From a researchers point of view, large prediction intervals canstill have meaning since it reveals the uncertainty surrounding the parameters andtherefore may just indicate that more trials, further research or other information(incorporate a Bayesian approach5;8) should be required whereas a 95% confidenceinterval only tells us the average effect is significant/insignificant but this result maybe imprecise due to the lack of trials.

3.4.2 Limitations

It is important that potential limitations of this review are acknowledged. I decidedto only use the Lancet database to search for studies since it is regarded as one of theworld’s most respected medical journal. I expected each study to be of high standardin terms of methodology and conduct. Unfortunately, I cannot be sure that this isthe case, flaws in procedure at trial level and meta-analysis level can result in errorprone results and may not reflect the true performance of the intervention.42 In thesecases, the prediction interval will be wider since it mixes heterogeneity caused byreal differences with heterogeneity as a result of methodological errors.7 I also onlyincluded meta-analyses of randomised controlled trials since such trials cancel the ef-fects of known and unknown confounding factors. I did come across meta-analysesof non-randomised trials (mainly observational studies) but excluded them since theyare more influenced by confounders. Whilst randomised controlled trials are held inhigher regard relative to observational studies, the jury remains out on whether wewould take randomised trials of low or even average quality over high quality obser-vational studies. Stroup et al.44 “inclusion of sufficient detail to allow a reader toreplicate meta-analytic methods was the only characteristic related to acceptance forpublication” suggesting that high quality observational studies could be considered.I could’ve extended our search beyond the Lancet to other databases but I felt theLancet already covered a wide variety of studies. There are also technical limita-tions to the review that must be addressed. Whilst there was a criteria that everymeta-analysis must have at least three randomised controlled trials, with few stud-ies, assumptions made when calculating a prediction interval may become violated.We assume a normal distribution but with few studies, this may be an inappropriatechoice.5 When considering the true treatment effect of a brand new study, I assumethe population in this new study is “sufficiently similar” to those already coveredin the analysis. If we have few studies, we fail to cover a sufficient range of popu-lations resulting in a wider prediction interval accounting for large uncertainty.2;5 Ialso wasn’t specific on what types of outcomes we allowed into the review. Thereis evidence that suggests that certain biases are more likely to arise when subjectiveoutcomes (e.g. favourable outcome (Study ID 3d18, poor outcome (Study ID 217 orany outcome that requires human input).45 It may have been more prudent to onlyconsider outcomes such as survival, mortality or continuous outcomes that have no

39

Page 43: The use of Prediction Intervals in Meta-Analysis

chance of being influenced by an external source.

3.4.3 Comparison with other studies

A related study complied by Graham et al.14 explored prediction intervals on meta-analysis. They performed a meta-epidemiological study of binary from meta-analysespublished between 2002 to 2010. Their study included 72 meta-analyses from 70studies each containing between 3-80 studies and for each, they calculated a random-effects meta-analysis incorporating DerSimonian and Laird12 method and calculatedtraditional and Bayesian 95% prediction intervals for odds ratios and risk ratios. Theyfound that 50 out of 72 meta-analyses had their 95% random-effects confidence intervalfor odds ratios exclude their null value, of these, 18 had their 95% prediction intervalsexclude the null. They also found that 46 out of the 72 meta-analyses had their 95%random-effects confidence interval for risk ratios exclude the null value, of these, 19had their 95% prediction intervals exclude the null. They concluded “meta-analyticconclusions may be appropriately signaled by consideration of initial interval estimateswith prediction intervals” but also stress that increasing heterogeneity can result inwide predictions intervals and caution must be taken when writing conclusions on ameta-analysis.14

Comparing my results to theirs, I found less meta-analyses had their 95% predictioninterval include the null when their 95% confidence interval had excluded theirs. Theirstudy was larger than mine and they also were able to directly calculate odds ratiosand relative risks for each meta-analysis. I worked out the effect size according tothe authors of the studies and in some cases, couldn’t directly work out the summaryestimate since the relevant data wasn’t available, only the individual treatment effectsalong with their 95% confidence intervals were reported.

3.4.4 Final Remarks and Implications

Perhaps only looking at focusing on cases where prediction intervals include the nullwhen their corresponding 95% confidence intervals didn’t may somewhat deviate awayfrom why a prediction interval is useful. Since we were able to apply a 95% predictioninterval to all cases, whether the analysis had high between-study heterogeneity, nobetween-study heterogeneity, whether the analysis had few or large trials, I was ableto describe the results of random-effects meta-analysis more accurately since we areconsidering the whole distribution of effects, even if what I am deducing is that theauthors require more trials or further research/information in cases where there arefew studies. In the case where there is no evidence of between-study heterogeneity(indicated by I2, τ̂ equal to 0), if we used a random-effects model with a predic-tion interval, if the prediction interval is significant wider than the random-effects

40

Page 44: The use of Prediction Intervals in Meta-Analysis

confidence interval, then this suggests there is uncertainty amongst the parameters(e.g. lack of power if there are few studies). If the prediction interval is fairly closeto the confidence interval, then this suggests a common effect may exists since wehave considered the whole distribution of effects and the impact of heterogeneity isnegligible. If there is evidence of betweens-study heterogeneity, then a predictioninterval can reveal the impact of between-study heterogeneity which is useful to clin-icians/researchers regardless if the average effect is significant. I therefore believe a95% prediction interval should be presented in every random-effects meta-analysis toenhance the interpretation of its results, but I stress the need for the analysis to havea sufficient number of good quality unbiased randomised controlled trials.

41

Page 45: The use of Prediction Intervals in Meta-Analysis

Chapter 4

Prediction intervals inMeta-Epidemiological studies

It seems widely agreed that systematic reviews which contain a meta-analysis of ran-domised controlled trials provide the strongest and most reliable evidence of the effectsof health care interventions since they use systematic and explicit methods to sum-marise all the evidence to answer a research question of interest.1;42;46 Unfortunately,they are not impervious to bias, if the meta-analysis is biased or includes biased trials;the results from a meta-analysis will incorporate these biases resulting in either anover/underestimation of the summary treatment effect which can lead to misleadingconclusions of how well the intervention works.42;46

In the process of systematic reviews, when the relevant trials are searched for, wemust make sure that al ofl the evidence (published and unpublished) is searched forso we can get the most accurate results. There is evidence that supports the fact thatpublished studies are more likely to reflect a statistical significant results and morelikely to report larger treatment effects and moreover, published studies are morelikely to be used in a systematic review and therefore a meta-analysis, which canlead to a biased summary treatment effect in a meta-analysis (publication bias).2;47

Furthermore, randomised controlled trials themselves are in danger of bias if there areimperfections in their methodological properties, i.e. there wasn’t proper allocationconcealment, lack of blinding etc.46 If we were to calculate a prediction interval in thepresence of bias, heterogeneity accounting for real differences mixes with heterogeneitycaused by these bias resulting in a much wider prediction interval.7 Other biases thatcan arise are citation bias, language bias, cost bias etc.2 The fundamental idea hereis that bias must be assessed to make the conclusions of a meta-analysis more robust,failure to acknowledge it can result in misleading results.

42

Page 46: The use of Prediction Intervals in Meta-Analysis

4.1 Meta-Epidemiological Study

A way in which we can inspect bias is to carry out a meta-epidemiological studywhich assesses the influence of trial characteristics on the treatment effect estimatesin a meta-analysis.43;42;46 A meta-epidemiological study will assess a specific trialcharacteristic by carrying out a meta-analysis on summary effects from a collectionof meta-analysis (essentially a ’meta-analysis of meta-analyses’).43;42;46 Like a normalmeta-analysis, meta-epidemiological study should describe the distribution of all evi-dence, describe any heterogeneity between the meta-analyses, inspect associated riskfactors and identify and control bias.

The first time meta-epidemiology surfaced was in an editorial in the BMJ by DavidNaylor48, in 1997, where cautions are raised concerning the summary effect of a meta-analysis. The author mentions how meta-analyses can generate “inflated and undulyprecise” estimates if biases exist. He also refers to evidence stating statistically signif-icant outcomes were more likely to be published than non-significant studies and adds“readers need to examine any meta-analyses critically to see whether researchers haveoverlooked important sources of clinical heterogeneity among the included trials”. In2002, meta-epidemiology is defined, by Sterne et al.46, as a statistical method to “iden-tify and quantify the influence of study level characteristics”. In 2007, the method hasbeen generalised in a systematic review conducted by OARSI (Osteoarthritis ResearchSociety International).49 This has resulted in many published meta-epidemiologicalstudies which can be founded on the internet such as the BMJ website. These types ofstudies have provided strong evidence that flaws in trial characteristics lead on averageto exaggeration of intervention effect estimates and in turn increase heterogeneity.42

4.2 Prediction Intervals in Meta-Epidemiological

Studies

The aim of this chapter is to apply a 95% prediction interval to meta-epidemiologicalstudies. Meta-epidemiological studies will use either a fixed-effect or a random-effectsmodel and report a summary estimate with a 95% confidence interval. They still,however, need to describe the extent of heterogeneity that exits across all the evidenceso the inclusion of a prediction interval can help formally describe it.

We searched for meta-epidemiological studies on the website of the British MedicalJournal (www.bmj.com). We used the advanced search toolbar and used the keyword“META EPIDEMIOLOGICAL” in text, abstract and title in all articles in all years.Any meta-epidemiological study looking at a trial characteristic was eligible as long aswe are able to carry out their meta-analysis ourselves. We took 4 studies at randomand carried out their meta-epidemiological meta-analysis using a random-effects meta-

43

Page 47: The use of Prediction Intervals in Meta-Analysis

analysis with a 95% prediction interval using the formulas (1.9 to 1.13) and (2.2). Inall 4 of the examples we use, we estimated the standard errors using the formulas(3.13) or (3.14) depending on outcome of interest, since we couldn’t work them outdirectly.

4.2.1 Example 1

A trial characteristic that can influence the estimates of individual trial treatmenteffect is the status of the study centre, i.e. is it carried out in a single centre orin multicentres. Bafeta et al.50 carry out a meta-epidemiological study in the aimto compare estimates of intervention effects between single centre and multicentrerandomised controlled trials on continuous outcomes. They address a previous studythat concluded the effect of interventions using binary outcomes are larger in singlecentre randomised controlled trials compare to multicentre ones51 and address a paperby Bellomo et al.52 who state single centre trials often contradict multi centre trials.The authors included 26 meta-analyses with a total of 292 randomised controlledtrials (177 in single centres and 115 in multicentres) with continuous outcomes thatwere published between January 2007 to January 2010 in the Cochrane database forsystematic reviews (which they state as having “high methodological quality”). Theyignored meta-analyses of non-randomised trials, IPD meta-analyses and meta-analyseswhere all trials were only single centre or only multicentres and any meta-analysisthat had less than 5 randomised controlled trials. They used the risk of bias tool asrecommended by the Cochrane Collaboration3 to assess risk of bias from individualreports for each trial. For each meta-analysis, they used a random-effects meta-analysis incorporated DerSimonian and Laird estimate for τ 2 to combine treatmenteffects across the trials and assessed heterogeneity using χ2 and I2. The authors thenestimate a standardised mean difference between single centre and multicentre trialsusing a random-effects meta-regression to incorporate potential heterogeneity betweentrials. They then synthesised these using a random-effects model and used I2, Q-test to assess between-meta-analysis heterogeneity. A standardised mean difference< 0 indicates that single centre trials, on average, showed larger treatment effectsthan multicentre trials. They calculated a summary estimate of -0.09 with a 95%confidence interval of [-0.17,-0.01] with low between-meta-analysis heterogeneity (I2

and τ̂ 2 values of 0). We obtained the same random-effects summary estimate of -0.09and same 95% confidence interval of [-0.17,-0.01], additionally we calculated a 95%prediction interval of [-0.18,0.00]. The results are shown in the forest plot below infigure 4.1.

The summary estimate (-0.09) indicates that on average, single centre trials produceda larger estimate of the intervention effect than multicentre trials. Since the 95%confidence interval ([-0.17,-0.01]) is entirely < 0, there is strong evidence that onaverage, single centre trials show a larger effect than multicentre trials looking at the

44

Page 48: The use of Prediction Intervals in Meta-Analysis

Figure 4.1: Forest plot of a meta-epidemiological analysis assessing the differencein intervention effect estimates between single centre and multicentre randomisedcontrolled trials50

same intervention but is this always the case. The authors report “on average singlecentre trials with continuous outcomes showed slightly larger intervention effects thanmulticentre” and acknowledge between-meta-analysis heterogeneity and risk of biasby using subgroup and sensitive analysis but a 95% prediction interval can describe allthe uncertainty more formally. The calculated 95% prediction interval ([-0.18,0.00])now includes the null value 0 but doesn’t exceed it and is only slightly wider thanthe 95% random-effects confidence interval revealing the impact of heterogeneity islow. We can say, that after considering the whole distribution of effects, in at least95% of the times, the effect in a multicentre will never be strictly larger than thecorresponding effect in a single centre but we cannot rule out that the effect might bethe same. We mirror the authors views that further research is needed to investigate

45

Page 49: The use of Prediction Intervals in Meta-Analysis

potential causes of these differences.

4.2.2 Example 2

Nuesch et al.53 carried out a meta-epidemiological study to examine whether or notexcluding patients from the analysis of randomised controlled trials are associatedwith biased estimates of treatment effects and whether or not it causes heterogene-ity between trials. They address evidence that departure from protocol and lossesto follow-up in randomised controlled trials can lead to exclusion of patients fromthe final analysis, and such handling of these patients lead to treatment effects thatdiffer methodically from the true treatment effects.54;55 Such bias is termed attri-tion bias56 or selection bias and this study aims to see how it affects the summaryeffects in a meta-analysis and does it increase between-study heterogeneity. The au-thors include 14 meta-analyses, with a total of 167 trials (39 with all patients in theanalysis, 128 where some patients excluded). Eligible meta-analyses were those ofrandom/quasi-randomised trials in patients with osteoarthritis of the knee or hip andreported non-binary patient reported outcome (e.g. pain intensity) which assessedany intervention with placebo or a non-intervention control. If a meta-analysis onlyincluded trials that had patient exclusions or had trials where there were no exclu-sions, it is ignored. Within each meta-analysis, the authors used a random-effectsmeta-analysis to calculated a summary effect for trials with and trials without exclu-sions before deriving differences between them. A difference of < 0 suggests trialswith exclusions have a more beneficial treatment effect. These differences were thensynthesised using a random-effects meta-analysis for which the authors state “fullyaccounted for variability in bias between meta-analysis” and they estimate τ 2 as ameasure of between-study heterogeneity. They obtained a summary estimate of -0.13 with a 95% confidence interval of [-0.29,0.04] with what they consider as highbetween-meta-analysis heterogeneity indicated by τ̂ 2 value of 0.07. We obtained thesame random-effects summary estimate of -0.13 but a different confidence interval of[-0.31,0.05] noticing an error in the 3rd meta-analysis in the forest plot presented inthe paper. We also obtained an I2 value of 78.2% and a slightly larger τ̂ 2 value of0.0811 as well as a 95% prediction interval of [-0.78,0.52].The results are shown in theforest plot below in figure 4.2.

The summary estimate (-0.13) indicates that on average, trials with exclusions pro-duce a larger estimate of the treatment effect compare to those without exclusions.The 95% confidence interval ([-0.31,0.05]) contains the null value so the average isn’tsignificant (nor is the authors 95% confidence interval). However, both ours and theauthors 95% confidence interval suggests there is evidence (albeit non-significant at5% level) that on average, patient exclusion leads to more beneficial treatment effects.This may have lead the authors to report that “excluding patients from the analysisof randomised trials often resulted in biased estimates of treatment effects, but the

46

Page 50: The use of Prediction Intervals in Meta-Analysis

Figure 4.2: Forest plot of a meta-epidemiological analysis assessing the difference ineffect sizes between trials with and without exclusions of patients from analysis50

extent and direction of bias remained unpredictable in a specific situation” and recom-mend “results from intention to treat analysis should always be described in reports ofrandomised trials”. They acknowledge the large between-meta-analysis heterogeneityby carrying out stratified analysis but a 95% prediction interval can reveal the fulluncertainty around the summary estimate. The calculated 95% prediction interval([-0.78,0.52]) is fairly wide since it is accounting for the large between-meta-analysisheterogeneity (indicated by I2 and τ̂ 2 values of 78.2% and 0.0811 respectively). Ican say that after considering the whole distribution of effects, although on averageit seems as though studies with exclusions lead to more beneficial treatment effect,analysis where the trials have no patient exclusions could quite easily have a morebeneficial treatment effect compared to those where there are exclusions. Here, theimpact of heterogeneity is much more evidential than the 95% confidence interval andfurther reveals in a brand new situation, the chance of a trial with exclusion beingbetter than a trial without exclusions is unpredictable. Possible reasons for such un-predictability could be down to the fact the analysis had a combined 39 trials without

47

Page 51: The use of Prediction Intervals in Meta-Analysis

any exclusions compared to a combined 167 trials that did have exclusions. Furtherresearch is required but, like the authors, I believe an intention to treat analysis shouldbe reported to account for all patients.

4.2.3 Example 3

Pildal et al.57 carry out a meta-epidemiological study to assess the impact of removingrandomised controlled trials without reported adequate allocation concealment has onthe conclusions drawn from a meta-analysis. The study also looks at how trials with-out double blinding affects conclusions but the study of interest is concerning reportedadequate allocation concealment. There is evidence that without adequate allocationconcealment, which conceals what treatment the next patient will receive, this maylead to selection bias and an exaggerated treatment effect.55 They state “withoutconcealment, person in charge might channel patients with better prognosis into hisor her preferred treatment. The authors searched for reviews in the Cochrane Li-brary and PubMed and included reviews that included a meta-analysis of randomisedcontrolled trial that reported a binary outcome and this had to be their first statisti-cally significant result that supported a conclusion in favour of an intervention. Theyexcluded any non-binary outcome meta-analyses, any that had more than 40 trialsand any based on non-randomised trials. For each meta-analysis, they reproduced itusing the author’s original method and then redid each meta-analysis but only withtrials that had reported adequate allocation concealment. They included 34 meta-analyses, of which 29 (covering 284 trials) went through to the analysis. For each,they estimate ratio of odds ratios using a univariate random-effects meta-regressionand combined these using a random-effects meta-analysis and calculated I2 as a mea-sure of heterogeneity.. A ratio of odds ratio < 1 indicates trials with inadequateallocation concealment shows a more favourable treatment effect. They calculated asummary estimate of 0.81 with a 95% confidence of interval [0.81,1.01] and I2 to be0. We obtained the exact same summary estimate and 95% confidence interval, wealso retrieved the same I2 value of 0 and calculated τ̂ 2 to be 0. We also calculated a95% prediction interval of [0.80,1.02].The results are shown in the forest plot belowin figure 4.3.

The summary estimate (0.90) indicates that on average, trials without reported ad-equate allocation concealment showed a more favourable treatment effect. The 95%confidence interval ([0.81,1.01]) slightly goes over the null value so the average effectisnt significant at the 5% level but there does seem to be evidence inadequate allo-cation concealment does exaggerate the treatment benefit. The authors state “Therewas a non-significant trend towards seemingly more beneficial effect of the experi-mental treatment in the trials without reported adequate allocation concealment and“The impact of reported allocation concealment and double-blinding on the treatmenteffect estimate is smaller and less consistent than previously through. This can be

48

Page 52: The use of Prediction Intervals in Meta-Analysis

Figure 4.3: Forest plot of a meta-epidemiological analysis assessing ratio of odds ratioof trials with and without adequate allocation concealment57

further fortified by a 95% prediction interval which we calculated to be [0.80,1.02].The 95% prediction interval is only slightly wider than the 95% confidence intervalrevealing the impact of heterogeneity to be low. After considering the whole dis-tribution of effects, whilst on average there is evidence that inadequate allocationconcealment does lead to a more beneficial treatment effect (albeit non-significant at5% level), there are situations when trials with adequate allocation concealment mayproduce a more beneficial treatment effect compared to those that have inadequateallocation concealment.

49

Page 53: The use of Prediction Intervals in Meta-Analysis

4.2.4 Example 4

Tzoulaki et al.58 carried out a meta-epidemiological study to compare reported effectsizes of cardiovascular biomarkers in datasets from observational studies with thosein datasets from randomised controlled trials. The authors state that biomarkers arebecoming prominently used as predictors of cardiovascular outcomes but address con-cerns that reported effect sizes may have be exaggerated since “several biases mayinflate the observed association.59 They point out that study testing prognostic as-sociations comes mainly from observational epidemiological studies (cohort studies,case-control studies etc) but point out the same type of information may be availablefrom randomised controlled trials. They report how epidemiological studies differfrom randomised controlled trials and these differences may result in larger treatmenteffect estimates,60 but remain unsure whether or not differences exist when biomark-ers, rather than treatment effects, are assessed hence the motivation for this study.The authors included 31 meta-analyses with a total of 555 studies (472 observationalstudies and 83 randomised controlled trials), each examining association betweenpre-specified eligible biomarkers with an eligible outcome. For each meta-analysis, arandom-effects model is used firstly to combined reported relative risks from all thedata and then for randomised controlled trials and observation studies separately.The authors were also recommended to calculate prediction intervals for summaryestimate.7 The authors then calculated a design difference which “represents the dif-ference between datasets from observational studies and from randomised controlledtrials as a proportion of the summary effect of each meta-analysis. A study differ-ence < 0 indicates treatment effect is stronger in randomised controlled trials. Thestudy differences were then combined using both a fixed-effect and a random-effectsmeta-analysis along with a 95% prediction interval. They calculated a random-effectssummary estimate of 0.24 with a 95% confidence interval of [0.07,0.40], I2 value of 39%and a 95% prediction interval of [-0.29,0.76]. We obtained the exact same random-effects summary estimate and 95% confidence interval as the authors, I2 value of39.1%, τ̂ 2 value of 0.0578 and a 95% prediction interval [-0.28,0.76]. The results areshown in the forest plot below in figure 4.4.

The summary estimate (0.24) indicates that on average, there was a stronger effectobserved in observational studies compared to randomised controlled trials. Sincethe 95% confidence interval ([0.07,0.40]) is entirely < 0, there is strong evidencethat on average, observational studies have more favourable results than results fromrandomised controlled trials, but is this always the case. The authors acknowledgewhat they believe to be “modest” heterogeneity, indicated by I2 value of 39% (weobtained 39.1% as well as τ̂ 2 value of 0.0578 suggesting moderate to high between-meta-analysis heterogeneity), by carrying out subgroup analysis. They were alsorecommended to calculate a 95% prediction interval, which they obtained to be [-0.29,0.76] (we obtained [-0.28,0.76]), which is a much more formal way of describingheterogeneity. The 95% prediction interval in both mine and their cases now includes

50

Page 54: The use of Prediction Intervals in Meta-Analysis

Figure 4.4: Forest plot of a meta-epidemiological analysis assessing design difference ineffect sizes in datasets from observational studies v those from randomised controlledtrials populations58

the null value and is quite wide revealing the impact of heterogeneity. The authorsstate “typically, observational studies are expected to show larger or even much largereffects than randomised controlled trials, but exceptions can exist where larger effectsare seen in the randomised controlled trials”. Since we have both considered the wholedistribution of effects, although on average cardiovascular biomarkers seem to havemore favourable results in observational studies, we cannot rule out that they couldhave a better effect in randomised controlled trials, but we can also see that the effect

51

Page 55: The use of Prediction Intervals in Meta-Analysis

4.3 Discussion

Meta-epidemiological studies, which assess how trial characteristics may influencetreatment effects in a meta-analysis, are becoming more and more prominent in evi-denced based clinical practice. Published meta-epidemiological studies that are readilyavailable (for example on the BMJ) provide strong evidence that imperfections in trialprotocol lead to biases that may lead to over/underestimation of the true summaryeffect of the intervention.42 We took 4 meta-epidemiological studies at random andapplied a 95% prediction interval. In all cases, the 95% prediction interval was able toadd extra important information about the meta-epidemiological summary estimatewhich may be overlooked if the focus is solely on the summary estimate. Interest-ingly, only two meta-epidemiological studies that we found on the BMJ included a 95%prediction interval58;61, one of which I reproduced, this came about after recommen-dation from Riley et al.7 after reviewing the original study. Zhang43 anticipates thatmeta-epidemiology will “further evolve to improve evidence based clinical practice”.I believe the inclusion of a 95% prediction interval in a meta-epidemiological study,which takes into account the whole distribution of effects in a way that acknowledgesheterogeneity, will further increase the quality and stature of meta-epidemiologicalstudies and lead to more accurate and robust conclusions drawn from them.

52

Page 56: The use of Prediction Intervals in Meta-Analysis

Chapter 5

Final Discussion and Conclusion

The aim of this paper was to investigate the use of prediction interval in meta-analysisand whether or not their inclusion can help interpret the results of the analysis to ahigher degree. The importance of meta-analysis has grown over the last 20 years inmedicine and health care but on-going research is continuing to try and improve themethod. Ades et al.8 and Higgins et al.5 set the scene for the use of predictions in atraditional meta-analysis and further publications7;13;10;14 provide stern evidence thatincluding a 95% prediction interval will enhance the quality and robustness of con-clusions drawn from meta-analysis. Despite this, recent meta-analyses of randomisedcontrolled trials still do not include a prediction interval in their analysis. My aimwas to take a collection of meta-analyses and apply a prediction interval and examinehow these affect the outcomes of the analysis.

In chapter 1, I introduced meta-analysis as the statistical component of a systematicreview; these remain the best sources of information health care interventions.1 Thefirst point I made was why we use meta-analysis over traditional narrative reviews.What I am not saying is that narrative reviews should be abolished. If we are tryingto compare studies that differ by outcome, effect size or study design, it may be moresensible to have two or more experts write a report rather than use meta-analysis. Idiscussed both fixed-effect and random-effects meta-analysis in this chapter, providedthe formulas to carry out both types as well as an example for each. The goal ofthis chapter is to stress the differences between the two types of model. A fixed-effect model assumes a common effect exists amongst the trials included the analysis.However, when information is extracted from published and unpublished data, trialswill differ in many ways so a common effect is unlikely to exists because of between-study heterogeneity (real differences), a critical theme in this paper. To account forthis, a random-effects model is used which assumes the treatment effects are normallydistributed about the summary estimate. The goal of a random-effects model is to notonly estimate the summary effect but explain the differences that exist.2;5 To see if

53

Page 57: The use of Prediction Intervals in Meta-Analysis

heterogeneity is present, I presented the I2-statistics, Q-statistic based on the Q-testas tools to help us see if between-study heterogeneity is present and the DerSimonianand Laird12 estimate of the between-study variance τ̂ 2 which is incorporated into therandom-effects model. I also stated evidence provided by Riley et al.7 that thereis misunderstanding of when to use a fixed-effect or random-effects model as wellas misinterpretations of the summary estimate. In my view, random-effects modelscan be used without any real justification, but a fixed-effect model requires firmjustification since we are being specific in the assumption that a common effect exists.

In chapter 2, we present a 95% prediction interval as given by Higgins et al.5. A 95%prediction interval describes the whole distribution of effects in a random-effects meta-analysis, describes the degree of heterogeneity since it takes into account of all theuncertainty and gives a range for which we can be 95% sure that the treatment effectin a brand new study lies within.2;5;7;13 In this chapter, the goal is not only to introducethe prediction interval but to stress why it is useful clinically. In a random-effectsmeta-analysis, the summary estimate and its 95% confidence interval only providesinferences over the average effect, since a random-effects assumes that the individualtreatment effects are normally distributed around the summary estimate, we needto describe that, this motivates the 95% prediction interval. If the 95% predictioninterval contains the associated null value, then in some cases the intervention willbe beneficial, unbeneficial or may have no effect. If the 95% prediction excludes thenull value, then when the intervention is applied, at least 95% of the teams, the effectwill be beneficial/unbeneficial, this is much more useful to clinicians. I also describethe differences between a 95% confidence interval and a 95% prediction interval sincethey may be misinterpreted.

In chapter 3, I carried out an empirical review on the impact of a 95% predictioninterval on existing meta-analyses of three or more randomised controlled trials fromthe Lancet. We discuss the results, limitations and implications in chapter 3. Over aquarter (27.8%) of the meta-analyses which had significant 95% confidence intervalshad insignificant 95% predictions intervals. In all these cases, their respective authorsgenerally stated there was evidence that the intervention was beneficial/unbeneficialand where there was evidence of between-study heterogeneity, they carried out sometype of analysis (e.g. subgroup analysis, meta-regression etc). I pointed out limita-tions of such types of analysis in chapter 1 and instead use a 95% prediction intervalto describe the extent of heterogeneity that is present. In these cases, although onaverage, the treatment effect is beneficial/unbeneficial, we cannot rule that in somecases, the effect may not be beneficial/unbeneficial. I believe 27.8% is a fairly sub-stantial proportion of meta-analyses which may have to acknowledge the fact that insome cases, their conclusions may not be valid. However, this review reveals morethan just the above. If a 95% prediction interval is only slightly wider than a 95%confidence interval and there is no evidence of between-study heterogeneity and wehave a sufficient number of good quality randomised trials, a common effect may ex-

54

Page 58: The use of Prediction Intervals in Meta-Analysis

ists (e.g. Study 11b26). This is a much more powerful way of deducing a commoneffect since we have considered the whole distribution of effects. Since the 95% predic-tion interval is only slightly wider, this may be attributed to within-study variationand slight uncertainty in parameter estimates (given we have sufficient good qualitytrials). For meta-analyses that had their 95% prediction interval exclude the null(25%), the prediction interval reveals that after considering the whole distribution ofeffects, when the treatment is applied in at least 95% of brand new study settings, theeffect will always be beneficial/unbeneficial. These cases generally had no evidenceof between-study heterogeneity (I2, τ̂ 2 values of 0 or closer to 0) apart from studyID 3d18. If we have few studies, a 95% prediction interval can reveal the impact ofuncertainty around the estimates, e.g. study 1732 had no evidence of between-studyheterogeneity (I2, τ̂ 2 values of 0) yet had an extremely large 95% prediction intervalattributed by the vast uncertainty in the parameter estimates.

In chapter 3, it is clear to see that with few studies, the 95% prediction interval isgenerally wider since there is uncertainty in parameter estimates as well as accountingfor between-study heterogeneity. Whilst a 95% prediction interval should still beconsidered, its clinical meaningfulness may become nullified. The problem lies inthe DerSimonian and Laird12 estimate for τ̂ 2 which is sensitive to the number ofstudies. The better we are able to estimate τ̂ 2, the more accurate the 95% predictioninterval and moreover, the more accurate the random-effects results are. Brockwellet al.63 and Hardy et al.64 discuss further problems of the DerSimonian and Lairdestimate. I have so far restricted myself to mentioning the Bayesian approach thatexists to estimating τ̂ 2. Graham et al.14 not only calculate traditional predictionintervals in their study but also estimate 95% Bayesian prediction intervals. Theystate “Bayesian methods incorporate uncertainty into estimates more readily thanfrequentist; Bayesian posterior probabilities indicate clinically relevant effects in animmediate and tangible manner. It would therefore seem wise to recommend thistype of approach in the case of few studies. We refer8;5 for more information. We alsorefer Knapp et al.62 who describe other methods for estimating τ̂ 2.

In chapter 4, we applied a 95% prediction interval to four meta-epidemiological stud-ies. Meta-epidemiological studies assesses the influence of trial characteristics onthe treatment effect estimates in a meta-analysis.43;42;46 These types of studies arerelatively new in medical and health care literature but they already have gained im-portance since they reveal strong evidence that imperfections in trial protocol leadto biases that lead to over/underestimation of the true summary effect of the in-tervention.46 They will use a fixed-effect or a random-effects model and report asummary estimate and 95% confidence interval and use some type of analysis to de-scribe between-meta-analysis heterogeneity. Riley et al7 have recommended authorsto include a prediction interval and one of the studies we reproduced58 had includeda 95% prediction interval and it was clear to see the added benefit of it. Like chapter3, the 95% prediction interval enhanced the interpretation of the results of each of the

55

Page 59: The use of Prediction Intervals in Meta-Analysis

meta-epidemiological studies we took (see chapter 4). Since the method itself is in itsearly stages, adding a 95% prediction interval to its analysis would further enhancethe methods reputation and make the results easier to apply to clinical practice sincewe consider the whole distribution of effects.

In chapter 3, we have seen how the inclusion of a prediction interval can enhancethe interpretation of results in a traditional random-effects meta-analysis. In chapter4, we applied a prediction interval to meta-epidemiological studies which suggeststhat we can further expand the uses of prediction intervals to other study designs inepidemiology. It could be possible to apply a prediction interval to meta-regressionsince meta-regression will still incorporate τ̂ 2. A prediction interval could be appliedto meta-analysis of cluster randomised controlled trials whereby groups of patients arerandomised rather than patients randomised individually. In all these cases, furtherresearch is required to assess its applications. I also recommend further research intothe Bayesian approach of meta-analysis when there are few studies or generally whenthe DerSimonial and Laird estimate for τ̂ 2 lack precision. Borenstein et al2 state“This is probably the best option, but the problem is that relatively few researchershave expertise in Bayesian meta-analysis. This makes it fairly clear that when τ̂ 2 lacksprecision, more emphasis must be given to a Bayesian approach and more research isrequired to educate researchers on the procedures of Bayesian meta-analysis.

In conclusion, I believe that every single random-effects meta-analysis should presenta 95% prediction interval in its analysis but for best performance, the meta-analysisshould be of good quality unbiased randomised controlled trials. This gives the95% prediction interval the best chance of describing the full uncertainty aroundthe random-effects summary estimate as well as the impact of between-study het-erogeneity and giving an accurate clinically meaningful range for which we are 95%sure that the treatment effect in a brand new study lies within. If we have few stud-ies, I recommend a Bayesian approach or whilst a random-effects meta-analysis maystill be used, the uncertainty must be addressed in the conclusions. Furthermore, Irecommend that every meta-epidemiological study include a 95% prediction intervalto further enhance the quality and robustness of its conclusions and to enhance itsreputation as a method.

56

Page 60: The use of Prediction Intervals in Meta-Analysis

Appendix A

STATA Codes

I used STATA v10.1 to perform a random-effects meta-analysis with a 95% predictioninteral on each of the meta-analyses included in the empirical review as well as themeta-epidemiological studies. All forest plots presented in the paper are producedusing STATA v10.1. The codes for each figure are presented below.

Figure 1.1: metan smd se, effect(SMD) xlabel(-1.5,-1,-0.5,0,0.5,1)xtitle(Mean Difference (Treatment A minus Placebo)) favours( FavoursTreatment # Favours Placebo)

Figure 1.2: metan smd se, random effect(SMD) xlabel(-1.5,-1,-0.5,0,0.5,1)xtitle(Mean Difference (Treatment B minus Placebo)) favours( FavoursTreatment # Favours Placebo)

Figure 2.1: metan smd se, random rfdist effect(SMD)xlabel(-1.5,-1,-0.5,0,0.5,1) xtitle(Mean Difference (Treatment B minusPlacebo)) favours( Favours Treatment # Favours Placebo)

Figure 3.3: metan a b c d, or random rfdist label(namevar=trial) xla-bel(0.001,0.01,0.1,1.0,10,100,1000) xtitle(Odds Ratio) favours( FavoursTreatment # Favours Control)

Figure 3.4: metan lnhr lnse, random rfdist eform effect(HR) label(namevar=trial) xlabel(0.1,0.2,0.5,1,2,5,10) xtitle(Hazard Ratio) favours( S+CT bet-ter # S alone better)

Figure 3.5: metan a b c d, or random rfdist label(namevar=trial) xla-bel(0.25,1,2,5,10) xtitle(Odds Ratio) favours( Thrombolysis Decreases #Thrombolysis Increases) boxsca(45)

Figure 3.6: metan a b c d, rr random rfdist label(namevar=trial) xla-bel(0.1,0.25,1,2,5,

57

Page 61: The use of Prediction Intervals in Meta-Analysis

10) xtitle(Relative Risk) favours( “Probiotic has” “protective effect” #“Probiotic has” “non-protective effect”) boxsca(30)

Figure 3.7: metan rd se, random rfdist effect(RD) label(namevar=trial)xlabel(-2,-1,0,1,2) xtitle(Risk Difference) favours( “Favours Bilateral” #11Favours Unilateral”) boxsca(30)

Figure 4.1: metan dsmd se, random rfdist effect(Difference in SMD)label(namevar=ma) xlabel(-2,-1.5,-1,-0.5,0,0.5,1,1.5) xtitle(Difference instandardised mean difference) favours( “Single centre trials” “show largereffect” # “Multicentre trials” “show larger effect”) boxsca(30)

Figure 4.2: metan des se, random rfdist effect(Difference in Effect Size)label(namevar=ma) xlabel(-1.5,-1,-0.5,0,0.5,1,1.5) xtitle(Difference in ef-fect size) favours( “Trials with” “exclusions” “more beneficial” # “Trialswithout” “exclusions” “more beneficial”)

Figure 4.3: metan ror se, randomf rfdist eform effect(Ratio of Odds Ratio)label(namevar=ma) xlabel(0.1,0.2,0.5,1,2,5,10) xtitle(“Ratio of” “Odds Ra-tios”) favours( “Trials with unclear or inadequate” “concealment show amore favourable” “effect of the experimental treatment”) boxsca(30)

Figure 4.4: metan dd se, random rfdist effect(Design Difference)label(namevar=ma) xlabel(-15,-5,-1,0,1,5,15) xtitle(Design Difference)favours( “Stronger effect” “in RCTs” # “Stronger effect in” “observationalstudies”) boxsca(30)

58

Page 62: The use of Prediction Intervals in Meta-Analysis

Bibliography

[1] Hemingway P, Brereton N. What is a systematic review? Hayward MedicalCommunications NPR09/1111, 2009.

[2] Borenstein M, Hedges L.V., Higgins J.P.T., et al. Introduction to Meta-Analysis.Wiley, 2009.

[3] Higgins J.P.T, Green S. Cochrane Handbook for Systematic Reviews of In-terventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration,2011. Available from www.cochrane-handbook.org.

[4] Crombie I, Davies H. What is meta-analysis? Hayward Medical Communica-tions NPR09/1112, 2009.

[5] Higgins J.P.T, Thompson S.G, Spiegelhalter D.J. A re-evaluation of random-effects meta-analysis. J R Stat Soc A 2009, 172:137159.

[6] Lin D.Y, Zeng D. On the relative efficiency of using summary statistics versusindividual-level data in meta-analysis. Biometrika (2010), 97, 2, pp. 321332.

[7] Riley R.D, Higgins J.P.T, Deeks J.J. Interpretation of random effects meta-analysis. BMJ 2011;342:d549.

[8] Ades A.E, Lu G, Higgins J.P.T. The Interpretation of Random-Effects Meta-Analysis in Decision Models. Med Decis Making 2005;25:646-54.

[9] Higgins J.P.T, Thompson S.G. Quantifying heterogeneity in a meta-analysis.Statist. Med, 2002; 21:1539-1558.

[10] Higgins J.P.T. Commentary: Heterogeneity in meta-analysis should be ex-pected and appropriately quantified. International Journal of Epidemiology2008;37:11581160

[11] Thompson S.G, Higgins J.P.T. How should meta-regression analyses be under-taken and interpret. StatMed 2002;21:1559-74

[12] DerSimonian R, Laird N. Meta-Analysis in Clinical Trials. Control ClinicalTrials 1986. 7:177-188.

59

Page 63: The use of Prediction Intervals in Meta-Analysis

[13] Guddat C, Grouven U, Skipka G, et al. A note on the graphical presenta-tion of prediction intervals in random-effects meta-analyses. BioMed Central10.1186/2046-4053-1-34, 2012.

[14] Graham P.L, Moran J.L. Robust meta-analytic conclusions mandate the provi-sion of prediction intervals in meta-analysis summaries. J Clin Epidemiol 2012,65:503510.

[15] Pignon J, Hill C. Meta-analyses of randomised clinical trials in oncology LancetOncol 2001; 2: 47582.

[16] Independent UK panel on Breast Cancer Screening. The benefits and harms ofbreast cancer screening: an independent review. Lancet 2012; 380: 177886.

[17] Mees S.M.D, Algra A, Vandertop W.P, et al. Magnesium for aneurysmalsubarachnoid haemorrhage (MASH-2): a randomised placebo-controlled trial.Lancet 2012; 380: 4449.

[18] Wardlaw J.M, Murray V, Berge E, et al. Recombinant tissue plasminogenactivator for acute ischaemic stroke: an updated systematic review and meta-analysis. Lancet 2012; 379: 236472.

[19] Jozwiak M, Rengerink K.O, Benthem M, et. al. Foley catheter versus vaginalprotaglandin E2 gel for induction of labour at term (PROBAAT trial): an openlabel, randomised controlled trial. Lancet 2011; 378: 2095103.

[20] Tasina E, Haidich A, Kokkali S, et al. Efficacy and safety of tigecycline for thetreatment of infectious diseases: a meta-analysis. Lancet Infect Dis 2011;11:83444.

[21] Holmes M.V, Newcombe P, Hubacel J, et al. Effect modification by populationdietary folate on the association between MTHFR genotype, homocysteine, andstroke risk: a meta-analysis of genetic studies and randomised trials. Lancet2011; 378: 58494.

[22] Sjoquist K.M, Burmeister B.H, Smithers B.M, et al. Survival after neoadjuvantchemotherapy or chemoradiotherapy for resectable oesophagel carcinoma: anupdated meta-analysis. Lancet Oncol 2011; 12: 68192.

[23] Dondorp A.M, Fanello C, Hendriksen I.C.E, et al. Artesunate versus quinine inthe treatment of severe falciparum malaria in African children (AQUAMAT):an open-label, randomised trial. Lancet 2010; 376: 164757.

[24] Hopfl M, Selig H.F, Nagele P. Chest-compression-only versus standard car-diopulmonary resuscitation: a meta-analysis. Lancet 2010; 376: 155257.

[25] Carotid Stenting Trialist’s Collaboration. Short-term outcome after stenting

60

Page 64: The use of Prediction Intervals in Meta-Analysis

versus endarterectomy for symptomatic carotid stenosis: a preplanned meta-analysis of individual patient data. Lancet 2010; 376: 106273.

[26] NSCLC Meta-analyses Collaborative Group. Adjuvant chemotherapy, with orwithout postoperative radiotherapy, in operable non-small-cell lung cancer: twometa-analyses of individual patient data. Lancet 2010; 375: 126777.

[27] Norman J.E, Mackenzie F, Owen P, et al. Progesterone for the preventionof preterm birth in twin pregnancy (STOPPIT): a randomised, double-blind,placebo-controlled study and meta-analysis Lancet 2009; 373: 203440.

[28] Ray K.K, Seshasai S.R.K.S, Wijesuriya S, et al. Effect of intensive control ofglucose on cardiovascular outcomes and death in patients with diabetes mellitus:a meta-analysis of randomised controlled trials. Lancet 2009; 373: 176572.

[29] Hofmeijer J, Kappelle L.J, Algra A, et al. Surgical decompression for space-occupying cerebral infarction (the Hemicraniectomy After Middle CerebralArtery infarction with Life-threatening Edema Trial [HAMLET]): a multicentre,open, randomised trial. Lancet Neurol 2009; 8: 32633.

[30] Carr R, Brocklehurst P, Dare C.J, et al. Granulocyte-macrophage colony stim-ulating factor administered as prophylaxis for reduction of sepsis in extremelypreterm, small for gestational age neonates (the PROGRAMS trial): a single-blind, multicentre, randomised controlled trial. Lancet 2009; 373: 22633.

[31] Golfinopoulos V, Salanti G, Pavlidis N, et al. Survival and disease-progressionbenefits with treatment regimens for advanced colorectal cancer: a meta-analysis. Lancet Oncol 2007; 8: 898911.

[32] Barker A, Maratos E.C, Edmonds L, et al. Recurrence rates of video-assistedthoracoscopic versus open surgery in the prevention of recurrent pneumotho-races: a systematic review of randomised and non-randomised trials. Lancet2007; 370: 32935.

[33] Catovsky D, Richards S, Matutes E, et al. Assessment of fl udarabine pluscyclophosphamide for patients with chronic lymphocytic leukaemia (the LRFCLL4 Trial): a randomised controlled trial. Lancet 2007; 370: 23039.

[34] Phromminitkul A, Haas S.J, Elsik M, et al. Mortality and target haemoglobinconcentrations in anaemic patients with chronic kidney disease treated witherythropoietin: a meta-analysis. Lancet 2007; 369: 38188.

[35] Sazawal S, Hiremath G, Dhingra U, et al. Effi cacy of probiotics in preventionof acute diarrhoea: a meta-analysis of masked, randomised, placebo-controlledtrials. Lancet Infect Dis 2006; 6:37482.

[36] The ESPRIT Study Group Aspirin plus dipyridamole versus aspirin alone af-

61

Page 65: The use of Prediction Intervals in Meta-Analysis

ter cerebral ischaemia of arterial origin (ESPRIT): randomised controlled trial.Lancet 2006; 367: 166573.

[37] Bjelakovic G, Nikolva D, Simonetti R.G, et al. Antioxidant supplements forprevention of gastrointestinal cancers: a systematic review and meta-analysis.Lancet 2004; 364: 121928.

[38] The UK ECT Review Group. Efficacy and safety of electroconvulsive therapyin depressive disorders: a systematic review and meta-analysis. Lancet 2003;361: 799808.

[39] Shepherd J, Blauw G.J, Murphy M, et al. Pravastatin in elderly individualsat risk of vascular disease (PROSPER): a randomised controlled trial. Lancet2002; 360: 162330.

[40] Mehta S.R, Eikelboom J.W, Yusuf S. Risk of intracranial haemorrhage withbolus versus infusion thrombolytic therapy: a meta-analysis. Lancet 2000; 356:44954.

[41] Gueyfifer F, Bulpitt C, Boissel J, et al. Antihypertensive drugs in very oldpeople: a subgroup meta-analysis of randomised controlled trials. Lancet 1999;353: 79396.

[42] Savovic J, Harris R, Wood L, et al. Development of a combined database formeta-epidemiological research. Res. Syn. Meth. 2010, 1 212–225.

[43] Zhang W. Abstracts from Invited Speakres 1-01 Osteoarthritis and Cartilage18, Supplement 2 (2010) S1S8.

[44] Stroup D.F, Thacker S.B, Olson C.M, et al. Characteristics of metaanalysesrelated to acceptance for publication in a medical journal. Clin Epidemiol 2001;54: 655-660.

[45] Wood L, Egger M, Gluud L.L, et a.l Empirical evidence of bias in treatmenteffect estimates in controlled trials with different interventions and outcomes:meta-epidemiological study. BMJ 2008;336:601-5.

[46] Sterne J, Juni P, Schulz K, et al. Statistical methods for assessing the influenceof study characteristics on treatment effects in ’meta-epidemiological’ research.Statist. Med. 2002; 21:15131524.

[47] Easterbrook P.J, Berlin J.A, Gopalan R, et al. Publication bias in clinicalresearch. Lancet 1991; 337:867-872.

[48] Naylor D. Meta-analysis and the meta-epidemiology of clinical research. BMJ1997;315:6171.

[49] Zhang W, Moskowitz R, Nuki G, et al. ARSI recommendations for the man-agement of hip and knee osteoarthritis, Part I: Critical appraisal of existing

62

Page 66: The use of Prediction Intervals in Meta-Analysis

treatment guidelines and systematic review of current research evidences. Os-teoarthritis and Cartilage 2007;15:981-1000.

[50] Bafeta A, Dechartres A, Trinquart L, et al. Impact of single centre status onestimates of intervention effects in trials with continuous: meta-epidemiologicalstudy BMJ 2012;344:e813.

[51] Deschartres A, Boutron I, Trinquart L, et al. Single-centre trials show largertreatment effects than multicentre trials: evidence from a meta-epidemiologicstudy. Ann Intern Med 20011;155:39-51.

[52] Bellomo R, Warrillow S.J, Reade M.C. Why we should be wary of single-centretrials. Crit Care Med 2009;37:31 14-9.

[53] Nuesch E, Trelle S, Reichenbach S, et al. The effects of excluding patients fromthe analysis in randomised controlled trials: meta-epidemiological study BMJ2009;339:b3244.

[54] Tierne J.F, Stewart L.A, et al. Investigating patient exclusion bias in meta-analysis. Int J Epidemiol 2005;34:79-87.

[55] Juni P, Altman D.G, Egger M. Systematic reviews in health care: assessing thequality of controlled clinical trials. BMJ 2001;323:42-6.

[56] Juni P, Egger M. Commentary: empirical evidence of attrition bias in clinicaltrials. Int J Epidemiol 2005;34:87-8.

[57] Pildal J, Hrobjartsson A, Jorgensen K, et al. Impact of allocation concealmenton conclusions drawn from meta-analysis of randomized trials. InternationalJournal of Epidemiology 2007;36:847857.

[58] Tzoulaki I, Siontis K, Ioannidis J. Prognostic effect size of cardiovascularbiomarkers in datasets from observational studies versus randomised trials:meta-epidemiology study. BMJ 2011;343:d6829.

[59] Tzoulaki I, Liberopoulos G, Ioannidis JP. Use of reclassifiation for assessmenof improved prediction beyond the Framingham risk score. Int J Epidemiol2011;40:1094-105.

[60] Vandenbroucke J.P. When are observational studies as credible as randomisedtrials? Lancet 2004;363:1728-31.

[61] Nuesch E, Trelle S, Reichenbach S, et al. Small study effects in meta-analysesof osteoarthritis trials: meta-epidemiological study. BMJ 2010;341:c3515.

[62] Knapp G, Biggerstaff B, Hartung J. Assessing the amount of heterogeneity inrandom-effects meta-analysis. Biom J 2006, 48:271285.

63

Page 67: The use of Prediction Intervals in Meta-Analysis

[63] Brockwell S.E, Gordon I.R. A comparison of statistical methods for meta-analysis. Stat Med 2001; 20: 825-840.

[64] Hardy R.J, Thompson S.G. A likelihood approach to meta-analysis with randomeffects. Stat Med 1996; 15: 619-629.

64