Systematic Reviews: Methods and Procedures

Post on 31-Dec-2015

33 views 0 download

Tags:

description

Systematic Reviews: Methods and Procedures. George A. Wells Editor, Cochrane Musculoskeletal Review Group Department of Epidemiology and Community Medicine University of Ottawa Ottawa, Ontario, Canada. Meta-analysis :. Meta-analysis is a statistical analysis of a collection of studies - PowerPoint PPT Presentation

Transcript of Systematic Reviews: Methods and Procedures

Systematic Reviews: Methods and Procedures

George A. WellsEditor, Cochrane Musculoskeletal Review

GroupDepartment of Epidemiology and Community Medicine

University of OttawaOttawa, Ontario, Canada

• Meta-analysis is a statistical analysis of a collection of studies

• Meta-analysis methods focus on contrasting and comparing results from different studies in anticipation of identifying consistent patterns and sources of disagreements among these results

• Primary objective:• Synthetic goal (estimation of summary effect)

vs• Analytic goal (estimation of differences)

Meta-analysis:

• Systematic Review:

– the application of scientific strategies that limit bias to the systematic assembly, critical appraisal and synthesis of all relevant studies on a specific topic

• Meta-Analysis:

– a systematic review that employs statistical methods to combine and summarize the results of several studies

Features of narrative reviews and systematic reviews

QUESTION Broad Focused

SOURCES/ Usually unspecified Comprehensive; SEARCH Possibly biased explicit

SELECTION Unspecified; biased?Criterion-based;uniformly applied

APPRAISAL Variable Rigourous

SYNTHESIS Usually qualitative Quantitative

INFERENCE Sometimes Usually evidence- evidence-based based

NARRATIVE SYSTEMATIC

Steps of a Cochrane Systematic Review

• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

• What is the study objective to validate results in a large population to guide new studies

Pose question in both biologic and health care terms specifying with operational definitions population intervention outcomes (both beneficial and harmful)

Inclusion Criteria

• Study design

• Population

• Interventions

• Outcomes

Steps of a Cochrane Systematic Review

• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

• Need a well formulated and co-ordinated effort• Seek guidance from a librarian• Specify language constraints• Requirements for comprehensiveness of search

depends on the field and question to be addressed

• Possible sources include: computerized bibliographic database review articles abstracts conference proceedings dissertations books experts granting agencies trial registries industry journal handsearching

• Procedure: usually begin with searches of biblographic reports

(citation indexes, abstract databases) publications retrieved and references therein searched

for more references

as a step to elimination of publication bias need information from unpublished research databases of unpublished reports clinical research registries clinical trial registries unpublished theses conference indexes

Published Reports(publication bias ie. tendency to publish statistically significant results)

Steps of a Cochrane Systematic Review

• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

• 2 independent reviewers select studies• Selection of studies addressing the question posed

based on a priori specification of the population, intervention, outcomes and study design

• Level of agreement: kappa• Differences resolved by consensus

• Specify reasons for rejecting studies

Study Selection

• 2 independent reviewers extract data using predetermined forms– Patient characteristics– Study design and methods – Study results– Methodologic quality

• Level of agreement: kappa• Differences resolved by consensus

Data Extraction

• Be explicit, unbiased and reproducible• Include all relevant measures of benefit and

harm of the intervention• Contact investigators of the studies for

clarification in published methods etc.• Extract individual patient data when published

data do not answer questions about: intention to treat analyses, time-to-event analyses, subgroups, dose-response relationships

Data Extraction ….

Steps of a Cochrane Systematic Review

• Well formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

• Size of study

• Characteristics of study patients

• Details of specific interventions used

• Details of outcomes assessed

Description of Studies

• Can use as:• threshold for inclusion• possible explanation form heterogeneity

• Base quality assessments on extent to which bias is minimized

• Make quality assessment scoring systems transparent and parsimonious

• Evaluate reproducibility of quality assessment• Report quality scoring system used

Methodologic Quality Assessment

Study Random Blinding Dropouts

Adami 1995 + + +

Black 1996 ++ + +

Bone 1997 + + --

Chestnut 1995 + + +

Hosking 1998 + -- +

Liberman 1995 + + +

McClung 1998 + + +

++ indicates that randomization was appropriate ( egRandom numbers were computer generated)

Quality Assessment: Example

Steps of a Cochrane Systematic Review

• Well formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

Outcome

Discrete(event)

Continuous(measured)

Odds Relative RiskRatio Risk Difference(OR) (RR) (RD)

Mean StandardizedDifference Mean Difference(MD) (SMD)

Overall Estimate

Fixed EffectsRandom Effects

Overall Estimate

Fixed EffectsRandom Effects

(Basic Data) (Basic Data)

Effect measures: discrete data

P1 = event rate in experimental group

P2 = event rate in control group

• RD = Risk difference = P2 - P1• RR = Relative risk = P1 / P2• RRR = Relative risk reduction = (P2-P1)/P2• OR = Odds ratio = P1/(1-P1)/[P2/(1-P2)]• NNT = No. needed to treat = 1 / (P2-P1)

Example

Experimental event rate = 0.3

Control event rate = 0.4

RD = 0.4 - 0.3 = 0.1

RR = 0.3 / 0.4 = 0.75

RRR = (0.4 - 0.3) / 0.4 = 0.25

OR = (0.3/0.7)/(0.4/0.6) = 0.64

NNT = 1 / (0.4 - 0.3) = 10

Discrete - Odds Ratio (OR)

Event No eventExperimental a b ne

Control c d nc

ee naP cc ncP

Basic Data a/ne c/nc

Odds: number of patients experiencing eventnumber of patients not experiencing event

Odds ratio: Odds in Experimental groupOdds in Control group

e c

e c

adP POR= =

1-P 1-P bc

Discrete - Odds Ratio Example

4613eP 387cP

Basic Data 13/46 7/38

Event No eventExperimental 13 33 46

Control 7 31 38

7451337

3113.

*

*OR

Discrete - Relative Risk (RR)

ee naP cc ncP

Basic Data a/ne c/nc

Event No eventExperimental a b ne

Control c d nc

Risk: number of patients experiencing eventnumber of patients

Risk Ratio: Risk in Experimental groupRisk in Control group

c)ba(

)dc(aPP RR ce

Discrete - Relative Risk - Example

4613Pe 387Pc

Basic Data 13/46 7/38

Event No eventExperimental 13 33 46

Control 7 31 38

1.5347/38

/PPRR ce

4613

Discrete - Risk Difference (RD)

ee naP cc ncP

Basic Data a/ne c/nc

Event No eventExperimental a b ne

Control c d nc

Risk: number of patients experiencing eventnumber of patients

Risk Difference: (Risk in Experimental group) - (Risk in Control group)

RD = Pe- Pc dc

c

ba

a

Discrete - Risk Difference - Example

4613Pe 387Pc

Basic Data 13/46 7/38

Event No eventExperimental 13 33 46

Control 7 31 38

RD = Pe- Pc = 13/46 - 7/38 = 0.098

Discrete - Odds Ratio

ee nap cc ncp

)1(

1

)1(n

1 s

1/2

eLo

cccee ppnpp

Event No eventExperimental a b ne

Control c d nc

Estimator:)ˆ1/(ˆ)ˆ1/(p̂

o e

cc

e

pp

p

ln(o) Lo

Standard Error:

)sZ exp(LoL/2o

100(1- )% CI:oL/2o sZ L

(O)

Discrete - Relative Risk

ee nap cc ncp

1

pn

p-1 s

1/2

ee

eLr

cc

c

pn

p

Event No eventExperimental a b ne

Control c d nc

Estimator: ce p̂/p̂ r ln(r) Lr

Standard Error:

)sZ exp(LrL/2r

100(1- )% CI:

(R)

rL/2r sZ L

Discrete - Risk Difference

ee nap cc ncp

)1(

n

)p-(1p s

1/2

e

eed

c

cc

n

pp

Event No eventExperimental a b ne

Control c d nc

Estimator: ce p̂-p̂ d

Standard Error:

100(1- )% CI:

(D)

d/2 sZ d

When to use OR / RR / RDAssociation OR

(0,)RR

(0,)RD

(- 1,1)‘Decreased’ <1 <1 <0None 1 1 0‘Increased’ >1 >1 >0

OR vs RR

Odds Ratio Relative Risk if event occurs infrequently (i.e. a and c small relative to b and d)

RR = a(c+d) ad = OR

(a+b)c bc

Odds Ratio > Relative Risk if event occurs frequently

RD vs RR

When interpretation in terms of absolute difference is betterthan in relative terms (eg. Interest in absolute reduction inadverse events)

PROPERTIES OF RISK DIFFERENCE (RD),RELATIVE RISK (RR) AND ODDS RATIO (OR)

RD RR OR

Simple measure? Yes Yes No

Symmetric (measure unaffected by Yes No Yeslabelling of study groups)?

Predicted event rates restricted to No No Yes[0,1] if measure is assumed constant?

Unbiased estimate available? Yes No No

Efficient estimation in small samples? No No Yes

Motivating biological model available? Yes Yes Yes

Continuous Data - Mean Difference (MD)

number mean standard deviation

Experimental ne se

Control nc sc

)x-x(Zx-x

n

s

n

s

cece

c

c

e

e

se )( :)

)

2/

22

CI % -100(1

x-x( se

x-x :(MD) difference Mean

ce

ce

ex

cx

Continuous Data - Standardized Mean Difference (SMD)

number mean standard deviation

Experimental ne se

Control nc sc

12)n4(n

42)n4(n f

2nn

1)s(n1)s(ns

:where

s

x-x f d :SMD

ce

ce

ce

2cc

2ee

ce

se(d) Z d : CI )%-100(1

)(2se(d)

/2

2/12

cece

ce

nn

d

nn

nn

ex

cx

Mean Difference• When studies have comparable outcome measures (ie.

Same scale, probably same length of follow-up)

• A meta-analysis using MDs is known as a weighted mean difference (WMD)

Standardized Mean Difference• When studies use different outcome measurements which

address the same clinical outcome (eg different scales)

• Converts scale to a common scale: number of standard deviations

When to use MD / SMD

Example: Combining different scales for Swollen Joint Count

Study ExptMean SD N

ControlMean SD N MD SMD

Andersen 6.9 5.2 12 19.4 12.2 12 -12.5 -1.287

Furst 18.0 11.0 17 27.0 15.0 16 -9.0 -0.671

Pinheiro -- -- -- -- -- -- -- --

Weinblatt 20.0 7.75 15 23.0 8.0 16 -3.0 -0.371

Williams 17.0 12.6 56 25.0 13.4 48 -8.0 -0.612

• “True” inter-study variation may exist (fixed/random-effects model)

• Sampling error may vary among studies (sample size)

• Characteristics may differ among studies (population, intervention)

Sources of Variation over Studies

• Parameter of interest: (quantifies average treatment effect)

• Number of independent studies: k

• Summary Statistic: Yi (i=1,2,…,k)

• Large sample size: asymptotic normal distribution

Fixed-effects model vs Random-effects model

Modelling Variation

Fixed-Effects Model

• Outcome Yi from study i is a sample from a distribution with mean

(ie. common mean across studies)

• Yi are independently distributed as N ( , ) (i=1,2,…,k) where = Var(Yi ) and assume E(Yi) =

2is

2is

Fixed-Effects Model

x

Random-Effects Model

• Outcome Yi from study i is a sample from a distribution with mean

(ie. study-specific means)

• Yi are independently distributed as N ( , ) (i=1,2,…,k) where = Var(Yi ) and assume E(Yi) =

• is a realization from a distribution of ‘effects’ with mean

• are independently distributed as N ( , ) (i=1,2,…,k) where• = Var ( ) is the inter-study variation

• is the average treatment effect

i

i

i

i

i

2

2 i

2is

2is

Random-Effects Model

x

• distribution of conditional on observed data, and is N ( )

• where Fi is the shrinkage factor for the ith study

Random-Effects Model …..

• after averaging study-specific effects, distribution of Yi is N ( , )• although is parameter of interest, must be considered and

estimated

i

22 is

2

Estimating Average Study Effect

)1(,)1( 2iiiii FsYFF

Estimating Study-Specific Effects

i

)/( 222 iii ssF

2,

Modelling Variation

• Studies are stratified and then combined to account for differences in sample size and study characteristics

• A weighted average of estimates from each study is calculated

• Question of whether a common or study-specific parameter is to be estimated remains …. Procedure:• perform test of homogeneity• if no significant difference use fixed-effects model• otherwise identify study characteristics that stratifies studies

into subsets with homogeneous effects or use random effects model

Fixed Effects Model

• Require from each study effect estimate; and standard error of effect estimate

Combine these using a weighted average:

pooled estimate = sum of (estimate weight)

sum of weights

where weight = 1 / variance of estimate

• Assumes a common underlying effect behind every trial

Fixed-Effects Model: General Scheme

Study Measure Std Error Weight

1 Y1 s1 W1

2 Y2 s2 W2

. . . .

. . . .

. . . .k Yk sk Wk

(no association: Yi=0)

2

1

i

is

W

Overall Measure:

) ˆse(ˆ : )%1(100

1 )ˆ(

ˆ

2/

ZCI

Wse

W

YW

ii

ii

iii

mle

Chi-Square Tests:

21

22hom

21

2

i

2i

2

2k

1i

2i

2

2hom

22

)ˆ(

)W(

W

11

ki

iiog

ii

i

assoc

kitotal

ogassoctotal

YW

W

Y

Y

) (k-) (df (k)

test Q sCochran'

(0,1) N 2assoc If ‘large’ association

If ‘large’ heterogeneity

1

2

1

2

Features in Graphic Display

• For each trial– estimate (square)

– 95% confidence interval (CI) (line)

– size (square) indicates weight allocated

• Solid vertical line of ‘no effect’– if CI crosses line then effect not significant (p>0.05)

• Horizontal axis– arithmetic: RD, MD, SMD– logarithmic: OR, RR

• Diamond represents combined estimate and 95% CI• Dashed line plotted vertically through combined estimate

Odds Ratio

Three methods for combining

(1) Mantel-Haenszel method

(2) Peto’s method

(3) Maximum likelihood method

Relative Risk

Risk Difference

Peto Odds Ratio

Mantel-Haenszel Odds Ratio

Relative Risk

Risk Difference

Weighted Mean Difference

Standardized Mean Difference

Weighted Mean Difference

Standardized Mean Difference

Heterogeneity

• Define meaning of heterogeneity for each review• Define a priori the important degree of heterogeneity (in large

data sets trivial heterogeneity may be statistically significant)• If heterogeneity exists examine potential sources (differences in

study quality, participants, intervention specifics or outcome measurement/definition)

• If heterogeneity exists across studies, consider using random effects model

• If heterogeneity can be explained using a priori hypotheses, consider presenting results by these subgroups

• If heterogeneity cannot be explained, proceed with caution with further statistical aggregation and subgroup analysis

Heterogeneity: How to Identify it

• Common sense

are the patients, interventions and outcomes in each of the included studies sufficiently similar

• Exploratory analysis of study-specific estimates

• Statistical tests

Heterogeneity: How to deal with it

Lau et al. 1997

• Subgroup analyses

subsets of trials

subsets of patients

SUBGROUPS SHOULD BE PRE-SPECIFIED TO AVOID BIAS

• Meta-regression

– relate size of effect to characteristics of the trials

Heterogeneity: Exploring it

Exploring Heterogeneity: subgroup analysis

Exploring Heterogeneity: subgroup analysis

Random Effects Model

• Assume true effect estimates really vary across studies

• Two sources of variation:- within studies (between patients)- between studies (heterogeneity)

• What the software does:- Revise weights to take into account both components of

variation:

• weight = 1 variance+heterogeneity

• When heterogeneity exists we get a different pooled estimate (but not necessarily) with a different interpretation a wider confidence interval a larger p-value

Random Effects Model

22

1)(

)(

)( )(ˆ

ii

ii

iii

mle sWwhere

W

YW

If is known then MLE of is2

If is unknown three common methods of inference can be used:

Restricted Maximum Likelihood (REML)

Bayesian

Method of Moments (MOM)

2

Method of Moments (Random effects model)

ii

ii

ogw WWW

k2

2hom2 )1(

,0max

Study Measure Weight (FE) Weight (RE)1 Y1 W1 w1

*=(w1-1+ )-1

2 Y2 W2 w2*=(w2

-1+ )-1

. . . .

. . . .

. . . .k Yk Wk wk

*=(wk-1+ )-1

Overall Measure

)ˆse( : )%1(100

1 )ˆ(

ˆ

*2/

*

*

*

*

*

*

ZCI

Wse

W

YW

ii

ii

iii

2w

2w

2w

Effect of model choice on study weights

Larger studies receive proportionally less weight in RE model

than in FE model

Fixed Effects

Random Effects

Fixed vs Random Effects: Discrete Data

Random Effects

Fixed EffectsFixed vs Random Effects: Continuous Data

Omission of Outlier - Chestnut Study

Analysis

• Include all relevant and clinically useful measures of treatment effect

• Perform a narrative, qualitative summary when data are too sparse, of too low quality or too heterogeneous to proceed with a meta-analysis

• Specify if fixed or random effects model is used• Describe proportion of patients used in final analysis• Use confidence intervals• Include a power analysis• Consider cumulative meta-analysis (by order of

publication date, baseline risk, study quality) to assess the contribution of successive studies

Steps of a Cochrane Systematic Review

• Well formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

Subgroup Analyses

• Pre-specify hypothesis-testing subgroup analyses and keep few in number

• Label all a posteriori subgroup analyses

• When subgroup differences are detected, interpret in light of whether they are:• established a priori

• few in number

• supported by plausible causal mechanisms

• important (qualitative vs quantitative)

• consistent across studies

• statistically significant (adjusted for multiple testing)

Sensitivity Analyses• Test robustness of results relative to key features of the studies and key

assumptions and decisions

• Include tests of bias due to retrospective nature of systematic reviews (eg.with/without studies of lower methodologic quality)

• Consider fragility of results by determining effect of small shifts in number of events between groups

• Consider cumulative meta-analysis to explore relationship between effect size and study quality, control event rates and other relevent features

• Test a reasonable range of values for missing data from studies with uncertain results

Funnel Plot• Scatterplot of effect estimates against sample

size• Used to detect publication bias• If no bias, expect symmetric, inverted funnel

• If bias, expect asymmetric or skewed shape

x x x x x xx x x x x x

x x x x x x x x x x

Suggestion of missing small studies

Effect Size (RR)

1.21.0.8.6.4.20.0

700

600

500

400

300

200

100

0

Intervention

H2-Blockers

Funnel Plot Example 1: Prophylaxis of NSAID induced Gastric Ulcers

Sa

mp

le S

ize

Funnel Plot Example 2: Alendronate for Postmenopausal Osteoporosis

0

500

1000

1500

2000

2500

0 5 10

Weighted Mean Difference

Sa

mp

le S

ize

WMD of % change in lumbar bone mineral density

Steps of a Cochrane Systematic Review

• Well formulated question• Comprehensive data search• Unbiased selection and extraction

process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup

analyses if appropriate and possible• Prepare a structured report

Presentation of Results

• Include a structured abstract

• Include a table of the key elements of each study

• Include summary data from which the measures are computed

• Employ informative graphic displays representing confidence intervals, group event rates, sample sizes etc.

Interpretation of Results

• Interpret results in context of current health care• State methodologic limitations of studies and review• Consider size of effect in studies and review, their

consistency and presence of dose-response relationship

• Consider interpreting results in context of temporal cumulative meta-analysis

• Interpret results in light of other available evidence• Make recommendations clear and practical• Propose future research agenda (clinical and

methodological requirements)

Generic Inferential Framework

Generic inferential framework

(1) Conceptually, think of a ‘generic’ effect size statistic T

(2) corresponding effect size parameter θ(3) associated standard error SE(T), square

root of variance(4) for some effect sizes, some suitable

transformation may be needed to make inference based on normal distribution theory

Generic inferential framework ...

(A) Fixed-Effects Model (FEM):– Assume a common effect size– Obtain average effect size as a weighted mean

(unbiased)• Optimal weight is reciprocal of variance (inverse

variance weighted method)

Generic inferential framework ...

• Variances inversely proportional to within-study sample sizes– what is the effect of larger studies in

calculating weights?– may also weigh by ‘quality’ index, q, scaled

from 0 to 1

Generic inferential framework ...

• Average effect size has conditional variance (a function of conditional variances of each effect size, quality index, …)– e.g.. V = 1/total weight

• Multiply the resulting standard error by appropriate critical value (1.96, 2.58, 1.645)

• Construct confidence interval and/or test statistic

Generic inferential framework ...

• Test the homogeneity assumption using a weighted effect size sums of squares of deviations, Q

• If Q exceeds the critical value of chi-square at k-1 d.f. (k = number of studies), then observed between-study variance significantly greater than what would be expected under the null hypothesis

Generic inferential framework ...

• When within-study sample sizes are very large, Q may be rejected even when individual effect size estimates do not differ much

• One can take different courses of action when Q is rejected (see next page)

Generic inferential framework ...

• Methodologic choices in dealing with ‘heterogeneous’ data

Generic inferential framework ...

(B) Random-Effects Model (REM):– Total variability of an observed study effect size reflects

within and between variance (extra variance component)– If between-studies variance is zero, equations of REM

reduce to those of FEM– Presence of a variance component which is significantly

different from zero may be indicative of REM

Generic inferential framework ...

• Once significance of variance component is established (e.g.. Q test for homogeneity of effect size), – its magnitude should be estimated– variance components can be estimated in many ways!

• the most commonly used method is the so-called the DerSimonian-Laird method which is based on method-of-moments approach

– Compute random effects weighted mean as an estimate of the average of the random effects in the population

– construct confidence interval and conduct hypothesis tests as before (new variance and thus new weights!!!)

Correlation Coefficient

Example: Correlation coefficient

• A measure of association more popular in cross-sectional observational studies than in RCTs is Pearson’s correlation coefficient, r given by

• X and Y must be continuous (e.g. blood pressure and weight)

• r lies between -1 to 1• not available in RevMan / MetaView at this time

2 2

( )( )

( ) ( )

X X Y Yr

X X Y Y

Correlation coefficient (cont’d)

• Following the generic framework discussed earlier:– the effect size statistic is r– the corresponding effect size parameter is the

underlying population correlation coefficient, – in this case, a suitable transformation is

needed to achieve approximate normality of effect size

– inference is conducted on the scale of the transformed variable and final results are back-transformed to the original scale

Correlation coefficient (cont’d)Assuming X and Y have a bivariate normal distribution, the Fisher’s Z

transformed variable

has, for large sample, an approximate normal distribution with mean of

and a variance of

Hence, weighting factor associated with Z is W = 1/Var = n-3.

1 1log

2 1

rZ

r

1 1log

2 1

1( )

3Var Z

n

Correlation coefficient (cont’d)

• meta-analysis is carried out on Z-transformed measures and final results are transformed back to the scale of correlation using

2

2

1

1

Z

Z

er

e

Numerical Example

• Source: Fleiss J., Statistical Methods in Medical Research 1993; 2: 121 -- 145.

• correlation coefficients reported by 7 independent studies in education are included in the meta-analysis

• Comparison: association between a characteristic of the teacher and the mean measure of his or her student’s achievement

__________________________________________Study n r Z* W** WZ WZ2

==============================================================

1 15 -0.073 -0.073 12 -0.876 0.064 2 16 0.308 0.318 13 4.134 1.315 3 15 0.481 0.524 12 6.288 3.295 4 16 0.428 0.457 13 5.941 2.715 5 15 0.180 0.182 12 2.184 0.397 6 17 0.290 0.299 14 4.186 1.252 7 __ 15 0.400 0.424 _ 12 ___5.088 2.157__Sum 88 26.945 11.195===================================================*Z = Fisher’s Z-transformation of r** W = n-3 2

2 2

2

( )( ) /

11.195 (26.945) /88 2.94

i i

i i i i i

Q W Z ZW Z W Z W

Example: Fleiss (1993)

Q = 2.94 on 6 df is not statistically significant.

Results and discussions

• No evidence for heterogeneous association across studies

• Fixed effect analysis may be undertaken• Questions:

– Would a random effect analysis as shown earlier produce a different numerical value for the combined correlation coefficient?

– How would the weights be modified to carry out a REM?

Results and discussions (cont’d)

• the weighted mean of Z is

• the approximate standard error of the combined mean is

/ 26.945/88 0.306i i iZ W Z W

1 1( ) 0.107

88i

SE ZW

Results and discussions (cont’d)

• Test of significance is carried out using

– this value exceeds the critical value 1.96 (corresponding to 5% level of significance), so we conclude that average value of Z (hence the average correlation) is statistically significant

0.3062.86

0.107( )

Zz

SE Z

Results and discussions (cont’d)

• 95% confidence interval for is

• Transforming back to the original scale, a 95% CI for the parameter of interest, , is

– again confirming a significant association

1.96 ( )

0.096 0.516

Z SE Z

0.096 0.474

Critical Appraisal of a

Systematic Review

(A) The Message

• Does the review set out to answer a precise question about patient care? – Should be different from an uncritical

encyclopedic presentation

(B) The Validity

• Have studies been sought thoroughly: Medline and other relevant bibliographic database

Cochrane controlled clinical trials register

Foreign language literature

"Grey literature" (unpublished or un-indexed reports: theses, conference proceedings, internal reports, non-indexed journals, pharmaceutical industry files)

Reference chaining from any articles found

Personal approaches to experts in the field to find unpublished reports

Hand searches of the relevant specialized journals.

Validity (cont’d)

• Have inclusion and exclusion criteria for studies been stated explicitly, taking account of the patients in the studies, the interventions used, the outcomes recorded and the methodology?

Validity (cont’d)

• Have the authors considered the homogeneity of the studies: the idea that the studies are sufficiently similar in their design, interventions and subjects to merit combination. – this is done either by eyeballing graphs like

the forest plot or by applications of chi-square tests (Q test)

(C) The Utility

• The various studies may have used patients of different ages or social classes, but if the treatment effects are consistent across the studies, then generalisation to other groups or populations is more justified.

Utility (cont’d)

• Be wary of sub-group analyses where the authors attempt to draw new conclusions by comparing the outcomes for patients in one study with the patients in another study– Be wary of "data-dredging" exercises, testing

multiple hypotheses against the data, especially if the hypotheses were constructed after the study had begun data collection.

Utility (cont’d)

• One may also want to ask: Were all clinically important outcomes considered?

Are the benefits worth the harms and costs?