Post on 15-Jan-2016
The GRADE approach: an introductory workshop
NTP, RaleighJune 22, 2011
Holger Schünemann, MD, PhD Professor and Chair, Dept. of Clinical Epidemiology & BiostatisticsProfessor of MedicineMichael Gent Chair in Healthcare ResearchMcMaster University, Hamilton, Canada
History- 1967 – Founded by David Sackett- 6 chairs since- Instrumental in specialty of Clinical Epidemiology, origin of “Evidence-Based Medicine”
People45 full time and joint faculty~ 120 associate & part time faculty; 19 emeritus~ 180 staff~ 200 PhD and Master students
The Department of Clinical Epidemiology & Biostatistics at
McMaster
Content
• Guidelines and GRADE– Background about GRADE
• Quality of evidence• Going from evidence to recommendations
What is a guideline?
• "Guidelines are recommendations intended to assist providers and recipients of health care and other stakeholders to make informed decisions. Recommendations may relate to clinical interventions, public health activities, or government policies."
WHO 2003, 2007
Evidence based healthcare decisions
Research evidence
Population valuesand preferences
(Clinical) state and circumstances
Expertise
Haynes et al. 2002
Confidence in evidence
• There always is evidence – “When there is a question there is evidence”
• Better research greater confidence in the evidence and decisions
Hierarchy of evidencebased on quality
STUDY DESIGN Randomized Controlled
Trials Cohort Studies and Case
Control Studies Case Reports and Case
Series, Non-systematic observations
Expert Opinion
BIAS
Explain the following?• Confounding, effect modification & ext. validity• Concealment of randomization• Blinding (who is blinded in a double blinded
study?)• Intention to treat analysis and its correct
application• P-values and confidence intervals
“Everything should be made as simple as possible but not simpler.”
BMJ 2003
BMJ, 2003
BMJ 2003Relative risk reduction:….> 99.9 % (1/100,000)
U.S. Parachute Association reported 821 injuries and 18 deaths out of 2.2 million jumps in 2007
Simple hierarchies are (too) simplistic
STUDY DESIGN Randomized Controlled
Trials Cohort Studies and
Case Control Studies Case Reports and Case
Series, Non-systematic observations
BIAS
Expert Opinion
Exp
ert O
pin
ion
Schünemann & Bone, 2003
Which hierarchy?
Evidence Recommendation• B Class I• A 1• IV C
Organization AHA ACCP SIGN
Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease
Oxford Centre for Evidence Based MedicineLevels of Evidence and Grades of Recommendations- 23 November 1999.
Grade of Recommendation
Level of Evidence
Therapy/Prevention, Aetiology/Harm Prognosis Diagnosis Economic analysis
1a SR (with homogeneity) of RCTs SR (with homogeneity*) of inception cohort studies; or a CPG validated on a test set.
SR (with homogeneity*) of Level 1 diagnostic studies; or a CPG validated on a test set.
SR (with homogeneity*) of Level 1 economic studies
A
1b Individual RCT (with narrow Confidence Interval)
Individual inception cohort study with > 80% follow-up
Independent blind comparison of an appropriate spectrum of consecutive patients, all of whom have undergone both the diagnostic test and the reference standard.
Analysis comparing all (critically-validated) alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables.
1c All or none All or none case-series Absolute SpPins and SnNouts Clearly as good or better, but cheaper. Clearly as bad or worse but more expensive. Clearly better or worse at the same cost.
2a SR (with homogeneity*) of cohort studies SR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTs.
SR (with homogeneity*) of Level >2 diagnostic studies
SR (with homogeneity*) of Level >2 economic studies
B
2b Individual cohort study (including low quality RCT; e.g., <80% follow-up)
Retrospective cohort study or follow-up of untreated control patients in an RCT; or CPG not validated in a test set.
Any of: Independent blind or objective comparison; Study performed in a set of non-consecutive
patients, or confined to a narrow spectrum of study individuals (or both) all of whom have undergone both the diagnostic test and the reference standard;
A diagnostic CPG not validated in a test set.
Analysis comparing a limited number of alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables.
2c “Outcomes” Research “Outcomes” Research
3a SR (with homogeneity*) of case-control studies
3b Individual Case-Control Study Independent blind comparison of an appropriate spectrum, but the reference standard was not applied to all study patients
Analysis without accurate cost measurement, but including a sensitivity analysis incorporating clinically sensible variations in important variables.
C
4 Case-series (and poor quality cohort and case-control studies)
Case-series (and poor quality prognostic cohort studies)
Any of: Reference standard was unobjective,
unblinded or not independent; Positive and negative tests were verified
using separate reference standards; Study was performed in an inappropriate
spectrum** of patients.
Analysis with no sensitivity analysis
D
5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”
Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”
Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”
Expert opinion without explicit critical appraisal, or based on economic theory
Oxford Centre for Evidence-Based Medicine (Chris Ball, Dave Sackett, Bob Phillips, Brian Haynes, and Sharon Straus).
USPSTF - Grade Definitions After May 2007: Certainty
Level of Certainty DescriptionHigh The available evidence usually includes consistent results from well-designed, well-
conducted studies in representative primary care populations. These studies assess the effects of the preventive service on health outcomes. This conclusion is therefore unlikely to be strongly affected by the results of future studies.
Moderate •The available evidence is sufficient to determine the effects of the preventive service on health outcomes, but confidence in the estimate is constrained by such factors as: The number, size, or quality of individual studies.•Inconsistency of findings across individual studies.•Limited generalizability of findings to routine primary care practice.•Lack of coherence in the chain of evidence.As more information becomes available, the magnitude or direction of the observed effect could change, and this change may be large enough to alter the conclusion.
Low •The available evidence is insufficient to assess effects on health outcomes. Evidence is insufficient because of: The limited number or size of studies.•Important flaws in study design or methods.•Inconsistency of findings across individual studies.•Gaps in the chain of evidence.•Findings not generalizable to routine primary care practice.•Lack of information on important health outcomes.More information may allow estimation of effects on health outcomes.
The USPSTF defines certainty as "likelihood that the USPSTF assessment of the net benefit of a preventive service is correct."
• Recommendations for prognosis– Use prognostic information to determine baseline
risk for healthcare decisions
19
20
Center for Disease Control and Prevention (CDC)
Evidence of Effectiveness
Execution - Good or
Fair
Design Suitability —
Greatest, Moderate, or
Least
Number of Studies
Consistent Effect Sized
Expert Opinion
Strong Good Greatest At Least 2 Yes Sufficient Not Used
Good Greatest or Moderate
At Least 5 Yes Sufficient Not Used
Good or Fair
Greatest At Least 5 Yes Sufficient Not Used
Meet Design, Execution, Number, and Consistency Criteria for Sufficient But Not Strong Evidence
Large Not Used
Sufficient Good Greatest 1 Not Applicable
Sufficient Not Used
Good or Fair
Greatest or Moderate
At Least 3 Yes Sufficient Not Used
Good or Fair
Greatest, Moderate, or Least
At Least 5 Yes Sufficient Not Used
Expert Opinion Varies Varies Varies Varies Sufficient Supports a Recommendation
Insufficient A. Insufficient Designs or Execution
B. Too Few Studies
C. Inconsistent
D. Small E. Not Used
Healthcare problem
recommendation
“Healthy people”“Herd immunity”
“Long term perspective”“Disease perception”“Lots of other things”
GRADE Working Group
Grades of Recommendation Assessment, Development and
Evaluation
CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, AJRCCM 2006, Chest 2006, BMJ 2008
• International group: ACCP, AHRQ, Australian NMRC, BMJ Clinical Evidence, Cochrane Collaboration, CDC, McMaster, NICE, Oxford CEBM, SIGN, UpToDate, USPSTF, WHO
• Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations
• International group of guideline developers, methodologists & clinicians from around the world (>250 contributors) – since 2000
GRADE Uptake World Health Organization CDC-ACIP Allergic Rhinitis in Asthma Guidelines (ARIA) American Thoracic Society American College of Physicians European Respiratory Society European Society of Thoracic Surgeons British Medical Journal Infectious Disease Society of America American College of Chest Physicians UpToDate® National Institutes of Health and Clinical Excellence (NICE) Scottish Intercollegiate Guideline Network (SIGN) Cochrane Collaboration Infectious Disease Society of America Clinical Evidence Agency for Health Care Research and Quality (AHRQ) Partner of GIN Over 40 major organizations
Guideline development
Process
Prioritise problems & scoping
Establish guideline panel and develop questions, including outcomes
Find and critically appraise systematic review(s)
and/or Prepare protocol(s) for systematic review(s)
and Prepare systematic review(s)
(searches, selection of studies, data collection and analysis)
Prepare an evidence profile
Assess the quality of evidence for each outcome
Prepare a Summary of Findings table
If developing guidelines: Assess the overall quality of evidence
and Decide on the direction (which alternative) and strength of the
recommendation
Draft guideline
Consult with stakeholders and/or external peer reviewers
Disseminate guidelines
Update review or guidelines when needed
Adapt guidelines, if needed
Prioritise guidelines/recommendations for implementation
Implement or support implementation of the guidelines
Evaluate the impact of the guidelines and implementation strategies
Update systematic review/guidelines
Case scenarioA 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.
Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir
Types of questions
Background QuestionsDefinition: What is Avian Influenza?Mechanism: What is the mechanism of
action of oseltamivir?
Foreground QuestionsBenefit > harm: In patients with avian
influenza, does oseltamivir therapy improve survival, …?
Framing a foreground question
Population: Avian Flu/influenza A (H5N1) patients
Intervention: Oseltamivir
Comparison: No pharmacological intervention
Outcomes: Mortality, hospitalizations, resource use, adverse outcomes,
antimicrobial resistance
Schunemann, et al., The Lancet ID, 2007
• Desirable outcomes– lower mortality– reduced hospital stay– reduced duration of disease– reduced resource expenditure
• Undesirable outcomes– adverse reactions – the development of resistance – costs of treatment
• Every decision comes with desirable and undesirable consequencesDeveloping recommendations must include a consideration of
desirable and undesirable outcomes
Choosing outcomes
• Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation
• Relative importance vary across populations
• Relative importance may vary across patient groups within the same population
• When considered critical - evaluate
Relative importance of outcomes
GRADE: recommendation – quality of evidence
Clear separation:1) Recommendation: 2 grades –
weak/conditional/optional or strong (for or against an intervention)?– Balance of benefits and downsides, values and preferences,
resource use and quality of evidence
2) 4 categories of quality of evidence: (High), (Moderate), (Low), (Very low)?– methodological quality of evidence– likelihood of bias– by outcome and across outcomes
*www.GradeWorking-Group.org
GRADE Quality of EvidenceIn the context of a systematic review• The quality of evidence reflects the extent to which
we are confident that an estimate of effect is correct.
In the context of making recommendations • The quality of evidence reflects the extent to which
our confidence in estimates of the effects is adequate to support a particular recommendation.
Likelihood of and confidence in an outcome
Definition of grades of evidenceResearch
• /A/High: Further research is very unlikely to change confidence in the estimate of effect.
• /B/Moderate: Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate.
• /C/Low: Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate.
• /D/Very low: Any estimate of effect is very uncertain.
Confidence in evidence/A/High: We are very confident that the true effect lies close to that of the estimate of the effect.
/B/Moderate: : We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different. /C/Low : Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
/D/Very low : We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.
Determinants of quality• RCTs
• observational studies
• 5 factors that can lower quality1. limitations in detailed design and execution (risk of bias criteria)2. Inconsistency (or heterogeneity)3. Indirectness (PICO and applicability)4. Imprecision (number of events and confidence intervals)5. Publication bias
• 3 factors can increase quality1. large magnitude of effect2. all plausible residual confounding may be working to reduce the
demonstrated effect or increase the effect if no effect was observed
3. dose-response gradient
1. Design and Execution/Risk of Bias
Examples:• Inappropriate selection of exposed and unexposed groups• Failure to adequately measure/control for confounding• Selective outcome reporting• Failure to blind (e.g. outcome assessors)• High loss to follow-up• Lack of concealment in RCTs• Intention to treat principle violated
Design and Execution/RoB
From Cates , CDSR 2008
Design and Execution/RoB
Overall judgment required
2. Inconsistency of results(Heterogeneity)
• if inconsistency, look for explanation– patients, intervention, comparator, outcome
• if unexplained inconsistency lower quality
Reminders for immunization uptake
Indoor air polution: ALRI
Non-steroidal drug use and risk of pancreatic cancer
Capurso G, Schünemann HJ, Terrenato I, Moretti A, Koch M, Muti P, Capurso L, Delle Fave G. Meta-analysis: the use of non-steroidal anti-inflammatory drugs and pancreatic cancer risk for different exposure categories.
Aliment Pharmacol Ther. 2007 Oct 15;26(8):1089-99.
3. Directness of Evidence
• differences in– populations/patients (children – neonates, women in
general – pregnant women)– interventions (all vaccines, new - old)– comparator appropriate (new policy – old or no policy)– outcomes (important – surrogate: cases prevented –
seroconversion)• indirect comparisons
– interested in A versus B– have A versus C and B versus C– Vaccine A versus Placebo versus Vaccine B
• Possibly. The “high” dose effects of bisphenol A in laboratory animals that provide clear evidence for adverse effects on development, i.e., reduced survival, birth weight, and growth of offspring early in life, and delayed puberty in female rats and male rats and mice, are observed at levels of exposure that far exceed those encountered by humans. However, estimated exposures in pregnant women and fetuses, infants, and children are similar to levels of bisphenol A associated with several “low” dose laboratory animal findings of effects on the brain and behavior, prostate and mammary gland development, and early onset of puberty in females. When considered together, these laboratory animal findings provide limited evidence that bisphenol A has adverse effects on development.
Flatulence
Importance of outcomes
2
5
6
7
8
9
3
4
1
Mortality
Myocardial infarction
Fractures
Pain due to soft tissue calcification / function
Critical fordecision making
Important, butnot critical for decision making
Low importancefordecision making
Coronary calcification
Ca2+/P- product
Bone density
Ca2+/P- product
Soft tissue calcification
Ca2+/P- product
Surrogates: relation to important outcomes increasingly uncertain
Hierarchy of outcomes according to their importance to assess the effect of phosphate lowering drugs in patients with renal failure and hyperphosphatemia
4. Publication Bias
• Should always be suspected– Only small “positive” studies (hypothesis confirming)– For profit interest– Various methods to evaluate – none perfect, but
clearly a problem
Egger M, Smith DS. BMJ 1995;310:752-54 49
I.V. Mg in acute
myocardial infarction
Publication bias
Meta-analysisYusuf S.Circulation 1993
ISIS-4Lancet 1995
50
Funnel plotS
tand
ard
Err
or
Odds ratio0.1 0.3 1 3
3
2
1
0
100.6
Symmetrical:No publication bias
51
Funnel plotS
tand
ard
Err
or
Odds ratio0.1 0.3 1 3
3
2
1
0
100.6
Asymmetrical:Publication bias?
0.4
File drawer problemNo interest in publishing or being published
Indoor air polution: ALRI
5. Imprecision
• Small sample size– small number of events
• Wide confidence intervals– uncertainty about magnitude of effect
• Extent to which confidence in estimate of effect adequate to support decision
Example: Immunization in children
What can raise quality?
1. large magnitude can upgrade (RRR 50%/RR 2)– very large two levels (RRR 80%/RR 5)– criteria
• everyone used to do badly• almost everyone does well
– parachutes to prevent death when jumping from airplanes
Reminders for immunization uptake
What can raise quality?2. dose response relation
– (higher INR – increased bleeding)– childhood lymphoblastic leukemia
• risk for CNS malignancies 15 years after cranial irradiation• no radiation: 1% (95% CI 0% to 2.1%) • 12 Gy: 1.6% (95% CI 0% to 3.4%) • 18 Gy: 3.3% (95% CI 0.9% to 5.6%)
3. all plausible confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed
All plausible residual confoundingwould result in an overestimate of effect
Hypoglycaemic drug phenformin causes lactic acidosis
The related agent metformin is under suspicion for the same toxicity.
Large observational studies have failed to demonstrate an association– Clinicians would be more alert to lactic acidosis in
the presence of the agent• Vaccine – adverse effects
Quality assessment criteria Study design
Initial quality of a body of evidence
Lower if Higher if Quality of a body of evidence
Randomised trials
High Risk of Bias
Inconsistency
Indirectness
Imprecision
Publication bias
Large effect Dose response All plausible residual confounding & bias -Would reduce a demonstrated effect -Would suggest a spurious effect if no effect was observed
A/High (four plus: )
B/Moderate (three plus: )
Observational studies
Low C/Low (two plus: )
D/Very low (one plus: )
Evidence Profiles/Summaries
Evidence Profiles/Summaries
Evidence Profiles/Summaries
Evidence Profiles/Summaries
Content
• Background • Quality of evidence• Moving from evidence to recommendations
Strength of recommendation
“The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” • Strong or weak/conditional
Determinants of the strength of recommendation
Factors that can strengthen a recommendation
Comment
Quality of the evidence The higher the quality of evidence, the more likely is a strong recommendation.
Balance between desirable and undesirable effects
The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely weak recommendation warranted.
Values and preferences The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted.
Costs (resource allocation) The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted
Developing recommendations
Case scenario
A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.
Methods – WHO Rapid Advice Guidelines for management of Avian Flu Applied findings of a recent systematic evaluation of
guideline development for WHO/ACHR
Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists
Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal
studies related to H5N1 infection
Oseltamivir for Avian FluSummary of findings: No clinical trial of oseltamivir for treatment of H5N1
patients. 4 systematic reviews and health technology
assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. Hospitalization: OR 0.22 (0.02 – 2.16) Pneumonia: OR 0.15 (0.03 - 0.69)
3 published case series. Many in vitro and animal studies. No alternative that is more promising at present. Cost: 40$ per treatment course
From evidence to recommendation
Factors that can strengthen a recommendation
Comment
Quality of the evidence Very low quality evidence
Balance between desirable and undesirable effects
Uncertain, but small reduction in relative risk still leads to large absolute effect
Values and preferences Little variability and clear
Costs (resource allocation) Low cost under non-pandemic conditions
Example: Oseltamivir for Avian FluRecommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (????? recommendation, very low quality evidence).
Schunemann et al. The Lancet ID, 2007
Example: Oseltamivir for Avian FluRecommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence).
Values and PreferencesRemarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment.
Schunemann et al. The Lancet ID, 2007
Implications of a strong recommendation
• Patients: Most people in this situation would want the recommended course of action and only a small proportion would not
• Clinicians: Most patients should receive the recommended course of action
• Policy makers: The recommendation can be adapted as a policy in most situations
Implications of a conditional/weak recommendation
• Patients: The majority of people in this situation would want the recommended course of action, but many would not
• Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making
• Policy makers: There is a need for substantial debate and involvement of stakeholders
Systematic review
Guideline development
PICO
OutcomeOutcomeOutcomeOutcome
Formulate
question
Rate
importa
nce
Critical
Important
Critical
Not important
Create
evidence
profile with
GRADEpro
Summary of findings & estimate of effect for each outcome
Grade overall quality of
evidence across outcomes based
on lowest quality of critical outcomes
Panel
Randomization increases initial
quality
1. Risk of bias2. Inconsisten
cy3. Indirectnes
s4. Imprecision5. Publication
bias
Gra
de d
own
Gra
de u
p 1. Large effect
2. Dose response
3. Confounders
Rate quality
of evidence
for each
outcomeSelect
outcomes
Very low
LowModerate
High
Formulate recommendations:• For or against (direction)• Strong or weak (strength)
By considering: Quality of evidence Balance benefits/harms Values and preferences
Revise if necessary by considering: Resource use (cost)
• “We recommend using…”• “We suggest using…”• “We recommend against using…”• “We suggest against using…”
Outcomes
across
studies
Issues in guideline development in Public Health
• Causation versus effects of intervention– Causation not equivalent to efficacy of interventions– Bradford Hill
• Nearly half a century old – tablet from the mountain?
• Harms caused by medications– Assumption is that removal of exposure leads to NO
adverse effects• How confident can one be that removal of the
exposure is effective in preventing disease?– Whether drugs or environmental factors it will depend
on the intervention to remove exposure
Schünemann et al. JECH 2010
Conclusions Clinical practice guidelines should be based on the best
available evidence to be evidence based GRADE combines what is known in health research
methodology and provides a structured approach to improve communication
Criteria for evidence assessment across questions and outcomes
Criteria for moving from evidence to recommendations Transparent, systematic
four categories of quality of evidence two grades for strength of recommendations
Transparency in decision making and judgments is key
Formulating Questions and Choosing Outcomes
Outline
• Type of questions
• Framing a foreground question
• Choosing outcomes
• Relative importance of outcomes
85
Guidelines and questions
Guidelines are a way of answering questions about clinical, communication, organisational or policy interventions, in the hope of improving health care or health policy.
It is therefore helpful to structure a guideline in terms of answerable questions.
WHO Guideline Handbook, 2008
Types of questions
Background QuestionsDefinition: What is COPD?Mechanism: What is the mechanism of
action of mucolytic therapy?
Foreground QuestionsEfficacy: In patients with COPD, does
mucolytic therapy improve survival?
Framing a foreground question
P
I
C
O
Framing a foreground question
Population:
Intervention:
Comparison:
Outcomes:
Case scenario
A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.
Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir
What are examples of:
• Background questions
• Foreground questions•Population:
•Intervention:
•Comparison:
•Outcomes: 91
Framing a foreground question
Population: Avian Flu/influenza A (H5N1) patients
Intervention: Oseltamivir (or Zanamivir)
Comparison: No pharmacological intervention
Outcomes: Mortality, hospitalizations, resource use, adverse outcomes,
antimicrobial resistance
Schunemann, Hill et al., The Lancet ID, 2007
Choosing outcomes
• Every decision comes with desirable and undesirable consequencesDeveloping recommendations must include a
consideration of desirable and undesirable outcomes
Outcomes should be patient important outcomes.
• desirable outcomes– lower mortality– reduced hospital stay– reduced duration of disease– reduced resource expenditure
• undesirable outcomes– adverse reactions – the development of resistance – costs of treatment
Choosing outcomes
What if what is important is not measured?
What if what is measured is not important?
How do we make sure we’ve covered all important outcomes?
Choosing outcomes
• Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation
• Relative importance vary across populations
• Relative importance may vary across patient groups within the same population
• When considered critical - evaluate
Relative importance of outcomes
2
Critical for decision making
Important, but not critical for decision making
Of lowimportance
5
6
7
8
9
3
4
1
Relative importance of outcomes
Using GRADEpro
Creating a new GRADEpro file
Profile groups
Profiles
Managing outcomes
118
Content
• Quality of evidence• Going from evidence to recommendations
Healthcare problem
recommendation
Strength of recommendation
“The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” • Strong or conditional
Strength of recommendationThe degree of confidence that the desirable effects of adherence to a recommendation outweigh the undesirable effects.
Desirable effects• health benefits• less burden• savings
Undesirable effects• harms• more burden• costs
Determinants of the strength of recommendation
Factors that can strengthen a recommendation
Comment
Quality of the evidence The higher the quality of evidence, the more likely is a strong recommendation.
Balance between desirable and undesirable effects
The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely weak recommendation warranted.
Values and preferences The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted.
Costs (resource allocation) The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted
Balancing benefits and downsides
↑ Allergic reactions
↑ Local skin reactions
↑ Nausea↑ Resources
↑ QoL ↓ Death
↓ Morbidity
↑ herd immunity
Conditional
Strong For Against
Balancing benefits and downsides
↑ Allergic
reactions
↑ Local skin
reactions
↑ Nausea
↑ Resources
↑ QoL↓ Death
↓
Morbidity↑ herd
immunityConditional
Strong For Against
Balancing benefits and downsides
↑ Allergic reactions
↑ Local skin reactions
↑ Nausea
↑ Resources↑ QoL↓ Death
↓ Morbidity
↑ herd immunity
Conditional
Strong For Against
Balancing benefits and downsides
↑ Allergic
reactions
↑ Local skin
reactions
↑ Nausea
↑ Resources
↑ QoL
↓ Death
↓
Morbidity
↑ herd
immunity
Conditional
Strong For Against
Balancing benefits and downsides
↑ Allergic reactions ↑ Local skin
reactions
↑ Nausea
↑ Resources
↑ QoL
↓ Death
↓ Morbidity
↑ herd immunity
Conditional
Strong For Against
Implications of a strong recommendation
• Policy makers: The recommendation can be adapted as a policy in most situations
• Patients: Most people in this situation would want the recommended course of action and only a small proportion would not
• Clinicians: Most patients should receive the recommended course of action
Implications of a conditional recommendation• Policy makers: There is a need for
substantial debate and involvement of stakeholders
• Patients: The majority of people in this situation would want the recommended course of action, but many would not
• Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making
Case scenario
A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.
Methods – WHO Rapid Advice Guidelines for Avian Flu
Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR
Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists
Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal
studies related to H5N1 infection
Oseltamivir for Avian FluSummary of findings: • No clinical trial of oseltamivir for treatment of H5N1
patients.• 4 systematic reviews and health technology
assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. – Hospitalization: OR 0.22 (0.02 – 2.16)– Pneumonia: OR 0.15 (0.03 - 0.69)
• 3 published case series. • Many in vitro and animal studies. • No alternative that was more promising at present.• Cost: 40$ per treatment course
From evidence to recommendation
Factors that can strengthen a recommendation
Comment
Quality of the evidence Very low quality evidence
Balance between desirable and undesirable effects
Uncertain, but small reduction in relative risk still leads to large absolute effect
Values and preferences Little variability and clear
Costs (resource allocation) Low cost under non-pandemic conditions
Complex data & decisions: yes/no?
Recommendation
- The Guidelines Group recommends that
fluoroquinolones are / not used in the
treatment of all patients with MDR (Strong(conditional) recommendation/
low(moderate, high) grade of evidence)
Recommendation: In women with histologically confirmed CIN, the expert panel recommends/suggests cryotherapy/LEEP over cryotherapy/LEEP.
Population: Women with histologically confirmed CINIntervention: Cryotherapy versus LEEPFactor Decision Explanation High or moderate evidence(is there high or moderate quality evidence?)The higher the quality of evidence, the more likely is a strong recommendation. Yes
N0 ÅÅOO
There is moderate quality evidence from both randomised and observational controlled studies for recurrence rates. However, there is low quality evidence for other outcomes which were considered critical and important for decision making (e.g., severe adverse events, cervical cancer). There is uncertainty for fertility and other obstetrical outcomes, and HIV acquisition/transmission was not measured.
Certainty about the balance of benefits versus harms and burdens (is there certainty?)The larger the difference between the desirable and undesirable consequences and the certainty around that difference, the more likely a strong recommendation. The smaller the net benefit and the lower the certainty for that benefit, the more likely is a conditional/ weak recommendation.
Yes No
Benefits of LEEP were greater, and harms were fewer or similar
Recurrence rates of CIN I, CIN II-III and all CINs are probably greater with cryotherapyo CIN II-III, OR 3.3 (1.04 to 10.46)o CIN I, OR 2.74 (0.62 to 12.07)o All CIN, OR 2.14 (1.05 to 4.33)
Cryotherapy may be less acceptable to patients than LEEP There may be little difference in serious adverse events between
cryotherapy and LEEP, but there may be fewer minor adverse events (such as pain) with cryotherapy
It is unclear whether there is a difference in fertility/obstetric outcomes
Certainty in or similar values (is there certainty or similarity?)The more certainty or similarity in values and preferences, the more likely a strong recommendation.
YES No
Similar values across women
High value was placed on CIN recurrence, serious adverse events and acceptability to the patient
Low value was placed on minor adverse events
Resource implications (are the resources worth the intervention?)The lower the cost of an intervention compared to the alternative that is considered and other costs related to the decision – that is, the less resources consumed – the more likely is a strong recommendation.
YES No
More resources required for LEEP
Need for more skilled providers to perform LEEP Need for more or expensive equipment/supplies for LEEP;
electrical supply for LEEP Need for local anaesthesia with LEEP
Overall strength of recommendationConditional
Example: Oseltamivir for Avian Flu
Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence).
Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment.
Schunemann et al. The Lancet ID, 2007
Issues in guideline development for immunization
• Causation versus effects of intervention– Causation not equivalent to efficacy of interventions– Bradford Hill
• Nearly half a century old – tablet from the mountain?
• Harms caused by interventions– Assumption is that removal of vaccine (or no exposure)
leads to NO adverse effects• How confident can one be that removal of the
exposure is effective in preventing disease?– Whether immunization or environmental factors: will
depend on the intervention to remove exposure
Current state of recommendations
140
Current state of recommendations
• Reviewed 7527 recommendations– 1275 randomly selected
• Inconsistency across/within• 31.6% did not recommendations clearly
– Most of them not written as executable actions• 52.7% did not indicated strength
141
Recommendation
• The Guideline Group recommends rapid DST testing for resistance to INH and RIF or RIF alone over conventional testing or no testing at the time of diagnosis of TB (conditional, /low quality evidence).
• Values and preferences: A high value was placed on outcomes such as preventing death and transmission of MDR as a result of delayed diagnosis as well as avoiding spending resources.
Group composition
• Group composition might affect recommendation
• Common principle: include all affected by the recommendations ( multi-disciplinary groups incl. patients/carers) – Industry?
• Keep a manageable size
The Process: How to make it constructive?
• Group members are heterogeneous and might have different objectives
• Chair facilitates rather than leads the group
• Common understanding of goal, tasks and ground rules
• Similar level of required knowhow and skills
• Sufficient technical support
Balanced participation and formal agreement
• Key task of chair
• Formal consensus processesDelphi MethodNominal group processVoting
Group processes
How to present controversies
• Lay out the controversies• Describe the evidence• Ask members to focus on the agreed upon
evidence and the factors leading to a decision• Ask whether there still is disagreement• Vote
– Make voting explicit and transparent (ways of doing this to come tomorrow)
Conclusions - Process• Success depends on strong chair(s), training of group,
good facilitation and technical support
– Clinical and methods co-chairs
• Formal consensus developing methods might support agreement on recommendations
– Voting represents forced consensus
• Guideline development will require sufficient resources.
GRADE Grid
Systematic review
Guideline development
PICO
OutcomeOutcomeOutcomeOutcome
Formulate
question
Rate
importa
nce
Critical
Important
Critical
Not important
Create
evidence
profile with
GRADEpro
Summary of findings & estimate of effect for each outcome
Grade overall quality of
evidence across outcomes based
on lowest quality of critical outcomes
Panel
Randomization increases initial
quality
1. Risk of bias2. Inconsisten
cy3. Indirectnes
s4. Imprecision5. Publication
bias
Gra
de d
own
Gra
de u
p 1. Large effect
2. Dose response
3. Confounders
Rate quality
of evidence
for each
outcomeSelect
outcomes
Very low
LowModerate
High
Formulate recommendations:• For or against (direction)• Strong or conditional (strength)
By considering: Quality of evidence Balance benefits/harms Values and preferences
(Revise by considering:)
Resource use (cost)
• “We recommend using…/should”• “We suggest using…/might”• “We recommend against
using…/might not”• “We suggest against using…/should
not”
Outcomes
across
studies
Conclusions WHO guidelines should be based on the best available
evidence to be evidence based GRADE is the approach used by WHO and gaining acceptance
internationally combines what is known in health research methodology and
provides a structured approach to improve communication Does not avoid judgments but provides framework Criteria for evidence assessment across questions and
outcomes Criteria for moving from evidence to recommendations Transparent, systematic
four categories of quality of evidence two grades for strength of recommendations
Transparency in decision making and judgments is key
Format
• Mix of seminars/interactive lectures, self directed learning and simulation– Large group and smaller group discussion– Computer work
• Simulate guideline panel work• Select rapporteur (both for large group and
any small group work)
Format
• Mix of seminars/interactive lectures, self directed learning and simulation– Large group and smaller group discussion– Computer work?
• Simulate guideline panel work• Select rapporteur (for any small group work)