RESEARCH Open Access Screening for esophageal adenocarcinoma · 2020. 1. 29. · Lise Bjerre2,...
Transcript of RESEARCH Open Access Screening for esophageal adenocarcinoma · 2020. 1. 29. · Lise Bjerre2,...
-
RESEARCH Open Access
Screening for esophageal adenocarcinomaand precancerous conditions (dysplasia andBarrett’s esophagus) in patients withchronic gastroesophageal reflux diseasewith or without other risk factors: twosystematic reviews and one overview ofreviews to inform a guideline of theCanadian Task Force on Preventive HealthCare (CTFPHC)Candyce Hamel1* , Nadera Ahmadzai1, Andrew Beck1, Micere Thuku1, Becky Skidmore1, Kusala Pussegoda1,Lise Bjerre2, Avijit Chatterjee3, Kristopher Dennis4, Lorenzo Ferri5, Donna E. Maziak6, Beverley J. Shea1,Brian Hutton1,7, Julian Little7, David Moher1,7 and Adrienne Stevens1
Abstract
Background: Two reviews and an overview were produced for the Canadian Task Force on Preventive Health Careguideline on screening for esophageal adenocarcinoma in patients with chronic gastroesophageal reflux disease(GERD) without alarm symptoms. The goal was to systematically review three key questions (KQs): (1) Theeffectiveness of screening for these conditions; (2) How adults with chronic GERD weigh the benefits and harms ofscreening, and what factors contribute to their preferences and decision to undergo screening; and (3) Treatmentoptions for Barrett’s esophagus (BE), dysplasia or stage 1 EAC (overview of reviews).
Methods: Bibliographic databases (e.g. Ovid MEDLINE®) were searched for each review in October 2018. We alsosearched for unpublished literature (e.g. relevant websites). The liberal accelerated approach was used for title andabstract screening. Two reviewers independently screened full-text articles. Data extraction and risk of biasassessments were completed by one reviewer and verified by another reviewer (KQ1 and 2). Quality assessmentswere completed by two reviewers independently in duplicate (KQ3). Disagreements were resolved throughdiscussion. We used various risk of bias tools suitable for study design. The GRADE framework was used for ratingthe certainty of the evidence.
(Continued on next page)
© Her Majesty the Queen in Right of Canada. 2020 Open Access This article is distributed under the terms of the CreativeCommons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use,distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source,provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public DomainDedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article,unless otherwise stated.
* Correspondence: [email protected] Hospital Research Institute, Knowledge Synthesis Group, 501 SmythRoad, Ottawa, ON, CanadaFull list of author information is available at the end of the article
Hamel et al. Systematic Reviews (2020) 9:20 https://doi.org/10.1186/s13643-020-1275-2
http://crossmark.crossref.org/dialog/?doi=10.1186/s13643-020-1275-2&domain=pdfhttp://orcid.org/0000-0002-5871-2137http://creativecommons.org/licenses/by/4.0/http://creativecommons.org/publicdomain/zero/1.0/mailto:[email protected]
-
(Continued from previous page)
Results: Ten studies evaluated the effectiveness of screening. One retrospective study reported no difference inlong-term survival (approximately 6 to 12 years) between those who had a prior esophagogastroduodenoscopyand those who had not (adjusted HR 0.93, 95% confidence interval (CI) 0.58–1.50). Though there may be higherodds of a stage 1 diagnosis than a more advanced diagnosis (stage 2–4) if an EGD had been performed in theprevious 5 years (OR 2.27, 95% CI 1.00–7.67). Seven studies compared different screening modalities, and showedlittle difference between modalities. Three studies reported on patients’ unwillingness to be screened (e.g. due toanxiety, fear of gagging). Eleven systematic reviews evaluated treatment modalities, providing some evidence ofearly treatment effect for some outcomes.
Conclusions: Little evidence exists on the effectiveness of screening and values and preferences to screening.Many treatment modalities have been evaluated, but studies are small. Overall, there is uncertainty inunderstanding the effectiveness of screening and early treatments.
Systematic review registrations: PROSPERO (CRD42017049993 [KQ1], CRD42017050014 [KQ2], CRD42018084825[KQ3]).
Keywords: Esophageal adenocarcinoma, Gastroesophageal reflux disease, Barrett’s esophagus, Dysplasia, Screening,Patient values and preferences, Treatment, Systematic review, Overview of reviews
IntroductionThere are two main types of esophageal cancer. Theseare, esophageal adenocarcinoma (EAC) where malignantcells form in the tissues of the lower third of the esopha-gus, primarily in glandular cells where Barrett’s Esopha-gus (BE) also develops [1], and esophageal squamous cellcarcinoma (ESCC), where malignant cells form in thesquamous cells of the esophagus. ESCC is the mostprominent form of esophageal neoplasm worldwide, with398,000 cases of ESCC compared to 52,000 cases ofEAC in 2012 [2]. However, EAC is more common thanESCC in Canada and nearly 50% of the worldwide casesof EAC occur in Northwestern Europe and North Amer-ica [3]. From 1986 to 2006, EAC incidence in Canadarose by 3.9% (1.8 to 3.5 per 100,000) in males and 3.6%(0.2 to 0.5 per 100,000) in females per year [3]. Rates inCanada, provided by the Canadian Cancer Society, re-port the overall rates of esophageal cancer (combinedEAC and ESCC). In 2017, projected new cases ofesophageal cancer were 2330 cases (1800 among menand 530 among women) with 2130 deaths from the dis-ease (1650 among men and 480 among women). Al-though esophageal cancer has a lower incidence thanother cancers (ranked 13th among men and 19th amongwomen), it has a high mortality rate and a low 5-yearsurvival rate (14%), the second lowest survival rate afterpancreatic cancer [4]. About 20% of EAC cases are diag-nosed at an early stage where treatment with surgeryleads to a 5-year survival rate of 90% [5].
Risk factorsIncreases in incidence of EAC may be dependent on theincreasing prevalence of related risk factors such asobesity and gastroesophageal reflux disease (GERD) [3].Other risk factors for the development EAC are BE, age
50 years and older, male sex, European descent, currentor past smoking, a family history of BE or EAC and adiet low in fruits and vegetables [1, 6–8].The prevalence of GERD in Western countries has in-
creased over the past few decades and is one of the mostcommonly encountered conditions in primary care prac-tice with an estimated prevalence of between 18–27% inthe USA and 9–26% in Europe [9]. Extrapolating theseprevalence estimates to the Canadian population, sinceno Canadian incidence studies exist, would mean that3.4–6.8 million persons in Canada experience GERD[10]. GERD is a chronic disease with varying definitions[10–13]. The Montreal definition has been adopted byclinicians and researchers, and defines GERD as “a con-dition which develops when the reflux of stomach con-tents causes troublesome symptoms (e.g., retrosternalburning (heartburn), regurgitation) and/or complications(e.g., esophagitis, esophageal stricture)” [14]. Accordingto the American Society for Gastrointestinal Endoscopy,chronic, long-standing GERD is defined as frequent se-vere GERD symptoms for over 5 years and requiringregular acid suppression therapy [15]. However, expertsdiffer in the definition of the duration of symptoms andwhether acid suppression therapy is considered in defin-ing chronic GERD [16–18].
The most common complications of GERD areesophagitis, esophageal stricture, BE and EAC [10]. Ap-proximately 60% of people with EAC have experiencedsymptoms of GERD and there is an association betweenthe frequency and severity of symptoms and increasedrisk of EAC [19, 20]. In BE, the tissue lining the esopha-gus transforms into tissue resembling the lining of theintestines. Generally, this transformation is called intes-tinal metaplasia, and in the esophagus, it is called BE. Itis currently not known how the transformation occurs;
Hamel et al. Systematic Reviews (2020) 9:20 Page 2 of 25
-
however, it has been suggested that the acid regurgita-tion associated with GERD may assist changes at thecellular level [19]. BE is known to develop in around 6–14% of people with GERD, and among those with BE(with or without GERD), 0.2–0.5% develop EAC [21].However, not all individuals with BE will experiencechronic GERD symptoms, and it is still unclear why such asmall percentage of people with GERD develop BE [22, 23].Once an individual is diagnosed with BE, regular surveil-lance using endoscopy should be considered, as BE canprogress over time from low- to high-grade dysplasia andinto EAC [24, 25]. Patients who have EAC discovered as aresult of endoscopic screening or as part of a surveillanceprogram for BE are diagnosed with earlier-stage tumours,are less likely to have lymph node involvement, and havebetter short-term life expectancies than those who presentwith alarm symptoms such as dysphagia and weight loss[26]. It has also been found that the longer the length of BE(e.g. short segment vs. long segment), the higher the riskfor EAC [27].
TreatmentThe goal of treatment for BE and/or low- or high-gradedysplasia is to slow or halt GERD symptoms, reducemucosal inflammation, control dysplasia and preventprogression to adenocarcinoma [28]. The treatments forEAC depend on the stage of the disorder (0 to 4). Forstage 0, the disease is considered precancerous and issynonymous with high-grade dysplasia. Endoscopictherapies (e.g. radiofrequency ablation (RFA) or endo-scopic mucosal resection (EMR)) are typically per-formed, followed by endoscopic surveillance [29]. Forstage 1, the disease is generally treated with mechanicalmethods to remove tissue (e.g. endoscopic mucosal re-section) followed by an ablative technique to destroy anyremaining abnormal areas in the esophagus lining [29].
There are four main categories for managing and/ortreating the conditions of interest (i.e. stage 1 EAC, BEor dysplasia): (1) pharmacological therapies; (2) surveil-lance (endoscopic); (3) endoscopic or endoscopic-assisted therapies; and (4) surgery (see Additional file 1).These strategies may overlap with some of the condi-tions of interest. For example, proton pump inhibitor(PPI) therapy is not a treatment for EAC but may reducethe risk of developing dysplasia and EAC among peoplewith BE. These therapies may also be used in com-bination (e.g. pharmacological therapy and surveil-lance procedures for BE) depending on the diseaseprogression.
ObjectivesWith Canada’s increasing senior population and longerlife expectancy, there is an expected increase in the inci-dence rates of GERD and EAC, and, therefore, increaseddemand for gastrointestinal endoscopies [10, 30]. Fromthe Canadian Institute for Health Information NationalPhysician Database, between 2004 and 2008 the numberof upper endoscopies performed in Canada has in-creased by approximately 16% [31]. However, the reasonfor the endoscopy was not detailed. In order to deter-mine the effectiveness of screening for EAC amongGERD patients, the following three key questions (KQs)(Table 1) were addressed through two systematic reviews(SRs) (KQ1 and KQ2) and one overview of reviews(KQ3).
MethodsThese SRs were developed, conducted and prepared ac-cording to the Canadian Task Force for Preventive HealthCare (CTFPHC) Procedure Manual [32] or as methodswere updated by the CTPHFC. The protocols for these SRshave been published with PROSPERO (CRD42017049993,
Table 1 Key questions
Key question Question
1a In adults (≥ 18 years) with chronic gastroesophageal reflux disease (GERD)a with orwithout other risk factorsb, what is the effectiveness (benefits and harms) of screeningfor esophageal adenocarcinoma (EAC) and precancerous conditions (Barrett’s Esophagus(BE) and low- and high-grade dysplasia)? What are the effects in relevant subgrouppopulations?
1b If there is evidence of effectivenessc, what is the optimal time to initiate and to endscreening, and what is the optimal screening interval (includes single and multiple testsand ongoing ‘surveillance’)?
2 In adults with chronic GERD with or without other risk factors,b who have been offered,received, or allocated to receive screening for EAC and precancerous conditions (BE andlow- and high-grade dysplasia), how do they weigh the benefits and harms of screening,and what factors contribute to these preferences and to their decisions to undergoscreening?
3 What is the effectiveness (benefits and harms) of treatment for stage 1 EAC andprecancerous conditions (BE and low- and high-grade dysplasia) in adults?
aAs defined by study authorsbRisk factors will be deemed so by included studiescIf there is evidence of at least moderate certainty of evidence of benefit, according to GRADE
Hamel et al. Systematic Reviews (2020) 9:20 Page 3 of 25
-
CRD42017050014, CRD42018084825) and are available onthe CTFPHC website (https://canadiantaskforce.ca/).These reviews are reported according to the Pre-
ferred Reporting Items for Systematic Reviews andMeta-Analyses (PRISMA) statement [33] (Additionalfile 2) and includes a PRISMA flow diagram for eachkey question. We also used AMSTAR (A Measure-ment Tool to Assess the Methodological Quality ofSystematic Reviews) for additional quality control[34]. Any amendments made to the protocols whenconducting the reviews have been outlined in Additionalfile 3.
Analytic frameworksThe analytic framework for these reviews is presented inFig. 1.
Inclusion and exclusion criteriaTable 2 presents the eligibility criteria for each KQ,using the PICOTS framework.
Literature searchAll search strategies (Additional file 4) were developedand tested through an iterative process by an experi-enced medical information specialist in consultationwith the review teams. In addition, the search strategyfor the MEDLINE database was peer-reviewed by an-other experienced librarian using the Peer Review ofElectronic Search Strategies (PRESS) checklist [35](Additional file 5). Table 3 presents an overall descrip-tion of the searching for each KQ.
Study selectionFor each KQ, duplicates across searches were identifiedand removed using Reference Manager [36]. Theremaining articles were uploaded into Distiller System-atic Review (DistillerSR) Software© [37] for title and ab-stract screening and full-text screening of the remainingpotential relevant articles.Reviewers performed a pilot testing phase of randomly
selected title and abstracts (n = 50) and potentially rele-vant full-text articles (n = 25) prior to commencing
Fig. 1 Guideline analytic framework
Hamel et al. Systematic Reviews (2020) 9:20 Page 4 of 25
https://canadiantaskforce.ca/
-
Table
2Po
pulatio
n,interven
tions,com
parison
s,ou
tcom
es,tim
eframe,stud
yde
sign
(PICOTS)
Keyqu
estio
n1
Keyqu
estio
n2
Keyqu
estio
n3
Popu
latio
nInclusion
Adu
lts(≥
18yearsold)
awith
chronicgastroesop
hage
alrefluxdisease(GERD)bwith
orwith
outothe
rriskfactorsc
foresop
hage
aladen
ocarcino
ma(EAC).
Adu
lts(≥
18yearsold)
awith
chronicGERDwith
orwith
outothe
rriskfactorscforEA
Cwho
have
been
offered,
received
,orallocatedto
receivescreen
ing,
depe
ndingon
thede
sign
ofthestud
y.
Adu
lts(≥
18yearsold)
awith
stage1EA
C,Barrett’s
Esop
hagu
s(BE)
orlow-or
high
-grade
dysplasia,with
orwith
outchronicGERDas
defined
inthesystem
aticre-
view
s(SRs)d
Exclusion
-Expe
riencingalarm
symptom
sforEA
C:d
ysph
agia,recurrent
vomiting
,ano
rexia,weigh
tloss,g
astrointestin
albleeding
orothe
rsymptom
siden
tifiedby
authorsas
‘alarm
’.-Diagn
osed
with
othe
rgastro-esoph
agealcon
ditio
ns(e.g.gastriccancer,esoph
agealatresia,other
lifethreaten
ing
esop
hage
alcond
ition
s)or
pre-existin
gdisease(BE,dysplasia,or
EAC).
Thosediagno
sedwith
othe
rgastro-esoph
agealcon
di-
tions
(e.g.g
astriccancer,esoph
agealatresia,and
othe
rlife-threaten
ingesop
hage
alcond
ition
s).
Interven
tion
/com
parator
Inclusion
KQ1a:
-Screen
ingversus
noscreen
ing
-One
screen
ingmod
ality
versus
anothe
rscreen
ing
mod
ality
Allscreen
ingmod
alities
forBE,d
ysplasiaor
EACwillbe
includ
ed,suchas
esop
hago
gastrodu
oden
oscopy
(EGD)e,f
EGDfplus
adjuncttechniqu
esg,transnasalend
oscopy,
cytologicexam
ination
KQ1b
:-One
screen
ingmod
ality
vs.ano
ther
screen
ingmod
ality
-One
intervalof
screen
ingvs.ano
ther
intervalof
screen
ing
-Timep
oint
atwhich
toinitiatescreen
ingvs.ano
ther
timep
oint
-Timep
oint
atwhich
toceasescreen
ingversus.ano
ther
timep
oint
Screen
ingforEA
Candothe
rprecancerous
lesion
swith
anyscreen
ingmod
ality
Dep
ending
onstud
yde
sign
,com
paratorsmay
be:
-Noscreen
ingh
-Differen
tscreen
ingmod
ality
-Differen
tscreen
ingintervals
-Differen
tleng
ths/du
ratio
nof
screen
ing
-Offeredscreen
ingbu
tdidno
treceivescreen
ing
-Nocomparison
Managem
ent/treatm
entforstage1EA
C,low-or
high
-gradedysplasiaor
BEinclud
ing:
-Ph
armacolog
icaltherapiesi
-Surveillancemetho
dssuch
as:EGDe,fplus
biop
sy,EGDf
plus
biop
syplus
adjuncttechniqu
esj
-Endo
scop
icor
Endo
scop
icAssistedtherapiesk
-Surgery,includ
ingfund
oplicationandesop
hage
ctom
yCom
parator:Nomanagem
ent/treatm
entcomparedto
anothe
rmanagem
ent/treatm
entregimen
,ora
combinatio
nof
managem
ent/treatm
entstrategies.
Exclusion
Any
follow-updiagno
stictests,such
as24-h
esop
hage
alpH
testor
anytestforstagingpu
rposes,suchas
compu
terized
tomog
raph
yandmagne
ticresonanceim
aging.
Outcomes
Inclusion
Criticalforde
cision
-making
1.Mortality—
all-cause
andEA
C-related
(1,5
and10
year
oras
available)l,m
2.Survival(1,5
and10
year
oras
available)l
3.Life
threaten
ing,
severe,ormed
icallysign
ificant
conseq
uences
(suchas
requ
iring
hospitalizationor
prolon
gatio
nof
hospitalization;disabling(limiting
self-
care
oractivities
ofdaily
living)
Impo
rtantforde
cision
-making
4.Incide
nceof
EAC(bystage),BE,low-andhigh
-grade
dysplasiam
5.Qualityof
life(validated
scales
only;e.g.SF-36,
WHOQUAL)
6.Psycho
logicaleffects(e.g.anxiety
andde
pression
)7.Major
orminor
med
icalproced
ures
m
8.Overdiagn
osisn
1.How
patientsweigh
thebe
nefitsandharm
sof
screen
ing(e.g.ranking
/ratingof
bene
fitsandharm
sou
tcom
es)
2.Willingn
essto
bescreen
ed3.Uptakeof
screen
ing
4.Factorsconsidered
inde
cision
tobe
screen
ed:w
hat
compo
nents/ou
tcom
esof
screen
ingdo
patientsplace
morevalueon
whe
nde
ciding
whe
ther
tobe
screen
edor
not(e.g.p
oten
tialcom
plications
resulting
from
screen
ing)
5.Intrusiven
essof
thescreen
ingmod
ality
Criticalforde
cision
-making
1.Mortality—
all-cause
andEA
C-related
(1,5
and10
years,
oras
available)l
2.Survival(1,5
and10
years,or
asavailable)l
3.Prog
ressionfro
mno
n-dysplasticBE
toBE
with
dyspla-
sia,prog
ressionfro
mlow-grade
tohigh
-grade
dysplasia,
prog
ressionto
EAC
4.Life
threaten
ing,
severe,ormed
icallysign
ificant
conseq
uences
(suchas
requ
iring
hospitalizationor
prolon
gatio
nof
hospitalization;disabling(limiting
self-
care
oractivities
ofdaily
living)
Impo
rtantforde
cision
-making
5.Qualityof
life(validated
scales
only;e.g.SF-36,
WHOQUAL)
6.Major
orminor
med
icalproced
ures
7.Psycho
logicaleffects(e.g.,anxiety,stress)
8.Overtreatmen
tPo
st-hoc
outcom
es:
9.Com
pleteeradicationof:intestin
almetaplasia/BE,
dysplasia,high
-grade
dysplasia,ne
oplasia
Hamel et al. Systematic Reviews (2020) 9:20 Page 5 of 25
-
Table
2Po
pulatio
n,interven
tions,com
parison
s,ou
tcom
es,tim
eframe,stud
yde
sign
(PICOTS)(Con
tinued)
Keyqu
estio
n1
Keyqu
estio
n2
Keyqu
estio
n3
10.Reductio
n/regressio
nof
BE:inleng
th(cm),inarea
(%)
11.Treatmen
tFailure
(noablatio
n)12.EACrecurren
ce
Timing
Nolim
itsNolim
itsNolim
its
Setting
Settings
werelim
itedto
prim
arycare
orsettings
inwhich
aprim
arycare
physiciancouldreferapatient
for
esop
hage
alscreen
ing.
Prim
arycare
orothersettings
gene
ralizableto
prim
ary
care.
Any
setting.
Stud
yde
sign
sInclusion
Rand
omized
controlledtrials(RCTs),includ
ingcluster
RCTs.
Ifno
orfew
RCTs
(i.e.<5trials)areavailable:Non
-RCT,
controlledbe
fore-afte
r,interrup
tedtim
esseries,coho
rtstud
ies,case-con
trol
stud
ies,lim
iting
tohigh
erlevelsof
eviden
cede
pend
ingon
thenature
andvolumeof
spe-
cific
stud
yde
sign
s.Ifno
orfew
RCTs
areavailablefortheoverdiagno
sis
outcom
e,ecolog
icalandcoho
rtstud
ieswillbe
considered
forallo
utcomes
used
forthejudg
emen
tof
overdiagno
sis.
Rand
omized
controlledtrials
Ifinsufficien
tdata
exists:
Con
trolledclinicaltrials,con
trolledbe
fore-after,case-
controls,coh
ort,interrup
tedtim
eseries(ITS),and
cross-sectional(e.g.
surveys)
Ifinsufficien
tdata
existsfortheabove:
Qualitativestud
iesandmixed
-metho
dsstud
ies
System
aticreview
sof
RCTs
o
Tobe
defined
asaSR,a
review
musthave
met
allfou
rof
thefollowingcriteria:(1)
searched
atleaston
edatabase;(2)
repo
rted
itsselectioncriteria;(3)
cond
ucted
quality
orriskof
bias
assessmen
ton
includ
edstud
ies;
and(4)provided
alistandsynthe
sisof
includ
edstud
ies.
SRsthat
iden
tifiedob
servationalstudies
wereinclud
edif
results
from
RCTs
wereprovided
separately.
Exclusion
Cross-sectio
nalstudies,caseseries,case
repo
rts,and
othe
rpu
blicationtype
s(editorials,com
men
taries,no
tes,
letter,opinion
s).
Com
men
taries,op
inion,ed
itorialsandreview
sSRsthat
combine
results
from
RCTs
with
non-RC
Ts,con
-trolledbe
fore-afte
r,interrup
tedtim
esseries,coho
rtstud
-ies,case-con
trol
stud
ies,cross-sectionalstudies,case
series,case
repo
rtsandothe
rpu
blicationtype
s(edito-
rials,com
men
taries,no
tes,letter,opinion
s)or
SRsthat
onlyinclud
eno
n-RC
Tandob
servationalstudies.
Lang
uage
Nolang
uage
restrictio
nsin
thesearch;how
ever,onlyEnglishandFren
charticleswillbe
includ
edat
full-text.
Databases
MED
LINE,Em
base,C
ochraneLibrary
MED
LINE,Em
base,C
INAHL,CochraneLibrary
MED
LINE,Em
base,C
ochraneLibrary(CDSR,D
ARE,H
TA)
a Studies
addressing
both
adults
andchild
ren,
ifda
taprov
ided
forad
ults
arerepo
rted
sepa
rately
bChron
icGER
D,asde
fined
bystud
yau
thors
c Riskfactorswillbe
asde
emed
soby
includ
edstud
ies
dWedidno
tuseapred
efined
metho
dfordiag
nosis(e.g.h
istopa
tholog
ical
exam
s,ICDcode
)an
dreliedon
how
itwas
defin
edin
theSR
se Alsokn
ownas
pane
ndoscopy
andup
perGIe
ndoscopy
f With
orwith
outbiop
syprotocol
gFo
rexam
ple,
chromen
doscop
yan
dna
rrow
-ban
dim
aging
hAlth
ough
wewillconsider
compa
rativ
estud
iesthat
includ
eano
screen
ingarm,w
eun
derstand
that
theou
tcom
esof
interest
dono
tap
plyto
peop
lewho
dono
treceiveor
have
notbe
enofferedscreen
ing.
Forsuch
stud
ies,wewillon
lyconsider
data
forthosewho
receiveor
areofferedscreen
ing
i Suchas
PPI,H2receptor
antago
nists,Cox-2
inhibitors,P
rokine
ticsan
dan
tacids,N
SAIDs
j Suchas
high
-definition
/high-resolutio
nwhite
light
endo
scop
y,chromoe
ndoscopy
,electronicchromoe
ndoscopy
,autofluorescensce
imag
ing,
confocal
laseren
domicroscop
y,lig
htscatterin
gspectroscopy
,diffuse
refle
ctan
cespectroscopy
k Suchas
ablativ
etechniqu
es(the
rmal
orchem
ical),an
dmecha
nicalm
etho
ds(EMR,
ESDor
combine
dop
tions)
l From
thetim
eof
allocatio
nto
screen
ingor
controla
rmmTh
eseou
tcom
eswillbe
used
tojudg
etheextent
ofov
erdiag
nosis,which
isde
fined
asthediag
nosisof
diseasewhich
wou
ldne
verha
vebe
comeclinically
appa
rent
inape
rson
'slifetim
e(i.e.,cau
sing
neith
ersymptom
sno
rde
ath)
nAsjudg
edby
thestud
yau
thor
orwillbe
judg
edby
theCTFPH
Cworking
grou
pusinginform
ationprov
ided
byau
thors,whe
reavailable
oSystem
aticreview
sthat
combine
RCTan
dno
n-RC
Tswillbe
includ
edifresults
forRC
Tsareprov
ided
sepa
rately
from
non-RC
Tstud
ies
Hamel et al. Systematic Reviews (2020) 9:20 Page 6 of 25
-
broad screening. Screening forms can be found in Add-itional file 7. Titles and abstracts were independentlyscreened for relevance by two reviewers, using the liberalaccelerated method, which requires one user to includefor further assessment at full-text and two reviewers toexclude [38]. References were reviewed in random order,with each reviewer unaware if the reference had alreadybeen assessed and excluded by the other reviewer. Sub-sequently, full-texts were retrieved and two reviewers in-dependently assessed the article for relevancy. Conflictsat full-text were resolved by consensus or a third teammember. Articles not available for download were or-dered from the library through interlibrary loans. Thosethat were not received within 30 days were excluded andlabelled accordingly. For articles with abstracts only, asearch was performed to locate any full-textpublications.Where chronic GERD was not defined in a study
(KQ1 and KQ2), we attempted to contact the study au-thors twice over 2 weeks by email to obtain more infor-mation. If authors did not respond, and the lack ofdefinition for chronic GERD was the only reason forpossible exclusion, we included the study. Reports in ab-stract form and protocols were coded as such and ex-cluded, but a search was completed to see if the full-textwas available. Those that were not available as full-textswere excluded and studies available only in abstractform are available in the list of excluded studies (Add-itional file 8).
Data extraction and managementFor all KQs, full data extraction was completed by onereviewer using a form developed a priori and 100% ofthese were verified by a second reviewer (Additional file
9). Any disagreements were resolved by consensus or ifneeded, with a third reviewer. For KQ1 and KQ2, whereinformation was unclear or missing, authors were con-tacted by email twice over 2 weeks. If no response wasreceived and the information affected the ability forquantitative analysis, the study was analyzed narratively.For KQ3, data were extracted as they were synthesizedand/or reported in the included reviews. No additionalinformation from the primary studies was extracted orassessed and quality control was not performed to verifythe accuracy of the reviews’ data on the included studies.
Risk of bias and quality assessmentFor KQ1 and KQ2, all included studies were assessed forthe risk of bias (RoB) by one reviewer, with verificationcompleted by a second reviewer. The Cochrane RoB tool[39] was used to evaluate the RoB in RCTs and theNewcastle-Ottawa scale (NOS) [40] was used to evaluatethe RoB in cohort studies. For KQ3, the quality of theincluded SRs was assessed using the AMSTAR measure-ment tool [41]. Two reviewers assessed the quality ofeach included SR independently. Any discrepancies wereresolved through discussion and if needed, a third re-viewer. We used the AMSTAR 2 [42] approach to deter-mine the final assessments of quality of conduct,including consideration of four critical domains and cat-egorized the quality as high, moderate, low or criticallylow, using the criteria described in Additional file 10.For all assessments, disagreements were resolved by con-sensus or third party adjudication.
AnalysisFor all KQs, characteristics of the included studies/re-views are presented in tables and summarised
Table 3 Searching for studies
Key question 1 Key question 2 Key question 3
Searchesa,b Additional file 4. KQ1 searches Additional file 4. KQ2 searches Additional file 4. KQ3 searches
Databases OVID MEDLINE®OVID MEDLINE® Epub Aheadof Print, In-Process and OtherNon-Indexed CitationsEmbase Classic + EmbaseCochrane Library on Wiley
Same as KQ1plus CINAHL using theEBSCO platform
Same as KQ1
Date run From the inception date on October 29–30, 2018.
Controlled vocabulary examplesc Gastroesophageal reflux,esophageal neoplasms,endoscopy
Gastroesophageal reflux,patient acceptance of healthcare, informed consent
Barrett esophagus, esophagealneoplasms, meta analysis
Keywords examplesc GERD, esophageal cancer,esophagoscopy
GERD, patient perspective,informed decision-making
Barrett’s dysplasia, esophagealcancer, systematic review
Grey literature CADTH Grey Matters, websites listed in Additional file 6, bibliographies of relevantsystematic reviews and clinical practice guidelines identified from the searchstrategies and grey literature searching.
CADTH Grey Matters plusadditional references listedin Additional file 6.
aWhen possible, animal-only and opinion-pieces were removed from the resultsbThe search strategies were peer-reviewed using PRESS 2015 [35] and can be found in Additional file 5cVocabulary and syntax adjusted across databases, as required
Hamel et al. Systematic Reviews (2020) 9:20 Page 7 of 25
-
narratively. For KQ1, the results are presented in evi-dence sets 1 to 8 (Additional file 11), with associated for-est plots, where applicable. For KQ2, due to the natureof the data, a meta-analysis of outcomes was not appro-priate; however, narrative results are presented. ForKQ3, the results presented in evidence sets 1-11 (Add-itional file 12) may omit some results due to overlap. Inthe case of overlap where outcome data was the same inmultiple reviews, the review with the highest methodo-logical quality or with the most complete outcome datawas included; the additional reviews are listed inAdditional file 12: Table 1 and mentioned in the Notescolumn within the evidence sets. For KQ3, odds ra-tios (OR) were commonly used in SRs and absoluterisk differences (ARDs) were calculated accordingly.Where SR authors did not provide an OR, a relativerisk (RR) was calculated based on the results and theARD was calculated based on the RR. In instanceswhere the RR did not approximate the OR reportedin the SR, we inserted the RR in the notes column inthe evidence set; however, the ARDs were calculatedbased on the OR. We determined the extent of over-lap of evidence across reviews by outcome for eachcomparison using the corrected covered area (CCA)method [43].
Meta-analysisFor KQ1, raw data were extracted from all articles, whenavailable. Raw data were entered into Review ManagerSoftware version 5.3 [44] and hazard ratios (HR) wereproduced for the survival outcome and risk ratios (RR)were calculated for all other outcomes.
Subgroup analysisA priori-defined subgroup analysis (KQ1) variables in-cluded age, sex, body mass index (BMI), smoking his-tory, duration of chronic GERD, definition of chronicGERD, groupings of risk factors and various ethnicgroups. Reporting did not allow for these to beundertaken.
Sensitivity analysisSensitivity analyses were planned to restrict to thosestudies as being low risk of bias (KQ1) based on theoverall judgement, to address any decisions made re-garding handling of data or to explore statistical hetero-geneity (KQ1) and based on the timing of publication(KQ1 and KQ2). However, only two studies were consid-ered low risk of bias and therefore sensitivity analysiswas not undertaken.
Small study effectsFor KQ1 and KQ2, to assess for small study effects, acombination of graphical aids (e.g. funnel plot) and/or
statistical tests (e.g. Egger regression test, Hedges-Olkin)were planned if at least ten studies were available in anygiven analysis. This analysis was not undertaken.
Rating the certainty of the evidenceFor each critical and important outcome, the GRADEframework [32, 45] was used to assess the strength andcertainty of the evidence. We followed the GRADE guid-ance for determining the extent of the risk of bias forthe body of evidence [46]. The online software GRADE-pro GDT (https://gradepro.org/) was used for theGRADE assessments. Assessment of each GRADE do-main (study limitations (i.e. risk of bias), indirectness, in-consistency, imprecision and other considerations (i.e.publication bias and comprehensiveness of the search))was presented, where possible, with the information pro-vided in the studies. If there was missing information, anarrative description was provided. The certainty of theevidence for each outcome, in each study/review, wasrated by one reviewer and verified by a second reviewer.Any discrepancies were resolved through consensus.As KQ3 is an overview, and there are no published
methods for performing GRADE for overviews of re-views, we have used the five domains listed above as aguide. As none of the included reviews used GRADE toevaluate the body of evidence, we performed these as-sessments using the reported information in the reviewsand did not access the primary studies for any additionalinformation, as was pre-specified in the protocol. Whenundertaking domain assessments, we considered an ap-proach with sufficient face validity to align with GRADEguidance. We have elaborated on considerations and de-cisions in Additional file 13. As with existing GRADEguidance, each GRADE domain was judged as possessingno serious limitations (no rating down), serious limita-tions (rating down by one) or very serious limitations(rating down by two).
ResultsTable 4 provides a summary of the literature search re-sults and Fig. 2a–c shows the PRISMA flow diagrams foreach KQ. Study characteristics and population demo-graphics for each key question are presented in Add-itional file 14 and overall RoB/quality assessment forincluded studies and reviews are presented in Additionalfile 15. Additional files 11, 16, 12 provide the evidenceset results, narrative results, GRADE evidence profilesand GRADE summary of findings tables for KQ1,KQ2 and KQ3, respectively. The results presentedherein provide a high level overview of the results.For additional details of the individual studies and re-views within each section, the full SRs can be foundon the CTFPHC website (www.canadiantaskforce.ca).Additional file 8 provides a list of excluded studies
Hamel et al. Systematic Reviews (2020) 9:20 Page 8 of 25
https://gradepro.org/http://www.canadiantaskforce.ca
-
Table
4Summaryof
stud
ies/review
s
Keyqu
estio
n1
Keyqu
estio
n2
Keyqu
estio
n3
Literature
search
(PRISM
Aflow
diagramsin
Fig.
2a–c)
Initialsearch
results
7,292
1,614
4,374
Ded
uplication,grey
litandsupp
l.searchinga
4,384(evaluated
attitleandabstract)
1,600
3,761
Evaluatedat
full-text
1645
103
1007
#of
includ
edstud
ies
10(6
RCTs,1
rand
omized
cross-over
trial,1prospe
ctivecoho
rt,2
retrospe
ctive
coho
rt)
3(2
RCTs,1
coho
rt)
11SRs(10repo
rtingresults)which
includ
ed25
articles
repo
rtingresults
ofRC
Ts(1
to16
perreview
)(Figs.3
and4)
Stud
ycharacteristics(Fulltablesin
Add
ition
alfile14:TablesS1-S3)
Com
parison
s-Screen
ingin
thelast5yearswith
EGDvs
noscreen
ing[88,90]
-Screen
ingmod
ality
(e.g.con
ventionalEGD)vs.screening
mod
ality
(transnasal
esop
hago
scop
y)–[80,81,83,85–87]
-Biop
symetho
d(e.g.Fou
r-qu
adrant
rand
om)vs
biop
symetho
d(chrom
oend
o-scop
y)[82,84]
-Transnasalesop
hago
scop
yvs.
Vide
ocapsuleesop
hago
scop
y-Transnasal-EGDvs.Peroral-EGD
-Peroral-EGDandsedatedEG
D
-Celecoxib
vs.Placebo
-Omep
razolevs.H
istamineType
2Receptor
Antagon
ists
-PD
T+Omep
razolevs.O
mep
razole
-Anti-refluxsurgery(Nissenfund
oplication)
+APC
vsAnti-refluxsurgery(Nissenfund
oplication)
+Surveillance
(end
oscopic)
-Radiofrequ
ency
ablatio
n+Proton
pumpinhibitorvs.
Proton
pumpinhibitor
-Anti-refluxsurgery(Nissenfund
oplication)
vs.H
2RA/
Omep
razole
-PD
Tusing5-ALA
vs.PDTusingPh
otofrin
-PD
Twith
different
treatm
entparameters
-RFAvs
Surveillance(end
oscopic)
-APC
+PPIvs.MPEC+PPI
-PD
Tvs.A
PC+PPI
-Endo
scop
icmucosalresectionvs.RFA
Cou
ntry
ofcond
uct
8USA
;1India;1Japan
3USA
1Brazil;1China;5
UK;4USA
Yearspu
blishe
d1999–2018
1998,1999,2014
2008–2018
Stud
ysize
-20
to92
participantspe
rscreen
ingmod
ality
-60–378
participants(RCTs)
-1580
participants(prospectivecoho
rt)
-153and155participants(re
trospe
ctivecoho
rts)
62,105
and1210
participants
-9to
208participantsacross
SRs(m
ostwith
<100
participants)
-One
SRrepo
rted
onan
ongo
ingRC
Twith
noresults
[108]
Popu
latio
nde
mog
raph
ics(Fulltablesin
Add
ition
alfile14)
Sex
Men
:42–99%
Thesewereno
trepo
rted
inthe
includ
edstud
ies.
Manyof
thesewereno
trepo
rted
across
review
s.
Ethn
icity
White:41–99%b
Meanage
Meanage:48–67yearsold
Smokers
43%,80%
PPIu
se17%,48%
,78%
BMI
29.0to
31.4c
Outcomes
notrepo
rted
-All-causeor
cause-specificmortality
-Qualityof
life
-Major
orminor
med
icalproced
ures
-How
patientsweigh
the
bene
fitsandharm
sof
screen
ing
-EA
C-related
mortality
-Qualityof
life
-Overtreatmen
t
Hamel et al. Systematic Reviews (2020) 9:20 Page 9 of 25
-
Table
4Summaryof
stud
ies/review
s(Con
tinued)
Keyqu
estio
n1
Keyqu
estio
n2
Keyqu
estio
n3
-Overdiagn
osis
-Factorsconsidered
inde
cision
tobe
screen
ed-Intrusiven
essof
thescreen
ing
mod
ality
OverallRo
B-2stud
ieswerelow
RoBforhistolog
icallyconfirm
edBE
-Overall,ou
tcom
esacross
comparison
swereat
mod
erateor
high
RoBforbo
thRC
Tsandob
servationalstudies
(Add
ition
alfile15:TablesS1
and2)
-1stud
yratedas
high
RoB
(Add
ition
alfile15:TableS3)
-2SRswereratesas
low
quality
and8werecritically
low
(Add
ition
alfile15:TableS4)
-Multip
letoolsused
toevaluate
theprim
aryRC
Tsinclud
edin
theSRs,with
themajority
ratedas
unclear
orhigh
RoB(Add
ition
alfile15:TableS5)
OverallGRA
DE
Allou
tcom
eforallcom
parison
wereratedas
very
low
certainty.
GRA
DEwas
notpe
rform
edfor
thisKQ
.Eviden
ceformostou
tcom
eswas
ratedas
very
low
certaintyor
weregivenarang
eof
‘verylow
tolow’
(descriptio
nof
rang
esin
Add
ition
alfile13).
APC
:Argon
PlasmaCoa
gulatio
n;Multip
olar
electrocoa
gulatio
n;PD
T:Ph
otod
ynam
icTh
erap
y;PP
I:Proton
PumpInhibitor;Ro
B:riskof
bias
aBibliograp
hysearch,searchforfull-text
articlesba
sedon
abstractsan
dprotocols
bRe
ported
infivestud
ies
c Rep
ortedin
four
stud
ies
a Including
doub
lecoun
ting
Hamel et al. Systematic Reviews (2020) 9:20 Page 10 of 25
-
a
b
c
Fig. 2 a PRISMA flow diagram for KQ 1. b PRISMA flow diagram for KQ 2. c PRISMA flow diagram for KQ 3
Hamel et al. Systematic Reviews (2020) 9:20 Page 11 of 25
-
at full-text, with reasons for each KQ. A list of ongoingstudies for all KQs are provided in Additional file 17.
Key question 1. Effectiveness of screeningDetailed characteristics tables for the ten included stud-ies can be found in Additional 14: Table 1, and resultsare described herein. The certainty of the evidence toanswer KQ1a was very low; therefore, KQ1b was notaddressed.
EGD versus no prior EGDTwo retrospective cohort studies by Rubenstein 2008[47] and Hammad 2019 [48] studied a group of individ-uals with EAC and evaluated their electronic medical re-cords or the institutional cancer registry to examine ifthey had a standard sedated esophagogastroduodeno-scopy (EGD) in the five years prior to cancer diagnosisor not (Additional file 11: Evidence Set 1). InRubenstein 2008, survival data, reported using aKaplan-Meier curve, showed no difference betweensurvival rates at year 1 and 10 [47]. Authors reportthat there was no difference in long-term survival(approximately 6 to 12 years) between those who hadreceived a prior EGD and those who had not (ad-justed HR 0.93, 95% confidence interval (CI) 0.58 to1.50) [very low certainty]. It was difficult to determinea range of effects across studies for survival analysesas the Hammad 2019 study only had one eligible pa-tient with a prior EGD in the past 5 years.
Both Rubenstein et al. [47] and Hammad 2019 [48]reported information to evaluate whether an EGD inthe previous five years influenced the incidence ofEAC by stage of diagnosis at time of detection. It wasdifficult to determine a range of effects across studiesfor most stage-based analyses as one study only hadone eligible patient with a prior EGD and the stageof diagnosis was unknown (author correspondence)[48]. Rubenstein et al. [47] reported that there maybe a higher odds of a stage 1 diagnosis than a moreadvanced diagnosis (stages 2–4) (OR 2.77, 95% CI1.00 to 7.67; p = 0.0497; Forest Plot 1.1) [very lowcertainty].
EGD versus TNEFour studies evaluated EGD (sedated) compared tounsedated transnasal esophagoscopy (TNE) (RCTs byChang 2011 [49] and Sami 2015 [50]; a randomisedcrossover study by Jobe 2006 [51]; one cohort study byMori 2010 [52]) (Additional file 11: Evidence Set 2).Sami 2015 [50] evaluated safety, defined as serious adverseevents (life-threatening, severe or medical significant con-sequences of screening), and reported no serious adverseevents in either group [very low certainty].
Jobe et al. [51] reported on incidence of EAC only onthose who were receiving initial screening (i.e. excludingthose who were being followed with BE). There were nocases of EAC reported [very low certainty]. Three studies[49, 50, 52] defined incidence of endoscopically sus-pected BE differently. The RCTs showed no significantdifference between screening modalities; RR 1.90, 95%CI 0.19 to 19.27 [49] and p = 0.37 [50] [very low cer-tainty]. However, Mori 2010 [52] (cohort study) didshow a significant difference, with those being screenedwith TNE having a higher incidence of suspected BE(RR 2.09, 95% CI 1.30 to 3.36; Forest Plot 2.1) [very lowcertainty]. Two studies reported no difference in inci-dence of histologically confirmed BE between screeningmodalities; p = 0.44 [50] and RR 0.89, 95% CI 0.59 to1.33 [51] [very low certainty]. Incidence of dysplasia waslow, with zero in Chang 2011 [49] and nine (EGD: 5;TNE: 4) in Jobe 2006 [51] showing no difference be-tween screening modalities (RR 1.54, 95% CI 0.44 to5.44; Forest Plot 2.2) [very low certainty].Chang 2011 [49], Sami 2015 [50] and Jobe 2006 [51]
used the same measurement tool to measure anxiety(psychological effects); however, there were differencesin when the tool was utilized and the reporting of theoutcomes were different (e.g. mean, median, level of se-verity). Therefore, no meta-analysis was performed.There was no difference in anxiety before the procedure(p = 0.084) [51] [very low certainty], less anxiety overallduring the insertion (p = 0.0001) [51] [very low cer-tainty] and during the procedure (p < 0.001 [50] and p =0.0001 [51]) for those who received EGD compared toTNE [very low certainty].
EGD versus video capsule esophagoscopyOne RCT by Chang 2011 [49] evaluated three outcomes,all with very low certainty (Additional file 11: EvidenceSet 3). There was no difference in the incidence of endo-scopically suspected BE between screening modalities(RR 0.57, 95% CI 0.11 to 3.01; Forest Plot 3.1). Partici-pants with suspected BE based on video capsule esopha-goscopy (VCE) (swallowed device) were offered EGDand BE was confirmed through biopsy. Of the three par-ticipants with suspected BE who received VCE, nonewere histologically confirmed cases of BE. There wasalso no incidence of dysplasia among either group.
EGD versus transoral-EGDOne cohort study by Mori 2010 [52] allowed participantsto choose between three screening modalities (sedatedEGD, unsedated TNE and unsedated transoral-EGD pre-sented here) (Additional file 11: Evidence Set 4). Overall,there was no difference in the frequency, distribution orseverity in the incidence of endoscopically suspectedBE between modalities in those with grade 2 or 3 BE
Hamel et al. Systematic Reviews (2020) 9:20 Page 12 of 25
-
(RR 1.30, 95% CI 0.83 to 2.03; Forest Plot 4.1) [very lowcertainty].
TNE versus VCETwo studies, Chak 2014 [53] and Chang 2011 [49],provided data on four outcomes (Additional file 11:Evidence Set 5). There was no difference betweenscreening modalities for incidence of endoscopically sus-pected BE (RR 0.86, 95% CI 0.29 to 2.56; Forest Plot 5.1)[very low certainty], [49, 53] or for those with histologi-cally confirmed BE (RR 0.62, 95% CI 0.15 to 2.52) [verylow certainty] [53]. Chang 2011 [49] reported that therewere no incidences of dysplasia with either screeningmodality [very low certainty].Those in the unsedated TNE group experienced more
anxiety, nervousness or worry (psychological effects) be-fore the procedure than those in the swallowed VCEgroup (RR 2.28, 95% CI 1.33 to 3.88; Forest Plot 5.2) [53][very low certainty], and anxiety during the procedure(RR 2.14, 95% CI 1.22 to 3.77; Forest Plot 5.3) [53] [verylow certainty].
Unsedated TNE versus unsedated transoral EGDOne RCT by Zaman 1999 [54] randomised participantswith upper gastrointestinal (GI) symptoms. Mori 2010[52] (cohort) included those who had previously beenscreened for upper intestinal tract disorders, and allowedparticipants to choose between three screeningmodalities (Additional file 11: Evidence Set 6). Only onecomplication (life-threatening, severe, or medicallysignificant consequence) was reported (facial swellingfollowed by surgical exploration and full recovery),with no differences between screening modalities (RR4.04, 95% CI 0.17 to 95.20; Forest Plot 6.1) [very lowcertainty] [54].Zaman et al. [54] reported no difference between
screening modalities in the incidence of endoscopicallysuspected BE (three cases total) (RR 0.68, 95% CI 0.07 to7.09; Forest Plot 6.2) [very low certainty]. Mori et al. [52]reported a significant difference in the frequency of BE,with those screened with TNE less likely to have suspectedBE (grade 2 or 3) compared to transoral EGD (RR 0.62,95% CI 0.41 to 0.94; Forest Plot 6.3) [very low certainty].Zaman et al. [54] evaluated the levels of anxiety before
the procedure, during insertion, and during the procedure(psychological effects). Anxiety was assessed on a scale of10 (higher score representing higher level of anxiety), withno significant difference between levels of anxiety at anytime (Forest Plots 6.4-6.6) [very low certainty].
Random biopsy versus enhanced magnification-directedendoscopy biopsies (with acetic acid)One RCT by Ferguson 2006 [55] included patients whoreceived standard sedated EGD, with those with
suspected BE randomised at that point to different bi-opsy methods (Additional file 11: Evidence Set 7). As allparticipants were evaluated on suspected BE throughEGD, only incidence of histologically confirmed BE isreported. There was no difference in the incidence ofhistologically confirmed BE between different methodsof biopsy. This was found in both those with patternIII and IV specialized intestinal metaplasia (RR 0.98,95% CI 0.59 to 1.64; Forest Plot 7.1) [very low cer-tainty] and among all specialized intestinal metaplasiapattern types (RR 1.14, 95% CI 0.71 to 1.82; ForestPlot 7.2) [very low certainty].
Random biopsy versus chromoendoscopyOne RCT by Wani 2014 [56] included participants whowere given conventional EGD (n = 378) and those withsuspected BE who were randomised to either random bi-opsy (n = 33) or chromoendoscopy (n = 23) (Additionalfile 11: Evidence Set 8). There was no difference in thenumber of participants with histologically confirmed BEbetween methods (RR 0.87; 95% CI 026–2.90; Forest Plot8.1) [very low certainty].
Key question 2. Patient values and preferencesThree studies (Chak 2014 [53], Zaman 1999 [54] andZaman 1998 [57]) provided information on reasons whyparticipants were unwilling to be part of the study orreasons for deciding against the uptake of screeningonce allocated [53]. Objectives of the included studieswere to determine the acceptance and tolerability of dif-ferent screening modalities and provide data on screen-ing results. Studies reported on those who refusedparticipation prior to study commencement (i.e. eitherprior to being screened or prior to randomisation), butdid not provide participant characteristics on this patientsubset. A narrative summary of the results is providedherein, with detailed results in Additional file 16. Nostudies provided results on how patients weight the ben-efits and harms of screening, factors considered in deci-sion to be screened or intrusiveness of the screeningmodality.
Willingness to be screenedAll three studies provided reasons on why those askedhad refused to be screened/participate in the study. Alarge proportion of these individuals were in one study[53] with 1026 of the 1210 people asked not participat-ing, and 184 who agreed to participate. Among thosewho did not participate during the invitation period, 627(52%) did not return the phone call or respond to theletter, 385 (32%) refused to participate (with no reasonprovided), 12 (1%) were ineligible and two (0.2%) did notparticipate because of difficulty getting to the hospital.The other two studies by Zaman et al invited 105
Hamel et al. Systematic Reviews (2020) 9:20 Page 13 of 25
-
outpatients in one study and 62 in the other. Zaman1999 [54] reported 45 of 105 (43%) patients were unwill-ing to participate in the study comparing transnasal toperoral EGD. Zaman 1998 [57] reported 19 of 62 (31%)patients unwilling to participate in the study comparingperoral to sedated EGD.The main reason unwillingness to be screened in both
studies was due to anxiety, with 17% (18/105) [54] and19% (12/62) [57] of all those asked to participate report-ing this. Both studies also reported that a fear of gaggingwas the reason, with 10% (10/105) [54] and 5% (3/62)[57] reporting this as the reason. Lastly, not being inter-ested in the study (10/105, 10%) [54], not wishing toundergo a transnasal procedure (7/105, 7%) [54] and un-willingness to be a study subject (4/62, 6%) were also re-ported [57].
Uptake of screeningChak 2014 [53] reported seven individuals (of 184 ran-domised) who did not receive the allocated interventionafter randomisation. Five people randomised to the TNEgroup did not receive the procedure because theywanted capsule instead. Two people randomised to theVCE group did not receive the procedure because theywere worried about the capsule getting stuck. There wasno statistically significant difference in uptake betweenintervention groups (p = 0.25).
Key question 3. TreatmentThe review characteristics of the 11 included SRs areshown in Additional file 14: Table 3. Additional file 12:Table 1 provides additional details of all primary studiesincluded in each SR, and which treatment comparisonsprovided results in each SR, respectively. Additional file12: Evidence Sets 1-11 provides detailed results andGRADE tables. Some of the individual trials were repre-sented in more than one review since the reviews didnot have mutually exclusive eligibility criteria (Figs. 3and 4). Twenty-two sets of comparisons had overlappingdata across reviews (Additional file 18). In most cases,included studies overlapped completely, according tocorrected covered area (CCA) calculations. In few cases,there was discordance among reviews. Throughout theEvidence Sets 1-11, the word “significance” refers to stat-istical significance unless stated otherwise.
Celecoxib versus placeboRees 2010 [58] included one primary RCT [59] and re-ported no difference between the groups for all-causemortality [low certainty] and progression to adenocarcinomaat one year [very low certainty] (three cases per group)(Additional file 12: Evidence Set 1.1). For all-cause mortality,there is discordant reporting within the review, wherethe text reports two deaths in the trial, but the forest
plot reports three deaths in each group. Not presentedin the results table but presented narratively in the SR,review authors stated that the primary trial authorsdid not report any statistical difference for the followingoutcomes: the area of BE segment at 12 months, and inthe reduction in the number of patients progressing fromintestinal metaplasia to dysplasia between baseline and1-year. In addition, review authors reported “no statisticaldifference in the number of patients” with complete eradi-cation of dysplasia at 12 months, and with bleeding in eachgroup.
Omeprazole versus histamine type 2 receptor antagonistsRees 2010 [58] reported data from three primary studies[60–62], and one was only an abstract [60]. The threestudies had differences with regards to drug dosage andregimens (Additional file 14: Table 4). Results andGRADE ratings are presented in Additional file 12:Evidence Set 2.1. There was no difference in the re-duction in length (cm) of BE at 12 months betweenthe compared groups, and the pooled effect estimatesfor both the overall and subgroups (I2 statistic =62.6% and 60%, respectively) remained non-significantwhen the analysis was restricted to a subgroup whoreceived a higher dose of omeprazole [very low cer-tainty] [61, 62]. There was a small change in the re-duction in area (%) of BE with omeprazole that wasstatistically significant at 12 months [very low to lowcertainty] [61, 62].
Photodynamic therapy + omeprazole versus omeprazolealoneTwo unique [63, 64] trials (from three studies) [63–65]reported across four SRs [58, 66–68] reported on pa-tients with BE. Overholt 2007 [63] provided 5-yearfollow-up data for progression to EAC, with Overholt2005 [65] providing 2-year follow-up data for other out-comes for the same trial participants (Additional file 12:Evidence Set 3.1). Overholt 2005 [65] and Ackroyd 2000[64] reported on all-cause mortality, using photodynamictherapy (PDT) with either 5-ALA or porfimer sodium,respectively. Overholt et al. reported no statistically sig-nificant difference between groups, but this was basedon few observed events (n = 3) and Ackroyd et al. ob-served no deaths [very low certainty].At both two- (OR 0.38, 95% CI 0.18 to 0.77) [65] and
five- (RR 0.53, 95% CI 0.31 to 0.91) [63] years, there was astatistically lower progression to EAC with combined ther-apy than with omeprazole alone [very low to low certainty].Progression from non-dysplastic to dysplastic BE was statis-tically lower with combined therapy (n = 0) compared tothe omeprazole group (n = 12) [very low certainty] [64].Both reviews show higher eradication of dysplasia
with combined therapy [very low to low certainty];
Hamel et al. Systematic Reviews (2020) 9:20 Page 14 of 25
-
however, there were some data discrepancies betweenreviews [58, 67] for both studies [64, 65]. Li 2008[67] provided data among those with HGD from thesame studies as the eradication of dysplasia outcome.It is unclear why more participants experienced eradi-cation of HGD than dysplasia in general, as the de-nominators are the same. There was highereradication with PDT combined with Omeprazole[very low to low certainty]. Overholt 2007 [63] re-ported that eradication of BE by 5 years was statisti-cally greater with combined therapy (OR 14.18, 95%CI 5.38 to 37.37) [very low to low certainty].One study with 36 participants (reported in three re-
views) reported on reduction/regression of BE usingvarious measures [58, 67, 68]. Statistically significant re-ductions in both length and area of BE were observedwith combined therapy [64] in two reviews [very low cer-tainty] [58, 67]. Fayter et al. [68] provided results of evi-dence of regression (not further described), with muchhigher percentage of those in the combined group ex-periencing regression (89% vs. 11%) [very low certainty].
There were fewer absolute treatment failures of BEwith combined therapy [very low certainty] [64, 65].Statistically significantly more strictures formed
with combined therapy (49/138) compared to theomeprazole treatment group (0/70) in one study[very low to low certainty] [65].
Anti-reflux surgery + Argon plasma coagulation versus anti-reflux surgery + surveillance (endoscopic)Three systematic reviews [58, 66, 67] reported datafrom a single trial with two publications [69, 70] onsix outcomes (Additional file 12: Evidence Set 4.1).Nissen fundoplication was used for anti-reflux sur-gery. Ackroyd 2004 [70] was a short-term follow upof the patients, with longer-term follow up presentedin Bright 2007 [69]. No patients progressed to cancer[very low certainty] [69]. Based on sparse events (twoinstances in the surveillance group) in Bright 2007[69] (in Li 2008 [67]), no difference between thetreatment effects was observed for progression toHGD (from LGD) [very low certainty]. Bright 2007
Fig. 4 Map of Systematic Reviews and Primary RCTs
Hamel et al. Systematic Reviews (2020) 9:20 Page 15 of 25
-
[69] provided 5-year follow-up data for progressionfrom intestinal metaplasia to dysplasia, and reportedno difference between the two groups (two cases ofprogression in the surveillance group) [very low cer-tainty] [58, 69].The effect estimate favoured Argon plasma coagula-
tion (APC) [69] at 12 months for complete eradicationof BE [very low certainty]. Note: the data presented inthe forest plot differed from the data in the text [58, 69].No difference was observed between the treatmentgroups for complete ablation (among those with histo-logical change) [69] in Li 2008 [very low certainty]. Ack-royd 2004 [70] in De Souza 2014 [66] reported that nodifference in treatment failure was observed between thecompared groups [very low certainty].
Radiofrequency ablation + proton pump inhibitor versusPPI aloneThree systematic reviews [58, 71, 72] reported data fromShaheen 2009 [73] (Additional file 12: Evidence Set 5.1).Rees et al. [58] included patients with both low- andhigh-grade dysplasia; however, Qumseya 2017 [71] andPandey 2018 [72] restricted their reporting to patientswith low-grade dysplasia. Five participants progressed toEAC at 5 years or at the latest timepoint of follow-up(RFA + PPI: 1/84; PPI: 4/43) [58], resulting in no difference
between the compared treatments [low certainty]. Amongthose with LGD, none progressed to EAC over thefollow-up period [low (Rees 2010) and very low certainty(Qumseya 2017)] [58, 71].Fewer patients progressed to higher grades of dysplasia
with the radiofrequency ablation (RFA) treatment [lowcertainty] [58]. However, there is discrepancy in how thisoutcome is labelled in the review. The text says therewas no data for those progressing from IM to dysplasiaand labels it as progression to higher grades of dysplasia,but the forest plot is titled progression from IM to dys-plasia. When the outcome was restricted to progressionto HGD among patients with LGD, no difference wasobserved [very low certainty] [71, 72].There was a statistically significant difference favour-
ing RFA for complete clearance of intestinal metaplasia(RR 17.81, 95% CI 2.61–121.54) [very low certainty] [72],for complete clearance of dysplasia (OR 22.67, 95% CI8.72 to 58.94) [58] [low certainty], which remained whenrestricted to patients with LGD (OR 0.03, 95% CI 0.01–0.13) [very low certainty] [72], and for complete eradica-tion of BE (OR 143.53, 95% CI 18.53–1113.87) [low cer-tainty] [58]. De Souza 2014 [66] showed higher rate oftreatment failure in the proton pump inhibitor (PPI)treatment group compared to the RFA + PPI group(RFA + PPI: 19/84; PPI: 42/43) [very low certainty].
Fig. 3 Primary studies and conditions overlap among the systematic reviews
Hamel et al. Systematic Reviews (2020) 9:20 Page 16 of 25
-
There was no difference between treatment effects forstricture formatio [58] [very low certainty]. There wereno instances of perforation reported [72] [very low cer-tainty], and only one study participant developed bleed-ing, but data was not presented per arm [72] [very lowcertainty].
Anti-reflux surgery (Nissen fundoplication) versus H2receptor agonist/omeprazoleTwo systematic reviews [58, 67] reported data from Par-rilla 2003 [74] on five outcomes. Overall, the certainty ofthe evidence was very low for all outcomes (Additionalfile 12: Evidence Set 6.1). No deaths (all-cause mortality)were reported in either group [58].Few participants progressed to EAC, with two in each
group (not statistically significant) [58]. Rees 2010 [58] re-ported a significant difference in the incidence of progres-sion to dysplasia from intestinal metaplasia, with lessprogression in the surgical treatment group comparedwith the pharmacological treatment group. Although Liet al. [67] included the same primary study, the incidencein the surgery group differed from Rees et al., and demon-strated no significant difference between the groups [58,67]. Because different data were reported for the interven-tion groups, this led to discordant results between reviews.Although some participants experienced eradication of
dysplasia (surgery: 5/58, H2 receptor antagonist/omepra-zole: 3/43) at 5-year follow-up, this was not statisticallydifferent between treatment groups [58]. None of theparticipants experienced complete eradication of BE at 5years in either treatment group [58].
PDT with 5-aminolevulinic acid versus PDT with porfimersodiumMacKenzie 2008 [75] in Rees 2010 [58] reported prelim-inary data only in abstract form and recruitment had notyet been completed. The certainty of evidence was verylow for both outcomes (Additional file 12: Evidence Set7.1). There was no statistically significant difference ineradication of HGD between the treatment groups(preliminary results included 14 patients in each treat-ment group) [75].These preliminary results showed no difference be-
tween treatment groups in stricture formation.
Photodynamic therapy with different treatment parametersA SR by Fayter 2010 [68] with three primary studies[76–78], one of which was an abstract [76], compareddifferent parameters in the PDT treatment. The certaintyof the evidence was very low for cancer risk, and rangedfrom very low to low for the remaining four outcomes(Additional file 12: Evidence Set 7.2). Generally, higherdoses and red light had lower cancer risk and lower ratesof adenocarcinoma [76]. These results were considered
significant, but were taken from an abstract, so shouldbe interpreted with caution.
Radiofrequency ablation versus surveillance (endoscopic)Phoa 2014 [79] reported in two systematic reviews [71,72], included patients with BE with low-grade dysplasia.These reviews also included another primary study byShaheen et al. [73]; however, results from this study arepresented in Evidence Set 5.1 as another review [58]states that both treatment groups also received pharma-cological therapy (Additional file 12: Evidence Set 8.1).There were seven people with progression to EAC (RFA:1/68, Surveillance: 6/68) [very low certainty]. Progressionper patient-year is also presented [very low certainty].Qumseya 2017 [71] reported data as cumulative progres-sion from LGD to HGD [very low certainty] and pro-gression per patient-year [very low to low certainty]. Fewevents were observed (RFA: 0, Surveillance: 12). Pandey2018 [72] demonstrated a marginally statistically signifi-cant results favouring RFA (RR 0.03, 95% CI 0.00 to0.44) [very low to low certainty] [72]. Although Pandeyand Qumseya reported discrepant data for the surveil-lance group in the number of patients with progressionto HGD, 18 and 12, respectively, effect estimates aresimilar between reviews.RFA resulted in more patients with complete eradica-
tion of dysplasia (RR 3.52, 95% CI 2.40 to 5.17) [very lowto low certainty] [72]. A favourable treatment effect wasobserved with RFA for complete eradication of intestinalmetaplasia (RR 123.30, 95% CI 7.78 to 1954.10) [very lowto low certainty] [72].Eight strictures were formed among the study popu-
lation; however, data was not reported per arm [verylow to low certainty] [72]. None of the study patientsdeveloped perforations [very low to low certainty][72], and only one study participant developed bleed-ing, but data was not reported per group [very low tolow certainty] [72].
Argon plasma coagulation + PPI versus multipolarelectrocoagulation + PPIRees 2010 [58] reported on two primary studies(Additional file 12: Evidence Set 9.1) [80, 81], with noinstances of mortality (all-cause) reported [very low tolow certainty] and one case of stricture formation in theArgon plasma coagulation (APC) + PPI group [very lowcertainty].
Multipolar electrocoagulation + PPI versus Argon plasmacoagulation + PPITwo SRs [66, 67] reported the same two primary studiesas Evidence Set 9.1; however, the intervention and com-parison groups are reversed (Additional file 12: EvidenceSet 9.2) [80, 81]. Both outcomes are presented as one
Hamel et al. Systematic Reviews (2020) 9:20 Page 17 of 25
-
review provided the pooled OR (OR 2.01, 95% CI 0.77 to5.23) [very low certainty] for histological complete abla-tion of BE [67] and the other provided the pooled riskdifference (RD − 0.14, 95% CI − 0.33 to 0.05) [very lowcertainty] for treatment failure (the opposite of completeablation). Both favour multipolar electrocoagulation(MPEC) + PPI [66].
Photodynamic therapy versus Argon plasma coagulation +PPIFive systematic reviews [58, 66–68, 82] reported on sixprimary studies [83–88] of which some were abstracts(e.g. Zoepf 2003 [87]) (Additional file 12: Evidence Set10.1). There were many differences between the SRs andthe primary studies within the SRs in how comparisongroups were reported, heterogeneity between therapytypes (e.g. PDT with 5-ALA or Porfimer sodium), differ-ences in their drug dosing and light delivery regimens[58] and differences in the participants who were in-cluded in the analyses (e.g. all levels of dysplasia or LGDonly). Rees 2010 [58] reported on three studies [84–86],with a combined incidence of all-cause mortality ofone in the PDT group and none in the APC + PPI group[very low certainty] [84].Almond 2014 [82] reported on three studies [84, 86,
88] in participants with LDG. One incident case of EACby 12 months in the PDT group was reported [very lowcertainty]. Almond et al. [82] reported no events of pro-gression to high-grade dysplasia among 17 participants[very low certainty] [84, 86].Rees 2010 [58] and Almond 2014 [82] show discrepant
data for the PDT group in Ragunath et al. [86]. Thenumber of patients experiencing complete eradication ofdysplasia was reported as 10/13 in Rees 2010, and 8/11in Almond 2014 [very low certainty]. As Almond et al.included only those with low-grade dysplasia, it mightbe that the two additional participants in Rees et al. hadhigh-grade dysplasia, although this is not clearly re-ported. Five SRs [58, 66–68, 82] reported on PDT versusAPC + PPI and how it affected BE in five primary stud-ies [83–87]. These reviews reported the outcomes in sev-eral ways: complete ablation of BE, eradication of BE,reduction of BE (length, surface reduction) and treat-ment failure (no ablation). Overall, there was a high levelof heterogeneity among studies and in the results withvery low certainty in all of these outcomes except the re-duction in length (cm) which was rated as low certainty.Determining concordance of results across reviews wasdifficult due to the differences in how information wasreported. Almond 2014 [82] reports on Ragunath 2005[86], reporting no difference between treatments ineradication of intestinal metaplasia (two participants ineach group) [very low certainty].
Both Rees 2010 [58] and Almond 2014 [82] reportedon stricture, with Rees 2010 including three primarystudies [84–86] and Almond 2014 only including Ragu-nath 2005 [86]. Although there was discordance in thenumber of those experiencing stricture, neither reviewreported any difference between treatment groups [verylow certainty].
Endoscopic mucosal resection versus radiofrequencyablationThree SRs [89–91] included patients with BE and intra-mucosal neoplasia (i.e. early stage adenocarcinoma). Al-though both Fujii-Lau et al. [90] and Chadwick et al.[89] include Shaheen 2011 [92] as an included study, be-cause only one of the treatment groups was consideredrelevant for those reviews, neither reported the resultsfrom the placebo group. Therefore, results from Shaheen2011 [92] are not presented (Additional file 12: EvidenceSet 11.1). All three reviews provided results for bothtreatment groups for the primary study of van Vilsteren2011 [93], although all three reviews also label the treat-ment groups differently (e.g. stepwise EMR vs. focalEMR + RFA, EMR vs. RFA, complete EMR vs. RFA).Both endoscopic mucosal resection (EMR) and radiofre-quency ablation (RFA) eradicated neoplasia (eradicationof cancer) in most cases (EMR: 100%; RFA: 96%), withno difference between treatments [very low certainty][91]. Eradication of dysplasia was completed in almostall participants at the end of the treatment and atfollow-up. Only one participant in the RFA group didnot have complete eradication at the end of treatmentand follow-up [very low certainty] [89]. Almost all par-ticipants experienced complete eradication of intestinalmetaplasia, although there was slight discordance amongthe percentages reported in the two reviews [very lowcertainty] [89, 91].Only one participant in the EMR treatment group ex-
perienced recurrence of cancer [very low certainty] [90],no participant experienced recurrence of dysplasia [lowcertainty] [90] and two participants in each treatmentgroup experienced recurrence of intestinal metaplasia[very low certainty] [90].Two SRs [89, 91] reported on bleeding, with some data
discrepancies, but overall concordant results. One SR[89] reported that among the 25 participants in the EMRgroup, only one participant experienced perforations. Noone in the RFA group experienced this outcome. Mostparticipants receiving EMR treatment experienced stric-tures (22 of 25, 88%) compared to only three of 22 (14%)in the RFA group. Review authors did not provide effectestimates, but a risk ratio of 6.45 (95% CI 2.23 to 18.66)for EMR compared to RFA was calculated using thesedata [91]. Almost all participants receiving EMR
Hamel et al. Systematic Reviews (2020) 9:20 Page 18 of 25
-
experienced stenosis requiring treatment (88%, 22/25),with only three of 21 (14%) experiencing stenosis in theRFA group [89]. This difference was statistically signifi-cant with a calculated risk ratio of 6.45 (95% CI 2.23–18.65) for EMR compared with RFA. All of these adverseevents were rated as very low certainty.
DiscussionEsophageal cancer, although lower in incidence relativeto other cancers, has a higher mortality rate, partly dueto a more advanced stage at diagnosis, when the canceris widely spread to other vital organs and is incurable.This makes the consideration of whether to invest inscreening services important. In 2012, a Cochrane sys-tematic review by Yang et al. [94] set out to include onlyRCTs comparing screening versus no screening, andfound no studies meeting their inclusion criteria. Fiveyears later, this systematic review found no additionalrandomised controlled trials comparing screening to noscreening. Among the few studies that have assessed theeffectiveness of screening of individuals with chronicGERD, there exists several limitations (e.g. small samplesizes, one-time screening test with no follow-up). Al-though there may be higher odds of stage 1 diagnosis ifan EGD had been performed in the previous 5 years, thestudy included a small number of cases, resulting in lowprecision [47]. Those diagnosed at earlier stages (T1 andT2) can be treated with potentially curable therapies, forexample, esophagectomy in patients with high-gradedysplasia and stage T1a cancer has been associated witha greater survival; 89% at 1 year, 77% at five years and68% at 10 years [95]. Comparatively, those with latestage cancer that cannot be cured by surgery receivechemotherapy/chemoradiation and have a 15% 5-yearsurvival rate [2].There was little difference in the incidence rates of
EAC, BE and dysplasia using alternative screeningmethods. Although EGD with biopsy is considered thegold standard for the diagnosis and surveillance of BE[96, 97], the results from these studies may encourageincreased usage of alternative methods of screening forBE and EAC. Conventional EGD uses sedation, whichincreases the cost of screening (e.g. monitoring patientspost-procedure) and resources used (e.g. availability of agastroenterologist, recovery room). Alternate methodsdo not require sedation, can be done in a primary caresetting and require little monitoring post-procedure. Instudies where participants who had experienced a previ-ous screening and were allowed to then select whichscreening modality they wanted, there was a preferencetowards unsedated methods. Of the 1574 participants,721 (46%) chose transnasal, 599 (38%) chose transoraland 254 (16%) chose EGD [52]. Further supporting pa-tient choice of screening modality, RCTs reported higher
levels of dropouts and anxiety among those randomisedto TNE compared to other screening modalities, al-though not always significant. The perceived discomfortof the unsedated transnasal procedure could contributeto increased anxiety.When considering patient values and preferences for
screening, the data is also sparse. Three studies reportedon the willingness, or in this case the unwillingness, toparticipate and be screened in a study on screening forEAC and precancerous conditions. One study also pro-vided outcome information on uptake of screening,more specifically reasons why they did not uptakescreening after allocation. No other outcomes of interestwere addressed in these studies, overall providing littleevidence to answer the KQ2. We are not aware of anyother reviews that have been done in the area of upperGI screening in relation to how patients weigh the bene-fits and harms of screening and what factors contributeto these preferences and to their decision to undergoscreening, so there is nothing to compare it to.In our overview evaluating treatment for BE, with or
without dysplasia, and early-stage adenocarcinoma(KQ3), 11 SRs were included. Treatment modalities cov-ered pharmacological therapy, various ablative tech-niques, surgery and some combinations thereof, with amix of statistically significant and non-significant results,meaning that treatment may show an effect on someoutcomes and little to no effect on others. However,there were few studies, all with small sample sizes byoutcome, and for many outcomes, only one study pro-vided results, thereby providing little information withwhich to gauge the certainty of the evidence. In consult-ation with clinical experts, in addition to evidence fromretrospective and prospective clinical series (e.g. AIMtrial [92]), and registry data, certain treatments are cur-rently considered as the standard of care. For example,BE with HGD should be treated with ablation and T1aesophageal cancer (EAC and ESCC) should be treatedwith endoscopic resection (either endoscopic mucosalresection or endoscopic submucosal resection).
LimitationsBoth reviews and the overview of reviews were devel-oped using rigorous methodological standards, as de-tailed a priori in registered protocols. There may,however, still be some limitations. There is a risk ofmissing studies, although we minimized this risk bysearching multiple databases and using several tech-niques to search for grey literature. We included onlyEnglish and French language studies, and some studieswere excluded because we could not get access to thefull text (i.e. not available through open access journalsor through interlibrary loans). There is a chance thatsome of these records may have met the inclusion
Hamel et al. Systematic Reviews (2020) 9:20 Page 19 of 25
-
criteria and provided additional results. In KQ3, most re-cords (68%) were excluded during our screening phasedue to not meeting the pre-defined SR definition [98].Reason for exclusion were mainly lack of quality assess-ment of primary studies and not a study design of inter-est (either a narrative review or clinical practiceguideline based on a non-systematic literature review).Consequently, there is a chance that our conclusionsmay not be reflective of the totality of relevant, existingevidence. Updating the evidence base is an important re-search agenda item. Among those that did meet our pre-defined definition, some were excluded because theyonly included observational studies, or did not separateresults of RCTs from observational studies.When evaluating the results for the effectiveness of
screening (KQ1), given the very low certainty of the evi-dence, true effects may be substantially different or un-certain in light of limitations in the body of evidence.There were several important methodological limitationsleading to a moderate or high risk of bias among allstudy outcomes. The few included studies, and generallysmall sample sizes leads to imprecise results that couldnot be assessed for consistency or publication bias. Atrend that may continue in this area, as half of the po-tentially relevant ongoing trials are expecting samplepopulations of less than 200 participants (Additional file17). Blinding of participants to screening modality wasnot possible in these studies. The inability to blind pa-tients could affect psychological outcomes, as a patientmight have a preference to one screening modality overanother. When evaluating the results for patient valuesand preferences (KQ2), it was difficult to accurately as-sess RoB for these studies, as the primary purpose of theincluded studies was to evaluate acceptability afterscreening and effectiveness of the screening modality, adifferent lens to the context of our review. Most out-come data were collected before randomisation, and asthere is no formal tool to assess RoB prior to randomisa-tion, these outcomes were not assessed. Measurementbias may be present, as studies did not clearly state howthis outcome data were collected. It is not clear how thedata were collected among those who refused participa-tion during the consent period, as there is no mention ofquestionnaires or if and how study personnel collectedthis information. Only the uptake of screening outcomein one study stated that a non-completion questionnairewas given to ascertain reasons for non-completion. Itwas difficult to assess the inconsistency among the in-cluded studies, mainly due to a lack of informationamong the studies contributing to outcome results. Forexample, the largest study invited 1210 participants, with38% (385/1026) of those declining to participate not pro-viding any information on why they refused. Poorreporting of patient information for those who
contributed outcome data was seen in all studies. Nonereported on the age and sex of these participants, andindication for screening (as described above), making itdifficult to understand how comparable these studiesmight have been. Similarly, the quality of the evidencefor treating BE, dysplasia and early-stage cancer (KQ3)was low or very low across the comparisons and out-comes, indicating uncertainty that the observed effectswould be representative of the true underlying effect.Poor reporting was a barrier in assessing all domains.Additionally, items within tools such as the Jadad scoreand Downs & Black do not directly translate to consid-erations that GRADE guidance suggests for assessingrisk of bias. The current limited evidence originated in11 poorly conducted reviews (two rated as low qualityand nine rated as critically low quality), from smallRCTs (published between 1996 and 2011 with one pub-lished in 2014) with unclear or high risk of bias withshort follow-up times. Multicenter trials are needed toincrease the power of the evidence base. The lack of alonger patient follow-up time to inform outcomes maybe explained by patient retention issues or the cost offollowing patients long-term.The lack of a definition of chronic GERD, or even how
studies defined GERD, leads to a serious concern for thedirect generalizability of the population represented inthese studies to the target population of this review.Among studies that did provide a description on howGERD was defined, not all studies used a validated ques-tionnaire to define GERD, while some defined GERD in-clusion based on “typical symptoms”. Some studies didnot define GERD at all. A standardized definition ofchronic GERD would allow trialists to better identify thepopulation of interest. Additionally, as more data accrue,this may lead to more certainty as to whom the evidencewould apply (i.e. directness) and with greater precision ofthe estimate and better quality of conduct (and reporting).Several outcomes of interest, including mortality, qual-
ity of life and overdiagnosis, were not reported in any ofthe included studies (KQ1). This is mostly because thestudy results were cross-sectional in nature and theseoutcomes would require follow-up. In the absence of theoutcomes of interest to calculate overdiagnosis, we wereuna