Systematic Review Module 6: Data Abstraction Joseph Lau, MD Thomas Trikalinos, MD, PhD Tufts EPC...
-
Upload
arthur-campbell -
Category
Documents
-
view
217 -
download
1
Transcript of Systematic Review Module 6: Data Abstraction Joseph Lau, MD Thomas Trikalinos, MD, PhD Tufts EPC...
Systematic Review Module 6: Systematic Review Module 6: Data AbstractionData Abstraction
Joseph Lau, MDJoseph Lau, MDThomas Trikalinos, MD, PhDThomas Trikalinos, MD, PhD
Tufts EPCTufts EPC
Melissa McPheeters, PhD, MPH Melissa McPheeters, PhD, MPH Jeff Seroogy, BSJeff Seroogy, BS
Vanderbilt University EPCVanderbilt University EPC
CER Process OverviewCER Process Overview
2
Prepare topic:
· Refine key questions
· Develop analytic frameworks
Search for and select
studies:
· Identify eligibility criteria
· Search for relevant studies
· Select evidence for inclusion
Abstract data:
· Extract evidence from studies
· Construct evidence tables
Analyze and synthesize data:
· Assess quality of studies
· Assess applicability of studies
· Apply qualitative methods
· Apply quantitative methods (meta-analyses)
· Rate the strength of a body of evidence
Present findings
Learning ObjectivesLearning Objectives
What is data abstraction? Why do it? What kind of data to collect? How much data to collect? How to collect data accurately and efficiently? How many extractors? With what background? How do abstraction forms look like? What are some challenges in data abstraction? Is it feasible to query original authors?
3
Aims of Data AbstractionAims of Data Abstraction
Summarize studies to facilitate synthesis Identify numerical data for meta-
analyses Obtain information to assess the quality
of studies more objectively Identify future research needs
4
On Data Abstraction (I)On Data Abstraction (I)
Abstracted data should– accurately reflect information reported in the
publication
– remain in a form close to the original reporting (so that disputes can be easily resolved)
– provide sufficient information to understand the studies and to perform analyses
Abstract what is needed (avoid over doing it); data abstraction is labor intensive and can costly and error prone
Different question may have different data needs
5
On Data Abstraction (II)On Data Abstraction (II)
Involves more than copying words and numbers from the publication to a form
Clinical domain, methodological, and statistical knowledge is needed to ensure the right information is captured
Interpretation of published data is often needed Quality assessment of articles belongs in this step Appreciate the fact that what is reported is
sometimes not necessarily what was carried out
6
What Data to Collect?What Data to Collect?
Guided by key questions and eligibility criteria Anticipate what data the summary tables should
include, what data will be needed to answer questions, and conduct meta-analyses
Data extraction follows the PICO format and include study design– Population
– Intervention or exposure
– Comparators (when applicable)
– Outcomes and numbers
– Study design
7
Data Elements: P, I, CData Elements: P, I, C
Population-generic elements may include patient characteristics such as age, gender distribution, and disease stage. May need more specific items according to topic
Intervention or exposure and comparator items depend on the abstracted study– RCT, observational study, diagnostic test
study, prognostic factor study, family-based or population-based genetic studies, etc.
8
Data Elements: OData Elements: O
Outcomes should be determined a priori with Technical Expert Panel
Criteria often are not clear as to which outcomes to include and which to discard– Mean change in ejection fraction or proportion
with increase in ejection fraction by > 5%
May be useful to record different outcome definitions and consult content experts before making a decision
9
Data Elements: OData Elements: O
Apart from data on outcome definitions, you need quantitative data for meta-analysis– Dichotomous (deaths, strokes, MI, etc.)– Continuous variables (mmHg, pain score, etc.)– Survival curves– Sensitivity, specificity, Receiver Operating
Characteristic (ROC)– Correlations– Slopes
10
Data Elements: Study DesignData Elements: Study Design
Varies by type of study Some information to consider collecting when
recording study characteristics for RCTs– Number of centers (multi-center studies)
– Method of randomization (adequacy of allocation concealment)
– Blinding
– Funding source
– Intention to treat (ITT), lack of standard definition
11
Clarifying EPC LingoClarifying EPC Lingo
In the EPC program, we often refer to the following types of tables:– Evidence tables are prettified data
extraction forms. Typically, each study is abstracted to a set of evidence tables.
– Summary tables synthesize evidence tables to summarize studies. They contain context-relevant pieces of the information included in the study-specific evidence tables.
12
Developing Data Abstraction Developing Data Abstraction Forms (Evidence Tables)Forms (Evidence Tables)
No single generic form will fit all needs While there are common generic elements, in
general, form needs to be modified for each topic or study design
Organization of information in PICO format highly desirable
Well-structured form vs. flexible form Anticipate the need to capture “unanticipated”
data Iterative process, needs testing on multiple
studies by several individuals
13
Common Problems when Creating Common Problems when Creating Extraction Forms (Evidence Tables)Extraction Forms (Evidence Tables)
Forms have to be constructed before any serious data extraction is underway– Original fields may turn out to be inefficient
or unusable
In practice, reviewers have to – Be as thorough as possible in the initial
set-up
– Reconfigure the tables as needed
– Dual review process helps fill in gaps
14
ExampleExample
First draftFirst draft Second draft Second draft
15
ExampleExample
Final draftFinal draft
16
Common Problems when Creating Common Problems when Creating Extraction Forms (Evidence Tables)Extraction Forms (Evidence Tables)
Lack of uniformity among outside reviewers– No matter how clear and detailed the
instructions, data will not be entered identically from one reviewer to the next
Solutions– Evidence Table Guidance document—
instructions on how to input data
– Limit the amount of core members handling the evidence tables to avoid discrepancies in presentation
17
ExampleExample
From the Vanderbilt EPC Evidence Table Guidance document– The “Country, Setting” field: provides a list
of possible settings that could be encountered in the literature Academic medical center(s), community,
database, tertiary care hospital(s), specialty care treatment center(s), substance abuse center(s), level I trauma center(s), etc.
– The “Study design” field: cross-sectional, longitudinal, case-control, RCT, etc.
18
ExampleExample
Reviewer AReviewer A Reviewer B Reviewer B
19
Samples of Final Data Extraction Samples of Final Data Extraction Forms (Evidence Tables)Forms (Evidence Tables)
For evidence reports or technology assessments with many key questions, data extraction forms may become very long (several pages)
The next few slides are examples of data extraction forms: do not study them, just fly through them
When you design your own extraction forms, improvise: there are many possible functional versions
20
[Technology Assessment on home monitoring of obstructive sleep apnea syndrome AHRQ, 2007, Tufts EPC]
Patient and Study Patient and Study CharacteristicsCharacteristics
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
21
Characteristics of Index Test Characteristics of Index Test and Reference Standardand Reference Standard
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
22
Results (Concordance/Accuracy)Results (Concordance/Accuracy)
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome
23
Results (Nonquantitative)Results (Nonquantitative)
24
Technology Assessment Home diagnosis of sleep apnea-hypopnea syndrome www.ahrq.gov
Considerations in Managing Considerations in Managing Data AbstractionData Abstraction
How to maximize scientific accuracy under budgetary and logistical constraints?
How many people should extract data? Should data extraction be performed
“blinded” to the author, affiliations, journal, results?
How to resolve discrepancies?
25
Typical EPC Evidence Typical EPC Evidence ReportsReports
Systematic review of a topic 5 key questions (e.g., prevalence, diagnosis,
management, future research) Analytic framework, evidence tables, summary
tables, meta-analyses, decision models 12 months from start to completion of final report Screen 5,000 to >10,000 abstracts Examine several hundred full-text articles Synthesize 100 to 300 articles 100 to >200 pages in length
26
Estimating Time to Conduct a Estimating Time to Conduct a Meta-analysisMeta-analysis from Number of Citations Retrievedfrom Number of Citations Retrieved
Metaworks Inc. project summary (EPC I)– 37 meta-analysis projects
– Mean total number of hours: 1,139 hours
– Median: 1,110 hours (216 to 2,518 hours)
– Pre-analysis activities (literature search, retrieval, screening, data extraction): 588 hours (standard deviation 337 hours)
– Statistical analysis: 144 hours (106)
– Report and manuscript: 206 hours (125)
– Administrative: 201 hours (193)
27
Allen IE, Olkin I. JAMA 1999;282:634-35.Allen IE, Olkin I. JAMA 1999;282:634-35.
Fixed and Variable Costs Associated Fixed and Variable Costs Associated with Systematic Reviewswith Systematic Reviews
JAMA 1999;282:634-35.
28
Tools Available for Data Abstraction Tools Available for Data Abstraction and Collection (Pros and Cons)and Collection (Pros and Cons)
Word processing software (MS Word) Spreadsheet (MS Excel) Database software (e.g., MS Access,
Epi-Info) Dedicated off-the-shelf commercial
software (e.g., TrialStat) Homegrown software
29
Who Should Abstract Data Who Should Abstract Data and How Many People?and How Many People?
Domain experts vs. methodologists Single or double independent
abstraction followed by reconciliation vs. single and independent verification
Blinded (to authors, journal, results) data abstraction?
Berlin J. Does blinding of readers affect the results of meta-analysis?Berlin J. Does blinding of readers affect the results of meta-analysis? Lancet Lancet 1997;350:185-186.1997;350:185-186.
30
Challenges in Data Extraction Challenges in Data Extraction
Problems in data reporting Inconsistencies in published papers Data reported in graphs
31
Examples of Data Reporting Examples of Data Reporting Problems in the LiteratureProblems in the Literature
“Data for the 40 patients who were given all 4 doses of medications were considered evaluable for efficacy and safety. The overall study population consisted of 10 (44%) men and 24 (56%) women, with a racial composition of 38 (88%) whites and 5 (12%) blacks.”
[Verbatim]]
32
Examples of Data Reporting Examples of Data Reporting ProblemsProblems
33
Examples of Data Reporting Examples of Data Reporting ProblemsProblems
34
Inconsistencies in Published Inconsistencies in Published PapersPapers
Let’s extract the number of deaths in two arms, at 5 years of follow-up.
35
Results Text Results Text
36
Overall Mortality Overall Mortality
[…] […] 2424 deaths occurred in deaths occurred in the the PCIPCI group, […] and group, […] and 2525 in the in the MTMT group […] group […]
[Verbatim][Verbatim]
PCIPCI
(205)(205)
MEDMED
(203)(203)
DeadDead 2424 2525
Overall Mortality (Figure 2 in Overall Mortality (Figure 2 in Manuscript)Manuscript)
PCIPCI
(205)(205)
MTMT
(203)(203)
DeadDead 2424 2525
28 35
[The paper clearly states that there is no censoring]
37
Clinical Events (Table 2 in Clinical Events (Table 2 in Manuscript) Manuscript)
38
PCIPCI
(205)(205)
MT
(203)
DeadDead 2424
2828
3232
2525
3535
3333
Digitizing Data Reported in Digitizing Data Reported in GraphsGraphs
39
Data Are Often Presented in Data Are Often Presented in Graphical FormGraphical Form
Ayappa I et al. Sleep. 2004 Sep 15;27(6):1171-9
We want to We want to dichotomize dichotomize measurements measurements for a 2 x 2 table:for a 2 x 2 table:
Cutoff should be 15 Cutoff should be 15 (events per hour) (events per hour) in each axis.in each axis.
This information is This information is not reported in not reported in the paper, but the paper, but can be extracted can be extracted from the graph: from the graph: count the dots!count the dots!
40
Using Digitizing SoftwareUsing Digitizing Software
Engauge Digitizer,an open-source software:
Each data point is marked with a red “X,” and the coordinates are given in a spreadsheet.
digitizer.sourceforge.net
41
Reconstructing the Plot to Count Reconstructing the Plot to Count Classification at Specific CutoffsClassification at Specific Cutoffs
42
Reconstructing a Bland-Reconstructing a Bland-Altman PlotAltman Plot
43
Additional Common IssuesAdditional Common Issues
Missing information in published papers Variable quality of studies Publications with at least partially
overlapping patient subgroups Variable quality of conduct and quality Potentially fraudulent data
44
Considerations When Contacting Considerations When Contacting Authors for More InformationAuthors for More Information
How important is the information likely to be?
How reliable are additional data? How likely are you to be successful? How much effort is required? Where else should you look for more
data?– FDA website– ClinicalTrials.gov - Results Database
45
Types of Missing DataTypes of Missing Data
Detailed PICOTS information (e.g., population demographics, background diet, comorbidities, concurrent medications, precise definitions of outcomes)
Information to assess methodological quality (e.g., randomization methods, blinding)
Necessary statistics for meta-analysis (e.g., standard error, sample size, confidential interval, exact p-value)
46
A Nonexhaustive List of Common A Nonexhaustive List of Common Data Abstraction ProblemsData Abstraction Problems
Non-uniform outcomes (e.g., different pain measurements in different studies)
Incomplete data (frequent problem: no standard error or confidence interval)
Discrepant data (different parts of the same report gave different numbers)
Confusing data (cannot figure out what the authors reported)
Nonnumeric format (reported as graphs) Missing data (only the conclusion is reported) Multiple (overlapping) publications of the same study
with or without discrepant data
47
Why Do Such Problems Why Do Such Problems Exist?Exist?
It is an eye-opening experience to attempt to extract information from a paper that you have read carefully and thoroughly understood only to be confronted with ambiguities, obscurities, and gaps in the data that only an attempt to quantify the results reveals.
Gurevitch J, Hedges LV. Chapter 17. Meta-analysis: Combining the Gurevitch J, Hedges LV. Chapter 17. Meta-analysis: Combining the results of independent experiments. (pg 383). In: Design and analysis of results of independent experiments. (pg 383). In: Design and analysis of ecological experiments. Samuel M. Scheiner, Jessica Gurevich, eds. ecological experiments. Samuel M. Scheiner, Jessica Gurevich, eds. Chapman & Hall, New York, 1993.Chapman & Hall, New York, 1993.
48
Why Do Such Problems Why Do Such Problems Exist?Exist?
Because so few research reports give effect size, standard normal deviates, or exact p-values, the quantitative reviewer must calculate almost all indices of study outcomes . . . Little of this calculation is automatic because results are presented in a bewildering variety of forms and are often obscure.
Green BF, Hall JA. Quantitative methods for literature reviews. Annual Green BF, Hall JA. Quantitative methods for literature reviews. Annual Review of psychology 1984;35:37-53.Review of psychology 1984;35:37-53.
49
Closing RemarksClosing Remarks
Laborious, tedious, (could take an hour or more per article); nothing is automatic
To err is human Interpretation and subjectivity are
unavoidable Data often not reported in a uniform
manner (e.g., quality, location in paper, metrics, outcomes, numerical value vs. graphs)
50