Evaluating Data Quality in the Cancer Registry

Evaluating data quality in the Cancer Registry

Freddie Bray

Deputy Head, Section of Cancer Information

IARC

Dharmais Cancer Hospital · Jakarta · November 2010

Cancer Incidence in Five Continents: Vol 1 (1966) Introduction

Reliable cancer registries:

• Those able to amass information (diagnostic and personal) on virtually all cases of cancer among patients genuinely resident within a defined catchment area during a prescribed period of time;

• able to supplement this with death certificate data for patients not seen in hospital

• having an adequate system for eliminating duplicate entries for the same person

• and good population data - by sex and by 5-year age groups and, if relevant, by race/language

Data quality and its evaluation

• Evaluation of data quality in the cancer registry: Principles and methods.

• Part I: Comparability, validity and timeliness (Bray & Parkin)• Part II: Completeness (Parkin & Bray)• Eur J Cancer (2009) 45: 747-77, 756-64• Update of 1994 IARC Technical Report• Application to Cancer Registry of Norway:• Larsen et al (2009) Eur J Cancer 45:1218-31

• Standards and guidelines for cancer registration in Europe: The ENCR recommendations, vol 1. Lyon: IARC (2003).


Conclusion:

“This review indicated that the routines in place at the Cancer Registry of Norway yield comparable data that can be considered reasonably accurate, close-to-completion and timely, and serves as a justification for our policy of reporting annual incidence one year after the close of registration.”


Four “classical” dimensions of quality:

• Comparability

• Validity

• Completeness

• Timeliness

1. Comparability2. Completeness3. Validity4. Timeliness

Special Issue: Data Quality at the Cancer Registry of Norway

http://www.kreftregisteret.no



• Comparability

• Validity

• Completeness

• Timeliness


Comparability

• Ensuring comparable standards of reporting and classification across registries and within registries over time;

• Reporting of routines, standards and practices in place and, especially, dates in changes of practice;

• Where standards within a registry differ from “accepted” practice, requirement to provide means of conversion from one to other.


Comparability

• Classification and coding systems

• Definition of incidence date

• Handling of multiple primaries

• Incidental diagnosis (basis)

• Screening and testing

• Imaging

• Autopsy diagnosis (basis)

• Handling of death certificate information


Bray and Parkin (2009) EJC 45:747


Larsen et al (2009) EJC 45:1218

Note: Rates are age-adjusted to the 1970 U.S. standard. Rates from 1973-1987 are

based on data from the 9 standard registries. Data from San Jose and Los

Angeles are included in the rate calculation for 1988-1995.



• Comparability

• Validity

• Completeness

• Timeliness


Validity

• Accuracy of reporting

• Do cases reported to have a specific characteristic truly have that characteristic

• Depends on

• Accuracy of source information

• Registry “skill” in abstracting, coding and reporting


Validity – assessment procedures:

• Diagnostic criteria methods

• Histologic/microscopic verification (% HV/MV)

• Death certificate only (% DCO)

• Missing information (e.g. % PSU)

• Internal consistency checks (QC)

• Re-abstracting and recoding (QA)


Microscopic verification (% MV)

• Varies by cancer site (and age);

• Depends on pathology/cytology service

• 100% not always best;

• Are statistical tests to compare % MV of a registry against standard, other registries or itself at different time.


Death certificate only (% DCO)• Varies by cancer site (and age);

• Depends on clinical service;

• Associated with reduction in validity (especially site and diagnosis date) and increase in missing information;

• Other validity issues around “Death certificate notified (DCN)” or “Death certificate initiatied (DCI)”.


Bray and Parkin (2009) EJC 45:747


Missing information (e.g. % PSU)• Varies by cancer site and age;

• Varies by data item (e.g. stage);

• Depends on both registry and clinical record practice;

• Care required in codes used to define “primary site uncertain” (not just “Unknown primary site ICD-10 C80);

• Low % MV associated with high “PSU”.


Internal consistency checks (QC)• Invalid (or unlikely) codes or combinations of codes

or sequences of dates;

• Can be operationalised within software (including during data entry);

• IARC has developed such tools (IARC-CHECK) within IARCcrgTools which can be downloaded from IACR website: www.iacr.com.fr

• Checks applied should be reported along with results.

http://www.iacr.com.fr/


Re-abstracting and recoding (QA)

• Expensive and time consuming;

• Can be operationalised on sample basis;

• Can make use of other ad-hoc studies;

• Requires approaches to correct identified problems prospectively and retrospectively.



• Comparability

• Validity

• Completeness

• Timeliness


Completeness:

• The extent to which all of the incident cancers occurring in a target population are included in the registry database;

• Key defining criterion for population basis to registration;

• No perfect assessment tool


Completeness assessment:

• Methods based on comparisons and inspection;

• Methods based on independent assessment.

• Ad-hoc planned or incidental studies

• Use of multiple (independent) sources of notification especially death certificates


Completeness assessment:

• Methods based on comparisons and inspection;

• Compare rates over time and/or with similar populations;

• Inspect age-incidence curves;

• Stability of childhood cancer rates.


Completeness assessment:• Methods based on independent assessment

• Ad-hoc planned or incidental studies(comments as for validity)

• M/I ratios

• Capture-recapture methods

• The DC and M/I methodAjiki et al (1998) Nippon KEZ 45:1011

• The Flow method (also measures timeliness)Bullard et al (2000) B.J.Cancer 82:111

Read Parkin & Bray

(2009) for details


Completeness assessment:• M/I ratios;• Number of incident cases during defined time period;• Number of deaths during the same time period;• Assumption that mortality data from a source

independent of cancer registration;• Should analyse by cancer site and by age group;• Absolute values depend on survival rates and quality

of both registration and death certification;• Not robust to (usually rare) short-term changes in

incidence or survival.



• Comparability

• Validity

• Completeness

• Timeliness


Timeliness:

• Speed with which registry can collect, process and make available data at a given standard of completeness and quality;

• Often pressure to increase timeliness at expense of other quality indicators;

• Some registries (e.g. SEER) publish at a given time point and make estimates of under reporting;

• 12-24 months after year end represents current “standard”.

Data quality indicators CI5 vol. 9Breast cancer (f)

Registry No. MV% DCO% M/I%

Brazil

Sao Paulo

22598 82.2 4.6 22.8

SEER (14) 237378 98.5 0.6 21.3

Norway 12521 98.4 0.3 29.4

UK

Scotland

17137 96.4 0.3 32.9

Japan

Osaka

11103 90.1 2.5 30.5

Data quality indicators CI5 vol. 9Lung cancer (m)

Registry No. MV% DCO% M/I%

Brazil

Sao Paulo

6525 66.9 13.8 72.8

SEER (14) 123409 89.8 1.8 80.7

Norway 6516 87.4 1.0 88.6

UK

Scotland

12969 74.9 0.9 88.3

Japan

Osaka

16759 73.1 19.3 83.7


• Cancer registration is a worldwide activity and leads the way in global surveillance for non communicable diseases;

• The benefit of population based registration to cancer control programs and to epidemiological research can be realised only to the extent that data are of a comparable, high quality standard;

• Reporting on data quality in a registry is as important as reporting analyses of the data.

Evaluating Data Quality in the Cancer Registry

Documents

Transcript of Evaluating Data Quality in the Cancer Registry