Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings
description
Transcript of Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings
Bayesian Network for Predicting Invasive and In-situ Breast
Cancer using Mammographic Findings
Jagpreet Chhatwal1 O. Alagoz1, E.S. Burnside1, H. Nassif1,
E.A. Sickles21University of Wisconsin-Madison
2University of California, San Francisco
Outline• Introduction• Model Formulation• Results• Conclusions
2
Background: Facts• Breast cancer is the most common non-
skin cancer affecting women in the U.S.
• Every two minutes a woman is diagnosed with breast cancer
• Estimated number of deaths because of breast cancer in 2007 - 40,460
3
MammographyA low dose X-ray examination of breasts
• Mammography is shown to be the most cost-effective diagnostic procedure for early diagnosis of breast cancer– American Cancer Society recommends that
women above the age 40 should have a screening mammogram every year
– More than 20 million mammograms are performed in the US annually
Breast Biopsy
5
Tissue-sampling procedure to confirm the presence of cancer• Types of biopsies: Needle
aspiration and surgical• Estimated number of
biopsies performed annually - 700,000
• 55−85% of breast biopsies result in benign (non-cancerous) findings
• Estimated overspending on benign biopsies - $250 million
• Significant anxiety associated with biopsy
Invasive and In-situ Cancer
6
• Nearly all breast cancer arises in the milk ducts of the breast• Invasive cancer• Ductal carcinoma in-situ
(DCIS)• DCIS lesions contain cells
that appear to be cancer but not all such lesions behave as cancer
Invasive and In-situ Cancer (Cont.) DCIS is a non-invasive malignant condition
with a very favorable prognosis. Depending on the grade of the DCIS and the
expected life span of older women, DCIS often will not cause morbidity or mortality for many years, if ever.
Invasive breast cancer has an increased risk of axillary node metastasis or distant disease Quickly results in morbidity and mortality (also in
older women).
7
Diagnosis or Over-diagnosis Only some DCIS lesions will eventually become
invasive cancer. What percent will become invasive cancer is not known Which DCIS will become invasive is not known
Detecting DCIS on mammograms may benefit those women whose DCIS would become invasive cancer.
Detecting DCIS may potentially harm those women who have breast surgery but whose DCIS would never become invasive cancer. Incidence of DCIS has increased significantly,
with the same predominance in older women as invasive breast cancer.
8
Objective To build a quantitative model to
predict the risk of DCIS and invasive breast cancer using patient demographic factors and mammography findings.
9
Data Source• Mammography data from University of California
San Francisco Medical Center between 1997 to 2007
• Combination of structured data and extracted variables from dictated text reports– Patient demographic factors – Imaging features according to the standardized
Breast Imaging Reporting and Data Systems (BI-RADS) lexicon.
• 2,211 malignant biopsy records – 1,544 invasive cancer and 667 DCIS.
10
Sample Text-report• “Possible clustered microcalcifications, right breast.
FINDINGS: Spot compression magnification mammography of the right breast was performed. In the right upper outer breast, there is a cluster of amorphous appearing microcalcifications. These are slightly suspicious for malignancy and therefore biopsy is recommended. No other suspicious clusters of microcalcifications are present. There are few scattered microcalcifications elsewhere in the right breast. Recommend needle localization followed by surgical biopsy…”
11
MBNi
• Developed a Mammography Bayesian Network for Invasive and In-situ cancer risk prediction (MBNi)
• Structural Training– NP hard problem– Using Tree Augmented Naïve Bayes
(TAN) algorithm– WEKA (Waikato Environment for
Knowledge Analysis)12
MBNi
13
Performance Measures Sensitivity and Specificity
Patient with Disease
Patients without disease
Test + a b
Test - c d
Sensitivity=a/(a+c)
Specificity = d/(b+d)
Receiver Operating Characteristic (ROC) Curve Graphical plot of sensitivity versus 1-specificity for
varying cut-off points (thresholds) Area under the ROC curve (Az)
Performance Measures (Cont.) Precision and Recall
Patient with Disease
Patients without disease
Test + a b
Test - c d
Recall = a/(a+c)
Precision = a/(a+b)
Precision-Recall (PR) Curve Graphical plot of precision versus recall for varying cut-off
points (thresholds)
Validation Technique 10 fold cross-validation
Test fold
Training fold
1 32 10…
…
Data set
Fold
Merge tested folds for performance analysis
1 32 10…Fold
Performance: ROC Curve
17
Performance: PR Curve
18
Older versus Younger Women• Mammography is known to perform
better in older women• We stratified our data set in two parts as
follows:– Mammography data of women less than age
50 (177 DCIS and 361 invasive cancers),– Mammography data of women above the
age 65 (219 DCIS and 600 invasive cancers).
19
Performance: ROC
20
P=0.039
Performance: PR Curve
21
P=0.038
Conclusions• Our MBNi can predict the risk of DCIS versus
invasive cancer and may be superior in older.• Our MBNi has the potential to aid in the
clinical management decisions such as the need for increased sampling at biopsy and the appropriate selection of surgical interventions.
• Our MBNi is a step towards shared decision-making and may empower older women to better manage their health in the context of their co-morbidities and life expectancy.
22
Ongoing and Future Research• Validation of “text extraction” features• Three-class prediction model – Benign,
DCIS and Invasive cancer• Predict the risk of breast diseases type• Ensemble learning:
– Logistic Regression– Artificial Neural Networks– Bayesian Networks– Support Vector Machines
23
Thank You!
24