Digital Pathology Diagnostic Accuracy, Viewing Behavior and
Image Characterization Linda Shapiro University of Washington
Computer Science and Engineering
Slide 2
Motivation Approximately 1.4 million women undergo breast
biopsies. Pathologists determine a diagnosis of benign disease or
cancer. Diagnostic errors are alarmingly frequent and lead to
altered patient treatment. At the thresholds of atypical
hyperplasia and ductal carcinoma in situ, where up to 50% of cases
are misclassified. The causes underlying diagnostic errors remain
largely unknown. Digital slides are new for pathologists.
Slide 3
The Digital Pathology Study Aim 1. Interpretive accuracy of
pathologists: digitized slide images vs. original glass slides. 200
pathologists, 3 experts, 240 slides, 5 diagnostic categories, mouse
tracking and eye tracking. Aim 2. Identify visual scanning patterns
associated with diagnostic accuracy and efficiency. Tracking data
converted to graph structures that allow comparisons between
pathologists. Aim 3. Examine and classify image characteristics of
ROIs captured in Aims 1 and 2. Computer-based statistical learning
techniques will be used to identify image characteristics that lead
to correct and incorrect diagnoses.
Slide 4
A Multi-Disciplinary Research Team Dr. Joann Elmore and Dr.
Donald Weaver are studying the interpretive accuracy of
pathologists viewing digitized slide images in comparison to
original glass slides under a microscope. Tad Brunye, a
psychologist and cognitive scientist, is collecting detailed
eye-tracking data of pathologists interpreting digitized slides to
study their visual search process. Linda Shapiro and Selim Aksoy,
researchers in computer vision pattern recognition, are developing
computational techniques to analyze whole slide images and recorded
viewing data in order to identify image characteristics and viewing
patterns associated with diagnostic accuracy. Ezgi Mercan, graduate
RA, who does the image analysis work. Supported by NIH/NCI R01
CA172343
Slide 5
Whole slide images of breast biopsies with ROIs marked by
expert pathologists Proliferative Changes Atypical Ductal
Hyperplasia Ductal Carcinoma in-situ Invasive Cancer
Slide 6
Image characteristics and viewing patterns ROIs marked by
pathologists as diagnostically useful, as well as on areas of
fixation or zoom for eye and cursor movement, are characterized by
their color, texture, and structure. Computer-based statistical
learning techniques will then identify image characteristics of
both diagnostic and distracting image regions. Merging image
features with pathologists behavioral data, we will investigate the
effects of image features and viewing patterns on diagnostic
efficiency and accuracy.
Slide 7
Strategies 1.Capture tracking patterns of individual
pathologists as graphs whose nodes represent regions of interest
(ROIs) and whose edges represent transitions between ROIs. 2.For
each slide, create a summary graph that summarizes the behavior of
all pathologists who interpreted that slide on one graph with
coincident nodes merged. 3.Use the summary graphs to compare
1.behavior of community pathologists to experts 2.correct diagnoses
to incorrect diagnoses. 4.Use the regions that correspond to the
graph nodes to 1.help in understanding what image characteristics
caused errors 2.lead to automatic diagnostic tools
Slide 8
Localization of Regions of Interest (ROIs) in Whole Slide
Images Given: a set of digital whole slide images from multiple
diagnostic categories pathologists viewing patterns in terms of
zoom level fixation time panning time rectangles marked by
pathologists as the most important regions for diagnosis Goal: To
predict regions that attract pathologists attention in whole slide
images using image characteristics. E. Mercan, S. Aksoy, L. G.
Shapiro, D. L. Weaver, J. G. Elmore, Localization of Diagnostically
Relevant Regions of Interest in Whole Slide Images. International
Conference on Pattern Recognition, August 2014.
Slide 9
Localization of Regions of Interest in Whole Slide Images 1. We
developed a model for automatically analyze recorded tracking data
to discover regions of interest. These are our ground truth. 2. We
also trained a model based on color and texture features to predict
ROIs without seeing the tracking data. We used marked rectangles
from expert pathologists to train a bag-of-words model. tracking
data our viewport analysis ROIs to which pathologist actually paid
attention our computational model based on color and texture an
unseen WSI our predicted ROIs
Slide 10
1. Learning Visual Dictionary Each 120x120 pixel patch is a
word. Words are obtained from ROIs marked by experts. We calculated
color (L*a*b*) and texture (LBP of H&E channels) histograms
from RGB images for each word. Using k-means clustering we obtained
a visual dictionary. 8/26/201410 120x120 pixel patches ROIs marked
by experts feature vectors Words of Visual Dictionary L*a*b*EH
Color Texture (LBP) H: haematoxylin E: eosin K-means
clustering
Slide 11
We used a sliding window approach to cut 3600x3600 pixel
overlapping windows (bags). Each bag contains 30x30=900 words and
is represented as a histogram of k words from the visual
dictionary. Bags in ROIs marked by experts are positive examples
for diagnostically relevant regions. Bags outside the expert ROIs
are negative examples. 8/26/201411 whole slide images 3600x3600
pixel overlapping sliding windows 2. Training: Sliding Window a
histogram (bag of words) for each sliding window +++---+++--- ROIs
marked by experts
Slide 12
8/26/201412 bag of words for each sliding window whole slide
image 3600x3600 pixel sliding windows probability of being a ROI
Binary Classifier (LogReg and SVM) 3. Testing visual dictionary
trained on expert ROIs
Slide 13
4. Early Results on Localization of Regions of Interest in
Whole Slide Images 20 high resolution breast pathology images from
5 diagnostic categories ranging from benign to invasive. Each of 3
experts provided 1 hand marked ROI and 1 viewport log per image. We
ran 10-fold cross-validation experiments with L1- regularized
logistic regression and SVM. We obtained 79.60% accuracy in
predicting ROIs using an SVM. our predicted probability map based
on color and texture ROIs extracted from viewport logs
Slide 14
Current work We are developing image features to describe and
discriminate different diagnostic categories. We are using
superpixels, which are small regions of pixels that are uniform in
color/texture, instead of working at the pixel level. We are
looking for structure in the biopsy images, both breast biopsy
images and melanoma biopsy images that will help in diagnosis. We
are also using nucleus-finding algorithms to find and then classify
nuclei. normal cells cancerous cells mitotic figures
Slide 15
Future Work: Structural Features Superpixels with graph-cut
methods to segment ducts from stromal background. cells lumen
stroma Nucleus detection and segmentation to calculate several
nucleus features: original H&E superpixels graph-cut
segmentation Shape features: size, circularity, elongation Texture
features Color features
Slide 16
Future Work: Melanoma Skin biopsies look very different than
breast biopsies. They have an orientation and well defined layers.
Melanoma, one of the most deadly skin cancers, is characterized
with architectural abnormalities and existence of mitotic figures.
A whole slide image of a skin biopsy. Benign (left) and melanoma
(right) biopsy samples showing normal and abnormal architectures.
Mitotic figures in skin biopsy marked by a pathologist.