Download - Digital Pathology Diagnostic Accuracy, Viewing Behavior and Image Characterization Linda Shapiro University of Washington Computer Science and Engineering.

Digital Pathology Diagnostic Accuracy, Viewing Behavior and Image Characterization Linda Shapiro University of Washington Computer Science and Engineering

Motivation Approximately 1.4 million women undergo breast biopsies. Pathologists determine a diagnosis of benign disease or cancer. Diagnostic errors are alarmingly frequent and lead to altered patient treatment. At the thresholds of atypical hyperplasia and ductal carcinoma in situ, where up to 50% of cases are misclassified. The causes underlying diagnostic errors remain largely unknown. Digital slides are new for pathologists.

The Digital Pathology Study Aim 1. Interpretive accuracy of pathologists: digitized slide images vs. original glass slides. 200 pathologists, 3 experts, 240 slides, 5 diagnostic categories, mouse tracking and eye tracking. Aim 2. Identify visual scanning patterns associated with diagnostic accuracy and efficiency. Tracking data converted to graph structures that allow comparisons between pathologists. Aim 3. Examine and classify image characteristics of ROIs captured in Aims 1 and 2. Computer-based statistical learning techniques will be used to identify image characteristics that lead to correct and incorrect diagnoses.

A Multi-Disciplinary Research Team Dr. Joann Elmore and Dr. Donald Weaver are studying the interpretive accuracy of pathologists viewing digitized slide images in comparison to original glass slides under a microscope. Tad Brunye, a psychologist and cognitive scientist, is collecting detailed eye-tracking data of pathologists interpreting digitized slides to study their visual search process. Linda Shapiro and Selim Aksoy, researchers in computer vision pattern recognition, are developing computational techniques to analyze whole slide images and recorded viewing data in order to identify image characteristics and viewing patterns associated with diagnostic accuracy. Ezgi Mercan, graduate RA, who does the image analysis work. Supported by NIH/NCI R01 CA172343

Whole slide images of breast biopsies with ROIs marked by expert pathologists Proliferative Changes Atypical Ductal Hyperplasia Ductal Carcinoma in-situ Invasive Cancer

Image characteristics and viewing patterns ROIs marked by pathologists as diagnostically useful, as well as on areas of fixation or zoom for eye and cursor movement, are characterized by their color, texture, and structure. Computer-based statistical learning techniques will then identify image characteristics of both diagnostic and distracting image regions. Merging image features with pathologists behavioral data, we will investigate the effects of image features and viewing patterns on diagnostic efficiency and accuracy.

Strategies 1.Capture tracking patterns of individual pathologists as graphs whose nodes represent regions of interest (ROIs) and whose edges represent transitions between ROIs. 2.For each slide, create a summary graph that summarizes the behavior of all pathologists who interpreted that slide on one graph with coincident nodes merged. 3.Use the summary graphs to compare 1.behavior of community pathologists to experts 2.correct diagnoses to incorrect diagnoses. 4.Use the regions that correspond to the graph nodes to 1.help in understanding what image characteristics caused errors 2.lead to automatic diagnostic tools

Localization of Regions of Interest (ROIs) in Whole Slide Images Given: a set of digital whole slide images from multiple diagnostic categories pathologists viewing patterns in terms of zoom level fixation time panning time rectangles marked by pathologists as the most important regions for diagnosis Goal: To predict regions that attract pathologists attention in whole slide images using image characteristics. E. Mercan, S. Aksoy, L. G. Shapiro, D. L. Weaver, J. G. Elmore, Localization of Diagnostically Relevant Regions of Interest in Whole Slide Images. International Conference on Pattern Recognition, August 2014.

Localization of Regions of Interest in Whole Slide Images 1. We developed a model for automatically analyze recorded tracking data to discover regions of interest. These are our ground truth. 2. We also trained a model based on color and texture features to predict ROIs without seeing the tracking data. We used marked rectangles from expert pathologists to train a bag-of-words model. tracking data our viewport analysis ROIs to which pathologist actually paid attention our computational model based on color and texture an unseen WSI our predicted ROIs

1. Learning Visual Dictionary Each 120x120 pixel patch is a word. Words are obtained from ROIs marked by experts. We calculated color (L*a*b*) and texture (LBP of H&E channels) histograms from RGB images for each word. Using k-means clustering we obtained a visual dictionary. 8/26/201410 120x120 pixel patches ROIs marked by experts feature vectors Words of Visual Dictionary L*a*b*EH Color Texture (LBP) H: haematoxylin E: eosin K-means clustering

We used a sliding window approach to cut 3600x3600 pixel overlapping windows (bags). Each bag contains 30x30=900 words and is represented as a histogram of k words from the visual dictionary. Bags in ROIs marked by experts are positive examples for diagnostically relevant regions. Bags outside the expert ROIs are negative examples. 8/26/201411 whole slide images 3600x3600 pixel overlapping sliding windows 2. Training: Sliding Window a histogram (bag of words) for each sliding window +++---+++--- ROIs marked by experts

8/26/201412 bag of words for each sliding window whole slide image 3600x3600 pixel sliding windows probability of being a ROI Binary Classifier (LogReg and SVM) 3. Testing visual dictionary trained on expert ROIs

4. Early Results on Localization of Regions of Interest in Whole Slide Images 20 high resolution breast pathology images from 5 diagnostic categories ranging from benign to invasive. Each of 3 experts provided 1 hand marked ROI and 1 viewport log per image. We ran 10-fold cross-validation experiments with L1- regularized logistic regression and SVM. We obtained 79.60% accuracy in predicting ROIs using an SVM. our predicted probability map based on color and texture ROIs extracted from viewport logs

Current work We are developing image features to describe and discriminate different diagnostic categories. We are using superpixels, which are small regions of pixels that are uniform in color/texture, instead of working at the pixel level. We are looking for structure in the biopsy images, both breast biopsy images and melanoma biopsy images that will help in diagnosis. We are also using nucleus-finding algorithms to find and then classify nuclei. normal cells cancerous cells mitotic figures

Future Work: Structural Features Superpixels with graph-cut methods to segment ducts from stromal background. cells lumen stroma Nucleus detection and segmentation to calculate several nucleus features: original H&E superpixels graph-cut segmentation Shape features: size, circularity, elongation Texture features Color features

Future Work: Melanoma Skin biopsies look very different than breast biopsies. They have an orientation and well defined layers. Melanoma, one of the most deadly skin cancers, is characterized with architectural abnormalities and existence of mitotic figures. A whole slide image of a skin biopsy. Benign (left) and melanoma (right) biopsy samples showing normal and abnormal architectures. Mitotic figures in skin biopsy marked by a pathologist.