Rahul Deo, MD PHD Application of Hospital Machine Learning to Cardiac … · 2020. 12. 30. ·...

74
Application of Machine Learning to Cardiac Imaging Rahul Deo, MD PHD Lead Investigator, Brigham and Women’s Hospital One Brave Idea Adjunct Associate Professor, University of California San Francisco Member of Faculty Harvard Medical School "One Brave Idea" logo © American Heart Association; "cdhi" logo © The Regents of the University of California; "BWH" logo © Brigham and Women's Hospital. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. 1

Transcript of Rahul Deo, MD PHD Application of Hospital Machine Learning to Cardiac … · 2020. 12. 30. ·...

  • Application of Machine Learning to Cardiac Imaging

    Rahul Deo, MD PHD Lead Investigator, Brigham and Women’s Hospital One Brave Idea

    Adjunct Associate Professor, University of California San Francisco

    Member of Faculty Harvard Medical School

    "One Brave Idea" logo © American Heart Association; "cdhi" logo © The Regents of the University of California; "BWH" logo © Brigham and Women's Hospital. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    1

    https://ocw.mit.edu/help/faq-fair-use/

  • Outline

    1. Introduction to cardiac structure and function

    2. Major types of cardiac diagnostics and how they are used

    3. Where’s the data?

    4. Basic computer vision topics relevant to cardiac imaging

    5. A fully automated pipeline for echocardiogram interpretation

    6. Rethinking the future of automated interpretation: lessons

    from the ECG

    7. What about the biology?

    2

  • A Brief Introduction to Cardiac Structure and Function

    3

  • Coronary heart disease (CHD) is the leading global cause of death

    CHD is the leading cause of death in both developed and developing countries.

    Lancet, 2018

    4 © Mayo Foundation for Medical Education and Research. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    https://ocw.mit.edu/help/faq-fair-use

  • The heart’s primary function is as a pump

    1. The heart must deliveroxygenated blood throughoutthe circulatory system

    2. Blood supplies tissues withoxygen for ATP production,delivers and receives signalingmolecules, and removes waste

    3. The heart pumps ~5L of bloodper minute, which can expand to20-35L per minute duringexercise

    4. The rhythmic function of theheart results in >2 billion heartbeats in a typical lifetime

    © source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. 5

    https://ocw.mit.edu/help/faq-fair-use/

  • The structure of the heart

    © source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    4 chambers: RA, RV, LA, LV 4 valves:TV, PV, MR,AV 2 circulations in series:

    © Tineke Willems and Marieke Hazewinkel, courtesy of The Radiology Assistant. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.pulmonary and systemic 6

    https://ocw.mit.edu/help/faq-fair-use/https://radiologyassistant.nl/cardiovascular/cardiac-anatomyhttps://ocw.mit.edu/help/faq-fair-use/

  • The Cardiac Cycle: Synchronized Electrical and Mechanical Activation

    1. The Wiggersdiagram alignsmechanical andelectrical events(and heart sounds)

    2. The heart alternatesbetween periods ofrelaxation and filling(diastole) andperiods ofcontraction andejection of blood(systole)

    © Julian Andrés Betancur Acevedo. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ 7

    https://ocw.mit.edu/help/faq-fair-use/

  • Visualizing the Heart in Motion

    8 © source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    https://ocw.mit.edu/help/faq-fair-use/

  • Diseases of the heart are organized intoabnormalities of contractile function, coronaryblood supply, circulatory flow, or heart rhythm

    Abnormality Disease Names Presentation Treatment Contractile function Heart failure Shortness of

    breath, fluid buildup in legs

    medications, ventricular assist device, transplant

    Coronary bloodsupply

    Coronary artery disease, myocardial

    infarction

    Chest pain,shortness of breath

    angioplasty/stenting;coronary artery bypass

    grafting

    Circulatory flow Aortic stenosis/regurgitation, mitral

    stenosis/regurgitation,

    Lightheadedness, shortness of

    breath, fainting

    valve replacement, valve repair

    Heart rhythm Atrial fibrillation/flutter, ventricular

    tachycardia, sick sinus syndrome

    palpitations,fainting, cardiac

    arrest

    ablation, implantabledefibrillator, pacemaker

    9

  • The heart is a complex multicellular organ

    © The Cardio Research Web Project. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    1. The cardiomyocyte is theprimary excitable andcontractile cell in the heart

    2. But … only 31% of cells in theheart are cardiomyocytes

    3. Cardiac function and diseasearises from the interplay of abroad group of cells

    4. Other cell types: endothelial,fibroblast, leukocytes

    A. R. Pinto et al., "Revisiting Cardiac Cellular Composition," Circ Res. 2016 February 5; 118(3): 400–409. © American Heart Association. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    10

    https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/

  • Cardiac Imaging in Medical Decision Making

    11

  • Top image © source unknown; middle image © American College of Cardiology; bottom image © source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    Cardiac imaging plays a critical role in diagnosis (and definitions of disease)

    Modality Cost Approach Diagnostic UtilityElectrocardiogram

    (ECG) $ Voltage

    differences Myocardial infarction

    Echocardiography $$-$$$ Ultrasound (sound waves);Doppler shift

    Quantitation of cardiac structure and function, heart failure, valvular disease, pulmonary

    hypertension MRI $$$$ Magnetic

    resonance (volumetric

    reconstruction)

    Quantitation of cardiac structure and function, heart failure, valvular

    disease Angiography

    (Fluoroscopy andComputed

    Tomography)

    $$$$ X-ray(volumetric

    reconstruction for CT)

    Epicardial coronary artery disease

    SPECT/PET $$$$ Radionuclide tracer

    Coronary artery disease (inferred);

    microvascular disease Intracardiac pressure

    transducers

    $$$ Pressure transducer

    Heart failure, valvular disease, pulmonary

    hypertension

    Many cardiac diseases are defined (for better or worse) as departures from normal anatomic/physiologic values 12

    https://ocw.mit.edu/help/faq-fair-use/

  • Cardiac decisions are often (but not always) guided by inputs from imaging

    Disease Decision Inputs Heart failure Decision to implant a

    defibrillator to prevent sudden death

    Symptoms +ejection fraction of

    the heart 70%

    Aortic stenosis Valve replacement Symptoms +valve area +

    enlargement of theheart

    Atrial fibrillation Decision to start anticoagulation to

    prevent stroke

    Age, sex, otherdiagnoses

    Myocardialinfarction

    Decision to start aspirin and a statin toprevent a future heart

    attack

    A risk model based on age, sex,lab values, blood

    pressure, diabetes

    1. Information content of imaging can be very high … but decisions are based on historical patient populations followed through time with the relevant disease

    2. Risk model and decision analysis is dictated by what data are available for these historical populations

    3. Imaging is available for patient populations for which it is a part of the accepted management plan … but is unlikely to be found for other diseases given cost

    13

  • Imaging Modalities and Data

    14

  • How Medical Imaging Data Are Stored

    1. DICOM (Digital Imaging and Communications Standard) isthe international standard to transmit, store, retrieve, print,process, and display imaging information

    2. Image/video files are stored in DICOM format, and combinea compressed image with a DICOM “header” which includescharacteristics of the image

    3. Open access libraries like GDCM, pydicom facilitatecompressing/uncompressing; reading and editing header

    4. Osirix Lite provides a free DICOM viewer

    15

  • Where can I get access to data?

    1. Most imaging data is housed in data archives (increasingly “vendor neutral”)

    2. Access is often highly limited: 1. Some images have burned in pixels with patient names, dates

    of birth 2. Scalable solutions for download and de-identification are

    not always available (these would facilitate changing vendors) 3. Some systems have monetized their imaging data

    3. Labels (e.g. diagnoses, measurements) are often stored separately in the electronic health record

    4. Scale of data (at BWH) - clearly relates to cost of the study as well as perceived utility: 1. Electrocardiograms: 30 million ECGs 2. Echocardiography: 300,000-500,000 studies 3. Cardiac PET: 8000 studies 16

  • Example of a DICOM header

    Unfortunately, there can be some instrument to instrument variability in how some fields are represented.

    17

  • Characteristics of Cardiac Imaging Data

    1. Compression: lossy vs. lossless2. Spatial resolution: number of

    pixels; pixel dimensions3. Sampling frequency (temporal

    resolution): very high forvarious ultrasound modalitiesand coronary angiography,moderate for CT scanners

    4. Coronary artery velocity is10-65 mm/seconds

    5. “Gating”: electrocardiograminformation can be coupledwith imaging information toaverage images acrosscorresponding portions of thecycle

    Modality Spatial Resolution Temporalresolution

    Echocardiography 2-3 mm 1-5 ms forsome modes;

    typically 20-30msfor 2D

    MRI 0.1mm 30-100 ms

    Angiography(Fluoroscopy)

    0.1mm 1-10 ms

    ComputedTomography

    0.5mm in x,y;0.5-0.625mm in z

    65-175 ms

    SPECT/PET 8-10 mm forSPECT; 3-5 mm for

    PET

    minutes

    18

  • Relevant Topics in Computer Vision

    19

  • Machine learning in cardiac disease - what physician practices can we mimic?

    1. All current measurements (cardiacchamber areas, ventricularthickness) are performed manually1. Severity of a stenosis of the

    coronary artery© American College of Cardiologists.All rights reserved. This content isexcluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    2. Left ventricular cardiacvolumes (and by comparisonacross the cardiac cycle,ejection fraction)

    2. Some disease diagnoses involveclassification of images/videos

    We’ll come back to whether these contributions would be seen as © Stanford Medicine. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    valuable 20

    https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/

  • Many priorities in computer vision are of great interest to cardiac imaging

    1. Image (and video) classification: assigning a label toan image/video

    2. Semantic segmentation: associating each pixel in animage with a class label

    3. Image registration: mapping different sets of imagesonto one coordinate system

    21

  • Image classification: an obvious task to mimic

    1. Many simple disease recognition tasks exist in medicine - and can becarried out by an experienced radiologist in 2 minutes or less1. e.g. lung cancer or not2. pneumonia or not3. breast cancer or not4. fluid around the heart or not

    2. Many of the first successes in medical image classification haveinvolved situations with very large data sets, already labeled in thecontext of routine clinical care1. Chest x-rays2. Mammograms

    3. Barriers to data export and sharing have limited the size of manyother data sets

    22

  • Image classification: convolutional neural networks have rekindled an interest in automated medical image classification

    Red Green Blue

    Samoyed (16); Papillon (5.7); Pomeranian (2.7); Arctic fox (1.0); Eskimo dog (0.6); white wolf (0.4); Siberian husky (0.4)

    Convolutions and ReLU

    Max pooling

    Max pooling

    Convolutions and ReLU

    Convolutions and ReLU

    1. Representation learning2. No need for hand-engineering of features3. Transfer learning: important in training data poor scenarios

    Image from LeCun, Bengio & Hinton, "Deep Learning," Nature 521 (28 May 2015). © Springer Nature Limited. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. 23

    https://ocw.mit.edu/help/faq-fair-use/

  • Image classification: will anyone use it?

    1. If a radiologist takes 2 minutes to read a study, how muchbenefit is there to automate the process

    2. Liability is an enormous reason why we don’t task-shift imageinterpretation to less-skilled personnel - radiologists are amongthe most sued physicians

    3. So it is unlikely we will see the benefits of other disciplineswhere a machine will be permitted to independently read astudy BUT:1. If there are 1000 X-rays to (over)read or if the hospital is in

    off-hours (overnight, weekend), a machine can pre-read anddecide what’s most urgent to look at

    2. An independent read should catch some missed diagnoses4. The calculus may change in resource poor settings

    24

  • Image classification: explaining the diagnosis

    1. All imaging-based medical decisions have typically required acorresponding human confirmation of a visual finding

    2. In some cases, the need is unambiguous: you can’t take a biopsy of atumor you can’t localize; nor can you submit a report that doesn’tlocalize the abnormality

    3. Increasingly patients and providers share in decisions: requiring bothto be convinced of the validity of the conclusion

    4. CNNs raise some concerns as to whether a simple explanation canalways be given

    Image from Murdoch et al., "Interpretable machine learning: definitions, methods, and applications," PNAS October 29, 2019 116 (44) 22071-22080. © National Academy of Sciences. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https:// ocw.mit.edu/help/faq-fair-use/

    25

    https://ocw.mit.edu/help/faq-fair-use/

  • Image classification: explaining the diagnosis

    1. Different strategies 1. Find input images that maximally activate

    a given class score and compare them according to some interpretable property

    2. Visualize how the network responds to a specific input image

    2. Saliency map (Simonyan,Vedaldi, Zisserman, 2014): 1. Class model visualization: generate an

    image that maximizes the (regularized) class score

    2. Image-specific class-specific saliency map: plots the derivative of the score function for a given class with respect to each pixel

    © Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    26

    https://ocw.mit.edu/help/faq-fair-use/

  • Algorithm 1 Evaluating the prediction difference using conditional and multivariate sampling Input: classifier with outputs p(c|x), input image x of size n ⇥ n, inner patch size k, outer patchsize l > k, class of interest c, probabilistic model over patches of size l ⇥ l, number of samples SInitialization: WE = zeros(n*n), counts = zeros(n*n) for every patch xw of size k ⇥ k in x do

    0x = copy(x) sumw = 0define patch x̂w of size l ⇥ l that contains xwfor s = 1 to S do

    0xw xw sampled from p(xw|x̂w\xw)sumw += p(c|x0) . evaluate classifier

    end for p(c|x\xw) := sumw/SWE[coordinates of xw] += log2(odds(c|x)) log2(odds(c|x\xw))counts[coordinates of xw] += 1

    end for Output: WE / counts . point-wise division

    Zintgraf et al., "Visualizing Deep Neural Network Decisions: Prediction Difference Analysis", ICLR 2017. © Luisa M Zintgraf, Taco S Cohen, Tameem Adel, and Max Welling. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    27

    https://ocw.mit.edu/help/faq-fair-use/

  • Semantic segmentation: localizing structures

    1. Semantic segmentation: assigning labels to individual pixels

    2. Quantification of cardiac function involves estimating the heart’s volume at its upper and lower limits 1. This requires delineating the boundaries

    of the heart: a segmentation task 3. A diagnosis (classification) may also require

    limiting the field of view to a given structure

    4. Many radiology reports involve providing measurements (lengths, areas) for basic

    Lang et al., "Recommendations for cardiac chamber quantification bystructures echocardiography in adults," J Am Soc Echocardiogr. 2015 Jan;28(1):1-39.e14. © American Society of Echocardiography. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    28

    https://ocw.mit.edu/help/faq-fair-use/

  • The U-Net Architecture for Semantic Segmentation

    1. The contraction/downsampling layerprovides arepresentation of thecontext of the image

    2. The expanding/upsampling layer mapscontextual features tothe appropriatelocalization

    3. Skip connectionsconcatenate images fromthe downsampling layerto the upsampling one

    copy and crop

    inputimage

    tile

    output segmentation map

    641

    128

    256

    512

    1024

    max pool 2x2

    up-conv 2x2

    conv 3x3, ReLU

    572

    x 5

    72

    284²

    64

    128

    256

    512

    57

    0 x

    570

    568 x

    568

    28

    280

    ²1

    40

    ²

    13

    13

    68²

    66

    ²

    64

    ²3

    28²

    56²

    54²

    52

    ²

    512

    104

    ²

    10

    10

    200²

    30²

    19

    196²

    392

    x 3

    92

    39

    0 x

    390

    38

    8 x

    38

    8

    388 x

    388

    1024

    512 256

    256 128

    64128 64 2

    conv 1x1

    Ronneberger et al., "U-Net: Convolutional Networks for Biomedical Image Segmentation", MICCAI 2015. © Olaf Ronneberger,Philipp Fischer, and Thomas Brox. All rights reserved. This content is excluded from our Creative Commons license. For moreinformation, see https://ocw.mit.edu/help/faq-fair-use/.

    29

    https://ocw.mit.edu/help/faq-fair-use/

  • Various architectures help incorporate global features and contextual interactions

    30

    © Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    https://ocw.mit.edu/help/faq-fair-use/

  • Image registration: aligning different images

    1. There is sometimes a need to mergeimages from different modalities or fromdifferent time points in the same study

    2. In cardiac imaging this is relevant formerging a study with poor spatialresolution with one which is superior butmay lack functional information: PET + CT

    3. The low temporal resolution of PET alsorequires averaging across many cardiaccycles (ECG-based gating) and there maybe movement of the thorax (breathing)during this time

    © source unknown. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/. 31

    https://ocw.mit.edu/help/faq-fair-use/

  • Image registration methods: pre- and post-CNNs

    1. Classification

    1. Intensity domain vs. frequency domain

    2. Raw intensities vs. feature-based

    3. Global (whole image) vs. local (region of interest)

    4. Type of transformation used to relate one image to the other

    5. Monomodal vs. multimodal

    2. Additional choice of similarity metric as well as algorithms to search parameter space for geometric transformation

    3. Conditional variational autoencoder have been explored to learn geometric transformations between pairs of images

    © Taylor & Francis. All rights reserved. This content is excluded from our(Krebs … Mansi, arXiv:1812.07460, 2018) Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. 32

    https://ocw.mit.edu/help/faq-fair-use/

  • A Fully Automated Echocardiogram Interpretation Pipeline

    33

  • The Failure of our Current Approach to Cardiovascular Disease

    Risk factors deviate from

    optimal values

    • blood pressure • LDL/VLDL cholesterol • weight • blood sugar

    Time

    Hemming andhawing regarding lifestyle changes

    Death and

    disability

    $$$

    Treatment may be (grudgingly)

    initiated • anti-hypertensives • cholesterol-lowering • anti-diabetics

    Symptomsdevelop

    • dyspnea • anginaTherapeutic

    Responsiveness 34

  • What sort of solution are we looking for?

    1. Low-cost quantitative metrics that are indicative of

    disease progression and reflect the onset of these tissue-

    level changes

    2. Should be specific to the disease process:

    1. expressive: capture complex underlying processes

    (molecular, cellular, imaging …)

    2. multidimensional: can’t readily be “gamed”

    3. Should be ameliorated with therapy (c.f. genetic risk)

    35

  • Simple cardiac ultrasound views provide a quantitative metric of early disease progression

    Left ventricular mass increases with disease

    Left ventricular function diminishes with disease

    Left atrial volume increases with disease

    © source unknown. All rights reserved. This content is excluded from ourCreative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    36

    https://ocw.mit.edu/help/faq-fair-use/

  • A role for automated interpretation at the “low risk - high reward” portion of the spectrum

    Non-skilled acquisition Skilled sonographer(primary care)

    Low cost handheld ultrasound Costly full ultrasound system

    Automated interpretation Expert cardiologist interpretation

    Early in disease course Late in disease course

    Decision support regarding initiation Difficult decisions regarding or intensification of therapy surgery

    High liabilityLow liability $ 37

    $$$

  • Machine learning in cardiac disease - what should we be focusing on?

    1. Enabling much greater volumes of data © American College of Cardiologists.All rights reserved. This content is(tracking, clinical trial) by: excluded from our Creative Commons license. For more information, see1. Reducing costs of acquisition https://ocw.mit.edu/help/faq-fair-use/.

    2. Augmenting interpretation of simply acquired data - i.e. diagnosing abnormalities of relaxation without Tissue Doppler

    3. Automating interpretation to reduce costs 2. Surveillance within a hospital system: patient

    identification for therapies 3. Triage: automated interpretation (lessons from

    ECGs in the ambulance/emergency room) 4. Can we go beyond what humans can see?

    1. Quantitative tracking of intermediate states of disease and assessing treatment response

    2. Recognizing meaningful subclasses of © Stanford Medicine. All rights reserved. This content is excluded from our Creative

    disease that differ in prognosis and Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. treatment 38

    https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/

  • A role for rapid triage: the electrocardiogram in © source unknown. All rights reserved. This content is excluded myocardial infarction from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    1. The ST-elevation myocardial infarction pattern in the ECG arises from complete obstruction of blood flow to portions of the heart

    2. In the early 2000’s it was recognized that any delay in angioplasty and stenting would result in irreversible damage to the heart

    3. The previous approach - with a cardiologist reviewing the ECG before any action was taken was replaced with a rapid triage system by ambulance personnel or ED physicians

    4. The cardiac catheterization team would be “activated” by non-cardiologists and needed to arrive to the hospital within 30 minutes

    5. Some subsets of these activations were false positives

    © Healio. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    39

    https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/

  • What is in an Echo Study?

    1. Typically a collection of up to 70 videos of the heart taken over

    multiple cardiac cycles and focusing on different viewpoints

    (requiring ~45 min by a skilled sonographer)

    2. Heart can be visualized from >10 different views - though not truly

    discrete classes as sonographer can zoom and angle the probe to

    focus on structures of interest.These are typically unlabeled.

    3. Still images are typically included to enable manual measurements

    4. UCSF performs 12,000-15,000 echo studies per year; BWH

    performs 30,000-35,000 studies

    5. There are >7,000,000 echos performed annually in the Medicare

    population alone

    6. There are likely 100,000,000’s of archived echos 40

  • DICOM formatimages

    An Automated (low-cost!) Approach to Echo Interpretation

    Echo Studies (14035)

    View View Probability Classification Quality Score (277)

    Segmentation Disease Detection: for 5 views HCM (495/2244) (791*) PAH (584/2487)

    Amyloid (179/804)

    Cardiac Structure: Cardiac Function: Zhang ... Deo, "Fully Automated Echocardiogram Mass and Volume Ejection Fraction Interpretation in Clinical Practice: Feasibility and Diagnostic Accuracy," Circulation Volume 138, Issue 16, 16 October 2018, Pages 1623-1635.(8666) (6407)

    Longitudinal Strain(526)

    41

  • DevelopersJeffrey Zhang

    Rahul Deo

    Project Design Jeffrey Zhang

    Rahul Deo Geoff Tison Sanjiv Shah

    Computer Vision Consultants Laura Hallock Pulkit Agrawal

    EchoCV Team

    Wise Computer Vision Gurus

    Ruzena BajcsyAlyosha Efros

    Image LabelingRahul Deo

    Sravani GajjalaFrancesca Delling

    Statistics Rahul Deo

    Image SegmentationRahul Deo

    Sravani GajjalaGeoff Tison

    Herceptin Data AcquisitionMandar Aras Eugene Fan

    Kirsten Fleischmann Michelle Melisko

    ChaRandle Jordan Atif Qasim

    Strain ComputationLauren Beusslink-Nelson

    Sanjiv ShahAtif Qasim

    Mats Lassen

    "UCSF" and "Berkeley" logos © The Regents of the University of California; "Northwestern Medicine" logo © Brigham and Women's Hospital. All rights reserved.This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    42

    https://ocw.mit.edu/help/faq-fair-use/

  • DICOM formatimages

    An Automated (low-cost!) Approach to Echo Interpretation

    Echo Studies (14035)

    View View Probability Classification Quality Score (277)

    Segmentation Disease Detection: for 5 views HCM (495/2244) (791*) PAH (584/2487)

    Amyloid (179/804)

    Cardiac Structure: Cardiac Function: Mass and Volume Ejection Fraction

    (8666) (6407)Longitudinal Strain

    (526) 43

  • View Classification -Someone Already Got To It Before Us

    X. Gao et al., "A fused deep learning architecture for viewpoint classification ofechocardiography," Information Fusion Volume 36, July 2017, Pages 103-113.

    44

    https://www.sciencedirect.com/science/article/pii/S1566253516301385

  • View Classification - Our Take Prediction of echo views for individual images

    Ground Truth

    PLAX.remote PLAX

    PLAX.zoom of LA PLAX.centered on LA

    RV inflow PSAX.apex

    PSAX.PapMusclePSAX.MV PSAX.AoV

    PSAX.AoV zoom A2c.no occlusions A2c.occluded LA A2c.occluded LV

    A3c.no occlusions A3c.occluded LA A3c.occluded LV

    A4c.no occlusions A4c.occluded LA A4c.occluded LV

    A5c Subcostal

    SuprasternalOther

    364 45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 1155 10 82 3 0 1 0 27 0 0 9 0 10 1 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 1 0 0 0 0 0 6 0 0 0 4 0 0 0 0 21 52 107 2 0 0 0 7 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 18 4 18 266 2 15 2 8 0 7 0 0 0 1 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 0 0 41 26 936 129 13 2 1 10 0 0 0 0 19 21 0 2 0 1 0 0 1 0 5 7 18 0 195 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 3 14 0 0 0 4 766 20 0 0 0 0 0 0 1 0 1 0 15 2 0 0 0 29 1 0 0 1 8 39 290 0 0 0 0 0 4 0 0 0 3 13 0 0 0 1 0 0 0 4 0 0 9 0 1070 90 36 80 8 0 38 17 0 23 2 0 0 0 0 0 0 0 0 0 0 0 0 67 362 2 0 9 0 9 44 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 64 0 10 0 0 0 1 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 5 0 435 80 30 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 3 12 21 161 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 90 1 0 0 0 0 1530 45 104 93 0 0 0 0 0 0 0 0 0 3 0 0 0 22 32 0 0 0 0 28 543 20 12 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 1 0 0 0 3 4 47 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 9 185 0 0 0 3 1 1 0 1 0 0 0 2 0 3 0 0 0 0 0 9 1 9 15 930 1 24 0 0 0 8 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 16 0 10 0 20 0 0 14 0 0 0 0 0 0 3 0 0 0 0 0 0 20 0 3255

    9 11

    36 146

    81 75

    PLAX

    .remote

    PLAX

    PLAX

    .zoom

    of LA

    PLAX

    .centered on

    LA

    RVinfl

    ow

    PSAX

    .apex

    PSAX

    .PapMuscle

    PSAX

    .MV

    PSAX

    .AoV

    PSAX

    .AoV

    zoom

    A2c.no

    occlusions

    A2c.occluded

    LA

    A2c.occluded

    LV

    A3c.no

    occlusions

    A3c.occluded

    LA

    A3c.occluded

    LV

    A4c.no

    occlusions

    A4c.occluded

    LA

    A4c.occluded

    LV

    A5c

    Subcostal

    Suprasternal

    Other

    Predictions

    Zhang ... Deo, "A Computer Vision Pipeline for Automated Determination of Cardiac Structure and Function and Detection of Disease by Two-Dimensional Echocardiography," arXiv, 2017 45

    0.75

    0.50

    0.25

    0.00

    https://arxiv.org/abs/1706.07342https://arxiv.org/abs/1706.07342

  • DICOM formatimages

    An Automated (low-cost!) Approach to Echo Interpretation

    Echo Studies (14035)

    View View Probability Classification Quality Score (277)

    Segmentation Disease Detection: for 5 views HCM (495/2244) (791*) PAH (584/2487)

    Amyloid (179/804)

    Cardiac Structure: Cardiac Function: Mass and Volume Ejection Fraction

    (8666) (6407)Longitudinal Strain

    (526) 46

  • Segmentation using Convolutional NeuralNetworks

    Image CNNGround Truth

    Image CNNGround Truth

    For all views, only 100-200 manually traced images were used for training We segment every frame of every video

    Zhang ... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018;138:1623–1635. 47

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338

  • Deriving “Real World” Measurements: Comparisons to Thousands of Studies from the UCSF Clinical Echo Laboratory

    Metric Number of Echo Studies Used for

    Comparison

    Median Value (IQR) Absolute Deviation - % of Manual (Automated vs. Manual Measurement)

    50 75 95

    Left atrial volume 4800 52.6 (40.0-71.0) 16.1 29.3 66.2

    Left ventricular diastolic volume 8457 92.1 (71.8-119.1) 17.2 30.5 68.0

    Left ventricular systolic volume 8427 33.2 (24.1-46.8) 26 47 108

    Left ventricular mass 5952 148.0 (117.3-159.9) 15.1 27.6 61

    Left ventricular ejection fraction 6407 64.8 (58.3-59.41) 9.7 17.2 39.9

    Global longitudinal strain 418 19.0 (17.0-21.0) 7.5 13.6 30.8

    Global longitudinal strain (Johns Hopkins PKD study)

    110 18.0 (16.0-20.0) 9.0 17.1 39.4

    We can make all of the common measurements for B-mode echo.

    48

  • 49

    Deriving “Real World” Measurements A Left ventricular diastolic volume B Left atrial volume

    N = 8457 N = 4800100100

    Difference

    (%)

    50

    0

    −50 Difference (%)

    50

    0

    −50

    −100−100 50 100 150 2000 100 200 300 Mean of measurements Mean of measurements

    C D 500

    Left atrial volum

    e(autom

    ated) 200

    100 p < 2e−16 Left ventricular

    mass p = 5e−12400

    300

    (autom

    ated)

    case control HCM status

    200

    100

    case control HCM status

    Zhang ..... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018; 138: 1623–1635.

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338

  • 50

    Assessing Cardiac Function A Left ventricular ejection fraction B Global longitudinal strain

    100 50N = 6417 N = 418

    Difference

    (%) 50

    0

    −50

    0

    −25Difference

    (%) 25

    −100 −50 0 25 50 75 100 10 15 20 25

    Mean of measurements Mean of measurements

    Zhang ..... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018; 138: 1623–1635.

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338

  • Are clinicians really a gold standard?

    1. A typical echocardiogram reader will manually trace the heart in 4-6

    “representative frames” and that will be the gold standard

    2. There is wide variability in measurement values from physician to

    physician - up to 8-9% for the ejection fraction, which has a normal

    value of 60%

    3. How do we show an improvement?

    1. Compare to an average of multiple readers

    2. Compare to a gold-standard imaging system (e.g. MRI)

    3. Demonstrate utility in outcomes

    51

  • Internal Measures of Consistency

    Comparison N Correlation – Manual vs.

    Manual (p-value)

    Correlation – Automated vs.

    Automated (p-value)

    Left atrial volume vs. left ventricular mass 4012 0.54 (

  • Longitudinal Strain in Longitudinal Studies Tracking Patients on Herception Chemotherapy

    Zhang ..... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018; 138: 1623–1635.

    A vision of low cost serial monitoring of patients at risk of cardiac dysfunction: hypertension, obesity, diabetes

    53

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338

  • Automated Disease Detection - What’s the Point?

    1. Several rare diseases would benefit from referral to a

    cardiologist or specialty center

    2. These diagnoses tend to be missed at centers that see them

    infrequently

    3. We hypothesized that we could implement “disease detection

    modules” based on these same simple views

    4. This would again be themed as “decision support” - not

    definitive diagnoses

    54

  • A Model for Hypertrophic Cardiomyopathy HCM detection model (phased)

    A4c + PLAX

    © source unknown. All rights reserved. This content isexcluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    1.00

    0.75

    0.50

    ● ● ● ● ● ●

    AUROC = 0.93

    0.00 0.25 0.50 0.75 1.00 False positive fraction• Leading cause of sudden death in

    young athletes 100

    0.00

    0.25

    True

    pos

    itive

    fract

    ion

    ● ●

    ● ●

    ●● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ● ● ●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ● ●

    ρ = 0.41

    −10 0 10 20 30 Logit probability of HCM in cases

    Left

    vent

    ricul

    ar m

    ass (g

    kg

    m 2 )

    300

    • 1:500 individuals

    ● ●

    ● ●

    ● ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ρ = 0.38

    −10 0 10 20 30 Logit probability of HCM in cases

    Left

    atria

    l vol

    ume (m

    L kg

    m2 )

    • Inherited in families 75

    • Can result in unstable heart rhythms, 200

    heart failure, and stroke 50 100

    • Management involves behavioral 25 changes, medication, and preventive

    implantation of a defibrillator Left atrial volumes Left ventricular mass 55

    Zhang ..... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018; 138: 1623–1635.

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338https://ocw.mit.edu/help/faq-fair-use/

  • A Model for Cardiac Amyloidosis

    • Can be inherited in families

    • Can result in unstable heart rhythms, heart

    failure, and stroke

    AUROC = 0.87

    0.00

    0.25

    0.50

    0.75

    1.00

    0.00 0.25 0.50 0.75 1.00 False positive fraction

    True

    positive

    fraction

    Amyloid detection model

    Relating CNN probability toCardiac Structure

    • “Senile amyloidosis” is a common cause of

    heart failure in the elderly … but often

    missed

    • Management involves medication, and

    preventive implantation of a pacemaker/

    defibrillator Left ventricular

    mass (g

    m2 )

    200

    150

    100

    50

    0

    ● ● ●

    ● ● ●

    ● ●

    ● ●●

    ●●

    ● ●

    ● ●

    ● ●

    ●● ●

    ●●

    ●● ●

    ● ●

    ● ●

    ● ●●

    ● ●

    ● ●

    ρ = 0.36

    −10 0 10 20 Logit probability of amyloid in cases

    Zhang ..... Deo, "Fully Automated Echocardiogram Interpretation in Clinical Practice," Circulation. 2018; 138: 1623–1635. 56

    https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.118.034338

  • Images © source unknown. All rights reserved. Thiscontent is excluded from our Creative CommonsA Model for Mitral Valve Prolapse license. For more information, seehttps://ocw.mit.edu/help/faq-fair-use/.

    • A disease characterized by abnormal MVP detection model (phased)A4c + PLAXmyxomatous thickening of the valve

    leaflets

    • Seen in 1% of the population

    • Can progress to severe valve disease

    1.00

    True

    pos

    itive

    fract

    ion

    0.75

    0.50

    0.25

    0.00 ●

    ● ●

    ● ●

    AUROC = 0.89

    0.00 0.25 0.50 0.75 1.00 False positive fraction

    and is sometimes seen with arrhythmia

    and sudden death 57

    https://ocw.mit.edu/help/faq-fair-use/

  • What Next - Clinical Deployment!!!

    1. UCSF has filed provisional patent for our system

    2. Code and weights are freely available for academic/nonprofit

    use on Bitbucket: https://bitbucket.org/rahuldeo/echocv

    3. We don’t have FDA approval as a diagnostic - but proof-of-

    concept studies are needed to test the value of integrating

    automation into the clinical workflow

    4. Enabling National and Global Clinical Deployment

    1. Brandon Fornwalt: Geisinger Health System

    2. Patrick Gladding: The University of Auckland, NZ

    58

    https://bitbucket.org/rahuldeo/echocv

  • Musings about the future

    59

  • Some predictions for the future of cardiac imaging: following the path of ECG interpretation

    1. Routine measurements will be made in an automated way -with a visual check of segmentation quality

    2. Some automated diagnoses may happen at point-of-care: assessment of heart function, dangerous accumulation of fluid around the heart

    3. Until image acquisition is facilitated, the benefits of automated interpretation will be muted

    60

  • Where there is greater uncertainty …

    1. Ideally, we should be using automated interpretation to elevate medicine beyond the current practice - but that requires much larger data sets and imaging more often (i.e. time course) than what is currently performed (and reimbursed)

    2. Pharmaceutical companies have motivation to perform high frequency serial imaging to assess whether there are any benefits to medications in a shortened Phase II trial - accurate scalable quantification will be needed

    3. Surveillance of daily studies may be useful to enable identification of individuals who may be eligible for clinical trials or newly approved therapies (e.g. cardiac amyloidosis)

    61

  • Subclassification, risk models, … and the challenge of demonstrating utility

    1. There is no question most disease classifications are crude … and finer distinctions can be made between disease states

    2. There is also no question that survival models are crude, and better predictive models should be possible with imaging data and emerging algorithms

    3. Unfortunately, physicians are only interested in classifications or risk models that will change practice … and require evidence to justify this

    4. So until we have more data, we are left with the status quo and a bunch of research manuscripts

    62

  • What about the biology?

    63

  • What is missing in medicine?

    1. Low-cost quantitative metrics that are indicative of

    disease progression and reflect the onset of these tissue-

    level changes

    2. Should be specific to the disease process:

    1. expressive: capture complex underlying biological

    processes

    2. multidimensional: can’t readily be “gamed”

    3. Should be ameliorated with therapy (c.f. genetic risk)

    64Logo © American Heart Association. All rights reserved. This content is excluded from ourCreative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    https://ocw.mit.edu/help/faq-fair-use/

  • Immediate challenges

    1. Inaccessible biology: the tissue of interest in CHD is

    not accessible … how then to build biological assays

    2. Expensive: most detailed biological measurements are

    costly (c.f. imaging, proteomics, DNA sequencing)

    3. Problematic to train:

    • sample size: models that quantify complex biological processes will need to be high-dimensional … but these will require very large sample size to train and validate

    • time: CHD develops slowly over time but a new biological assay requires prospective enrollment

    65

  • Expanding phenotypic space

    1. Current clinical data sets lack the scale and expressivity needed to

    reflect underlying biological processes

    2. Discoveries from UK Biobank, Partners Biobank, Vanderbilt, Geisinger,

    etc., are all limited by the underlying low dimensionality of phenotypic

    information - and that will not be solved by sample size (more of the

    same)

    3. But these studies were exorbitant and have taken decades to accrue

    the current sample size… how do we improve on this?

    4. We need a data type that has the dimensionality to capture biological

    heterogeneity and complexity and yet can still be collected in a very

    scaleable manner (cf. representation learning needs)

    5. It became very clear we need to stay clear of sequencing technologies

    and costly medical imaging 66

  • A focus on individual circulating blood cells

    1. Causally implicated in CHD pathogenesis

    1. Involvement of neutrophils, monocytes, and lymphocytes in disease

    pathogenesis (plaque pathology; plaque pathology); genetic models in mice

    2. CANTOS trial

    3. Accelerated atherosclerosis in autoimmune disorders

    4. Association of clonal hematopoeisis and early myocardial infarction

    2. Accessible: accessible in a blood draw

    3. Precedence for utility: Existing predictive models exist for CAD using WBC/RBC

    characteristics

    4. Express many of the proteins implicated by genetic analysis in atherosclerosis: e.g.

    LDLR, LPL, FADS1/2

    5. Reflect many pathways found in diverse cell types: autophagy, phagocytosis, free

    radical dissipation 67

  • Cell morphology rather than genomics

    1. Takes advantage of computer vision advances in

    characterizing subtle distinctions between cell types

    and states at low cost

    2. Can analyze tens of thousands of individuals cells

    per participant

    3. Fluorescent dyes permit characterization of

    organelles (mitochondria, ER, Golgi), cytoskeleton

    (actin), nucleic acid (DNA, RNA)

    4. Can be connected to gene expression to clarify

    underlying functional abnormalities

    Images © sources unknown. All rights reserved. This content isexcluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    68

    https://ocw.mit.edu/help/faq-fair-use/

  • Readily amenable to perturbations

    1. Expressivity can be augmented by adding

    perturbational reagents to whole blood and

    repeating the cell staining protocol

    2. Examples: LPS, cholesterol crystals, saturated fatty

    acids

    3. Whole blood environment permits cross talk

    between cells: e.g. vital netosis triggered by LPS-

    platelet interaction

    4. Final readout — high content imaging — is

    inexpensive and directly comparable to baseline

    state

    Vial, molecule, cell, and medical equipment images © source unknown. All rights reserved. This content is excludedfrom our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. Cholesterol crystal image courtesy of Ed Uthman on Flickr. License: CC BY. Microscope slide image in the public domain,courtesy of Steven Glenn. 69

    https://www.flickr.com/photos/euthman/5205043804https://ocw.mit.edu/help/faq-fair-use/

  • Recruitment workflow – 1000-1200 patients per month (12,000-15,000 per year)

    Vial image © sourceunknown. All rightsreserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

    Cardiology clinic General medicine Primary carePrimary assays: low cost, Secondary assays: costly, less reproducible, expressive, rapid, robust, limited expressivity, non-responsive to therapy, interpretable responsive, non-biological

    Somatic sequencing(CHIP)

    GWAS (GRS)

    Novel Devices PETECG

    Single cell RNA-Seq WGS

    Vial, molecule, cell, and medical equipment images © source unknown. All rights reserved.This content is excluded from our Creative Commons license. For more information, seehttps://ocw.mit.edu/help/faq-fair-use/. Cholesterol crystal image courtesy of Ed Uthman onFlickr. License: CC BY. Microscope slide image in the public domain, courtesy of StevenGlenn.

    70

    https://www.flickr.com/photos/euthman/5205043804https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/https://ocw.mit.edu/help/faq-fair-use/

  • Scaling up to incorporate longitudinal data: tapping into the pathology lab

    Sysmex CellaVision

    1. Data has been collected since 2015

    2. >3.5 million data records with 1800 added a day

    3. We connect cellular data to full medical data via API

    4. Digitized slides from 100,000 patients - 13 million images

    5. Prospectively doing this for all acute MI patients

    3.5M FCS files from the XE-5000

    740K raw files from the XN-9000 + 1800/day

    Abbot Sapphire

    170K+ FCS files

    160K+ WBC TYP files

    13M images 18% Lymphocytes (in active DB)

    100K smears + 135/day

    Siemens Advia

    65K+ FCS files

    7.7M raw files

    CellaVision database File Share

    database Aggregation new Server snapshot snapshot Server data

    Currently a single server/database supporting three CellaVision

    1K+ nightly snapshots from 11/2015 to present

    Custom Python script copies new MimerSQL rows to MySQL

    ~13M images

    ~100K smears

    instruments ~8 TB 575 GB ~9hrs per backup

    Accumulated Database

    71

  • Summary of our approach 1. A permissive recruitment scheme to enable rapid

    accrual of tens of thousands of patients per year all

    with expressive phenotyping and full medical records

    2. Use of cell morphology/cell counter data to massively

    expand phenotypic space at low cost using

    perturbations and diverse readouts

    3. Overlapping of multiple phenotypic scales in different

    cohorts to convert costly, tissue-localized phenotypes

    (e.g. PET, CHIP sequencing) into lower cost (TTE, cell

    imaging) models

    4. API-based cohort identification to allow rapid

    identification of patients of interest

    5. Automated curation of the medical record into a

    vehicle for machine learning and causal inference Wellness Disease

    Time

    The macro- and microcirculation Image from Taqueti and DeCarli, "Coronary Microvascular Disease Pathogenic Mechanisms and Therapeutic Options," Journal of the American College of Cardiology Volume 72, Issue 21, 27 November 2018, Pages 2625-2641. Courtesy of Elsevier, Inc., https://www.sciencedirect.com. Used with permission.

    72

    https://www.sciencedirect.com

  • Acknowledgments:

    • American Heart Association • Center for Digital Health innovation at UCSF • Brigham and Women's Hospital • NIH Director's New Innovator Award • National Heart Lung and Blood Institute

    73

  • MIT OpenCourseWare https://ocw.mit.edu

    6.S897 / HST.956 Machine Learning for Healthcare Spring 2019

    For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms

    74

    https://ocw.mit.eduhttps://ocw.mit.edu/termshttps://ocw.mit.edu/terms

    cover-slides.pdfcover_h.pdfBlank Page