Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single...

48
“A single cell approach to interrogating network rewiring in EMT” Dana Pe’er Department of Biological Science Department of Systems Biology Columbia University

Transcript of Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single...

Page 1: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

“A single cell approach to interrogating network rewiring in EMT”

Dana Pe’er

Department of Biological Science

Department of Systems Biology

Columbia University

Page 2: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Learning Networks from Single Cells

Idea: Use natural stochastic variation within a cell population and treat measurements of each individual cell as a sample for learning

Page 3: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Each cell is a pointof information

Abundance of Protein A

Ab

un

dan

ce o

f P

rote

in B

Data-Driven Learning

How does protein A influence protein B?

Assumptions:

Molecular influences create statistical dependencies

We treat each cell as an independent sample of these dependencies.

Page 4: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Can we use single cells to learn signaling networks?

Sachs*, Perez*, Pe’er* et.al. Science 2005

Karen SachsOmar PerezDoug LauffenburgerGarry Nolan

Page 5: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Datasets of cells

• condition ‘a’• condition ‘b’•condition…‘n’

12 Color Flow Cytometry

perturbation a

perturbation n

perturbation b

Conditions (96 well format)

Primary Human T-Lymphocyte Data

Assumptions:

Treat perturbation as an “ideal intervention” (Cooper, G. and C. Yoo (1999).

Page 6: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Phospho-Proteins

Phospho-Lipids

Perturbed in dataPKC

Raf

Erk

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

1Reversed

3Missed

17/17Reported

15/17T Cells

Inferred T cell signaling map

siRNA

[Sachs et al, Science 2005]

Page 7: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

What did we need to succeed?

PKC

Raf

Erk

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

420 instead of 6000 samples

PKC

Raf

Erk

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

420 averaged samples

Large number of samples and single cell resolution are needed for success

Page 8: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat
Page 9: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Spectral overlap in flow cytometry

http://www.dvssciences.com/technical.html

10molecules

1000molecules

1%overlap

20molecules

Page 10: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Mass cytometry work flow

FCS data export

Measure

by TOF

Nebulize

single-cell

droplets

Ionize

(7500K)

High-dimensional analysisFCS data

Ionize

(7500K)

Isotopically

enriched

lanthanide

ions (+3)

x 4 to 6 polymers

= 120 to 180

atoms per

antibody

30-site

chelating

polymer

We get 45 dimensions simultaneously in millions

of individual cells

Bendall*, Simonds* et. al. Science 2011

Mass cytometry: a game changer

Page 11: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Decreased spectral

overlap

Increased

dimensionality

Mass cytometry

45 dimensions

and counting

Page 12: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

How does signal processing differ between subtypes?

Krishnaswamy et.al. Science 2014

Smita KrishnaswamyMatthew H. SpitzerMichael MingueneauSean C BendallOren Litvin, Erica StoneGarry Nolan

Page 13: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Signaling Through T-cell Maturation

Naïve(CD44-)

Effector/Memory(CD44+)

Lymph

Naïve and effector memory CD4+ T-cells have similar signaling network, yet these respond differently

Our surface panel has enough markers to resolve key T-cell subsets together with their signaling

They have been stimulated and processed in the same tube allowing for direct comparison

Page 14: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

pSL

P7

6

pCD3z

Real Mass Cytometry Data

14

Each point is a cell

Units of measurement: log-scale transformed molecule counts

pCD3z

pSLP76

Page 15: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

15

Scatterplots Reveal Only Range

Pre-Stimulation Post-Stimulation

pSL

P7

6

pCD3z pCD3z

Cannot discern effect of stimulation

Page 16: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Kernel Density Estimationp

SLP

76

pCD3z

Kernel Density Estimation (KDE) learns underlying probability distribution

16

Page 17: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

17

Pre-Stimulation Post-Stimulation

KDE obscures X-Y relationship

Molecules shift together

Coarse functional relationship

Page 18: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Conditioning unveils X-Y Relationship

Conditional distribution for each X-slice is computed

Captures behavior across full dynamic range

Captures behavior of small populations of responding cells

Page 19: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Change in Signal Transfer Relationship

Pre-Stimulation Post-Stimulation

X-increase X-increase

Y-increase

Y-increase

This is beyond “increasing pCD3z levels”

Page 20: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

How do we quantify information transmitted by an edge?

The high local joint density biases mutual information assessment

DREMI resamples Y from conditional density in each X-slice to reveal relationship between X and Y

The key is we want to model P(Y|X)

Rather than P(X,Y)

Page 21: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

DREMI captures “edge strength”

v

v

Page 22: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Comparing Naïve to Effector memory T-cells

pSLP76 responds more strongly in effmem T-cells

The “edge” transmits pCD3z levels more faithfully in naïve T-cells

pCD3z

pSL

P7

6

0 0.5 1 2

Naive

Effmem

4

Page 23: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Comparing Naïve to Effector memory T-cells

Increased transmission of input in naïve T-cells propagates down

For a longer duration

Page 24: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Protein Activation: a Different View

• sdgfd Levels of molecules are higher in Effmem

Effmem cells need less antigen to trigger

Naïve cell responses are more tailored to input

Page 25: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

DREMI Reveals Alternative Pathway

Effmem cells have alternate input via AKT pathway

Page 26: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Predicting differences in “edge” strength

Pre-erk-KD levelPost-erk-KD level

.65

Pre-erk-KD levelPost-erk-KD level

.26

pERK

pS6

pERK

pS6

Naïve (4m) Effmem (4m)

Predictions for ERK KO mouse

Erk_KO should impact pS6 more in Naïve cells

Difference should accentuate at the 3 minutes after stimulus

Page 27: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Validation of edge strength prediction

Replicate 1 Replicate 2

Average pS6B6 – ERK_KO

We validated that the influence of pERK on pS6 is stronger in Naïve T-cells.

Similar validation for differences between CD4 and CD8

Page 28: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

The devil is in the details

KDE's interpolate over areas where there are no samples, so they correct for gaps to some extent.

Histogram approach, fast, but sensitive to bandwidth

Kernel approach, slow and tedious need to integrate all kernels at every point of evaluation, most heuristics sensitive to noise

Page 29: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Hybrid Method for Density Estimation

• We take a hybrid method for density estimation.

• Use the speed of histogram and the smoothness of Kernels:

• 1. Build a histogram of the initial data

• 2. Obtain a good estimate of the bandwidth

• 3. Smooth the histogram using the bandwidth.

• Goal:f̂h (x) =

1

nh 2pe

-h2 (x-xi )

2

2

i=1

n

å

Botev et.al., Annals of Statistic, 2010

Page 30: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Connection to heat equation

Heat Equation:

It governs the distribution of temperature in a region over time.

∂f

∂t=

1

2

∂2f

∂x2, with initial condition: f x,0( )=D

A Gaussian kernel, (which is what we want) is the unique

solution to the above equation!

f̂h (x) =1

nh 2pe

-h2 (x-xi )

2

2

i=1

n

å

Page 31: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

“Spreading of Heat” over time akin to Smoothing Data

At t = 0, the initial condition is a delta peak at 0. For any t>0, we get a Gaussian.

In finite domain, the solution to heat equation is a Fourier series in cosine

Motivates us to work in frequency domain.

=> Solution = Discrete Cosine Transforms

Facilitates rapid computation

f (x) = am cos(mp x)exp-m2p 2t

2

æ

èç

ö

ø÷

m=0

¥

å

Page 32: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Computing in frequency domain

0 200 400 600 800 10000

0.005

0.01

0.015Histogram of the input data

X

De

nsity

DCT

Smooth DCT

0 200 400 600 800 10000

0.005

0.01

0.015

X

De

nsity

Original Histogram

Final Density Estimate

Invert Smooth DCT

This is equivalent to solving heat diffusion in a bound space

Page 33: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Smoothing in action:increasing the diffusion

Page 34: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Diffusion KDE

34

Diffusion-based KDE estimate is faster and smootherBotev, et al., Annals of Stats, 2011

Page 35: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Reconfiguring Signaling Edges Driving EMT

Smita KrishnaswamyRoshan SharmaNevana ZivanovicBernd Bodenmiller

Page 36: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Epithelial-mesenchymal transition (EMT)

Epithelial Mesenchymal

The cells transition between two very different states.

Can we understand the changes in signaling and phenotype underlying this transition?

Induce EMT by treating a breast cancer cell line with TGFB

Page 37: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

EMT: State Change in Cells Cellular heterogeneity: both epithelial and

mesenchymal cells coexist during transition.

• Both epithelial and mesenchymal cellsMMTV-PyMT

E-Cadherin

Vimentin

Both epithelial and mesenchymalcells at day 3

Page 38: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Early, young

Late, mature

A trajectory approach to development

Single cell studies are finding that sometimes development is a continuous progression

Strong signal in the data, simple methods get rough approximation, but hard to get accurate progression.

Page 39: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

The Challenge: Non-Linearity Development is highly non-linear in n-D space

Euclidian distance is a poor measure for chronological distance

Page 40: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Wanderlust Approach

• Convert data to a k nearest neighbors graph• Each cell is a node• Each cell only “sees” its local

neighborhood

Bendall*, Davis*, Amir* et.al. Cell 2014

Page 41: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Derive Trajectory using “graph walk”

s

T

• What is the position of a cell along the trajectory?

- Start from an early cell- Define distance by walking

along graph

But, very noisy data, many additional tricks needed.

Page 42: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Wanderlust

1. Convert data into a set of klNN graphs

2. In each graph, iteratively refine a trajectory using a set of random waypoints

3. The solution trajectory is the average over all graph trajectories

A graph based trajectory detection algorithm. Wanderlust is scalable,

robust and resistant to noiseWe use randomness to overcome noise!

Page 43: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Refine distances using waypoints

s Choose M random waypoints, l1…lM

Page 44: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Refine distances using waypoints

Next, find the shortest path from each waypoint lito n

Short distances are more reliable and help refine

order locally

Page 45: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

.

.

.

.

.

.

.

.

.

l1 aligned SP

l2 aligned SP

l3 aligned SP

l4 aligned SP

lM aligned SP

New orientation trajectory

Contribution of li is weighed by its distance from p

Refine distances using waypoints

Page 46: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

klNN graph

klNN: k-out-of-l nearest neighbors

Generate l nearest neighbors graph

To generate one klNN graph,

For each node, pick k neighbors randomly

Initial lNN graph klNN #1 klNN #2 klNN #3

Each shortcut appears in only a small number of klNN-graphs

Page 47: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Wanderlust Trajectory

Wanderlust infers path from Hematopoietic Stem Cells to immature B cells from a single sample of human bone marrow.

Matches prior knowledge, robust and reproducible across 7 individuals.

Identified and validated 3 novel rare progenitor states (0.007% of cells)

Page 48: Characterizing a phenotypic landscape of tumor ...Columbia University. Learning Networks from Single Cells Idea: Use natural stochastic variation within a cell population and treat

Acknowledgements

Jacob Levine

Michelle Tadmor

El-ad David Amir

Oren Litvin

Smita Krishnaswamy

Nolan Lab (Stanford)

Garry Nolan

Sean Bendall

Matt Spitzer

Kara Davis

Erin Simons

Tiffany Chen

Manu Setty

Linas Mazutis

Ambrose Carr

Roshan Sharma

Bodenmiller Lab (U Zurich)

Bernd Bodenmiller

Nevana ZivanovicDavid van Dijk

Josh Nainys