Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of...

26
Heterogeneity in evolutionary models of tumor progression Rick Durrett, Jasmine Foo, Kevin Leder, John Mayberry and Franziska Michor July 14, 2010 1 Introduction sec-intro All cancers and tumor types display a striking variability among the cancer cells within a single tumor. Tumors are composed of a heterogeneous mixture of cell types with distinct genotypes, morphologies, metabolic activities and behaviors such as proliferation rate, anti- gen expression, drug response and metastatic potential [16, 20]. For example, molecular and phenotypic analysis of cells from breast carcinomas reveal defined subpopulations with distinct gene expression and (epi)genetic profiles [33]. Heterogeneity and subpopulations within single tumors have also been demonstrated via flow cytometry in cervical cancers and lymph node metastases [27]. Cytogenetic heterogeneity such as variation in ploidy and chromosomal structure has been discovered among cells within breast tumors and leukemias [35]. Genetic clonal diversity has also been observed in individual pre-malignant lesions in Barrett’s esophagus, a condition associated with increased risk of developing esophageal ade- nocarcinoma [21, 24]. Virtually every major type of human cancer and biological subtype has been shown to contain distinct cell subpopulations with differing heritable alterations [20, 16, 25]. Tumor heterogeneity has direct clinical implications on disease classification and prog- nosis, as well as on treatment efficacy and drug target identification [25, 20]. The degree of genetic clonal diversity in Barrett’s esophagus has been correlated to clinical progression to esophageal cancer [24]. In prostate carcinomas, tumor heterogeneity has been cited as a key factor in pretreatment underestimation of tumor aggressiveness and incorrect assessment of DNA ploidy status of tumors [34, 15]. Heterogeneity has long been implicated in the devel- opment of resistance to cancer therapies after an initial period of response [13, 25], as well as in the development of metastases [11]. In addition, tumor heterogeneity has been shown to hamper the precision of microarray-based analyses of gene expression patterns, which are currently widely used for the identification of genes associated with specific tumor types [28]. These issues underscore the importance of obtaining a more detailed understanding of the origin and temporal evolution of heterogeneity during tumorigenesis. The clonal evolution model of carcinogenesis states that tumors are monoclonal, i.e. originating from a single abnormal cell, and that over time the descendants of this ancestral 1

Transcript of Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of...

Page 1: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

Heterogeneity in evolutionary models of tumorprogression

Rick Durrett, Jasmine Foo, Kevin Leder, John Mayberry and Franziska Michor

July 14, 2010

1 Introductionsec-intro

All cancers and tumor types display a striking variability among the cancer cells within asingle tumor. Tumors are composed of a heterogeneous mixture of cell types with distinctgenotypes, morphologies, metabolic activities and behaviors such as proliferation rate, anti-gen expression, drug response and metastatic potential [16, 20]. For example, molecularand phenotypic analysis of cells from breast carcinomas reveal defined subpopulations withdistinct gene expression and (epi)genetic profiles [33]. Heterogeneity and subpopulationswithin single tumors have also been demonstrated via flow cytometry in cervical cancersand lymph node metastases [27]. Cytogenetic heterogeneity such as variation in ploidy andchromosomal structure has been discovered among cells within breast tumors and leukemias[35]. Genetic clonal diversity has also been observed in individual pre-malignant lesions inBarrett’s esophagus, a condition associated with increased risk of developing esophageal ade-nocarcinoma [21, 24]. Virtually every major type of human cancer and biological subtypehas been shown to contain distinct cell subpopulations with differing heritable alterations[20, 16, 25].

Tumor heterogeneity has direct clinical implications on disease classification and prog-nosis, as well as on treatment efficacy and drug target identification [25, 20]. The degree ofgenetic clonal diversity in Barrett’s esophagus has been correlated to clinical progression toesophageal cancer [24]. In prostate carcinomas, tumor heterogeneity has been cited as a keyfactor in pretreatment underestimation of tumor aggressiveness and incorrect assessment ofDNA ploidy status of tumors [34, 15]. Heterogeneity has long been implicated in the devel-opment of resistance to cancer therapies after an initial period of response [13, 25], as wellas in the development of metastases [11]. In addition, tumor heterogeneity has been shownto hamper the precision of microarray-based analyses of gene expression patterns, which arecurrently widely used for the identification of genes associated with specific tumor types [28].These issues underscore the importance of obtaining a more detailed understanding of theorigin and temporal evolution of heterogeneity during tumorigenesis.

The clonal evolution model of carcinogenesis states that tumors are monoclonal, i.e.originating from a single abnormal cell, and that over time the descendants of this ancestral

1

Page 2: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

cell acquire various combinations of mutations [20, 25]. Under this model, genetic drift andnatural selection drive the progression and diversity of the tumor. As the tumor progresses,genetic instability and uncontrolled proliferation allow the production of cells with additionalalterations which may confer characteristics such as drug resistance, resistance to cell deathsignaling, metastatic proclivity and increased mutation and proliferation rates. The resultingdistinct subpopulations evolve, in turn produce new mutations and subpopulations as timeprogresses.

Mathematical models of tumor heterogeneity can be found in the literature, many ofwhich consider the dynamics of distinct subpopulations evolving under selective pressure[26, 4, 5, 18, 14, 19, 31, 10, 3]. In many of these models, the subpopulations representpredefined phenotypes, such as the drug -sensitive and -resistant cells of a tumor, and thegrowth dynamics of these populations are explored. In this work we consider a stochasticevolutionary model of tumorigenesis in an exponentially growing population, wherein ge-netic alterations confer random fitness changes. Under this model, we address questionssurrounding the extent of genetic diversity in tumor subpopulations and its evolution intime. This work is an extension of a previous paper in which we investigated the effect of therandom mutational fitness distribution on the growth kinetics of the tumor [8]. The paper isorganized as follows: in section 2 we introduce the model and discuss the different types ofsubpopulations that emerge in simulations. In section 3, we state some useful results from[10, 8] and use these results to investigate heterogeneity between populations which haveaccumulated varying numbers of mutations. We also derive useful approximations for (i)the size of the subpopulation of all individuals with k mutations and (ii) the waiting timeuntil we see the first individual with k mutations appear. In section 4, we analyze the effectsof our model parameters on the degree of genotypic heterogeneity within the populationof cells with a single genetic alteration by considering two quantitative measures of hetero-geneity: Simpson’s index and the proportion of cells which come from the largest family ofgenotypically identical cells. We conclude in Section 5 with a discussion of our results.

2 Model Descriptionsec-model

We consider a multi-type branching process model of tumorigenesis in which mutations conferan additive change to the birth rate of the cell. This additive change is drawn according toa probability distribution ν which we refer to as the fitness distribution. In our terminology,type-i cells have accumulated i ≥ 0 mutations. The initial population consists entirely oftype-0 cells that give birth at rate a0 to new type-0 cells and produce type-1 cells at rateu1. We refer to u1 as the mutation rate for type-0 cells. We assume that all cells in thepopulation die at rate b0 < a0, and that the population of type-0 cells starts at a sufficientlylarge population V0 so that we can approximate its size by Z0(t) = V0e

λ0t, where λ0 = a0−b0.When a type-0 cell produces a type-1 cell, the new cell gives birth to type-1 cells at ratea0 + X, where X ≥ 0 is drawn according to the distribution ν and produces type-2 cellsat rate u2. In general, a type-(k − 1) cell with birth rate a produces a new type-k cell atrate uk and the new type-k cell assumes an increased birth rate a + x where x ≥ 0 is drawn

2

Page 3: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

according to ν. We suppose that each type-k cell produced by a type-(k − 1) cell starts agenetically distinct lineage of cells and we refer to the set of all of its type-k descendants asits family. We also let Zk(t) denote the total number of type-k cells in the population attime t and when we refer to the kth wave or generation of mutants, we mean the set of alltype-k cells. We denote the total population at time t by Z(t).

We will restrict our attention in this paper to fitness distributions concentrated on [0, b]for some b > 0. If the fitness distribution is unbounded with tail ν(x,∞) ∼ Kxβe−γxα

, thenit is shown in [8] that the growth rate of the number of first generation mutants is super-exponential and hence this choice of distribution is unrealistic for modeling tumorigenesis.We shall discuss two distinct classes of distributions:

(i) ν is discrete and assigns mass gi to a finite number of values b1 < b2 < · · · < bN = b.

(ii) ν is continuous with a bounded density g(x) that is continuous and positive at b.

The special case of discrete distributions with N = 1 (i.e. deterministic fitness advances)was first studied in [17, 14] and asymptotic results for this model were obtained in [10]. Asimilar discrete time branching process model is considered in [3]. There, cells can accumulate“driver” and “passenger” mutations. The former decrease the death rate of cells while thelatter provide no selective advantage. This essentially corresponds to a special case of (i)above with N = 2 and b1 = 0. Asymptotic results for general continuous distributions wereobtained in [8]. Theorem 1 below contains a summary of results which are relevant for ourcurrent discussion.

Figure 1 shows a snapshot of the population decomposition in a sample simulation of ourmodel in case (ii) with ν ∼ Uniform[0,0.05]. Here we have started with a single type-0 cellthat has birth rate a0 = 0.2 and death rate b0 = 0.1. The type-0 population is not shown, buthas reached O(106) cells at the time of the snapshot. This figure illustrates the complex geno-typic variability present in the population of cells produced by our model. We observe thatthere are two sources of heterogeneity present in the population: variability in the numberof mutations per cell (heterogeneity between generations) and genotypic variation betweenmembers of the same generation (heterogeneity within a generation). Generations appear inwaves with a large number of different mutant families making significant contributions tothe generation size. Our goal here is to discuss these two sources of heterogeneity and deriveanalytic results that quantify the relationship between the amount of genotypic variationpresent in the population and our model parameters. Heterogeneity between generations isdiscussed in Section 3 while heterogeneity within a generation is discussed in Section 4.

3 Heterogeneity between generationssec-btwn

We begin with a summary of some previous results. We shall assume that ui ≡ u, define

λk = λ0 + kb

3

Page 4: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4Birth rate of clone

Wave 1Wave 2Wave 3Wave 4Wave 5

(a)

0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4Birth rate of clone

60

80

100

120

140

(b)

Figure 1: A sample cross-section of tumor heterogeneity. (a) Each mutant family presentin the tumor at time t = 150 is represented with a circle, positioned on the horizontal axisaccording to fitness (birth rate) and on the vertical axis according to family size. Colorsdelineate families with differing numbers of mutational alterations (wave k cells have kalterations). (b) Colorscale depicts time of creation of each family. In this simulation,a0 = 0.2, b0 = 0.1, ν ∼ U([0, 0.05]), u = 0.001.wave_sample2

4

Page 5: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

to be the maximum growth rate that can be attained by generation k mutants, and let

pk = −k +k−1∑j=0

λk

λj

.

Throughout this paper, we shall also use ”⇒” to denote convergence in distribution.

th2B Theorem 1. If ν satisfies (i) above, then

(1/u)(k+pk)e−λktZk(t) ⇒ Vd, k

where Vd, k has Laplace transform

exp(−dk(λ0, b)V0θλ0/λk)

for all θ ≥ 0. If ν satisfies (ii) above, then

(t/u)k+pke−λktZk(t) ⇒ Vc, k

where Vc, k has Laplace transform

exp(−ck(λ0, b)V0θλ0/λk)

for all θ ≥ 0. Here, dk(λ0, b) and ck(λ0, b) are constants which depend on the indicatedparameters. Explicit formulas can be found in [8], Section 4.

The “d” and “c” in the constants and subscripts stand for discrete and continuous. In case(ii), Theorem 1 follows from [8], Theorem 4. In case (i) when N = 1, Theorem 1 follows fromthe discussion in [8], Section 4 (see also [10], Theorem 5 for a similar result when the numberof type-0 individuals is random), but these results can easily be extended to general finitedistributions because the exponential growth of generation k implies that one can ignore thecontribution of mutations which advance the fitness by bi < b. Note that in case (ii), there isa polynomial correction to the exponential growth of generation k, but the limiting behaviorof the two systems is otherwise similar. This slowdown is due to the fact that when thefitness distribution is continuous, it takes an additional amount of time until the birth of thefirst individuals in generation k with near maximal growth rates (i.e. growth rates near λk

- see [8], Section 1.1).

3.1 Theoretical results: small mutation limit

Theorem 1 implies that in case (i), for example, we have the approximation

log Zk(t) ≈ λkt− (k + pk) log(1/u) + log Vd,k (3.1) Vkapprox

when t is large. Dividing both sides of this equation by L = log(1/u) and looking at theprocess on a time-scale of order L, we can see that the log size of generation k approaches adeterministic, linear limit as the mutation rate u → 0.

5

Page 6: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

linlim Theorem 2. As u → 0,

(1/L) log+ Zk(Lt) → zk(t) = [λkt− (k + pk)]+ = λk(t− βk)

+

in probability where

βk =k + pk

λk

=k−1∑j=0

1

λj

.

The convergence is uniform on compact subsets of [0,∞).

The uniform convergence is a consequence of the continuity and monotonicity of the lim-its. Note that the limiting process depends on λ0, the growth rate of type-0’s, and b, themaximum attainable fitness increase, but is otherwise independent of the particular choiceof fitness distribution. An example of the limiting process is shown in Figure 2. In Section3.2, we shall compare the limiting approximation given by Theorem 2, the approximationgiven by the righthand side of (3.1), and the results of simulations.

10 15 20 25 30 35 40 45 50 55 600

2

4

6

8

10

12

t

z k(t)

25 30 35 40 45 500

1

2

3

4

5

6

7

8

t

z k(t)

Figure 2: t vs. zk(t): λ0 = .1, b = .05. Left: First 20 waves started at t = 10 = 1/λ0,the time that 1’s begin to be born. Right: Closer look at the firstseven waves showing thechanges in the dominant type. fig:bpwaves

As a consequence of Theorem 2, we obtain the following important corollary regardingthe birth time of type-k’s.

Corollary 1. Let Tk = inf{t ≥ 0 : Zk(t) > 0} be the first time a type-k individual is born.Then as µ → 0,

Tk/L → βk

in probability for all k ≥ 0.

From the definition of βk, it is clear that

βk − βk−1 =1

λk

6

Page 7: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

0 2 4 6 8 10 12 14 16 18 2010

15

20

25

30

35

40

45

50

55

k

β k

30 35 40 45 50 550

1

2

3

4

5

6

7

8

t

m(t

)

Figure 3: Left: Birth times for the first 20 generations plotted as a function of generationnumber. Right: Dominant type as a function of time. Same parameters as Figure 2. fig:domtype

is decreasing so that the increments between the birth times for successive generations de-crease as k increases, leading to an acceleration in the rate at which new mutations areaccumulated. This acceleration can be seen on the left side of Figure 3 and occurs regardlessof our particular choice of fitness distribution although since λ−1

k is inversely proportional tob, distributions which allow for larger fitness increases will tend to exhibit shorter increments.

In [3], the authors observe a similar acceleration of waves, but based on approximationsin [2], they suggest that this acceleration is an artifact of the presence of both passengerand driver mutations and does not occur when only driver mutations which confer a fixedselective advantage are allowed (i.e. when the fitness increments are deterministic). In ourmodel, the cause of the acceleration of waves is due to the difference in growth rates betweenthe successive generations: type-k’s are born when generation k− 1 reaches size O(1/u) andsince the asymptotic growth rate of generation k is larger than the asymptotic growth rateof generation k−1, generation k+1 will reach size O(1/u) faster than generation k. Anotherexample of this phenomenon can be found in [9] where the authors study a related stochasticmodel of tumor growth in which the population of cells grows at a fixed exponential rate andsubpopulations of different cell types compete for space. There, the cause of the accelerationis again related to growth rates: later generations take longer to achieve dominance in theexpanding population of cells and hence, new types are born with a higher fitness advantageover the population bulk, allowing them to reach size O(1/u) more rapidly.

We conclude this section with a second useful consequence of Theorem 2: an asymptoticresult for the time at which type-k individuals become dominant in the population.

Corollary 2. Let Sk = inf{t ≥ 0 : Zk(t) > Zj(t), ∀ j 6= k} be the first time that type-kindividuals become the dominant type. Then

Sk/L → tk = b−1 + βk

7

Page 8: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

in probability as µ → 0 for all k ≥ 1.

The limit tk is the solution to

λk(tk − βk) = λk−1(tk − βk−1)

i.e. the time when zk(t) first overtakes zk−1(t). The right side of Figure 3 shows an exampleof how the index m(t) of the largest generation at time t, defined by

zm(t)(t) = max{zk(t) : k ≥ 0},

changes over time. The transitions between periods of dominance are only sharp in the smallmutation limit and an interesting question for future investigations would be to determinethe time scale on which these transitions occur. Note also that because of the log scale,at any given time the population consists primarily of members in the current dominantgeneration, i.e.

(1/L) log Z(Lt) → zm(t)(t)

as µ → 0. Therefore, for small mutation rates, the amount of genetic heterogeneity presentin the population is determined by the amount of heterogeneity present in the dominantgeneration.

3.2 Positive mutation rates: numerical simulationssec-simvtheo

In this section we numerically investigate the heterogeneity between waves in the case ofpositive mutation rates and deterministic fitness distributions. Given that there are approx-imately 3 billion = 3×109 base pairs in the human genome and assuming a mutation rate perbase pair of O(10−8)−O(10−10), we expect that point mutations occur at a rate of between.3 and 30 per cell division. However we assume that advantageous mutations constitute onlya fraction of the possible mutations and therefore that the mutation rate per cell divisionfor advantageous mutations should be in the approximate range of O(10−5)− O(10−2). Weshall use u = 10−5 in our investigations below (which also corresponds to the value used bythe authors in [3]).

Generation Mean Standard Deviation1 4.7638 1.37382 10.6010 2.24343 17.0519 3.0282

Table 1: Means and standard deviations for log Vd, k, k = 1, 2, 3. table:Vdk

In Figure 5 (a) and (b), we compare the average log size of the kth generation in simula-tions with the limiting approximation given by Theorem 2. We observe qualitatively similarbehavior, however the limiting approximation consistently underestimates the times at which

8

Page 9: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

2 4 6 8 10 12 14 160

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

log Vd,1

Rel

ativ

e F

requ

ency

Figure 4: Relative frequency histogram of 1000 random samples from the distribution oflog Vd,1. fig:logV1

new waves appear. To explain the source of this bias, we use the alternative approximationgiven by righthand side of (3.1) which can be rewritten as

zLk (t) = Lzk(t/L) + log Vd,k (3.2) zhat

where L = log(1/u). Using the expression for the Laplace transform of Vd,k and the numericalalgorithm of [30], we simulate 1000 random draws from the distribution of Vd,k. Table 1 showsthe sample mean and standard deviation for log Vd,k, k = 1, 2, 3. The distribution of log Vd,k

has a positive mean and is skewed right (see Figure 4 for an example with k = 1) implyingthat the limit in Theorem 2 will in general underestimate the size of generation k for positivemutation rates. The approximation obtained by replacing log Vd,k with the sample mean oflog Vd,k is plotted in Figure 5 (c). We note that after an initial period in which the numberof type-k individuals is small, this picture closely resembles the plot in (a). We also notethat the variance of log Vd,k tends to increase with k and hence, we should expect to see anincreasing amount of variability in simulations in the time when type-k individuals are born.In Figure 5 (d), we plot the right hand side of (3.2) replacing log Vd,k with the value twostandard deviations above its mean to illustrate an extreme scenario.

9

Page 10: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

5 10 15 20 25 300

1

2

3

4

Time

Log

popu

latio

n si

ze

5 10 15 20 25 300

1

2

3

4

Time

Log

popu

latio

n si

ze

5 10 15 20 25 300

1

2

3

4

Time

Log

popu

latio

n si

ze

5 10 15 20 25 300

1

2

3

4

Time

Log

popu

latio

n si

ze

Wave 0Wave 1Wave 2Wave 3

Figure 5: The log size of generations 1 through 4 as a function of time. Note that bothtime and space are plotted in units of L = log(1/u) (a) (Top left) Average values over 106

simulations. (b) (Top right) Limiting approximation from Theorem 2. (c) (Lower left)Approximation from (3.2) using the mean of log Vd,k. (d) (Lower right) Approximation from(3.2) using the value two standard deviations above the mean of log Vd,k. The parametersused in the simulation are u = 10−5, a0 = 0.2, b0 = 0.1, and b = .05. fig:het_finite

10

Page 11: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

4 First generation heterogeneitysec-first

In this section, we study within generation heterogeneity by examining the amount of geno-typic heterogeneity present in the first generation of individuals. We use two statisticalmeasures to assess heterogeneity: (i) Simpson’s Index, which gives the probability that tworandomly chosen individuals from the first generation come from the same clone, and (ii) thefraction of individuals in the first generation which come from the largest family of individu-als. To obtain these results, we will rely on Theorem 3 below which gives us insight into howthe limit in Theorem 1 comes about: the limit is the sum of points in a nonhomogeneousPoisson process. Each point in the limiting process represents the contribution of a differentmutant lineage to Z1(t) so that it suffices to calculate (i) and (ii) for the limiting process.Before stating this result, we need to introduce some terminology. Here and in what follows,we use |A| to denote the number of points in the set A. We say that Λ is a Poisson processon (0,∞) with mean measure µ if Λ is a random set of points in (0,∞) with the followingproperties:

(i) For any A ⊂ (0,∞), N(A) = |Λ ∩ A| is a Poisson random variable with mean µ(A).

(ii) For any k ≥ 1, if A1, ..., Ak are disjoint subsets of (0,∞), then N(Ai), 1 ≤ i ≤ k areindependent.

We also let α = λ0/λ1 ∈ (0, 1) denote the ratio of the growth rate of 0’s to the maximalgrowth rate of 1’s and note that 1 + p1 = 1/α.

th1C Theorem 3. Let Λ be a Poisson process on (0,∞) with mean measure

µ(A) =

A

αz−(α+1)dz

and let S denote the sum of the points in Λ. Then there exist positive constants Ad, Ac =Ad(λ0, b), Ac(λ0, b) which depend on the indicated parameters so that in case (i) as t →∞

(AduV0)−(1+p1)e−λ1tZ1(t) ⇒ S,

and in case (ii) as t →∞(AcuV0))

−(1+p1)t1+p1e−λ1tZ1(t) ⇒ S.

For more details, see [8], Theorem 3 and [10], Corollary to Theorem 3. Note that the meanmeasure for Λ has tail µ(x,∞) = x−α.

Let Xn denote the nth largest point in Λ, and let Sn =∑n

i=1 Xi denote the sum of then largest points. To determine the dependence of Xn on n we first note that if we defineΛ′ = f(Λ) where f(x) = x−α, then Λ′ is a Poisson process and after making the change ofvariables y = x−α, we can see that the mean measure is

µ′(A) =

f−1(A)

αx−(α+1)dx =

A

dy = |A|.

11

Page 12: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

In other words, Λ′ is a homogeneous Poisson process with constant intensity and hence, thespacings between points are independent exponentials with mean 1. If we let Tn denotethe time of the nth arrival in Λ′, then the law of large numbers implies that Tn ∼ n asn → ∞. Since Xn = T

−1/αn , we obtain Xn ∼ n−1/α as n → ∞. This already suggests how

the behavior of Xn depends on α: smaller α corresponds to a quicker decay in Xn and hence,less heterogeneity. In addition we have the following Lemma. Although this result holdsfor general α > 0, we recall that we are assuming here and throughout this section thatα ∈ (0, 1).

meansum Lemma 1.EXn = Γ(n− 1/α)/Γ(n).

Furthermore, if we define S∞ =∑∞

i=1 Xi, then

ES∞ < ∞.

Proof. Since Tn has a Gamma(n,1) distribution, we have EXn = ET−1/αn = Γ(n−1/α)/Γ(n).

Stirling’s approximation implies that Γ(n − 1/α)/Γ(n) ∼ n−1/α and the second conclusionfollows.

4.1 Simpson’s Index

One common measure of heterogeneity in a population is Simpson’s index which is theprobability that two randomly selected individuals come from the same family. Recallingthat Xi is the contribution of the ith largest family of generation 1 individuals to the totalsize of generation, we define Simpson’s index for our point process by

R =

∑∞i=1 X2

i

(S∞)2=

∞∑i=1

(Xi

S∞

)2

.

The formula for the mean ER is remarkably simple.

simpmean Theorem 4. ER = 1− α

This result shows that the average amount of heterogeneity present in the first generationdepends only on α, the ratio of the growth rate of type-0’s to the maximum attainable growthrate of type-1 individuals. The key to our proof is a result in [12] which considers

Rn =n∑

i=1

(Yi

Sn

)2

where Yi are iid random variables in the domain of attraction of a stable law with index αand Sn = Y1 + · · ·+ Yn and shown that

limn→∞

ERn = 1− α.

12

Page 13: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

To explain the connection between the two results, we note that if we have P (Yi > x) = x−α,for x ≥ 1 and let Yn,i = Yi/n

1/α, then

nP (Yn,i ∈ A) → µ(A).

This implies that if we let ∆n = {Yn,i : i ≤ n} be the point process associated with the Yn,i

and define the measures ξn ≡∑

x∈Λnδx and ξ =

∑x∈Λ δx, then we have

ξn ⇒ ξ.

so that we should expect ER to agree with lim ERn. We make this argument rigorous inAppendix A.

In Figure 6, we plot the sample mean of Simpson’s index of Z as a function of time fordifferent values of α and compare with 1− α, the expected value of Simpson’s index for thelimiting point process. We observe that initially, the sample mean tends towards 1− α. Forlarger values of b, the sample mean overshoots the limiting value. Our theory guaranteesthat eventually the values of the sample mean will converge, however we were not able tosimulate far enough out in time to see this occur.

60 70 80 90 100 110 1200

0.1

0.2

0.3

0.4

0.5

α = 0.9091 (b = 0.01)

60 70 80 90 100 110 1200

0.1

0.2

0.3

0.4

0.5

α = 0.6667 (b = 0.05)

Figure 6: Expected value of Simpson’s Index for the first wave. We compare the samplemean (dots) of Simpson’s index at times t = 60 to 120 with the expected value of Simpson’sIndex for the limiting point process (lines), for two different values of α. Parameters: λ0 =0.2, a0 = 0.1, ν ∼ U([0, b]), where b = 0.01 (left) and b = 0.05(right). fig:SI_diffa

We can further exploit the previously discussed connection between ∆n and Λ to gainadditional insight into the distribution of Simpson’s index. For example, one can obtainexpressions for higher moments of R (see [1]). In [22], it is established that as n →∞,

Sn(2) = R−1/2 =

∑ni=1 Yi

(∑n

i=1 Y 2i )

1/2

has a limiting distribution with a density f that satisfies

f(y) ∼ ae−by2

, as y →∞,

13

Page 14: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

for some constants a, b > 0 (see [22], equation (5.7) and [32], Theorem 6.1 for more details).Therefore, after making the change of variables x = y−1/2, we can see that the density of Rnear the origin has the form

g(x) ∼ ax−3/2e−bx−1

, as x → 0, .

In Figure 7, we perform simulations of Simpson’s index for the first wave mutants in thebranching process Z. We use parameters u = 10−3, λ0 = 0.2, a0 = 0.1 and mutationsconfer an additive fitness change drawn according to ν ∼ U([0, 0.01]). In the simulations weobserve convergence of the empirical distribution of Simpson’s Index to the distribution forthe limiting point process. Curiously, in the plot at t = 70, we observe a few noticeable bumpsin the distribution (e.g. around 0.5). These bumps can be explained by a few families withsimilar size dominating the population; this occurs with enough probability to be apparentin the density.

4.2 Largest clonessec-largest

To further investigate heterogeneity properties of the point process, let

Sn =n∑

i=1

Xi

be the total contribution from the largest n points. Let

Vn = X1/Sn

be the fraction of individuals descended from the largest family of first generation mutants.Our next result shows that 1/Vn converges to a non-trivial limit and provides us with anexplicit formula for the characteristic function of the limit.

charfunc Theorem 5. As n → ∞, V −1n ⇒ W where W has characteristic function ψ satisfying

ψ(0) = 1 and

ψ(t) =eit

fα(t)

for all t 6= 0 with

fα(t) = 1 + α

∫ 1

0

(1− eitu)u−(α+1)du.

The form of the characteristic function is the same as the characteristic function forlimn→∞ Tn/Y(1) where the Yi are iid random variables with power law tails, Y(1) = maxi≤n Yi,and Tn =

∑ni=1 Yi (see, for example, [6]). Again, this agreement is a consequence of the

previously discussed connection between ∆n and the limiting Poisson point process.

14

Page 15: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10t=70

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

t=90

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10t=110

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10

15t=130

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10

15t=∞

Simpsons Index

Figure 7: Empirical distribution of Simpson’s Index for the first wave. We include plots ofSimpson’s Index for the branching process at times t = 70, 90, 110, 130 along with Simpson’sIndex for the limiting point process (t = ∞). The histograms show the average over 1000simulations. For the limiting point process, we approximate Simpson’s index by looking atthe largest 104 points in the process. Parameters: λ0 = 0.2, a0 = 0.1, ν ∼ U([0, 0.01]). fig:SI_dist

15

Page 16: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

It is interesting to note that the characteristic function in Theorem 5 is not integrable.The problem is that the density of V −1

n blows up near 1. As an explanation for this, we notethat with probability

exp(−(1− x−α)) exp(−x−α)x−α = e−1x−α

there is a point in the process bigger than x and no points in [1, x). When this happens,

V −1n = Sn/X1 ≤ 1 + n/x

and soFn(y) = P (V −1

n ≤ y) ≥ e−1n−α(y − 1)α.

If we had Fn(y) ∼ (y−1)α, then the density would blow up like (y−1)α−1 as y → 1. We willconfirm that this gives the right asymptotic by providing an explicit formula for the densityof W .

Vdensity Corollary 3. W has a density on (1,∞) given by

f(y) = limM→∞

∫ M

−M

eit(1−y)

fα(t)dt.

Note that integral expression above does not converge absolutely so part of the proof willconsist of showing that the limit exists. If we apply the change of variable s = t(y − 1) inthe definition of f , we see that

f(y) = (y − 1)α−1

∫ ∞

−∞

e−it

(y − 1)α +∫ (y−1)−1

01−eiut

uα+1 du

thus confirming the intuition that the density blows up like (y − 1)α−1 as y approaches 1.Differentiating ψ leads to simple expressions for the mean and variance of the limit.

Vmean Corollary 4. EW = 11−α

and var( W ) = 2(1−α)2(2−α)

.

Figure 8 suggests that the rate of convergence is slow for α close to 1.Returning to the study of Vn = X1/Sn, we note that Theorem 5 implies that Vn converges

to a non-trivial limit V = W−1 and Jensen’s inequality applied to the strictly convex function1/x implies that E(lim X1/Sn) > 1−α. Simulations suggest that despite this fact, deviationsof the mean from 1− α are small as illustrated in Figure 8.

5 Discussionsec-disc

In this work we have investigated the evolution of diversity in a stochastic model of tumorexpansion incorporating random mutational advances. In Section 3 we considered hetero-geneity between tumor subpopulations with varying numbers of mutations. In the limit as

16

Page 17: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

8

10

12

α

E V

n −1

n=103

n=104

n=105

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

α

E V

n

n=10

n=102

n=103

n=104

n=105

Figure 8: Left: Comparison of Monte Carlo estimates for EV −1n and the limit (1 − α)−1.

Right: Comparison of the Monte Carlo estimates for EVn and the curve (1− α)+. Numberof simulations = 100. fig:snmean

the mutation rate approaches zero, we obtained estimates of the contribution of each waveof mutants to the total population. These limiting approximations depend on the maximumattainable fitness advance for mutants, but not on the specific form of the fitness distribu-tion. We also obtained estimates of the time of arrival of the first cell of each wave whichshowed that the accumulation of new genetic alterations will accelerate over time due to theincreasing growth rates of successive generations. Simulations demonstrate that for small,but positive mutation rates, the behavior of the system is qualitatively similar to the limitingpredictions (see Figure 5). These simulations also suggest that as time increases, multiplewaves of mutants coexist without a single, largely dominant wave.

In Section 4 we investigated the genotypic diversity within the first wave of mutants. Inparticular, we considered two measures of diversity: the Simpson’s Index and the fractionof individuals in the first generation which come from the largest family of individuals. Weanalyzed these diversity measures in a limiting regime where the first generation of mutantscan be described by an inhomogeneous point process. For this point process, we obtained theexact mean of Simpson’s Index as well as the form of its density near the origin. Interestingly,the mean of Simpson’s index is given by the quantity 1−α, where α is the ratio between thefitness of the unmutated ancestral population and the maximum possible fitness of the firstwave of mutants. Next, we observed that as time increases the mass of the distribution ofSimpson’s Index moves closer to 0, indicating higher levels of diversity in the tumor at latertimes (see Figure 7). This is also observed via direct numerical simulation of the branchingprocess; we observed convergence of the distribution and mean of Simpson’s Index to thepredicted limiting values.

In Section 4.2 we investigated the ratio between the total population size of the first waveof mutants and the size of the largest family. We show that this ratio can be approximated

17

Page 18: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

by a random variable with mean (1− α)−1. The density of this random variable is given inCorollary 3. We note that as α approaches 1 the mean of this ratio grows to infinity. Inother words, as α approaches 1 the largest family constitutes a vanishing proportion of thetotal wave one population.

Our results indicate that tumor diversity is strongly dependent upon the age of the tumorand the quantity 1 − α, but is otherwise unaffected by changes to the fitness distribution.Investigations of Simpson’s Index and the ratio Sn/X1 indicate that if α is close to one(i.e. fitness advances are small) we expect the tumor population to have a higher diversity.In addition, our results on Simpson’s Index and inter-wave heterogeneity indicate that alonger lived tumor will have a higher level of diversity. An open problem which could lead toadditional insights into how tumor heterogeneity changes over time is to quantify the amountof heterogeneity present in later generations by obtaining an explicit formula for the meanof Simpson’s Index for generation k ≥ 2. We conjecture that the mean of Simpson’s Indexwill be a decreasing function of the generation number, indicating higher levels of diversityin later generations and a further increase in the total amount of heterogeneity present inthe tumor at later times.

A Proofs of results on first generation heterogeneitysec-proofs

We will use the following notation in this appendix. For a real number t, we define thefunction

sgn(t) =

−1, t < 0

0, t = 0

1, t > 0.

For a complex number z we denote the real part of z by Re[z] and its imaginary part byIm[z].

A.1 Simpson’s Index

To prove Theorem 4, let

Rn =

∑ni=1 Y 2

n,i

(∑n

i=1 Yn,i)2

denote the value of Simpson’s index for the point process Λn. The following lemma is arestatement of Theorem 5.3 in [12] applied to the Yi.

iidsimp Lemma 2. ERn → 1− α as n →∞.

We will prove Theorem 4 by showing that we also must have ERn → ER. To this end,let

Rn(ε) =

∑ni=1 Y 2

n,i1Yn,i>ε(∑ni=1 Y 2

n,i1Yn,i>ε

)2

18

Page 19: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

denote the truncated value of Simpson’s index for Λn and

R(ε) =

∑∞i X2

i 1Xi>ε

(∑∞

i=1 Xi1Xi>ε)2

denote the truncated value for Λ. Then for any ε > 0, we have

|ERn − ER| ≤ E|Rn −Rn(ε)|+ E|Rn(ε)−R(ε)|+ E|R(ε)−R| (A.1) tribnd

We will complete the proof by deriving appropriate bounds for each of the three terms onthe righthand side of (A.1). For the first term, we have

Rnerror Lemma 3.lim sup

n→∞E |Rn −Rn(ε)| ≤ hε

where hε → 0 as ε → 0.

Proof. Let ε > 0 and write An,k =∑n

i=1 Y kn,i, An,k(ε) =

∑ni=1 Y k

n,i1Yn,i>ε, and An,k(ε) =An,k − An,k(ε) for k = 1, 2. Since

EY k1 1Y1≤εn1/α =

∫ εn1/α

1

kyk−1y−αdy ≤ Cεk−αnk/α−1

for k = 1, 2, we have the boundEAn,k(ε) ≤ Cεk−α. (A.2) errorbnd

After noting that An,2 ≤ A2n,1, An,2(ε) ≤ A2

n,1(ε) ≤ A2n,1, and An,1(ε) + An,1(ε) = An,1, we

have for any δ > 0,

E|Rn −Rn(ε)| = E

∣∣∣∣An,2(ε)

A2n,1

+ Rn(ε)

(A2

n,1(ε)− A2n,1

A2n,1

)∣∣∣∣≤ δ−2EAn,2(ε) + P (An,1 ≤ δ)

+ δ−1E

(|2An,1(ε)An,1(ε) + A2

n,1(ε)|An,1

)+ P (An,1 ≤ δ)

= δ−2E|An,2(ε)|+ δ−1E

(∣∣∣∣An,1(ε)

(2 +

An,1(ε)

An,1

)∣∣∣∣)

+ 2P (An,1 ≤ δ)

≤ δ−2EAn,2(ε) + 3δ−1EAn,1(ε) + 2P (An,1 ≤ δ)

≤ ε2−α/δ2 + 3ε1−α/δ + 2P (An,1 ≤ δ) (A.3) Rndiff

where we have used (A.2) in the last line. To control the third term on the right, let φ denotethe Laplace transform of Y1. Then

1− φ(t) = α

∫ ∞

1

(1− e−ty)y−(α+1)dy

= αtα∫ ∞

t

(1− e−x)x−(α+1)dx ∼ Ctα

19

Page 20: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

as t → 0 since 1 − e−x ∼ x as x → 0 implies that∫∞0

(1 − e−x)x−(α+1)dx < ∞. We canconclude that

E exp(−tAn,1) = (1− (1− φ(t/n1/α))n → exp(−Ctα)

as n →∞. In particular, An,1 ⇒ A1, where A1 has the above Laplace transform. Since

1− exp(−Ctα) → 1

as t →∞, we have P (A1 = 0) = 0 so that taking δ = ε(1−α)/2 in (A.3) yields the result.

To bound the second term on the righthand side of (A.1), we need some notation. Let Mp

denote the class of all point measures on (0,∞). In a slight abuse of notation, we write v ∈ νwhen ν ∈ Mp and v ∈ supp(ν). We shall equip Mp with the topology of vague convergence(see, for example, Section 3.4 in Resnick [29]) and take as our σ-algebra the one generatedby open sets in this topology. Associated with any random set of points, we can associatea measure ξ which is a random variable with values in Mp. We will write Λn ⇒ Λ to meanthat the associated random measures ξn ⇒ ξ.

Lemma 4. Λn ⇒ Λ and if we define the maps Fk,ε : Mp → [0,∞) by

Fk,ε(µn) =∑x∈µn

xk1x>ε

for k = 1, 2, then(F1,ε(Λn), F2,ε(Λn)) ⇒ (F1,ε(Λ), F2,ε(Λ)).

Proof. Since

nP (Yn,i ∈ A) = n

n1/αA

αy−(α+1)dy =

A

αx−(α+1)dx = µ(A)

for all Borel sets A, the first claim follows from Proposition 3.21 in [29]. The second claimfollows from the Continuous Mapping Theorem (see, for example, page 152 in [29]), the factthat Fk,ε is continuous away from measures ν with ε /∈ ν, and the fact that the randommeasure associated with Λ has no point masses with probability 1.

As a consequence of this lemma, the fact that Rn(ε) ≤ 1, and the bounded convergencetheorem, we have

Rep Corollary 5.E |Rn(ε)−R(ε)| → 0

as n →∞ for any ε > 0

It thus remains to establish the following

20

Page 21: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

Rerror Lemma 5.lim sup

ε→0E |R(ε)−R| = 0.

Proof. We can establish this result using the same results as in the proof of Lemma 3. Inparticular if we define Ak =

∑∞n=1 Xk

n and Ak(ε) =∑∞

n=1 Xkn1Xn<ε for k = 1, 2. Then

following the display in A.3 we have for any δ > 0

E |R−R(ε)| ≤ δ−2EA2(ε) + 3δ−1EA1(ε) + 2P (A1 ≥ δ).

It is obvious that P (A1 = 0) = 0 and EA2(ε) ≤ EA1(ε) for ε < 1, so it only remains toestablish that

EA1(ε) → 0,

as ε → 0. However this result follows immediately from Lemma 1. Therefore taking δ =(EA1(ε)

)1/4completes the proof.

We can now complete the proof of Theorem 4 by letting n →∞ and then ε → 0 in (A.1)and applying Lemmas 3 and 5 and Corollary 5.

A.2 Largest Clones

We begin with the

Proof of Theorem 5. Theorem 5.1 in [6] implies that we have

E exp(itTn/Y(1)

) → ψ(t)

as n → ∞ where as in Section 4, Y(1) = maxi≤n Yi and Tn =∑n

i=1 Yi. To conclude thatTn/Y(1) ⇒ V , we need to show that ψ is continuous at 0. To establish this fact, we makethe change of variables v = tu to conclude that

fα(t) = 1 + α

∫ 1

0

(1− eitu)u−(α+1)du = 1 + α|t|α∫ |t|

0

(1− eivsgn(t))v−(α+1)dv. (A.4) altpsi

Since 1− exp(iv) ∼ −iv as v → 0, the integral on the right-hand side of (A.4) is finite andhence,

ψ(t) = eitf−1α (t) → 1

as t → 0. Since Tn/Y(1) ⇒ V , the fact that Sn/X1 ⇒ V follows from the arguments in theprevious section.

Proof of Corollary 3. We first establish that there are no point masses in the distributionof V . By the inversion formula we have for any a ∈ R,

P (V = a) = limT→∞

1

2T

∫ T

−T

e−iatψ(t)dt

= limT→∞

1

2T

∫ T

−T

eit(1−a)

fα(t)dt.

21

Page 22: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

If we focus on the positive axis and use the change of variable s = t/T ,

1

2T

∫ T

0

eit(1−a)

fα(t)dt =

1

2

∫ 1

0

eisT (1−a)

fα(sT )ds.

From display (A.4) it follows that for every s ∈ (0, 1) we have eisT (1−a)/fα(sT ) → 0 asT →∞. Note that

Re[fα(t)] = 1 + α

∫ 1

0

1− cos ut

uα+1du > 1,

which implies |fα(t)| ≥ 1 for all t. Therefore∣∣eisT (1−a)/fα(t)

∣∣ ≤ 1 for all t and it follows viathe Dominated Convergence Theorem that

limT→∞

1

2

∫ 1

0

eisT (1−a)

fα(sT )ds = 0.

A similar result holds for the integral on the negative axis and we conclude that

P (V = a) = 0.

We can therefore conclude for x > 1 and h > 0 via the inversion formula (see [7], (3.2)) andFubini’s Theorem that

P (V ∈ (x, x + h)) = limT→∞

1

∫ T

−T

∫ x+h

x

e−ityψ(t)dydt

= limT→∞

1

∫ x+h

x

∫ T

−T

e−ityψ(t)dtdy.

Therefore in order to establish the result we need to show that

limT→∞

1

∫ x+h

x

∫ T

−T

e−ityψ(t)dtdy =1

∫ x+h

x

∫ ∞

−∞e−ityψ(t)dtdy.

This follows if we show that limT→∞∫ T

−Te−ityψ(t)dt is a convergent integral, and that there

exists a bounded function h defined on (x, x + h) such that

|hT (y)| =∣∣∣∣∫ T

−T

e−ityψ(t)dt

∣∣∣∣ ≤ h(y).

We first use integration by parts to see

hT (y) =

∫ T

−T

eit(1−y)

fα(t)dt =

i

1− y

(eiT (1−y)

fα(T )− e−iT (1−y)

fα(−T )+

∫ T

−T

eit(1−y)f ′α(t)

fα(t)2dt

).

Recalling that |fα(T )| → ∞ as T → ±∞, it follows that if we establish that f ′α(t)fα(t)2

is

integrable on (−∞,∞), then the convergence of the integral and the existence of a bounded

22

Page 23: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

dominating function will be established. Since fα is bounded away from 0, it suffices to checkthat the function decays fast enough. Recalling the definition of fα

f ′α(t) = −iαtα−1

∫ t

0

eiv

vαdv,

which follows by passing the derivative inside the integral in the definition of fα. We canestablish that

supT<∞

∣∣∣∣∫ T

0

eiv

vαdv

∣∣∣∣ < ∞

by observing ∫ ∞

0

eiv

vαdv = e−iπ(1−α)/2Γ(1− α)

which can be found in many places, e.g. [23]. Thus,

|f ′α(t)| ≤ αtα−1 supT<∞

∣∣∣∣∫ T

0

eiv

vαdv

∣∣∣∣ ≤ C0tα−1.

We can similarly establish that for t sufficiently large

|fα(t)|2 ≥ C1t2α,

for a positive finite constant C1. Thus for t sufficiently large

∣∣∣∣eit(1−y)f ′α(t)

fα(t)2

∣∣∣∣ ≤C

tα+1,

establishing the result.

Proof of Corollary 4. Using the Taylor series expansion of exp(iu) about 0 in (A.4) aboveimplies that

1 + fα(t) = 1−∞∑

n=1

α (it)n

(n− α)n!.

and therefore,

f (k)α (t) =

∞∑

n=k

αintn−k

(n− α)(n− k)!

so that in particular,

f (k)α (0) =

ikα

k − α

for all k ≥ 1. Let S(t) = log ψ(t) = it− log fα(t). Then dropping the α subscript on fα, wehave

S ′(t) = (i− f ′(t)/f(t)) = (i− (log f(t))′)

23

Page 24: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

which yields the desired result for the mean:

EY = iS ′(0) = i(i− f ′(0)) =1

1− α.

Now

S′′(t) = −(log f(t))

′′= −f

′′(t)f(t)− (f ′(t))2

f 2(t)so

var(Y ) = S′′(0) = −f

′′(0) + (f ′(0))2 =

α

2− α+

α2

(1− α)2=

α

(1− α)2(2− α)

completing the proof

References

AT07 [1] H. Albrecher and J. L. Teugels. Asymptotic analysis of a measure of variation. Teor.Imovır. Mat. Stat., (74):1–9, 2006.

Beer07 [2] N. et al. Beerenwinkel. Genetic progression and the wait- ing time to cancer. PLoSComp. Biol. 3, 3(11):e225, 2007.

Boz09 [3] I. Bozic, T. Antal, H. Ohtsuki, H. Carter, D. Kim, S. Chen, R. Karchin, K. Kinzler,B. Vogelstein, and M. Nowak. Accumulation of driver and passenger mutations duringtumor progression. arXiv:0912.1627v1, 2009.

CoGo85 [4] A. Coldman and J. Goldie. Role of mathematical modeling in protocol formulation incancer chemotherapy. Cancer Treat. Rep., 49:1041–1045, 1984.

CoGo86 [5] A. Coldman and J. Goldie. A stochastic model for the origin and treatment of tumorscontaining drug-resistant cells. Bull. Math. Biol., 48(3/4):279–292, 1986.

DAR [6] D.A. Darling. The role of the maximum term in the sum of independent randomvariables. Trans. AMS, 72:85–107, 1952.

DURPTE [7] R. Durrett. Probability: Theory and Examples, Third Edition. Duxbury Press, Belmont,California., 2005.

DEA [8] R. Durrett, J. Foo, K. Leder, J. Mayberry, and F. Michor. Evolutionary dynamicsof tumor progression with random fitness values. Theoretical Population Biology, inreview, 2010.

DMAY [9] R. Durrett and J. Mayberry. Traveling waves of selective sweeps. Ann. Appl. Prob., toappear, 2010.

DM [10] R. Durrett and S. Moseley. Evolution of resistance and progression to disease duringclonal expansion of cancer. Theoretical Population Biology, 77(1):42–48, 2010.

24

Page 25: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

Fidler78 [11] I. Fidler. Tumor heterogeneity and the biology of cancer invasion and metastasis1.Cancer Research, 38:2651–2660, 1978.

FEA [12] A. Fuches, A. Joffe, and J. Teugels. Expectation of the ratio of the sum of squares to thesquare of the sum: exact and asymptotic results. Theory Probab. Appl., 46(2):243–255,2001.

GeRo02 [13] J. Geisler, S. Rose, H. Geisler, G. Miller, and M. Wiemann. Drug resistance and tumorheterogeneity. CME Journal of Gynecologic Oncology, 7:25–28, 2002.

HaIwMi07 [14] H. Haeno, Y. Iwasa Y, and F. Michor. The evolution of two mutations during clonalexpansion. Genetics, 177:2209–2221, 2007.

HaAu05 [15] L. Haumlggarth, G. Auer, C. Busch, M. Norberg, M. Haumlggman, and L. Egevad.The significance of tumor heterogeneity for prediction of dna ploidy of prostate cancer.Scandinavian Journal of Urology and Nephrology, 39(5):387–392, 2005.

He84 [16] G.H. Heppner. Tumor heterogeneity. Cancer Research, 44:2259–2265, 1984.

IwNoMi06 [17] Y. Iwasa, M.A. Nowak, and F. Michor. Evolution of resistance during clonal expansion.Genetics, 172:2557–2566, 2006.

KaTo00 [18] A.R. Kansal, S. Torquato, E.A. Chiocca, and T.S. Deisboeck. Emergence of a sub-population in a computational model of tumor growth. Journal of Theoretical Biology,207:431–441, 2000.

Ko06 [19] N.L. Komarova. Stochastic modeling of drug resistance in cancer. Journal of TheoreticalBiology, 239:351–366, 2006.

CaPo07 [20] K. Polyak L. Campbell. Breast tumor heterogeneity: Cancer stem cells or clonal evolu-tion? Cell Cycle, 6(19):2332–2338, 2007.

LaPa07 [21] L. Lai and et al. Increasing genomic instability during premalignant neoplastic progres-sion revealed through high-resolution array-cgh. Genes Chromosomes Cancer, 46:532–542, 2007.

LOG [22] B.F. Logan, C.L. Mallows, S.O. Rice, and L.A. Shepp. Limit distributions of self-normalized sums. Ann. Probab., 1:788–809, 1973.

LOY [23] P. Loya. Dirichlet and fresnel integrals via integrated integration. Mathematics Maga-zine, 78:63–67, 2005.

MaGa06 [24] C. Maley and et al. Genetic clonal diversity predicts progression to esophageal adeno-carcinoma. Nature Genetics, 38:468–473, 2006.

MePeRe06 [25] L. Merlo and et al. Cancer as an evolutionary and ecological process. Nature ReviewsCancer, 6:924–935, 2006.

25

Page 26: Heterogeneity in evolutionary models of tumor progressionlede0024/bpHet714.pdf · Birth rate of clone Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 (a) 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

Michelson89 [26] S. Michelson, K. Ito, H. Tran, and J. Leith. Stochastic models for subpopulation emer-gence in heterogeneous tumors. Bull. Math. Biology, 51(6):731–747, 1989.

NgSe06 [27] H. Nguyen and et al. Evidence of tumor heterogeneity in cervical cancers and lymphnode metastases as determined by flow cytometry. Cancer, 71:2543–2550, 2006.

OsBu05 [28] M. O’Sullivan, V. Budhraja, Y. Sadovsky, and J. Pfeifer. Tumor heterogeneity affectsthe precision of microarray analysis. Diagnostic Molecular Pathology, 14(2):65–71, 2005.

SR [29] S. Resnick. Extreme Values, Regular Variation, and Point Processes. Springer-Verlag,1987.

Ridout [30] M. S. Ridout. Generating random numbers from a distribution specified by its laplacetransform. Statistics and Computing, 19(4):439–450, 2009.

Sc08 [31] J. Schweinsberg. The waiting time for m mutations. Electron. J. Probability, 13, 2008.

SHA [32] Q.M. Shao. Self-normalized large deviations. Ann. Probab., 25(285-328), 1997.

Shipitsin07 [33] M. Shipitsin and et al. Molecular definition of breast tumor heterogeneity. Cancer Cell,11(259-273), 2007.

WaWi00 [34] N. Wang, C. Willkin, A. Bcking, and B. Tribukait. Evaluation of tumor heterogeneityof prostate carcinoma by flow- and image dna cytometry and histopathological grading.Analytical Cellular Pathology, 20(1):49–62, 2000.

Wolman86 [35] S. Wolman. Cytogenetic heterogeneity: Its role in tumor evolution. Cancer Genet.Cytogenet., 19:129–140, 1986.

26