Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva –...

73
Probability&Statistics - based Probability&Statistics - based models models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College

Transcript of Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva –...

Page 1: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Probability&Statistics - based modelsProbability&Statistics - based models

August 1, 2007 MathFest 2007 San Jose, CA

Raina Robeva – Sweet Briar College

Page 2: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Probability&Statistics - based modelsProbability&Statistics - based models

Introduction Introduction

Quantitative Traits (Limit Quantitative Traits (Limit Theorems)Theorems)

Luria – Delbruck Experiments Luria – Delbruck Experiments

Evaluating risks from time Evaluating risks from time series data series data

Page 3: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Elementary ProbabilityElementary Probability

Random Variables

),(XX

),,( P - Probability Space

Histograms

Page 4: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Elementary ProbabilityElementary Probability

Set of all outcomes -

Examples:

1) Flipping a coin:

2) Rolling a die:

3) Rolling two dice:

TH ,

6,5,4,3,2,1

Page 5: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Elementary ProbabilityElementary Probability

Elementary Events – the elements of

Events – the subsets of : CBA ,,

Definition of Probability:

elementsofnumberA

AP

||,||

||)(

How do we find probabilities?

We Count!We Count!

Page 6: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Chromosomes and GenesChromosomes and Genes

Genes are found on chromosomes and code for a specific trait

The possible alternative forms of the genes are called alleles.

Chromosomes are large DNA molecules found in the cell’s nucleus

Each gene has a specified place on the chromosome called a locus.

The human Chromosome 11 contains 28 genes. The first 5 genes from the tip of the short arm form a cluster of genes that encode components of hemoglobin

Page 7: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

ProblemProblem

- All possible sequences of length 2 comprised of a and A

2when,1

1when,2

0when,1

||

k

k

k

E

If E = “exactly k dominant alleles”, find P(E).

AAAaaAaa ,,,

||

||)(

EEP

One gene, two types of alleles: a (recessive) and A (dominant)

k = number of dominant alleles (0, 1, or 2)

Page 8: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Problem (cont.)Problem (cont.) AAAaaAaa ,,,

2when,4/1

1when,2/1

0when,4/1

||

||)(

k

k

kE

EP

Gregor Mendel – experiments with peas

Round - dominant Wrinkled - recessive

Parental Generation

First Filial Generation

P

Second Filial Generation

only round peas in F1

3:1 ratio of round vs. wrinkled in F2

1F

2F

x

x

Phenotypic Phenotypic Ratios Ratios

1:3 (1:2:1)1:3 (1:2:1)

%751)round(

%25)wrinkled(

41

41

P

P

Page 9: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Quantitative Traits (1909)Quantitative Traits (1909)

Herman Nilsson – Ehle

Phenotypic Ratios Phenotypic Ratios

1 : 4 : 6 : 4 : 1 1 : 4 : 6 : 4 : 1

1 : 6 : 15 : 20 : 15 : 6 : 1 : 6 : 15 : 20 : 15 : 6 : 11

……

Parental Generation

First Filial Generation

P

Second Filial Generation

1F

2F

x

x All of intermediate color

Tw

o n

ew s

had

es a

pp

ear

Page 10: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Quantitative Traits – ExamplesQuantitative Traits – Examples

Page 11: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

n genes, two types of alleles: a and A

Polygenic HypothesisPolygenic Hypothesis

N = 2n – total positions

If E = “exactly k dominant alleles”, find P(E) = ?k = number of dominant alleles (0, 1, 2, …, N)

Page 12: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Polygenic Hypothesis – set of Polygenic Hypothesis – set of outcomesoutcomes

1

2

3

4

- All possible sequences of length 8 comprised of a and A82|| In general,

nN 222||

Page 13: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Polygenic HypothesisPolygenic Hypothesis

Alleles a and A are equally likely

N = 2n – total positions

If E = “exactly k dominant alleles”, find P(E).

k = number of dominant alleles (0, 1, 2, …, N)

)!(!

!||

kNk

N

k

NE

N2||

N

k

N

EEP

2||

||)(

Page 14: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Example: Nilsson-Ehle (1909)Example: Nilsson-Ehle (1909)

Nilsson – Ehle: Two genes (n = 2), N = 2n = 4 number of alleles

X – number of a alleles

in the N loci

16

4

2

kk

N

NP(X = k) =

Page 15: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Random VariablesRandom Variables

)(XX

Continuous – X can be any value from an interval

X is “known” when we know:

the distribution function F(x) = P(X< x);

the probability density function f(x) = d/dx [F(x)]

x

dttfxF )()(

Discrete – X takes integer values

X is “known” when we know P(X=k) for all possible k

Page 16: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Common Discrete Random Common Discrete Random VariablesVariables

Bernoulli X takes values k = 0, 1

P(X=1) = p; P(X=0) = 1-p

Binomial X takes values k = 0, 1, 2, …, N

kNk ppk

NkXP

)1()(

Poisson X takes values k = 0, 1, 2, 3, …

!)(

kekXP

k

N = 20, p = 0.5N= 20, p = 0.2N= 20, p = 0.7

Parameters

Bernoulli (p)

Bin(N, p)

Po( )

Page 17: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Common Continuous Random Common Continuous Random VariablesVariables

Exponential X takes values ),0( xxexF 1)(

xexf )(

Gaussian (Normal) X takes values ),( x

22

2)(

21)(

x

exf 2

2

21)(

x

exf

),( N )1,0(N

Page 18: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Bell - Shaped Distr. of Quantitative Bell - Shaped Distr. of Quantitative TraitsTraits

Traits are controlled not by one but by several different genes. The genes are independent and contribute cumulatively to the expression of the characteristic (Polygenic Hypothesis)

Distribution of the trait is Binomial (2n, p), where n –number of genes and p frequency of the non-contributing allele in the population.

Distribution is approximately Gaussian.

Further “smoothing” by environmental factors

Page 19: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

N=8, p = 0.2

N = 20, p = 0.5

N=50, p = 0.7

When Np is large and N(1-p) is large, then

Binomial (N,p) ~ Normal (Np, ))1( pNp

Why the “bell-shaped” distribution of Why the “bell-shaped” distribution of quantitative traits? quantitative traits?

1667 - 1754MoivreMoivre

1749 - 1827LaplaceLaplace

Central Limit TheoremCentral Limit Theorem

Page 20: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Aggregate CharacteristicsAggregate Characteristics

Mean Value )()( kXkPXE

dxxxfXE )()(

Standard Deviation222 )]([)()]([)( XEXEXEXEXVar

Moments of order m )()( kXPkXE mm

dxxfxXE mm )()(

Page 21: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

ExamplesExamples

Binomial (N, p) NpXE )(

NpqXVar )(

)(XE

)(XVar

)(XE2)( XVar

Gaussian ( ),

Poisson( )

Page 22: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Poission Distribution Arises Poission Distribution Arises When…When…

Events of low intensity occurring in time

X(t) – the number of events that have occurred in [0,t]

0 timet

X(t) has a Poisson distribution with parameter

!

)()(

k

etkXP

tk

t

Average number of events per unit time =

Page 23: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Events of low intensity occurring independently of one another

t

X– the number of events that have occurred in a unit surface/volume over time t

X has a Poisson distribution with parameter

!

)()(

k

etkXP

tk

Average number of events per unit surface/volume per unit time =

Poission Distribution Arises Poission Distribution Arises When…When…

Page 24: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

The Law of Large Numbers (1713)

If X is a random variable with

,)( XE

then

,as,21

nn

XXX n

.as, nX

or, equivalently,

Page 25: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Example – Ordinary Coin Toss Game

1. Toss a coin

.as,5.021

nn

XXX n

5. Average payback to you

2. If Heads, win $1

3. If Tails, win nothing

50.0$1)2/1(0)2/1()( iXE

4. Let Xi be your win for game i

6. By the Law of Large Numbers

Simulation Example

Page 26: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Example – St. Petersburg Game1. Toss a coin

5. With probability 1/(2N) we win $2N

2. If Heads, win $2

3. If Tails, keep tossing until it falls Heads

4. If first Heads on N-th toss, win $2N

H $2TH $4TTH $8TTTH $16 etc.

111

2)2

1(2)

2

1(2)

2

1( 3

32

2

6. Average payback to you

Page 27: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

St. Petersburg Game – a sample run

Page 28: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Random Processes (Temporal Stochastic Models)Random Processes (Temporal Stochastic Models)

Random Process: X(t) – Random variable that changes in time

When t = 0, 1, 2, … – Discrete Random Process

When t changes continuously – Continuous Random Process

In addition, since for any value of t, X(t) can be discrete or continuous random variable, there are four possibilities for the process {X(t), t}.

{X(t), t} is defined through its probability distribution. ))0(|)(()( iXxtXPtp i

x

For example, if X(t) can take values x = 0,1,2,…, then is the probability

distribution of X. ),...](),(),([)( 210 tptptptp iiii

Page 29: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Single Population Immigration-Death ProcessSingle Population Immigration-Death Process

Deterministic Model

X(t) = population size at time t

I = rate of immigration

a = per capita death rate

aXIdt

dX

Stochastic Model (Kolmogorov – Chapman DE) xttX )( can happen when:

X(t) = x and no change over . (Event A)

X(t) = x + 1 and one death over . (Event B)

X(t) = x -1 and one immigration over . (Event C)

Probability for more than unit change over . (D)

t

t

t

)( tot

Page 30: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Kolmogorov – Chapman EquationsKolmogorov – Chapman Equations

))(Pr()( ntXtpn

)()())(1()()()1()( 11 totpttoIantptItptnattp nnnn

P(B) P(C) P(A) P(D)

Subtract , divide by , and let )(tpn t 0t

0),()()()()1()( 11 ntpanItIptpnatpdt

dnnnn

0),()()( 100 ntaptIptpdt

d

Demo

Page 31: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

How are the Stochastic and Deterministic Models Related?How are the Stochastic and Deterministic Models Related?

Define )(tnpEXX n

Multiply the K-C equation by n and sum over n

0),()()()()1()( 11 ntpanItIptpnatpdt

dnnnn

][])1([

)()()()()1(

11

11

nnnn

nnn

npnpIannpnanpXdt

d

tpanIntnIptpnnaXdt

d

Xatnpa n )( 1

The mean value of the stochastic process X

satisfies the deterministic equation

XaIX

dt

d

Page 32: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Luria-Delbruck Experiments

Page 33: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Darwinian Model - mutations are equally likely to occur at any moment in time.

Lamarckian Model - mutations evolve only in response to an environmental cue.

When do mutations occur?

Page 34: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Luria-Delbruck Experiments (1943)

Large number of bacterial cultures, starting each one from a small number of cells.

Plate the cultures on nutrient agar plates that on which a large amount of a virus has been plated first. Incubate.

Luria SE & Delbruck M. Mutations of Bacteria from Virus Sensitivity to Virus Resistance. Genetics 28:491(1943).

Control

Page 35: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Hypothesis 1 (Mutation): Mutations occur randomly, but the probability that a bacterium mutates from sensitive to resistant is small. This mutation is completely independent from the presence of the virus. When the bacteria are added to the plates, the mutants are already resistant to the virus. Only these mutants proliferate into colonies on the plate.

 

Hypothesis 1 (Acquired Immunity): A small number of bacteria mutated to acquire resistance only after they are exposed to the virus. Survival confers immunity not only to the individual but also to its offspring, and the colonies grow.

Hypotheses

Page 36: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Count the Number of Colonies

Page 37: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Hypothesis 1 (Acquired Immunity, Directed Mutation): A small number of bacteria mutated to acquire resistance only after they are exposed to the virus. Survival confers immunity not only to the individual but also to its offspring, and the colonies grow.

Two opposing hypotheses

killer virus

Page 38: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Two opposing hypotheses Hypothesis 2 (Mutation + Selection): Mutations occur randomly, but the probability that a bacterium mutates from sensitive to resistant is small. This mutation is completely independent from the presence of the virus. When the bacteria are added to the plates, the mutants are already resistant to the virus. Only these mutants proliferate into colonies on the plate.

killer virus

Page 39: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Poisson

)()( XVarXE 1)(/)( XVarXE

What is the Distribution of the Mutant Cells at the time of plating?

Under the Directed Mutation Hypothesis

killer virus

Under the Mutation + Selection Hypothesis

killer virus

Non-Poisson

)()( XVarXE largeveryis)(XVar

Luria-Delbruck Distribution

Page 40: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Large variation in the number of mutants

Page 41: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

What is the average number of resistant cells under continuous mutation? Assume that mutation can only occur at the time of division

Assume that each cell can mutate with a constant probability p

Average number of mu-tant cells in generation i

Generation (i)

Expected number of mutants at the end from this generation

0

1

23

45

6

p

p2

p22p32

p42

p52

p62

Np2

NN pp 222 1

NN pp 222 22

NN pp 222 33

NN pp 222 44

NN pp 222 55

NN pp 222 66

NNN pppXE 222)( 1111)(XE

Page 42: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Mutation.xls

AcqIm.xls

Biological ESTEEM

Page 43: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Lea and Coulson (1949)

Lea, D.E. and Coulson, C.A. (1949) The distribution of the number of mutants in bacterial populations. J. Genetics 49, 264-285

xxmxmx /)1()1(),(

Theorem. Let Xt denote the number of mutant cells in the culture at time t. If p is the probability for a single cell to mutate and m = p2n, then the probability generating function of the distribution defined by

has the form

kt xkXPmx )(),(

Page 44: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

More recent work on the Luria-Delbruck distribution

Page 45: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Evaluating risk from time series data Evaluating risk from time series data

Glucose Variability and Risk Assessment Glucose Variability and Risk Assessment in Diabetesin Diabetes

Hearth Rate Variability and the Risk for Hearth Rate Variability and the Risk for Neonatal Sepsis Neonatal Sepsis

Page 46: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

In both human and economic terms, diabetes is one of the nations most costly diseases. Diabetes is the leading cause of kidney failure, blindness in adults, and amputations. It is a major risk factor for heart disease, stroke, and birth defects. Diabetes shortens average life expectancy by up to 15 years, and costs our nation in excess of $100 billion annually in health-relatedSixteen Million people Sixteen Million people

in the United States havein the United States haveDiabetes Mellitus.Diabetes Mellitus.

expenditures- more than any other single chronic disease. Diabetes spares no group, affecting young and old, all races and ethnic groups, the rich and the poor.

Blood Glucose Fluctuation Characteristics Blood Glucose Fluctuation Characteristics Quantified from Self-Monitoring DataQuantified from Self-Monitoring Data

Page 47: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

• Type 1 Diabetes also referred to as Insulin Dependent Diabetes Mellitus (IDDM) is the type of diabetes in which the pancreas produces no insulin or extremely small amounts;

• Type 2 Diabetes is the type of diabetes in which the body doesn’t use its insulin effectively or doesn’t produce enough insulin

• Insulin a hormone secreted by the pancreas that regulates metabolism of glucose.

• Blood Glucose (BG) is the concentration of glucose in the bloodstream;

• The BG levels are measured in mg/dl (USA) and in mmol/L (most elsewhere);

• The two scales are directly related by: 18 mg/dl= 1mM;

DefinitionsDefinitionsDefinitionsDefinitions

Page 48: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Target BloodGlucose Range:

70-180 mg/dl(DCCT, 1993)

Hyperglycemia

Hypoglycemia

Food

Insulin

Insulin

Severe Hypoglycemia

Counter-regulation

Insulin

Page 49: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

• Defined as a low BG resulting in stupor, seizure, or unconsciousness that precludes self-treatment (The Diabetes Control and Complications Trial Research Group, 1997). Four percent of the deaths among individuals with IDDM are attributed to SH (DCCT Study Group, 1991).

• Although most severe hypoglycemic episodes are not fatal, there remain numerous negative sequelae leading to compromised occupational and scholastic functioning, social embarrassment, poor judgment, serious accidents, and possible permanent cognitive dysfunction (Gold AE et al., 1993; Deary et al., 1993; Lincoln et al., 1996).

• Fear of severe hypoglycemia is identified as the major barrier to improved metabolic control (Cryer et al., 1994).

Severe HypoglycemiaSevere Hypoglycemia

Page 50: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

BG Fluctuations: T1DMBG Fluctuations: T1DM

0.00

100.00

200.00

300.00

400.00

500.00

600.00

0.00 5.00 10.00 15.00 20.00 25.00 30.00

Page 51: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

BG Fluctuations: T2DMBG Fluctuations: T2DM

0.00

100.00

200.00

300.00

400.00

500.00

600.00

0.00 5.00 10.00 15.00 20.00 25.00 30.00

Page 52: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Average Glycemia and Glucose Variability Person A: HbA1c=8.0%

Blo

od

Glu

cose

(m

g/d

l)

Person B: HbA1c=8.0%

Blo

od

Glu

cose

(m

g/d

l)

Time (days)

0

50

100

150

200

250

300

350

400

0

50

100

150

200

250

300

350

400

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

Page 53: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Blood Glucose (BG) Monitoring SystemsBlood Glucose (BG) Monitoring Systems

Self-Monitoring BG Devices (typically 3-10 measurements/24 hours)

Continuous BG Monitoring Systems

(up to 288 measurements/24 hours)

Page 54: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

5

10

15

20

25

30

Fre

quen

cy

Hypo- Target Range Hyperglycemia

Data Range, if Symmetrization is used

BG (mM)Standard Data Range

ClinicalCenter

NumericalCenter

The Distribution of the BG LevelsThe Distribution of the BG Levels::(Mean=6.7, SD=3.6, Normality hypothesis is rejected, P<0.05)

Page 55: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Symmetrization of the BG Scale:Symmetrization of the BG Scale:

Assumptions:A1: The transformed whole BG range should be symmetric around 0. A2: The transformed target BG range should be symmetric around 0.

Transformation:f(BG,) = [(ln (BG )) ‑ ], > 0

That satisfies the conditions:A1: f (33.3,) = - f (1.1,) and A2: f(10,) = - f(3.9,).

Which leads to the equations:(ln (33.3)) ‑ = [(ln (1.1)) ‑ ]

(ln (10.0)) ‑ ln ‑. [(ln (33.3)) ‑ (ln (1.1) ‑ 10 (scaling)

When solved numerically:1.0331.871 and 1.774 (when BG is in mM)

1.0841 and 1.509 (when BG is in mg/dl)

Page 56: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

1 4 7 10 13 16 19 22 25 28 31 34

BG (mM)

00.5

11.5

22.5

33.5

-0.5-1

-1.5-2

-2.5-3

-3.5

ClinicalCenter

NumericalCenter

f(BG) = 1.774 * (ln(BG)^1.033 - 1.871)

Symmetrization Function:Symmetrization Function:

Page 57: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Distribution of the Transformed BG Levels:Distribution of the Transformed BG Levels:

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.50

10

20

30

40

50

Fre

quen

cy

f(BG)

Hypoglycemia HyperglycemiaTarget Range

Symmetrized Data Range

Clinical and Numerical

Center

Page 58: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

The BG risk function: r(BG)=10.f(BG)2

Let x1, x2, ... xn be a series of n BG readings,and let

rl(BG)=r(BG) if f(BG)<0 and 0 otherwise;rh(BG)=r(BG) if f(BG)>0 and 0 otherwise.

The Low Blood Glucose [Risk] Index (LBGI) and the High BG [Risk] Index (HBGI) are then defined as:

)xrl(n

1=LBGI i

n

1=i )xrh(

n1

=HBGI i

n

1=i

Defining the Low and High Defining the Low and High Blood Glucose Indices:Blood Glucose Indices:

Page 59: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Symmetrization Symmetrization of the BG Measurement Scaleof the BG Measurement Scale

0 0.5 1 1.5 2 2.5 3-0.5-1-1.5-2-2.5-30

20

40

60

80

100

Transformed BG Scale

r(B

G)

Target RangeHypoglycemia Hyperglycemia

y = 10 * x^2Low BG Risk High BG Risk

Clinical and Numerical

Center

• Evaluation of HbAEvaluation of HbA1c1c

• Assessment of Long-Term Risk Assessment of Long-Term Risk for [Severe] Hypoglycemiafor [Severe] Hypoglycemia

• Assessment of Short-Term Assessment of Short-Term Risk for [Severe] HypoglycemiaRisk for [Severe] Hypoglycemia

Risk Analysis of Blood Glucose Data: Theory and Algorithms

• Predicts 40% of SH episodes for the subsequent 6 months;• Predicts 50% of imminent SH episodes (24 hours);• The technology has been licensed by Lifescan Inc, Milpitas, CA;

Page 60: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 340

20

40

60

80

100

BG Level (mM)

r(B

G)

Target Range

Low BG Risk High BG Risk

The Blood Glucose Risk Function:The Blood Glucose Risk Function:(As Defined on the Original Blood Glucose Scale)

Page 61: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

• 4 million births 4 million births • 40,000 very low birth weight 40,000 very low birth weight

(<1500 grams) infants (<1500 grams) infants • 15,000 NICU beds15,000 NICU beds• 400,000 NICU admissions400,000 NICU admissions

Hearth Rate Variability and the Risk for Neonatal Hearth Rate Variability and the Risk for Neonatal SepsisSepsis

Page 62: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Neonatal Sepsis: A Major Public HealthNeonatal Sepsis: A Major Public Health ProblemProblem

• Risk of sepsis is high– 25 - 40% of VLBW infants develop sepsis while

in the neonatal intensive care unit

• Significant mortality and morbidity – In VLBW infants, sepsis doubles the risk of

dying – Length of stay is increased by 1 month– Health care costs are increased

Page 63: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Current Practice for Infants at Risk for Current Practice for Infants at Risk for SepsisSepsis

• Nurse relates that infant in NICU is “not acting right” or “looks a little off”

• Physicians must take the cautious approach, suspecting sepsis

• Assessment includes invasive tests:– CBC, blood culture, urine culture, lumbar

puncture

• Intervention: antibiotics

Page 64: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Baby

Page 65: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Problems with Problems with Current Medical PracticeCurrent Medical Practice

• Nurses and physicians’ subjective assessments are neither sensitive nor specific

• Diagnostic tests have important limitations:– invasive– not performed until infant has clinical signs– various CBC components range from 11% to 77%

Page 66: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Need for Better Risk Need for Better Risk Assessment for Neonatal SepsisAssessment for Neonatal Sepsis

• Tremendous need for continuous non-invasive monitoring for sepsis

• Any device that adds objective information about infant’s state of health from continuous risk assessment monitoring would be helpful

Page 67: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.
Page 68: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.
Page 69: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.
Page 70: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.
Page 71: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Time [RR interval number]

Mag

nit

ud

e o

f R

R in

terv

al [

400

500

600

300

A

400

500

600

300

B

400

500

600

300

C

0 512 1024 1536 2048 2560 3072 3584 4096

Time [RR interval number]

Mag

nit

ud

e o

f R

R in

terv

al [M

sec]

400

500

600

300

A

400

500

600

300

B

400

500

600

300

C

0 512 1024 1536 2048 2560 3072 3584 4096

Page 72: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.

Time [RR interval number]

Mag

nit

ud

e o

f R

R in

terv

al [

400

500

600

300

A

400

500

600

300

B

400

500

600

300

C

0 512 1024 1536 2048 2560 3072 3584 4096

Time [RR interval number]

Mag

nit

ud

e o

f R

R in

terv

al [M

sec]

400

500

600

300

A

400

500

600

300

B

400

500

600

300

C

0 512 1024 1536 2048 2560 3072 3584 4096

4000

8000

12,000

16,000

18,000

10

100

1,000

10,000

1 0

4000

8000

12,000

16,000

18,000

10

100

1,000

10,000

1 0

-20 0 20 40 60 80 100 120

Difference from median [msec]

medianmedian

Sample Asymmetry=2.97R1=27

R2=79.5

Sample Asymmetry=11.8R1=45.5

R2=538.5

B

C

medianmedian

4000

8000

12,000

16,000

18,000

10

100

1,000

10,000

1 0

Sample Asymmetry=1.37R1=42

R2=57.5

A

Page 73: Probability&Statistics - based models August 1, 2007 MathFest 2007 San Jose, CA Raina Robeva – Sweet Briar College.