Testing models against data

20
Testing models against data Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterd [email protected]. nl http://www.bio.vu. nl / thb master course WTC methods Amsterdam, 2005/11/02

description

Testing models against data. Bas Kooijman Dept theoretical biology Vrije Universiteit Amsterdam [email protected] http://www.bio.vu.nl/thb. master course WTC methods Amsterdam, 2005/11/02. Kinds of statistics 1.2.4. Descriptive statistics sometimes useful, frequently boring - PowerPoint PPT Presentation

Transcript of Testing models against data

Page 1: Testing models against data

Testing models against dataBas Kooijman

Dept theoretical biologyVrije Universiteit Amsterdam

[email protected]://www.bio.vu.nl/thb

master course WTC methodsAmsterdam, 2005/11/02

Page 2: Testing models against data

Kinds of statistics 1.2.4

Descriptive statistics sometimes useful, frequently boring

Mathematical statistics beautiful mathematical construct rarely applicable due to assumptions to keep it simple

Scientific statistics still in its childhood due to research workers being specialised upcoming thanks to increase of computational power (Monte Carlo studies)

Page 3: Testing models against data

Tasks of statistics 1.2.4

Deals with• estimation of parameter values, and confidence of these values• tests of hypothesis about parameter values differs a parameter value from a known value? differ parameter values between two samples?

Deals NOT with• does model 1 fit better than model 2 if model 1 is not a special case of model 2

Statistical methods assume that the model is given(Non-parametric methods only use some properties of the given model, rather than its full specification)

Page 4: Testing models against data

Nested models

2210)( xwxwwxy

xwwxy 10)( 0)( wxy 220)( xwwxy

Venn diagram

02 w 01 w

Page 5: Testing models against data

Error of the first kind: reject null hypothesis while it is true

Error of the second kind: accept null hypothesis while the alternative hypothesis is true

Level of significance of a statistical test: = probability on error of the first kind

Power of a statistical test: = 1 – probability on error of the second kind

Testing of hypothesis

true false

accept 1 -

reject 1 -

null hypothesis

dec

isio

nNo certainty in statistics

Page 6: Testing models against data

Contr.

NOEC

NOEC

Res

po

nse

log concentration

LOEC

*

Statistical testing

NOEC No Observed Effect ConcentrationLOEC Lowest Observed Effect Concentration

Page 7: Testing models against data

What’s wrong with NOEC?

• Power of the test is not known

• No statistically significant effect is not no effect;

• Effect at NOEC regularly 10-34%, up to >50%

• Inefficient use of data– only last time point, only lowest doses

– for non-parametric tests also values discarded

NOECNOECR

es

po

ns

e

log concentration

Contr.Contr.

LOEC

*LOECLOEC

*OECD Braunschweig meeting 1996:NOEC is inappropriate and should be phased out!

OECD Braunschweig meeting 1996:NOEC is inappropriate and should be phased out!

Page 8: Testing models against data

Statements to remember

• “proving” something statistically is absurd

• if you do not know the power of your test, do don’t know what you are doing while testing

• you need to specify the alternative hypothesis to know the power this involves knowledge about the subject (biology, chemistry, ..)

• parameters only have a meaning if the model is “true” this involves knowledge about the subject

Page 9: Testing models against data

Independent observations

IIf

If X and Y are independent

Page 10: Testing models against data

Central limit theorems

The sum of n independent identically (i.i.) distributed random variables becomes normally distributed for increasing n.

The sum of n independent point processes tends to behave as a Poisson process for increasing n.

yy

YXZ yYPyzXPzZPdyyfyzfzfYXZ )()()(;)()()(

Number of events in a time interval is i.i. Poisson distributedTime intervals between subsequent events is i.i. exponentially distributed

Page 11: Testing models against data

Sums of random variables

)λexp()λ()(

λ)(

)λexp(λ)(

1 yyn

yf

xxf

nY

X

)(Var)(Var;1

i

n

ii XnYXY

)λexp(!

λ)()(

)λexp(!

λ)(

ny

nyYP

xxXP

y

x

Exp

onen

tial p

rob

dens

Poi

sson

pro

b

Page 12: Testing models against data

Normal probability density

2

2 σ

μ

2

1exp

πσ2

1)(

xxf X

μ'μ

2

1exp

π2

1)( 1- xxxf

nX

μ)/σ(x-

σ

σ95%

Page 13: Testing models against data

Parameter estimation

Most frequently used method: Maximization of (log) Likelihood

likelihood: probability of finding observed data (given the model), considered as function of parameter values

If we repeat the collection of data many times (same conditions, same number of data)the resulting ML estimate

Page 14: Testing models against data

Profile likelihoodlarge sample

approximation

95% conf interval

Page 15: Testing models against data

Comparison of models

Akaike Information Criterion for sample size n and K parameters

12)θ(log2

Kn

nKL

12σlog 2

Kn

nKn

in the case of a regression model

You can compare goodness of fit of different models to the same databut statistics will not help you to choose between the models

Page 16: Testing models against data

Confidence intervals

parameter

estimate

excluding

point 4

sd

excluding

point 4

estimate

including

point 4

sd

including

point 4

L, mm 6.46 1.08 3.37 0.096

rB,d-1 0.099 0.022 0.277 0.023

time, d

leng

th, m

m

ttrLLLtrLLLtL

B

B

smallfor)()exp()()(

00

0

10 LBr

95% conf intervals

correlations amongparameter estimatescan have big effectson sim conf intervals

excludespoint 4

includespoint 4

L

Page 17: Testing models against data

: These gouramis are from the same nest, These gouramis are from the same nest, they have the same age and lived in the same tank they have the same age and lived in the same tankSocial interaction during feeding caused the huge size differenceSocial interaction during feeding caused the huge size differenceAge-based models for growth are bound to fail;Age-based models for growth are bound to fail; growth depends on food intake growth depends on food intake

No age, but size:No age, but size:

Trichopsis vittatus

Page 18: Testing models against data

Rules for feeding

Page 19: Testing models against data

time

time time

rese

rve

dens

ityre

serv

e de

nsity

leng

thle

ngth

time 1 ind

2 ind

determinexpectation

Social interaction Feeding

Page 20: Testing models against data

Dependent observations

Conclusion

Dependences can work out in complex ways

The two growth curves look like von Bertalanffy curves with very different parameters

But in reality both individuals had the same parameters!