SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv...

57
SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College

description

SAMSI March 2007 GASP: The Random Field Regression Model What kind of model should be used for data from computer experiments? We need to consider: Attention to bias, not to variance. Nonlinear effects and interactions. High-dimensional inputs. The RFR model is one possible solution.

Transcript of SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv...

Page 1: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

GASP Models and Bayesian Regression

David M. Steinberg Dizza Bursztyn

Tel Aviv University Ashkelon College

Page 2: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

1. GASP: The Random Field Regression Model

2. The RFR Model and Bayesian Regression

3. RFR to Bayes – What is the Model?

4. Example 1: A Simple One-Factor Model

5. Example 2: Nuclear Waste Repository

6. From Bayes to RFR

7. Conclusions

PREVIEW

Page 3: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

GASP: The Random Field Regression Model

What kind of model should be used for data from computer experiments?

We need to consider:

• Attention to bias, not to variance.

• Nonlinear effects and interactions.

• High-dimensional inputs.

The RFR model is one possible solution.

Page 4: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression ModelAlso known as

Kriging model – from roots in geostatistics

GASP – for Gaussian stochastic process

Page 5: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression ModelLet y denote a response and x a vector of factor settings or covariates.

Treat y as the realization of a random field with a fixed regression component:

y(x) = 0 + jfj(x) + (x)

The regression part is often limited to just the constant term.

Page 6: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression ModelThe random field (x) is used to represent the departure of the true response function from the regression model.

Typical assumptions:

E{(x)} = 0.

E{(x1)(x2)} = C(X1,X2) = 2R(X1,X2)

Page 7: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression Model

• We can estimate the response at a new input site using the Best Linear Unbiased Predictor.

• The estimator is also the posterior mean if we assume that all random terms have normal distributions.

• The estimator is much more flexible than the standard regression model. It smoothly interpolates the output data.

Page 8: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression Model

• Typically the correlation function R includes parameters that can be estimated by maximum likelihood or by cross-validation.

• One popular recommendation:

R(x1,x2) = exp{ - j | x1,j – x2,j |p(j)}

Page 9: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression ModelAn example from Welch et al., Technometrics,

1992.

Page 10: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The Random Field Regression ModelThe RFR model has been found to generate good

predictors in applications.

But it …

• is difficult to interpret.• does not relate to “classical” models.• is not clear “what it does to the data”.

Page 11: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian Regression

We will show that the RFR model can be understood as a Bayesian regression model.

Suppose we want to represent the response y using a regression model:

y(x) = 0 + jfj(x)

Page 12: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian Regression

Take a Bayesian view and assign priors to the coefficients.

Assign a vague prior to the constant.

Assume that the remaining terms are independent, with

j ~ N(0,j2).

Page 13: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian Regression

We now have

y(x) = 0 + jfj(x)

= 0 + (x)

Page 14: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian RegressionThe term (x) is a random field whose

distribution is induced by the prior assumptions on the regression coefficients.

E{(x)} = 0.

E{(x1)(x2)} = C(X1,X2) = j2fj(X1)fj(X2)

Page 15: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian Regression

The RFR model is equivalent to a Bayesian regression model.

The number of regression functions can be as large as we desire, even a full series expansion.

Page 16: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR Model and Bayesian Regression

The importance of each regression function in the Bayesian model is reflected by the prior variance, with important terms assigned large variances.

A regression component in the RFR model corresponds to assigning diffuse priors to the appropriate coefficients (i.e. giving them “infinite” prior variances). Then leave those terms out of the random field component.

Page 17: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR to Bayes– What is the Model?

Suppose we fit an RFR model to data from a computer experiment.

Can we find an associated Bayesian regression model?

Finding the Bayes model may be helpful in understanding the RFR model.

Page 18: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR to Bayes– What is the Model?Some simple data analysis provides an

answer.Our algorithm:• Compute the correlation matrix R(Xi,Xj)

at all pairs of design points.• Compute the eigenvalues and

eigenvectors of the correlation matrix.• For the leading eigenvalues, find out how

the associated eigenvectors are related to the design factors.

Page 19: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor Design

Consider a computer experiment with just one factor.

The design includes 50 points spread uniformly on the interval [-1,1].

The correlation function is estimated from the power exponential family:

R(X1,X2) = exp{ - 0.05 |X1 – X2|2}.

Page 20: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor DesignThe eight leading eigenvalues of the

correlation matrix:Leading Eigenvalues

2 4 6 8

10^-

1310

^-10

10^-

710

^-4

10^-

110

^1

Page 21: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor DesignThe first eigenvector, plotted against the

input factor from our design:First Vector

X

-1.0 -0.5 0.0 0.5 1.0

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

Page 22: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor DesignThe second eigenvector:

Second Vector

X

-1.0 -0.5 0.0 0.5 1.0

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

Page 23: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor DesignThe third eigenvector:

Third Vector

X

-1.0 -0.5 0.0 0.5 1.0

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

Page 24: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 1: A One-factor DesignThe fourth eigenvector:

Fourth Vector

X

-1.0 -0.5 0.0 0.5 1.0

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

Page 25: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR to Bayes– What is the Model?Why does the algorithm work?Let Y denote the output vector. The

Bayesian regression model says that

Y = 01 + 1f1 + … + TfT.where f1,f2,…,fT are the columns in the

regression matrix.

Then Y ~ N(01,C), where C= j2fjf’j.

Page 26: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

RFR to Bayes– What is the Model?The algorithm merely reverses the logic.Given the correlation matrix, it identifies

regression vectors and prior variances.The regression vectors depend on intrinsic properties of the correlation function and on the experimental design.For example, if the design “confounds” two effects, we might get a regression vector that is explained by either of the two or by a linear combination of them.

Page 27: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

We included 26 input factors.The design is a 900 point Latin Hypercube,

generated automatically by RESRAD.Several pairs of factors should be equal to

one another. RESRAD allowed us to enforce a 0.99 rank correlation between such pairs.

Other pairs should be similar and we used a 0.3 rank correlation for them.

Page 28: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

The response is the maximal equivalent annual dose of radiation in the drinking water (in millirem) during a 10,000 year time window.

IAEC standards stipulate that this dose should be at most 30 millirem.

The goal is to identify factors that affect the outcome and should be subject to further study at a proposed repository site.

Page 29: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryThe output data show no leaching at all for

more than 75% of the input vectors.When leaching does occur, the maximal

annual dose has a highly skewed distribution:

Quantiles of Standard Normal

Max

Dos

e

-3 -2 -1 0 1 2 3

050

0010

000

1500

0

Page 30: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryWe fitted an RFR model to the log of the

maximal annual dose, using only those input vectors with an outcome of at least 0.1 (n=163).

We selected the 8 strongest input factors as predictors.

Most of these factors are also related to the presence/absence of leaching, so the design for the RFR model is no longer uniform in the input space.

Page 31: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryTwo of the strongest factors related to presence or absence of leaching.

U238 Distribution Coefficient (UZ)

Thic

knes

s

-1.0 -0.5 0.0 0.5

-1.0

-0.5

0.0

0.5

1.0

Page 32: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

A RFR model was fitted using 8 strong factors with the PErK software of Brian Williams.

The power exponential correlation function was used.

Page 33: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryThe fitted model:

FactorThetaExponentKd U238 Unsaturated0.2421.66Thickness0.2371.97Kd U238 Saturated0.2901.69Effective Porosity0.0230.13Eff. Porosity Saturated0.1021.95Hydraulic Conductivity0.1431.94Precipitation0.4040.20Kd T230 Contaminated0.0080.16

Page 34: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryThe eigenvalues:

Sequence Number

Eig

enva

lue

0 50 100 150

0.1

0.5

1.0

5.0

50.0

Page 35: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryThe leading eigenvector versus Thickness:

Thickness

Lead

ing

E-v

ecto

r

-1.0 -0.5 0.0 0.5 1.0

-0.1

5-0

.10

-0.0

50.

00.

050.

100.

15

Page 36: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryThe next 5 eigenvectors are almost linear functions of the 5 input factors with the largest scale parameters. Dominant factors in red.

E-vectorFactorsR2 (%)

21,2,397.7

31,2,3,794.6

41,2,3,697.2

52,3,6,794.6

61,2,6,789.2

Page 37: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

Adding a few nonlinear effects increases the R2 values to above 95%.

The first vector has small quadratic effects of the first 3 factors.

The 6th vector has clear nonlinear effects of factor 7 (Precipitation – low exponent in model).

Page 38: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

The 7th eigenvector is not a linear function of the input factors.

Adding second-order effects shows a strong quadratic effect of Precipitation.

Page 39: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryPlot of the vector against Precipitation:

Preceipitation

E-v

ecto

r 7

-1.0 -0.5 0.0 0.5 1.0

-0.1

5-0

.10

-0.0

50.

00.

050.

100.

15

Page 40: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryRegressing the e-vector against a “tent function with a plateau” in Precipitation gives an R2 of 89.9%.

The remaining scatter is most closely related to a linear effect in factor 5 (Effective Porosity in the Saturated Zone) and a quadratic effect in factor 3 (Kd for U238 in the Saturated Zone).

Page 41: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

The 8th eigenvector is not a linear function of the input factors.

It can be largely explained by a linear term in Effective Porosity (Saturated Zone), a quadratic dependence on the Kd for U238 (Saturated Zone), the interaction of the last factor with Thickness, and nonlinear terms in Precipitation.

Page 42: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

Plot vs EP (SZ). Residuals vs. Precipitation

Effective Porosity (SZ)

8th

E-v

ecto

r

-0.5 0.0 0.5

-0.2

-0.1

0.0

0.1

0.2

Precipitation

Res

idua

ls-1.0 -0.5 0.0 0.5 1.0

-0.0

50.

00.

050.

10

The outlier is in a “corner” of the Thickness by Kd projection.

Page 43: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste Repository

We can also apply the idea “in reverse.”

Suppose there is a linear effect in one of the input factors.

Is the effect a part of the RFR model?

Page 44: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryResults from regressing linear effects on the 12 leading eigenvectors.

FactorE-vectorsR2 (%)12-697.922-698.632-597.442-1210.252-1297.062-598.372-893.982-12 3.5

Page 45: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Example 2: Nuclear Waste RepositoryResults from regressing pure cubic effects on the 12 leading eigenvectors.

FactorE-vectorsR2 (%)12-1245.921-1237.032-1246.642-12 8.852-1263.862-1235.672-1275.481-12 9.4

Page 46: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

From Bayes to RFR

The ideas here can also be used to derive covariance functions for RFR models.

Write down a Bayesian regression model.

Compute the resulting covariance function.

Page 47: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

From Bayes to RFRExample 1:

• Hermite polynomials.

• Decay to 0 away from the origin.

• Priors on the coefficients that shrink exponentially.

Result is the power exponential family with all exponents equal to 2.

Page 48: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

From Bayes to RFRExample 2:

• Fourier series.

• Priors on the coefficients that shrink polynomially.

Result is family of spline covariances.

Page 49: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Some Special Models1. Gaussian correlation and Hermite polynomials

j

jjj xxxxR 2,2,121 ||exp),(

Consider a single univariate term in this product.

Page 50: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The scaled Hermite polynomials are orthonormal with respect to the N(0,1) density.

)(* xH s

.)()( ,**

tsts ZHZHE

Define

)1(2exp)()(

2*

wwxxHxJ ss

Assume

0

)()(s

ss xJxZ

0sE ss w2var

Page 51: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Then

)()(),( 210

221 xJxJwxxC ss

s

s

2

212

2/122

)1(2exp1 xx

www

Page 52: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

-5 0 5

-1.0

-0.5

0.0

0.5

1.0

Plot of J1 for w=0.35.

Page 53: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

-5 0 5

-10

12

Plot of J2 for w=0.35.

Page 54: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

2. Trigonometric regression and splines.

Assume x is in [0,1] and

,)2sin()2cos()()(1

,2,11

0

s

ssj

m

jj sxsxxkxY

The first sum has polynomials and all coefficients except the last one have vague priors.

The remaining terms have priors with mean 0 and

)/()var( 2 nm ).()2/(2var 22, ns msl

Page 55: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

./)2sin()2sin()2cos()2cos()2/(2 2

12121

2 m

j

m jjxjxjxjx

The contribution of the trigonometric terms to the covariance function is

|)(|)!2()1( 2121 xxBm m

m

Here B2m is a Bernoulli polynomial and the right-hand side is a spline in one argument for fixed values of the second argument.

Page 56: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

The estimator produced by this model is exactly the interpolating spline of degree 2m-1 that minimizes the squared m’th integral of the estimate while at the same time interpolating the data.

Page 57: SAMSI March 2007 GASP Models and Bayesian Regression David M. Steinberg Dizza Bursztyn Tel Aviv University Ashkelon College.

SAMSI March 2007

Conclusions• RFR models offer great flexibility for modeling

data from computer experiments.

• RFR models have an equivalent interpretation as Bayesian regression models.

• The Bayesian regression framework can be helpful for understanding how the RFR model is modeling the data.

• Some straightforward data analysis can uncover a Bayesian model associated with the RFR.