Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a...

35
Uncertainty Quantification Using Deep Gaussian Processes Alireza Daneshkhah Warwick Centre for Predictive Modelling The University of Warwick Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 1 / 35

Transcript of Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a...

Page 1: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Uncertainty Quantification Using Deep Gaussian

Processes

Alireza Daneshkhah

Warwick Centre for Predictive ModellingThe University of Warwick

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 1 / 35

Page 2: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Motivation

Deterministic Solver

Red

uct

ion

Den

sity

Est

imat

ion

Rec

onst

ruct

ion

Observed input

Bayesian Training

Surrogate

Model

• Tree construction.

• HDMR terms.

• Experimental design.

A =

{

α(s)

}SA

s=1

Reduced input space Output space

• Output correlations.

Data collection

Statistics PDFs Error bars

Bilionis and Zabaras (2012)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 2 / 35

Page 3: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

The multiscale Modelling Challenges

1 The complex multiscale physical models challenges◮ Curse of dimensionality◮ Computational complexity (& limited data)◮ Discontinuity of the model output

2 The current solutions◮ Probabilistic neural networks◮ Traditional Gaussian processes◮ Multi-output separable Gaussian process

3 Deep Gaussian processes◮ Probabilistic representation◮ Analytical solution is available◮ Model dimensionality is no longer an issue

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 3 / 35

Page 4: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Deep Neural network

The idea is taken from the Deep Neural network with l hidden

layers

given x

h1 = φ (W1x)

h2 = φ (W2h1)

h3 = φ (W3h2)

y = w⊤4

h3y1

h31

h32

h33

h34

h21

h22

h23

h24

h25

h26

h11

h12

h13

h14

h15

h16 h1

7h1

8

x1 x2 x3 x4 x5 x6

Lawrence et al. (2014).

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 4 / 35

Page 5: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Problems with Deep neural network

By increasing the number of nodes in neighboring layers, the

corresponding set W (in h = φ(Wx)) becomes very big (leading to

be a overfitted model).

Solution: replace Wi with a lower rank form

Wi = UiVTi

Wk1×k2 , then Uk1×q and Vk2×q

given x

f1 = V⊤1

x

h1 = g (U1f1)

f2 = V⊤2

h3

h2 = g (U2f2)

f3 = V⊤1

h2

h3 = g (U3f3)

y = w⊤4

h3y1

h11

h12

h13

h14

h21

h22

h23

h24

h25

h26

h31

h32

h33

h34

h35

h36

h37

h38

x1 x2 x3 x4 x5 x6

Lawrence et al. (2014).

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 5 / 35

Page 6: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Alternative Solution: Deep GP

Put GP prior over weights and and taking width of each layer to

infinity.

Y = f3(f2(· · · f1(X))), Hi = fi(Hi−1)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 6 / 35

Page 7: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Deep Gaussian Process

f1~GP

f3~GP

f2~GP

1 Deep Gaussian process◮ Bayesian belief network (DAG)◮ Non-parametric, non-linear mappings fl◮ Likelihood is a non-linear function of the inputs

2 Challenges◮ How to learn the intermediate hidden layers?◮ How to efficiently train the model?

3 Solution◮ Variational compression◮ Provides a Probabilistic presentation of the model

evidence

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 7 / 35

Page 8: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Non-linear mapping Using GP

Nonlinear regression problem: Learn f with error bars from data

D = {X,y}

Use GP prior on f = {fi}Ni=1 given X = {xi}

Ni=1

}

f1f2

f3fN

N function values

x

f

= ( ; )

prior

p(f |X) = N (0,KN)

KN

K(xi,xj) = τ 2exp

−1

2

q∑

k=1

(

x(k)i − x

(k)j

ωk

)2

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 8 / 35

Page 9: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

GP Regression

yi = fi + ǫi, ǫi ∼ N(0, σ2)

marginal likelihood: p(y | X) = N (0,KN + σ2I)

predictive distribution: p(y∗ | x∗,y,X) = N (µ∗, σ2∗)

µ∗ = K∗N(KN + σ2I)−1y

σ2∗= K∗∗ − K∗N(KN + σ2I)−1KN∗ + σ2

Problem: N3 computation

Solution: Use sparse GP approximation based on a small set of M

pseudo-inputs or inducing variables

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 9 / 35

Page 10: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Sparse GP Using pseudo-inputs

1 Choose any set of M ≪ N inducing variables X

2 Draw corresponding function values f from prior◮ p(f | X) = N (0,KM)

3 Draw f conditioned on f◮ p(f | f) = N (KNMK−1

M f, Σ = KN − KNMK−1M KMN)

X

x

f

f f

Σ

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 10 / 35

Page 11: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Sparse GP approximation

p(fi | f) = N (µi = KiMK−1M f, λi = Kii − KiMK−1

M KMi)

X

x

f

Λ

Approximate: p(f | f) ≈∏

i=1 p(fi | f) = N (µ,Λ), Λ = diag(λ)

Minimum KL: minqiKL[

p(f | f) ‖∏

i qi(fi)]

Integrate out f to obtain: p(f) =∫

p(f)p(f | f)df

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 11 / 35

Page 12: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Sparse Pseudo-input GP (SPGP)∫

|

GP prior

N (0,KN) ≈SPGP prior

p(f) = N (0,KNMK−1

MKMN + Λ)

≈ = +

SPGP covariance computation: O(M2N)

The predictive mean computational complexity: O(M)

The predictive variance computational complexity: O(M2)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 12 / 35

Page 13: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

How to find Sparse Pseudo-inputs?

consider these inputs as the extra hyper-parameters and

maximize the marginal likelihood w.r.t (X, τ, σ,ω)

p(y | X,X, τ, σ,ω)

This Joint optimization avoids discontinuities that arise when the

design points are selected .

We use this augmented variable method, followed by a collapsed

variational approximation for learning Deep GP.

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 13 / 35

Page 14: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Sparse Pseudo-inputs position

amplitudeamplitudeamplitude

x

y

X

amplitude lengthscale noise amplitudeamplitudeamplitude

x

y

X

amplitude lengthscale noise

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 14 / 35

Page 15: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Bayesian GP Latent Variable Model (GP-LVM)

Start with a standard GP-LVM.

Y

X

σ2

p(Y | X) =

p∏

j=1

N (y:,j | 0,K)

Apply standard latent variable approach◮ Define Gaussian prior over latent space, X.

p(X) =

q∏

j=1

N (x:,j | 0, α2j I)

◮ Integrate out latent variables to get p(Y) =?◮ Integration is intractable.

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 15 / 35

Page 16: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Standard Variational Inference

Standard variational bound has the form

L = 〈log p(y | X)〉q(X) + KL(q(X) ‖ p(X))

Requires expectation of log p(y | X) under q(X)

log p(y | X) = −1

2yT(Kff + σ2I)−1y −

1

2|Kff + σ2I| −

N

2log 2π

The computation of expectation of log p(y | X) under q(X) is

extremely difficult.

Augment the GP model by inducing variables (Z,u = f(Z))

p(f,u | Z,X) = N

((

f

u

)

|0,

(

Kff Kfu

Kuf Kuu

))

log p(y | X,Z) ≥ logN (y | 0,KfuK−1uu Kuf + σ2I)−

1

2σ2tr(Σ)

Σ = Kff − KfuK−1uu Kuf

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 16 / 35

Page 17: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Bayesian Variational Inference

Treat u as the extra parameters of the model

p(u) = N (u | 0,Kuu)

By applying the parametric Variational Bayes (Titsias and

Lawrence, 2010), we have

log p(y | X) ≥ logN (y | KfuK−1uu m, σ2I)−

1

2σ2tr(SK−1

uu KufKfuK−1uu )

−KL(p(u) ‖ q(u))−1

2σ2tr(Σ)

The variational distribution is q(u) = N (u | m,S)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 17 / 35

Page 18: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Deep GP representation - Process Composition

f1~GP

f3~GP

f2~GP

Deep GP: y = hl(hl−1(. . . h1(X))) + ǫ

Joint pdf:

p(y, {hi}li=1 | x) = p(y | hl)

l∏

i=2

p(hi | hi−1)p(h1 | x)

h1 | X = N (0,Kh1h1+ σ2

1I)

hi | hi−1 = N(0,Khihi+ σ2

i I)

y | hl = N(0,Khlhl+ σ2

l I)

The direct computation of p(y | X) is intractable

(O(N3)).

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 18 / 35

Page 19: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Computational Challenges in Learning Deep GP

1 Marginalise out all hidden layers in a Bayesian framework (Titsiaset al. 2010)

◮ The number of parameters is drastically reduced.◮ The deep network structure can be automatically determined

2 The direct marginalization of hi’s are intractable.

p(y | x) =∫

p(y | h1)(∫

p(h1 | h2)p(h2 | x)dh2

)

dh1

p(h2 | x) =∫

p(h1 | f2)p(f2 | h2)p(h1 | x)dh1df2

p(f2 | h2) contains non-linear kernel K−1f2f2

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 19 / 35

Page 20: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Deep GP augmented by Inducing Variables: An

Example

f1~GP

f3~GP

f2~GP

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 20 / 35

Page 21: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Variational (Compression) Inference for Deep GPs

Augment layer hi with a set of inducing variables ui

Apply Bayesian variational inference withing each layer

The bound on the conditional probability is:

p(y, {hi}li=1 | {ui}

li=1x) ≥ p(y | hl,ul)×

l∏

i=2

p(hi | hi−1,ui)p(h1 | x,u1)

× exp

(

l∑

i=1

−1

2σ2i

tr(Σi)

)

p(hi | ui,hi−1) = N (hi | KhiuiK−1

uiuiui, σ

2i I)

Σi = Khihi− Khiui

K−1uiui

Kuihi

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 21 / 35

Page 22: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Variational Compression for Deep GP (3)

Given x and a fixed q(u1) = N (u1 | m1,S1) compute

q(h1) =

p(h1 | u1,x)q(u1)du1

Given q(h1), we can variationally propagate using q(u2) and

marginalize out h1

log p(h2 | x,u2) ≥ −〈1

2σ22

tr(Σ1)〉q(h1) −1

2σ21

tr(Σ0)

−KL(q(u1)‖p(u1)) + log N(h2 | Ψ2K−1u2u2

u2, σ22 I)

−1

σ22

tr(

(Φ2 −ΨT2Ψ2)K

−1u2u2

u2uT2 K−1

u2u2

)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 22 / 35

Page 23: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

The marginal likelihood bound

Continue to feed-forward to the bottom layer using the variational

propagation at each layer, the marginal likelihood bound is

log p(y | X) ≥ −l∑

i=2

1

2σ2i

(ψi − tr(ΦiK−1uiui

))−1

2σ21

tr(Σ1)

−l∑

i=1

KL(q(ui)‖p(ui)) + logN (y | ΨlK−1ulul

ml, σl2I)

−l∑

i=1

1

σ2i

tr(

(Φi −ΨTi Ψi)K

−1uiui

〈uiuTi 〉q(ui)K

−1uiui

)

Φi = 〈KuihiKhiui

〉q(hi−1),Ψi = 〈Khiui〉q(hi−1), ψi = 〈tr(Khihi

)〉q(hi−1)

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 23 / 35

Page 24: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

VC for Deep GP - Points

1 All the terms of the given bound are tractable, including the KL

term

2 However, the rate of tractability depends on the selected

covariance function (the same as GPLVM) and how feasible can

be convoluted with q(hl).

3 A gradient based optimization method can be used to maximizethe final form of the variational lower bound w.r.t to

◮ model parameters: {σ2i , θi}

l+1i=2

◮ variational parameters: {Zi,mi,Si,µi+1,Σi+1}li=1

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 24 / 35

Page 25: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Elliptic PDE Example

−∇.(a(ω,x)∇u(ω,x)) = f (·), in D

u(ω,x) = 0, in ∂D

The physical domain is D = [0, 1]2

Z(ω,x) = log(a(ω,x))

C(x1,x2) = σ2rf exp

(

−kI∑

i=1

(x1,i − x2,i)2

λ

)

We generate N = 250 realisations of Z by truncating the KLE at

q1 = 50, when λ = 0.1.

The boundary problem is then solved with FEM over 16 × 16 grid

Response observed on a 20 × 20 grid.

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 25 / 35

Page 26: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Input & output realizations of the Elliptic PDEs

Training data: D = {(Zr, ur), r = 1, . . . , 200}

0 0.5 10

0.2

0.4

0.6

0.8

1

-2

-1

0

1

2

0 0.5 10

0.2

0.4

0.6

0.8

1

-2

-1

0

1

2

3

0 0.5 10

0.2

0.4

0.6

0.8

1

-2

-1

0

1

2

0 0.5 10

0.2

0.4

0.6

0.8

1

-2

-1

0

1

2

0 0.5 10

0.2

0.4

0.6

0.8

1

0.01

0.02

0.03

0.04

0.05

0.06

0 0.5 10

0.2

0.4

0.6

0.8

1

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0 0.5 10

0.2

0.4

0.6

0.8

1

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0 0.5 10

0.2

0.4

0.6

0.8

1

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 26 / 35

Page 27: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Hidden Layers demonstrations - Elliptic Problem

0 50 100 150 200 2500

0.5

1

1.5Layer 1

1 2 3 4 5 6 7 8 9 10 11 120

2

4

6Layer 2

Latent Dimesion 1

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

LD

6

-0.5

0

0.5

1Layer 2

Latent Dimension 139

-0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

LD

14

2

-0.2

-0.1

0

0.1

0.2Layer 1

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 27 / 35

Page 28: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Posterior Mean and Variance of Response - Elliptic

Problem

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.01

0.02

0.03

0.04

0.05

0.06

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

×10-5

1.4423

1.4423

1.4423

1.4423

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1×10

-3

2

4

6

8

10

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 28 / 35

Page 29: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Mean of Variance & Variance of mean - Elliptic

Problem

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1×10

-5

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1×10

-5

1.2706

1.2706

1.2706

1.2706

1.2706

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 29 / 35

Page 30: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Flow through porous media

−∇u = 0

u = −K(X, ω)∇p, ∀x ∈ in Xs = [0, 1]2

p = 1 − x1 on ∂Xs

Deterministic solver: Mixed FEM on a 20 × 20 grid.

Response observed on a 20 × 20 grid.

G(x, ω) = log(K(x, ω) is an exponential RF with:

COVG(xs1,xs2) = s2G exp

{

−ks∑

k=1

|xs1,k − xs2,k|

λk

}

.

We employ the KLE on G and truncate it after 50 terms, λk = 0.1.

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 30 / 35

Page 31: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Input & output realizations of the Permeability Problem

400 data-points are generated, first 300 are used for training the

Deep GP.

0 0.5 10

0.5

1

-0.5

0

0.5

0 0.5 10

0.5

1

-0.6

-0.4

-0.2

0

0.2

0.4

0 0.5 10

0.5

1

-0.5

0

0.5

0 0.5 10

0.5

1

-0.2

0

0.2

0.4

0.6

0 0.5 10

0.5

1

-2

-1

0

1

0 0.5 10

0.5

1

-2

-1

0

1

0 0.5 10

0.5

1

-1.5

-1

-0.5

0

0.5

1

0 0.5 10

0.5

1

-1.5

-1

-0.5

0

0.5

1

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 31 / 35

Page 32: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Hidden Layers demonstrations - Permeability Problem

A deep GP with 2 hidden layers is fitted to the data

K = 80 ( of inducing variables).

0 50 100 150 200 250 300 350 4000

5

10

15Layer 1

1 2 3 4 5 6 7 8 9 10 11 12 13 140

0.02

0.04Layer 2

Lat Dim 1 ×10-4

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Lat

Dim

241

×10-4

-5

0

5Layer 1

Lat Dim 1

-1.5 -1 -0.5 0 0.5 1 1.5 2

Lat D

im 1

4

-2

0

2

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 32 / 35

Page 33: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Posterior Mean and Variance of Pressure -

Permeability Problem

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

-2

-1.5

-1

-0.5

0

0.5

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

-1.5

-1

-0.5

0

0.5

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1×10

-5

2.5072

2.5072

2.5072

2.5072

2.5072

2.5072

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 33 / 35

Page 34: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Mean of Variance & Variance of mean of Pressure

0 0.2 0.4 0.6 0.8 1

x

0

0.2

0.4

0.6

0.8

1

y

-1.5

-1

-0.5

0

0.5

1

0 0.2 0.4 0.6 0.8 1

x

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

0.5

1

1.5

2

2.5×10

-3

0 0.2 0.4 0.6 0.8 1

x

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

1.1376944

1.13769445

1.1376945

1.13769455

1.1376946×10

-4

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 34 / 35

Page 35: Uncertainty Quantication Using Deep Gaussian Processes...Deep Gaussian Process K \ Provides a Probabilistic presentation of the model K f1~ GP f3~ GP f2~ GP 1 Deep Gaussian process

Summary

The final models is not a GP!

Deep GP provides a probabilistic approximation of the model

which is useful for UQ and also guards against overfitting

Deep GP allows unsupervised and supervised deep learning.

Using Deep GP, the curse of dimensionality is no longer an issue

Variational compression algorithms show promise for scaling

these models to massive data sets.

Sampling is straight-forward

Alireza Daneshkhah Uncertainty Quantification Using Deep Gaussian Processes 35 / 35