A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction...

60
Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary Regression Maria De Yoreo Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 De Yoreo BNP Binary Regression

Transcript of A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction...

Page 1: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

A Fully Nonparametric Modeling Approach toBinary Regression

Maria De Yoreo

Department of Applied Mathematics and StatisticsUniversity of California, Santa Cruz

SBIES, April 27-28, 2012

De Yoreo BNP Binary Regression

Page 2: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 3: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 4: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 5: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 6: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Motivation

I binary responses along with covariates are present inmany settings, including biometrics, econometrics, andsocial sciences

I Goal: determine the relationship between response andcovariates

I examples: credit scoring, medicine, population dynamics,environmental sciences

I the response-covariate relationship is described by theregression function

I standard approaches involve linearity and distributionalassumptions, e.g., GLMs

De Yoreo BNP Binary Regression

Page 7: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Motivation

I binary responses along with covariates are present inmany settings, including biometrics, econometrics, andsocial sciences

I Goal: determine the relationship between response andcovariates

I examples: credit scoring, medicine, population dynamics,environmental sciences

I the response-covariate relationship is described by theregression function

I standard approaches involve linearity and distributionalassumptions, e.g., GLMs

De Yoreo BNP Binary Regression

Page 8: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Bayesian Nonparametrics

I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification

I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)

I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically

I involves random covariates

De Yoreo BNP Binary Regression

Page 9: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Bayesian Nonparametrics

I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification

I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)

I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically

I involves random covariates

De Yoreo BNP Binary Regression

Page 10: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Bayesian Nonparametrics

I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification

I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)

I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically

I involves random covariates

De Yoreo BNP Binary Regression

Page 11: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Latent Variable Formulation

I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)

I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)

I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation

I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health

De Yoreo BNP Binary Regression

Page 12: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Latent Variable Formulation

I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)

I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)

I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation

I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health

De Yoreo BNP Binary Regression

Page 13: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Latent Variable Formulation

I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)

I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)

I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation

I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health

De Yoreo BNP Binary Regression

Page 14: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 15: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

DP Mixture Model

The Dirichlet Process (DP) (Ferguson, 1973) generatesrandom distributions, and can be used as a prior for spaces ofdistribution functions.

I DP constructive definition (Sethuraman, 1994): ifG ∼ DP(α,G0), then it is almost surely of the form∑∞

l=1 plδνl

→ νliid∼ G0, l = 1,2, ...

→ zriid∼ Beta(1, α), r = 1,2, ...

→ define p1 = z1, and pl = zl∏l−1

r=1(1− zr ), for l = 2,3, ...I DP mixture model for the latent responses and covariates

f (z, x ; G) =

∫Np+1(z, x ;µ,Σ)dG(µ,Σ)

G|α,ψ ∼ DP(α,G0(µ,Σ;ψ))

De Yoreo BNP Binary Regression

Page 16: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

DP Mixture Model

The Dirichlet Process (DP) (Ferguson, 1973) generatesrandom distributions, and can be used as a prior for spaces ofdistribution functions.

I DP constructive definition (Sethuraman, 1994): ifG ∼ DP(α,G0), then it is almost surely of the form∑∞

l=1 plδνl

→ νliid∼ G0, l = 1,2, ...

→ zriid∼ Beta(1, α), r = 1,2, ...

→ define p1 = z1, and pl = zl∏l−1

r=1(1− zr ), for l = 2,3, ...I DP mixture model for the latent responses and covariates

f (z, x ; G) =

∫Np+1(z, x ;µ,Σ)dG(µ,Σ)

G|α,ψ ∼ DP(α,G0(µ,Σ;ψ))

De Yoreo BNP Binary Regression

Page 17: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Implied Conditional Regression

I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs

f (z, x ; G) =∞∑

l=1

plNp+1(z, x ;µl ,Σl)

I Binary regression functional: pr(y = 1|x ; G)

→ marginalize over z to obtain f (x ; G) and f (y , x ; G)

f (x ; G) =∞∑

l=1

plNp(x ;µxl ,Σ

xxl )

And the joint distribution f (y , x ; G) =

∞∑l=1

plNp(x ;µxl ,Σ

xxl )Bern

(y ; Φ

(µz

l + Σzxl (Σxx

l )−1(x − µxl )

(Σzzl − Σzx

l (Σxxl )−1Σxz

l )1/2

))De Yoreo BNP Binary Regression

Page 18: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Implied Conditional Regression

I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs

f (z, x ; G) =∞∑

l=1

plNp+1(z, x ;µl ,Σl)

I Binary regression functional: pr(y = 1|x ; G)

→ marginalize over z to obtain f (x ; G) and f (y , x ; G)

f (x ; G) =∞∑

l=1

plNp(x ;µxl ,Σ

xxl )

And the joint distribution f (y , x ; G) =

∞∑l=1

plNp(x ;µxl ,Σ

xxl )Bern

(y ; Φ

(µz

l + Σzxl (Σxx

l )−1(x − µxl )

(Σzzl − Σzx

l (Σxxl )−1Σxz

l )1/2

))De Yoreo BNP Binary Regression

Page 19: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Implied Conditional Regression

I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs

f (z, x ; G) =∞∑

l=1

plNp+1(z, x ;µl ,Σl)

I Binary regression functional: pr(y = 1|x ; G)

→ marginalize over z to obtain f (x ; G) and f (y , x ; G)

f (x ; G) =∞∑

l=1

plNp(x ;µxl ,Σ

xxl )

And the joint distribution f (y , x ; G) =

∞∑l=1

plNp(x ;µxl ,Σ

xxl )Bern

(y ; Φ

(µz

l + Σzxl (Σxx

l )−1(x − µxl )

(Σzzl − Σzx

l (Σxxl )−1Σxz

l )1/2

))De Yoreo BNP Binary Regression

Page 20: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

The Regression Function

I implied regression function:pr(y = 1|x ; G) =

∑∞l=1 wl(x)πl(x), with covariate

dependent weights

wl(x) ∝ plN(x ;µxl ,Σ

xxl )

and probabilities

πl(x) = Φ

(µz

l + Σzxl (Σxx

l )−1(x − µxl )

(Σzzl − Σzx

l (Σxxl )−1Σxz

l )1/2

)

I Notice that the probabilities have the probit form withcomponent-specific intercept and slope parameters

De Yoreo BNP Binary Regression

Page 21: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

The Regression Function

I implied regression function:pr(y = 1|x ; G) =

∑∞l=1 wl(x)πl(x), with covariate

dependent weights

wl(x) ∝ plN(x ;µxl ,Σ

xxl )

and probabilities

πl(x) = Φ

(µz

l + Σzxl (Σxx

l )−1(x − µxl )

(Σzzl − Σzx

l (Σxxl )−1Σxz

l )1/2

)

I Notice that the probabilities have the probit form withcomponent-specific intercept and slope parameters

De Yoreo BNP Binary Regression

Page 22: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Identifiability

Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)

I the binary responses are not able to inform about the scaleof the latent responses

I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl

I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x

De Yoreo BNP Binary Regression

Page 23: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Identifiability

Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)

I the binary responses are not able to inform about the scaleof the latent responses

I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl

I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x

De Yoreo BNP Binary Regression

Page 24: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Identifiability

Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)

I the binary responses are not able to inform about the scaleof the latent responses

I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl

I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x

De Yoreo BNP Binary Regression

Page 25: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Identifiability

Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)

I the binary responses are not able to inform about the scaleof the latent responses

I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl

I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x

De Yoreo BNP Binary Regression

Page 26: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Identifiability

Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)

I the binary responses are not able to inform about the scaleof the latent responses

I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl

I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x

De Yoreo BNP Binary Regression

Page 27: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the

relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)

I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −

∑k−1j=1 βk ,j(yj − µj), δk ),

k = 2, ...,m→ useful for modeling longitudinal data and specifying

conditional independence assumptions

De Yoreo BNP Binary Regression

Page 28: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the

relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)

I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −

∑k−1j=1 βk ,j(yj − µj), δk ),

k = 2, ...,m→ useful for modeling longitudinal data and specifying

conditional independence assumptions

De Yoreo BNP Binary Regression

Page 29: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the

relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)

I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −

∑k−1j=1 βk ,j(yj − µj), δk ),

k = 2, ...,m→ useful for modeling longitudinal data and specifying

conditional independence assumptions

De Yoreo BNP Binary Regression

Page 30: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the

relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)

I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −

∑k−1j=1 βk ,j(yj − µj), δk ),

k = 2, ...,m→ useful for modeling longitudinal data and specifying

conditional independence assumptions

De Yoreo BNP Binary Regression

Page 31: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

I here, no natural ordering is present, but theparamaterization has other useful properties which weexploit

I δ1 = Σzz

→ fix δ1, and mix on δ2, . . . , δp+1 and p(p + 1)/2 free elementsof β, denoted by vector β̃

Then the DP mixture model becomes

f (z, x ; G) =

∫Np+1(z, x ;µ, β−1∆β−T )dG(µ, β,∆)

I computationally convenient: there exist conjugate priordistributions for β̃ and δ2, ..., δp+1, which are MVN and(independent) inverse-gamma

De Yoreo BNP Binary Regression

Page 32: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Facilitating Identifiability

I here, no natural ordering is present, but theparamaterization has other useful properties which weexploit

I δ1 = Σzz

→ fix δ1, and mix on δ2, . . . , δp+1 and p(p + 1)/2 free elementsof β, denoted by vector β̃

Then the DP mixture model becomes

f (z, x ; G) =

∫Np+1(z, x ;µ, β−1∆β−T )dG(µ, β,∆)

I computationally convenient: there exist conjugate priordistributions for β̃ and δ2, ..., δp+1, which are MVN and(independent) inverse-gamma

De Yoreo BNP Binary Regression

Page 33: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 34: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Hierarchical Model

Blocked Gibbs sampler: truncate G to GN(·) =∑N

l=1 plδWl (·),with Wl = (µl , β̃l ,∆l), and introduce configuration variables(L1, ...,Ln) taking values in 1, ...,N.

yi |ziind∼ 1(yi=1)1(zi>0) + 1(yi=0)1(zi≤0), i = 1, . . . ,n

(zi , xi)|W ,Liind∼ Np+1((zi , xi);µLi , β

−1Li

∆Liβ−TLi

), i = 1, ...,n

Li |p ∼N∑

l=1

plδl(Li), i = 1, . . . ,n

Wl |ψind∼ Np+1(µl ; m,V )Nq(β̃l ; θ, cI)

p+1∏i=2

IG(δi,l ; νi , si), l = 1, . . . ,N

De Yoreo BNP Binary Regression

Page 35: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Posterior Inference

I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.

I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum

I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)

N )

→ provides point estimate and uncertainty quantification forregression function

I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0

De Yoreo BNP Binary Regression

Page 36: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Posterior Inference

I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.

I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum

I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)

N )

→ provides point estimate and uncertainty quantification forregression function

I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0

De Yoreo BNP Binary Regression

Page 37: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Model FormulationPosterior Inference

Posterior Inference

I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.

I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum

I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)

N )

→ provides point estimate and uncertainty quantification forregression function

I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0

De Yoreo BNP Binary Regression

Page 38: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 39: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Simulated Data

I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.

I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data

I a practical prior specification approach which isappropriate when little is known about the problem isapplied here

I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)

De Yoreo BNP Binary Regression

Page 40: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Simulated Data

I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.

I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data

I a practical prior specification approach which isappropriate when little is known about the problem isapplied here

I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)

De Yoreo BNP Binary Regression

Page 41: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Simulated Data

I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.

I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data

I a practical prior specification approach which isappropriate when little is known about the problem isapplied here

I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)

De Yoreo BNP Binary Regression

Page 42: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

−2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

x

Pr(z>0|x;G)

−2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

xPr(y=1|x;G)

The inference for pr(z > 0|x ; G) (left) is compared to that forpr(y = 1|x ; G) (right) and the truth (solid line).

Page 43: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

−4 −3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

1.2

z

f(z|x=x1)

−4 −3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

1.2

z

f(z|x=x2)

−4 −3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

1.2

z

f(z|x=x3)

z

f(z|x=x1)

−3.9 0.0 2.9

0.0

1.2

z

f(z|x=x2)

−3.9 0.0 2.9

0.0

1.2

z

f(z|x=x3)

−3.9 0.0 2.9

0.0

1.2

Top row: Inference for f (z|x0; G) under the model which views zas observed, with true densities as dashed lines, at 3 values ofx0. Bottom: Inference from the binary regression model.

Page 44: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 45: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Ozone and Wind Speed

I 111 daily measurements of wind speed (mph) and ozoneconcentration (parts per billion) in NYC over 4 monthperiod

I objective: model the probability of exceeding a certainozone concentration as a function of wind speed

I the model only sees whether or not there was anexceedance, but there is an actual ozone concentrationunderlying this 0/1 value

De Yoreo BNP Binary Regression

Page 46: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

wind speed

prob

abilit

y of

ozo

ne e

xcee

denc

e

5 10 15 20

050

100

150

wind speed

ozon

e co

ncen

tratio

n

Left: The probability that ozone concentration (parts per billion)exceeds a threshold of 70 decreases with wind speed (mph).Right: For comparison, here are the actual non-discretizedozone measurements as a function of wind speed.

Page 47: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

−3 −1 0 1 2 30.0

0.2

0.4

0.6

z

f(z|x0)

−3 −1 0 1 2 3

0.0

0.2

0.4

0.6

z

f(z|x0)

−3 −1 0 1 2 3

0.0

0.2

0.4

0.6

z

f(z|x0)

−3 −1 0 1 2 30.0

0.2

0.4

0.6

z

f(z|x0)

Estimates for f (z|x0; G) at wind speed values of 5, 8, 10, and15 mph.

Page 48: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Outline

1 Introduction

2 MethodologyModel FormulationPosterior Inference

3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data

4 Discussion

De Yoreo BNP Binary Regression

Page 49: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Simulation ExampleAtmospheric MeasurementsCredit Card Data

Credit Cards and Income

I n = 100 subjects in a study were asked whether or notthey owned a travel credit card, and their income wasrecorded (Agresti, 1996)

I In this situation, it is not clear that there is somemeaningful interpretation of the latent continuous randomvariables, but we can still use the method for regression

I Does probability of owning a credit card change withincome?

De Yoreo BNP Binary Regression

Page 50: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

10 20 30 40 50 60 70

0.0

0.2

0.4

0.6

0.8

1.0

income in thousands

Pr(

y=1|

x;G

)

●●●●●●●●

●●

●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●

●●

●●●●

●●●●●●●●●

●●●●●●●●●●

●●

● ●●●●

●● ●●●●●● ●●●●●●

●● ●● ● ●

●●●●●● ●

Probability of owning a credit card appears to increase withincome, with a slight dip or leveling off around income of 40-50,since all subjects in that region did not own a credit card.

Page 51: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Extensions to Ordinal Reponses

I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j

if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)

I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables

I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense

De Yoreo BNP Binary Regression

Page 52: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Extensions to Ordinal Reponses

I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j

if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)

I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables

I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense

De Yoreo BNP Binary Regression

Page 53: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Extensions to Ordinal Reponses

I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j

if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)

I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables

I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense

De Yoreo BNP Binary Regression

Page 54: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Extensions to Ordinal Reponses

I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j

if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)

I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables

I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense

De Yoreo BNP Binary Regression

Page 55: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Other Extensions

I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response

I several applications, but limited existing methods forflexible inference

I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj

I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be

used, and fixing certain elements of δ provides thenecessary restrictions

I mixed ordinal-continuous responses

De Yoreo BNP Binary Regression

Page 56: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Other Extensions

I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response

I several applications, but limited existing methods forflexible inference

I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj

I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be

used, and fixing certain elements of δ provides thenecessary restrictions

I mixed ordinal-continuous responses

De Yoreo BNP Binary Regression

Page 57: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Other Extensions

I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response

I several applications, but limited existing methods forflexible inference

I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj

I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be

used, and fixing certain elements of δ provides thenecessary restrictions

I mixed ordinal-continuous responses

De Yoreo BNP Binary Regression

Page 58: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Other Extensions

I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response

I several applications, but limited existing methods forflexible inference

I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj

I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be

used, and fixing certain elements of δ provides thenecessary restrictions

I mixed ordinal-continuous responses

De Yoreo BNP Binary Regression

Page 59: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Other Extensions

I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response

I several applications, but limited existing methods forflexible inference

I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj

I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be

used, and fixing certain elements of δ provides thenecessary restrictions

I mixed ordinal-continuous responses

De Yoreo BNP Binary Regression

Page 60: A Fully Nonparametric Modeling Approach to Binary …mnd13/SBIES2012.pdf · Introduction Methodology Data Illustrations Discussion A Fully Nonparametric Modeling Approach to Binary

IntroductionMethodology

Data IllustrationsDiscussion

Conclusions

? Binary responses measured along with covariatesrepresents a simple setting, but the scope of problemswhich lie in this category is large.

? This framework allows flexible, nonparametric inference tobe obtained for the regression relationship in a generalbinary regression problem.

? The methodology extends easily to larger classes ofproblems in ordinal regression, including multivariateresponses and mixed responses, making the frameworkmuch more powerful, with utility in a wide variety ofapplications.

De Yoreo BNP Binary Regression