MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION … ICIP Vidal 3oct18.pdfΒ Β· Imaging Inverse...

1
Imaging Inverse Problems OBJECTIVE: to estimate an unknown image from an observation . A + ~(0, 2 ) βˆˆβ„ UNKNOWN IMAGE βˆˆβ„ OBSERVATION LINEAR OPERATOR NOISE Some canonical examples: β€’ image deconvolution β€’ compressive sensing β€’ super-resolution β€’ tomographic reconstruction β€’ inpainting CHALLENGE: not enough information in to accurately estimate . For example, in many imaging problems = + where the operator is either: High-dimensional problems β†’ ~10 6 RANK-DEFICIENT SMALL EIGENVALUES not enough equations to solve for NOISE AMPLIFICATION unreliable estimates of REGULARISATION: We can render the problem well-posed by using prior knowledge about the unknown signal . How do we choose the regularisation parameter ? The regularisation parameter controls how much importance we give to prior knowledge and to the observations, depending on how ill posed the problem is and on the intensity of the noise. PRIOR KNOWLEDGE OBSERVED DATA POSSIBLE APPROACHES FOR CHOOSING : NON BAYESIAN β€’ Cross-validation β†’ Exhaustive search method β€’ Discrepancy Principle β€’ Stein-based methods β†’Minimise Stein’s Unbiased Risk Estimator (SURE), a surrogate of the MSE β€’ SUGAR β†’ More efficient algorithms that uses gradient of SURE Limitation: mostly designed for denoising problems. Difficult to implement BAYESIAN β€’ Hierarchical β†’ Propose prior for and work with hierarchical model β€’ Marginalisation β†’ Remove from the model (|) = ℝ + , Limitation: only for homogeneous () or cases with known () β€’ Empirical Bayes β†’ Choose by maximising marginal likelihood (|) Difficulty: (|) becomes intractable in high-dimensional problems Our strategy: Empirical Bayes The challenge is that (|) is intractable as it involves solving two integrals in ℝ (to marginalise and to compute ). This makes the computation of extremely difficult. OUR CONTRIBUTION We propose a stochastic optimisation scheme to compute the maximum marginal likelihood estimator of the regularisation parameter. Novelty: the optimisation is driven by proximal Markov chain Monte Carlo (MCMC) samplers. We want to find that maximises the marginal likelihood (|): ࡞ (|) = ℝ , = ∈Θ (|) We should be able to reliably estimate from as is very high-dimensional and is a scalar parameter Proposed stochastic optimisation algorithm If (|) was tractable, we could use a standard projected gradient algorithm: STOCHASTIC APPROXIMATION PROXIMAL GRADIENT (SAPG) ALGORITHM +1 = Θ [ + log (| )] where verifies To tackle the intractability, we propose a stochastic variant of this algorithm based on a noisy estimate of ((| ). Using Fisher’s identity we have: log = |, log , = |, βˆ’ βˆ’ If is unknown we can use the identity βˆ’ = | The intractable gradient becomes log = |, βˆ’ + | Now we can approximate |, βˆ’ and | with Monte Carlo estimates. We construct a Stochastic Approx. Proximal Gradient (SAPG) algorithm driven by two Markov kernels M and K targeting the posterior , and the prior respectively. + ~M X , , +1 = Θ + + βˆ’ + + ~K U , , CONVERGE JOINTLY TO EMPIRICAL BAYES POSTERIOR STOCHASTIC GRADIENT How do we generate the samples? MOREAU-YOSIDA UNADJUSTED LANGEVIN ALGORITHM (MYULA) +1 = βˆ’ ( + βˆ’ )+ 2 +1 () We use the MYULA algorithm for the Markov kernels K and M because they can handle: β–ͺ High-dimensionality ! β–ͺ Convex problems with a non-smooth () β–ͺ The MYULA kernels do not target , and exactly. β–ͺ Sources of asymptotic bias: β–ͺ Discretisation of Langevin diffusion: controlled by and . β–ͺ Smoothing of non differentiable (): controlled by . β–ͺ , must be < inverse of Lipchitz constant of the gradient driving each diffusion respectively. β–ͺ More information about how to select each parameter can be found in [2]. WHERE DOES MYULA COME FROM? = 1 2 log | + LANGEVIN DIFFUSION +1 = + , + 2 +1 +1 ~ 0, UNADJUSTED LANGEVIN ALGORITHM (ULA) If () is not Lipschitz differentiable ULA is unstable. The MYULA algorithm uses Moreau-Yosida regularization to replace the non-smooth term () with its Moreau Envelope (). Moreau-Yosida regularisation controlled by ∝ βˆ’|| β†’ +1 βˆ’ EULER MARUYAMA APPROXIMATION ∝ exp{ βˆ’ () βˆ’ () } BROWNIAN MOTION Results We illustrate the proposed methodology with an image deblurring problem using a total-variation prior. We compare our results with those obtained with: β–ͺ SUGAR β–ͺ Marginal maximum-a-posteriori estimation β–ͺ Optimal or oracle value βˆ— EVOLUTION OF THROUGHOUT ITERATIONS BOAT IMAGE QUANTITATIVE COMPARISON βœ“ Generally outperforms the other approaches For each image, noise level, and method, we compute the MAP estimator and we display on this table the average results for the 6 test images. EB performs remarkably well at low SNR values Longer computing time only justified if marginalisation can’t be used SUGAR performs poorly when the main problem is not noise Close-up for SNR=20db METHOD COMPARISON Over-smoothing: too much regularisation Ringing: not enough regularisation PRIOR DATA DEGRADED EMPIRICAL BAYES MAP ESTIMATOR SNR=40db Future work β–ͺ Detailed analysis of the convergence properties. β–ͺ Extending the methodology to problems involving multiple unknown regularisation parameters. β–ͺ Reducing computing times by accelerating the Markov kernels driving the stochastic approximation algorithm: β–ͺ Via parallel computing β–ͺ By choosing a faster Markov kernel β–ͺ Adapt algorithm to support some non-convex problems. β–ͺ Use the samples from , to perform uncertainty quantification. References [1] Marcelo Pereyra, JosΓ© M Bioucas-Dias, and MΓ‘rio AT Figueiredo, β€œMaximum-a-posteriori estimation with unknown regularisation parameters,” in 23 rd European Signal Processing Conference (EUSIPCO), 2015. IEEE, pp. 230–234, 2015. [2] Alain Durmus, Eric Moulines, and Marcelo Pereyra, β€œEfficient Bayesian computation by proximal Markov Chain Monte Carlo: when Langevin meets Moreau,” SIAM Journal on Imaging Sciences, vol. 11, no. 1, pp. 473–506, 2018. [3] Ana Fernandez Vidal and Marcelo Pereyra. "Maximum Likelihood Estimation of Regularisation Parameters." In 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1742-1746. IEEE, 2018. MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION PARAMETERS Ana Fernandez Vidal, Dr. Marcelo Pereyra ( [email protected], [email protected] ) School of Mathematical and Computer Sciences, Heriot - Watt University, Edinburgh EH14 4AS United Kingdom The observation is related to by a statistical model as a random variable. We use a prior density to reduce uncertainty about and deliver accurate estimates. The Bayesian Framework () penalises undesired properties and the regularisation parameter controls the intensity of the penalisation. Example: random vector representing astronomical images. | = βˆ’ () HYPERPARAMETER We model the unknown image as a random vector with prior distribution | promoting desired properties about . PRIOR INFORMATION ∝ βˆ’ () The observation is related to by a statistical model: OBSERVED DATA LIKELIHOOD Observed and prior information are combined by using Bayes' theorem: POSTERIOR DISTRIBUTION , ∝ βˆ’ + , = | /(|) ො = argmax βˆˆβ„ , = argmin βˆˆβ„ + () The predominant Bayesian approach in imaging is MAP estimation which can be computed very efficiently by convex optimisation: We will focus on convex problems where: - () is lower semicontinuous, proper and possibly non-smooth - () is Lipschitz differentiable with Lipschitz constant L () = βˆ’ 2 2 2 2 ) ~(, 2 NORMALIZING CONSTANT Conclusions β–ͺ We presented an empirical Bayesian method to estimate regularisation parameters in convex inverse imaging problems. β–ͺ We approach an intractable maximum marginal likelihood estimation problem by proposing a stochastic. optimisation algorithm. β–ͺ The stochastic approximation is driven by two proximal MCMC kernels which can handle non-smooth regularisers efficiently. β–ͺ Our algorithm was illustrated with non-blind image deconvolution with TV prior where it: βœ“ achieved close-to-optimal performance βœ“ outperformed other state of the art approaches in terms of MSE Γ— had longer computing times. β–ͺ More details can be found in [3]. EXPERIMENT DETAILS β€’ Recover from a blurred noisy observation , = + and ) ~N(0, 2 β€’ is a circulant uniform blurring matrix of size 9x9 pixels β€’ = β†’ isotropic total-variation pseudo-norm β€’ We use 6 test images of size 512x512 pixels β€’ We compute the MAP estimator for each using different values of obtained with different methods β€’ We compare methods by taking average value over 6 images of the mean squared error (MSE) and the computing time (in minutes) ORACLE ORIGINAL in minutes Γ— Increased computing times βœ“ Achieves close-to-optimal performance SNR=40dB Close-up for SNR=20dB DEGRADED ORIGINAL EMP. BAYES MARGINAL MAP SUGAR

Transcript of MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION … ICIP Vidal 3oct18.pdfΒ Β· Imaging Inverse...

Page 1: MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION … ICIP Vidal 3oct18.pdfΒ Β· Imaging Inverse Problems ... The challenge is that 𝑝( | ) is intractable as it involves solving two

Imaging Inverse Problems

OBJECTIVE: to estimate an unknown image π‘₯ from an observation 𝑦.

A +

𝑧~𝑁(0, 𝜎2𝐼)

π‘₯ ∈ ℝ𝑛

UNKNOWNIMAGE

𝑦 ∈ β„π‘š

OBSERVATIONLINEAR

OPERATOR

NOISE

Some canonical examples:β€’ image deconvolution β€’ compressive sensing β€’ super-resolution β€’ tomographic reconstructionβ€’ inpainting

CHALLENGE: not enough information in 𝑦 to accurately estimate π‘₯.

For example, in many imaging problems 𝑦 = 𝐴π‘₯ + 𝑧 where the operator 𝐴 is either:

High-dimensional problems β†’ 𝑛~106

RANK-DEFICIENT

SMALL EIGENVALUES𝐴

not enough equations to solve for π‘₯

NOISE AMPLIFICATION unreliable estimates of π‘₯

REGULARISATION: We can render the problem well-posed by using prior knowledge about theunknown signal π‘₯.

How do we choose the regularisation parameter πœƒ?The regularisation parameter controls how much importance we give to prior knowledge andto the observations, depending on how ill posed the problem is and on the intensity of the noise.

𝜽

PRIORKNOWLEDGE

OBSERVEDDATA

POSSIBLE APPROACHES FOR CHOOSING 𝛉:

NON BAYESIANβ€’ Cross-validation β†’ Exhaustive search method

β€’ Discrepancy Principle

β€’ Stein-based methods β†’Minimise Stein’s Unbiased Risk Estimator (SURE), a surrogate of the MSE

β€’ SUGAR β†’ More efficient algorithms that uses gradient of SURE

Limitation: mostly designed for denoising problems. Difficult to implement

BAYESIANβ€’ Hierarchical β†’ Propose prior for πœƒ and work with hierarchical model

β€’ Marginalisation β†’ Remove πœƒ from the model 𝑝(π‘₯|𝑦) = +ℝ 𝑝 π‘₯, πœƒ 𝑦 π‘‘πœƒ

Limitation: only for homogeneous πœ‘(π‘₯) or cases with known 𝐢(πœƒ)

β€’ Empirical Bayes β†’ Choose πœƒ by maximising marginal likelihood 𝑝(𝑦|πœƒ)

Difficulty: 𝑝(𝑦|πœƒ) becomes intractable in high-dimensional problems

Our strategy: Empirical Bayes

The challenge is that 𝑝(𝑦|πœƒ) is intractable as it involves solving two integrals in ℝ𝑛 (to marginalise π‘₯ and to compute

𝐢 πœƒ ). This makes the computation of πœƒ 𝑀𝐿𝐸 extremely difficult.

OUR CONTRIBUTIONWe propose a stochastic optimisation scheme to compute the maximum marginal likelihood estimator of theregularisation parameter. Novelty: the optimisation is driven by proximal Markov chain Monte Carlo (MCMC) samplers.

We want to find πœƒ that maximises the marginal likelihood 𝑝(𝑦|πœƒ):

࡞

𝑝(𝑦|πœƒ) = ℝ𝑛 𝑝 𝑦, π‘₯ πœƒ 𝑑π‘₯

πœƒπ‘€πΏπΈ = π‘Žπ‘Ÿπ‘”π‘šπ‘Žπ‘₯πœƒ ∈ Θ

𝑝(𝑦|πœƒ)

We should be able to reliably estimate πœƒ from 𝑦 as 𝑦 is very high-dimensional and πœƒ is a

scalar parameter

Proposed stochastic optimisation algorithmIf 𝑝(𝑦|πœƒ) was tractable, we could use a standard projected gradient algorithm:

STOCHASTIC APPROXIMATION PROXIMAL GRADIENT (SAPG) ALGORITHM

πœƒπ‘‘+1 = π‘ƒπ‘Ÿπ‘œπ‘¦Ξ˜

[πœƒπ‘‘ + 𝛿𝑑𝑑

π‘‘πœƒlog 𝑝(𝑦|πœƒπ‘‘)] where 𝛿𝑑 verifies

To tackle the intractability, we propose a stochastic variant of this algorithm based on a noisy estimate of 𝑑

π‘‘πœƒπ‘™π‘œπ‘”(𝑝(𝑦|πœƒπ‘‘). Using Fisher’s identity we have:

𝑑

π‘‘πœƒlog 𝑝 π’š πœƒ = 𝐸𝒙|π’š,πœƒ

𝑑

π‘‘πœƒlog 𝑝 𝒙, π’š πœƒ = 𝐸𝒙|π’š,πœƒ βˆ’πœ‘ π‘₯ βˆ’

𝑑

π‘‘πœƒπΏπ‘œπ‘” 𝐢 πœƒ

If 𝐢 πœƒ is unknown we can use the identity βˆ’ 𝑑

π‘‘πœƒπΏπ‘œπ‘” 𝐢 πœƒ = 𝐸𝒙|πœƒ πœ‘ 𝒙

The intractable gradient becomes 𝑑

π‘‘πœƒlog 𝑝 π’š πœƒ =𝐸𝒙|π’š,πœƒ βˆ’πœ‘ π‘₯ + 𝐸𝒙|πœƒ πœ‘ 𝒙

Now we can approximate 𝐸𝒙|π’š,πœƒ βˆ’πœ‘ π‘₯ and 𝐸𝒙|πœƒ πœ‘ 𝒙 with Monte Carlo estimates.

We construct a Stochastic Approx. Proximal Gradient (SAPG) algorithm driven by two Markov kernels Mπœƒ and Kπœƒtargeting the posterior 𝑝 𝒙, π’š πœƒ and the prior 𝑝 π‘₯ πœƒ respectively.

𝑿𝒕+𝟏~Mπœƒπ‘‘ X 𝑦, πœƒπ‘‘ , 𝑋𝑑

πœƒπ‘‘+1 = π‘ƒπ‘Ÿπ‘œπ‘—Ξ˜

πœƒπ‘‘ + 𝛿𝑑 πœ‘ 𝑼𝒕+𝟏 βˆ’ πœ‘ 𝑿𝒕+𝟏

𝑼𝒕+𝟏~Kπœƒπ‘‘ U πœƒπ‘‘ , π‘ˆπ‘‘

𝑝 π‘₯ 𝑦, πœƒπ‘€πΏπΈ

𝑝 π‘₯ πœƒπ‘€πΏπΈ

πœƒπ‘€πΏπΈ

CONVERGEJOINTLY TO

EMPIRICAL BAYES POSTERIOR

STOCHASTIC GRADIENT

How do we generate the samples?

MOREAU-YOSIDA UNADJUSTED LANGEVIN ALGORITHM (MYULA)

π‘₯𝑑+1 = π‘₯𝑑 βˆ’ 𝛾 (𝛻𝑔𝑦 π‘₯𝑑 +π‘₯𝑑 βˆ’ π‘π‘Ÿπ‘œπ‘₯πœ‘

πœ†πœƒ π‘₯𝑑

πœ†) + 2𝛾 𝑍𝑑+1

𝛻 πœ‘π‘€π‘Œ(π‘₯)

We use the MYULA algorithm for the Markov kernels Kπœƒ and Mπœƒ because they can handle:β–ͺ High-dimensionality !β–ͺ Convex problems with a non-smooth πœ‘(𝒙)

β–ͺ The MYULA kernels do not target 𝑝 π‘₯ 𝑦, πœƒ and 𝑝 π‘₯ πœƒ exactly.β–ͺ Sources of asymptotic bias:

β–ͺ Discretisation of Langevin diffusion: controlled by 𝛾 and πœ‚.β–ͺ Smoothing of non differentiable πœ‘(π‘₯): controlled by πœ†.

β–ͺ 𝛾,πœ‚ must be < inverse of Lipchitz constant of the gradient driving each diffusion respectively.β–ͺ More information about how to select each parameter can be found in [2].

WHERE DOES MYULA COME FROM?

𝑑𝑋 𝑑 =1

2𝛻 log 𝑝 𝑋 𝑑 |𝑦 𝑑𝑑 + π‘‘π‘Š 𝑑

LANGEVIN DIFFUSION

π‘₯𝑑+1 = π‘₯𝑑 + 𝛾 𝛻 πΏπ‘œπ‘” 𝑝 π‘₯𝑑 𝑦, πœƒ + 2𝛾 𝑍𝑑+1

𝑍𝑑+1~𝑁 0, 𝕀

UNADJUSTED LANGEVIN ALGORITHM (ULA)

If πœ‘(π‘₯) is not Lipschitz differentiable ULA is unstable.

The MYULA algorithm uses Moreau-Yosida regularization to replace the non-smooth term πœ‘(π‘₯) with its Moreau Envelope πœ‘π‘€(π‘₯).

Moreau-Yosida regularisation controlled by πœ†

𝑝 π‘₯ ∝ π‘’βˆ’|π‘₯|

𝑑𝑋 𝑑

𝑑𝑑→

π‘₯𝑑+1βˆ’π‘₯𝑑

𝛾

EULER MARUYAMA APPROXIMATION

∝ exp{ βˆ’ 𝑔𝑦(π‘₯) βˆ’ πœƒ πœ‘(π‘₯) } BROWNIAN MOTION

ResultsWe illustrate the proposed methodology with an image deblurring problem using a total-variation prior.

We compare our results with those obtained with:β–ͺ SUGARβ–ͺ Marginal maximum-a-posteriori estimationβ–ͺ Optimal or oracle value πœƒβˆ—

EVOLUTION OF πœƒ THROUGHOUT ITERATIONS

BOAT IMAGE

QUANTITATIVE COMPARISON

βœ“ Generally outperforms the other approaches

For each image, noise level, and method, we compute the MAP estimator and we display on this table the average results for the 6 test images.

EB performs remarkably well at low SNR values

Longer computing time only justified if marginalisation can’t be used

SUGAR performs poorly when the main problem is not noise

Close-up for SNR=20db

METHOD COMPARISON

Over-smoothing: too much regularisation

Ringing:

not enough regularisation

PRIOR

DATA

DEGRADED EMPIRICAL BAYES MAP ESTIMATOR

SNR=40db

Future workβ–ͺ Detailed analysis of the convergence properties.

β–ͺ Extending the methodology to problems involvingmultiple unknown regularisation parameters.

β–ͺ Reducing computing times by accelerating the Markovkernels driving the stochastic approximation algorithm:

β–ͺ Via parallel computingβ–ͺ By choosing a faster Markov kernel

β–ͺ Adapt algorithm to support some non-convex problems.

β–ͺ Use the samples from 𝒑 𝒙 π’š, 𝜽 𝑴𝑳𝑬 to perform

uncertainty quantification.

References[1] Marcelo Pereyra, JosΓ© M Bioucas-Dias, and MΓ‘rio ATFigueiredo, β€œMaximum-a-posteriori estimation with unknownregularisation parameters,” in 23rd European Signal ProcessingConference (EUSIPCO), 2015. IEEE, pp. 230–234, 2015.

[2] Alain Durmus, Eric Moulines, and Marcelo Pereyra, β€œEfficientBayesian computation by proximal Markov Chain Monte Carlo:when Langevin meets Moreau,” SIAM Journal on ImagingSciences, vol. 11, no. 1, pp. 473–506, 2018.

[3] Ana Fernandez Vidal and Marcelo Pereyra. "MaximumLikelihood Estimation of Regularisation Parameters." In 2018 25thIEEE International Conference on Image Processing (ICIP), pp.1742-1746. IEEE, 2018.

MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION PARAMETERSAna Fernandez Vidal, Dr. Marcelo Pereyra ([email protected], [email protected])

School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS United Kingdom

The observation π’š is related to 𝒙 by a statistical model as a random variable.

We use a prior density to reduce uncertainty about and deliver accurate estimates.

The Bayesian Framework

πœ‘(π‘₯) penalises undesired properties and the regularisation parameter πœƒ controls the intensity of the penalisation.

Example: random vector 𝒙representing astronomical images.

𝑝 π‘₯|πœƒ =π‘’βˆ’πœƒ πœ‘ π‘₯

𝐢(πœƒ)

HYPERPARAMETER

We model the unknown image π‘₯ as a random vector with

prior distribution 𝑝 π‘₯|πœƒ promoting desired properties about π‘₯.

PRIOR INFORMATION

𝑝 𝑦 π‘₯ ∝ π‘’βˆ’π‘”π‘¦(π‘₯)

The observation 𝑦 is related to π‘₯ by a statistical model:

OBSERVED DATA

LIKELIHOOD

Observed and prior information are combined by using Bayes' theorem:

POSTERIOR DISTRIBUTION

𝑝 π‘₯ 𝑦, πœƒ ∝ π‘’βˆ’ 𝑔𝑦 π‘₯ + πœƒ πœ‘ π‘₯

𝑝 π‘₯ 𝑦, πœƒ = 𝑝 𝑦 π‘₯ 𝑝 π‘₯|πœƒ /𝑝(𝑦|πœƒ)

ොπ‘₯𝑀𝐴𝑃 = argmaxπ‘₯βˆˆβ„π‘›

𝑝 π‘₯ 𝑦, πœƒ = argminπ‘₯βˆˆβ„π‘›

𝑔𝑦 π‘₯ + πœƒ πœ‘(π‘₯)

The predominant Bayesian approach in imaging is MAP estimationwhich can be computed very efficiently by convex optimisation:

We will focus on convex problems where:- πœ‘(π‘₯) is lower semicontinuous,

proper and possibly non-smooth- 𝑔𝑦(π‘₯) is Lipschitz differentiable

with Lipschitz constant L

𝑔𝑦(π‘₯) = π‘¦βˆ’π΄π‘₯ 2

2

2𝜎2)𝑦~𝑁(𝐴π‘₯, 𝜎2𝐼

NORMALIZING CONSTANT

Conclusionsβ–ͺ We presented an empirical Bayesian method to estimate regularisation parameters in convex inverse imaging problems.β–ͺ We approach an intractable maximum marginal likelihood estimation problem by proposing a stochastic.

optimisation algorithm.β–ͺ The stochastic approximation is driven by two proximal MCMC kernels which can handle non-smooth regularisers efficiently.β–ͺ Our algorithm was illustrated with non-blind image deconvolution with TV prior where it:

βœ“ achieved close-to-optimal performanceβœ“ outperformed other state of the art approaches in terms of MSEΓ— had longer computing times.

β–ͺ More details can be found in [3].

EXPERIMENT DETAILSβ€’Recover 𝒙 from a blurred noisy observation π’š , π’š = 𝐴𝒙 + 𝒛 and 𝒛 )~N(0, 𝜎2𝐼‒𝐴 is a circulant uniform blurring matrix of size 9x9 pixelsβ€’πœ‘ 𝒙 = 𝑇𝑉 𝒙 β†’ isotropic total-variation pseudo-normβ€’We use 6 test images of size 512x512 pixelsβ€’We compute the MAP estimator for each using different values of πœƒ obtained with different

methodsβ€’We compare methods by taking average value over 6 images of the mean squared error

(MSE) and the computing time (in minutes)

ORACLE ORIGINAL

in minutes

Γ— Increased computing timesβœ“ Achieves close-to-optimal performance

SNR=40dB

Close-up for SNR=20dB

DEGRADEDORIGINAL EMP. BAYES MARGINAL MAP SUGAR