An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An...

8
An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguere , CJ Carey, Thomas Boucher, Sridhar Mahadevan College of Information and Computer Science University of Massachusetts Amherst, MA 01003 [email protected] M. Darby Dyar Department of Astronomy Mount Holyoke College South Hadley, MA 01075 Abstract Spectroscopic measurements represent one of the cor- nerstones of extragalactic and planetary science obser- vations, but data processing techniques for dealing with various wavelengths are extremely variable. Nearly all of them share a common pre-processing step, which is the removal of extraneous signal from the data of inter- est. The baseline itself can be caused by a large num- ber of factors depending on the type of spectroscopy, and interferes with interpretation and analysis. Accu- rate interpretations of spectroscopic data thus rely cru- cially on proper estimation and removal of a baseline (sometime called continuum or background) signal. Be- cause the baseline correction problem is common to many areas of spectroscopy, a large variety of tech- niques have been proposed to solve it. To date, there has been no significant effort to unify these methods in a way that highlights the similarities and differences between them. Based on an extensive review of exist- ing literature, we propose here a framework that identi- fies the semantic subtasks that comprise most baseline removal (BLR) methods, as well as the computational structures that describe their implementation. Most im- portantly, this framework supports automated construc- tion of novel BLR techniques, allowing the baseline cor- rection algorithm itself to be optimized and automated for particular applications or types of spectroscopy. Introduction The task of proper baseline or continuum removal is com- mon to nearly all spectroscopic applications that identify minerals based on the shape and distribution of their spec- tral peaks. Its goal is to remove any portion of a signal that is irrelevant to features of interest while preserving the pre- dictive information encoded in its peaks. The unwanted sig- nal is variously referred to as baseline, background, contin- uum, or simply noise by different communities depending on choice of the spectroscopic methods and the underlying causal physical processes. The scientific community has ad- dressed this problem with varying degrees of success. Man- ual baseline removal, though common, is unprincipled and may produce variable results (Jirasek et al., 2004). Many au- tomated techniques have been proposed to subtract baselines from different types of spectra, each with tunable parameters that may be set to refine the quality of the removed baseline. In practice, it is common to find that these techniques have evolved out of empirical experimentation with parameters that produce reasonable results on a small set of spectra. Be- cause no baseline correction algorithm is perfect and users cannot find perfect parameter settings for all data, continuum removal typically introduces systematic error to subsequent processing. The importance of this problem is apparent in countless applications in space science where baseline removal (BLR) is critical to obtaining state of the art results, as demon- strated by three examples. The first is laser-induced break- down spectroscopy (LIBS), currently deployed as part of the ChemCam instrument on Mars Science Laboratory (Wiens et al. 2012, Maurice et al., 2012). The LIBS signal used for measuring the chemical composition of surface materials on Mars is complicated by a Bremsstrahlung continuum and ion-electron recombination processes that are not relevant to the atomic emission signal of interest. The ChemCam team removes this continuum via decomposition into a set of cu- bic spline undecimated wavelet scales in which local minima of convex hulls are found. A spline function is then interpo- lated through the different minima (Wiens et al., 2013). Another important application with a need for careful baseline removal is reflectance spectroscopy as commonly implemented in remote sensing of planetary surfaces. For example, the Compact Reconnaissance Imaging Spectrom- eter for Mars (CRISM) experienced challenges from spec- tral artifacts due to residual atmospheric contributions and detector-based effects as the flight hardware aged for which careful noise suppression techniques were developed (Par- ente, 2008, 2010). Other examples in astronomy include, for example, stud- ies of high-resolution spectra of photoionized gas in the in- terstellar medium. For this application, it is common to nor- malize spectra to remove the continuum shape, introducing a systematic uncertainty for which mathematical models are needed, such as the formulation described in Sembach and Savage (1992). Others scientists who use ultraviolet spectro- graphs must deal with fixed-pattern noise in the data. Some- times this can be removed with a flatfield that accounts for pixel-to-pixel sensitivity variations, but often the noise is re- moved by an analysis of the data, requiring that the observa- tions be done in a specific way (e.g. Jenkins, 2002). Although the focus of this work is to develop improved

Transcript of An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An...

Page 1: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

An Optimization Perspective on Baseline Removal for Spectroscopy

Stephen Giguere†, CJ Carey,Thomas Boucher, Sridhar Mahadevan

College of Information and Computer ScienceUniversity of Massachusetts

Amherst, MA 01003†[email protected]

M. Darby DyarDepartment of AstronomyMount Holyoke College

South Hadley, MA 01075

Abstract

Spectroscopic measurements represent one of the cor-nerstones of extragalactic and planetary science obser-vations, but data processing techniques for dealing withvarious wavelengths are extremely variable. Nearly allof them share a common pre-processing step, which isthe removal of extraneous signal from the data of inter-est. The baseline itself can be caused by a large num-ber of factors depending on the type of spectroscopy,and interferes with interpretation and analysis. Accu-rate interpretations of spectroscopic data thus rely cru-cially on proper estimation and removal of a baseline(sometime called continuum or background) signal. Be-cause the baseline correction problem is common tomany areas of spectroscopy, a large variety of tech-niques have been proposed to solve it. To date, therehas been no significant effort to unify these methodsin a way that highlights the similarities and differencesbetween them. Based on an extensive review of exist-ing literature, we propose here a framework that identi-fies the semantic subtasks that comprise most baselineremoval (BLR) methods, as well as the computationalstructures that describe their implementation. Most im-portantly, this framework supports automated construc-tion of novel BLR techniques, allowing the baseline cor-rection algorithm itself to be optimized and automatedfor particular applications or types of spectroscopy.

IntroductionThe task of proper baseline or continuum removal is com-mon to nearly all spectroscopic applications that identifyminerals based on the shape and distribution of their spec-tral peaks. Its goal is to remove any portion of a signal thatis irrelevant to features of interest while preserving the pre-dictive information encoded in its peaks. The unwanted sig-nal is variously referred to as baseline, background, contin-uum, or simply noise by different communities dependingon choice of the spectroscopic methods and the underlyingcausal physical processes. The scientific community has ad-dressed this problem with varying degrees of success. Man-ual baseline removal, though common, is unprincipled andmay produce variable results (Jirasek et al., 2004). Many au-tomated techniques have been proposed to subtract baselinesfrom different types of spectra, each with tunable parametersthat may be set to refine the quality of the removed baseline.

In practice, it is common to find that these techniques haveevolved out of empirical experimentation with parametersthat produce reasonable results on a small set of spectra. Be-cause no baseline correction algorithm is perfect and userscannot find perfect parameter settings for all data, continuumremoval typically introduces systematic error to subsequentprocessing.

The importance of this problem is apparent in countlessapplications in space science where baseline removal (BLR)is critical to obtaining state of the art results, as demon-strated by three examples. The first is laser-induced break-down spectroscopy (LIBS), currently deployed as part of theChemCam instrument on Mars Science Laboratory (Wienset al. 2012, Maurice et al., 2012). The LIBS signal used formeasuring the chemical composition of surface materials onMars is complicated by a Bremsstrahlung continuum andion-electron recombination processes that are not relevant tothe atomic emission signal of interest. The ChemCam teamremoves this continuum via decomposition into a set of cu-bic spline undecimated wavelet scales in which local minimaof convex hulls are found. A spline function is then interpo-lated through the different minima (Wiens et al., 2013).

Another important application with a need for carefulbaseline removal is reflectance spectroscopy as commonlyimplemented in remote sensing of planetary surfaces. Forexample, the Compact Reconnaissance Imaging Spectrom-eter for Mars (CRISM) experienced challenges from spec-tral artifacts due to residual atmospheric contributions anddetector-based effects as the flight hardware aged for whichcareful noise suppression techniques were developed (Par-ente, 2008, 2010).

Other examples in astronomy include, for example, stud-ies of high-resolution spectra of photoionized gas in the in-terstellar medium. For this application, it is common to nor-malize spectra to remove the continuum shape, introducinga systematic uncertainty for which mathematical models areneeded, such as the formulation described in Sembach andSavage (1992). Others scientists who use ultraviolet spectro-graphs must deal with fixed-pattern noise in the data. Some-times this can be removed with a flatfield that accounts forpixel-to-pixel sensitivity variations, but often the noise is re-moved by an analysis of the data, requiring that the observa-tions be done in a specific way (e.g. Jenkins, 2002).

Although the focus of this work is to develop improved

Page 2: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

BLR methods for spectroscopy, its contributions apply toother domains as well and are therefore of broad interestto AI and ML researchers. The BLR problem, in its mostgeneral form, is to separate an observed signal into twoconstituent parts, one that varies slowly and another thatis zero except in distinct, localized areas. Besides spectraldata, signals of this type are often encountered during timeseries analysis. For example, techniques used for climatemodeling rely on smoothing and subtraction of trends takenfrom benchmark studies to reduce variance in the input data(Grieser et al., 2002; McQuade and Monteleoni, 2012). Ob-served temperature data is composed of a slowly varyingseasonal trend together with short-term fluctuations causedby local weather patterns, suggesting that BLR techniquescould be applied to study each effect separately. BLR meth-ods may also be applied to study records of search term pop-ularity over time, which exhibit a similar structure. Here,the observed signal is comprised of sudden changes in thequery’s popularity, perhaps due to external events, super-imposed onto a gradually evolving trend (Shimshoni et al.,2009). As a final example, higher dimensional extensions ofBLR techniques may be used for to estimate and removeslowly varying bias fields, such as those present in MRIimages due to heterogeneity of the magnetic field used formeasurement (Juntu et al., 2005; Learned-Miller and Aham-mad, 2004). By removing this bias, the salient features of theimages can be isolated, improving the results of automaticanalyses such as screening for certain disorders.

Although the underlying processes that give rise to the un-wanted baseline signal are variable, the challenge of fittinga baseline to a signal containing a background with super-imposed peaks is nearly universal. Development of a widelyapplicable yet customizable approach to BLR has not yetbeen attempted, likely due to lack of collaboration acrossdisciplinary boundaries. Existing manual and automated ap-plications employ combinations of methods, including tech-niques for smoothing, filtering, fitting, and iterating suc-cessive models for baseline estimation as summarized bySchulze et al. (2005). A coherent framework for implement-ing combinations of these methods in a systematic and ob-jective fashion has been lacking.

This problem presents an ideal opportunity for applica-tion of state of the art AI techniques to a problem that spansmany scientific disciplines. Advances in these methods andincreases in computational speed have made it possible toapproach BLR from a perspective in which true optimiza-tion and mix-and-matching of existing BLR methods can beconsidered. The goal of this paper is to integrate existing andnovel methods for BLR into a single coherent framework foroptimization, bringing together elements of previously sug-gested techniques and providing a path toward true automa-tion that leverages current technology. The broad implica-tions for improvements in BLR will be far-reaching into alltypes of spectroscopy.

Related WorkThe problem of baseline correction has been well-studied inmany different areas of spectroscopy.

Scientists using Nuclear Magnetic Resonance (NMR)spectroscopy, for example, have produced many BLR meth-ods. Dietrich et al. (1991) introduced the Fast and Accu-rate Baseline Correction (FABC) algorithm for use withNMR spectra. Friedrichs (1995) presented the Noise Medianmethod, which estimates the baseline of NMR spectra by se-lecting the median intensity within a small window centeredaround each channel. For correction of Fourier TransformNMR data, Golotvin and Williams (2000) proposed a BLRtechnique based on iteratively thresholding the input spectrauntil peaks were eliminated. Later, Bao et al. (2012) pro-posed an iterative BLR method for NMR spectroscopy thatis notable because it guarantees that the corrected spectrumwill be non-negative.

A significant number of BLR methods have been de-veloped for processing Raman spectra as well. Lieber andMahadevan-Jansen (2003) introduced an iterative techniquecalled Modified Polyfit and demonstrated its effectivenessat correcting Raman spectra. The Local Second Derivatives(LSD) method, given in Rowlands and Elliott (2011), wasalso shown to work well on this type of spectra. Asymmet-ric Least Squares (ALS), a popular method for BLR orig-inally presented in Eilers and Boelens (2005), was origi-nally developed for correcting Raman, Fourier TransformInfrared (FTIR), and chromatography data. More recently,He et al. (2014) proposed the Improved Asymmetric LeastSquares (IALS) method, an extension of ALS that imposesextra smoothness constrains on the baseline, and success-fully applied it to Raman spectra.

Beyond the preceding examples, other types of spec-troscopy have also produced valuable BLR methods. Ruck-stuhl et al. (2001) introduced Robust Baseline Estimation(RBE) and demonstrated its effectiveness on laser-inducedfluorescence (LIF) spectra. For correcting spectra from Cap-illary Endophoresis analysis, Gan et al. (2006) proposeda BLR technique called Iterative Polynomial Fitting thatworks similarly to Modified Polyfit. Shao and Griffiths(2007) used the wavelet transform to filter out the baselinein open-path FTIR spectra (OP/FTIR). Kajfosz and Kwiatek(1987) modeled the baseline using the envelope of parabo-las placed above and below the spectrum; this was appliedto various forms of X-ray spectroscopy, such as XRF, PIXE,and SRIXE analysis. The Rolling Ball method introduced inKneen and Annegarn (1996) and the correction method pre-sented in (Parente, 2010) are similarly based on fitting theenvelope of ellipses and Hermite polynomials, respectively.

It is also common for new BLR methods to be evaluatedon multiple types of spectra, especially in more recent work.Li et al. (2013) presented the Morphologically WeightedPenalized Lease Squares (MPLS) algorithm and applied itto chromotographic data and Raman spectra. For Ramanand surface-enhanced induced infrared absorption (SIERA)spectroscopy, Weakley et al. (2012) undertake BLR usingrepeated thresholding of the baseline estimate. Zhang et al.(2010) proposed the Adaptive Iteratively Reweighted Penal-ized Least Squares (airPLS) methods and evaluated its ef-fectiveness on Raman, NMR, and chromatography data. Liuet al. (2014) applied Selective Iteratively Reweighted Quan-tile Regression (SirQR) to the same three types of spectra.

Page 3: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

Importantly, many of the BLR methods described abovehave been shown to be effective at correcting multiple typesof spectra besides those for which they were originally eval-uated. In addition, many of them were developed to be ef-fective in multiple areas of spectroscopy, even if they wereonly evaluated on a subset of them. The fact that a largenumber of methods have been shown to be effective in a va-riety of applications suggests that there are many similaritiesbetween them, an insight that led to the development of theframework proposed here.

With many available baseline correction methods tochoose from, it has become increasingly difficult for a scien-tist to determine which method is best for their application.Fortunately, several valuable surveys of existing BLR meth-ods have been compiled. The most significant survey is of-fered by Schulze et al. (2005), which evaluated the baselinesestimated by 11 BLR methods applied to synthetic spectra.The spectra were designed to simulate spectra from vibra-tional spectroscopies, including Raman and infrared. A cen-tral objective of the survey was to facilitate the communica-tion of BLR techniques that had been developed in disparateareas of spectroscopy. The framework presented here is, toour knowledge, the first attempt to unify these methods.

Liland et al. (2010) also provide a comparison of sev-eral BLR techniques, including ALS, RBE, the Noise Me-dian Method, Iterative Polynomial Fitting, and the RollingBall method of Kneen and Annegarn (1996). Each methodwas evaluated on Raman spectra and matrix-assisted laserdesorption-ionization time-of-flight (MALDI-TOF) spectra.Beyond providing a useful comparison, this paper is no-table because it proposes a simple scheme for optimizing thechoice of BLR technique for a specific application. How-ever, the proposed optimization was defined over existingBLR methods and did not involve the synthesis of new meth-ods. Our current work can be viewed as a development of theideas presented there.

Finally, we note that it is common for novel BLR meth-ods to be presented in comparison to existing methods.These smaller scale studies also have been useful resourcesfor helping spectroscopists make more informed decisionsabout which BLR algorithm is appropriate for a particularapplication.

BackgroundBefore a baseline can be removed from a spectrum, a modelis needed to explain the relationship between the baselineand the observed signal. In most work on baseline correc-tion, it is assumed that the observed intensity values are thesum of the true signal, a baseline signal, and noise, as shownbelow.

y = ytrue + b + ε (1)

Here, y is the observed signal, ytrue is the signal of interest,b is the baseline, and ε is a vector of noise terms. Basedon this model, the strategy for baseline correction is simplyto estimate the baseline and subtract it from the observedsignal. Due to improvements in the quality of measurementdevices, the noise vector has a small effect compared to thebaseline and is often ignored.

Figure 1: Baseline removal example for a Raman spectrum of en-statite from the RRUFF.ino web site, sample R070641, with noiseadded to highlight the effect of smoothing. The solid black lineshows the original spectrum and the red line shows baseline esti-mated using asymmetric least squares (Eilers and Boelens 2005)with smoothness = 1e+6 and asymmetry=0.010. The dashed greenline is the resultant corrected spectrum.

The above model describes spectra that are contami-nated by an additive baseline, such as those arising fromBremsstrahlung effects. The majority of existing BLR meth-ods are based on this model. However, some types of spectraare distorted by a baseline that attenuates the true signal. Fora specific channel i, these effects are best modeled using theformula shown below.

y(i) = y(i)trueb

(i) + ε(i) (2)To apply BLR methods that assume an additive baseline inthese cases, we note the following, assuming that the scaleof the noise is much smaller than the observed intensity (i.e.ε(i) � y(i).

log y(i) = log(y(i)trueb

(i) + ε(i))

= log

(y(i)trueb

(i)

(1 +

ε(i)

y(i)trueb

(i)

))

= log y(i)true + log b(i) + log

(1 +

ε(i)

y(i)trueb

(i)

)

≈ log y(i)true + log b(i) +

ε(i)

y(i)trueb

(i)

The last line follows from the Taylor series expansion oflog(1 + x), which yields the approximation log(1 + x) ≈ xfor |x| < 1. The logarithm is a monotonically increas-ing function, so that log-transforming the spectra does notchange the location of peaks in the signal. This allows BLRmethods that assume an additive model to be applied to mul-tiple types of spectroscopy effectively. For example, in ex-citation/fluorescence (Bremsstrahlung effects) spectroscopy,the continuum is truly an additive signal. In contrast, thecontinuum in reflectance/absorption spectroscopy representsa background that undergoes attenuation. For instrumentsthat suffer from a combination of these effects, proper treat-ment would require a model that accounts for both baselines

Page 4: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

explicitly. Such treatment is beyond the scope of this paper,but poses an interesting direction for future work.

As presented, baseline correction is an ill-posed problembecause there is no a priori way to know precisely how muchof the observed intensity was contributed by the baselineversus the true signal. It is possible that the only rigorousway to address this problem is to use optimization, varyingthe BLR parameters until no predictive information remainsin the baseline (Carey et al., 2015; Giguere et al., 2015).As a result, there are several assumptions about the baselinethat are commonly used to guide its estimation. The first isthat the baseline varies slowly compared to the true signal.Figure 1 shows a example baseline estimate for a Ramanspectrum generated using FABC method introduced in Diet-rich et al. (1991). This assumption implies that the baselinewill be relatively low frequency, and that it will not containany sudden peaks. Consequently, channels of the signal thatcontain peaks can be ignored when estimating the baseline.

The second assumption, which is not commonly consid-ered compared to the first, is that the true signal is non-negative. This constrains the baseline to lie below the ob-served spectrum because subtracting a baseline that is largerthan the input signal would result in an estimate of the truesignal that has negative intensity for some channels. Thesetwo assumptions form the conceptual basis for almost allbaseline correction methods.

Framework OverviewWe propose here a unifying framework for BLR methodsthat allows existing models to be represented in a coherentway. It recognizes that the wide variety of available meth-ods are actually composed of a small set of semantic steps.By identifying these steps, which we refer to as subtasks,it is possible to construct baseline correction algorithms bydesigning an arrangement of subtasks, called a template.

Our framework analyzes different BLR methods alongtwo orthogonal dimensions: one considers the semanticsteps that are necessary to remove the baseline, while theother identifies the computational structures of each method.This is shown schematically in Figure 2. First, semanticsteps are identified and referred to as subtasks. These sub-tasks are then arranged to form a template for the BLRmethod. Because each subtask represents a semantic oper-ation, an implementation, or an algorithm for completingthe subtask, must be chosen for each. This division intosubtasks and implementations allows for the constructionof new baseline correction methods by designing a tem-plate and selecting implementations, such that the templateitself or the choice of implementations is novel. In addi-tion, this approach facilitates comparisons between existingBLR techniques by analyzing the shared subtasks betweenthem, as shown in Figure 3. Finally, the framework identi-fies the computational structures that comprise each imple-mentation, illustrated as nodes within each implementationin both Figures 2 and 3. This is useful because it highlightspossible avenues for improving existing implementations.

With this framework, designing a new BLR technique canbe done in two ways: one can choose a new template and set

Figure 2: Illustration of framework for baseline removal. A base-line correction algorithm is specified by a template and a choice ofimplementations. Three implementations of peak identification areshown here as examples, although more are available. The templatedefines the semantic steps that are taken to extract the baseline,while the implementations determine how each subtask is com-pleted. Each implementation is comprised of components that cap-ture the computational structures present within the implementa-tion.

of implementations, or develop a new implementation thatcan be incorporated into an existing template. This not onlymakes design of new methods easier, but also confines lowlevel details to implementations. Moreover, this allows BLRoptimization by choosing a template and iterating over theexisting implementations to find an optimal correction algo-rithm for a specific task or type of spectroscopy. Our prelim-inary work (Carey et al., 2015; Giguere et al., 2015) suggeststhat different types of spectroscopy will benefit from distinctcombinations of implementations, relating to the differentunderlying physical processes that generate the baseline foreach analytical technique.

Finally, there are alternative frameworks for organizingexisting BLR methods that provide unique advantages forscientists. The most notable of these involves reinterpret-ing each existing technique as the optimization of a specificobjective function. This would allow them to be comparedand communicated in a unified way, and may elucidate inter-esting connections between seemingly different techniques.However, organizing BLR methods in this way does notyield a set of composable operations from which new meth-ods can be constructed. Therefore, this scheme is not wellsuited to the task of automatically constructing novel, cus-tomized BLR techniques, and was not pursued in this work.

Optimization over BLR MethodsOur proposed framework for BLR allows optimization forspecific applications. Here, we outline a procedure for per-forming this optimization.

Before the optimization can begin, existing BLR tech-niques must be reevaluated using this framework to iden-tify the subtask implementations that comprise each method.The purpose of this step is to establish a set of implemen-tations that will form the basis for constructing new meth-ods. This task is simplified by the observation that a largenumber of existing procedures are based on a single novelsubtask implementation that is then combined with existing

Page 5: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

Figure 3: View of two existing BLR methods using the proposedframework. The techniques begin with different subtasks, but usethe same sequence of subtasks afterwards. The differences betweenthe two are described by the difference in their templates, as well asthe distinct implementations chosen for their subtasks. Also shownare the components that comprise the implementations for eachsubtask for the two methods.

ones to form the proposed correction method. Any parame-ters required by the implementations, along with their rec-ommended ranges, should also be recorded.

It is assumed that a function scoreT (Y|X) is given thatexpresses the quality of the results obtained using Y as inputto some task T , where Y are the corrected spectra and Xdenotes a fixed set containing any other quantities necessaryfor the task (e.g. ground truth values for regression, or labelsfor classification). Without loss of generality, the score re-turned by this function is assumed to increase with the qual-ity of the results. In addition, the user must provide a tem-plate to optimize. While it is appealing to optimize over ev-ery possible combination of subtasks, this cannot be done inpractice because it involves searching over an infinite space.By constraining the optimization to consider BLR methodscomprised of a particular sequence of subtasks, the searchspace becomes finite, making the optimization tractable for

reasonably sized templates.Given a template, a scoring function and a set of imple-

mentations, optimization can begin. The template and set ofimplementations are both finite, resulting in a finite numberof combinations of implementations that match the template.These combinations form a set of distinct BLR methods. Foreach one of these, the scoring function is then maximizedwith respect to the parameters required by its subtask imple-mentations. This optimization is equivalent to the hyperpa-rameter tuning procedure familiar to many AI and ML re-searchers. The choice of parameters used for a BLR methodcan have a large effect on the quality of the corrected signal(Carey et al., 2015). Once the maximal score is computedfor each of the synthesized techniques, the method with thehighest score is returned along with its optimal parameters.

In the following sections, the details of the proposedframework are described. While the organization schemehas few components, it allows for the representation of com-plex BLR algorithms.

SubtasksPeak IdentificationPeak identification is the task of determining whether or notthe intensity value in each channel along a spectrum lieson the baseline. Specifically, a peak identifier is a functionfpeak? : Rn × Rn → Rn+ that accepts a vector of channelfrequencies x and corresponding intensities y, and producesa set of non-negative weights w. Depending on the imple-mentation, additional input parameters may also be required.Each weight, associated with a channel i, is correlated withthe degree of certainty that yi lies on the baseline rather thana peak. As a result, if wi ≈ 0, then yi lies on a peak, whereasif wi � 0, yi is assumed to lie on the baseline.

Among the many techniques for peak identification, twocategories emerge based on the form of the output weightvector. One is comprised of peak identifiers that produce bi-nary weight vectors, which we refer to as maskers. The sec-ond category contains weighters, or identifiers that producecontinuous weights.

With these two classes in mind, there are some considera-tions that play an important role in the choice of which peakidentifier to use in a given circumstance.

Sparsity: Depending on how the resulting vector willbe used, it may be useful for a peak identifier to produceweights that are sparse. For example, it is common to use theresults of peak identification as the weights for a weightedleast squares fit of the original spectrum. Because channelscontaining peaks are given small weights, this procedure re-sults in a fit that is biased towards the spectrum’s baseline.If a weighter is used for peak identification, then this fitwill involve computing a separate term for each channel. Incontrast, if a masker is used, then this procedure can be re-duced to an ordinary least squares fit over the channels withnonzero weights, simplifying the computation.

Certainty Information: In some applications, it is usefulto know the degree to which a specific channel representsa baseline intensity. This is especially useful when reason-ing about channels containing tails of the peak distribution,

Page 6: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

where intensity values lie close to the baseline but showa slight increase that intensifies near the base of the peak.Maskers cannot provide such information because all base-line channels are assigned the same weight, making it nec-essary to use a weighter in these circumstances.

Based on these considerations, we note that the twoclasses of peak identifier represent extremes of a tradeoffbetween complexity and information content. When infor-mation about certainty is not needed, maskers are preferredbecause they may accelerate later computation. On the otherhand, if certainty information is needed, then maskers arenot sufficient and a weighter should be applied.

Baseline Extraction

The extraction subtask refers to the task of constructing anestimate of the baseline from an input signal. The definingcharacteristic of baseline extractors is that they produce bi-ased approximations that do not fit peaks in the input signal.Stated differently, the output of baseline extraction dependsheavily on the intensity of channels representing the base-line, while channels representing peaks have little to no ef-fect on the output.

This asymmetry is crucial and can be generated either in-ternally or by using the output of a peak identifier. Imple-mentations that produce asymmetry themselves often do soby penalizing baseline estimates lie above the input morethan those that lie below it, as in the Asymmetric LeastSquares method proposed by Eilers and Boelens (2005). Al-ternatively, internal asymmetry can be obtained by repeat-edly thresholding the estimate until it conforms to the base-line, such as in the Modified Polyfit algorithm introducedin Lieber and Mahadevan-Jansen (2003). Implementationsthat use the output of peak identification are often simpler,and proceed by weighting the influence of each channel onthe estimated baseline according to the input weights. A par-ticularly simple example is found in Golotvin and Williams(2000), where the extracted baseline is obtained by linearlyinterpolating between the spectral intensities of channelswith nonzero weights.

Smoothing and Filtering

The smoothing and filtering subtask represents the reductionof noise in the input signal. Although both this and the base-line extraction subtask involve computing a new signal froman input, the smoothing and filtering subtask is distinguishedby the fact that it produces a relatively unbiased approxima-tion to the input signal. In contrast, the extraction subtaskproduces a heavily biased approximation that does not fitpeaks in the input. This difference is illustrated in Figure 4.

Smoothing and filtering are most often applied as prepro-cessing for other subtasks. For example, a peak identifierthat uses the first derivative of the input to locate peaks maywrongly consider some baseline channels to be peaks if theinput is noisy. In this case, smoothing the input or filteringout the high frequency noise will eliminate these false posi-tives, resulting in a better eventual estimate of the baseline.

Figure 4: A portion of a laser-induced breakdown spectrum (ac-quired under Mars conditions) that is analogous to data producedby the ChemCam rover on Mars. Black line is the raw spectrum; thered line represents exponential smoothing. The thick blue line rep-resents the baseline fitted to the original data using an FABC model(Cobas et al. 2006). The distinguishing feature of the baseline ex-traction subtask is that the extracted signal does not match peaks inthe spectrum, whereas the smoothing/filtering subtask produces asignal that is comparably unbiased.

ComponentsFitting

A fitting component is any operation that fits a parameter-ized model to an input spectrum. The optimal parametersfor the model are found by optimizing the quality of the fit,a quantity described by the loss function, over the possiblesettings of the parameters. Thus, there are three componentsthat comprise every fitting component: a reference signal, amodel, and a loss function.

• Reference Signal: A reference signal is a tuple S =(x, y), where y is a vector of intensity values that willbe approximated by the model and x is a vector of cor-responding frequencies.

• Model: A model is a function mθ : Rk → Rk that pro-duces estimates of the reference signal intensities basedon the the given parameter vector θ. For convenience, theresult of evaluating the model at x is often abbreviated asmθ(x) = yθ, or simply y.

• Loss Function: Finally, a loss function is a functionJ(y|y) whose value decreases as the quality of the fit be-tween y and y improves.

Combining these terms, estimation methods produce the sig-nalmθ∗(x), where θ∗ = argminθJ(mθ(x)|y) are the modelparameters that minimize the loss function. In general, theprocedure used to optimize the loss function can be left un-specified because different optimization methods will pro-duce the same optimum. It is also worth noting that fittingcomponents are typically sensitive to global features of thespectrum, meaning that the intensity of the fit in a particularchannel is affected by the intensity of the spectrum in distantchannels.

Page 7: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

MappingMapping components are defined as operations that applya single, static function, denoted fmap, to each channel ofthe input signal. In contrast to fitting components, mappingcomponents do not require optimization, and tend to be sen-sitive to local features of the spectrum. In addition, mappingcomponents tend to be influenced by local structures in theinput rather than global features.

One common type of mapping component computes foreach channel i the value of a function that depends onnearby channels, within a window around i. The Noise Me-dian Method (Friedrichs, 1995), which computes the medianwithin each window, is an example of a BLR method com-prised of a single component of this type. Another form ofmapping component applies a conditional function to eachchannel. Conditional functions perform different computa-tions based on the input, such the simple case of returning1 if the input is negative and 0 otherwise. From these ex-amples, it is clear that mapping components, characterizedentirely by the chosen mapping function, are capable of rep-resenting a wide variety of transformations on spectra.

ConclusionIn the past two decades, baseline removal (BLR) meth-ods have been developed for many different areas of spec-troscopy, including those in the space sciences. While thishas led to many creative and effective solutions to the base-line removal problem, it poses its own challenges. Namely,users are often presented with a long list of BLR methodsto apply, with very little intuition about how they work, howthey are similar, and which is most effective for their appli-cation. In this work, a framework is proposed that identi-fies the common semantic and computational structures thatcomprise almost all baseline correction methods. In particu-lar, BLR is modeled as an arrangement of high-level sub-tasks that form a template. Distinct correction algorithmscan then be formed by varying the template, or by selectingdifferent implementations for each subtask. Our new frame-work allows easier and more insightful comparisons amongexisting correction algorithms and also supports the auto-matic construction of novel techniques. As a result, it is nowpossible to leverage concepts from AI and ML to optimizethe BLR technique itself. This eliminates the need for usersto spend valuable time learning the internals of existing ap-proaches in order to facilitate educated choices about whichmethod is best for their application.

In future work, we intend to use this framework to an-alyze the relationship between the structure of the BLRmethod and the quality of results obtained using the cor-rected spectra. For example, one question of particular in-terest is whether there are systematic differences in the tem-plate or choice of implementations that are optimal for dif-ferent types of spectra, such as Raman, FTIR, or LIBS spec-tra. Another analysis would identify the effect that partic-ular subtasks have on the resulting corrected spectra, usingthe framework to construct BLR methods that vary in spe-cific, controlled ways. Answers to these questions may iden-tify previously unknown characteristics of different types

of spectra, inform future experimental design in laboratory,and provide valuable insights to guide development of im-plementations for particular applications. By supporting theautomated construction of BLR techniques, the frameworkproposed here not only makes it easier for spectroscopiststo apply BLR, but also allows the BLR problem itself to bestudied at an unprecedented level of detail.

AcknowledgmentsWe appreciate the support of NASA grant NNX15AC82Gand the Mars Fundamental Research Program.

ReferencesBao, Q., Feng, J., Chen, F., Mao, W., Liu, Z., Liu, K., and

Liu, C. (2012). A new automatic baseline correctionmethod based on iterative method. Journal of MagneticResonance, 218:35–43.

Carey, C., Dyar, M. D., Boucher, T. F., Giguere, S., Hoff,C. M., Breitenfeld, L. B., Parente, M., Tague Jr., T. J.,Wang, P., and Mahadevan, S. (2015). Baseline removalin raman spectroscopy: Optimization techniques. In Proc.Lunar Planet. Sci. Conf. (LPSC).

Dietrich, W., Rudel, C. H., and Neumann, M. (1991). Fastand precise automatic baseline correction of one-and two-dimensional nmr spectra. Journal of Magnetic Resonance(1969), 91(1):1–11.

Eilers, P. H. and Boelens, H. F. (2005). Baseline correctionwith asymmetric least squares smoothing. Leiden Uni-versity Medical Centre Report.

Friedrichs, M. (1995). A model-free algorithm for the re-moval of baseline artifacts. Journal of BiomolecularNMR, 5(2):147–153.

Gan, F., Ruan, G., and Mo, J. (2006). Baseline correctionby improved iterative polynomial fitting with automaticthreshold. Chemometrics and Intelligent Laboratory Sys-tems, 82(1):59–65.

Giguere, S., Carey, C., , Dyar, M. D., Boucher, T. F., Parent,M., Tague Jr., T. J., and Mahadevan, S. (2015). Baselineremoval in libs and ftir spectroscopy: Optimization tech-niques. In Proc. Lunar Planet. Sci. Conf. (LPSC).

Golotvin, S. and Williams, A. (2000). Improved baselinerecognition and modeling of ft nmr spectra. Journal ofMagnetic Resonance, 146(1):122–125.

Grieser, J., Tromel, S., and Schonwiese, C.-D. (2002). Sta-tistical time series decomposition into significant compo-nents and application to european temperature. Theoreti-cal and applied climatology, 71(3-4):171–183.

He, S., Zhang, W., Liu, L., Huang, Y., He, J., Xie, W., Wu, P.,and Du, C. (2014). Baseline correction for raman spec-tra using an improved asymmetric least squares method.Analytical Methods, 6(12):4402–4407.

Jenkins, E. B. (2002). Thermal pressures in neutralclouds inside the local bubble, as determined from cifine-structure excitations. The Astrophysical Journal,580(2):938.

Page 8: An Optimization Perspective on Baseline Removal for ...sgiguere/papers/custom_blr_ijcai.pdf · An Optimization Perspective on Baseline Removal for Spectroscopy Stephen Giguerey, CJ

Jirasek, A., Schulze, G., Yu, M., Blades, M., and Turner, R.(2004). Accuracy and precision of manual baseline de-termination. Applied spectroscopy, 58(12):1488–1499.

Juntu, J., Sijbers, J., Van Dyck, D., and Gielen, J. (2005).Bias field correction for mri images. In Computer Recog-nition Systems, pages 543–551. Springer.

Kajfosz, J. and Kwiatek, W. (1987). Nonpolynomial approx-imation of background in x-ray spectra. Nuclear Instru-ments and Methods in Physics Research Section B: BeamInteractions with Materials and Atoms, 22(1):78–81.

Kneen, M. and Annegarn, H. (1996). Algorithm for fittingxrf, sem and pixe x-ray spectra backgrounds. NuclearInstruments and Methods in Physics Research Section B:Beam Interactions with Materials and Atoms, 109:209–213.

Learned-Miller, E. G. and Ahammad, P. (2004). Joint mribias removal using entropy minimization across images.In Advances in Neural Information Processing Systems,pages 761–768.

Li, Z., Zhan, D.-J., Wang, J.-J., Huang, J., Xu, Q.-S.,Zhang, Z.-M., Zheng, Y.-B., Liang, Y.-Z., and Wang, H.(2013). Morphological weighted penalized least squaresfor background correction. Analyst, 138(16):4483–4492.

Lieber, C. A. and Mahadevan-Jansen, A. (2003). Automatedmethod for subtraction of fluorescence from biologicalraman spectra. Applied spectroscopy, 57(11):1363–1367.

Liland, K. H., Almøy, T., and Mevik, B.-H. (2010). Optimalchoice of baseline correction for multivariate calibrationof spectra. Applied spectroscopy, 64(9):1007–1016.

Liu, X., Zhang, Z., Sousa, P. F., Chen, C., Ouyang, M., Wei,Y., Liang, Y., Chen, Y., and Zhang, C. (2014). Selec-tive iteratively reweighted quantile regression for base-line correction. Analytical and bioanalytical chemistry,406(7):1985–1998.

McQuade, S. and Monteleoni, C. (2012). Global climatemodel tracking using geospatial neighborhoods. In AAAI.

Parente, M. (2008). A new approach to denoising crism im-ages. In Lunar and Planetary Science Conference, vol-ume 39, page 2528.

Parente, M. (2010). Unsupervised Unmixing of Hyperspec-tral Images: Imaging the Martian Surface. PhD thesis,Stanford University.

Rowlands, C. and Elliott, S. (2011). Automated algorithmfor baseline subtraction in spectra. Journal of RamanSpectroscopy, 42(3):363–369.

Ruckstuhl, A. F., Jacobson, M. P., Field, R. W., and Dodd,J. A. (2001). Baseline subtraction using robust localregression estimation. Journal of Quantitative Spec-troscopy and Radiative Transfer, 68(2):179–193.

Schulze, G., Jirasek, A., Yu, M. M., Lim, A., Turner, R. F.,and Blades, M. W. (2005). Investigation of selected base-line removal techniques as candidates for automated im-plementation. Applied spectroscopy, 59(5):545–574.

Shao, L. and Griffiths, P. R. (2007). Automatic baseline cor-rection by wavelet transform for quantitative open-pathfourier transform infrared spectroscopy. Environmentalscience & technology, 41(20):7054–7059.

Shimshoni, Y., Efron, N., and Matias, Y. (2009). On thepredictability of search trends. Google Research Blog.

Weakley, A. T., Griffiths, P. R., and Aston, D. E. (2012).Automatic baseline subtraction of vibrational spectra us-ing minima identification and discrimination via adap-tive, least-squares thresholding. Applied spectroscopy,66(5):519–529.

Wiens, R., Maurice, S., Lasue, J., Forni, O., Anderson,R., Clegg, S., Bender, S., Blaney, D., Barraclough, B.,Cousin, A., Deflores, L., Delapp, D., Dyar, M., Fabre, C.,Gasnault, O., Lanza, N., Mazoyer, J., Melikechi, N., Mes-lin, P.-Y., Newsom, H., Ollila, A., Perez, R., Tokar, R.,and Vaniman, D. (2013). Pre-flight calibration and ini-tial data processing for the chemcam laser-induced break-down spectroscopy instrument on the mars science labo-ratory rover. Spectrochim. Acta, Part B, 82:1 – 27.

Zhang, Z.-M., Chen, S., and Liang, Y.-Z. (2010). Base-line correction using adaptive iteratively reweighted pe-nalized least squares. Analyst, 135(5):1138–1146.