Combining orthogonal signal correction and wavelet packet transform with radial basis function...

9
Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination Ling Gao , Shouxin Ren Department of Chemistry, Inner Mongolia University, Huhhot, 010021, Inner Mongolia, China abstract article info Article history: Received 8 September 2008 Received in revised form 25 September 2009 Accepted 2 November 2009 Available online 10 November 2009 Keywords: Orthogonal signal correction Wavelet packet transform Radial basis function neural network Data mining Multicomponent spectrophotometric determination This paper presented a novel method named OSCWPTRBFN based on the concept of data mining in chemometrics for resolving overlapping spectra. The method combines orthogonal signal correction, wavelet packet transform and radial basis function neural network for enhancing the ability of removing noise and eliminating unrelated information as well as improving the quality of the regression method. OSC was applied to remove structured noise that is unrelated to the concentration variables. Wavelet packet representations of signals provided a local timefrequency description, thus in the wavelet packet domain, the quality of noise removal can be improved. Radial basis function network was applied for overcoming the convergence problem met in back propagation training and for facilitating nonlinear calculation. In this spectrophotometric case, through optimization, the number of OSC components, wavelet function, decomposition level, the number of hidden nodes and the width (σ) of RBFN for OSCWPTRBFN method were selected as 1, Coif 2, 4, 15 and 0.7 respectively. The relative standard errors of prediction (RSEP) for all components with OSCWPTRBFN, WPTRBFN, RBFN, partial least squares (PLS), OSCWPTPLS, principal component regression (PCR), Fourier transform based PCR (FTPCR) and multivariate linear regression (MLR) methods were 6.85, 7.74, 22.0, 10.1, 8.93, 13.5, 13.1, and 2.38 × 10 3 % respectively. Experimental results showed that the OSCWPTRBFN method was successful and had advantages over the other approaches. The results obtained from an additional test case, simultaneous differential pulse stripping voltammetric determination of Pb(II), Cd(II) and Ni(II), also demonstrated that the OSCWPTRBFN method performed very well. © 2009 Elsevier B.V. All rights reserved. 1. Introduction The simultaneous determination of several components in a mixture can be a difcult problem, especially for components that have similar analytical characteristics. The problem of how to distinguish overlapped signals is often encountered in analytical experiments. As a consequence of signal overlapping, the quality of analytical information is lower than that derived from isolated signals; the extent of the loss depends on the extent of the overlap. The overlap of signals can also cause nonlinearities. Since the advent of microcomputer and commercial instrumentation with rapid-scan technique, considerable amount of data can be acquired easily and quickly. However, the acquired data are often characterized by complicated structures, noise and redundancy contamination, and potential for collinearity. These problems stimulate the growing of data mining [13], which has become a developing eld of vigor. By using data mining, valuable knowledge are extracted from huge data sets, systematic relationships are discovered in variables, and effective prediction is carried out. Data mining is an integration of ideas and techniques in multiple disciplines, and seems to be one of the most intriguing cross-subjects in information technology. Data mining provides effective ways of analyzing data by rst performing denoising, dimension reduction, collinearity and redundancy elimi- nation, data transformation, extraction of feature values to obtain valuable information, and then constructing data models with intelligence methods to make predictions. By using data mining one can analyze full spectra rather than just picking out a few characteristic values. Some of the data mining methods in chemo- metrics such as neural computation, domain transform and data analysis have proven to be extremely useful [411]. How to eliminate noise and nonrelevant information is a critical step for data mining. Wavelet analysis represents relatively recent mathematical developments, and can provide information on local time and frequency scales together [1214]. Many applications of wavelet analysis have been found for signal processing, image processing, earthquake prediction, acoustics, damage assessment of structures, uid mechanics and analytical chemistry. Wavelet trans- form (WT) can be used for the purpose of converting data from the original domain into the wavelet domain, in which the representation of a signal is spares and signal denoising is easier to be carried out. Chemometrics and Intelligent Laboratory Systems 100 (2010) 5765 Corresponding author. Tel.: +86 471 6688471; fax: +86 471 4992984. E-mail address: [email protected] (L. Gao). 0169-7439/$ see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2009.11.001 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemolab

Transcript of Combining orthogonal signal correction and wavelet packet transform with radial basis function...

Page 1: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems

j ourna l homepage: www.e lsev ie r.com/ locate /chemolab

Combining orthogonal signal correction and wavelet packet transform with radialbasis function neural networks for multicomponent determination

Ling Gao ⁎, Shouxin RenDepartment of Chemistry, Inner Mongolia University, Huhhot, 010021, Inner Mongolia, China

⁎ Corresponding author. Tel.: +86 471 6688471; fax:E-mail address: [email protected] (L. Gao).

0169-7439/$ – see front matter © 2009 Elsevier B.V. Aldoi:10.1016/j.chemolab.2009.11.001

a b s t r a c t

a r t i c l e i n f o

Article history:Received 8 September 2008Received in revised form 25 September 2009Accepted 2 November 2009Available online 10 November 2009

Keywords:Orthogonal signal correctionWavelet packet transformRadial basis function neural networkData miningMulticomponent spectrophotometricdetermination

This paper presented a novel method named OSCWPTRBFN based on the concept of data mining inchemometrics for resolving overlapping spectra. The method combines orthogonal signal correction, waveletpacket transform and radial basis function neural network for enhancing the ability of removing noise andeliminating unrelated information as well as improving the quality of the regression method. OSC wasapplied to remove structured noise that is unrelated to the concentration variables. Wavelet packetrepresentations of signals provided a local time–frequency description, thus in the wavelet packet domain,the quality of noise removal can be improved. Radial basis function network was applied for overcoming theconvergence problem met in back propagation training and for facilitating nonlinear calculation. In thisspectrophotometric case, through optimization, the number of OSC components, wavelet function,decomposition level, the number of hidden nodes and the width (σ) of RBFN for OSCWPTRBFN methodwere selected as 1, Coif 2, 4, 15 and 0.7 respectively. The relative standard errors of prediction (RSEP) for allcomponents with OSCWPTRBFN, WPTRBFN, RBFN, partial least squares (PLS), OSCWPTPLS, principalcomponent regression (PCR), Fourier transform based PCR (FTPCR) and multivariate linear regression (MLR)methods were 6.85, 7.74, 22.0, 10.1, 8.93, 13.5, 13.1, and 2.38×103% respectively. Experimental resultsshowed that the OSCWPTRBFN method was successful and had advantages over the other approaches. Theresults obtained from an additional test case, simultaneous differential pulse stripping voltammetricdetermination of Pb(II), Cd(II) and Ni(II), also demonstrated that the OSCWPTRBFN method performed verywell.

+86 471 4992984.

l rights reserved.

© 2009 Elsevier B.V. All rights reserved.

1. Introduction

The simultaneous determination of several components in amixture can be a difficult problem, especially for components thathave similar analytical characteristics. The problem of how todistinguish overlapped signals is often encountered in analyticalexperiments. As a consequence of signal overlapping, the quality ofanalytical information is lower than that derived from isolatedsignals; the extent of the loss depends on the extent of the overlap.The overlap of signals can also cause nonlinearities. Since the adventof microcomputer and commercial instrumentation with rapid-scantechnique, considerable amount of data can be acquired easily andquickly. However, the acquired data are often characterized bycomplicated structures, noise and redundancy contamination, andpotential for collinearity. These problems stimulate the growing ofdata mining [1–3], which has become a developing field of vigor. Byusing data mining, valuable knowledge are extracted from huge datasets, systematic relationships are discovered in variables, and effective

prediction is carried out. Data mining is an integration of ideas andtechniques in multiple disciplines, and seems to be one of the mostintriguing cross-subjects in information technology. Data miningprovides effective ways of analyzing data by first performingdenoising, dimension reduction, collinearity and redundancy elimi-nation, data transformation, extraction of feature values to obtainvaluable information, and then constructing data models withintelligence methods to make predictions. By using data mining onecan analyze full spectra rather than just picking out a fewcharacteristic values. Some of the data mining methods in chemo-metrics such as neural computation, domain transform and dataanalysis have proven to be extremely useful [4–11].

How to eliminate noise and nonrelevant information is a criticalstep for data mining. Wavelet analysis represents relatively recentmathematical developments, and can provide information on localtime and frequency scales together [12–14]. Many applications ofwavelet analysis have been found for signal processing, imageprocessing, earthquake prediction, acoustics, damage assessment ofstructures, fluid mechanics and analytical chemistry. Wavelet trans-form (WT) can be used for the purpose of converting data from theoriginal domain into the wavelet domain, in which the representationof a signal is spares and signal denoising is easier to be carried out.

Page 2: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

Fig. 1. The simple plot of RBFN architecture.

58 L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

Wavelet packet transform (WPT) is a generalization of WT. WPT wasfirst introduced by Coifman et al. for dealing with the non-stationarities of the data [15,16]. Based on the same denoising idea,WPT can also be applied to signal denoising. The main differencebetween WT and WPT is that WPT decomposes not only approxima-tions but also details. Comparing to wavelet analysis, WPT has a betterfrequency resolution for the decomposition signals so it is possible tocombine the different decomposition levels for achieving theoptimum time–frequency representation of the original signals. Itsflexibility enables WP to capture and characterize locally the mostrelevant parts of a signal; hence, it is the ideal choice for datacompression, relevant information extraction and denoising [17–19].In order to enhance the predictive ability of multivariate calibrationmodels, raw data are often pre-processed for eliminating irrelevantinformation prior to calibration [20]. These pre-processing methodsmay occasionally remove useful information other than the irrelevantinformation and usually lead to reduced predictive ability ofcalibration. In order to avoid removing relevant information forprediction, Wold et al. developed a novel pre-processing techniquefor raw data, called orthogonal signal correction (OSC) [21]. Thisalgorithm aims to discard information in the response matrix D,which is mathematically orthogonal and unrelated to concentrationmatrix C. The information to be removed is the so-called structurednoise, which can come from different sources such as baseline,instrument variation and measurement conditions, interfering phys-ical and chemical process and other interferences. After Wold et al.,several researchers have presented different approaches to OSC [22–30]. The concept behind all OSC algorithms is to run OSC as a filter todiscard irrelevant data from the response matrix D, since extraneousinformation may degrade the predictive ability of calibration models.All the pertinent information in matrix D related to matrix C shouldremain.

Artificial neural network (ANN) is one of the most widely usedmathematical algorithms for modeling both linearity and non-linearity systems [31,32]. ANN is a form of artificial intelligence thatmathematically simulates biological nervous systems. In the pastyears, the ANN was introduced to analytical chemistry. The wideapplicability of ANN stems from their flexibility and ability to modellinear and non-linear systems without prior knowledge of anempirical model. Presently, the most widely used type of ANN is themultilayer feedforward networks (MLFN) with back propagation (BP)algorithm [33]. However, the BP–MLFN method is slow, temporallyunstable in training and apt to fall into local minima. Much attentionhas been paid to solve this problem and to facilitate the trainingprocess into the global minimum [34]. Radial basis function network(RBFN) is among themost efficient solutions [35]. Due to the localizednature of RBFN, the network can be trained extremely quickly andovercome the convergence problem [36,37]. Radial basis function(RBF) as a nonlinear transfer function is incorporated into the hiddenlayer of RBFN to facilitate nonlinear calculation.

A novel approach tested here uses OSC and WPT in combinationwith RBFN to eliminate noise and extraneous information as well as toenhance the regression quality. This approach is namedOSCWPTRBFN,which is based on radial basis function neural network (RBFN) withorthogonal signal correction (OSC) and wavelet packet transform(WPT) as pre-processing tools to analyze overlapping spectra. In OSC,worthless information that is orthogonal and unrelated to concentra-tion is removed. In WPT, time–frequency localization, best-basisalgorithm and thresholding operation were used to improve thequality of feature extraction, data compression and denoising. In RBFNoperation, the convergence problemmet in BP trainingwas solved andpredictive ability was improved. This seems to be the first applicationof a combined OSC–WPT–RBFN approach to multicomponent spec-trophotometric determination. The simultaneous determination of Cu(II), Cd(II) and Zn(II) with xylenol orange and cetyltrimethylammonium bromide (CTMAB) using traditional spectrophotometry

is challenging due to the absorption spectra overlap and the proposedmethodwas applied to overcome this difficulty. The proposedmethodcan extend its applicability to simultaneous differential pulse strippingvoltammetric determination of Pb(II), Cd(II) and Ni(II) and can beproven to be a valid approach. Thismethod can also be applied to otherkinds of instrumental analytical data for multivariate calibration aswell.

2. Theory

2.1. Radial basis function networks

RBFN has a special architecture in which it can be presented as atwo-layer feed forward structure (one hidden and one output layer).The input layer does not process information; it serves only todistribute the input data among the hidden nodes. Each node on thehidden layer represents a radial basis function (RBF). A generalGaussian kernel functionwas used in this case. The simple architectureof RBFN is shown in Fig. 1.

The output from the jth Gaussian node for an input object xi can becalculated by the following equation:

HjðxÞ = Hjð jjxi−cj jj Þ = exp−ðxi−cjÞ2

2σ2j

!ð1Þ

where the vector xi holds the ith input data and vector cj representsthe position of the center of the jth Gaussian function. ||xi−cj|| is thecalculated Euclidean distance and the width σ is a parametercontrolling the smoothness properties of the function. The outputlayer is essentially the same for MLFN, and normally has a linearactivation function. The output is calculated by a linear combination ofthe radial basis functions plus the bias w0 according to:

yðxiÞ = ∑m

j=1wjhjðxiÞ + w0 ð2Þ

where wj is the jth weight between hidden and output layer, m is thenumber of basis functions.

The Eq. (2) can be represented in matrix notation:

Y = HW ð3Þ

Page 3: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

59L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

The Eq. (3) can be solved for W by using the Moore–Pensorepseudo-inverse:

W = ðHTHÞ−1HTY ð4Þ

If let A=HTH, the Eq. (4) can be written more conveniently as:

W = A−1HTY ð5Þ

where A−1 is the variance matrix.In general, the network training includes two steps. One is an

unsupervised part to determine centers c and width σ of the radialnodes and the number of hidden layer nodes. The other is a linearsupervised part in which the weights of output layers are calculatedby the standard linear optimization technique. In this paper, thealgorithm of forward selection with regularization method wasadopted to train RBFN first. The basic idea of regularization is tostabilize the solution by means of some auxiliary nonnegativeregularization function for improving the RBF performance. Thismethod is called weight decay in ANN theory and global ridgeregression in applied regression analysis. Weight decay adds a penaltyterm to the error function. The penalty is usually the sum of squaredweights times a regularization constant. Thus the cost function is:

C = ∑P

i=1ð ̂yi− ∑

m

j=1wjhjðxiÞÞ + ∑

m

j=1λjw

2j ð6Þ

where λj is the jth regularization parameters of an additional weightpenalty term, p is the number of samples, ŷi is the ith target value,

∑m

j=1wjhjðxiÞ is the ith calculated output. In fact, the first term is the

sum square error (SSE) function and the second term is called theweight decay term, which avoids the absolute values of weightsbecoming too large. The Eq. (4) can bemodified in following notation:

W = ðHTH + ΛÞ−1HTY ð7Þ

where Λ is a diagonal matrix with the regularization parameters alongits diagonal.

The variance matrix is:

A−1 = ðHTH + ΛÞ−1 ð8Þ

Weightmatrix is still calculated by Eq. (5). Forward selection startswith an empty subset and adds one basis function at a time, then findssubset of all basis functions that decrease the cost function to thegreatest extent. The search process continues until the cost functionstops decreasing. Forward selection is also used as a procedure toselect centers from training set samples. The centers of RBF can bedetermined by random selection methods and several mathematicalalgorithms such as orthogonal least square (OLS) learning algorithm[38], genetic algorithm (GA) [39], Kohonen neural network andk-means clustering. The idea behind the choice of centers c is to puttogether several neighboring points in the input data space and torepresent these points by a single radial basis function. Centers c andwidthσ are inmuch the sameway as themean and standard deviationin a normal distribution. The fraction of overlap between each hiddennode and its neighbors are decided bywidthσ. The number of nodes inhidden layer is the number of radial basis functions. In the first step, thetraining algorithm based on forward selection with weight decay wereused to automatically select the number of nodes and centers in hiddenlayer and to adjust the connection weight between hidden layer andoutput layer simultaneously. The parameter selection is very simple,only one parameter (width σ) has to be optimized by trial and error. Inthe second step, the output weights are easily obtained by the pseudo-inverse calculation or singular value decomposition (SVD) since the

mapping from the hidden nodes to output is a linear activationfunction. The response of each output node is calculated by a traditionallinear least-square regression of the output of hidden layer. Theprocessis a linear supervised learning in which the weight of output layer isestimated by using the knowledge of the pairs of vectors.

2.2. Orthogonal signal correction

Orthogonal signal correction (OSC) is a data treatment tool thatremoves information in the Dmatrix which is orthogonal to matrix C.OSC is performed as a preprocessing step to enhance the calibrationmodel. The OSC algorithm is described below:

(1) Thewhole set of spectra obtained from standardmixture is usedto build the experimental datamatrixD. Before starting the OSCcalculation, mean centering and data standardization areperformed. After this operation, a matrix where each columnhas a zero mean and a variance equal to one is obtained.

(2) The algorithm starts to calculate the first principal component(PC1) from the training data D to get the score vector t.

(3) Orthogonalize t to matrix C according to Eq. (9)

tnew = ð1−CðCTCÞ−1

CT Þt ð9Þ

(4) Calculate a weight vector, w that satisfies Dw=tnew(5) Calculate a new score vector from D and w: t=Dw, t is then

orthogonalized again with respect to C to give tnew. This processis iterated in steps 3–5 until the relative difference betweentnew and t is smaller than the convergence limit.

(6) A loading vector (p) is then calculated with the help of thescore vector:

pT = t

TD= ðtT tnewÞ ð10Þ

(7) Training data are then corrected by subtracting tpT,

Dosc = D−tpT ð11Þ

where D is the filtered matrix, which only contains infor-

osc

mation relevant to C.

(8) The algorithm is repeated for the next OSC component andused Dosc in the new OSC as D; then another OSC component istreated till satisfactory is reached.

(9) Mean centering and data standardization are performed for testsamples in the same way as the training set. A score vector fortest set, ttest, is calculated using w and p of the calibrationmodel according to Eq. (12):

ttest = Duw ð12Þ

(10) The test set will be corrected by subtracting ttest p:

Duosc = Du−ttestpT ð13Þ

This process continueswith the nextOSC component by usingDu

osc as Du, then another one, and so on till satisfactory.

2.3. WPT and denoising

The discrete wavelet transform (DWT) has been recognized as anatural wavelet transform for discrete time signals, and has the abilityto provide information in the time and frequency domain. The fastdiscrete wavelet transform (FDWT) can be implemented by means ofthe Mallat's pyramid algorithm, which can be viewed as a filtering of asignal bymeans of recursive application of the high-pass and low-pass

Page 4: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

60 L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

filters that form a quadrature mirror filter (QMF) pair. The output ofthe low-pass filter is called approximations whereas the output of thehigh-pass filter is known as details. The approximations can be used asthe data input for the QMF for further wavelet decomposition. Thetheoretical background of FDWT has been described in details byauthors [20]. The wavelet packet transform (WPT) is an importantextension of the wavelet transform. A wavelet packet Wjnk isgenerated from the base function:

WjnkðxÞ = 2− j=2Wnð2−jx−kÞ ð14Þ

where indices j, n, k are the scale, the oscillation and the localizationparameter respectively. j, k∈Z n=0,1,2,···2j−1. The waveletpackets inherit properties such as orthonormality and time-frequencylocalization from their corresponding wavelet functions. In general,the wavelet packets are defined by:

W2nðxÞ =ffiffiffi2

pΣhðkÞWnð2x−kÞ ð15Þ

W2n + 1ðxÞ =ffiffiffi2

pΣgðkÞWnð2x−kÞ ð16Þ

where h(k) and g(k) are low-pass and high-pass filters, W0(x)=ϕ(x)is the scaling function andW1(x)=ψ(x) is the wavelet function. ψ andϕ are used to define the details and the approximations respectively.The discrete filters h(k) and g(k) are quadrature mirror filtersassociated with scaling function and wavelet function. These filtersare used to construct the filter matrices, denoted as H and G. Thedifference between WT and WPT is the decomposition path. Analgorithm of fast wavelet packet transform (FWPT) is similar to FDWT,with the application of QMF to the data followed by downsampling.However, to calculate a WPT, computation proceeds by applying theQMF not only to the low-pass output but also to the high-pass output.The recursion is simply to filter and downsample all output of theprevious scale level. For calculating formulas on FWPT and inverseFWPT, please refer to reference [9].

The wavelet packet denoising procedures include four steps:(1) WPT, (2) the estimation of the best basis, (3) the thresholding ofwavelet packet coefficients and (4) reconstruction. The best basis isselected according to the entropy-based criterion proposed byCoifman and Wickerbauser [15]. The Shannon entropy was appliedin this case. Compared to DWT, the full WPT binary tree containsredundancy; thus in order to represent the original signal, a best basisamong different possible bases can be selected from the full binarytree. The search for the best basis involves computing the entropy foreach node of the decomposition tree. The tree is pruned from thebottom nodes to the root. In this way, best basis for the full tree wasobtained. The thresholding operation is implemented by the SUREmethod in hard thresholding type. The SUREmethodwas proposed byDonoho [40] and the algorithmic procedures were well described inthe literature [41] by B.K. Alsberg. After finding the best basis, theobtained WP coefficients are selected according to their absolutevalues. Only the coefficients whose absolute values are higher than apredefined threshold value are retained, and those smaller than aselected threshold are eliminated.

According to these algorithms, a program called POSCWPTRBFNwas designed to perform the simultaneous determination.

3. Experimental

3.1. Apparatus and reagents

Shimadzu UV-240 spectrophotometer furnished with OPI-2 func-tion was used for all experiments; a Lenovo Pentium IV microcom-puter was used for all the calculations; pH measurements were madewith a pH-3B digital pH-meter with a glass-saturated calomel dualelectrode. All reagents were of analytical reagent grade. The water

used was doubly distilled and deionized. Stock standard solutions of0.25 mol L−1 Cd(II), Zn(II) and 0.10 mol L−1 Cu(II) were prepared fromtheir respectivenitrates and standardizedaccording togenerally acceptedprocedures. Standard solutions were then prepared from this stock byserial dilution as required. 1.00 g L−1 xylenol orange, 20.00 g L−1

cetyltrimethyl ammoniumbromide (CTMAB) and pH=9.20 borax buffersolution were used.

3.2. Procedures

A series of mixed standard solutions containing various ratios ofthe three compounds was prepared and 2.50 mL of 1.00 g L−1 xylenolorange solution were added into 25 mL volumetric flasks, 3.00 ml of20.00 g L−1 CTMAB solution and 5.00 mL borax buffer solution(pH=9.20) were also added and diluted with distilled water tomark. After 15 min, spectra were measured in 1 cm cuvettes between500 and 625 nm at 2 nm intervals with respect to a reagent blank. Thespectral range of 500–625 nm was selected as the spectral zone thatgave the maximum spectral information for determining thecomponents of interest. An absorption matrix for calibration (D)was built up from these data. According to the same procedures, anabsorption matrix for prediction (Du) was built up too.

4. Results and discussion

Absolute and relative standard errors of prediction (SEP and RSEP)were used as the criteria to compare the performances of the testmethods. The SEP for a single component is given by Eq. (17); that forall components by Eq. (18). The RSEP is given by Eq. (19) [42].

SEP =

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi∑m

j=1fCij− ̂Cijg2

m

vuuutð17Þ

SEP =

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi∑n

i=1∑m

j=1fCij− ̂Cijg2

nm

vuuutð18Þ

RSEP =

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi∑n

i=1∑m

j=1fCij− ̂Cijg2

∑n

i=1∑m

j=1C2ij

vuuuuuut ð19Þ

Where Cij and Ĉij are the actual and estimated concentrations,respectively, for the ith component in the jth mixture, m is thenumber of mixtures, and n is the number of components.

4.1. The absorption spectra of Cu(II), Cd(II) and Zn(II)

Fig. 2 shows the absorption spectra of Cu(II) (0.80×10−5 mol L−1),Cd(II) (2.00×10−5 mol L−1) and Zn(II) (2.00×10−5 mol L−1) withxylenol orange and CTMAB as reagents in pH=9.20 borax buffersolution . Experimental conditionswere the same as those described inexperimental procedures. As can be seen from Fig. 2, the spectraexhibited strongly overlapped bands in their absorbing regions.

4.2. Optimization of experimental conditions

The optimal experimental conditions for this system were selectedexperimentally. The effect of pH on this experimental system wasstudied over the pH range 6.00–11.00. In thismedium, themixture of Cu(II), Cd(II) and Zn(II) showed maximum absorbance in the pH range8.75–9.25. Therefore, a buffer solution (pH=9.20) was used in thisstudy. The influence of xylenol orange was also investigated. A 2.50 mL

Page 5: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

Fig. 2. Absorption spectra of Cu(II), Cd(II) and Zn(II).

61L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

1.00 g L−1 xylenol orange solution was chosen because it ensuredmaximal and constant absorbance. The effect of surfactant concentra-tion and the amount of buffer solution on the absorbance of the systemswas investigated and the optimum amounts of CTMAB and buffersolution were selected. The optimal experimental conditions selectedwere the same as those described in the experimental proceduresection. Under the experimental conditions selected, the absorbance ofthe ternary mixture remained strong and stable over 15–120 min.

4.3. WPT and WP denoising

In order to evaluate the denoising ability of the method, weselected mean spectra of matrix D to be the original signal f here. WPTwas applied to the signal f using FWPT algorithm. Some of WPcoefficients obtained by FWPT are shown in Fig. 3. Each coefficient isidentified by the couple of indices (j,n), where j is the level ofdecomposition and n is the position of the block at that level. FromFig. 3, it is obvious that the w (j, 0) coefficients are entirely positiveand their waveforms are similar to the original spectra. The otherscoefficients have both positive and negative parts. Each block of thecoefficients describes the components of the signal f, related to a

Fig. 3. Some of the WP coefficients obta

certain frequency band. Fig. 3 shows that the WPT can be viewed as acomplete binary tree structure. The information in WP is redundant.In general, the information in child level are overlapped with those inparent level, for example, w (1, 0) with w (2, 0)+w (2, 1) or w (1, 0)withw (3, 0)+w (3, 1)+w (3, 2)+w (3, 3). In order to represent theoriginal signal, the best basis is preferably selected from full binarytree for ensuring that no redundant information is included.

The best basis only covers the complete horizontal block but notthe overlapping vertical block thus can be composed from a singledecomposition level or a combination of levels to achieve theoptimum time–frequency representation of original signals. Afterfinding the best basis, a threshold was applied toWP coefficients. Onlythe wavelet coefficients whose absolute values are higher than apredefined threshold value are retained. This flexible time–frequencyresolution enables WP to characterize locally the most relevant partsof a signal. In the spectrophotometric measurements, the analyticalsignals usually center in low frequency parts, whereas the noise inhigh frequency parts. In the WP domain, the denoising wasimplemented by the best basis selection and thresholding operation.Finally, by utilizing inverse FWPT, the denoised WP coefficients wereconverted into the original domain.

4.4. Visual comparison of OSC, WPT and OSC–WPT pretreatmentmethods

In order to enhance the predictive power of multivariatecalibration methods, spectral data are often corrected prior to dataanalysis. In this case, we selected three pre-processing methods (OSC,WPT and OSC–WPT) for comparison. OSC is a pre-processing tool thatdiscards nonessential information to concentration in matrix D, andapplies for a signal correction that does not remove relevantinformation from the raw data. The second pretreatment tool isWPT, which is used for performing data compression, relevantinformation extraction and denoising based on its property of duallocalization in both scale (frequency) and position (time). The thirdtool is a hybrid method called OSC–WPT, which is proceeded by usingOSC first and WPT afterwards. The method inherits properties fromthe original methods for eliminating not only noise and backgroundinformation but also other information irrelevant to concentration. Inorder to visually inspect the filtering effects, three pre-processingmethods (OSC, WPT and OSC–WPT) were performed. The raw, the

ined by wavelet packet transforms.

Page 6: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

62 L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

corrected spectra by means of the three methods and the differencebetween the raw and the corrected spectra of test set Du werecalculated and displayed in Fig. 4. The process corrected by OSC wasillustrated with the first column of Fig. 4. The raw spectra were pre-processed by OSC and one OSC component was removed. Bycomparing the spectra before and after the OSC-correction, we canobserve that OSC did not change the shape of the spectra and thedifference between OSC corrected spectra and raw signals changedslightly. In the OSC corrected data, the 2nd–6th spectra are over-lapped and clearly concentrated into one group. After OSC correction,some extraneous information was eliminated, especially those in the1–45 regions of wavelength index. The process corrected by WPT wasillustrated by the second column of Fig. 4. The raw spectra werepreprocessed byWPTwith Coif 2 wavelet function and decompositionlevel=4. Comparing with spectra before the WPT correction, theshape of the spectra after the WPT correction stayed almost the sameexcept some small peaks were eliminated, and the difference betweenWPT-corrected spectra and raw signals did not undergo obviouschange. After WPT correction, some irrelevant information wasremoved, especially those in about 30–61 areas of wavelengthindex. The third column of Fig. 4 illustrates the process corrected byOSC–WPT. The raw spectra were preprocessed by OSC–WPT and oneOSC component was removed. By comparing the spectra before andafter the OSC–WPT correction, we observed that the OSC–WPT did notchange the shape of the spectra except some small peaks were takenout and there was a little change of the difference between OSC–WPTcorrected spectra and raw signals. In the OSC–WPT corrected data, the2nd–6th spectra are overlapped and clearly concentrated into onegroup. After OSC–WPT correction, not only some extraneousinformation was removed in the whole regions of wavelength indexbut some small peaks were also eliminated. After OSC–WPTcorrection, significant amount of unrelated information and noisewere taken out, especially some small peaks caused by randomvariation in spectra. It is fairly obvious that hybrid OSC–WPTcombines the advantages of the two methods and can efficientlyremove both the systematic structured variation that is independentof concentration and the random variation. Thus, OSC–WPT can beapplied as a preprocessor in the OSCWPTRBFN method.

Fig. 4. Visual comparison of three prepro

4.5. OSCWPTRBFN

The success at obtaining a reliable result by the OSCWPTRBFNmethod depends strongly on the judicious choice of relativeparameters. Five parameters were optimized in the OSCWPTRBFNmethod:wavelet function, decomposition level (L), the number of OSCfactors, the number of hidden nodes, and thewidthσ of RBFN. The firsttwo are associated with WPT, the next one is required by OSC and thelast two are demanded by RBFN. Each of the wavelet functions hasdifferent characteristics. The wavelet function, which is optimal for agiven signal, is not necessarily the best for another type of signal. Thus,the choice of the wavelet functions is strongly problem-dependent. Itis possible to use the predictive parameters SEP and RSEP to find theoptimum choice of functions. In these cases, the wavelet functionstested were Coiflet (Coif) 1,2…5, Daubechies 4,6…20, and Symmlet4,5…8. The SEP and RSEP of total components were computed. It canbe seen that the Coif 2 seems to outperform the others, so Coif 2 wasselected in this study. The influence of decomposition level L is alsoimportant, thus tests were performed using different L levels. Thereported SEP and RSEP of total components showed that they stronglydepend on the decomposition level. In this case, L=4was selected. Inorder to obtain an OSCWPTRBFN regression model with goodpredictive ability, the optimal number of OSC components needs tobe chosen. OSC components can be removed from the raw spectra bysubtracting insignificant information that is orthogonal to concentra-tion. The number of OSC components is equal to the number ofsuccessive cycles of the OSC approach over raw spectra. There is notheoretical rule concerning the choice of the number of OSCcomponents, since this parameter depends on the complexity of theproblem. The selection of the number of OSC components wasempirical; one to three OSC components were tested in this case.The lowest RSEP valuewas obtainedwith one OSC component. If morethan one OSC componentwere taken, the values of RSEP increased andoverfit may appear. The other two parameters that need to beoptimized are the width σ of RBF and the number of nodes in hiddenlayer. By optimization as abovementioned, Coif 2, L=4, the number ofOSC components=1, the number of hiddennodes=15, and thewidthσ of RBF=0.7 were selected as optimal parameters.

cessed techniques for prediction set.

Page 7: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

Table 1Composition of the training set.

Sampleno.

Concentration(10−6 mol L−1)

Sampleno.

Concentration(10−6 mol L−1)

Cu(II) Cd(II) Zn(II) Cu(II) Cd(II) Zn(II)

1 4.000 4.000 6.000 9 12.00 4.000 18.002 4.000 5.600 12.00 10 12.00 5.600 24.003 4.000 7.200 18.00 11 12.00 7.200 6.0004 4.000 9.600 24.00 12 12.00 9.600 12.005 8.000 4.000 12.00 13 16.00 4.000 24.006 8.000 5.600 6.000 14 16.00 5.600 18.007 8.000 7.200 24.00 15 16.00 7.200 12.008 8.000 9.600 18.00 16 16.00 9.600 6.000

63L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

A training set of 16 samples formed frommixtures of Cu(II), Cd(II),and Zn(II) was designed according to a four-level orthogonal arraydesign with L16 (45) matrix in order to obtain maximum information.Table 1 summarizes the composition of the training set. Experimentaldata obtained from the training set were arranged in matrix D, whereeach row corresponded to the absorbances of different mixtures at agiven wavelength and each column represented the spectra obtainedfrom a given mixture. A set of nine synthetic unknown samples weremeasured in the same way as the training set and arranged in matrixDu. Before starting the OSCWPTRBFN calculation, mean centering anddata standardizationwere performed as preprocessing. In the programPOSCWPTRBFN, OSC–WPT operation was performed first and one cantreat each spectrum for a given mixture. Therefore, using the samemethod, each row vector of matricesD andDu was corrected by OSC toremove structurednoise not related to concentration, denoised by bestbasis selection and thresholding operation in theWPdomain, and thenreconstructed by applying inverse FWPT operation. Secondly, thereconstructed spectra can serve as the input of RBFN, which canefficiently perform multivariate calibration. Lastly, the concentrationsof Cu(II), Cd(II) and Zn(II) for a test set were calculated. Actualconcentrations and recoveries of Cu(II), Cd(II) and Zn(II) are listed inTable 2. The experimental results showed that the SEP and RSEP for allcomponents were 0.765 (10−6 mol L−1) and 6.85%. All the experi-mental values in Table 2 were means of three replicates.

4.6. Comparison of the OSCWPTRBFN with seven other chemometricmethods

In order to evaluate the OSCWPTRBFN method, OSCWPTRBFN,WPTRBFN, RBFN, partial least squares (PLS) [10,43], OSCWPTPLS,principal component regression (PCR) [44], Fourier transform basedPCR (FTPCR) [45] and multivariate linear regression (MLR) [46] weretested in the study with a set of synthetic unknown samples. The SEPand RSEP for the seven methods are given in Table 3. The RSEP valuesfor all components calculated by OSCWPTRBFN, WPTRBFN, RBFN,

Table 2Actual concentration and percentage recovery of the unknowns.

Sampleno.

Actual concentration Recovery (%)(10−6 mol L−1)

OSCWPTRBFN

Cu(II) Cd(II) Zn(II) Cu(II) Cd(II) Zn(

1 6.000 4.800 8.000 111.0 89.7 982 6.000 6.400 14.00 137.3 82.6 923 6.000 8.000 20.00 127.1 80.6 994 10.00 4.800 14.00 100.6 96.5 1005 10.00 6.400 20.00 93.1 108.8 1006 10.00 8.000 8.000 97.7 99.6 1037 14.00 4.800 20.00 102.1 103.2 978 14.00 6.400 8.000 98.3 108.5 969 14.00 8.000 14.00 104.9 97.9 96

OSCWPTPLS, PLS, FTPCR, PCR and MLR methods were 6.85, 7.74, 22.0,8.93, 10.1, 13.1, 13.5, and 2.89×105% respectively. The MLR methodfailed in this case; the RSEP for all components was 2.89×105% andthe condition number of the calibration matrix was 2.38×103. Thecondition number can be used as a measure of the selectivity whenanalyzing themulticomponent systems. As the condition number getslarger, the concentration error can also get larger. These poorprediction results demonstrated that the MLR method was unableto predict in this case. The problem here is that collinearity is presentdue to the fact that the data acquired possess a high level ofredundancy. In order to eliminate collinearity, the PCR, FTPCR and PLSmethods were each used to perform a simultaneous determination.FTPCR combines FFT with PCR. The Fourier transforms act as a filter,since high-frequency noise can be deleted. The PLS algorithm is builton the properties of the nonlinear iterative partial least squares(NIPALS) algorithm. The algorithm has been described in details [10]by the authors of this paper. The same principle is behind all threemethods: compress the original data into a smaller number ofprincipal components or latent variables, which have the advantageof being uncorrelated. The theoretical background of these threemethods has been described in detail previously [10,43–46]. Deter-mination of the number of factors is one of the most important stepsto perform in factor-based methods. The essence of this step is thepseudo-rank determination of the raw experimental data [47]. Threeprincipal factors were selected for the present case based onpreviously reportedmethods [11]. The RSEP values for all componentsobtained using PCR, FTPCR and PLS were 13.5, 13.1, and 10.1%respectively. These results indicate that PLS gives better results thaneither PCR or FTPCR. The PLS method can decompose both theabsorbance and concentration data, and is commonly known as one ofthe best multivariate calibrations for linear systems. In this case PLSmethod gives much better results than RBFN method. However, thePLSmethod did not perform aswell as OSCWPTRBFN andWPTRBFN inthis test. The WPTRBFN method combines the idea of WPT denoisingwith RBFN. OSCWPTRBFN is a method that combines the advantagesof the three techniques to effectively reduce model complexity withOSC and WPT as pre-processing tools and improves the performanceof regression. OSCWPTPLS is an approach which is based on PLSregression with OSC and WPT as pre-processing tools. Experimentalresults demonstrated that the OSCWPTRBFN and WPTRBFN methodshad clear superiority over the original RBFN method and are evenbetter than the PLS method. Table 3 shows that the OSCWPTRBFNgenerates more satisfactory results comparing to the other methodsand is better than the OSCWPTPLS method. The hybridWPT–OSC pre-processing method is more effective for RBFN than PLS. The maindrawback of ANN is that it suffers severely from the “curse ofdimensionality” [48]. ANN is not able to ignore irrelevant input andhas shown difficulty in handling input data with noisy and high levelof redundancy. After pre-processing by the hybrid WPT–PLS method,the performance of OSCWPTRBFN is improved significantly.

WPTRBFN RBFN

II) Cu(II) Cd(II) Zn(II) Cu(II) Cd(II) Zn(II)

.0 117.4 94.8 90.1 115.7 94.3 91.7

.0 137.8 94.1 86.5 185.8 135.2 47.1

.6 122.6 86.6 98.6 187.3 95.5 75.6

.8 111.3 93.8 94.1 118.0 81.6 93.5

.6 100.2 108.4 97.2 127.0 73.2 95.1

.3 98.8 96.8 104.7 100.5 95.8 103.5

.7 109.9 82.7 97.3 110.5 79.9 97.5

.1 102.1 93.1 101.7 110.1 98.6 83.4

.3 96.8 102.0 102.1 101.4 94.1 102.0

Page 8: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

Table 3SEP and RSEP values for Cu(II), Cd(II) and Zn(II) system by the six methods.

Method SEP (10−6 mol L−1) RSEP (%)

Cu(II) Cd(II) Zn(II) Total components Cu(II) Cd(II) Zn(II) Total components

OSCWPTRBFN 1.75 1.24 0.808 0.765 9.62 10.9 3.15 6.85WPTRBFN 1.96 0.948 1.40 0.863 10.8 8.38 5.44 7.74RBFN 4.81 1.85 5.26 2.46 26.4 16.4 20.5 22.0OSCWPTPLS 1.36 0.83 0.67 1.00 12.9 12.7 4.52 8.93PLS 1.50 0.854 0.560 1.05 14.4 13.1 4.31 10.1FTPCR 1.10 1.40 1.52 1.35 10.5 21.5 11.7 13.1PCR 1.24 1.421 1.50 1.39 11.9 21.7 11.6 13.5MLR 4.26×104 2.73×104 6.07×104 2.99×104 2.67×105 2.75×105 3.06×105 2.89×105

64 L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

4.7. Quantitative comparison of OSC, WPT and OSC–WPT pretreatmentmethods

In order to enhance the predictive power of multivariatecalibration methods, spectral data are often corrected prior to dataanalysis. In this case, we selected three pre-processing methods (OSC,WPT and OSC–WPT) for comparison. For quantitative comparison,RBFN method was selected as the calibration model. Four methods(OSCWPTRBFN, WPTRBFN, OSCRBFN and RBFN) were tested with aset of synthetic unknown samples. The SEP and RSEP for the fourmethods are given in Table 4. The RSEP values for all componentsobtained using OSCWPTRBFN, WPTRBFN, OSCRBFN and RBFN were6.85, 7.74, 8.77 and 22.0% respectively. From Table 4, it can be seenthat the three kinds of pretreatment methods can enhance predictivepower significantly. The hybrid OSC–WPT pre-processing methodcombines the advantages of each method and is better than WPT andOSC, hence should be focused by our study. Future research will bedevoted to creating an improved OSC–WPT pre-processing methodthat combines WPT with genetic algorithms or simulated annealingmethods in order to improve the thresholding technique, and thatcombines OSC with other chemometric techniques to enhance theability of removing structured noise.

4.8. Additional case: simultaneous differential pulse strippingvoltammetric determination of Pb(II), Cd(II) and Ni(II)

A BAS-100A Electrochemical Analyzer (Bioanalytical Systems, Inc.)was used for all experiments. A PAR 303A static mercury drop electrode(PrincetonApplied Research)was used as aworking electrode, a Ptwireas a counter electrode and an Ag/AgCl/ saturated-KCl as a referenceelectrode. A training set of 9 samples formedbymixing Pb(II), Cd(II) andNi(II) was designed according to three-level orthogonal array design

Table 4SEP and RSEP values for Cu(II), Cd(II) and Zn(II) system by the four methods.

Method SEP (10−6 mol L−1)

Cu(II) Cd(II) Zn(II) Total compon

OSCWPTRBFN 1.75 1.24 0.808 0.765WPTRBFN 1.96 0.948 1.40 0.863OSCRBFN 1.74 1.31 1.96 0.978RBFN 4.81 1.85 5.26 2.46

Table 5SEP and RSEP values for Pb(II), Cd(II) and Ni(II) system by the four methods.

Method SEP (10−7 mol L−1)

Pb(II) Cd(II) Ni(II) Total compon

OSCWPTRBFN 0.099 0.105 0.035 0.049WPTRBFN 0.107 0.129 0.039 0.057RBFN 0.176 0.139 0.063 0.076PLS 0.109 0.131 0.043 0.063

with the L9 (34)matrix. Themixed solutions containing various ratios ofPb(II), Cd(II) andNi(II)wasprepared in 100 mLstandardflasks; 5.00 mLof Hg (II) buffer was added ,then with NaCl–sodium acetate (NaAc)buffer solution (pH=6.00) were diluted to mark. The solutions wereplaced in the voltammetric cell and purged with nitrogen for 300 s;nitrogen was passed over the solution surface throughout thisoperation. Differential pulse stripping voltammograms (DPSV) wererecorded between −700 mV and −400 mV at 4.0 mV intervals. Thewhole set of voltammograms obtained in nine standard mixtures wasused to build up the matrix D. Using the same procedures, a Du matrixfor nine synthetic validation samples was built up. Experimentalconditions used for DPSV were as follows: scan rate, 20 mV/s; pulseamplitude, 50 mV; sample width, 20 ms; pulse width, 60 ms; pulseperiod, 200 ms; deposition time, 360 s; deposition potential,−1300 mV; still time, 15 s. For theDOSCWPTRBFNmethod, thenumberof DOSC components =1, wavelet function: Daubechies 8, decompo-sition level L=3, the number of hidden nodes=8 and the width (σ) ofRBFN=0.7 were selected as optimal parameters after trials.

Using the four methods including three kinds of RBFN methodsand PLS, the concentrations of Pb(II), Cd(II) and Ni(II) in the validationset were calculated. The SEP and RSEP values for the four methodsare displayed in Table 5. RSEP values for all components usingDOSCWPTRBFN, WPTRBFN, RBFN, and PLS were 4.38, 5.10, 6.89, and5.65%, respectively. The results demonstrate that the OSCWPTRBFNand WPTRBFN methods had clear superiority over the original RBFNmethod and delivered better results than did the PLS method.Although the PLS method gives somewhat better results than RBFNmethod, the DOSCWPTRBFN method had the best performanceamong the four methods and was successful at simultaneousvoltammetric determination of overlapping peaks, proving that it isa promising technique. Comparing Table 5 with Table 3, we find thatthe performance of RBFN for the voltammetry is much improved than

RSEP (%)

ents Cu(II) Cd(II) Zn(II) Total components

9.62 10.9 3.15 6.8510.8 8.38 5.44 7.749.56 11.6 7.63 8.77

26.4 16.4 20.5 22.0

RSEP (%)

ents Pb(II) Cd(II) Ni(II) Total components

4.06 4.63 5.51 4.384.39 5.72 6.17 5.107.24 6.14 10.1 6.895.28 6.00 6.32 5.65

Page 9: Combining orthogonal signal correction and wavelet packet transform with radial basis function neural networks for multicomponent determination

65L. Gao, S. Ren / Chemometrics and Intelligent Laboratory Systems 100 (2010) 57–65

that of the spectrophotometry. The advantage of OSCWPTRBFNstimulates our interest to ANN techniques. It is clear that someintelligent computational methods such ANN and support vectormachines (SVM) [49] can provide another way for simultaneousmulticomponent determination.

5. Conclusion

A method named OSCWPTRBFN based on the data mining inchemometrics was developed for multicomponent spectrophotometricdetermination. The approach combines orthogonal signal correction,wavelet packet transform and radial basis function neural network forenhancing the ability of noise removal and unrelated informationelimination aswell as the quality of the regressionmethod. Thismethodis a kind a of datamining techniquewhich can beused to analyzematrixdata by first performing denoising, dimension reduction, collinearityand redundancy elimination, data transformation, extraction of featurevalues to obtain valuable information, and then constructing datamodels with intelligence methods to make predictions. From thecomparison of the results obtained by eight different methods(OSCWPTRBFN, WPTRBFN, RBFN, OSCWPTPLS, PLS, FTPCR, PCR andMLRmethods), it is clearly seen that the twohybridmethods (WPTRBFNand OSCWPTRBFN) performed better than the RBFN method. TheOSCWPTRBFN delivered the best performance among all eight. Theexperimental results from an additional voltammetric test case alsodemonstrated that the OSCWPTRBFN method performed successfully.

Acknowledgement

The authors would like to thank National Natural ScienceFoundation of China (20667002 and 60762003) for financial supportof this project.

References

[1] L. Mutihac, R. Mutihac, Anal. Chim. Acta 612 (2008) 1–18.[2] J. Han, M. Kamber, Data Mining: Concepts and Techniques, 2nd ed., Elsevier Pte.

Ltd., Singapore, 2006.[3] M. Daszykowski, B. Walczak, D.L. Massart, Chemometr. Intell. Lab. Syst. 65 (2003)

97–112.[4] Q.J. Han, H.L. Wu, C.B. Cai, L. Xu, R.Q. Yu, Anal. Chim. Acta 612 (2008) 121–125.[5] C.J. Xu, S. Gourvenec, Y.Z. Liang, D.L. Massart, Chemometr. Intell. Lab. Syst. 81

(2006) 3–12.

[6] Y.N. Ni, Y.Y. Peng, S. Kokot, Anal. Chim. Acta 616 (2008) 19–27.[7] H. Li, J.R. Hou, K. Wang, F. Zhang, Talanta 70 (2006) 336–343.[8] C.J. Xu, S. Gourvenec, Y.Z. Liang, D.L. Massart, Anal. Chim. Acta 575 (2006) 1–8.[9] S.X. Ren, L. Gao, Anal. Bioanal. Chem. 378 (2004) 1392–1398.[10] L. Gao, S.X. Ren, Chemometri. Intell. Lab. Syst. 45 (1999) 87–93.[11] S.X. Ren, L. Gao, Talanta 50 (2000) 1163–1173.[12] I. Daubechies, Commun. Pure Appl. Math. 41 (1988) 909–996.[13] S. Mallat, W.L. Hwang, I.E.E.E. Trans, Inform. Theory 38 (1992) 617–643.[14] C.E. Meil, D.F. Walnut, SIAM Rev. 31 (1989) 628–666.[15] R.R. Coifman,M.V.Wickerhauser, I.E.E.E. Trans, Inform. Theory 38 (1992) 713–718.[16] B. Jawerth, W. Swedens, SIAM Review 36 (1994) 377–412.[17] R.N.F. dos Santos, R.K.H. Galvao, M.C.U. Araujo, E.C. da Silva, Talanta 71 (2007)

1136–1143.[18] X.G. Shao, W.S. Cai, Anal. Lett. 32 (1999) 743–760.[19] D. Donald, Y. Everingham, D. Coomans, Chemom. Intell. Lab. Syst. 77 (2005) 32–42.[20] C.A. Andersson, Chemometr. Intell. Lab. Syst. 47 (1999) 51–63.[21] S. Wold, H. Antti, F. Lindgren, J. Ohman, Chemometr. Intell. Lab. Syst. 44 (1998)

175–185.[22] J.A.Westerhuis, S. de Jong, A.G. Smilde, Chemometr. Intell. Lab. Syst. 56 (2001)13–25.[23] A. Hoskuldsson, Chemometr. Intell. Lab. Syst. 55 (2001) 23–28.[24] T. Fearn, Chemometr. Intell. Lab. Syst. 50 (2000) 47–52.[25] Q. Shen, J.H. Jiang, G.L. Shen, R.Q. Yu, Chemometr. Intell. Lab. Syst. 82 (2006)

44–49.[26] D. Chen, F. Wang, X.G. Shao, Q.D. Su, Analyst 128 (2003) 1200–1203.[27] S. Preys, J.M. Roger, J.C. Boulet, Chemometr. Intell. Lab. Syst. 91 (2008) 28–33.[28] Y.Q. Wu, I. Noda, F. Meersman, Y. Ozaki, J. Mol. Struct. 799 (2006) 121–127.[29] I. Esteban-Diez, J.M. Gonzalez-Saiz, D. Gomez-Camara, C.P. Millan, Anal. Chim. Acta

555 (2006) 84–95.[30] J.A. Westerhuis, S. de Jong, A.G. Smilde, Chemometr. Intell. Lab. Syst. 56 (2001)

13–25.[31] F. Marini, A.L. Magri, R. Bucci, A.D. Magri, Anal. Chim. Acta 599 (2) (2007) 232–240.[32] A. Verikas, M. Bacauskiene, Chemomotr. Intell. Lab. Syst. 67 (2003) 187–191.[33] D.A. Cirovic, Trends Anal. Chem. 16 (3) (1997) 148–155.[34] W.J. Egan, S.M. Angel, S.L. Mogan, J. Chemom. 15 (2001) 29–48.[35] F. Schwenker, H.A. Kestler, G. Palm, Neural Netw. 14 (2001) 439–458.[36] J. Tetteh, S.Howells, E.Metcalfe, T. Suzuki, Chemometr. Intell. Lab. Syst. 41 (1998)17–29.[37] S.W. Choi, D. Lee, J.H. Park, I.B. Lee, Chemometr. Intell. Lab. Syst. 65 (2003)

191–208.[38] S. Chen, C.F. Cowan, P.M. Grant, IEEE Trans. Neural Netw. 2 (1991) 302–309.[39] A. Alexandridis, P. Patrinos, H. Sarimveis, G. Tsekouras, Chemometr. Intell. Lab.

Syst. 75 (2005) 149–162.[40] D.L. Donoho, IEEE Trans, Inform. Theory 41 (3) (1995) 613–627.[41] B.K. Alsberg, A.M. Woodward, M.K. Winson, J. Rowland, D.B. Kell, Analyst 122

(1997) 645–652.[42] M. Otto, W. Wegscheider, Anal. Chem. 57 (1985) 63–69.[43] L. Gao, S.X. Ren, Spectrochim. Acta, Part A 61 (2005) 3013–3019.[44] S.X. Ren, L. Gao, J. Automat. Chem. 17 (1995) 115–118.[45] S.X. Ren, L. Gao, Microchem. J. 58 (1998) 151–161.[46] S.X. Ren, L. Gao, Anal. Lett. 28 (1995) 1665.[47] K. Faber, B.R. Kowalski, J. Chemom. 11 (1997) 53–71.[48] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford university press,

Oxford, UK, 1995.[49] H.D. Li, Y.Z. Liang, Q.S. Xu, Chemometr. Intell. Lab. Syst. 95 (2009) 188–198.