Spectral-Spatial Constrained Nonnegative Matrix ...

11
6766 IEEE JOURNAL OF SELECTED TOPICS INAPPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 Spectral-Spatial Constrained Nonnegative Matrix Factorization for Spectral Mixture Analysis of Hyperspectral Images Ge Zhang, Student Member, IEEE, Shaohui Mei , Senior Member, IEEE, Yan Feng , and Qian Du , Fellow, IEEE Abstract—Hyperspectral spectral mixture analysis (SMA), which intends to decompose mixed pixels into a collection of end- members weighted by their corresponding fraction abundances, has been successfully used to tackle mixed-pixel problem in hyper- spectral remote sensing applications. As an approach of decom- posing a high-dimensional data matrix into the multiplication of two nonnegative matrices, nonnegative matrix factorization (NMF) has shown its advantages and been widely applied to SMA. Un- fortunately, most of the NMF-based unmixing methods can easily lead to an unsuitable solution, due to inadequate mining of spatial and spectral information and the influence of outliers and noise. To overcome such limitations, a spatial constraint over abundance and a spectral constraint over endmembers are imposed over NMF- based unmixing model for spectral-spatial constrained unmixing. Specifically, a spatial neighborhood preserving constraint is pro- posed to preserve the local geometric structure of the hyperspectral data by assuming that pixels in a spatial neighborhood generally fall into a low-dimensional manifold, while a minimum spectral distance constraint is formulated to optimize endmember spectra as compact as possible. Furthermore, to handle non-Gaussian noises or outliers, an L 2,1 -norm based loss function is ultimately adopted for the proposed spectral-spatial constrained nonnegative matrix factorization model and a projected gradient based optimization algorithm is designed for optimization. Experimental results over both synthetic and real-world datasets demonstrate that the pro- posed spatial and spectral constraints can certainly improve the performance of NMF-based unmixing and outperform state-of-the- art NMF-based unmixing algorithms. Index Terms—Hyperspectral remote sensing, linear mixture model (LMM), nonnegative matrix factorization (NMF), spectral mixture analysis (SMA). I. INTRODUCTION H YPERSPECTRAL remote sensing image, which involves collecting abundant spectral information over hundreds Manuscript received February 8, 2021; revised March 21, 2021; accepted June 17, 2021. Date of publication June 25, 2021; date of current version July 14, 2021. This work was supported in part by the Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University under Grant CX2021080, in part by the Fundamental Research Funds for the Central Universities, and in part by the National Natural Science Foundation of China under Grant 61671383. (Corresponding author: Shaohui Mei.) Ge Zhang, Shaohui Mei, and Yan Feng are with the School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, China (e- mail: [email protected]; [email protected]; [email protected]). Qian Du is with the Department of Electrical and Computer Engi- neering, Mississippi State University, Starkville, MS 39762 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/JSTARS.2021.3092566 of contiguous bands, has been widely applied to many civil and military fields [1]–[3]. However, in view of the low spatial reso- lution of imaging system, mixed pixels are commonly found in hyperspectral images [4]–[6]. The presence of mixed pixels se- riously affects the performance of many image processing tasks, such as classification and target detection, limiting the quanti- tative development of hyperspectral remote sensing. Therefore, spectral mixture analysis (SMA), which extracts fractional abun- dance for pure ground objects (known as endmembers) within a pixel, has been developed to solve the mixed-pixel problems for quantitative analysis of hyperspectral remote sensing images. In recent years, the development of deep learning technology has promoted the development of hyperspectral unmixing [7]–[9], but it is still limited since large amount of training samples are required. In the past decades, many unsupervised SMA algorithms have been proposed by jointly performing endmember extraction and abundance estimation, such as unconstrained fully constrained least square unmixing algorithm [10], gradient descent maxi- mum entropy algorithm [11], to name a few. In these algorithms, pure pixels for each endmember are assumed to present in the data. However, such a strong requirement is not met when pixels are in a high mixing level. Therefore, a group of algorithms have been developed for highly mixed situation by considering the unsupervised SMA as a blind source separation problem [12]. Independent component analysis (ICA), which extracts inde- pendent sources from observed data, has been applied to SMA by treating it as a blind source separation problem [13]. However, it has been demonstrated that ICA is not a suitable approach since both endmembers and abundances are not statistically independent, which violates the assumption of ICA. Meanwhile, nonnegativity of endmembers and abundances cannot be guaran- teed in ICA. To overcome these limitations of ICA, Nascimento et al. [14] proposed the dependent component analysis method based on the assumption that the columns of abundance matrix are random vectors following a mixture of Dirichlet probabil- ity density functions. However, nonnegativity of endmember spectra is still not guaranteed and may even result in “negative estimated reflectance.” Nonnegative matrix factorization (NMF), which was pro- posed to find a set of nonnegative basis functions for represent- ing nonnegative data [15], has been widely utilized for SMA of highly mixed data since the nonnegativity and sum-to-one This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

Transcript of Spectral-Spatial Constrained Nonnegative Matrix ...

Page 1: Spectral-Spatial Constrained Nonnegative Matrix ...

6766 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

Spectral-Spatial Constrained Nonnegative MatrixFactorization for Spectral Mixture Analysis

of Hyperspectral ImagesGe Zhang, Student Member, IEEE, Shaohui Mei , Senior Member, IEEE, Yan Feng , and Qian Du , Fellow, IEEE

Abstract—Hyperspectral spectral mixture analysis (SMA),which intends to decompose mixed pixels into a collection of end-members weighted by their corresponding fraction abundances,has been successfully used to tackle mixed-pixel problem in hyper-spectral remote sensing applications. As an approach of decom-posing a high-dimensional data matrix into the multiplication oftwo nonnegative matrices, nonnegative matrix factorization (NMF)has shown its advantages and been widely applied to SMA. Un-fortunately, most of the NMF-based unmixing methods can easilylead to an unsuitable solution, due to inadequate mining of spatialand spectral information and the influence of outliers and noise.To overcome such limitations, a spatial constraint over abundanceand a spectral constraint over endmembers are imposed over NMF-based unmixing model for spectral-spatial constrained unmixing.Specifically, a spatial neighborhood preserving constraint is pro-posed to preserve the local geometric structure of the hyperspectraldata by assuming that pixels in a spatial neighborhood generallyfall into a low-dimensional manifold, while a minimum spectraldistance constraint is formulated to optimize endmember spectra ascompact as possible. Furthermore, to handle non-Gaussian noisesor outliers, an L2,1-norm based loss function is ultimately adoptedfor the proposed spectral-spatial constrained nonnegative matrixfactorization model and a projected gradient based optimizationalgorithm is designed for optimization. Experimental results overboth synthetic and real-world datasets demonstrate that the pro-posed spatial and spectral constraints can certainly improve theperformance of NMF-based unmixing and outperform state-of-the-art NMF-based unmixing algorithms.

Index Terms—Hyperspectral remote sensing, linear mixturemodel (LMM), nonnegative matrix factorization (NMF), spectralmixture analysis (SMA).

I. INTRODUCTION

HYPERSPECTRAL remote sensing image, which involvescollecting abundant spectral information over hundreds

Manuscript received February 8, 2021; revised March 21, 2021; accepted June17, 2021. Date of publication June 25, 2021; date of current version July 14,2021. This work was supported in part by the Innovation Foundation for DoctorDissertation of Northwestern Polytechnical University under Grant CX2021080,in part by the Fundamental Research Funds for the Central Universities, and inpart by the National Natural Science Foundation of China under Grant 61671383.(Corresponding author: Shaohui Mei.)

Ge Zhang, Shaohui Mei, and Yan Feng are with the School of Electronics andInformation, Northwestern Polytechnical University, Xi’an 710129, China (e-mail: [email protected]; [email protected]; [email protected]).

Qian Du is with the Department of Electrical and Computer Engi-neering, Mississippi State University, Starkville, MS 39762 USA (e-mail:[email protected]).

Digital Object Identifier 10.1109/JSTARS.2021.3092566

of contiguous bands, has been widely applied to many civil andmilitary fields [1]–[3]. However, in view of the low spatial reso-lution of imaging system, mixed pixels are commonly found inhyperspectral images [4]–[6]. The presence of mixed pixels se-riously affects the performance of many image processing tasks,such as classification and target detection, limiting the quanti-tative development of hyperspectral remote sensing. Therefore,spectral mixture analysis (SMA), which extracts fractional abun-dance for pure ground objects (known as endmembers) within apixel, has been developed to solve the mixed-pixel problems forquantitative analysis of hyperspectral remote sensing images. Inrecent years, the development of deep learning technology haspromoted the development of hyperspectral unmixing [7]–[9],but it is still limited since large amount of training samples arerequired.

In the past decades, many unsupervised SMA algorithms havebeen proposed by jointly performing endmember extraction andabundance estimation, such as unconstrained fully constrainedleast square unmixing algorithm [10], gradient descent maxi-mum entropy algorithm [11], to name a few. In these algorithms,pure pixels for each endmember are assumed to present in thedata. However, such a strong requirement is not met when pixelsare in a high mixing level. Therefore, a group of algorithms havebeen developed for highly mixed situation by considering theunsupervised SMA as a blind source separation problem [12].

Independent component analysis (ICA), which extracts inde-pendent sources from observed data, has been applied to SMA bytreating it as a blind source separation problem [13]. However,it has been demonstrated that ICA is not a suitable approachsince both endmembers and abundances are not statisticallyindependent, which violates the assumption of ICA. Meanwhile,nonnegativity of endmembers and abundances cannot be guaran-teed in ICA. To overcome these limitations of ICA, Nascimentoet al. [14] proposed the dependent component analysis methodbased on the assumption that the columns of abundance matrixare random vectors following a mixture of Dirichlet probabil-ity density functions. However, nonnegativity of endmemberspectra is still not guaranteed and may even result in “negativeestimated reflectance.”

Nonnegative matrix factorization (NMF), which was pro-posed to find a set of nonnegative basis functions for represent-ing nonnegative data [15], has been widely utilized for SMAof highly mixed data since the nonnegativity and sum-to-one

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

Page 2: Spectral-Spatial Constrained Nonnegative Matrix ...

ZHANG et al.: SPECTRAL-SPATIAL CONSTRAINED NONNEGATIVE MATRIX FACTORIZATION 6767

constraints can be easily relaxed [16], [17]. However, minimiz-ing the representation error in the linear mixture model (LMM)by NMF is not sufficient for SMA since the unmixing results ofNMF are not unique [18]. In order to make this problem well-posed, auxiliary constraints must be imposed to NMF for phys-ically meaningful results. Many constrained NMF algorithmshave been proposed for this purpose, which can be classified intothree categories, i.e., endmember constrained NMF, abundanceconstrained NMF, and double constrained NMF.

One of the widely used endmenber constraints is variationalminimum volume. Given that image pixels are generally as-sumed to lie in a simplex whose vertices are endmembers [19],the minimum volume constrained NMF (MVC-NMF) algorithmwas proposed to restrict the volume of the simplex when NMFis employed for SMA [20]. The volume-regularized NMF algo-rithm further promoted small volume of the convex hull spannedby the basis matrix [21]. Zhuang et al. [22] discussed the useof several minimum volume regularizations and introduced astrategy named NMF-QMV for automatically selecting regu-larization parameter, which included by different regularizationmodules based on boundary, center, and total variation (TV).However, matrix determinant and matrix inversion computationmay cause numerical instability problems when the simplexvolume becomes small. The minimum dispersion constrainedNMF (MiniDisCo-NMF) was also proposed for SMA such thatthe endmember spectra have minimum variances, while it mayoversmooth the endmember spectra [23]. A similar approachwas also proposed to constrain endmember dissimilarity, whichcan not only measure the difference between the endmembersignatures but also constrain the obtained signatures to besmooth [24]. In addition, prior knowledge over endmembers,such as spectral signature of known endmembers, can be propa-gated during optimization process by minimizing the differencebetween the image data and the prior knowledge [25].

Recently, abundance constrained NMF algorithms havegained great attention in SMA of hyperspectral images. Thesmoothness constraint was imposed over abundance, which canbe realized by the TV regularizer, following the assumption thatground objects usually vary slowly and abrupt changes rarelyappear in an image [26], [27]. Moreover, since the number ofendmembers in an entire image is usually much larger than thenumber of endmembers present in a specific pixel, especiallyin the spectral library based unmixing, it often leads to a sparsesolution and thus it is intuitive and meaningful to constrain NMFwith sparseness [28]. Qian et al. utilized an L1/2 norm to ensurethe sparsity of abundance [29]. Feng et al. proposed a sparsity-constrained deep NMF with the total variation (SDNMF-TV)technique for hyperspectral unmixing, which introduced anL1/2

constraint for the sparse distribution of each endmember in the 2-D space and the TV regularizer for promoting piecewise smooth-ness in abundance maps [30]. However, the sparsity constraintis unstable and prone to noise. Spatial context, which affects theabundance relationship between pixels and their neighbors, hasalso attracted much attention in SMA applications [31]. Specif-ically, the abundance structural constraint was adopted sincespectra in the homogeneous region are considered to expresssimilar substance constructions and the same sparsity [32], [33].

The region-based structure preserving NMF explored consistentdata distribution in the same region while discriminating dif-ferent data structures across regions in the unmixed data [34].Wang et al. proposed the spatial group sparsity-regularized NMF(SGSNMF) [35], but it only accounted for spatial homogeneityof the first-order pixel neighborhood system of the abundancemap, the spatial structure information of the image remainedincomplete. Yang et al. combined nonlocal spatial informa-tion with spatial group sparsity together into the NMF method[36].

Taking either endmember or abundance constraint into ac-count in NMF based unmixing has a restricted effect to obtaina satisfying solution. Therefore, double-constrained NMF algo-rithms are expected to impose constrains on both endmembersand abundance simultaneously. Jia et al. proposed the piecewisesmooth NMF with sparseness constraint algorithm by incor-porating a united constraints of the piecewise smoothness andthe sparseness [37]. However, it often needs prior knowledgeabout the exact sparseness degree of abundances. Salehani et al.explored the use of “arctan” function as a sparse regularizer term,and enforced a roughness penalty to make spectral signaturessmooth as well as to restrict the related estimation error [38].The double-constrained multilayer NMF method [39] adoptedthe sparsity of the abundance matrix and a corresponding graphregularization term based on the k-nearest neighbor graph topreserve the geometric structure of the data. The robust col-laborative nonnegative matrix factorization (R-CoNMF) [40]addressed sparsity among the lines of the abundance matrix andpromoted minimum volume by pushing the endmembers towardthe mean value of the data set or bring the endmembers to thereal solution quadratically regularized by a given simplex. Yuanet al. further introduced TV regularization into R-CoNMF tobetter handle the huge solution space problem and explore theadjacent relationship [41]. Specially, simplex minimum volume,abundance sparsity, and abundance smoothness have been simul-taneously studied in the multiple-priors ensemble constrainedNMF model [42]. Many other related methods, such as graphregularized based [43], self-paced based NMF algorithms [44],tensor based [45], [46], and deep learning based unmixingalgorithms [47]–[49], have also been developed. However, thesealgorithms do not deal well with outliers or noise and the effecton spatial structure of abundance distribution still needs to befurther verified.

To alleviate the above-mentioned problems and take fulladvantage of spatial and spectral information, in this article,a spectral-spatial constrained NMF algorithm is proposed byimposing constraints on both endmembers and abundance si-multaneously. Specifically, a spatial neighborhood preservingconstraint is imposed on abundances to preserve local geometricstructure of the data in a low-dimensional manifold. Meanwhile,a minimum spectral distance constraint is also introduced tooptimize endmember spectra as compact as possible. To fur-ther handle non-Gaussian noise or outliers, an L2,1-norm basedloss function is adopted for the proposed spectral-spatial con-strained nonnegative matrix factorization (SS-NMF) algorithm.Finally, extensive experiments over both synthetic and real-world datasets are conducted to validate the effectiveness and

Page 3: Spectral-Spatial Constrained Nonnegative Matrix ...

6768 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

robustness of the proposed SS-NMF methods for hyperspectralunmixing.

The rest of this article is organized as follows. Section IIintroduces the basic idea of NMF algorithms. Section III presentsthe proposed SS-NMF model for SMA of hyperspectral images.Section IV reports and discusses experimental results on bothsynthetic and real datasets. Finally, Section V concludes thisarticle.

II. NMF ALGORITHMS FOR SMA

A. NMF Model for SMA

The LMM, which is well-known for its simplicity and ex-plicit physical meaning, has been widely utilized to model therelationship between mixtures and endmembers. In the LMM,the photons reflected from different ground objects within onepixel are assumed not to interfere with each other. As a result,the observed spectral matrix R ∈ Rb×o is assumed to be theproduct of endmember matrix M ∈ Rb×c and its correspondingabundance matrix A ∈ Rc×o plus a noise matrix N ∈ Rb×o.Their relationship can be formulated as

R = MA+N (1)

in which b is the number of bands, c is the number of endmem-bers, and o is the number of pixels in the image. Here, the ithcolumn vectors of M and A, denoted by mi and ai, correspondto the ith endmember and the abundance of the ith spectral pixelri, respectively.

In order to make the LMM physically meaningful, the abun-dance sum-to-one constraint (ASC) and the abundance nonnega-tive constraint (ANC) must be imposed, which can be expressedas

c∑j=1

Ajk = 1 (2)

Ajk ≥ 0 j = 1, 2, . . . , c, k = 1, 2, . . . , o. (3)

Generally, the ASC can be embedded into this model by addingan additional pseudoband to the data matrix and the endmembermatrix [50]. Therefore, in the following analysis, only the ANCis explicitly considered.

Given the target matrix R, the task of NMF is to decompose itinto two nonnegative matrices M and A. Hence, the followingmodel can be utilized to tackle the mixed-pixel problem:

R ≈ MA

s.t.M � 0,A � 0 (4)

where “� 0” represents elementwise nonnegativity. Therefore,the generally used Frobenius-norm based NMF model for SMAcan be expressed by the following optimization problem:

min1

2‖R−MA‖F

s.t.M � 0,A � 0. (5)

Since a real world hyperspectral data contain outliers andnoise, especially non-Gaussian noise may exist, the errors for

both pixels and samples are inevitably squared, which means afew pixels that contain noise or outliers with large error valueswill play a major role in the objective function [51]. To overcomethe above-mentioned problems, we employ joint an L2,1-normloss function to resist the sensitivity to noises and outliers, whichhas been shown to be very effective in [52]. Therefore, the NMFoptimization problem based on the L2,1-norm loss function canbe formulated as

min f(M,A) =1

2‖R−MA‖2,1 =

1

2

b∑i=1

‖Ri − (MA)i‖2

s.t.M � 0,A � 0 (6)

where Ri and (MA)i are the ith row of R and MA, respec-tively. In addition, according to [53], the gradient derivations ofeach variable can be determined as follows:

∇Mf(M,A) = Q(MA−R)AT (7)

∇Af(M,A) = MTQ(MA−R) (8)

where Q ∈ Rb×b is a diagonal matrix with the ith diagonalelement as

qii =1

2‖Ri − (MA)i‖2 . (9)

B. PG Based NMF Algorithm for SMA

Many algorithms have been proposed to solve the objec-tive function of NMF problem, such as multiplicative updat-ing method [54], projected gradient (PG) method [55], activeset method [56], and block principal pivoting method [57], inwhich PG based algorithm has been demonstrated to be veryeffective to solve the constrained NMF problem for SMA. Inorder to minimize the cost function of NMF [defined by (6)),the PG method is utilized to solve two subproblems of NMFalternatively, which is summarized in Algorithm 1.

Algorithm 1: The alternating PG algorithm for NMF.

1. Initializing M1 � 0,A1 � 0.2. For k = 1, 2, . . .

Mk+1 = P[Mk − αk∇Mf(Mk,Ak)

](10)

Ak+1 = P[Ak − αk∇Af(Mk+1,Ak)

](11)

where P [x] is a projected function, the value is chosen asfollows:

P [x] =

⎧⎨⎩

0 if x � 0x if 0 < x < u,u if x � u

(12)

in which u is the upper bound for M or A.

One of the key problems in the PG method is selecting thestep size αk, which is also the most time consuming for thisalgorithm. An simple and effective solution to select αk is“Armijo rule along the projection arc” [58].

Page 4: Spectral-Spatial Constrained Nonnegative Matrix ...

ZHANG et al.: SPECTRAL-SPATIAL CONSTRAINED NONNEGATIVE MATRIX FACTORIZATION 6769

III. OUR PROPOSED SPECTRAL AND SPATIAL CONSTRAINED

NMF ALGORITHMS FOR SMA

It has been demonstrated that minimizing the representationerror in LMM by NMF is not sufficient for SMA since theunmixing results of NMF are not unique. In order to make thisproblem well-posed, extra constraints other than the ANC mustbe imposed to the objective function in (6)

F(M,A) =1

2‖R−MA‖2,1 + 1

2λ1 · J1(M) +

1

2λ2 · J2(A),

(13)in which J1(M) and J2(A) are additional constraints imposedon endmember matrixM and abundance matrixA, respectively,and λ1 and λ2 are their tradeoffs among different objectivefunctions.

Observed from the LMM defined by (1), spectral informationof an image can be represented by its endmembers since all thepixels can be represented by these endmembers, while spatialinformation, which reflects the structural relationship of pixelsand their neighbors, can be expressed in the similarity of theircorresponding abundance. Therefore, in this article, the spectralconstraint is considered for J1(M) and the spatial constraintis constructed for J2(A) to make the unmixing result moreaccurate.

A. Spectral Constraint for Endmembers

Minimum volume of simplex determined by endmembers hasbeen demonstrated to be extremely effective to approximate anoptimal solution of endmembers due to the concepts of convex[20]. However, when it is utilized to constrain an NMF basedalgorithm, dimension reduction is required as a preprocess-ing step. In addition, matrix determinant and matrix inversioncomputation may cause numerical instability when the simplexvolume becomes small [23]. Actually, to find a simplex withminimum volume is in essence to find a simplex which is ascompact as possible. Therefore, a minimum spectral distanceconstraint is alternatively imposed to enforce compactness ofendmembers, which is defined as

J1(M) =

c∑i=1

(mi − m)T (mi − m) (14)

where mi(i = 1, 2, . . . , c) represents an endmember in M, andm defines the average spectrum of endmember matrix M. Inmatrix form, this constraint can be expressed as follows:

J1(M) = trace((M− M)T (M− M)

). (15)

The proposed minimum spectral distance constraint for end-members, which utilizes cumulative distance between each end-member spectrum and the mean spectral signature to replacevolume of the simplex formed by endmembers, can also optimizeendmember spectra as compact as possible. Actually, in a two-dimensional (2-D) space, the simplex formed by endmembersis reduced to a triangle. The proposed endmemberwise distancemeasures the distance from the three vertices of the triangle to itscenter of gravity while the volume measures the area of the tri-angle. Both measures can ensure compactness of endmembers.

However, the proposed spectral distance outperforms the volumeof simplex for several reasons. First, it is a convex constraint bywhich the optimization algorithm for SMA will benefit a lot.Second, it does not need a dimension reduction preprocessingstep. Finally, numerical instability, which is caused by matrix de-terminant and inversion operations when the volume of simplexbecomes small, can be avoided.

B. Spatial Constraint for Abundance

Recent studies have shown that many real world data areactually sampled from a nonlinear low-dimensional manifold,which is embedded in the high-dimensional ambient space. Inthis article, the local geometric structure of data is exploitedby considering the relationship between pixels and their cor-responding neighbors. Therefore, the local linear embedding(LLE) assumption, which means that each pixel can be linearlyreconstructed by its neighbors, is considered.

For each hyperspectral pixel ri(i = 1, 2, . . ., o) in the image,we use N(ri) represents its neighboring pixels. According to theLLE assumption, ri can be represented as the linear combinationof its neighboring pixels:

ri =∑

rj∈N(ri)

Wijrj (16)

where Wij represents the linear coefficient of rj in the recon-struction of ri. Generally, the linear coefficients Wij can beobtained by the following objective function:

Wij = argmin

∣∣∣∣∣∣∣∣∣∣∣∣ri −

∑rj∈N(ri)

Wijrj

∣∣∣∣∣∣∣∣∣∣∣∣2

. (17)

According to the LMM [defined by (1)] and the LLE [definedby (16)], each pixel can be represented as follows:

ri ≈ Mai

≈∑

rj∈N(ri)

Wijrj

≈∑

rj∈N(ri)

Wij(Maj)

= M∑

rj∈N(ri)

Wijaj . (18)

Observed from (18), we have

ai =∑

rj∈N(ri)

Wijaj (19)

which implies that the local geometric structure of data can bepreserved by abundances in the unmixing procedure. Therefore,the following spatial preserving constraint can be imposed onabundances to preserve the relationship of spatial neighborhood:

J2(A) =

o∑i=1

||ai −∑

rj∈N(ri)

Wijaj ||22

= ||A−AW||2F

Page 5: Spectral-Spatial Constrained Nonnegative Matrix ...

6770 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

= ||A(I−W)||2F= trace(A(I−W)(I−W)TAT )

= trace(APPTAT ) (20)

where trace(·) denotes the operation of trace, I =diag(1, 1, . . . , 1), and P = I−W.

C. Spectral and Spatial Constrained NMF Algorithm for SMA

According to the spectral distance constraint for endmembers[defined by (15)] and the spatial preserving constraint for abun-dance (20), the proposed spectral and spatial constrained NMF(SS-NMF) model can be constructed as

F(M,A) =1

2‖R−MA‖2,1

+1

2λ1 · trace

((M− M)T (M− M)

)

+1

2λ2 · trace(APPTAT ). (21)

According to the PG method defined by Algorithm 1, the pro-posed SS-NMF model can be effectively solved by the followingtwo suboptimization problems with iterative updates:

Ak+1 = argminA�0

F(Mk,Ak)

Mk+1 = arg minM�0

F(Mk,Ak+1). (22)

In other words, alternatively fixes one matrix and improves theother one. The convergence of this alternating method for NMFhas been explained by Lin [59] and it has been pointed outthat any limit point of the sequence (Mk,Ak) generated bythis method is a stationary point for NMF. Since both spectraldistance constraint and spatial preserving constraint are second-order convex constraints, this conclusion is still held for theproposed SS-NMF algorithm.

Each suboptimization algorithm is a convex problem and canbe solved by the PG method. Therefore, the gradient deriva-tion of the proposed spectral distance constraint and spatialpreserving constraint must be determined previously. The gra-dient derivation of the proposed spectral distance constraint isdetermined as

∇MJ1(M) = 2 · (M− M) · (I− 1

c1) (23)

where I = diag(1, 1, . . . , 1), and 1 is a c-by-c matrix of ones.The gradient derivation of the proposed spatial preserving con-straint is determined as

∇AJ2(A) = 2 ·APPT . (24)

As a result, the gradient derivation of F(M,A) for these twosubproblems of the proposed SS-NMF can be derived as

∇Mf(M,A) = Q(MA−R)AT +λ1 · (M− M) ·(I− 1

c1

)

∇Af(M,A) = MTQ(MA−R) + λ2 ·APPT . (25)

Similar to the PG method of NMF, the nonnegative constraintcan be guaranteed by a projective operation. The update rule forthe proposed SS-NMF can be summarized as

Ak+1 = max(0, Ak + αk∇AF(Mk,Ak)

)Mk+1 = max

(0, Mk + βk∇MF(Mk,Ak+1)

)(26)

where αk and βk are the stepsize. It has been pointed out thatthe stepsize is a critical value in the PG method. In this article,the well-known Armijo’s technique is adopted to find the beststepsize for these two subproblems.

In addition, it should be noted that, in the optimization processof the proposed SS-NMF for SMA, the iteration stops whenthe maximum iteration number (set to 500) is reached or thefollowing criterion is satisfied:

F(Mk,Ak)− F(Mk−1,Ak−1)

F(Mk−1,Ak−1)< 10−4. (27)

IV. EXPERIMENTS

In this section, extensive experiments on both syntheticdataset and real hyperspectral remote sensing image are con-ducted to validate the performance of the proposed SS-NMFalgorithm.

A. Experiments Over Synthetic Dataset

A 100× 100-pixel synthetic hyperspectral scene, which issimulated under linear mixture assumption using five spectra ofminerals from a spectral library compiled by the U.S. GeologicalSurvey (USGS)1 containing as many as 420 bands covering from400 to 2500 nm, is adopted in this experiment. The simulationstrategy in [60] is adopted, which mainly involves two steps:pure-pixel image simulation and mixed-pixel image simulation.The whole image is divided into 25 square regions with the samesize of 20× 20 pixels. All the pixels in the same small regionare assigned as a homogeneous ground object. In each row,five ground objects are assigned randomly to the five regions.Furthermore, 10-dB Gaussian noise is added to the whole imageand 50-dB Gaussian noise is added to the pixels inside a blockto simulate spectral variability. In order to generate mixed pixelsin the image, a 15× 15 window is used for local averaging viaspatial convolution. Fig. 1 shows the five reflectance curves ofselected spectra from the USGS spectral library.

To assess the performance of the proposed algorithm, thespectral angle distance (SAD) is adopted to evaluate the perfor-mance of estimated endmember in SMA. Let mi be an estimatedendmember and mi be the most similar spectral signature in theUSGS library. The SAD value between two spectral signaturesis defined as

SAD(mi, mi) = arccos

(mi · mi

‖mi‖ · ‖mi‖)

(28)

where ‖ · ‖ represents the magnitude of vectors. Note thatlower SAD scores mean higher spectral similarity betweenthe compared vectors. In the synthetic experiment, abundance

1[Online]. Available: http://speclab.cr.usgs.gov/spectral.lib06

Page 6: Spectral-Spatial Constrained Nonnegative Matrix ...

ZHANG et al.: SPECTRAL-SPATIAL CONSTRAINED NONNEGATIVE MATRIX FACTORIZATION 6771

Fig. 1. Five typical hyperspectral signatures of typical minerals from theUSGS digital spectral library.

fractions of each endmember are also known as ground-truth.Therefore, the abundance estimation error (labeled as “EstEr”),which is evaluated by the root-mean-square error between thetrue abundance and the estimated fractional abundance, is alsoevaluated for quantitative evaluation. The lower the EstEr, thebetter the abundance reconstruction performance. Let ai denotethe SS-NMF estimated abundance of mi, with ai being thecorresponding ground-truth abundance. Therefore, the EstEr isdefined as

EstEr(ai, ai) =1

o

⎛⎝ o∑

j=1

(aij − aij)2

⎞⎠

12

. (29)

Moreover, the signal-to-reconstruction error (SRE) is alsoadopted to measure the quality of the reconstruction of wholeimage, which is defined as

SRE(R,M,A) = 10lgE[‖R‖22]

E[‖R−MA‖22]. (30)

It measures the information regarding the power of the error inrelation to the power of the signal. Generally, the higher the SRE(dB), the better the image reconstruction performance.

1) Parameter Analysis: The proposed SS-NMF algorithminvolves two parameters to balance the importance of spectralconstraint and spatial constraint, i.e., λ1 and λ2 in (13). Theunmixing results of the proposed SS-NMF algorithm with dif-ferent values of λ1 and λ2 are shown in Fig. 2, where λ1 variesexponentially from 1e − 7 to 1e2 and λ2 varies exponentiallyfrom 1e − 3 to 1e6. To balance the importance of differentconstraints, the optimal combination of λ1 and λ2 is set to(1e − 5, 1).

2) Ablation Experiment: With respect to the proposedmethod, it mainly consists of three parts: L2,1-norm based lossfunction, spectral endmember constraint, and spatial abundanceconstraint. To verify the efficiency of each part, we separate thoseparts and conduct an ablation experiment. As shown in Table I,four other variants of the proposed SS-NMF are considered:

TABLE ISPECIFIC EXPERIMENTAL CONFIGURATION OF THE PROPOSED

SS-NMF VARIANTS

TABLE IIAVERAGE SAD SCORES (IN DEGREES), ESTER VALUE, AND SRE VALUE OF

THE SS-NMF VARIANTS ON THE SYNTHETIC DATASET

The best results are in bold.

1) SS-NMFF : Frobenius-norm based loss function combin-ing with both spectral endmember constraint and spatialabundance constraint;

2) NMF2,1: only L2,1-norm based loss function;3) SM -NMF: L2,1-norm based loss function combining with

spectral end member constraint;4) SA-NMF: L2,1-norm based loss function combining with

spatial abundance constraint.The synthetic experimental results of all these variants are

listed in Table II. It is observed that1) the L2,1-norm based loss function is better to model

the reconstruction error than LF -norm when NMF-basedunmixing model is used;

2) both the proposed spatial neighborhood preserving con-straint and the proposed minimum spectral distance con-straint can certainly improve the performance of L2,1-norm based NMF model for SMA.

3) when the proposed spatial neighborhood preserving con-straint and the proposed minimum spectral distance con-straint are jointly imposed over L2,1-norm-based NMFmodel, the performance of SMA can be further improved,demonstrating the necessity and complementary of thesetwo spectral and spatial constraints for NMF model.

3) Comparison With Typical NMF-Based Algorithms: Sev-eral well-known NMF-based algorithms are adopted for com-parison, including MVC-NMF [20], NMF-QMV (TV) [22],and SGSNMF [35], together with the baseline comparison ofVCA [61]+FCLS [62]. In all cases, parameters of these com-parison methods have been carefully optimized so that the bestperformance for each method is reported. To ensure reliablecomparisons, the VCA and FCLS algorithms are adopted to ini-tialize all the endmember matrixes and abundance matrixes. Thecomputational time of comparison algorithms is also measuredwith Inter Core i7-9700 CPU at 3000 GHz with 16 GB of RAMusing MATLAB R2020b on Windows 10 platform. Moreover,20 independent runs have been carried out and the results areaveraged for the proposed and comparison algorithms in theexperiments.

Page 7: Spectral-Spatial Constrained Nonnegative Matrix ...

6772 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

Fig. 2. Parameter analysis for the synthetic dataset, evaluated by SAD for endmembers, EstEr for abundance, and SRE for pixels.

Fig. 3. Noise sensitivity analysis of different algorithms over the synthetic image under different SNRs, evaluated by SAD for endmembers, EstEr for abundance,and SRE for image.

TABLE IIIAVERAGE SAD SCORES (IN DEGREES), ESTER VALUE, SRE VALUE, AND

COMPUTATIONAL TIME (IN SECONDS) OF COMPARISON ALGORITHMS ON THE

SYNTHETIC DATASET

The best results are in bold.

Table III shows the average SAD and EstEr scores betweenthe ground-truth endmembers or abundance and correspondingmatrixes acquired by different algorithms over synthetic dataset,as well as SRE value of the whole image and computational timeof each algorithm operation. It is observed that the proposedSS-NMF method obviously outperforms other considered algo-rithms by achieving the lowest SAD and EstEr values and thehighest SRE value. However, due to the double constraints ofproposed SS-NMF, it needs more time to find optimal solutionsthan that of NMF-QMV using spatial constraints.

In order to analyze the sensitivity of the proposed SS-NMF tonoises, different noise levels involved in the synthetic dataset arealso considered by varying SNR from 15 to 35 dB with the inter-val of 5 dB. The unmixing performance for all these algorithmsover the synthetic dataset with different SNRs is shown in Fig. 3.It is observed that the proposed SS-NMF outperforms underdifferent noise levels. Moreover, the unmixing performance of

Fig. 4. Band 100 of the subimage of AVIRIS Cuprite dataset.

the proposed SS-NMF steadily increased in terms of SAD andSRE when the SNR increases.

B. Experiments Over Real Dataset

The well-known Cuprite dataset acquired by AVIRIS2 con-tains 224 bands ranging from 370 to 2510 nm with a groundinstantaneous field of view of 20 m. A 350× 350-pixel subsetfrom the sector labeled as f970619t01p02r02sc03.a.rfl, whichis shown in Fig. 4, is used for performance validation and thenumber of endmembers are set to 16. After removing noisy bands

2[Online]. Available: http://aviris.jpl.nasa.gov/html/aviris.freedata.html

Page 8: Spectral-Spatial Constrained Nonnegative Matrix ...

ZHANG et al.: SPECTRAL-SPATIAL CONSTRAINED NONNEGATIVE MATRIX FACTORIZATION 6773

Fig. 5. Parameter analysis for the Cuprite dataset, evaluated by SAD forendmembers and SRE for image.

TABLE IVAVERAGE SAD SCORES (IN DEGREES) AND SRE VALUE OBTAINED BY THE

SS-NMF VARIANTS ON THE AVIRIS CUPRITE DATASET

Note that the best results of each signature and mean values are in bold.

and water-absorption bands (including bands 1− 4, 105− 115,150− 170, and 223− 224), a total of 186 reflectance bands arefinally adopted. In this experiment, for quantitative evaluation,only eight reference signatures are selected from the USGSdigital spectral library, corresponding to highly representativeminerals in the Cuprite mining district, including Alunite, An-dradite, Buddingtonite, Dumortierite, Jarosite, Kaolinite, Mont-morillonite, and Muscovite. For quantitatively evaluating theperformance of our proposed SS-NMF algorithm over Cupritedataset, two criteria, SAD for endmember and SRE for imageas above, are adopted.

1) Parameter Analysis: The unmixing results of the pro-posed SS-NMF algorithm under the two parameters λ1 and λ2

are shown in Fig. 5, where λ1 was set from 1e − 4 to 1e1, λ2

from 1e − 3 to 1e2 exponentially. To balance the performanceof estimated endmember and reconstructed pixels, the optimalcombination of λ1 and λ2 is (1e − 3, 1e2).

2) Ablation Experiment: The ablation experiment using dif-ferent variants of the proposed SS-NMF algorithm is also consid-ered over Cuprite dataset, and the results are shown in Table IV.It is also confirmed that the proposed spatial neighborhood pre-serving constraint and the proposed minimum spectral distanceconstraint are very effective to constrain NMF model for SMA,and they can jointly perform the best for unmixing. Moreover,the L2,1-norm based loss function is more efficient to modelnoise and outliers than LF -norm in the real scene.

3) Comparison With Typical NMF-Based Algorithms: Thethree typical NMF-based algorithms, i.e., MVC-NMF [20],NMF-QMV (TV) [22], and SGSNMF [35], together with the

TABLE VAVERAGE SAD SCORES (IN DEGREES), SRE VALUE, AND COMPUTATIONAL

TIME OBTAINED BY COMPARISON ALGORITHMS ON THE

AVIRIS CUPRITE DATASET

Note that the best results of each signature and mean values are in bold.

baseline of VCA [61]+FCLS [62], are also adopted for com-parison in this experiment. The experimental results of all thesealgorithms over the AVIRIS Cuprite dataset are listed in Table V,which lists the SAD scores between each acquired endmemberand corresponding selected USGS spectral signature, as wellas the mean SAD values across all eight selected signaturesand the SRE value of the entire image. It is observed that allthe algorithms obtain better spectral approximations for certaindifferent minerals, while those by the proposed SS-NMF canreach the lowest mean SAD value and highest SRE value,indicating that the estimated endmembers and reconstructedpixels by SS-NMF generally provide a good match with regardto the corresponding reference signatures as well as the originalpixels. Meanwhile, the computational time of SS-NMF is lessthan that of MVC-NMF and NMF-QMV, which verifies the highefficiency of proposed model in complex spatial context andunder larger number of pixels.

Fig. 6 shows a qualitative comparison between the abun-dance maps of three typical signatures: 1) Alunite+Muscoviteand/or Kaolinite; 2) Buddingtonite; and 3) Montmorillonite,with the corresponding reference classification maps producedby Tricorder software (labeled as “GT”). It can be seen that theproposed SS-NMF algorithm produces clearest abundance mapsin close agreement with the ground truth of the mineral maps,which is most consistent and stable. To sum up, these exper-imental results confirm that the proposed SS-NMF algorithmgenerates the best unmixing performance.

C. Convergence Analysis

Both synthetic and Cuprite datasets are used to analyze theconvergence of the proposed method. The convergence curvesof SS-NMF on two data are presented in Fig. 7, where we showhow the objective function value F(M,A) changes as it evolvesover 1000 iterations. As shown in Fig. 7, the proposed algorithmconverges well on both synthetic and real data. SS-NMF onsynthetic data converges in about 200 iterations while Cupritedata converges in about 400 iterations. Therefore, we set amaximum number of iterations of 500 is reasonable in thisarticle, and this experiment can verify that the proposed SS-NMFalgorithm exhibited a good convergence behavior.

Page 9: Spectral-Spatial Constrained Nonnegative Matrix ...

6774 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

Fig. 6. Abundance maps obtained by different algorithms over Cuprite datasetfor three typical spectral signatures. (Black: 0% abundance and White: 100%abundance).

Fig. 7. Convergence analysis of SS-NMF on (a) Synthetic dataset and (b)Cuprite dataset.

V. CONCLUSION

In this article, a spectral-spatial constrained NMF (SS-NMF)is proposed for SMA of hyperspectral images by imposing aspectral constraint over endmembers and a spatial constraintover abundance simultaneously in the NMF-based unmixingmodel. Specifically, an L2,1-norm loss function is employed toeffectively handle non-Gaussian noises or outliers. Meanwhile, aminimum spectral distance constraint that optimizes endmemberspectra as compact as possible and a spatial preserving constraintto preserve the local structure of the data on a low-dimensionalmanifold are imposed to enhance SMA. Experiments over bothsynthetic dataset and well-known Cuprite dataset are conductedto verify the effectiveness and superiority of the proposed al-gorithm over several typical NMF-based algorithms. The ex-perimental results show that the proposed SS-NMF algorithmoutperforms the other NMF-based algorithms both qualitativelyand quantitatively, which further reveals that the proposed spatialand spectral constraints have a significant improvement on theunmixing results. Although current performance is very en-couraging, the long computational time of the solution methodis still a challenge for practical applications. A more efficientoptimization algorithm will be studied in our future research.

REFERENCES

[1] H. Li, R. Feng, L. Wang, Y. Zhong, and L. Zhang, “Superpixel-basedreweighted low-rank and total variation sparse unmixing for hyperspectralremote sensing imagery,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 1,pp. 629-647, Jan. 2021.

[2] J. Peng, L. Li, and Y. Y. Tang, “Maximum likelihood estimation-basedjoint sparse representation for the classification of hyperspectral remotesensing images,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 6,pp. 1790–1802, Jun. 2019.

[3] Y. Zhou, J. Peng, and C. L. P. Chen, “Dimension reduction using spatialand spectral regularized local discriminant embedding for hyperspectralimage classification,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 2,pp. 1082–1095, Feb. 2015.

[4] A. Lagrange, M. Fauvel, S. May, and N. Dobigeon, “Matrix cofactorizationfor joint spatial-spectral unmixing of hyperspectral images,” IEEE Trans.Geosci. Remote Sens., vol. 58, no. 7, pp. 4915–4927, Jul. 2020.

[5] R. Tao, X. Zhao, W. Li, H. Li, and Q. Du, “Hyperspectral anomaly detectionby fractional Fourier entropy,” IEEE J. Sel. Topics Appl. Earth Observ.Remote Sens., vol. 12, no. 12, pp. 4920–4929, Dec. 2019.

[6] M. Zhang, W. Li, Q. Du, L. Gao, and B. Zhang, “Feature extractionfor classification of hyperspectral and LiDAR data using patch-to-patchCNN,” IEEE Trans. Cybern., vol. 50, no. 1, pp. 100–111, Jan. 2020.

[7] S. Mei, R. Jiang, X. Li, and Q. Du, “Spatial and spectral joint super-resolution using convolutional neural network,” IEEE Trans. Geosci.Remote Sens., vol. 58, no. 7, pp. 4590–4603, Jul. 2020.

[8] S. Mei, J. Ji, Y. Geng, Z. Zhang, X. Li, and Q. Du, “Unsupervisedspatial-spectral feature learning by 3D convolutional autoencoder forhyperspectral classification,” IEEE Trans. Geosci. Remote Sens., vol. 57,no. 9, pp. 6808–6820, Sep. 2019.

[9] X. Zhao et al., “Joint classification of hyperspectral and lidar data using hi-erarchical random walk and deep CNN architecture,” IEEE Trans. Geosci.Remote Sens., vol. 58, no. 10, pp. 7355–7370, Oct. 2020.

[10] D. Heinz and C. Chang, “Fully constrained least squares linear spectralmixture analysis method for material quantification in hyperspectral im-agery,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, pp. 529–545,Mar. 2001.

[11] L. Miao, H. Qi, and H. Szu, “A maximum entropy approach to unsuper-vised mixed-pixel decomposition,” IEEE Trans. Image Process., vol. 16,no. 4, pp. 1008–1021, Mar. 2007.

Page 10: Spectral-Spatial Constrained Nonnegative Matrix ...

ZHANG et al.: SPECTRAL-SPATIAL CONSTRAINED NONNEGATIVE MATRIX FACTORIZATION 6775

[12] J. M. Bioucas-Dias et al., “Hyperspectral unmixing overview: Geomet-rical, statistical, and sparse regression-based approaches,” IEEE J. Sel.Topics Appl. Earth Observ. Remote Sens., vol. 5, no. 2, pp. 354–379,Apr. 2012.

[13] C.-I. Chang, S.-S. Chiang, J. A. Smith, and I. W. Ginsberg, “Linear spectralrandom mixture analysis for hyperspectral imagery,” IEEE Trans. Geosci.Remote Sens., vol. 40, no. 2, pp. 375–392, Feb. 2002.

[14] J. M. Nascimento and J. M. Bioucas-Dias, “Hyperspectral unmixingalgorithm via dependent component analysis,” in Proc. IEEE Int. Geosci.Remote Sens. Symp., 2007, pp. 4033–4036.

[15] D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negativematrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.

[16] X. Tao, B. Wang, L. Zhang, and J. Zhang, “A new scheme fordecomposition of mixed pixels based on nonnegative matrix fac-torization,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2007,pp. 1759–1762.

[17] Z. Yang, G. Zhou, S. Xie, S. Ding, J.-M. Yang, and J. Zhang, “Blind spectralunmixing based on sparse nonnegative matrix factorization,” IEEE Trans.Image Process., vol. 20, no. 4, pp. 1112–1125, Apr. 2011.

[18] V. Pauca, J. Piper, and R. Plemmons, “Nonnegative matrix factorization forspectral data analysis,” Linear Algebra Appl., vol. 416, no. 1, pp. 29–47,2006.

[19] S. Mei, M. He, Y. Zhang, Z. Wang, and D. Feng, “Improving spatial-spectral endmember extraction in the presence of anomalous ground ob-jects,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 11, pp. 4210–4222,Nov. 2011.

[20] L. Miao and H. Qi, “Endmember extraction from highly mixed datausing minimum volume constrained nonnegative matrix factorization,”IEEE Trans. Geosci. Remote Sens., vol. 45, no. 3, pp. 765–777,May 2007.

[21] A. M. S. Ang and N. Gillis, “Algorithms and comparisons of nonnegativematrix factorizations with volume regularization for hyperspectral unmix-ing,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 12,pp. 4843–4853, Dec. 2019.

[22] L. Zhuang, C. Lin, M. A. T. Figueiredo, and J. M. Bioucas-Dias, “Regu-larization parameter selection in minimum volume hyperspectral unmix-ing,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 12, pp. 9858–9877,Dec. 2019.

[23] Y. Yu, S. Guo, and W. Sun, “Minimum distance constrained non-negativematrix factorization for the endmember extraction of hyperspectral im-ages,” Proc. SPIE, vol. 6790, 2007, pp. 1–9.

[24] N. Wang, B. Du, and L. Zhang, “An endmember dissimilarity constrainednon-negative matrix factorization method for hyperspectral unmixing,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 2,pp. 554–569, Apr. 2013.

[25] L. Tong, J. Zhou, Y. Qian, X. Bai, and Y. Gao, “Nonnegative matrixfactorization based hyperspectral unmixing with partially known end-members,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 11, pp. 1–14,Nov. 2016.

[26] W. He, H. Zhang, and L. Zhang, “Total variation regularized reweightedsparse nonnegative matrix factorization for hyperspectral unmixing,” IEEETrans. Geosci. Remote Sens., vol. 55, no. 7, pp. 3909–3921, Jul. 2017.

[27] F. Hao, A. Li, H. Xu, and W. Tao, “Sparsity-constrained deep nonnegativematrix factorization for hyperspectral unmixing,” IEEE Geosci. RemoteSens. Lett., vol. 15, no. 7, pp. 1105–1109, Jul. 2018.

[28] W. Wang, Y. Qian, and Y. Y. Tang, “Hypergraph-regularized sparse NMFfor hyperspectral unmixing,” IEEE J. Sel. Topics Appl. Earth Observ.Remote Sens., vol. 9, no. 2, pp. 681–694, Feb. 2016.

[29] Y. Qian, S. Jia, J. Zhou, and A. Robles-Kelly, “Hyperspectral unmixing vial1/2 sparsity-constrained nonnegative matrix factorization,” IEEE Trans.Geosci. Remote Sens., vol. 49, no. 11, pp. 4282–4297, Nov. 2011.

[30] X. Feng, H. Li, J. Li, Q. Du, A. Plaza, and W. J. Emery, “Hyperspectral un-mixing using sparsity-constrained deep nonnegative matrix factorizationwith total variation,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 10,pp. 6245–6257, Oct. 2018.

[31] R. Liu, D. Bo, and Z. Liangpei, “Hyperspectral unmixing via doubleabundance characteristics constraints based NMF,” Remote Sens., vol. 8,no. 6, pp. 464–486, 2016.

[32] X. Lu, H. Wu, and Y. Yuan, “Double constrained NMF for hyper-spectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5,pp. 2746–2758, May 2014.

[33] S. Yang, X. Zhang, Y. Yao, S. Cheng, and L. Jiao, “Geometric nonnegativematrix factorization (GNMF) for hyperspectral unmixing,” IEEE J. Sel.Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 6, pp. 2696–2703,Jun. 2015.

[34] L. Tong, J. Zhou, X. Li, Y. Qian, and Y. Gao, “Region-based structurepreserving nonnegative matrix factorization for hyperspectral unmixing,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 10, no. 4,pp. 1575–1588, Apr. 2017.

[35] X. Wang, Y. Zhong, L. Zhang, and Y. Xu, “Spatial group sparsity regular-ized nonnegative matrix factorization for hyperspectral unmixing,” IEEETrans. Geosci. Remote Sens., vol. 55, no. 11, pp. 6287–6304, Nov. 2017.

[36] L. Yang, J. Peng, H. Su, L. Xu, Y. Wang, and B. Yu, “Combined nonlocalspatial information and spatial group sparsity in NMF for hyperspectral un-mixing,” IEEE Geosci. Remote Sens. Lett., vol. 17, no. 10, pp. 1767–1771,Oct. 2020.

[37] S. Jia and Y. Qian, “Constrained nonnegative matrix factorization forhyperspectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 1,pp. 161–173, Jan. 2009.

[38] Y. E. Salehani and S. Gazor, “Smooth and sparse regularization for NMFhyperspectral unmixing,” IEEE J. Sel. Topics Appl. Earth Observ. RemoteSens., vol. 10, no. 8, pp. 3677–3692, Aug. 2017.

[39] H. Fang, A. Li, T. Wang, and H. Xu, “Hyperspectral unmixing usingdouble-constrained multilayer NMF,” Remote Sens. Lett., vol. 10, no. 3,pp. 224–233, 2019.

[40] J. Li, J. M. Bioucas-Dias, A. Plaza, and L. Liu, “Robust collaborativenonnegative matrix factorization for hyperspectral unmixing,” IEEE Trans.Geosci. Remote Sens., vol. 54, no. 10, pp. 6076–6090, Oct. 2016.

[41] Y. Yuan, Z. Zhang, and Q. Wang, “Improved collaborative non-negativematrix factorization and total variation for hyperspectral unmixing,” IEEEJ. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13, pp. 998–1010,Mar. 2020.

[42] K. Qu and W. Bao, “Multiple-priors ensemble constrained nonnegativematrix factorization for spectral unmixing,” IEEE J. Sel. Topics Appl. EarthObserv. Remote Sens., vol. 13, pp. 963–975, Feb. 2020.

[43] L. Tong, J. Zhou, B. Qian, J. Yu, and C. Xiao, “Adaptive graph reg-ularized multilayer nonnegative matrix factorization for hyperspectralunmixing,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13,pp. 434–447, Jan. 2020.

[44] J. Peng, Y. Zhou, W. Sun, Q. Du, and L. Xia, “Self-paced nonnegativematrix factorization for hyperspectral unmixing,” IEEE Trans. Geosci.Remote Sens., vol. 59, no. 2, pp. 1501–1515, Feb. 2021.

[45] F. Xiong, Y. Qian, J. Zhou, and Y. Y. Tang, “Hyperspectral unmixing viatotal variation regularized nonnegative tensor factorization,” IEEE Trans.Geosci. Remote Sens., vol. 57, no. 4, pp. 2341–2357, Apr. 2019.

[46] H. C. Li, S. Liu, X. R. Feng, and S. Q. Zhang, “Sparsity-constrainedcoupled nonnegative matrix-tensor factorization for hyperspectral unmix-ing,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13,pp. 5061–5073, Aug. 2020.

[47] X. R. Feng, H. C. Li, S. Liu, and H. Zhang, “Correntropy-basedautoencoder-like NMF with total variation for hyperspectral un-mixing,” IEEE Geosci. Remote Sens. Lett., to be published, doi:10.1109/LGRS.2020.3020896.

[48] R. A. Borsoi, T. Imbiriba, J. C. M. Bermudez, and C. Richard, “Deep gen-erative models for library augmentation in multiple endmember spectralmixture analysis,” IEEE Geosci. Remote Sens. Lett., to be published, doi:10.1109/LGRS.2020.3007161.

[49] Z. Han, D. Hong, L. Gao, B. Zhang, and J. Chanussot, “Deep half-siamesenetworks for hyperspectral unmixing,” IEEE Geosci. Remote Sens. Lett.,to be published, doi: 10.1109/LGRS.2020.3011941.

[50] S. Mei, M. He, Z. Wang, and D. Feng, “Mixture analysis by multichanelhopfield neural network,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 3,pp. 455–459, Jul. 2010.

[51] M. Yong, L. Chang, X. Mei, C. Liu, and J. Ma, “Robust sparse hyper-spectral unmixing with L2,1 norm,” IEEE Trans. Geosci. Remote Sens.,vol. 55, no. 3, pp. 1227–1239, Mar. 2017.

[52] J. Li, Y. Li, R. Song, S. Mei, and Q. Du, “Local spectral similaritypreserving regularized robust sparse hyperspectral unmixing,” IEEE Trans.Geosci. Remote Sens., vol. 57, no. 10, pp. 7756–7769, Oct. 2019.

[53] F. Nie, H. Huang, X. Cai, and C. Ding, “Efficient and robust featureselection via joint l2,1-norms minimization,” Adv. Neural Inf. Process.Syst., vol. 23, 2010, pp. 1813–1821.

[54] D. Lee and H. Seung, “Algorithms for non-negative matrix factorization,”in Proc. 13th Int. Conf. Adv. Neural Inf. Process. Syst., 2001, pp. 535–541.

[55] M. Chu, F. Diele, R. Plemmons, and S. Ragni, “Optimality, computa-tion, and interpretation of nonnegative matrix factorizations,” Tech. Rep.,NCSU, 2005

[56] H. Kim and H. Park, “Nonnegative matrix factorization based on alternat-ing nonnegativity constrained least squares and active set method,” SIAMJ. Matrix Anal. Appl., vol. 30, no. 2, pp. 713–730, 2008.

Page 11: Spectral-Spatial Constrained Nonnegative Matrix ...

6776 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021

[57] J. Kim and H. Park, “Fast nonnegative matrix factorization: An active-set-like method and comparisons,” SIAM J. Sci. Comput., vol. 33, no. 6,pp. 3261–3281, 2011.

[58] D. Bertsekas, “On the Goldstein–Levitin–Polyak gradient projectionmethod,” IEEE Trans. Autom. Control, vol. 21, no. 2, pp. 174–184, 2003.

[59] C. Lin, “Projected gradient methods for nonnegative matrix factorization,”Neural Comput., vol. 19, no. 10, pp. 2756–2779, 2007.

[60] S. Mei, M. He, Z. Wang, and D. Feng, “Spatial purity based endmemberextraction for spectral mixture analysis,” IEEE Trans. Geosci. RemoteSens., vol. 48, no. 9, pp. 3434–3445, Sep. 2010.

[61] J. Nascimento and J. Dias, “Vertex component analysis: A fast algorithmto unmix hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 43,no. 4, pp. 898–910, Apr. 2005.

[62] D. Heinz, C.-I. Chang, and M. L. G. Althouse, “Fully constrained leastsquares-based linear unmixing,” in Proc. IEEE Int. Geosci. Remote Sens.Symp., 1999, pp. 1401–1403.

Ge Zhang (Student Member, IEEE) received the B.S.degree in electronics and information engineeringfrom Northwestern Polytechnical University, Xi’an,China, in 2018. She is currently working toward thePh.D. degree in information and communication engi-neering with the School of Electronics and Informa-tion, Northwestern Polytechnical University, Xi’an,China.

Her research interests include hyperspectral imageunmixing and signal processing.

Shaohui Mei (Senior Member, IEEE) received theB.S. degree in electronics and information engineer-ing and the Ph.D. degree in signal and informationprocessing from Northwestern Polytechnical Univer-sity, Xi’an, China, in 2005 and 2011, respectively.

He is currently an Associate Professor with theSchool of Electronics and Information. He was aVisiting Student with the University of Sydney, fromOctober 2007 to October 2008. His research interestsinclude hyperspectral remote sensing image process-ing and applications, intelligent signal and informa-

tion acquisition and processing, video processing, and pattern recognition.Dr. Mei was the recipient of Excellent Doctoral Dissertation Award of Shaanxi

Province in 2014, and Best Paper Award of IEEE ISPACS 2017. He is currentlyserving as Associated Editor of IEEE JOURNAL OF SELECTED TOPICS IN APPLIED

EARTH OBSERVATIONS AND REMOTE SENSING, the Reviewer of more than 20international famous academic journals, and was awarded the Best Reviewerof IEEE JSTARS in 2019. He served as the Registration Chair of IEEE ChinaSummit and International Conference on Signal and Information Processing2014.

Yan Feng received the B.S. degree in electronic en-gineering and the M.S. and Ph.D. degrees in signaland information processing from Northwestern Poly-technical University, Xi’an, China, in 1984, 1989, and2007, respectively.

She is currently a Professor with the School ofElectronics and Information, Northwestern Polytech-nical University. Her research interests include imageprocessing, compression and classification of hyper-spectral images, compressed sensing theory, and in-telligent information processing.

Qian Du (Fellow, IEEE) received the Ph.D. degree inelectrical engineering from the University of Mary-land at Baltimore County, Baltimore, MD, USA, in2000.

She is currently a Bobby Shackouls Professor withthe Department of Electrical and Computer Engineer-ing, Mississippi State University, MS, USA. Her re-search interests include hyperspectral remote sensingimage analysis and applications, pattern classifica-tion, data compression, and neural networks.

Dr. Du is a Fellow of the SPIE-International Soci-ety for Optics and Photonics. She was a recipient of the 2010 Best ReviewerAward from the IEEE Geoscience and Remote Sensing Society. She was aCo-Chair for the Data Fusion Technical Committee of the IEEE Geoscienceand Remote Sensing Society from 2009 to 2013, and the Chair for RemoteSensing and Mapping Technical Committee of the International Association forPattern Recognition from 2010 to 2014. She served as an Associate Editor of theIEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND

REMOTE SENSING, the Journal of Applied Remote Sensing, and the IEEE SIGNAL

PROCESSING LETTERS. Since 2016, she has been the Editor-in-Chief of theIEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND

REMOTE SENSING. She is the General Chair for the 4th IEEE GRSS Workshopon Hyperspectral Image and Signal Processing: Evolution in Remote Sensingin Shanghai, China, in 2012.