Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

13
72 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019 Multiple-Model Adaptive Estimation for 3-D and 4-D Signals: A Widely Linear Quaternion Approach Min Xiang , Member, IEEE, Bruno Scalzo Dees, and Danilo P. Mandic, Fellow, IEEE Abstract—Quaternion state estimation techniques have been used in various applications, yet they are only suitable for dynamical systems represented by a single known model. In order to deal with model uncertainty, this paper proposes a class of widely linear quaternion multiple-model adaptive estimation (WL-QMMAE) algorithms based on widely linear quaternion Kalman filters and Bayesian inference. The augmented second- order quaternion statistics is employed to capture complete second-order statistical information in improper quaternion signals. Within the WL-QMMAE framework, a widely linear quaternion interacting multiple-model algorithm is proposed to track time-variant model uncertainty, while a widely linear quaternion static multiple-model algorithm is proposed for time- invariant model uncertainty. A performance analysis of the proposed algorithms shows that, as expected, the WL-QMMAE reduces to semiwidely linear QMMAE for C η -improper signals and further reduces to strictly linear QMMAE for proper signals. Simulation results indicate that for improper signals, the proposed WL-QMMAE algorithms exhibit an enhanced per- formance over their strictly linear counterparts. The effectiveness of the proposed recursive performance analysis algorithm is also validated. Index Terms— Interacting multiple-model (IMM) algorithm, multiple-model adaptive estimation (MMAE), static multiple- model (SMM) algorithm, quaternion Kalman filters, quaternion noncircularity, widely linear processing. I. I NTRODUCTION Q UATERNIONS have been traditionally used in aerospace engineering and computer graphics in order to model 3-D rotations and orientations, as their division algebra rec- tifies numerical problems (accumulation of error and gimbal lock) associated with vector algebras [1]. The recently intro- duced HR and generalised HR calculi [2], [3] and augmented quaternion statistics [4], [5] have triggered a resurgence of research on quaternion-valued signal processing, such as quaternion filters and quaternion neural networks [6]–[12]. This owes to a compact model of mutual information between data channels provided by quaternions and the inherent phys- ically meaningful interpretation for a number of 3-D and 4- D phenomena. These developments have enabled quaternions to find applications in communications, motion tracking, and biomedical signal processing [13]–[16]. Manuscript received July 31, 2017; revised January 26, 2018; accepted April 13, 2018. Date of publication May 23, 2018; date of current version December 19, 2018. (Corresponding author: Min Xiang.) The authors are with the Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2BT, U.K. (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2018.2829526 The second-order statistical properties of quaternion-valued signals are conventionally characterized by their covariance. However, signal processing techniques that utilize the stan- dard covariance are optimal only for estimating second-order circular (proper) quaternion signals. Advances in quaternion statistics have established that both of the covariance and the three complementary covariances are necessary to characterize the complete augmented second-order statistics of second- order noncircular (improper) quaternion signals [4], [5]. Subsequently, widely linear processing algorithms exploiting the complementary covariances, in addition to the standard covariance, have been proposed, owing to their generic nature. For C η -improper quaternion signals, the widely linear process- ing reduces to semiwidely linear processing. For example, widely linear state estimation has been used for widely linear quaternion dynamical systems with improper noise [17]–[19], and it reduces to semiwidely linear estimation for semiwidely linear systems with C η -improper noise [20]. The existing quaternion state estimation methods are suit- able for dynamical systems represented by a single known model. In reality, there is often considerable uncertainty about the system model. An effective method for cir- cumventing the system uncertainty in the real domain is through multiple-model adaptive estimation (MMAE), which can be seen as an application of the mixture of experts’ methodology [21], [22] to the state estimation problem. A real- valued MMAE algorithm typically consists of a bank of filters based on multiple models that represent possible system behavior patterns, referred to as modes, and a fusion algorithm that fuses the state estimates of different filters to form the overall estimate. The fusion algorithm may be based on Bayesian inference [23] or neural network training [24]. The existing MMAE algorithms are divided into two cate- gories: static multiple-model (SMM) algorithms and interact- ing multiple-model (IMM) algorithms. The former are valid for time-invariant system uncertainty and the latter are effec- tive for time-variant system uncertainty. Both have been exten- sively studied and used in practical applications, such as target tracking, fault diagnosis, and intelligent control [25]–[28]. The theoretical performance analysis of the MMAE algo- rithms, especially the IMM, however, is challenging. Recent research on this issue has been focused on recursive approaches, including the hybrid conditional averaging tech- nique for predicting the mean square error (MSE) [29]–[31] and the posterior Cramer–Rao lower bound technique for com- puting the lower performance bound [32]. These approaches have been shown to be more computationally efficient than 2162-237X © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

Page 1: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

72 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Multiple-Model Adaptive Estimation for 3-D and4-D Signals: A Widely Linear Quaternion Approach

Min Xiang , Member, IEEE, Bruno Scalzo Dees, and Danilo P. Mandic, Fellow, IEEE

Abstract— Quaternion state estimation techniques have beenused in various applications, yet they are only suitable fordynamical systems represented by a single known model. In orderto deal with model uncertainty, this paper proposes a classof widely linear quaternion multiple-model adaptive estimation(WL-QMMAE) algorithms based on widely linear quaternionKalman filters and Bayesian inference. The augmented second-order quaternion statistics is employed to capture completesecond-order statistical information in improper quaternionsignals. Within the WL-QMMAE framework, a widely linearquaternion interacting multiple-model algorithm is proposed totrack time-variant model uncertainty, while a widely linearquaternion static multiple-model algorithm is proposed for time-invariant model uncertainty. A performance analysis of theproposed algorithms shows that, as expected, the WL-QMMAEreduces to semiwidely linear QMMAE for Cη-improper signalsand further reduces to strictly linear QMMAE for propersignals. Simulation results indicate that for improper signals,the proposed WL-QMMAE algorithms exhibit an enhanced per-formance over their strictly linear counterparts. The effectivenessof the proposed recursive performance analysis algorithm is alsovalidated.

Index Terms— Interacting multiple-model (IMM) algorithm,multiple-model adaptive estimation (MMAE), static multiple-model (SMM) algorithm, quaternion Kalman filters, quaternionnoncircularity, widely linear processing.

I. INTRODUCTION

QUATERNIONS have been traditionally used in aerospaceengineering and computer graphics in order to model

3-D rotations and orientations, as their division algebra rec-tifies numerical problems (accumulation of error and gimballock) associated with vector algebras [1]. The recently intro-duced HR and generalised HR calculi [2], [3] and augmentedquaternion statistics [4], [5] have triggered a resurgenceof research on quaternion-valued signal processing, such asquaternion filters and quaternion neural networks [6]–[12].This owes to a compact model of mutual information betweendata channels provided by quaternions and the inherent phys-ically meaningful interpretation for a number of 3-D and 4-D phenomena. These developments have enabled quaternionsto find applications in communications, motion tracking, andbiomedical signal processing [13]–[16].

Manuscript received July 31, 2017; revised January 26, 2018; acceptedApril 13, 2018. Date of publication May 23, 2018; date of current versionDecember 19, 2018. (Corresponding author: Min Xiang.)

The authors are with the Department of Electrical and ElectronicEngineering, Imperial College London, London SW7 2BT, U.K. (e-mail:[email protected]; [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNNLS.2018.2829526

The second-order statistical properties of quaternion-valuedsignals are conventionally characterized by their covariance.However, signal processing techniques that utilize the stan-dard covariance are optimal only for estimating second-ordercircular (proper) quaternion signals. Advances in quaternionstatistics have established that both of the covariance and thethree complementary covariances are necessary to characterizethe complete augmented second-order statistics of second-order noncircular (improper) quaternion signals [4], [5].Subsequently, widely linear processing algorithms exploitingthe complementary covariances, in addition to the standardcovariance, have been proposed, owing to their generic nature.For C

η-improper quaternion signals, the widely linear process-ing reduces to semiwidely linear processing. For example,widely linear state estimation has been used for widely linearquaternion dynamical systems with improper noise [17]–[19],and it reduces to semiwidely linear estimation for semiwidelylinear systems with Cη-improper noise [20].

The existing quaternion state estimation methods are suit-able for dynamical systems represented by a single knownmodel. In reality, there is often considerable uncertaintyabout the system model. An effective method for cir-cumventing the system uncertainty in the real domain isthrough multiple-model adaptive estimation (MMAE), whichcan be seen as an application of the mixture of experts’methodology [21], [22] to the state estimation problem. A real-valued MMAE algorithm typically consists of a bank offilters based on multiple models that represent possible systembehavior patterns, referred to as modes, and a fusion algorithmthat fuses the state estimates of different filters to formthe overall estimate. The fusion algorithm may be basedon Bayesian inference [23] or neural network training [24].The existing MMAE algorithms are divided into two cate-gories: static multiple-model (SMM) algorithms and interact-ing multiple-model (IMM) algorithms. The former are validfor time-invariant system uncertainty and the latter are effec-tive for time-variant system uncertainty. Both have been exten-sively studied and used in practical applications, such as targettracking, fault diagnosis, and intelligent control [25]–[28].The theoretical performance analysis of the MMAE algo-rithms, especially the IMM, however, is challenging. Recentresearch on this issue has been focused on recursiveapproaches, including the hybrid conditional averaging tech-nique for predicting the mean square error (MSE) [29]–[31]and the posterior Cramer–Rao lower bound technique for com-puting the lower performance bound [32]. These approacheshave been shown to be more computationally efficient than

2162-237X © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 73

the traditionally used Monte Carlo (MC) simulation method.For a number of practical engineering problems for which

MMAE of 3-D and 4-D signals can be used, such as maneu-vering target tracking in the 3-D space, quaternion-valuedMMAE (QMMAE) is a compact way to account for mutualinformation between the dimensions while preserving theinherent physical meaning of the signals. However, QMMAEalgorithms for hybrid systems have not yet been investi-gated in the literature, although real-valued MMAE algorithmshave been recently extended to the complex domain [33].The QMMAE is a nontrivial extension of the real-valuedMMAE, since the augmented second-order quaternion statis-tics must be incorporated to deal with improper quaternionsignals. This paper proposes a class of widely linear QMMAE(WL-QMMAE) algorithms based on the widely linear quater-nion Kalman filter (WL-QKF). This approach exploits com-plete second-order statistics of improper quaternion signals.For time-variant system uncertainty, we propose a widelylinear quaternion IMM (WL-QIMM) algorithm and a recur-sive algorithm for the analysis of its performance. For time-invariant system uncertainty, we show that the WL-QIMM cansimplify to a widely linear quaternion SMM (WL-QSMM)algorithm. In a similar spirit, we show that for semiwidelylinear systems with C

η-improper noise, the WL-QMMAEreduces to the semiwidely linear QMMAE (SWL-QMMAE),while for strictly linear systems with H-proper noise, theWL-QMMAE reduces to the strictly linear QMMAE(SL-QMMAE). For rigor, we have also proved the convergenceof WL-QSMM.

The rest of this paper is organized as follows. Section IIprovides an overview of quaternions and quaternion stateestimation. Section III presents the WL-QMMAE framework,which yields the WL-QIMM and WL-QSMM algorithms.The performance analysis of these algorithms is provided inSection III. Numerical simulations for the proposed algorithmsare presented in Section IV. Section V concludes this paper.

Throughout this paper, we use boldface capital letters todenote matrices, A, boldface lowercase letters for vectors,a, and italic letters for scalar quantities, a. Superscripts (·)T ,(·)∗, and (·)H denote the transpose, conjugate, and Hermitian(i.e., transpose and conjugate) operators, respectively. Thesymbol E{·} denotes the statistical expectation operator.

II. BACKGROUND

A. Quaternion Algebra

The quaternion domain H is a 4-D vector space over the realfield R, spanned by the basis {1, ı, j, κ}. A random quaternionvariable x ∈ H consists of a real part R[·] and an imaginarypart I[·], which comprises three imaginary components, so that

x = R[x] + I[x] = R[x] + Iı [x]ı + Ij [x]j + Iκ [x]κ (1)

where R[x],Iı [x],Ij [x], and Iκ [x] are real variables and ı, j ,and κ are imaginary units with the properties

ı2 = j2 = κ2 = −1, ıj = −j ı = κ

jκ = −κj = ı, κı = −ıκ = j.

The conjugate of x is defined as

x∗ = R[x] − I[x] = R[x] − Iı [x]ı − Ij [x]j − Iκ [x]κ.

The modulus of x is then

|x | =�

R[x]2 + Iı [x]2 + Ij [x]2 + Iκ [x]2

and the product of two quaternions x, y ∈ H is given by

xy = R[x]R[y] − I[x] · I[y] + R[x]I[y]+ R[y]I[x] + I[x] × I[y]

where “·” denotes the scalar product and “×” denotes thevector product. The presence of the vector product causesnoncommutativity of the quaternion product, that is, xy �= yx .The quaternion product has the following properties:

|xy| = |x ||y|, x−1 = x∗

|x |2(xy)−1 = y−1x−1, (xy)∗ = y∗x∗.

A quaternion variable x is called a pure quaternion ifR[x] = 0. A quaternion variable x is called a unit quaternionif |x | = 1.

Another important notion is the quaternion involution [34],which defines a self-inverse mapping, analogous to the com-plex conjugate [35]. The general involution of the quaternionvariable x is defined as xα � −αxα, which represents therotation of the vector part of x by π about a unit purequaternion α. The quaternion involutions have the property:(xα)α = x . Accordingly, define xα∗ � (xα)∗ = (x∗)α.The three special cases of involutions about the ı , j , and κimaginary axes are given by

xı = −ı xı = R[x] + Iı [x]ı − Ij [x]j − Iκ [x]κxj = −j xj = R[x] − Iı [x]ı + Ij [x]j − Iκ [x]κxκ = −κxκ = R[x] − Iı [x]ı − Ij [x]j + Iκ [x]κ. (2)

B. Statistics and Estimation of Quaternion Signals

The set of involutions in (2), together with the orig-inal quaternion, forms the most frequently used basisfor augmented quaternion statistics, which is at thecore of the recently proposed widely linear processingmethodology [4], [5]. The augmented second-order statisticsof a zero-mean random quaternion column vector x isexploited by the ı -, j -, and κ- covariance matrices, Cxxη �E{(x − E{x})(x − E{x})ηH }, η ∈ {ı, j, κ}, which are referredto as complementary covariance matrices, together with thestandard Hermitian covariance matrix, Cxx � E{(x − E{x})(x − E{x})H }. These covariances, taken together, enable thecharacterization of the quaternion impropriety (second-ordernoncircularity) which arises from the degree of correlationand/or power imbalance between the imaginary componentsrelative to the real component. For the nth variable in x, xn ,the impropriety coefficients defined as ρη � |Cxn xη

n/Cxn xn |,

η = ı, j, κ , measure the degree of correlation between xn

and each of its involutions [36], [37]. Note ρη ∈ [0, 1].

Page 3: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

74 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Benefiting from the information contained in the complemen-tary covariances, widely linear processing with the augmentedsignal vector

xa � [xT , xıT , xjT , xκT ]T (3)

and semiwidely linear processing with the semiaugmentedsignal vector xb � [xT , xηT ]T , where η ∈ {ı, j, κ}, achievea better performance for improper quaternion signals com-pared to traditional strictly linear processing with x [38]. Thecovariance matrices of xa and xb can be represented by

Cxaxa � E{(xa − E{xa})(xa − E{xa})H }

=

⎡⎢⎢⎣

Cxx Cxxı Cxxj Cxxκ

Cıxxı Cı

xx Cıxxκ Cı

xxj

Cjxxj Cj

xxκ Cjxx Cj

xxı

Cκxxκ Cκ

xxj Cκxxı Cκ

xx

⎤⎥⎥⎦

Cxbxb � E{(xb − E{xb})(xb − E{xb})H }=

Cxx Cxxη

Cηxxη Cη

xx.

The four degrees of freedom in the quaternion domainallow for different notions of properness: H-properness,Rη-properness, and Cη-properness.

Definition 1 (Properness of a Random Quaternion Vector):A random quaternion vector x is H-proper if it is uncorrelatedwith its involutions xı , xj and xκ , so that Cxxı = Cxxj =Cxxκ = 0; x is Rη-proper if it is only uncorrelated with theinvolution xη, so that only Cxxη among the three comple-mentary covariances vanishes; x is Cη-improper if it is onlycorrelated with the involution xη, so that all complementarycovariances except Cxxη vanish; otherwise, x is generallyimproper.

Based on the above augmented second-order statis-tics, a generic quaternion multivariate Gaussian distribu-tion (QMGD) for improper random vectors can be expressedas xa ∼ N (μa, Cxaxa ), where μa � E{xa}, with a probabilitydensity function given by [5]

f (xa|μa, Ca) =4N exp

�− 1

2 (xa − μa)H C−1xaxa(xa − μa)

π2N√|Cxaxa |

(4)

where |Cxaxa | is the determinant of Cxaxa .The augmented form of a quaternion random vector x

can be written as xa = [xT , xηT , xβT , xζT ]T for distinctη, β, ζ ∈ {ı, j, κ}. Note that the ordering does not change theprobability density in (4). If x is Cη-improper, then Cxaxa =bdiag(Cxbxb , Cβ

xbxb), and (4) becomes

f (xb|μb, Cb) = 4N exp − (xb − μb)H C−1

xbxb(xb − μb)

π2N |Cxbxb |(5)

where μb � E{xb} and the distribution can be expressed asxb ∼ N (μb, Cxbxb).

If x is H-proper, then Cxaxa = bdiag(Cxx, Cıxx, Cj

xx, Cκxx),

and (4) becomes

f (x|μ, C) = 4N exp{−2(x − μ)H C−1xx (x − μ)}

π2N |Cxx|2 (6)

where μ � E{x} and the distribution can be expressed asx ∼ N (μ, Cxx).

To deal with improper quaternion signals, the WL-QKFbased on the widely linear state-space model

xan = Fa

nxan−1 + wa

n

zan = Ha

nxan + va

n (7)

has been proposed [17]. It is optimal for widely linear quater-nion dynamical systems with general (proper or improper)noise [17]. In (7), n denotes the time instant, and xa

n , zan , wa

n ,and va

n are the augmented forms of the state variable vector xn ,the observation vector zn , the state noise vector wn , and theobservation noise vector vn , as in (3), while the state transitionand observation matrices, Fa

n and Han , have the following

structure:

Fan =

⎡⎢⎢⎣

Fn Fı,n Fj,n Fκ,n

Fıı,n Fı

n Fıκ,n Fı

j,nFj

j,n Fjκ,n Fj

n Fjı,n

Fκκ,n Fκ

j,n Fκı,n Fκ

n

⎤⎥⎥⎦

Han =

⎡⎢⎢⎣

Hn Hı,n Hj,n Hκ,n

Hıı,n Hı

n Hıκ,n Hı

j,nHj

j,n Hjκ,n Hj

n Hjı,n

Hκκ,n Hκ

j,n Hκı,n Hκ

n

⎤⎥⎥⎦

where Fn, Fı,n , Fj,n, Fκ,n, Hn, Hı,n , Hj,n , and Hκ,n are quater-nion matrices.

Observe that under the condition of Cη-improperness, the

widely linear model in (7) reduces to the semiwidely linearone given by

xbn = Fb

nxbn−1 + wb

n

zbn = Hb

nxbn + vb

n (8)

where the state transition and observation matrices, Fbn and Hb

n ,have the following structure:

Fbn =

Fn Fη,n

Fηη,n Fη

n

, Hb

n =

Hn Hη,n

Hηη,n Hη

n .

The semiwidely linear quaternion state estimation based onthe semiwidely linear model in (8) is therefore optimal whenthe true system model is semiwidely linear with Cη-improperwn and vn [39].

Furthermore, under the condition of H-properness,the widely linear model in (7) reduces to the strictly linearone given by

xn = Fnxn−1 + wn

zn = Hnxn + vn. (9)

The strictly linear quaternion state estimation based on thestrictly linear model in (9) is optimal only when the true sys-tem model is strictly linear and wn and vn are H-proper [40].This conclusion is an extension of the analysis result forcomplex Kalman filters [41] to the quaternion domain.

Page 4: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 75

III. QUATERNION MMAE

If a quaternion dynamical system is represented by thewidely linear model in (7), but the state transition matrix, theobservation matrix, and the properties of the noises are uncer-tain over time, a quaternion Kalman filter based on a singleassumed model will suffer a great performance loss. In suchcases, we may employ L potential models as candidates forthe unknown system model. We can assume that the systemmodel is always included in the L candidate models, each ofwhich is represented by

xan = Fa[l]

n xan−1 + wa[l]

n

zan = Ha[l]

n xan + va[l]

n (10)

where l ∈ {1, 2, . . . , L} is the model index, wa[l]n and va[l]

n

are zero-mean improper state and observation noise, andFa[l]

n and Ha[l]n are state transition and observation matrices

corresponding to model l. We refer to the system behaviorpattern as a mode. For example, if the system can be rep-resented by model l at time n, we say that the system is inmode l at time n. Upon implementing L WL-QKFs, we canobtain their augmented state estimates and error covariancesgiven by

xa[l]n|n = E

xa

n

��Zan, τ [l]

n

Pa[l]n|n = E

�xa

n − xa[l]n|n

��xa

n − xa[l]n|n

�H ��Zan, τ [l]

n

�.

By fusing the L estimates, the overall augmented estimate canbe obtained as

E xa

n

��Zan

� =L�

l=1

p�τ [l]

n

��Zan

�xa[l]

n|n (11)

while the overall augmented error covariance becomes

E �

xan − E

xa

n

��Zan

���xa

n − E xa

n

��Zan

��H ��Zan, τ [l]

n

=L�

l=1

p�τ [l]

n

��Zan

��Pa[l]

n|n + �xa[l]

n|n − xan|n

��xa[l]

n|n − xan|n

�H �

=L�

l=1

p�τ [l]

n

��Zan

��Pa[l]

n|n + xa[l]n|n

�xa[l]

n|n�H � − xa

n|n�xa

n|n�H (12)

where Zan � [zaT

1 , . . . , zaTn ]T = [ZaT

n−1, zaTn ]T is the set of

measurements and τ [l]n is the event that the system is in mode l

at time instant n. Using Bayes’ law, we next obtain

p�τ [l]

n |Zan

= p�za

n

��Zan−1, τ

[l]n

�p�τ

[l]n

��Zan−1

��L

i=1 p�za

n

��Zan−1, τ

[i]n

�p�τ

[i]n

��Zan−1

= p�za

n

��Zan−1, τ

[l]n

��Li=1 p

�τ [i]

n−1

��Zan−1

�p�τ [l]

n��τ [i]

n−1

��L

i=1 p�za

n

��Zan−1, τ

[i]n

��Lj=1 p

�τ

[ j ]n−1

��Zan−1

�p�τ [i]

n��τ [ j ]

n−1

(13)

where p(zan |Za

n−1, τ[l]n ) is the likelihood function of τ

[l]n . Note

that p(zan|Za

n−1, τ[l]n ) = p(ra[l]

n |τ [l]n ), where

ra[l]n � za

n − H[l]n xa[l]

n|n−1 (14)

is referred to as the augmented residual vector. Followingthe principle commonly used for real-valued and complex-valued MMAE, we assume the augmented residual vector

to be normally distributed, with ra[l]n ∼ N (0, Sa[l]

n ), where

Sa[l]n is the augmented covariance matrix of ra[l]

n . Denote

the likelihood function p(zan |Za

n−1, τ[l]n ) by L[l]

n . L[l]n can be

computed as

L[l]n = f

�ra[l]

n

��0, Sa[l]n

�. (15)

Note that Sa[l]n is an intermediate result in the WL-QKF

algorithm corresponding to model l. The estimated mode attime n is therefore given by arg max

lp(τ

[l]n |Za

n). The above

constitutes the main framework of the proposed WL-QMMAE.This framework is flexible and therefore can incorporate otherapproaches for nonlinear estimation. For example, the para-meters within the selected models may be a function of theorder of the signal subspace or the number of noncircularsignals in the subspace [42]. The fusion in (11) and (12) isa data-driven approach and can be viewed as a more directand adaptive way of implementing the multistep generalizedlikelihood ratio test with the threshold being unit [42]. Theconditional probabilities of the various hypotheses modeledin the filters can be based on criteria other than the like-lihood. For instance, in order to select the model that bestfits the observation data while avoiding overfitting, the like-lihood function can be replaced with exp(−ITCl), where

ITCl = − ln[ f (ra[l]n |0, Sa[l]

n )] + αO[l] is the information-theoretic criterion of model l, with αO[l] a penalty termthat penalizes complicated models to avoid overfitting, α thepenalty factor dependent on the chosen criterion, and O[l] theorder of model l [43].

A. WL-QIMM

To understand the WL-QMMAE, the probabilityp(τ [l]

n |Zan−1) in (13) needs to be explained further. For

systems with a time-varying mode, we make the standardassumption that the mode transition is governed by afirst-order Markov chain, that is

p�τ [l]

n

��τ [i]n−1

� = π [il] ∀i, l ∈ {1, 2, . . . , L} (16)

where π [il] is the Markov transition probability from mode ito mode l. Upon incorporating this assumption and imposinga reinitialization step at the beginning of the estimation tocouple the L WL-QKFs, the WL-MMAE framework reducesto a practical WL-QIMM algorithm, an extension of the real-valued IMM [44]. An iteration of this recursive algorithm attime n can be described as follows and can be shown in theschematic in Fig. 1.

Step 1 (Reinitialization): For l = 1, . . . , L, i = 1, . . . , L,the probability of the system is in mode i at time (n−1) giventhat the system is in mode l at time n, which is computed as

γ [li]n � p

�τ [i]

n−1

��τ [l]n , Za

n−1

� = π [il]μ[i]n−1�L

j=1 π [ j l]μ[ j ]n−1

where μ[i]n−1 termed mode probability is the estimate of

p(τ[l]n−1|Za

n−1) in the previous iteration. For l = 1, . . . , L,

Page 5: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

76 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Fig. 1. Block diagram of the WL-QIMM algorithm.

compute the augmented initial state of filter l based on Bayes’law as

xa[l]0n = E

xa

n−1

��τ [l]n , Za

n−1

� =L�

i=1

xa[i]n−1|n−1γ

[li]n (17)

and the augmented initial error covariance of filter l as

Pa[l]0n

=L�

i=1

γ [li]n

�Pa[i]

n−1|n−1 + �x[l]

0n − xa[i]n−1|n−1

��x[l]

0n − xa[i]n−1|n−1

�H �

=L�

i=1

γ [li]n

�Pa[i]

n−1|n−1 + xa[i]n−1|n−1

�xa[i]

n−1|n−1

�H � − xa[l]0n

�xa[l]

0n

�H.

(18)

Step 2 (Mode-Conditioned Kalman Filtering): Implement L

WL-QKFs with the initial conditions xa[l]0n and Pa[l]

0n . Specifi-cally, for l = 1, . . . , L, compute the predicted augmented state

xa[l]n|n−1 = Fa[l]

n xa[l]0n

predicted augmented error covariance

Pa[l]n|n−1 = Fa[l]

n Pa[l]0n

�Fa[l]

n

�H + Qa[l]n

augmented residual

ra[l]n = za

n − H[l]n xa[l]

n|n−1

augmented residual covariance

Sa[l]n = Ha[l]

n Pa[l]n|n−1

�Ha[l]

n

�H + Ra[l]n

filter gain

Ka[l]n = Pa[l]

n|n−1

�Ha[l]

n

�H �Sa[l]

n

�−1

updated augmented state

xa[l]n|n = xa[l]

n|n−1 + Ka[l]n ra[l]

n

updated augmented error covariance

Pa[l]n|n = Pa[l]

n|n−1 − Ka[l]n Ha[l]

n Pa[l]n|n−1

likelihood for mode l as in (15).Step 3 (Combination): For l = 1, . . . , L, compute the

estimate of p(τ [l]n |Za

n), denoted by μ[l]n , as

μ[l]n = L[l]

n�L

i=1 π [il]μ[i]n−1�L

i=1 L[i]n

�Lj=1 π [ j i]μ[ j ]

n−1

(19)

according to (13) and (16). Based on (11), compute theoverall augmented state estimate as a weighted sum of theL augmented state estimates from the L WL-QKFs, that is

xan|n =

L�

l=1

μ[l]n xa[l]

n|n . (20)

Based on (12), compute the overall augmented error covari-ance as

Pan|n =

L�

l=1

μ[l]n

�Pa[l]

n|n + �xa[l]

n|n − xan|n

��xa[l]

n|n − xan|n

�H �

=L�

l=1

μ[l]n

�Pa[l]

n|n + xa[l]n|n

�xa[l]

n|n�H � − xa

n|n�xa

n|n�H

. (21)

B. Performance Analysis of WL-QIMM

The stability condition of the real-valued IMM algo-rithm proved in [45] is readily extended to the WL-QIMM,that is, if the system is uniformly controllable and uni-formly observable, the covariances of the mode-conditionedWL-QKFs are uniformly bounded from above and below.Furthermore, to efficiently evaluate the estimation performanceof WL-QIMM, this section extends the hybrid conditionalaveraging algorithms for the performance analysis of real-valued IMM [29], [30] to the quaternion domain. The resultingrecursive algorithm, without recourse to multiple trials, is morecomputationally efficient than the commonly used MC simu-lation method. Note that owing to the widely linear nature ofthe WL-QIMM algorithm, the performance analysis algorithmmust also be in a widely linear form, since the augmentedsecond-order quaternion statistics must be incorporated.

The performance of WL-QIMM depends on the operatingscenario, so that we must analyze the average performance ofWL-QIMM in a definite operating scenario S in which thesystem can be represented by a widely linear quaternion statemodel in the form of (7). Compared with (10), the model in (7)is general in the sense that Fa

n , Han , wa

n , and van may or may not

belong to the candidate model sets used by the L WL-QKFs.We define x � E{x|S} to denote the mean of the quater-nion random vector x in a scenario S, while Cov(x, y) �E{(x − x)(y − y)H |S} denotes the cross covariance of twoquaternion random vectors x and y in S. Based on thesedefinitions, we can derive a recursive algorithm that estimatesthe MSE of WL-QIMM without generating observation data inmultiple trials. This is achieved by the conditional expectationoperation mentioned earlier, through which the randomnessof the error covariance due to the noise is averaged out. Thisrecursive algorithm is similar to the performance analysis algo-rithm for real-valued IMM [30], which originated from [29],but is different in two respects: 1) the augmented forms of thequaternion variables are employed and 2) the generic QMGDwith the probability density function in (4) is incorporatedinto the evaluation of the average likelihood functions and theaverage mode probabilities. Next, we describe an iteration ofthe proposed performance analysis algorithm at time n, whichis shown in Fig. 2, but for space concerns, omit the derivation,

Page 6: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 77

Fig. 2. Block diagram of the performance analysis algorithm for WL-QIMM.

owing to its generic extension from the real-valued algorithm.For more details, we refer the reader to [29] and [30].

Step 1 (Reinitialization): Compute the means of theinitial conditions of the mode-conditioned WL-QKFs. Forl = 1, . . . , L, i = 1, . . . , L, compute the mean of the mixingprobability as

γ [li]n ≈ π [il]μ[i]

n−1�Lj=1 π [ j l]μ[ j ]

n−1

. (22)

For l = 1, . . . , L, compute the mean of the initial conditionof filter l as

x[l]0n =

L�

i=1

γ [li]n x[i]

n−1|n−1

Pa[l]0n =

L�

i=1

γ [li]n

�Pa[i]

n−1|n−1+�xa[l]

0n −xa[i]n−1|n−1

× �xa[l]

0n − xa[i]n−1|n−1

�H �

=L�

i=1

γ [li]n

�Pa[i]

n−1|n−1 + xa[i]n−1|n−1

�xa[i]

n−1|n−1

�H �

− xa[l]0n

�xa[l]

0n

�H.

Step 2 (Mode-Conditioned Kalman Filtering): Compute themeans of residuals and estimates of the mode-conditionedWL-QKFs. For l = 1, . . . , L, define the estimation error of

filter l as e[l]n � xn − x[l]

n|n , and the error of the initial state of

filter l as e[l]0n � xn−1 − x[l]

0n . Their means are related by

e[l]0n ≈

L�

i=1

γ [li]n e[i]

n−1.

For l = 1, . . . , L, define the model error as

�a[l] � HanFa

n − Ha[l]n Fa[l]

n

and compute

ra[l]n = Ha[l]

n Fa[l]n ea[l]

0n + �a[l]xan−1

Pa[l]n|n−1 = Fa[l]

n Pa[l]0n

�Fa[l]

n

�H + Qa[l]n

Sa[l]n = Ha[l]

n Pa[l]n|n−1

�Ha[l]

n

�H + Ra[l]n

Ka[l]n = Pa[l]

n|n−1

�Ha[l]

n

�H �Sa[l]

n

�−1

x[l]n|n = Fa[l]

n xa[l]0n + Ka[l]

n ra[l]n

Pa[l]n|n = Pa[l]

n|n−1 − Ka[l]n Ha[l]

n Pa[l]n|n−1

ea[l]n = �

I − Ka[l]n Ha[l]

n

�Fa[l]

n ea[l]0n

+ �Fa

n − Fa[l]n − Ka[l]

n �a[l]�xan−1

where I is the identity matrix. Update xan = Fa

n xan−1.

Step 3 (Cross-Covariance Analysis): Compute the crosscovariances of augmented residuals and augmented estimationerrors of the mode-conditioned WL-QKFs. For l = 1, . . . , L,i = 1, . . . , L, compute the cross covariances of augmentedresiduals as

Cov�ra[l]

n , ra[i]n

= HanQa

nHa Hn + �a[l]Cov

�xa

n−1, xan−1

�(�a[l])H

+ Ha[l]n Fa[l]

n Cov�ea[l]

0n , ea[i]0n

��Ha[i]

n Fa[i]n

�H

+ Ha[l]n Fa[l]

n Cov�ea[l]

0n , xan−1

�(�a[i])H

+ �a[l]Cov�ea[i]

0n , xan−1

�H �Ha[i]

n Fa[i]n

�H + Ran

and the cross covariances of augmented estimation errors as

Cov�ea[l]

n , ea[i]n

= �I − Ka[l]

n Ha[l]n

�Fa[l]Cov

�ea[l]

0n , ea[i]0n

× �Fa[i] − Ka[i]

n Ha[i]Fa[i]�H

+ �Fa − Fa[l] − Ka[l]

n �a[l]�Cov�xa

n−1, xan−1

× �Fa − Fa[i] − Ka[i]

n �a[i]�H

+ �Fa[l] − Ka[l]

n Ha[l]Fa[l]�Cov�ea[l]

0n , xan−1

× �Fa − Fa[i] − Ka[i]

n �a[i]�H

+ �Fa − Fa[l] − Ka[l]

n �a[l]�Cov�ea[i]

0n , xan−1

�H

× �Fa[i] − Ka[i]

n Ha[i]Fa[i]�H

+ �I − Ka[l]

n Ha�Qan

�I − Ka[i]

n Ha�H + Ka[l]n Ra

n

�Ka[i]

n

�H

where

Cov�ea[l]

0n , xan−1

� ≈L�

i=1

γ [li]n Cov

�ea[i]

n−1, xan−1

�(23)

Cov�ea[l]

0n , ea[i]0n

� ≈L�

k=1

L�

j=1

γ [lk]n γ

[i j ]n Cov

�ea[k]

n−1, ea[ j ]n−1

�. (24)

For l = 1, . . . , L, update

Cov�ea[l]

n , xan

� = �I − Ka[l]

n Ha[l]n

�Fa[l]

n Cov�ea[l]

0n , xan−1

�Fa H

n

+ �Fa

n − Fa[l]n − Ka[l]

n �a[l]�Cov�xa

n−1, xan−1

× Fa Hn + �

I − Ka[l]n Ha

n

�Qa

n

Cov�xa

n, xan

� = FanCov

�xa

n−1, xan−1

�Fa H

n + Qan.

Page 7: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

78 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Step 4 (Combination): For l = 1, . . . , L, employing the fact

that ra[l]n ∼ N (ra[l]

n , Va[l]n ), where Va[l]

n = Cov(ra[l]n , ra[l]

n ), wecan compute the mean of likelihood function as

L[l]n =

HNL[l]

n f�ra[l]

n

��ra[l]n , Va[l]

n

�dr[l]

n

=�

HNf�ra[l]

n

��0, Sa[l]n

�f�ra[l]

n

��ra[l]n , Va[l]

n

�dr[l]

n

=�

2

π

�2N����Sa[l]

n�−1 + �

Va[l]n

�−1�−1��0.5

��Sa[l]n

��0.5��Va[l]n

��0.5

× exp − 0.5

�ra[l]

n

�H �Va[l]

n + Sa[l]n

�−1ra[l]n

�(25)

and the mean of the mode probability as

μ[l]n =

HNμ[l]

n f�ra[l]

n

��ra[l]n , Va[l]

n

�dr[l]

n

=�

HN

�Li=1 π [il]μ[i]

n−1 f�ra

n

��0, Sa[l]n

�f�ra

n

��ra[l]n , Va[l]

n�

�Li=1

�Lj=1 π [ j i]μ[ j ]

n−1 f�ra

n

��0, Sa[i]n

� drn

= L[l]n

L�

i=1

π [il]μ[i]n−1 ×

HN

× f�ra

n

���Va[l]n

�Sa[l]

n�−1+I

�−1ra[l]n ,

��Sa[l]

n�−1+�

Va[l]n

�−1�−1��L

i=1�L

j=1 π [ j i]μ[ j ]n−1 f

�ra

n

��0, Sa[i]n

� drn

(26)

Then, compute the covariance of the overall augmented esti-mation error as

E �

xan − xa

n|n��

xan − xa

n|n�H ��S�

≈L�

l=1

L�

i=1

μ[l]n μ[i]

n

�Cov

�ea[l]

n , ea[i]n

� + ea[l]n

�ea[i]

n

�H �(27)

and the MSE of the overall estimate is given by

E{(xn − xn|n)H (xn − xn|n)|S}= 1

4Tr

�E

�xa

n − xan|n

��xa

n − xan|n

�H ��S��. (28)

Remark 1: In order to derive a closed-form solution, we haveapplied the approximations in (22)–(24) and (27) based on thezero correlation assumption between the involved variables.

C. WL-QSMM

If the system mode is time invariant, then

π [il] =�

1 l = i

0 l �= i

so that (19) simplifies to

μ[l]n = L[l]

n μ[l]n−1�L

i=1 L[i]n μ[i]

n−1

(29)

and the reinitialization step can be proven to be unneces-sary. In this case, the WL-QIMM algorithm reduces to theWL-QSMM algorithm, for which each iteration includes thefollowing two steps.

Step 1 (Mode-Conditioned Kalman Filtering): ImplementL WL-QKFs with the initial states and covariances being the

estimates obtained in the previous iteration. This step is thesame as Step 2 of the WL-QIMM except that x[l]

0n and Pa[l]0n

in (17) and (18) are replaced with xa[l]n−1|n−1 and Pa[l]

n−1|n−1 forl = 1, . . . , L.

Step 2 (Combination): Compute the mode probabilitiesaccording to (15) and (29). The overall augmented esti-mate and overall augmented error covariance are givenby (20) and (21), respectively.

The performance analysis algorithm proposedin Section III-B can also be applied to the WL-QSMM

algorithm by replacing ea[l]0n , x[l]

0n , and Pa[l]0n with ea[l]

n−1,

xa[l]n−1|n−1, and Pa[l]

n−1|n−1, for l = 1, . . . , L.We next prove that the WL-QSMM algorithm con-

verges to the candidate model closest to the true systemmodel in the sense of the Kullback–Leibler divergence.The Kullback–Leibler divergence between N (E{ra

n}, San) and

N (E{ra[l]n }, Sa[l]

n ), where ran is the augmented residual cor-

responding to the filter that exactly matches the true systemmodel and Sa

n is its covariance matrix, is defined by

Kl � 1

2ln

��Sa[l]n

����Sa

n

�� + 1

2Tr

�R[�Sa[l]

n

�−1San] − I4N

+ 1

2

�E

ra[l]

n

� − E ra

n

��H �Sa[l]

n

�−1�E

ra[l]

n

� − E ra

n

��

= 1

2ln

��Sa[l]n

����Sa

n

�� + 1

2Tr

�R

��Sa[l]

n

�−1E

ra[l]

n

�ra[l]

n

�H�� − I�.

(30)

The second equality in (30) is based on E{ran} = 0 and

E ra[l]

n

�ra[l]

n

�H� − E ra[l]

n

��E

ra[l]

n

��H = San

which is a straightforward extension of the analysis result forreal-valued mismatched Kalman filters [46] to the quaterniondomain. Let t = arg minl∈{1,2,...,L} Kl , which is the index ofthe candidate model closest to the true system model. Forl �= t , define [l]

n � μ[l]n /μ[t ]

n to obtain

[l]n

[l]n−1

= L[l]n

L[t ]n

=����

��Sa[t ]n

����Sa[l]

n��

× exp

�1

2

��ra[t ]

n

�H �Sa[t ]

n

�−1ra[t ]n

−�ra[l]

n

�H �Sa[l]

n

�−1ra[l]n

�� =����

��Sa[t ]n

����Sa[l]

n��

× exp

�1

2Tr

�R

��Sa[t ]

n

�−1ra[t ]n

�ra[t ]

n

�H −Sa[l]n

�−1ra[l]n

�ra[l]

n

�H ���

(31)

Applying the statistical expectation operator and the naturallogarithm to (31) yields

ln�E

[l]

n

� [l]

n−1

�� = Kt − Kl < 0

whereby E{ [l]n / [l]

n−1} < 1, which indicates that μ[l]n

decreases to 0 and hence μ[t ]n increases to 1 when n → ∞.

This proves the convergence of the WL-QSMM algorithm tomodel t .

Page 8: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 79

Remark 2: Since the Kullback–Leibler divergence in (30)is nonnegative and is vanishing only when the filter modelexactly matches the true system, the algorithm will convergeto the true system model if the true model belongs to thecandidate model set.

D. SWL-QMMAE and SL-QMMAE

For semiwidely linear quaternion systems with Cη-improperquaternion Gaussian noise, the above WL-QMMAE algo-rithms reduce to corresponding SWL-QMMAE algorithms byreplacing the augmented forms of variables with the semiaug-mented forms. Based on the probability density function in (5),the likelihood functions in the SWL-QIMM and SWL-QSMMalgorithms are given by

L[l]n = f

�rb[l]

n

��0, Sb[l]n

�.

For strictly linear quaternion systems with H-proper quater-nion Gaussian noise, the WL-QMMAE algorithms reduceto corresponding SL-QMMAE algorithms by replacing theaugmented forms of variables with the original forms. Sincethe QMGD is associated with the probability density functionin (6) under this condition, the likelihood functions in theSL-QIMM and SL-QSMM algorithms become

L[l]n = f

�r[l]

n

��0, S[l]n

�.

The SL-QMMAE and SWL-QMMAE algorithms can beseen as reduced-order WL-QMMAE algorithms [47]. There-fore, they are suboptimal for widely linear quaternion systemsand general improper quaternion noise, and their performancedisadvantages increase with the degree of widely linear natureof the system and/or the degree of noise impropriety.

IV. SIMULATIONS

A. QIMM Algorithms for Filtering Quaternion Signals

This section presents simulation results for evaluating theSL-QIMM, SWL-QIMM, and WL-QIMM algorithms andthe corresponding performance analysis algorithms. Since thelinear nature (strictly, semiwidely, or widely linear) of theinvolved quaternion system essentially affects the estimationperformance, we employed a strictly linear quaternion hybridsystem, with noise at different levels of impropriety, to studythe impact of impropriety on the estimation performance. Theconsidered system, which was used in [33], can be representedby the following 1-D state-space model:

xn = b[l]xn−1 + w[l]n

zn = c[l]xn + v[l]n

where l ∈ {1, 2}, b[1] = 0.95, b[2] = 0.85, c[1] = 1,c[2] = 0.4, and w

[1]n , w

[2]n , v

[1]n , v

[2]n are proper or improper

quaternion Gaussian noise sequences with the power of 2,0, 0, and 0 dB. The system mode was 1 from n = 1 to100 and from n = 200 to 300, and was 2 from n = 100to 200. In each experiment, 500 MC simulation runs ofthe three considered QIMM algorithms were implementedand the estimation performance was evaluated through theaveraged MSE. The initial mode probabilities were set to

Fig. 3. Learning curves of the three QIMM algorithms when the state andobservation noise are improper. (a) Probability of the time-varying mode 1.(b) MSE of the algorithms considered.

μ[1]0 = 0.1 and μ[2]

0 = 0.9. The proposed performance analysisalgorithms were also implemented for comparison with theMC results.

Fig. 3 shows the simulation results of the three QIMMalgorithms for improper state noise with impropriety coeffi-cients ρı = ρj = ρκ = 0.2 and improper observation noisewith ρı = ρj = ρκ = 0.9. Fig. 3(a) shows the averageprobabilities of the hypothesizing mode 1 over 500 MC runs.Observe that the WL-QIMM has converged faster to the truemode than the other two algorithms, while the SL-QIMM hasexhibited the slowest convergence. As shown in Fig. 3(b),the WL-QIMM has achieved the smallest MSE, while theSL-QSMM has attained the largest MSE; the MSE com-puted from the proposed performance analysis algorithm forWL-QIMM has matched the MSE averaged over the MC runswith 12% difference on average. The error arose from theapproximations in (22)–(24), (27), and the integral calculationin (26). The 500 MC runs of WL-QIMM took 79 min, whilethe proposed performance analysis algorithm took only 2 minon a standard PC. Obviously, this computation efficiencyadvantage of the performance analysis algorithm will increasewith the data size.

Fig. 4 shows the simulation results of the three QIMMalgorithms for C

ı -improper state and observation noise.

Page 9: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

80 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Fig. 4. Learning curves of the three QIMM algorithms when the state andobservation noise are C

ı -improper. (a) Probability of the time-varying mode 1.(b) MSE of the algorithms considered.

Fig. 4(a) shows the average probabilities of the hypothesiz-ing mode 1 over 500 MC runs, and Fig. 4(b) shows theMSEs of the three QIMM algorithms. As shown in Fig. 4,the WL-QIMM and SWL-QIMM have exhibited the sameperformance, achieving a lower MSE than the SL-QIMM, andthe MSE computed from the performance analysis algorithmfor WL-QIMM has matched the MSE averaged over the MCruns with 16% difference on average.

Fig. 5 shows the simulation results of the three QIMMalgorithms for H-proper state and observation noise. Fig. 5(a)shows the average probabilities of the hypothesizing mode 1over the 500 MC simulation runs, and Fig. 5(b) shows theMSEs of the three QIMM algorithms. As shown in Fig. 5,the three QIMM algorithms had the same performance for H-proper noise, and the MSE computed from the performanceanalysis algorithm for WL-QIMM has matched the MSEaveraged over the MC runs with 14% difference on average.

Fig. 6 shows the average MSEs of the three QIMM algo-rithms over the whole simulation period. Fig. 6(a) shows theresults for H-proper observation noise and improper state noisewith varying impropriety coefficients. For simplicity, the threeimpropriety coefficients of the state noise were set to be equal,that is, ρı = ρj = ρκ = ρ. Fig. 6(b) shows the results forH-proper state noise and improper observation noise with

Fig. 5. Learning curves of the three QIMM algorithms when the state andobservation noise are H-proper. (a) Probability of the time-varying mode 1.(b) MSE of the algorithms considered.

various impropriety coefficients. The three impropriety coeffi-cients of the observation noise were also equal. Fig. 6 showsthat for the H-proper noise, the three QIMM algorithms haveexhibited the same MSE, while for the improper noise, theWL-QIMM has achieved the smallest MSE and the SL-QIMMhas attained the largest MSE. As the impropriety degreeof the noise is increased, the MSE difference between thethree algorithms is also increased. Interestingly, the MSE ofSL-QIMM was increased with the impropriety coefficient ofstate noise but it was unaffected by the impropriety coefficientof observation noise, while the MSEs of SWL-QIMM andWL-QIMM were decreased with the impropriety coefficientsof state and observation noise. The MSE estimated from theperformance analysis algorithm for WL-QIMM shows thesame trend as the MC simulation result with about 10%difference in value.

B. QSMM Algorithms for Filtering Quaternion Signals

This section presents the simulation results for evaluatingthe SL-QSMM, SWL-QSMM, and WL-QSMM algorithmsand the performance analysis algorithm for WL-QSMM. Thesimulation settings were the same as that for the QIMMalgorithms in Section IV-A, except that the system modewas constantly 1. Fig. 7 shows the simulation results of

Page 10: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 81

Fig. 6. MC simulated MSEs of the three QIMM algorithms and the MSEcomputed from the proposed performance analysis algorithm (analysis) forWL-QIMM, as a function of the impropriety coefficient of noise. (a) Improperstate noise and H-proper observation noise. (b) H-proper state noise andimproper observation noise.

the three QSMM algorithms for improper state noise withρı = ρj = ρκ = 0.2 and improper observation noisewith ρı = ρj = ρκ = 0.9. Fig. 7(a) shows the averageprobabilities of the hypothesizing mode 1 over 500 MC runs.Observe that the WL-QSMM has converged faster to the truemode than the other two algorithms, and the SL-QSMM hasexhibited the slowest convergence. As shown in Fig. 7(b), theWL-QSMM has achieved the smallest MSE, while theSL-QSMM has attained the largest MSE; the MSE computedfrom the performance analysis algorithm for WL-QSMMhas matched the MSE averaged over the MC runs with5% difference. The performance analysis algorithm for theWL-QSMM was more accurate than the analysis algorithmfor WL-QIMM simulated in Section IV-A because of the lesscomplicated mechanism of the QSMM algorithm comparedwith the QIMM algorithm.

Fig. 8 shows the average MSEs of the three QSMMalgorithms over the whole simulation period. Fig. 8(a) showsthe results for H-proper observation noise and improper statenoise over a range of impropriety coefficients. For simplicity,the three impropriety coefficients of the state noise were set tobe equal. Fig. 8(b) shows the results for H-proper state noise

Fig. 7. Learning curves of the three QSMM algorithms when the state andobservation noise are improper. (a) Probability of the time-varying mode 1.(b) MSE of the algorithms considered.

and improper observation noise with various impropriety coef-ficients. The three impropriety coefficients of the observationnoise were also equal. Fig. 8 shows that for H-proper noise,the three QSMM algorithms have exhibited the same MSE,while for the improper noise, the WL-QSMM has achievedthe smallest MSE and the SL-QSMM has attained the largestMSE. As the impropriety degree of the noise is increased,the MSE difference between the three algorithms is increased.It is also noticeable that the MSEs of WL-QSMM andSWL-QSMM was decreased with the impropriety coefficientsof state and observation noise, and the MSE of SL-QSMM wasunaffected by the impropriety coefficients. The MSE estimatedfrom the performance analysis algorithm for WL-QSMM hasmatched the MC simulation result.

C. QIMM Algorithms for 3-D Target Tracking

The effectiveness of the proposed WL-QIMM algorithm isnext illustrated in the context of target tracking in a 3-D space.The task is to track the position and velocity of a moving targetusing the measurements of its position. Denote the positionby (Xn, Yn, Zn), velocity by (Xn, Yn, Zn), and accelerationby (Xn, Yn, Zn). Consider two quaternion-valued kinematicmodels. The first one is the constant velocity model given

Page 11: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

82 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

Fig. 8. MC simulated MSEs of the three QSMM algorithms and the MSEcomputed from the proposed performance analysis algorithm (analysis) forWL-QSMM, as a function of the impropriety coefficient of noise. (a) Improperstate noise and H-proper observation noise. (b) H-proper state noise andimproper observation noise.

by [25]

xn =

1 �t0 1

xn−1 + wn

with the measurement given by

zn = [ 1 0 ]xn + vn

where xn = [ı Xn + jYn + κ Zn; ı Xn + j Yn + κ Zn], �t is thesampling interval, wn is a zero-mean pure quaternion Gaussiannoise vector, and vn is zero-mean pure quaternion Gaussiannoise. The covariance matrices of state noise are

Cww =⎡⎢⎣

1

3�t3 1

2�t2

1

2�t2 �t

⎤⎥⎦ σ 2

w

Cwwı = ρwı Cww, Cwwj = ρwj Cww, Cwwκ = ρwκCww

where σ 2w indicates the power of state noise, and

ρwı , ρwj , ρwk ∈ R are the impropriety coefficients satisfyingρwı + ρwj + ρwk = −1. The covariances of observationnoise are Cvv = σ 2

v , Cvv ı = ρv ı Cvv , Cvvj = ρvjCvv , andCvvκ = ρvκCvv , where σ 2

v indicates the power of observationnoise, and ρv ı , ρvj , ρvk ∈ R are the impropriety coefficientssatisfying ρv ı + ρvj + ρvk = −1.

Fig. 9. Learning curves of the three QIMM algorithms for target trackingin a 3-D space. (a) Probability of the time-varying mode 1. (b) MSE of theposition and velocity estimation.

The second model is the constant acceleration model givenby [25]

xn =⎡⎢⎣

1 �t1

2�t2

0 1 �t0 0 1

⎤⎥⎦ xn−1 + wn

with the measurement given by

zn = [ 1 0 0 ]xn + vn

where

xn =⎡⎣

ı Xn + jYn + κ Zn

ı Xn + j Yn + κ Zn

ı Xn + j Yn + κ Zn

⎤⎦ .

The pure quaternion Gaussian noises wn and vn in this modelhave the same properties as those in the constant velocitymodel, except for

Cww =

⎡⎢⎢⎢⎢⎢⎣

1

20�t5 1

8�t4 1

6�t3

1

8�t4 1

3�t3 1

2�t2

1

6�t3 1

2�t2 dt

⎤⎥⎥⎥⎥⎥⎦

σ 2w.

Page 12: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

XIANG et al.: MMAE FOR 3-D AND 4-D SIGNALS: A WIDELY LINEAR QUATERNION APPROACH 83

In our simulation, �t = 0.1 s, and the target initially hada position of (0, 0, 0), velocity of (1, 2, 0.5), and vanishingacceleration. The target has followed the constant velocitymodel with σ 2

w = 0.01 (mode 1) for 150 s before followingthe constant acceleration model with σ 2

w = 1 (mode 2) foranother 150 s. During the whole 300 s, ρwı = ρwj = ρwk =−1/3, ρv ı = ρvj = −25/27, ρvk = 23/27, and σ 2

v = 1. In theestimation algorithms, the initial estimate of the state vectorwas set to [2ı +2.1j +2.2κ, 0.1ı +0.2j+0.3κ, 0.05ı +0.05j+0.05κ], and the initial mode probabilities were μ

[1]0 = 0.1 and

μ[2]0 = 0.9. Fig. 9 shows the MC simulated performance of

the three considered QIMM algorithms. As shown in Fig. 9(a),the probability of mode 1 is increased to 0.9 in the first 150 sand decreased toward 0.2 in the second 150 s. Fig. 9(b) showsthat the WL-QIMM has achieved a lower MSE of the positionand velocity estimation compared with the SL-QIMM andSWL-QIMM.

V. CONCLUSION

We have proposed a class of WL-QMMAE algorithmswhich employ the augmented second-order quaternion statis-tics to cater for both proper and improper quaternion Gaussiansignals. The WL-QIMM algorithm for tracking time-variantsystem mode uncertainty and the WL-QSMM algorithm fortracking time-invariant system mode uncertainty have beenproposed based on the Bayesian inference. A recursive algo-rithm has been derived to efficiently assess the performanceof WL-QIMM, and a convergence proof of WL-QSMMhas been provided. We have shown that, as expected, theWL-QMMAE reduces to the semiwidely linear form forC

η-improper signals and further reduces to the strictly linearform for H-proper signals. The effectiveness of the proposedalgorithms has been verified by numerical simulations over3-D and 4-D signal processing case studies. Although thepresented algorithms in this paper use Kalman filters for linearstate estimation, the proposed WL-QMMAE framework isflexible and therefore can incorporate other types of filters andneural networks for nonlinear estimation [48]. The proposedalgorithms can also be modified to the widely linear quaternioncounterpart of the Gaussian sum filtering algorithms for non-Gaussian signals [49], [50].

REFERENCES

[1] J. B. Kuipers, Quaternions and Rotation Sequences: A Primer WithApplications to Orbits, Aerospace and Virtual Reality. Princeton, NJ,USA: Princeton Univ. Press, 1999.

[2] D. P. Mandic, C. Jahanchahi, and C. C. Took, “A quaternion gradientoperator and its applications,” IEEE Signal Process. Lett., vol. 18, no. 1,pp. 47–50, Jan. 2011.

[3] D. Xu, C. Jahanchahi, C. C. Took, and D. P. Mandic, “Enablingquaternion derivatives: The generalized HR calculus,” Roy. Soc. OpenSci., vol. 2, no. 8, p. 150255, 2015.

[4] J. Vía, D. Ramírez, and I. Santamaría, “Properness and widely linearprocessing of quaternion random vectors,” IEEE Trans. Inf. Theory,vol. 56, no. 7, pp. 3502–3515, Jul. 2010.

[5] C. C. Took and D. P. Mandic, “Augmented second-order statistics ofquaternion random signals,” Signal Process., vol. 91, no. 2, pp. 214–224,Feb. 2011.

[6] T. Mizoguchi and I. Yamada, “An algebraic translation ofCayley-Dickson linear systems and its applications to online learning,”IEEE Trans. Signal Process., vol. 62, no. 6, pp. 1438–1453, Mar. 2014.

[7] C. C. Took and D. P. Mandic, “The quaternion LMS algorithm foradaptive filtering of hypercomplex processes,” IEEE Trans. SignalProcess., vol. 57, no. 4, pp. 1316–1327, Apr. 2009.

[8] T. K. Paul and T. Ogunfunmi, “A kernel adaptive algorithm forquaternion-valued inputs,” IEEE Trans. Neural Netw. Learn. Syst.,vol. 26, no. 10, pp. 2422–2439, Oct. 2015.

[9] F. Shang and A. Hirose, “Quaternion neural-network-based PolSAR landclassification in poincare-sphere-parameter space,” IEEE Trans. Geosci.Remote Sens., vol. 52, no. 9, pp. 5693–5703, Sep. 2014.

[10] Y. Xia, C. Jahanchahi, and D. P. Mandic, “Quaternion-valued echostate networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 4,pp. 663–673, Apr. 2015.

[11] M. E. Valle and F. Z. de Castro, “On the dynamics of hopfield neuralnetworks on unit quaternions,” IEEE Trans. Neural Netw. Learn. Syst.,doi: 10.1109/TNNLS.2017.2691462.

[12] T. Minemoto, T. Isokawa, H. Nishimura, and N. Matsui, “Feed forwardneural network with random quaternionic neurons,” Signal Process.,vol. 136, pp. 59–68, Jul. 2017.

[13] S. Miron, N. Le Bihan, and J. I. Mars, “Quaternion-MUSIC for vector-sensor array processing,” IEEE Trans. Signal Process., vol. 54, no. 4,pp. 1218–1229, Apr. 2006.

[14] J.-F. Gu and K. Wu, “Quaternion modulation for dual-polarized anten-nas,” IEEE Commun. Lett., vol. 21, no. 2, pp. 286–289, Feb. 2017.

[15] H. Fourati, N. Manamanni, L. Afilal, and Y. Handrich, “Complementaryobserver for body segments motion capturing by inertial and magneticsensors,” IEEE/ASME Trans. Mechatronics, vol. 19, no. 1, pp. 149–157,Feb. 2014.

[16] K. Adhikari, S. Tatinati, W. T. Ang, K. C. Veluvolu, and K. Nazarpour,“A quaternion weighted Fourier linear combiner for modeling physiolog-ical tremor,” IEEE Trans. Biomed. Eng., vol. 63, no. 11, pp. 2336–2346,Nov. 2016.

[17] C. Jahanchahi and D. P. Mandic, “A class of quaternion Kalman filters,”IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 533–544,Mar. 2014.

[18] J. Navarro-Moreno, R. M. Fernández-Alcalá, and J. C. Ruiz-Molina,“A quaternion widely linear model for nonlinear Gaussian estima-tion,” IEEE Trans. Signal Process., vol. 62, no. 24, pp. 6414–6424,Dec. 2014.

[19] J. D. Jiménez-López, R. M. Fernández-Alcalá, J. Navarro-Moreno, andJ. C. Ruiz-Molina, “Widely linear estimation of quaternion signalswith intermittent observations,” Signal Process., vol. 136, pp. 92–101,Jul. 2017.

[20] J. Navarro-Moreno, R. M. Fernández-Alcalá, and J. C. Ruiz-Molina,“Semi-widely simulation and estimation of continuous-time Cη-properquaternion random signals,” IEEE Trans. Signal Process., vol. 63, no. 18,pp. 4999–5012, Sep. 2015.

[21] R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, “Adaptivemixtures of local experts,” Neural Comput., vol. 3, no. 1, pp. 79–87,2012.

[22] S. E. Yuksel, J. N. Wilson, and P. D. Gader, “Twenty years of mixtureof experts,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 8,pp. 1177–1193, Aug. 2012.

[23] B. D. O. Anderson and J. B. Moore, Optimal Filtering.Englewood Cliffs, NJ, USA: Prentice-Hall, 1979.

[24] M. Gwak, K. Jo, and M. Sunwoo, “Neural-network multiple models fil-ter (NMM)-based position estimation system for autonomous vehicles,”Int. J. Autom. Technol., vol. 14, no. 2, pp. 265–274, Apr. 2013.

[25] Y. Bar-Shalom, X.-R. Li, and T. Kirubarajan, Estimation With Appli-cations to Tracking and Navigation: Theory, Algorithms and Software.Hoboken, NJ, USA: Wiley, 2001.

[26] L. Chen and K. S. Narendra, “Nonlinear adaptive control usingneural networks and multiple models,” Automatica, vol. 37, no. 8,pp. 1245–1255, Aug. 2001.

[27] L. Chen and K. S. Narendra, “Intelligent control using multiple neuralnetworks,” Int. J. Adapt. Control Signal Process., vol. 17, no. 6,pp. 417–430, 2003.

[28] Y. Fu and T. Chai, “Nonlinear multivariable adaptive control usingmultiple models and neural networks,” Automatica, vol. 43, no. 6,pp. 1101–1110, Jun. 2007.

[29] X. R. Li and Y. Bar-Shalom, “Performance prediction of the interactingmultiple model algorithm,” IEEE Trans. Aerosp. Electron. Syst., vol. 29,no. 3, pp. 755–771, Jul. 1993.

[30] C. E. Seah and I. Hwang, “Algorithm for performance analysis of theIMM algorithm,” IEEE Trans. Aerosp. Electron. Syst., vol. 47, no. 2,pp. 1114–1124, Apr. 2011.

Page 13: Multiple-Model Adaptive Estimation for 3-D and 4-D Signals ...

84 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 30, NO. 1, JANUARY 2019

[31] R. W. Osborne and W. D. Blair, “Update to the hybrid conditionalaveraging performance prediction of the IMM algorithm,” IEEE Trans.Aerosp. Electron. Syst., vol. 47, no. 4, pp. 2967–2974, Oct. 2011.

[32] C. Hue, J.-P. Le Cadre, and P. Perez, “Posterior Cramer–Rao boundsfor multi-target tracking,” IEEE Trans. Aerosp. Electron. Syst., vol. 42,no. 1, pp. 37–49, Jan. 2006.

[33] A. Mohammadi and K. N. Plataniotis, “Improper complex-valuedmultiple-model adaptive estimation,” IEEE Trans. Signal Process.,vol. 63, no. 6, pp. 1528–1542, Mar. 2015.

[34] T. A. Ell and S. J. Sangwine, “Quaternion involutions and anti-involutions,” Comput. Math. Appl., vol. 53, no. 1, pp. 137–143,Jan. 2007.

[35] D. P. Mandic and V. S. L. Goh, Complex Valued Nonlinear AdaptiveFilters: Noncircularity, Widely Linear and Neural Models. Hoboken,NJ, USA: Wiley, 2009.

[36] S. P. Talebi, S. Kanna, and D. P. Mandic, “Real-time estimationof quaternion impropriety,” in Proc. IEEE Int. Conf. Digit. SignalProcess. (DSP), Jul. 2015, pp. 557–561.

[37] J. Vía, D. Palomar, L. Vielva, and I. Santamaría, “Quaternion ICA fromsecond-order statistics,” IEEE Trans. Signal Process., vol. 59, no. 4,pp. 1586–1600, Apr. 2011.

[38] Y. Xia, C. Jahanchahi, T. Nitta, and D. P. Mandic, “Performancebounds of quaternion estimators,” IEEE Trans. Neural Netw. Learn.Syst., vol. 26, no. 12, pp. 3287–3292, Dec. 2015.

[39] J. D. J. López, R. M. Fernández-Alcalá, J. Navarro-Moreno, andJ. C. Ruiz-Molina, “A semi-widely linear filtering algorithm for Cη-proper quaternion signals based on randomly modeled observations,”in Proc. IEEE 17th Int. Workshop Signal Process. Adv. Wireless Com-mun. (SPAWC), Jul. 2016, pp. 1–5.

[40] C. Jahanchahi, “Quaternion valued adaptive signal processing,”Ph.D. dissertation, Dept. Elect. Electron. Eng., Imperial College London,London, U.K., 2013.

[41] D. H. Dini and D. P. Mandic, “Class of widely linear complexKalman filters,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 5,pp. 775–786, May 2012.

[42] T. Hasija, C. Lameiro, and P. J. Schreier, “Determining the dimensionof the improper signal subspace in complex-valued data,” IEEE SignalProcess. Lett., vol. 24, no. 11, pp. 1606–1610, Nov. 2017.

[43] M. Wax and T. Kailath, “Detection of signals by information theoreticcriteria,” IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 2,pp. 387–392, Apr. 1985.

[44] H. A. P. Blom and Y. Bar-Shalom, “The interacting multiple modelalgorithm for systems with Markovian switching coefficients,” IEEETrans. Autom. Control, vol. AC-33, no. 8, pp. 780–783, Aug. 1988.

[45] I. Hwang, C. E. Seah, and S. Lee, “A study on stability of the interactingmultiple model algorithm,” IEEE Trans. Autom. Control, vol. 62, no. 2,pp. 901–906, Feb. 2017.

[46] P. D. Hanlon and P. S. Maybeck, “Characterization of Kalman filterresiduals in the presence of mismodeling,” IEEE Trans. Aerosp. Electron.Syst., vol. 36, no. 1, pp. 114–131, Jan. 2000.

[47] B. Friedland, “On the properties of reduced order Kalman filters,” IEEETrans. Autom. Control, vol. 34, no. 3, pp. 321–324, Mar. 1989.

[48] F. A. Tobar, S.-Y. Kung, and D. P. Mandic, “Multikernel least meansquare algorithm,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 2,pp. 265–277, Feb. 2014.

[49] D. L. Alspach and H. W. Sorenson, “Nonlinear Bayesian estimationusing Gaussian sum approximations,” IEEE Trans. Autom. Control,vol. AC-17, no. 4, pp. 439–448, Aug. 1972.

[50] A. Mohammadi and K. N. Plataniotis, “Complex-valued Gaussian sumfilter for nonlinear filtering of non-Gaussian/non-circular noise,” IEEESignal Process. Lett., vol. 22, no. 4, pp. 440–444, Apr. 2015.

Min Xiang (M’18) received the Ph.D. degreein electrical engineering from Imperial CollegeLondon, London, U.K., in 2018.

He is currently a Research Associate with Impe-rial College London. His current research interestsinclude quaternion-valued statistical signal process-ing and adaptive filtering.

Bruno Scalzo Dees received the M.Eng. degreein aeronautical engineering from Imperial CollegeLondon, London, U.K., where he is currently pur-suing the Ph.D. degree with the Department ofElectrical and Electronic Engineering.

His current research interests include statisticalsignal processing, dynamical systems, and tensordecompositions.

Danilo P. Mandic (M’99–SM’03–F’12) received thePh.D. degree in nonlinear adaptive signal process-ing from Imperial College London, London, U.K.,in 1999.

He has been a Guest Professor with KatholiekeUniversiteit Leuven, Leuven, Belgium, the TokyoUniversity of Agriculture and Technology, Tokyo,Japan, Westminster University, London, and a Fron-tier Researcher with RIKEN, Wako, Japan. He is cur-rently a Professor of signal processing with ImperialCollege London, where he is involved in nonlinear

adaptive signal processing and nonlinear dynamics. He is also the DeputyDirector of the Financial Signal Processing Laboratory, Imperial CollegeLondon. He has two research monographs Recurrent Neural Networks forPrediction: Learning Algorithms, Architectures and Stability (West Sussex,U.K.: Wiley, 2001) and Complex Valued Nonlinear Adaptive Filters: Noncir-cularity, Widely Linear and Neural Models (West Sussex, U.K.: Wiley, 2009),an edited book Signal Processing Techniques for Knowledge Extraction andInformation Fusion (New York, NY, USA: Springer, 2008), and more than200 publications on signal and image processing.

Dr. Mandic has been a member of the IEEE Technical Committee onSignal Processing Theory and Methods. He has produced award winningpapers and products resulting from his collaboration with the industry. He hasbeen an Associate Editor of the IEEE Signal Processing Magazine, the IEEETRANSACTIONS ON CIRCUITS AND SYSTEMS II, the IEEE TRANSACTIONS

ON SIGNAL PROCESSING, and the IEEE TRANSACTIONS ON NEURALNETWORKS.