A novel compensation-based recurrent fuzzy neural network and its learning algorithm

11
www.scichina.com info.scichina.com www.springerlink.com A novel compensation-based recurrent fuzzy neural network and its learning algorithm WU Bo , WU Ke & L ¨ U JianHong School of Energy and Environment, Southeast University, Nanjing 210096, China Based on detailed study on several kinds of fuzzy neural networks, we propose a novel compensation- based recurrent fuzzy neural network (CRFNN) by adding recurrent element and compensatory element to the conventional fuzzy neural network. Then, we propose a sequential learning method for the struc- ture identication of the CRFNN in order to conrm the fuzzy rules and their correlative parameters effectively. Furthermore, we improve the BP algorithm based on the characteristics of the proposed CRFNN to train the network. By modeling the typical nonlinear systems, we draw the conclusion that the proposed CRFNN has excellent dynamic response and strong learning ability. compensation-based recurrent, fuzzy neural network, sequential learning method, improved BP algorithm, nonlinear system 1 Introduction The fuzzy neural network (FNN) has become a new approach for the modeling and control of the in- dustrial plants because of its recent developments. FNN carries out fuzzy reasoning by the structure of neural network; therefore, it not only has fuzzy logic which can express the fuzzy knowledge and carry out fuzzy reasoning, but also has good learn- ing ability, nonlinear mapping, and data process- ing ability as the neural network. The research on FNN has already become a new issue, and some progress has been made. Lin and Hsu [1] proposed the recurrent fuzzy neural network (RFNN) by adding the recurrent element to the conventional FNN. The membership layer of RFNN has self- adaptive ability so that the RFNN could learn the dynamic behavior of plants more effectively. How- ever, they use the conventional input dimension partitioning (IDP) method that makes the number of fuzzy rules increase exponentially as the input dimension increases. Thus, the structure of the network is complicated, and the network-training task becomes heavily simultaneously. Further- more, the fuzzy rules may be redundant. On the other hand, in ref. [1], the network was trained by the conventional BP algorithm that converges slowly and may obtain the local minimum values. Lee and Teng [2] used the direct input space parti- tioning (DISP) method instead of IDP in RFNN. The DISP method does not have the disadvan- Received April 18, 2007; accepted April 18, 2008 doi: 10.1007/s11432-009-0002-3 Corresponding author (email: [email protected]) Supported by the National High-Tech Research and Development Program of China (Grant No. 2006AA05A107) and Special Fund of Jiangsu Province for Technology Transfer (Grant No. BA2007008) Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

Transcript of A novel compensation-based recurrent fuzzy neural network and its learning algorithm

www.scichina.cominfo.scichina.com

www.springerlink.com

A novel compensation-based recurrent fuzzy

neural network and its learning algorithm

WU Bo†, WU Ke & LU JianHong

School of Energy and Environment, Southeast University, Nanjing 210096, China

Based on detailed study on several kinds of fuzzy neural networks, we propose a novel compensation-based recurrent fuzzy neural network (CRFNN) by adding recurrent element and compensatory elementto the conventional fuzzy neural network. Then, we propose a sequential learning method for the struc-ture identification of the CRFNN in order to confirm the fuzzy rules and their correlative parameterseffectively. Furthermore, we improve the BP algorithm based on the characteristics of the proposedCRFNN to train the network. By modeling the typical nonlinear systems, we draw the conclusion thatthe proposed CRFNN has excellent dynamic response and strong learning ability.

compensation-based recurrent, fuzzy neural network, sequential learning method, improved BP algorithm, nonlinear system

1 Introduction

The fuzzy neural network (FNN) has become a newapproach for the modeling and control of the in-dustrial plants because of its recent developments.FNN carries out fuzzy reasoning by the structureof neural network; therefore, it not only has fuzzylogic which can express the fuzzy knowledge andcarry out fuzzy reasoning, but also has good learn-ing ability, nonlinear mapping, and data process-ing ability as the neural network. The research onFNN has already become a new issue, and someprogress has been made. Lin and Hsu[1] proposedthe recurrent fuzzy neural network (RFNN) byadding the recurrent element to the conventionalFNN. The membership layer of RFNN has self-

adaptive ability so that the RFNN could learn thedynamic behavior of plants more effectively. How-ever, they use the conventional input dimensionpartitioning (IDP) method that makes the numberof fuzzy rules increase exponentially as the inputdimension increases. Thus, the structure of thenetwork is complicated, and the network-trainingtask becomes heavily simultaneously. Further-more, the fuzzy rules may be redundant. On theother hand, in ref. [1], the network was trainedby the conventional BP algorithm that convergesslowly and may obtain the local minimum values.Lee and Teng[2] used the direct input space parti-tioning (DISP) method instead of IDP in RFNN.The DISP method does not have the disadvan-

Received April 18, 2007; accepted April 18, 2008

doi: 10.1007/s11432-009-0002-3†Corresponding author (email: [email protected])

Supported by the National High-Tech Research and Development Program of China (Grant No. 2006AA05A107) and Special Fund of Jiangsu

Province for Technology Transfer (Grant No. BA2007008)

Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

tage of the input dimension method. It can makemore suitable partitioning by some effective struc-ture identification method so as to avoid the re-dundancy of fuzzy rules. However, in ref. [2], Leeand Teng paid more attention to improve the BPalgorithm and ignored the importance of structureidentification. Therefore, the RFNN training al-gorithm has little improvement. The convergentspeed is not fast enough, and there are also redun-dant fuzzy rules. In order to improve the trainingalgorithm of FNN, Zhang and Kandel[3] introducedthe compensatory element into the conventionalFNN. The compensatory element implements thecompensatory operation so as to change the con-ventional fuzzy reasoning into compensatory fuzzyreasoning. The new fuzzy reasoning method cannot only reduce the possibility of redundant fuzzyrules, but also optimize the fuzzy rules in the train-ing process dynamically and improve the adaptiveability of fuzzy rules consequently. Without effec-tive structure identification method, the fuzzy ruleswere initialized by human experience. This methodcould not avoid the redundancy of fuzzy rules andis not widely beneficial for the application of thenetwork in industrial process.

Based on the above, several types of novel FNNsand corresponding structure identification meth-ods were proposed[4,5]. The structure identifica-tion methods are too simple to obtain good re-sults. Therefore, they increase the training epochof BP algorithm. Based on the fuzzy partition-ing of DISP, Pang and Zhou[6] proposed a kind ofstructure identification method on the basis of theclustering algorithm, and it really has some effecton structure construction of the network. How-ever, without consideration of both recurrent andcompensatory elements of the proposed network,the method was not reasonable. Besides, Juang[7]

proposed the structure identification method basedon the genetic algorithm for his TSK-type recur-rent fuzzy neural network. The number of fuzzyrules should be assigned first by using the geneticalgorithm, and almost all the clustering algorithms(including the clustering algorithm in ref. [6]) havethis common disadvantage. This disadvantage isbad for the dynamic construction of networks.

This paper proposes a novel compensation-based

recurrent fuzzy neural network (CRFNN) whosestructure is relatively simple. Then, an effectivesupervisory sequential learning method is proposedfor the structure identification of the proposedCRFNN. After that, the improved BP algorithmis used to train the network further. By model-ing and computation of some typical nonlinear sys-tems, the results show that the proposed CRFNNand its learning algorithm have better performanceon dynamic response ability and nonlinear map-ping ability than those proposed in other correla-tive papers.

2 CRFNN

2.1 The structure and layered operationof CRFNN

The structure of the proposed CRFNN is shown inFigure 1. It has six layers and each layer’s functionis introduced as follows:

Layer 1. Input layer. The nodes in this layeronly transmit input values to the next layer.

Layer 2. Membership layer with recurrent ele-ment. In this layer, each node performs a member-ship function and acts as a unit of memory by therecurrent element. First, the inputs of this layer’smembership function for discrete time k can be de-noted by the following:

Ufij(k) = xi(k) + ϑij ∗ Ofij(k − 1), (1)

where Ofij(k − 1) is the layer’s output for discretetime k− 1; ϑij denotes the link weight of recurrentelement. The subscript ij indicates the jth fuzzyrules of the ith input xi. The Gaussian function isadopted here as the membership function. Then,the layer’s output is

Ofij(k)

= exp(−(xi(k) + ϑij ∗ Ofij(k − 1) − cij)2

2σ2ij

), (2)

where cij , σij denote the center and width of theGaussian membership function.

Layer 3. Rules layer. This layer performsfuzzy reasoning. Each node matches the an-tecedent of a fuzzy rule and performs the fuzzyoperation. The general fuzzy rule is as follows:

Rule-j : If x1 is F1j, x2 is F2j, . . . . . . , xn is Fnj ,

42 WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

Figure 1 The structure of proposed CRFNN.

then yj is wj. j = 1, 2, . . . , L.

The layer’s output is

Opj(k) =n∏

i=1

Ofij(k), j = 1, 2, · · · , L, (3)

where Opj(k) is the reasoning result of the jthfuzzy rule; n denotes the dimension of the inputvector.

Layer 4. Compensatory layer. This layer per-forms the compensatory operation on the reasoningresults of the fuzzy rules. The pessimistic outputUp and optimistic output Uo is as follows, respec-tively:

Up = Opj(k), Uo = (Opj(k))l/n. (4)

Then, we obtain the layer’s output:

Ocj(k) = (Up)1−γj(Uo)γj = (Opj(k))1−γj+γj/n,

j = 1, 2, · · · , L, (5)

where Ocj(k) is the output of the compensatorylayer and γj is the compensatory degree. Thecompensatory operation changes the conventionalfuzzy reasoning to compensatory fuzzy reasoning,and the compensatory fuzzy rules are shown as fol-lows:

Rule-j : [If x1 is F1j, x2 is F2j, . . . ,

xn is Fnj ]1−γj+γj/n,

then yj is wj. j = 1, 2, . . . , L.

Layer 5. Normalized layer. Each node inthis layer performs the normalized operation onthe output of the compensatory layer; that is, itcalculates the ratio of the compensatory reasoningresult of each rule to the sum of those of all rules.The output is denoted by the following:

Onj(k) =Ocj(k)∑L

j=1 Ocj(k), j = 1, 2, · · · , L, (6)

where Onj(k) is output of normalized layer.

Layer 6. Output layer. This layer computesthe y(k) as the summation of all incoming signalsfrom the normalized layer, and the output is asfollows:

y(k) =L∑

j=1

ωj ∗ Onj(k), j = 1, 2, · · · , L, (7)

where wj, j = 1, 2, . . . , L denotes the link weight ofthis layer.

Besides, the last two layers can be seen as onelayer named defuzzification layer, which performsthe defuzzification operation with some meth-ods, such as centroid method and weighted mean

WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51 43

method, and here is the latter one. Thus, the out-put can be denoted by

y(k) =L∑

j=1

ωj ∗ Ocj(k)

/L∑

j=1

Ocj(k). (8)

2.2 The structure characteristic ofCRFNN and comparison with some relativenetworks

Compared with conventional FNN, the proposedCRFNN has three main characteristics as follows:

1) Direct input space partitioning. There aretwo fuzzy partition methods, IDP (input dimen-sion partitioning) and DISP (direct input spacepartitioning)[8]. As shown in Figure 2(a), each di-mension of the input space is partitioned first andthen the resulting input space will be the prod-uct of all input dimension partitions. The IDPmethod makes the number of regions divided bythe input space increase exponentially as the inputdimension. DISP is shown in Figure 2(b). Thismethod divides the input space into some regionsby some algorithm directly. Therefore, comparedwith IDP method, DISP has more freedom to de-termine the input space partition and the numberof fuzzy rules, thus the constructed network may becomparatively simple if some reasonable algorithmis used to partition the input space directly.

2) Recurrent element. By introducing the recur-rent element into the membership layer, the con-structed recurrent fuzzy neural network has theability to deal with the dynamic behaviors. The re-current element may make the constructed networksimpler and hence improve the convergent speed

and training precision.3) Compensatory element. Zhang and Kandel[3]

proposed a compensatory operation based on bothpessimistic operation and optimistic operation.The pessimistic operation could map the inputsto the reasoning result by making a conservativedecision for the pessimistic situation or even theworst case. For example, p(x1, x2, . . . , xn) =

∏xi

or p(x1, x2, . . . , xn) = Min(x1, x2, . . . , xn). The op-timistic operation can map the inputs to the rea-soning result by making an optimistic decision forthe optimistic situation or even the best case. Forexample, o(x1, x2, . . . , xn) = Max(x1, x2, . . . , xn).The compensatory operation can map the pes-simistic output x1 and the optimistic output x2

to make a relatively compromised decision for thesituation between the worst case and the best case,that is

c(x1, x2) = x1−γ1 xγ

2 , γ ∈ [0, 1]. (9)

Based on the above, we compare the proposedCRFNN with relative networks in the reference pa-pers from several aspects on the structure, and theresults are shown in Table 1. Some explanationsabout the content of Table 1 are given as follows:

1) The recurrent element in the networks pro-posed in refs. [1, 2, 6, 10] is the same as that of theproposed CRFNN, as is shown in Figure 1. How-ever, the recurrent element in the network whichwas proposed in ref. [4] is more complicated. For adetailed explanation, refer to ref. [4]. As the otheraspects are the same as the proposed CRFNN, sothe network in ref. [4] is more complicated.

Figure 2 Two fuzzy partitioning methods. (a) Input dimension partitioning; (b) direct input space partitioning.

44 WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

Table 1 Comparison of the proposed CRFNN with relative networksa)

Ref. Recurrent element Compensatory element Fuzzy partitioning Defuzzification method The network complexity

[1]√ × IDP weighted mean ��

[2]√ × DISP weighted mean �

[3] × √IDP centroid ��

[4]√ √

DISP weighted mean ��

[6]√ √

DISP centroid ��

This paper√ √

DISP weighted mean �

a) The more stars present, the more complicated the network.

2) The difference between two defuzzificationmethods: the weighted mean method has only onestructure parameter, while the centriod methodhas two parameters.

3 Learning algorithm of the CRFNN

The hybrid learning algorithm consists of structurelearning and parameter learning. The structure le-arning algorithm is used to determine the initialstructure and correlative parameters of CRFNNaccording to the training patterns, and the param-eter learning algorithm is used to adjust the pa-rameters further.

3.1 Structure identification

The main target of the structure learning is to de-termine the number of fuzzy rules and the initialvalues of correlative parameters, that is, to de-termine the number of nodes and parameters inthe second layer. As the membership function isGaussian function that is a bridge of the CRFNNand RBFNN (radial basis function neural net-work), some algorithms of RBFNN can be usedhere, such as c-means or k-means clustering algo-rithms. However, they are unsupervised learningalgorithms, and their convergent speeds are slow.Here, we propose a novel supervisory sequentiallearning method for structure identification, whichis illumed by a sequential learning algorithm ofRBFNN[9].

First, we make some simplification of the struc-ture identification problem as follows:

1) Suppose that all the widths of Gaussian func-tions that belong to the same fuzzy rule are the

same in the structure identification of the network.2) Simplify the normalized layer and output layer

as one layer, that is,

w′j = wj

/L∑

j=1

Ocj(k) . (10)

Here, we do not consider whether the left andthe right of eq. (10) are equivalent, because w

′j is

only a bridge of the supervisory sequential learningin the structure identification so as to obtain thecenter and width of Gaussian functions. It is notthe initial value of wj in parameter identification.3.1.1 Supervisory sequential learning method(SSLM). In the supervisory sequential learningmethod for structure identification, the training isbased on a sequence of input-output pairs which ismarked as {(xi, yi)|i = 1, 2, . . . , k}, where xi ∈ RN ,

N is the dimension of the input vector and yi is theoutput.

According to eqs. (8) and (10), the output ofnetwork is given as follows:

f(x) =L∑

j=1

w′jOcj. (11)

Then, the constructing algorithm of the network isgiven as follows:

1) The network initialization: choose the initialvalue of the parameters for the new node, whichinclude the minimal width σmin, the minimum ofcontact ratio τmin, the minimal allowable error emin,the output precision of single pattern η, the initialvalue of compensatory coefficient γin, and the linkweight of recurrent element ϑin.

2) When a new training data is given sequen-tially, compute ei = yi − f(xi), and then refresh

WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51 45

the network according to the following rules withei:

a) If |ei| > emin and there are no nodes, thenc1 = x1, σ1 = σmin, w1 = y1, γ1 = γin, ϑ1 = ϑin.

b) If |ei| > emin and the number of nodes is L,suppose U = {c1, c2, · · · , cL, xi}, computed as fol-lows:

βj = (exp{−||U + ϑj ∗ Ofj(k − 1) − cj||2

/2σ2j })1−γj+γj/n,

p = (exp{−||U − xi||2/2σ2p})1−γin+γin/n,

σp = max{min{||xi − cj||}, σmin},j = 1, 2, · · · , L. (12)

Then, compute the contact ratio τ betweenspace {β1, β2, . . . , βL, p} and {β1, β2, . . . , βL}, andthe threshold value of the contact ratio τmin at thismoment is as follows:

τ = sqrt

( ||β1||2 + ||β2||2 + · · · + ||βL||2||β1||2 + ||β2||2 + · · · + ||βL||2 + ||p||2

),

(13)τmin = max(0, 1 − (1 − η) ∗ |yi/ei|). (14)

If τ < τmin, then add the L+1 node, or else donot add the node. The new node is initialized asthe following formulas:

cL+1 = xn, σL+1 = σp, w′L+1 = ei,

γL+1 = γin, ϑL+1 = ϑin. (15)

At the same time, adjust the dimension of theerror covariance matrix of PL+1 in the EKF algo-rithm so as to fit for the adjusted network.

c) If |ei| < emin, do not add a new node.d) If no new node is added, then adjust the pa-

rameters of the constructed network by the EKFalgorithm.3.1.2 EKF algorithm. If there are L nodes in theconstructed network, then the network parametersthat need adjusting are as follows:

θ = [w′1, w

′2, · · · , w′

L, cT1 , cT

2 , · · · , cTL, σ1, σ2, · · · ,

σL, γ1, γ2, · · · γL, ϑ1, ϑ2, · · · ϑL]. (16)

Make the adjusting with EKF algorithm:

θ(i) = θ(i − 1) + kLen,

kL = PL−1AL[RL + ATLPL−1AL]−1, (17)

PL = [I − kLATL]PL−1 + Q0I,

where kL is the Kalman gain, i is the ith trainingdata, and AL is the gradient vector of the networkoutput f(x)with respect to the θ in θ(i− 1) whichis shown as follows:

AL =[Oc1, · · · , OcL, w′

1 ∗∂Oc1

∂c1

, · · · ,

w′L ∗ ∂OcL

∂cL

, w′1 ∗

∂Oc1

∂σ1

, · · · ,

w′L ∗ ∂OcL

∂σL

, w′1 ∗

∂Oc1

∂γ1

, · · · ,

w′L ∗ ∂OcL

∂γL

, w′1 ∗

∂Oc1

∂ϑ1

, · · · ,

w′L ∗ ∂OcL

∂ϑL

], (18)

where the partial derivatives in eq. (18) are givenas eqs. (19)–(22) and j = 1, 2, . . . , L.

∂Ocj

∂cj

=1σ2

j

∗ Ocj ∗ (xi − cj + ϑj

∗ Ofj(k − 1))T ∗ (1 − γj + γj/n),(19)

∂Ocj

∂σj

=1σ3

j

∗ Ocj ∗ (xi − cj + ϑj

∗ Ofj(k − 1))2 ∗ (1 − γj + γj/n), (20)

∂Ocj

∂γj

= Ocj ∗ log(Opj) ∗(

1n− 1

), (21)

∂Ocj

∂ϑj

= − 1σ2

j

∗ Ocj ∗ (xi − cj + ϑj

∗ Ofj(k − 1))T ∗ (1 − γj + γj/n)

∗ Ofj(k − 1). (22)

In eq. (17), Q0 is a scalar quantity that deter-mines the chance move on the direction of gradientvector; the error covariance matrix of PL is a pos-itive definite symmetric matrix of N × N , whereN is the number of self-adaptive parameters of theconstructed network. When a new node is added,the dimension of PL increases correspondingly, andthe new added row and column must be initializedas eq. (23), because PL is the estimate of errorcovariance:

PL =

[PL−1 0

0 P0I

], (23)

46 WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

where P0 is the estimate of indefinability which isbrought out by the initialization of the parameters,and the dimension of identity matrix I is the num-ber of the new node parameters.3.1.3 Some appending explanation of the struc-ture identification algorithm. As mentionedabove, the main target of the structure learningis to determine the number of fuzzy rules and theinitial values of correlative parameters. Further-more, the EKF algorithm complexity is relatedwith the number of parameters in eq. (18). There-fore, the link weight of recurrent element ϑ andcompensatory degree γ are both assumed to beconstants that are not adjusted during the struc-ture identification so as to reduce the EKF algo-rithm complexity and improve the efficiency of thestructure identification. Generally, ϑ is set to bea decimal fraction about 10−1. Because the defi-nition domain of compensatory degree is [0, 1], itis set at 0.5, which makes the pessimistic opera-tion and optimistic operation on an initial balancestate[3].

3.2 Parameter identification

After the initial values of the network parame-ters with structure identification algorithm are ob-tained, the next step is to train the network by theimproved BP algorithm that is a combination ofLMS and BP algorithm.3.2.1 Obtain the link weight of output layer w byLMS. According to the structure of the CRFNNthat has only a structure parameter in the outputlayer—the link weight w, we obtain the value of w

by LMS algorithm since all the other parametersare known. That is

y = ςw, (24)

where

ς = [On1, On2, · · · , Onk], w = [w1, w2, · · · , wk]T.

3.2.2 Train the other parameters by BP algo-rithm. Given the training data (x, y), where x isthe n-dimensional input vector as x = [x1, x2, . . . ,

xn]. Define the network output yd = y − e, wheree is the output error by the LMS algorithm men-tioned above. Then, the objective function is de-fined as E = e2/2 = (y − yd)2/2; so according tothe gradient descent algorithm, we obtain the fol-

lowing:1) Training the compensatory degree. In order

to ensure the compensatory degree γ ∈ [0, 1], defineγ = f 2/(f 2 + d2), then

fj(t + 1) = fj(t) − η

{2fj(t)d2

j(t)[f 2

j (t) + d2j(t)]2

}∂E

∂γj

∣∣∣∣t

,

dj(t + 1) = dj(t) + η

{2f 2

j (t)dj(t)[f 2

j (t) + d2j(t)]2

}∂E

∂γj

∣∣∣∣t

,

γj(t + 1) =f 2

j (t + 1)f 2

j (t + 1) + d2j(t + 1)

, (25)

where∂E

∂γj

∣∣∣∣t

= − e(t)(wj(t) − yd(t))Onj(t)

×(

1n− 1

)ln(Opj(t)). (26)

2) Training the centers of membership functions

cij(t + 1) = cij(t) − η∂E

∂cij

∣∣∣∣t

, (27)

where∂E

∂cij

∣∣∣∣t

= − e(t)σ2

ij(t)(wj(t) − yd(t))(1 − γj + γj/n)

× Onj(t) ∗ (xi + ϑij(t)

∗ Ofij(t − 1) − cij(t)). (28)

3) Training the widths of membership functions

σij(t + 1) = σij(t) − η∂E

∂σij

∣∣∣∣t

, (29)

where∂E

∂σij

∣∣∣∣t

= − e(t)σ3

ij(t)(wj(t) − yd(t))(1 − γj + γj/n)

× Onj(t) ∗ (xi + ϑij(t)

∗ Ofij(t − 1) − cij(t))2. (30)

4) Training the link weight of recurrent element

ϑij(t + 1) = ϑij(t) − η∂E

∂ϑij

∣∣∣∣t

, (31)

where∂E

∂ϑij

∣∣∣∣t

=e(t)

σ2ij(t)

(wj(t) − yd(t))(1 − γj + γj/n)

× Onj(t) ∗ (xi + ϑij(t) ∗ Ofij(t − 1)

− cij(t))Ofij(t − 1). (32)

WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51 47

3.3 The analysis and comparison of sever-al relative learning algorithms

In the five other relative networks mentioned inTable 1, only HARCNFN[6] has a relatively effec-tive structure identification method named modi-fied RGC (relational grade clustering). The mod-ified RGC is a kind of clustering algorithm thattakes the input and output patterns as vectors andthen makes clustering. The number of fuzzy rulesis the clustering number and the initial values ofcorrelative parameters are determined by the re-sult of the clustering result. It does not take ac-count of both the compensatory element and re-current element, but the structure identificationmethod based on SSLM proposed above could takeaccount of all the elements, determine the fuzzyrules and correlative parameters dynamically ac-cording to the training patterns, and then adjustthese parameters; therefore, the constructed net-work has reached a considerable precision in theinitial time of parameter identification.

4 Illustrative examples

4.1 Example 1: XOR problem

First, we take a simple XOR problem as an ex-ample to prove that the proposed CRFNN couldgenerate fuzzy rules automatically by the trainingdata and converge quickly. In order to analyze theperformance of the proposed CRFNN conveniently,we use 0.2 and 0.8 to represent false and true, re-spectively, as ref. [3] did, for the XOR problem inTable 2. Besides, we define the globe square errors(GSE) as follows:

Eg =n∑

i=1

E/2. (33)

Table 2 XOR problem

x1 0.2 0.2 0.8 0.8

x2 0.2 0.8 0.2 0.8

y 0.2 0.8 0.8 0.2

4.1.1 Structure identification. First, take {x1,

x2} as the two input of network and {y} as theoutput. Thus, the CRFNN to be constructed hastwo input and single output. Then, set the param-eter values of SSLM: σmin = 0.4, τmin = 0.1, emin =

0.1, η = 0.9; and EKF algorithm: P0 = 0.5, Rn =2, Q0 = 0, the initial link weight of recurrent el-ement ϑ = 0.1 and the initial compensatory de-gree γ = 0.5. At last, we obtain four fuzzy rulesby structure identification, and the parameters areshown in Table 3.Table 3 The result of structure identification

k c1 c2 σ

1 0.2046 0.2008 0.4082

2 0.2165 0.9000 0.4969

3 0.8730 0.2002 0.5164

4 0.8018 0.8139 0.5916

4.1.2 Parameter identification. In this step, weuse the improved BP algorithm to train the con-structed CRFNN by structure identification fur-ther. A learning rate 0.05 was used for all of theparameters, and its adjusting dynamically accord-ing to the GSE was used. Then, with only 1 epochtraining, the GSE Eg come up to 1.3867×10−32,which is close to 0. All the parameter initializationdoes not need anymore adjusting. However, theGSE of CFNN in ref. [3] need 11 epochs trainingto obtain Eg = 10−6 in the best situation.

According to the result, we can conclude thatstructure identification method of the proposedCRFNN could generate fuzzy rules effectively bythe training data, which makes the parameter iden-tification faster.

4.2 Example 2: Identification of dynamicnonlinear system

A dynamic nonlinear system given below is usedto illuminate the performance of the proposedCRFNN. This system has strong nonlinear and dy-namic behaviors and is used in several other papersto test the performance of the networks. It is de-scribed as follows:

y(k+1) = f(y(k), y(k−1), u(k), u(k−1), u(k−2)),(34)

where

f(x1, x2, x3, x4, x5) =x1x2x3x5(x3 − 1) + x4

1 + x22 + x2

3

.

(35)Here, the current output of the system depends

on three previous outputs and two previous inputs.The testing input signal u(k) as eq. (36) is used to

48 WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

determine the identification results.

u(k) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

sin(kπ/25), 0 < k � 125,

1.0, 125 < k � 250,

−1.0, 250 < k � 375,

0.3 sin(kπ/25)

+0.1 sin(kπ/32) 375 < k � 500.

+0.6 sin(kπ/10),(36)

4.2.1 Structure identification. Similarly, the firststep is structure identification. According to thedescription of the system, we choose {u(k), y(k)} asthe input of the constructed network and {y(k+1)}as the output of the network. Set the parame-ter values of SSLM: σmin = 0.4, τmin = 0.1, emin =0.1, η = 0.968; and EKF algorithm: P0 = 1, Rn =2, Q0 = 0, the initial link weight of recurrent ele-ment ϑ = 0.1, and the initial compensatory degreeγ = 0.5. At last, ten fuzzy rules were generatedand the parameters are shown in Table 4.

Table 4 The result of structure identification

k c1 c2 σ

1 1.0072 −0.5231 0.8469

2 −0.5231 −0.0660 1.0090

3 0.4824 0.1757 0.9356

4 0.2601 0.5609 1.0301

5 1.0197 0.9124 0.1875

6 0.9968 −0.0066 0.1500

7 −0.7879 0.4582 0.5633

8 0.5246 0.0963 0.2010

9 0.2376 0.5156 0.1500

10 −1.0262 −1.2083 0.3845

4.2.2 Parameter identification. Use the result ofstructure identification as the initial value of thecenters and widths of membership function inthe constructed CRFNN. Then, choose the initiallearning rate as 0.03, and the GSE Eg come upto 0.0929, that is, MSE as 3.72×10−4, by training200 epochs. The final parameters of membershipfunctions are shown in Table 5. Figure 3 shows thedesired output of the nonlinear system and the out-put of the CRFNN, and Figure 4 shows the errorbetween them.4.2.3 Performance comparison with other correl-ative networks. On the basis of the analysis men-tioned above, ref. [3] proposed the CFNN first.

Table 5 The result of parameter identification

k c1 σ1 c2 σ2

1 1.0493 0.8510 −0.3990 0.8908

2 −0.5221 1.0314 −0.0556 1.0300

3 0.4696 0.8669 0.1871 0.9801

4 0.2454 1.0743 0.5807 0.9609

5 0.8928 0.2330 0.7703 0.1000

6 0.9931 0.1557 −0.0741 0.1000

7 −0.8103 0.4341 0.2941 0.5835

8 0.5846 0.1382 0.1967 0.2967

9 0.1678 0.2125 0.5397 0.1036

10 −0.9091 0.4192 −1.1704 0.3059

Figure 3 The desired output (y, solid line) and network output

(yd, dashed).

Figure 4 The identification error.

The convergent speed of BP algorithm was im-proved by the introduction of compensatory el-ement. However, the CFNN has poor dynamicresponse ability since there is no recurrent element,

WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51 49

and the convergent speed is not satisfying sincethere is no structure identification method. There-fore, ref. [6] proposed HARCNFN. The recurrentelement was introduced so as to strengthen the dy-namic response ability of the network. The onlydifference between HARCNFN and CRFNN pro-posed here is that HARCNFN uses the centroidas the defuzzification method and has one moreparameter than CRFNN. We compare the per-formance of CRFNN with CFNN[3], HARCNFN[6]

in Table 6. We can conclude that the proposedCRFNN could model the dynamic nonlinear sys-tem with less fuzzy rules and better precision.

Table 6 Performance comparison of correlative networks on

nonlinear system modeling

Structure

arame-

ter

Rules

num-

bers

Initial

GSE

Final

GSE

Training

epochs

CFNN[3] 5 25 12.0632 0.2952 200

HARCNFN[6] 6 10 0.8042 0.0915 200

CRFNN 5 10 0.1877 0.0929 200

4.3 Example 3: Identification of nonlinearsystem with chaotic behaviors

According to ref. [3], the nonlinear system withchaotic behaviors is given below:

x1(t) = −x1(t)x22(t) + 0.999 + 0.42 cos(1.75t),

x2(t) = x1(t)x22(t) − x2(t),

y(t) = sin[x1(t) + x2(t)]. (37)

We use “ode23” in MATLAB to solve the differ-ential equations (37) from t = 0 to t = 20 with theinitial value x1(0) = 1.0 and x2(0) = 1.0. Then,obtain 107 values of (x1(t), x2(t), y(t)), and Figure5 shows x1(t) and x2(t). As the above two exam-ples, here, the identification of nonlinear systemwith chaotic behaviors also has two steps.4.3.1 Structure identification. First, normalizeall of the training patterns. According to thedescription of the nonlinear system, we choose{x1(t), x2(t)} as the input of the constructed net-work and {y(t)} as the output of the network. Setthe parameter values of SSLM: σmin=0.4, τmin=0.1,emin=0.05, η=0.9; and EKF algorithm: P0 =1.0,Rn =2, Q0 =0, the initial link weight of recur-rent element ϑ=0.1, and the initial compensatorydegree γ=0.5. At last, four fuzzy rules were gener-ated, and the parameters are shown in Table 7.

Figure 5 x1(t) (solid line) and x2(t) (dashed).

Table 7 The result of structure identification

k c1 c2 σ

1 −0.8304 −0.5409 0.7828

2 0.1201 −0.5608 0.4470

3 0.1648 −1.1429 0.4770

4 0.9420 −0.7949 0.7321

4.3.2 Parameter identification. Choose the ini-tial learning rate as 0.3, the GSE Eg come up to6.0328× 10−4 by training 50 epochs. The final pa-rameters of membership functions are shown in Ta-ble 8. Figures 6–8 show the desired output of thesystem and the output of the CRFNN.

Table 8 The result of parameter identification

k c1 σ1 c2 σ2

1 −0.8709 0.6579 −0.6685 0.6209

2 0.2344 0.5967 −0.8723 0.3865

3 0.1894 0.7242 −1.1581 0.3717

4 0.9627 0.7203 −0.4805 0.7463

Figure 6 The desired output (y(t), solid line) and network out-

put (yd(t), dashed).

50 WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51

Figure 7 y(t)–x1(t) (solid line) and yd(t)–x1(t)(dashed).

Figure 8 y(t)–x2(t) (solid line) and yd(t)–x2(t) (dashed).

4.3.3 Performance comparison with CFNN[3].

In ref. [3], CFNN was used to identify the nonlin-ear system also with chaotic behaviors. There were25 fuzzy rules defined according to human experi-ence. However, we only need 4 fuzzy rules here.It is obvious that there are too many rules in ref.[3], which gives powerful proof that the CRFNNand its algorithm could generate fuzzy rules effec-tively. Besides, the convergent speed of CRFNN isalso faster than CFNN. Table 9 shows the detailedcomparison. The data is the training epochs untilGSE Eg come up to the corresponding value.

Table 9 Performance comparison with CFNN (γ=0.5)

Eg 0.1 0.075 0.05 0.025

CFNN 3 3 4 7

CRFNN 1 1 3 5

5 Conclusion

Based on detailed study on the several kinds ofFNNs, we propose a novel CRFNN by adding re-current element and compensatory element to theconventional FNN. The most important part totrain the FNN is structure identification. There-fore, a new supervisory sequential learning methodis proposed for the structure identification and isproven to be able to generate fuzzy rules effec-tively by several examples. Furthermore, with theimproved BP algorithm, the hybrid learning algo-rithm requires less fuzzy rules and converges morequickly than other correlative networks.

1 Lin C M, Hsu C F. Identification of dynamic systems using

recurrent fuzzy neural network. In: IFSA World Congress and

20th NAFIPS International Conference, 2001. Joint 9th. Van-

couver, 2001. 2671–2675

2 Lee C H, Teng C C. Identification and control of dynamic sys-

tems using recurrent fuzzy neural networks. IEEE Trans Fuzzy

Syst, 2000, 8(4): 349–365

3 Zhang Y Q, Kandel A. Compensatory neural fuzzy systems

with fast learning algorithm. IEEE Trans Neural Netw, 1998,

9(1): 83–105

4 Lin C J, Chen C H. A compensation-based recurrent fuzzy

neural network for dynamic system identification. Eur J Oper

Res, 2006, 172(2): 696–715

5 Lin C J, Chen C H. Identification and prediction using re-

current compensatory neuro-fuzzy systems. Fuzzy Sets Syst,

2005, 150(2): 307–330

6 Pang Z H, Zhou Y G. A hybrid approach-based recurrent

compensatory neural fuzzy network. In: Proceedings of the

6th World Congress on Intelligent Control and Automation.

Dalian, 2006. 2737–2741

7 Juang C F. A TSK-type recurrent fuzzy network for dynamic

systems processing by neural network and genetic algorithms.

IEEE Trans Fuzzy Syst, 2002, 10(2): 155–170

8 Chak C K, Feng G, Ma J. An adaptive fuzzy neural network

for MIMO system model approximation in high-dimensional

spaces. IEEE Trans Syst Man Cybern Part B-Cybern, 1998,

28(3): 436–446

9 Yang G, LU J H, Liu Z H. A new sequential learning algo-

rithm for RBF neural networks. Sci China Ser E-Tech Sci,

2004, 47(4): 447–460

10 Lu C H, Tsai C C. Generalized predictive control using recur-

rent fuzzy neural networks for industrial processes. J Process

Control, 2007, 17(1): 83–92

WU Bo et al. Sci China Ser F-Inf Sci | Jan. 2009 | vol. 52 | no. 1 | 41-51 51