zemouri2003eaai Radial Basis Function
-
Upload
abdo-sawaya -
Category
Documents
-
view
212 -
download
0
Transcript of zemouri2003eaai Radial Basis Function
-
8/22/2019 zemouri2003eaai Radial Basis Function
1/11
Engineering Applications of Artificial Intelligence 16 (2003) 453463
Recurrent radial basis function network for time-series prediction
Ryad Zemouri*, Daniel Racoceanu, Noureddine Zerhouni
Laboratoire dAutomatique de Besan@on, Groupe Maintenance et S#uret!e de Fonctionnement, 25, Rue Alain Savary, 25 000 Besan@on, France
Abstract
This paper proposes a Recurrent Radial Basis Function network (RRBFN) that can be applied to dynamic monitoring and
prognosis. Based on the architecture of the conventional Radial Basis Function networks, the RRBFN have input looped neurons
with sigmoid activation functions. These looped-neurons represent the dynamic memory of the RRBF, and the Gaussian neurons
represent the static one. The dynamic memory enables the networks to learn temporal patterns without an input buffer to hold the
recent elements of an input sequence. To test the dynamic memory of the network, we have applied the RRBFN in two time seriesprediction benchmarks (MacKey-Glass and Logistic Map). The third application concerns an industrial prognosis problem. The
nonlinear system identification using the Box and Jenkins gas furnace data was used. A two-steps training algorithm is used: the
RCE training algorithm for the prototypes parameters, and the multivariate linear regression for the output connection weights.
The network is able to predict the two temporal series and gives good results for the nonlinear system identification. The advantage
of the proposed RRBF network is to combine the learning flexibility of the RBF network with the dynamic performances of the
local recurrence given by the looped-neurons.
r 2003 Elsevier Ltd. All rights reserved.
Keywords: Neural network; Radial basis function; Dynamic neural networks; Recurrent neural networks; Neural predictive model; Time series
prediction
1. Introduction
The modern industrial monitoring requires processing
a certain number of sensors signals. It concerns
essentially the detection of all deviations comparing to
a working reference by generating an alarm, and the
failure diagnosis. The diagnosis operation has two main
functions: the location of the weakening system or sub-
system and the identification of the primary cause of this
failure (Lefebvre, 2000). The monitoring methods can be
classified in two categories (Dash and Venkatasubra-
manian, 2000): model-based monitoring methodologies
and without any model monitoring. The first class
contains essentially control system techniques based on
the difference between the system models outputs and
the equipments output (Combacau, 1991). The major
disadvantage of these techniques consists in the diffi-
culty to obtain the formal model especially for complex
or re-configurable equipments. The second class of
monitoring techniques is not sensitive to this problem.
These techniques are the probabilistic ones and the
Artificial Intelligence ones. The AI techniques are
essentially based on a training process that gives certain
adaptability to the monitoring application (Rengaswa-
my and Venkatasubramanian, 1995).
The use of the Artificial Neural Networks (ANN) on
a monitoring task can be viewed as a pattern recognition
application. The form to recognize is the measurable or
observable equipment data. The output classes are the
different working and failure modes of the equipment
(Koivo, 1994). The Radial Basis Function Networks are
completely adapted to this kind of application. Due to
the non-exhaustiveness of the history database of the
equipment operation, RBF networks are able to detect
new operations or failures modes by their local general-
ization. This one is obtained by the Gaussians basis
functions that are maximal to the core, and decrease in a
monotonous way with the distance. The second
advantage of the RBF network is the flexibility of their
training process.
The problem with the static classification methods is
that the dynamic process behavior is not considered
(Koivo, 1994). For example, the distinction between a
true degradation and a false alarm needs a dynamic
processing of the sensors signals (Zemouri et al., 2002a).
ARTICLE IN PRESS
*Corresponding author.
URL: http://www.lab.cnrs.fr
0952-1976/03/$ - see front matter r 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/S0952-1976(03)00063-0
-
8/22/2019 zemouri2003eaai Radial Basis Function
2/11
In our previous works, we have demonstrated that a
dynamic RBF is able to distinguish between a pick of
variation and a continuous variation of a signal sensor.
This can be interpreted as a distinction between a false
alarm and a true degradation. The prognosis function is
also strongly dependent on the dynamic behavior of the
process.The aim of the prognosis function is to predict a
sensor signal evolution. This operation can be obtained
either by a priori knowledge of the laws of the ageing
phenomena evolution or by a training process of the
signal evolution. In this way, the prognosis can identify
degradations or predict the time remaining before
breakdown (Brunet et al., 1990).
For this purpose, we introduce a new Recurrent
Radial Basis Function Network (RRBF) architecture
that is able to learn temporal sequences. The RRBFN
network is based on the advantages of Radial Basis
Function networks in term of training process time.
The recurrent or dynamic aspect is obtained by cas-
cading looped neurons on the first layer. This layer
represents the dynamic memory of the RRBF network
that permits to learn temporal data. The proposed
network combines the easy use of the RBF network
with the dynamic performance of the Locally Recurrent
Globally Feed forward network (Tsoi and Back, 1994).
The prognosis function can be seen like a time-
series prediction problem. In order to validate the
prediction capability of the RRBFN, we test the
network on two standards time series prediction bench-
marks: the MacKey-Glass and the Logistic Map. The
prognosis validation is made on a nonlinear systemidentification using the Box & Jenkins gas furnace
data.
The paper is organized in three sections: a brief
survey of the RBF network, their application and their
training process algorithms is presented in the second
section. The third section describes the architecture of
the RRBF network for the time series prediction.
Finally, we present the results obtained on the three
benchmarks.
2. Radial basis function network overview
2.1. RBF networks definition
Radial Basis Functions networks are able to provide a
local representation of an N-dimensional space. This is
made by restricted influence zone of the basis functions.
The parameters of this basis function are given by a
reference vector (core or prototype) lj and the dimen-
sion of the influence field sj: The response of the basis
function depends on the Euclidian distance between the
input vector x and the prototype vector lj; and depends
also on the size of the influence field:
fjx exp jjx ljjj
2
2s2j
!: 1
For a given input, a restricted number of basis
functions gives the calculation of the output. The RBF
network can be classified in two categories, according tothe type of output neuron: standardized and non-
standardized (Mak and Kung, 2000; Moody and
Darken, 1989; Xu, 1998; Ghosh and Nag, 2000).
Moreover, the RBF network can be used in two kind
of application: regression and classification.
2.2. RBF training techniques
The parameters of the RBF networks are the center
and the influence field of the radial function and the
output weight (between the intermediate layers neurons
and those of the output layer). The training process canobtain these parameters. One classify these training
techniques in the three following groups:
2.2.1. Supervised techniques
The principle of these techniques is to minimize the
quadratic error (Ghosh et al., 1992):
E X
n
En: 2
At each step of the training process, we consider the
variations: Dwij of the weight, Dmjk of the center and Dsjof the influence field. The update law is obtained by
using the descent of the gradient on En (Rumelhart et al.,1986; Le Cun, 1985).
2.2.2. Heuristic techniques
The principle of these techniques is to determine the
network parameters in an iterative way. Generally, we
start the training process by initializing the network on a
center with an initial influence field l0;s0: Presenting
the training vectors progressively creates the prototypes
centers. The aim of the next step is to modify the
influence rays and the connections weights (only weights
between the intermediate layer and the output one).
Some of the heuristic techniques used for RBF training
are presented below:
2.2.2.1. RCE Algorithm (Restricted Coulomb Energy)
(Hudak, 1992). The RCE Algorithm was inspired from
the theory of particles charges. The principle of the
training algorithm is to modify the network architecture in
a dynamic way. The intermediate neurons are added only
when it is necessary. The influence field is then adjusted to
minimize conflicting zones by a threshold y (Fig. 1).
2.2.2.2. Dynamic Decay Adjustment Algorithm (Berthold
and Diamond, 1995). This technique, partially extracted
ARTICLE IN PRESS
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463454
-
8/22/2019 zemouri2003eaai Radial Basis Function
3/11
from the RCE algorithm, is used for classificationapplications (discrimination). The principle of this
technique is to introduce two thresholds y and y
in order to reduce conflicting zone between prototypes.
To ensure the convergence of training algorithm, the
neural network must satisfy the two inequality (3) and
this for each vector x of class C from the training set
(Fig. 2):
(i: fcixXy48kac; 8j: fkj xoy
: 3
2.2.3. Two times training techniques
These techniques estimate the RBF parameters in two
phases: a first phase is used to determine the centers and
the rays of the basis functions. In this step, only input
vectors are used (unsupervised training algorithm). The
second step has to calculate the connections weights
between the hidden layer and the output layer (super-
vised training). Some of these techniques are presented
as below.
2.2.3.1. First phase (unsupervised). The k-means algo-
rithm: The prototypes centers and the variances matrix
can be calculated in two steps: in the first step, the
k-means cluster algorithm determines the center of the
cluster point Nk with the same class. This center is
obtained by a segmentation of the training space wk of
the k classes, in Jk disjoined groups fwk
j gJk
j1: The
population of this group is Nkj points. We estimate then
the center lj of the function by the average:
lj 1
Nkj
XxAwk
j
x: 4
The second step calculates the variance of the Gaussian
function (influence field). This one is calculated using the
following expression:
sj 1
Nkj
XxAwk
j
x ljx ljT: 5
Method Expectation Maximization (EM) (Dempster
et al., 1977): This technique is based on the analogy
between the RBF network and the Gaussian mixture
models. The Expectation Maximization (EM) algorithm
determines, in an iterative way, the parameters of a
Gaussian mixture (by the maximum of probability). The
RBF parameters are obtained by the two steps: step E
which calculates the mean of the unknown data
compared to the known data. The step M which
maximizes the vector parameters of the step E:
2.2.3.2. Second phase (Supervised). Maximum of mem-
bership (Hernandez, 1999): This technique, used in the
classification applications, considers the most significant
basis functions values fix:
fmax
maxN
i1f
i; 6
where N is the number of basis functions for all the
classes. The output of the neural network is then given by
y classefmax: 7
Algorithm of least squares: Let suppose that is fixed an
empirical risk function to minimize (Remp). As for the
Multi Layer Perceptron, the determination of the
parameters can then be done in a supervised way by
gradient decent method. If the selected cost function is
quadratic with fixed basis functions F; the weight matrix
W is obtained by a simple linear system resolution. The
solution is the weights matrix W that minimizes theempirical risk Remp. By canceling the derivative of this
risk compared to the weight, we obtain the optimal
conditions, which can be written in the following matrix
form:
FtFWt FtY: 8
Y represents the desired outputs vector. If the FtF matrix
is square and non-singular (Michelli condition (Michelli,
1986)), the optimal solution for the weights, with fixed
basis functions, can be written as
Wt FtF1FtY F1Y: 9
ARTICLE IN PRESS
A B
Input Vector
(category B)
xxn xBxA
Fig. 1. Influence field adjustment by RCE algorithm. Only one
threshold is used. The reduction of the conflicting zone must respect
the following relations: fBxAoy;fAxnoy;f
AxBoy: No new
prototype is added for the input vector xn:
A
Input Vector
(category B)
+
xxBxnxA
B
Fig. 2. Influence field adjustment by DDA algorithm. Two thresholds
y and y are used for the conflict reduction according to this
expression fBxAoy; fAxnoy
; fAxBoy
: No prototype is
added for the input vector fBxn > y:
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463 455
-
8/22/2019 zemouri2003eaai Radial Basis Function
4/11
3. The recurrent radial basis function network
The proposed recurrent RBF neural network con-
siders the time as an internal representation (Chappelier,
1996; Elman, 1990). The dynamic aspect is obtained by
the use an additional self-connection on the input
neurons with a sigmoid activation function. Theselooped neurons are a special case of the Locally
Recurrent Globally Feedforward architecture, called
local output feed back (Tsoi and Back, 1994). The
RRBF network can thus take into account a certain past
of the input signal (Fig. 3).
3.1. Looped neuron
Each neuron of the input layer gives a summation at
the instant t between its input Ii and its previous output
weighted by a self-connection wii: The output of its
activation function is
ait wiixit 1 Iit; 10
xit fait; 11
where ait and xit represent respectively the neuron
activation and its output at the instant t: f is the sigmoid
activation function:
fx 1 expkx
1 expkx: 12
To highlight the influence of this self-connection, we
let evolve the neuron without an external influence
(Frasconi et al., 1995; Bernauer, 1996). The initialconditions are: the input Iit0 0 and that xit0 1:
The output of the neuron evolves according to the
following expression:
xt 1 expkwiixt 1
1 expkwiixt 1: 13
Fig. 4 shows the temporal evolution of the output
neuron.
This evolution depends on the slope of the straight-
line D: This slope depends on two parameters: the self-
connection weight wii and the value of the activation
function parameter k: The equilibrium points of the
looped neuron satisfy the following equation:
at wiifat 1: 14
The point a0 0 is a first obvious solution of this
equation. The other solutions are obtained by the
variations study of the function:
ga a wiifa: 15
According to kwii; the looped neuron has one or more
equilibrium points:
* Ifkwiip2; the neuron has only one equilibrium point
a0 0:* If kwii > 2; the neuron has three equilibrium points
a0 0; a > 0; ao0:
To study the stability of these points, we study the
variations of the Lyapunov function (Frasconi et al.,
1995; Bernauer, 1996). In the case where kwiip2; this
function is defined by Va a2: We obtain
DV wiifa2 a2 gawiifa a: 16
If a > 0; then fa > 0 and gao0: If wii > 0 so then,
we have DVo0: If ao0; then fao0 and ga > 0: If
wii > 0; we have DVo0: The point a0 0 is thus a
steady-state equilibrium point if kwiip2 with wii > 0:
In the case where kwii > 2; the looped neuron has
three equilibrium points: a0 0; a > 0 and ao0: To
study the stability of the point a; we define theLyapunov function Va a a2 (see Frasconi
et al., 1995; Bernauer, 1996). We obtain
DV wiifa a2 a a2
gaga 2a a:
If a > a; gao0 and ga 2a a0; so we have
DVo0: The calculation is the same in the case of aoa:
The point a is a stable equilibrium point. In the same
way, we can prove that the point a is another stable
equilibrium point. The point a0 0 is an unstable
equilibrium point.
ARTICLE IN PRESS
Sigmoid FunctionRadial Basis Function
Output Neurons
w
w
w
Input
I1
I2
I3
Fig. 3. RRBF network (recurring networks with radial basis
functions).
xi
ai
f(ai)
i
ii
a
w
t
t+1
t+2
()
a0
xi
ai
f(ai)
a
a+
(a) (b)
Fig. 4. Equilibrium points of the looped neuron: (a) the forget
behavior kwiip2 and (b) temporal memorizing behavior (kwii > 2).
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463456
-
8/22/2019 zemouri2003eaai Radial Basis Function
5/11
The looped neuron thus can exhibit two behaviors
according to kwii: forgetting behavior kwiip2; and
temporal memory behavior kwii > 2: The figure below
shows the influence of the self-connection weight on
the behavior of the looped neuron with k 0:05
(Fig. 5):
The self-connection procures to the neuron thecapacity to memorize a certain past of the input data.
The weight of this self-connection can be obtained
by training, but the easier way to do it is to fix it a
priori. We will see in the next section how this looped
neuron can make the RRBF network possible to treat
dynamic data whereas traditional RBR treat only
static data.
3.2. RRBF for the prognosis
After showing the effect of the self-connection on the
dynamic behavior of the RRBF network, we present in
this paragraph the topology of the RRBF network and
its training algorithm for time series prediction applica-
tions (Fig. 6).
The looped neurons cascades represent the dynamic
memory of the neural network. The network then treats
the data dynamically. The output vector of the looped
neurons represents the input vector for the RBF nodes.
The neural network output is defined by
yt Xni1
wi fili;si; 17
where wi represents the connection weight between
radial neurons and the output neuron. The output of the
RBF nodes has the following expression:
fili;si exp
Pmj1 x
jt lji
2
s2i
!18
li lji
mj1 and si represent respectively the center and
the dimension of the influence ray of the ith prototype.
These radial neurons are the static memory of the
network. The output xjt of the jth looped neurons is
the dynamic memory of the network with the followingexpression:
xjt 1 expk$xjt 1 xj1t
1 expk$xjt 1 xj1t19
with j 1;y; m represents the number of the neurons
of the input layer. The first neuron of this layer has a
linear activation function x1t xt:
Fig. 7 shows the relation between the looped
neuron number and the length of a signal past. We
have introduced a variation D at the instant t 50
for a signal (Figs. 7(a) and (b)). The aim is to high-
light the dynamic memory longer of the RRBF shownin Fig. 6. Four looped-neuron RRBF is stimulated
by the signal of Fig. 7(a). Figs. 7(c)(f) show the
output error of each looped neuron caused by this
variation D:
The network parameters are determined with a two-
stage training process. During the first stage, an
unsupervised learning algorithm is used to determine
the parameters of the RBF nodes (the centers and the
influence rays). In the second stage, linear regression is
used to determine the weights between the hidden and
the output layer.
3.3. Training process of the RRBF
3.3.1. The prototypes parameters
The first step of the training process consists to
determine the centers and the influence rays of the
prototypes (static memory). These prototypes are
extracted from the output of the looped neurons
(dynamic memory). Each temporal signal is thus
characterize by a cluster point that the coordinate are
the output of the loop neuron at every moment t: We
have adopted the RCE training algorithm for this first
stage of the training process. The influence rays are
ARTICLE IN PRESS
0 20 40 60 80 100 120 140 160 180 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Outputofthelooped
neuron
30ii
w = 39iiw =40iiw =
41ii
w =
Fig. 5. Influence of self-connection on the behavior of the looped
neuron with k 0:05:
Fig. 6. Topology of the RRBF. The self-connection of the input
neurons procures to the network a dynamic processing of the input
data.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463 457
-
8/22/2019 zemouri2003eaai Radial Basis Function
6/11
adjusted according to a threshold y: A complete
iteration of this algorithm is as follows:
// Training Iteration
// Creation of a new prototype
for all training vector x Do:
add a new prototype pn1 with:
ln1 x
n 1
end
// adjusting the influence raysfor all prototype li Do:
si max1pjpn4jai s: filjoy
end
// End
3.3.2. Connections weights
The time series prediction can be seen like an
interpolation problem. The output of RBF network is
hx
Xn
i1
wifijjx lijj; 20
where N represents the number of the basis functions,
centered in the N input points.
The solution of this problem is to solve the N linear
equations to find the weight coefficients:
f11 f12 ? f1n
f21
f22? f
2n^ ^ & ^
fn1 fn2 ? fnn
26664 37775w1
w2
^
wn
26664 37775 y1
y2
^
yn
26664 37775; 21
yi is the desired output, and
fij fjjli ljjj; i;j 1; 2;y; n: 22
The equation can be written as
F w Y: 23
The weight vector is then
w F1 Y: 24
4. Application in prediction
We have tested the RRBF network on three time
series predictions applications. On these three applica-
tions, the required goal is to predict the evolution of the
input data from the knowledge of the past of these data.
The training process is made from a part of the data set.
The network was tested on the totality of the data. We
give for each application, two error-prediction average
and two error standard deviations according if the
network test is made on the only the test population or
on both test and training population.
4.1. MacKeyGlass chaotic time series
The MacKeyGlass chaotic time series is generated by
the following differential equation:
xt bxt axt t
1 x10t t: 25
xt is quasi-periodical and chaotic for the following
parameters: a 0:2; b 0:1 and t 17 (Jang, 1993;
Chiu, 1994). The simulated data were obtained by using
the fourth-order RungeKutta method for Eq. (25) with
the following initial conditions x0 1:2; and xt
t 0 for 0ptot: The simulation step is 1. The data of
this series are available on the following location http://
neural.cs.nthu.edu.tw/jang/benchmark.
We have tested the RRBF network presented pre-
viously on the MacKeyGlass prediction. To obtain good
result, we have used six looped neurons. The parameters
of these looped neurons are set such as to obtain the
longest dynamic memory (Fig. 5). This characteristic is
obtained with the value $ 40 of the self-connection
and the parameter of the sigmoid function k 0:05: The
parameters of the Gaussian functions as well as the
ARTICLE IN PRESS
0 50 100 150 200 250 30044
46
48
50
52
54
56
58
60
62
44
46
48
50
52
54
56
58
60
62
0 50 100 150 200 250 300
0 50 100 150 200 250 300 0 50 100 150 200 250 300
0
0.002
0.004
0.006
0.008
0.01
0.012
0
0.5
1
1.5
2
2.5
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8 x 10
-5x 10
-6
x 10-4
0
0 50 100 150 200 250 300
0.5
1
1.5
2
2.5
3
3.5
4
Signal evolution
1st looped neuron error
3rd looped neuron error 4thlooped neuron error
0 50 100 150 200 250 300
2ndlooped neuron error
signal with variation(a) (b)
(c) (d)
(e) (f)
Fig. 7. Influence of the number of looped neurons on the length of the
dynamic memory of the network: (a) signal evolution, (b) signal with
variation D; (c) first looped neuron error, (d) second looped neuron
error, (e) third looped neuron error, and (f) fourth looped neuron
error.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463458
http://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmark -
8/22/2019 zemouri2003eaai Radial Basis Function
7/11
connections weights are given by the training algorithms
presented previously with y 0:8:
Table 1 presents the obtained results by the RRBF
network with different number of training points (Nb)
taken from the 118th data point. The prediction errors
between the network output and the real value of the
series are presented in the various columns of the table
with the percentages of each error. This percentage is
calculated according to the amplitude 0.9 of the series.
The network is able to predict the series evolution with a
minimum of 50 training points with a mean error equalto 19% and standard deviation error equal to 27%. This
error decreases with the augmentation of the training
points until 2% of the error. The training time
corresponds to one iteration. Fig. 8 show the results of
the test with 500 training points.
4.2. Logistic map
The Logistic Map series is defined by the expression
below:
xt 1 4xt1 xt: 26
This series is chaotic in the interval of [0,1], with
x0 0:2: The goal of this application is to predict the
target value of xt 1: The input value of the RRBF
network is xt:
The best prediction results are obtained with one
looped neuron having the parameters $ 40 for the
self-connection, and k 0:05 for the sigmoid function
parameter. The parameter y 0:999 was used for the
first stage training process.
Table 2 shows the test results of the RRBF network
for different training number (Nb). The network cangives good results with only 10 training points. Fig. 9
shows the results of the test with a 100 training data
points.
4.3. Prediction nonlinear system
The third application relates to a nonlinear prediction
system, using the Box and Jenkins (1970) gas furnace
database, which is available in the location http://
neural.cs.nthu.edu.tw/jang/benchmark. These data re-
present a time series of gas furnace process with ut
represents the input gas and yt represents the output
ARTICLE IN PRESS
Table 1
Results of the RRBF test on the MacKeyGlass series prediction
Nb Min Max Moy1 Moy2 Dev Std1 Dev Std2
50 3:90 104 0.043% 1.1669 129% 0.1862 20% 0.1776 19% 0.251 27% 0.2482 27%
100 3:27 105 0.0036% 1.1632 129% 0.0969 10% 0.0879 9% 0.184 20% 0.1778 19%
150 4:13 105 0.00458% 0.7129 79% 0.0655 7% 0.0564 6% 0.103 11% 0.0982 11%
200 2:60 105
0.00288% 0.3915 43% 0.0502 5% 0.0408 4% 0.058 6% 0.0559 6%250 4:54 105 0.00504% 0.3000 33% 0.0480 5% 0.0369 4% 0.054 6% 0.0518 5%
300 1:46 105 0.00162% 0.2727 30% 0.0441 5% 0.0318 3% 0.048 5% 0.0456 5%
350 2:45 106 0.00027% 0.2874 31% 0.0439 4% 0.0296 3% 0.048 5% 0.0445 5%
400 3:35 105 0.0037% 0.3114 34% 0.0375 4% 0.0236 2% 0.042 4% 0.0382 4%
450 9:56 105 0.01062% 0.2893 32% 0.0360 4% 0.0209 2% 0.042 4% 0.0368 4%
500 1:50 105 0.00166% 0.2789 31% 0.0380 4% 0.0203 2% 0.043 4% 0.0371 4%
Nb represents the population of the training points. The columns Min and Max represent minimal and maximal error prediction. Moy1 represent the
average errors of predictions on the part of the data without training population, and Moy2 the average errors on all the data. Dev Std1 and Dev
Std2 are the standard deviations without and with training data. The percentages are given according to the amplitude of the signal 0.9.
0 200 400 600 800 1000 12000
0.2
0.4
0.6
0.8
1
1.2
1.4
System Output
Network Output
0 200 400 600 800 1000 12000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
(a) (b)
Fig. 8. Prediction results: (a) neural network output and the MacKey-Glass series values and (b) error of the neural network prediction.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463 459
http://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmarkhttp://neural.cs.nthu.edu.tw/jang/benchmark -
8/22/2019 zemouri2003eaai Radial Basis Function
8/11
CO2 concentration. The goal of this application is to
predict the yt value from the knowledge ofyt 1 and
ut 1:
The used RRBF network contains two inputs: an
input for yt and another for ut: The past of each
input signal is taken into account by a looped neuron.
The output of the neural network gives the yt 1
value. The network is composed of four input neurons
(a linear neuron and a looped neuron for each input
signal) and one output neuron. The intermediate
neurons are determined by the first stage training
process described previously. The first 145 points of
the database are used for the training process. The
second stage-training algorithm determined the connec-
tions weights. The best results were obtained with $
500 and k 0:05 for the sigmoid function, and y 0:84
for the training of the influence ray.
Table 3 shows the results of the network test on
this application. The RRBF neuronal network gives
a prediction result with an error average estim-
ation of 8%. The training process takes one time-
iteration.
5. Discussion
The Recurrent Radial Basis Function Network
presented in this article was successfully validated in
the two time series prediction problems. Figs. 8 and 9
show the results and the error prediction of the RRBF
for the MacKeyGlass series and the Logistic Map
series. This dynamic aspect is obtained thanks to the
looped input nodes (Fig. 3). This local output feedback
procures to the neuron a dynamic memory (Fig. 5). We
ARTICLE IN PRESS
Table 2
Results of the RRBF test on the Logistic Map series prediction
Nb Moy1 Moy2 Dev Std1 Dev Std2
10 0.0945 9% 0.0898 9% 0.0636 6% 0.0652 6%
20 7:26 104 7:26 102% 6:53 104 6:53 102% 5:11 104 5:11 102% 5:32 104 5:32 102%
30 1:59 106 1:59 104% 1:35 106 1:35 104% 1:69 106 1:69 104% 1:66 106 1:66 104%
40 4:69 108
4:69 106
% 3:75 108
3:75 106
% 3:66 108
3:66 106
% 3:77 108
3:77 106
%50 1:33 109 1:33 107% 1:00 109 1:00 107% 1:64 109 1:64 107% 1:53 109 1:53 107%
60 4:29 1010 4:29 108% 3:02 1010 3:02 108% 8:06 1010 8:06 108% 7:00 1010 7:00 108%
70 7:11 1011 7:11 109% 5:10 1011 5:10 109% 1:90 1010 1:90 108% 1:55 1010 1:55 108%
80 4:23 1012 4:23 1010% 3:25 1012 3:25 1010% 9:86 1012 9:86 1010% 7:74 1012 7:74 1010%
90 1:51 1011 1:51 109% 1:32 1011 1:32 109% 1:23 1011 1:23 109% 1:45 1011 1:45 109%
100 2:14 1011 2:14 109% 1:55 1011 1:55 109% 1:68 1011 1:68 109% 1:38 1011 1:38 109%
Nb represents the population of the training points. The columns Min and Max represent minimal and maximal error prediction. Moy1 represent the
average errors of predictions on the part of the data without training population, and Moy2 the average errors on all the data. Dev Std1 and Dev
Std2 are the standard deviations without and with training data. The percentages are given compared to amplitude of the signal.
0 20 40 60 80 100 120 140 160 180 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
System Output
Network Output
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
5
6
7
8
9x 10
-11
(a) (b)
Fig. 9. (a) Comparison of the prediction results of the network and the values of the series Logistic Map and (b) error of prediction of the neuron
network.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463460
-
8/22/2019 zemouri2003eaai Radial Basis Function
9/11
do not have so to use temporal windows to store or bloc
the input data as some neural architecture: NETtalk
introduced by Sejnowski and Rosenberg (1986), theTDNN by Waibel et al. (1989) and the TDRBF by
Berthold (1994). These temporal windows techniques
can have many disadvantages (Elman, 1990). First, the
data must be blocked by an external mechanism: when
the data can be presented to the network? The second
disadvantage is the limitation of the temporal window
dimension. The recurrent networks are not affected with
these points. We have shown in Fig. 7 that the RRBF
with four looped neurons is sensitive to a past of about
100 step time data.
A second advantage of the RRBF is the flexibility of
the training process. A two stage-learning algorithm was
used. The first stage concerns the determination of the
RBF parameters, and the second stage for the output
weight calculation. Only few seconds are required for
train the RRBF by a personal computer with a
700 MHz processor.
The major difficulty is to find the best parameters that
optimize the output result. These parameters are: the
number of the input looped neurons N > 0; the self-
connection value wii > 0; the parameter of the sigmoid
function k> 0; and the parameter of the first stage-
training algorithm 0oyo1. In the major case, we can
have good results with only one looped neuron N 1:
This input neuron is configured to have the longest
memory obtained with kw 2 (Fig. 5). The kparameter
is chosen so that to give a quasi-linear aspect to thesigmoid function around the initial point kE0:05: The
last parameter to adjust is the first stage-training
threshold y:
The results obtained by the RRBF show that the
RCE algorithm does not rigorously calculate the
parameters of the Gaussian nodes. The neural network
is over training. This result is completely coherent
because all the data of the training set are stored as
prototypes. The clustering techniques like the k-means
algorithm, which minimizes the sum of squares error
(SSE) between the inputs and hidden node centers,
will certainly give better result than the RCE algorithm.
However, these techniques can have also some dis-
advantages. We have presented in our previous work
an example which highlights these disadvantages
(Zemouri et al., 2002b):
* There is no formal method for specifying the number
of hidden nodes.* These nodes are initialized randomly. We have to run
several iterations to obtain the best result.
Our future works will concern the development of a new
method, which boosts the performances of the k-means
algorithm (Figs. 1012).
ARTICLE IN PRESS
0 50 100 150 200 250 300
44
46
48
50
52
54
56
58
60
62
y(t)
t
0 50 100 150 200 250 300
-3
-2
-1
0
1
2
3
u(t)
t(a) (b)
Fig. 10. (a) CO2 output concentration of the gas furnace and (b) input gas of the furnace.
Table 3
Results of the RRBF test on the nonlinear system prediction
Nb Min Max Moy1 Moy2 Dev Std1 Dev Std2
145 0.0067 0.04% 18.0235 120% 1.5274 10% 1.2441 8% 2.3267 15% 3.4950 23%
Nb represents the population of the training points. The columns Min and Max represent minimal and maximal error prediction. Moy1 represent the
average errors of predictions on the part of the data without training population, and Moy2 the average errors on all the data. Dev Std1 and Dev
Std2 are the standard deviations without and with training data. The percentages are given compared to amplitude of the signal.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463 461
http://-/?-http://-/?- -
8/22/2019 zemouri2003eaai Radial Basis Function
10/11
6. Conclusion
We have presented in this article an application of the
RRBF network on three time series prediction pro-
blems: MacKey-Glass, Logistic Map and Box & Jenkins
gas furnace data. Thanks to its dynamic memory, the
RRBF network is able to learn temporal sequences. This
dynamic memory is obtained by a self-connection of theinput neurons. The input data are not blocked by an
external mechanism, but are memorized by the input
neurons. The training process time is relatively short. It
took one iteration-time for the RBF parameters
calculation and a matrix multiplication-time for the
output weight calculation. In the three examples, all the
training data were correctly tested.
The results obtained in the three Time-Series Predic-
tion applications represent a validation for the dynami-
cal data-treatment by the RRBF network.
References
Bernauer, E., 1996. Les r!eseaux de neurones et laide au diagnostic: un
mod"ele de neurones boucl!es pour lapprentissage de s!equences
temporelles, Ph.D. Thesis, LAAS/FRANCE.
Berthold, M.R., 1994. A time delay radial basis function network for
phoneme recognition. Proceedings of International Conference on
Neural Networks, Orlando, Vol. 7, pp. 44704473.
Berthold, M.R., Diamond, J., 1995. Boosting the performance of RBF
networks with dynamic decay adjustment. In: Tesauro, G.,
Touretzky, D.S., Leen, T.K. (Eds.), Advances in Neural Informa-
tion Processing Systems, MIT Press, Cambridge, MA, pp. 521528.
Box, G.E.P., Jenkins, G.M. 1970. Time Series Analysis, Forecasting
and Control. Holden Day, San Francisco, pp. 532533.
Brunet, J., Jaume, D., Labarr"ere, M., Rault, A., Verg!e, M., 1990.
D!etection et diagnostic de panes, Approche par mod!elisation.
Traitement des nouvelles technologies/s!erie diagnostic et main-
tenance, edition hermes FRANCE.
Chappelier, J.C., 1996. RST: une architecture connexionniste pour la
prise en compte de relations spatiales et temporelles. Ph.D. Thesis,
Ecole Nationale Sup!erieure des T!el!ecommunications/France.
Chiu, S., 1994. Fuzzy model identification based on cluster estimation.Journal of Intelligent & Fuzzy Systems 2 (3), 267278.
Combacau, M., 1991. Commande et surveillance des syst"emes "a
!ev!enements discrets complexes: application aux ateliers flexibles.
Ph.D. Thesis, University of.Sabatier Toulouse, France.
Dash, S., Venkatasubramanian, V., 2000. Challenges in the
industrial applications of fault diagnostic systems. Proceedings
of the Conference on Process Systems Engineering Computing
and Chemical Engineering 24(27). Keystone, Colorado,
pp. 785791.
Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood
from incomplete data via the EM algorithm. Journal of the Royal
Statistic Society, Series B 39, 138.
Elman, J.L., 1990. Finding Structure in Time. Cognitive Science 14,
179211.
Frasconi, P., Gori, M., Maggini, M., Soda, G., 1995. Unifiedintegration of explicit knowledge and learning by example in
recurrent networks. IEEE Transactions on Knowledge and Data
Engineering 7 (2), 340346.
Ghosh, J., Nag, A., 2000. In: Howlett, R.J., Jain, L.C. (Eds.), Radial
Basis Function Neural Network Theory and Applications. Physica-
Verlag, Wurzburg.
Ghosh, J., Beck, S., Deuser, L., 1992. A neural network based hybrid
system for detection, characterization and classification of short-
duration oceanic signals. IEEE Journal of Ocean Engineering 17
(4), 351363.
Hernandez, N.G., 1999. Syst!eme de diagnostic par r!eseaux de neurones
et statistiques: application "a l a d!etection dhypovigilance dun
conducteur automobile. Ph.D. Thesis, LAAS/France.
Hudak, M.J., 1992. RCE classifiers: theory and practice. Cybernetics
and Systems 23, 483515.
Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference
systems. IEEE Transactions on Systems, Man, and Cybernetics 23,
665685.
Koivo, H.N, 1994. Artificial neural networks in fault diagnosis and
control. Control in Engineering Practice 2 (1), 89101.
Le Cun, Y., 1985. Une proc!edure dapprentissage pour r!eseau "a seuil
asym!etrique. Cognitiva 85, 599604.
Lefebvre, D., 2000. Contribution "a la mod!elisation des syst!emes
dynamiques "a !ev!enements discrets pour la commande et la
surveillance. Habilitation "a Diriger des Recherches, Universit!e de
Franche Comt!e/ IUT Belfort, Montb!eliard/France.
Mak, M.W., Kung, S.Y., 2000. Estimation of elliptical basis
function parameters by the EM algorithms with application to
speaker verification. IEEE Transactions on Neural Networks 11 (4),961969.
Michelli, C.A., 1986. Interpolation of scattered data: distance matrices
and conditionally positive definite functions. Constructive Approx-
imation 2, 1122.
Moody, J., Darken, J., 1989. Fast learning in networks of locally tuned
processing units. Neural Computation 1, 281294.
Rengaswamy, R., Venkatasubramanian, V., 1995. A syntactic pattern
recognition approach for process monitoring and fault diagnosis.
Engineering Applications of Artificial Intelligence Journal 8 (1), 3551.
Rumelhart, D.E, Hinton, G.E., Williams, R.J., 1986. Learning
internal representation by error propagation. In: Rumelhart,
D.E., McClelland, J.L. (Eds.), Parallel Distributed Processing
Explorations in the Microstructure of Cognition, Vol. 1. The MIT
Press, Bradford Books, Cambridge, MA, pp. 318362.
ARTICLE IN PRESS
0 50 100 150 200 250 30030
40
50
60
System Output y(t)
Network Output
Training population Test population
Fig. 11. Comparison of the test results of the CO2 concentration
prediction of the furnace gas with the real values.
0 50 100 150 200 250 300
0
10
20
Fig. 12. Prediction error of the RRBF network.
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463462
-
8/22/2019 zemouri2003eaai Radial Basis Function
11/11
Sejnowski, T.J., Rosenberg, C.R., 1986. NetTalk: a parallel network
that learns to read aloud. Electrical Engineering and Computer
Science Technical Report, The Johns Hopkins University.
Tsoi, A.C., Back, D., 1994. Locally Recurrent Globally Feedforward:
a critical review of the architectures. IEEE Transactions on Neural
Networks 5 (2), 229239.
Xu, L., 1998. RBF nets, mixture experts, and Bayesian Ying-Yang
learning. Neurocomputing 19 (13), 223257.Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K., 1989.
Phoneme recognition using time delay neural network. IEEE
Transactions in Acoustics, Speech and Signal Processing 37 (3),
328339.
Zemouri, R., Racoceanu, D., Zerhouni, N., 2002a. Application of the
dynamic RBF network in a monitoring problem of the production
systems. 15 IFAC World Congress on Automatic Control,
Barcelone, Espagne.
Zemouri, R., Racoceanu, D., Zerhouni, N., 2002b. R"eseaux de
neurones R!
ecurrents"
a Fonction de base Radiales RRFR:Application au pronostic. Revue dIntelligence Artificielle, RSTI
S!erie RIA 16 (03), 307338.
ARTICLE IN PRESS
R. Zemouri et al. / Engineering Applications of Artificial Intelligence 16 (2003) 453463 463