Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

1/46

Universitt Duisburg-Essen

Facultt fr Ingenieurwissenschaften

AUTOMATISIERUNGSTECHNIK UND KOMPLEXE SYSTEME

Evolutionary Neuro-Fuzzy System with

Internal Dynamics for System Identification

Dipl-Eng.

Cristian FLOREA

Coordinators

Prof. Dr. Eng. Steven X. DING

Dr. Eng. Lavinia FERARIU

June 2006


2/46


3/46

Abstract

The scope of this project is to explore the capabilities of a dynamic neuro-fuzzy structure in

system identification, which is the first step towards control.

The motivation of this study is based on the properties of the neuro-fuzzy systems, which are

summarized in the first chapter.

The visible part of a systems dynamic is represented by the measurements of its inputs and

outputs (and sometimes state variables). The results are presented as time series. Studying

them proved to be an appropriate way for determining the internal structure of the generating

system.

Neuro-fuzzy structures capable of generating similar time series can, sometimes, be

considered functional equivalent with the real systems, thus providing an alternative way for

modelling.

Of course, nowadays mathematical background and technology already constitute as powerful

modelling tools that are used in most cases, even when the model is relatively complex(usually, only main characteristics need to be modelled).

Neuro-fuzzy systems are intended to be used only with very complex systems for being an

efficient option, but theyre still an alternative way to model any given real system.

The practical component of this project provides a tool for system identification, developed

using the MATLAB development environment.

At this moment, validation of the toolbox uses simplified models of real systems, but a

milestone in the research process should be set by using more complex systems (social,

economic, financial)


4/46

- 4 / 46 -

I. Neuro-fuzzy systems .......................................................................................................... 6

I.1. Neural Networks ........................................................................................................ 6

I.2. Fuzzy Systems ............................................................................................................ 7

I.3. Combining Neural Networks and Fuzzy Systems ...................................................... 8

I.3.1. Neuro-Fuzzy Systems Characteristics ................................................................ 9

I.3.2. Neuro-fuzzy Systems properties ........................................................................ 9

I.3.3. Types of neuro-fuzzy systems .......................................................................... 10

II. Time Series ....................................................................................................................... 11

II.1. Time series classification ......................................................................................... 11

II.1.1. Continuity of measuring ................................................................................... 11

II.1.2. Number of variables ......................................................................................... 11

II.1.3. Linearity ........................................................................................................... 11II.1.4. Stationarity ....................................................................................................... 11

II.2. Prediction strategies ................................................................................................. 12

II.2.1. Prerequisites ..................................................................................................... 12

II.2.2. Strategies .......................................................................................................... 13

III. Neuro-Fuzzy Systems in Time Series Analysis ........................................................... 14

III.1. Fuzzy Neurons ...................................................................................................... 15

III.2. Neurons with fuzzy weights ................................................................................. 15

III.3. The Yamakawa Neuron Model ............................................................................ 16

III.4. The Dynamic Yamakawa Neuron Model ............................................................. 18

IV. The MATLAB Implementation ................................................................................... 21

IV.1. Performance evaluation ........................................................................................ 21

IV.2. Training algorithms .............................................................................................. 21

IV.2.1. Gradient-based algorithms (Backpropagation) ................................................ 22

IV.2.2. Genetic Algorithms .......................................................................................... 28

IV.3. Training strategies ................................................................................................ 29IV.4. Data structures ...................................................................................................... 29

IV.4.1. Class ARMA .................................................................................................... 30

IV.4.2. Class SFS.......................................................................................................... 30

IV.4.3. Class BRANCH ................................................................................................ 31

IV.4.4. Class NFS ......................................................................................................... 31

IV.5. Future development .............................................................................................. 32

IV.5.1. Evolutionary strategies ..................................................................................... 32

IV.5.2. Generating training data sets ............................................................................ 32

IV.5.3. Graphical user interface ................................................................................... 32


5/46

- 5 / 46 -

IV.6. Resources ............................................................................................................. 32

V. Testing and results ............................................................................................................ 33

V.1. Vehicle lateral dynamic model ................................................................................. 33

V.1.1. Model description ............................................................................................. 33

V.1.2. Physical simplifications .................................................................................... 34

V.1.3. Unknown input signal ...................................................................................... 35

V.1.4. Model parameter variation ............................................................................... 35

V.1.5. Model noise ...................................................................................................... 37

V.1.6. Typical failures ................................................................................................. 37

V.1.7. Physical parameters of the vehicle lateral dynamical model ........................... 38

I.1.1.1. System variables ........................................................................................... 38

I.1.1.2. Sensor noise data .......................................................................................... 38V.1.8. Reference .......................................................................................................... 39

V.2. Test 1 ........................................................................................................................ 39

V.3. Test 2 ........................................................................................................................ 42

VI. Conclusions .................................................................................................................. 45

VII. Bibliography and references ......................................................................................... 46


6/46

- 6 / 46 -

I. Neuro-fuzzy systems

Neural networks and fuzzy systems, as alternative methods in data processing, are suited for

intelligent behaviour modelling and mimic the actions of a human expert capable of solving

complex problems.

This goal is achieved through observation and learning instead of classical mathematical

modelling (using classical laws of physics, chemistry, biology, economics and many more).

Considering the previous remarks, the process of knowledge assimilation has the leading role.

According to knowledge classification, assimilation has three major components:

Interviewing and observingsuited for knowledge that can be expressed as a set ofrules;

Instruction; Learning;I.1. Neural Networks

Neural networks are systems that try to make use of some of the known or expected

organizing principles of the human brain. They consist of a number of independent, simple

processors - the neurons. These neurons communicate with each other via weighted

connections.

At first, research in this area was driven by neurobiological interests. The modelling of single

neurons and the so-called learning rules for modifying synaptic weights were the initial

research topics.

Modern research in neural networks considers the development of architectures and learning

algorithms, and examines the applicability of these models to information processing tasks.

Although, there are still many researchers who are modelling biological neural networks by

artificial neural networks to learn more about the structure of the human brain and the way it

works, biological plausibility is usually neglected and only the problem of information

processing with artificial neural networks is considered. These models have in common that

they are based on rather simple processing units or neurons exchanging information via

weighted connections.

Different types of neural networks can solve different problems, like pattern recognition,

pattern completion, determining similarities between patterns or data - also in terms of

interpolation or extrapolation - and automatic classification.

Learning in neural networks means to determine a mapping from an input to an output space

by using example patterns. If the same or similar input patterns are presented to the network

after learning, it should produce an appropriate output pattern.

Neural networks can be used if training data is available. It is not necessary to have a

mathematical model of the problem of interest, and there is no need to provide any form of

prior knowledge. On the other hand the solution obtained from the learning process usually

cannot be interpreted.

Although there are some approaches to extract rules from neural networks, most neural

network architectures are black boxes. They cannot be checked whether their solution is

plausible, i.e. their final state cannot be interpreted in terms of rules. This also means that aneural network usually cannot be initialized with prior knowledge if it is available, and


7/46

- 7 / 46 -

thus the network must learn from scratch. The learning process itself can take very long,

and there is usually no guarantee of success.

The following table synthesizes the advantages and drawbacks of neural networks:

Advantages Disadvantages

No mathematical model is needed There is no need for prior knowledge There are many training methods

developed

Black box-like system Usually, the manner in which the results

are obtained is not interpretable

Adaptation to environment changes mayprove to be difficult, thus retraining is

needed

If they exist, prior knowledge cant beused

The training process is not guaranteed toconverge

Table 1

I.2. Fuzzy Systems

When using fuzzy set theory, it is easy to model the fuzzy boundaries of linguistic terms by

introducing gradual memberships. In contrast to classical set theory, in which an object or a

case either is a member of a given set (defined, e.g., by some property) or not, fuzzy set

theory makes it possible that an object or a case belongs to a set only to a certain degree.

Interpretations of membership degrees include similarity, preference, and uncertainty. They

can state how similar an object or case is to a prototypical one, they can indicate preferences

between sub-optimal solutions to a problem, or they can model uncertainty about the true

situation, if this situation is described in imprecise terms. In general, due to their closeness to

human reasoning, solutions obtained using fuzzy approaches are easy to understand and to

apply.

Due to these strengths, fuzzy systems are the method of choice, if linguistic, vague, or

imprecise information has to be modelled.

The fuzzy systems are based on if-then rules. The antecedent of a rule consists of fuzzy

descriptions of input values, and the consequent defines a - possibly fuzzy - output value forthe given input. The benefits of these fuzzy systems lie in the suitable knowledge

representation. But problems may arise when fuzzy concepts have to be represented by

concrete membership degrees, which guarantee that a fuzzy system works as expected.

A fuzzy system can be used to solve a problem if knowledge about the solution is available in

the form of linguistic if-then rules. By defining suitable fuzzy sets to represent linguistic

terms used within the rules, a fuzzy system can be created from these rules.

There is no formal model of the problem of interest and no training data required.


8/46

- 8 / 46 -

Lets summarize in the following table:


No mathematical model is required Prior knowledge represented as if-then

rules may be used;

Ease of implementation if-then rules facilitate the interpretation

of results

if-then rules are a prerequisite; No learning capabilities; There are no standard methods for

parameter adjustments;

Difficulties in interpreting the resultsmay occur;

Adaptation to changing environmentcould be difficult;

Improvements are not guaranteed byparameter adaptation;

Table 2

I.3. Combining Neural Networks and Fuzzy Systems

Presently, the neuro-fuzzy approach is becoming one of the major areas of interest because it

gets the benefits of neural networks as well as of fuzzy logic systems and it removes the

individual disadvantages by combining them on the common features.

Different architectures of neuro-fuzzy system have been investigated. These architectures

have been applied in many applications especially in the process control.

Neural networks and Fuzzy logic have some common features such as distributedrepresentation of knowledge, model-free estimation, ability to handle data with uncertainty

and imprecision etc.

Fuzzy logic has tolerance for imprecision of data, while neural networks have tolerance for

noisy data. A neural networks learning capability provides a good way to adjust experts

knowledge and it automatically generates additional fuzzy rules and membership functions to

meet certain specifications. This reduces the design time and cost.

On the other hand, the fuzzy logic approach possibly enhances the generalization capability of

a neural network by providing more reliable output when extrapolation is needed beyond the

limits of the training data.

The basic idea of combining fuzzy systems and neural networks is to design an architecture

that uses a fuzzy system to represent knowledge in an interpretable manner and the learning

ability of a neural network to optimize its parameters.

The drawbacks of both of the individual approaches - the black box behaviour of neural

networks, and the problems of finding suitable membership values for fuzzy systems - could

thus be avoided.

A combination can constitute an interpretable model that is capable of learning and can use

problem-specific prior knowledge. Therefore, neuro-fuzzy methods are especially suited for

applications, where user interaction in model design or interpretation is desired.


9/46

- 9 / 46 -

I.3.1. Neuro-Fuzzy Systems Characteristics

Although there are a lot of different approaches, usually the term neuro--fuzzy system is used

for approaches which display the following properties:

A neuro-fuzzy system is based on a fuzzy system which is trained by a learningalgorithm derived from neural network theory. The (heuristic) learning procedureoperates on local information, and causes only local modifications in the underlying

fuzzy system.

A neuro-fuzzy system can be viewed as a 3-layer feed forward neural network. Thefirst layer represents input variables, the middle (hidden) layer represents fuzzy rules

and the third layer represents output variables. Fuzzy sets are encoded as (fuzzy)

connection weights. It is not necessary to represent a fuzzy system like this to apply a

learning algorithm to it. However, it can be convenient, because it represents the data

flow of input processing and learning within the model. (Sometimes 5-layer

architecture is used, where the fuzzy sets are represented in the units of the second and

fourth layer.) A neuro-fuzzy system can be always (i.e. before, during and after learning) interpreted

as a system of fuzzy rules. It is also possible to create the system out of training data

from scratch, as it is possible to initialize it by prior knowledge in form of fuzzy rules.

(Not all neuro-fuzzy models specify learning procedures for fuzzy rule creation)

The learning procedure of a neuro-fuzzy system takes the semantic properties of theunderlying fuzzy system into account. This results in constraints on the possible

modifications applicable to the system parameters. (Not all neuro-fuzzy approacheshave this property.)

A neuro-fuzzy system approximates an n-dimensional (unknown) function that ispartially defined by the training data. The fuzzy rules encoded within the system

represent vague samples, and can be viewed as prototypes of the training data. A

neuro-fuzzy system should not be seen as a kind of (fuzzy) expert system, and it has

nothing to do with fuzzy logic in the narrow sense.

I.3.2. Neuro-fuzzy Systems properties

From the fuzzy systems point of view, the main advantage is the learning capability, as for

neural networks, use of prior knowledge for initial conditions is an obvious gain, potentially

causing a speed-up of the training process. Also, the if-then rules set offer the possibility of

interpretable results.

The architecture of a neuro-fuzzy system is determined by the rules and fuzzy sets specific to

the problem. In this case there is no need to specify some network parameters, like the

number of hidden layers.

Unfortunately, combining the two types of systems is not a guaranteed way to success, due to thefact that training convergence is not a sure thing.

An often used method for training fuzzy systems consists in representing it as a neural network

and applying the specific training algorithms (i.e. backpropagation). This approach implies some

alteration of the structure and/or algorithm, due to the fact that the functions used for fuzzy

inference are not differentiable (i.e. min, max). There are two possibilities:

Replace specific fuzzy inference functions with differentiable ones (in this case there maybe a loss of interpretability of the results); Use of training methods that are not gradient-based;


10/46

- 10 / 46 -

Modern neuro-fuzzy systems are usually modelled as multi-layer feed-forward neural networks.

Here we can mention:

ANFIS (Adaptive-Network-based Fuzzy Inference System) a Sugeno fuzzy system ismodelled as 5-layer, feed-forward neural network;

GARIC (Generalized Approximate Reasoning-based Intelligent Control)implements afuzzy controller using specialized feed-forward neural networks;

Of course, there are other principles too as those used for fuzzy associative memories, self-

organizing feature map (transform the input of arbitrary dimension into a one or two

dimensional discrete map subject to a topologicalneighbourhood preservingconstraint.


No mathematical model is necessary Prior knowledge represented as if-then

rules is not required but it can be used;

There are many training/learningalgorithms;

Ease of implementation; Interpretable results;

Training convergence is not guaranteed;

Table 3

I.3.3. Types of neuro-fuzzy systemsUsually there are two major combinations of neural networks and fuzzy systems.

First case involves a neural network and a fuzzy system working independently. The neural

network is used to determine some parameters of the fuzzy system. The adjustment can be

online or offline. Because of this co-working, these systems are named cooperative neuro-

fuzzy systems.

In the second case, we have hybrid neuro-fuzzy systems, where a homogeneous neural

network-like architecture is obtained by interpreting a fuzzy system as a neural network or by

directly implementing it as one.


11/46

- 11 / 46 -

II. Time Series

Inertia is an intrinsic property for all systems, thus observing their previous behaviour might

provide useful information that will allow derive some laws and rules about its evolution.

A time series is a sequence of observations (data points), measured typically at successivetimes, spaced apart at uniform time intervals.

There are two main goals of time series analysis:

identifying the nature of the phenomenon represented by the sequence of observations; forecasting (predicting future values of the time series variable).

Both of these goals require that the pattern of observed time series data is identified and more

or less formally described. Once the pattern is established, we can interpret and integrate it

with other data (i.e., use it in our theory of the investigated phenomenon).

Regardless of the depth of our understanding and the validity of our interpretation (theory) of

the phenomenon, we can extrapolate the identified pattern to predict future events.

If a time series can be predicted, then it is called deterministic. Usually, in practice, time

series are stochastic processes due to noisy observations. Consequently future observations

are determined only in part by past values.

II.1. Time series classification

Essentially aspects characterizing a time series determine a criterion or set of criteria used for

classifying.

II.1.1. Continuity of measuring

If it is possible for a time series to measure its values at any given time, then the time series is

called continuous.

In practice, generally, time series contain observations made at predetermined moments,

separated by constant time intervals. In this case we have discrete time series.

A continuous time series can be transformed into a discrete one by measuring its values at

equally distant moments, thus obtaining a sampled time series.

Another type is the integrative (cumulative) time series, where measuring is possible only for

cumulative values of a variable (i.e. rainfall quantities).

II.1.2. Number of variables

A time series that contains values of a single variable is named monovariable. Otherwise it is

a multivariable time series.

II.1.3. Linearity

Linearity of a time series can be determined using methods as Hurst coefficient, Lyapunov

(characteristic) exponent and correlation dimension.

II.1.4. Stationarity

A stationary process has the property that the mean, variance and autocorrelation structure donot change over time.


12/46

- 12 / 46 -

Usually, time series are non-stationary, thus some methods were developed to transform a

non-stationary time series into a stationary one.

1. Difference the data. That is, given the seriesZt, we create the new series1 iii ZZY

The differenced data will contain one less point than the original data. Although you can

difference the data more than once, one difference is usually sufficient.

2. If the data contain a trend, we can fit some type of curve to the data and then modelthe residuals from that fit. Since the purpose of the fit is to simply remove long term

trend, a simple fit, such as a straight line, is typically used.

3. For non-constant variance, taking the logarithm or square root of the series maystabilize the variance. For negative data, you can add a suitable constant to make the

entire data positive before applying the transformation. This constant can then be

subtracted from the model to obtain predicted (i.e., the fitted) values and forecasts for

future points.

II.2. Prediction strategies

II.2.1. Prerequisites

Before presenting some prediction strategies, we need to define how to organize information

given by the time series in order to use it efficiently.

1. Data sets

Usually, time series data is divided into three sets:

Training set used to train the prediction system; by means of trial and error thedimension of training set is varied until an optimal size is reached;

Validation set used for monitoring the training process to make sure that theprediction system has not over-learned the training set;

Testing setused after training to study the performance of the prediction system;2. Sample delay and window size

Given a time series xt, xt-1, .xt-i,..and considering that we should predict xt+n, then one

must decide how many samples are used (this is called the window size) and how is the data

sampled. For a time series xt, xt-k, xt-2k.., xt-ik,.., k is called sample delay.

Both parameters will be determined experimentally.

3. Measure of prediction error

Given a time series x1, x2, x3,, xn and the mean value xmean then the standard deviation

n

i

meanisd xxn

x1

2

1

1(1)

is used as prediction error, given that the prediction is always the mean value. To mention

that, for random time series, the mean value is the best prediction.

A second method for measuring the prediction error starts from a series of predicted values

nxxx ,......., 21 , the prediction error is defined as


13/46

- 13 / 46 -

sd

error

x

xError

, where

n

i

iierror xxn

x1

2

1

1 (2)

In this case, the prediction quality is measured as the improvement relative to the prediction

of the mean value.

II.2.2. Strategies

One approach consists in constructing a function Fn that can directly predict ntx :

iktktkttnnt xxxxFx ,.......,,, 2 (3)

Another way is to construct a function FI

NktktkttIt xxxxFx ,.......,,, 21

which can predict one step ahead 1 tx , and then apply this function iteratively n times.

Finally, one can construct a function F1 to predict 1 tx , then retrain it, so F2 is obtains that

predicts2

tx , and so on:

1121122

211

,.......,,,

,.......,,,

iktktkttt

iktktkttt

xxxxFx

xxxxFx


14/46

- 14 / 46 -

III.Neuro-Fuzzy Systems in Time Series Analysis

Classical time series analysis has two major components:

1. Time domain analysis - usually more profitably for stochastic signals, uses correlationstechniques to study signal characteristicsThe correlation of two signals is defined as:

dtyxrt

xy

In particular, when a function is correlated with itself, the operator is called

autocorrelation; otherwise it is called cross correlation.

Correlation determines how much similarity there is between the two argument functions.

Some of the general properties are:

The maximum value of the correlation always occurs at t = 0. The function alwaysdecreases (or stays constant) as t approaches infinity;

The more the area under the correlation curve, the more is the similarity between thetwo signals;

2. Frequency domain analysis best suited for periodic signals, is based on the Fouriertransform.

Supposex is a complex-valued Lebesgue integral function. The Fourier transform to the

frequency domain is given by the function

dtetxX tj

2

1, with R

The Fourier analysis also uses Fourier series, discrete-time Fourier transform and discrete

Fourier transform.

On the other hand new approaches such as neural networks and especially neuro-fuzzy

systems offer ways for modelling a systems behaviour using artificial intelligence specific

techniques.

Neuro-fuzzy systems are more suitable for modelling large scale, complex systems thatotherwise would impose the use of very complex and hard to compute mathematical

equations.

Nowadays, in many cases, the classical methods are, sometimes by far, more efficient than

neuro-fuzzy modelling, but the research community is seeking for new, improved and

efficient modelling and training of these structures.

The scope of this project is to explore the capabilities of a dynamic neuro-fuzzy structure in

system identification, which is the first step towards control.


15/46

- 15 / 46 -

III.1. Fuzzy Neurons

The simplest neuro-fuzzy system is the fuzzy neuron which implements some basic fuzzy-

logic functions.

Figure 1 The Fuzzy Neuron

For examplefcan be one of the MIN or MAX functions:

nini

nini

xxxMAXxxxfy

xxxMINxxxfy

,.......,,....,,.......,,....,

,.......,,....,,.......,,....,

11max

11min

Using fuzzy logical neurons, the output is more or less influenced by the values of inputs.

This influence depends on both the weights and the operation of fusion:

for a neuron of type AND, the influence of its inputs having a weak weight is mostimportant

for a neuron of type OR the inputs whose weight is significant are rather taken intoaccount

III.2. Neurons with fuzzy weights

Another way to fuzzify the neuron model is the use of fuzzy weights instead of crisp values.

Fuzzy weights are interpreted as membership functions, thus the linear synaptic connections

are replaced with non-linearities labelled as loose connected or tight connected.

Exciting or inhibiting connections are represented through fuzzy intersection or through fuzzy

complement followed by fuzzy intersection.

a. Conventional neuron with fuzzy weightsConsidering the standard neuron

x1

xi

xn

w1

wi

wn

y

x1

xi

xn

w1

wi

wn

y


16/46

- 16 / 46 -

where x1, x2,, xn are inputs, w1, w2,, wn the weights, the bias (offset) and y the output.

Positive weights are excitatory and the negative ones inhibiting. The model has a single

parameter for each synapse and one non-linear function f. Due to its simplicity, more complex

functions are achieved with large, complicated architectures.

A more powerful neuron model is obtained when fuzzy weights are used and, more important,the inputs and outputs are also membership functions (take fuzzy values). Such a model is the

Yamakawa neuron which will be presented in the next section.

b. Direct fuzzification of neural networksIn this case, inputs and/or output and/or weights are generalized to fuzzy values. The

following table presents all possible combinations

Fuzzy Neural

Network

Weights Inputs Outputs

Type 1 Crisp Fuzzy Crisp

Type 2 Crisp Fuzzy Fuzzy

Type 3 Fuzzy Fuzzy Fuzzy

Type 4 Fuzzy Crisp Fuzzy

Type 5 Crisp Crisp Fuzzy

Type 6 Fuzzy Crisp Crisp

Type 7 Fuzzy Fuzzy Crisp

Type 1 networks were used to classify fuzzy input vectors into crisp classes, types 24 were

used to implement fuzzy if-then rules.

According to some researches, types 57 cant be implemented. For type 5 the output will

always be crisp, while for 6 and 6 there is no need to fuzzify the weights.

III.3. The Yamakawa Neuron Model

Lets consider the linearcombinatory model:

Figure 2 The Linear Combinator

The Yamakawa neuron is derived from the model above, where weights a i are replaced by

non-linear functions implemented with Sugeno fuzzy systems (SISO Single Input, Single

Output) and the bias a0 is set to zero.

x1

xi

xn

a1

ai

an

y

a0


17/46

- 17 / 46 -

Figure 3 The Yamakawa Model

The structure of the non-linear synapse is presented in the next figure:

Figure 4 The Yamakawa Model Synapse Structure (Sugeno Fuzzy System)

and we have:

m

j

iji

m

j

jiiji

ii

xg

wxg

xf

1

,

1

,,

where

g i,j - the jth membership function of the ith Sugeno fuzzy system w

i,jthe j

thvariable weight of the i

thSugeno fuzzy system

mthe number of membership functions

x1

xi

xn

y

f1

fi

fn

xi

fi (xi)

gi,1

gi,j

gi,m

wi,1

wi,j

wi,m


18/46

- 18 / 46 -

III.4. The Dynamic Yamakawa Neuron Model

The Yamakawa model offers more computational power but is still a static structure. In order

to be more practical one can transform it into a dynamic structure. Considering that,

nowadays, in practice, most time series are discrete, ARMA (Auto-Regressive, Moving

Average) filters prove to be suitable for modelling dynamic behaviour.

The proposed structure is represented in the following figure:

Figure 5 Dynamic Synapse

Reconsidering the Yamakawa model, the following representation is obtained:

Figure 6 The Dynamic Yamakawa Model

Lets consider a second-order ARMA filter with the following structure:

Figure 7 Structure of a second order ARMA filter

ARMA1 SFS ARMA2xi (k) yi (k)

ARMA11 SFS1 ARMA21x1 (k)

ARMA1i SFSi ARMA2ixi (k) y (k)

ARMA1n SFSn ARMA2nxn (k)

ARMA

T

T

T

T

y(k)x(k)

-

+

+

+

+

+

+


19/46

- 19 / 46 -

The inputoutput transfer of the ARMA filter is given by:

kxqaqa

qbqbbky

2

2

1

1

2

2

1

10

1)(

where 1q is the delay operator.

For the Sugeno fuzzy systems Gaussian membership functions are considered, with centers

uniformly distributed in [-1;1].

m

i

ckx

m

i

ckx

i

i

i

i

i

e

e

ky

1

)(

1

)(

2

2

2

2

)(

ioutput singletons (variable weights); idispersions for the Gaussian membership functions; cicentres of the Gaussian membership functions; mnumber of membership functions;

In order to write the inputoutput transfer equation for the entire system, we need to agree on

some notations:

Figure 8 The Yamakawa Dynamic Model with signals notations

ksqaqa

qbqbbky

2

2

1

1

2

2

1

10

1)(

m

i

i kvks1

)(

kw

qaqa

qbqbbkv ii 22

2

12

1

22

2

12

1

2

0

1)(

ni 1

m

j

ckz

m

j

ckz

ji

i

ji

jii

ji

jii

e

e

kw

1

)(

1

)(

,

2,

2,

2,

2,

)(

ni 1

ARMA11 SFS1 ARMA21x1 (k)

ARMA1i SFSi ARMA2ixi (k) y (k)

ARMA1n SFSn ARMA2nxn (k)

ARMAzi wi vi s

zn wn

v1

vn

z1 w1


20/46

- 20 / 46 -

kxqaqa

qbqbbkz ii 21

2

11

1

21

2

11

1

1

0

1)(

ni 1


21/46

- 21 / 46 -

IV. The MATLAB Implementation

Having studied this dynamic structures capability during my license degree project, the

purpose of the current project is to make training process more efficient (concerning duration,

performance and ease of use). The objective is to find those algorithms, strategies and training

parameters that determine fast training, maximum of performance and minimum of useradjustable parameters / variables.

IV.1. Performance evaluation

The structure proposed in this paper is intended to model a dynamic system using neuro-fuzzy

paradigm specific techniques. Thus, evaluation of modelling performance based on output

estimation error comes naturally.

In this case, the mean square error is considered:

22 ,2

1,

2

1, kykykekJ d

, where is the neuro-fuzzy system parameter vector.

IV.2. Training algorithms

The training algorithms are implemented as sets of functions designed to alter the parameters

of the neuro-fuzzy system according to the criterion above.

Two types of algorithms were considered, first the gradient-descent based, specific for

training neural network and second a genetic algorithm.

Both types have de same logical structure:

scaledata_sets

while notstop

output= evaluate(structure)

error= target_outputoutput

modify parameters

ifstop_crierion

stop = trueelse

stop = false

end if

end while


22/46

- 22 / 46 -

Data scaling is made according to the following procedure:

1min_max_

_min__2__

1min_max_

_min__2__

1min_max_

_min__2__

)_max(max_

)_min(min_

trainingtraining

settrainingsettetingsettestingscaled

trainingtraining

settrainingsetvalidationsetvalidationscaled

trainingtraining

settrainingsettrainingsettrainingscaled

settrainingtraining

settrainingtraining

Although the training set values will be in [-1,1], for the validation and testing sets it is

possible to exceed this interval. This scaling procedure assures that the transformation is

bijective.

IV.2.1. Gradient-based algorithms (Backpropagation)

This class of algorithms uses the following formula for updating system parameters:

,1

kJkk (4)

where is an algorithm specific parameter called learning rate.

Considering the expression for J(k,), the above formula becomes:

,

1

ky

kekk (5)

So, besides learning rate and estimation error, parameter variation depends only on systems

output.

Remark

For an ARMA filter having the input-output transfer described by

kxqaqa

qbqbbky

2

2

1

1

2

2

1

10

1

(6)

the following expressions were derived:

derivative of output with respect to input

2

2

1

1

210

1

qaqa

bbbk

x

y7)

derivative of output with respect to denominator coefficients)(

12

2

1

1

kyqaqa

q

a

yi

i

; 2,1i (8)


23/46

- 23 / 46 -

derivative of output with respect to numerator coefficients kx

qaqa

q

b

yi

i

2

2

1

11

; 2,0i (9)

For this revision of the toolbox, a poles-zeros factorized expression was preferred for the

ARMA filters. This allows a better control of the filters behaviour.

The following expressions were considered:

)())((

))(()(

2

21 kxpqpq

zqzqkky p

(10)

kxqaqa

qbqbbkxqppqpp

qzzqzzkky p 22

1

1

22110

2

21

1

21

221121

11

1)(

(11)

derivative of output with respect to input

2211

21

2121

1

1)(

qppqpp

zzzzkk

x

y p(12)

derivative of output with respect to gain

kxqppqpp

qzzqzzkky

p

2

21

1

21

2

21

1

21

11)( (13)

derivative of output with respect to poles)(

12

2

1

1

kyqaqa

q

a

yi

i

; 2,1i (14)

2

2

22

1

12

1

2

21

1

11

p

a

a

y

p

a

a

y

p

y

p

a

a

y

p

a

a

y

p

y

(15)

kyqaqa

qpqk

p

y

kyqaqa

qpqk

p

y

22

11

21

1

2

22

11

22

1

1

1)(

1)(

(16)


24/46

- 24 / 46 -

derivative of output with respect to zeros kx

qaqa

q

b

yi

i

2

2

1

11

; 2,0i (17)

2

2

22

1

12

1

2

21

1

11

z

b

b

y

z

b

b

y

z

y

z

b

b

y

z

b

b

y

z

y

18)

kxqaqa

qzqkkzy

kxqaqa

qzqkk

z

y

p

p

2

2

1

1

2

1

1

2

2

2

1

1

2

2

1

1

1)(

1)(

(19)

For the Sugeno fuzzy systems the expressions considered are:

inputoutput transfer

m

i

ckx

m

i

ckx

i

i

i

i

i

e

e

ky

1

)(

1

)(

2

2

2

2

)(

(20)

Let

m

i

ckx

ii

i

ekP1

)(

2

2

)( (21)

and

m

i

ckx

i

i

ekQ

1

)(

2

2

)(

(22)

Then

)(

)()(

kQ

kPky (23)


25/46

- 25 / 46 -

derivative of output with respect to inputConsidering the notations above, one can write

2)(

)()()()(

)(kQ

kx

QkPkQk

x

P

kx

y

(24)

where

m

i i

i

ckx

i

ckxek

x

Pi

i

1

2

)(

)(21)(

2

2

(25)

and

m

i i

i

ckx

ckxek

x

Qi

i

1

2

)(

)(21)(

2

2

(26)

derivative of output with respect to singletonsFrom the inputoutput transfer expression, one can derive

)(

)(

)(kQ

kP

ky

i

i

, mi ,1 (27)

having

2

2

)(

)( iickx

i

ekP

, mi ,1 (28)

derivative of output with respect to centresThe expressions above lead us to write

2)(

)()()()(

)(kQ

kc

QkPkQk

c

P

kc

y ii

i

, mi ,1 (29)

with

2

)(

)(2)(

2

2

i

i

ckx

i

i

ckxek

c

Pi

i

, mi ,1 (30)

and

2

)(

)(2)(

2

2

i

i

ckx

i

ckxek

c

Qi

i

, mi ,1 (31)


26/46

- 26 / 46 -

derivative of output with respect to dispersionsSome simple calculus and we have

2)(

)()()()(

)( kQ

kQ

kPkQkP

k

y ii

i

, mi ,1 (32)

where the derivatives involved have following expressions

3

2)(

))((2)(

2

2

i

i

ckx

i

i

ckxek

Pi

i

, mi ,1 (33)

3

2)(

))((2)(

2

2

i

i

ckx

i

ckxek

Pi

i

, mi ,1 (34)

With notations from Figure 8 we can write the training formula for each component.

For the algorithm itself, several options are available:

i. Parameter alteration

Sometimes the training set will contain fewer but more relevant samples. In this case

sequential training may be used, thus, the parameter change will occur after each

sample being evaluated. Otherwise batch training is used.

For batch training, suppose the training set has m samples, then the following

expressions will be considered:

derivativemeanerrormeankk __1 (35)

where

m

t

tem

errormean1

)(1

_ (36)

and

m

t

ty

mderivativemean

1

1_

(37)

ii. Learning rate

The implicit option is the modifiable but fixed learning rate, which means that for a

training session the learning rate will be constant, being specified before starting the

procedure.

Another option is the variable learning rate. In this case, during training procedure, the

learning rate is modified according to a previously established rule:

Annealing (gradually lower)In order to reach the minimum, and stay there, we must anneal (gradually

lower) the global learning rate. A simple, non-adaptive annealing schedule for

this purpose is the search-then-converge schedule


27/46

- 27 / 46 -

T

kk

1

0 (38)

Its name derives from the fact that it keeps nearly constant for the first T

training patterns, allowing the network to find the general location of theminimum, before annealing it at a (very slow) pace that is known from theory

to guarantee convergence to the minimum. The characteristic time T of this

schedule is a new free parameter that must be determined by trial and

error.

Bold driverA useful batch method for adapting the global learning rate is the bold

driver algorithm. Its operation is simple: after each epoch, compare the

network's loss e(k) to its previous value, e(k-1). If the error has decreased,

increase by a small proportion (typically 1%-5%). If the error has increased

by more than a tiny proportion (say, 10-10

), however, undo the last parameterchange, and decrease sharply - typically by 50%. Thus bold driver will keep

growing slowly until it finds itself taking a step that has clearly gone too far

up onto the opposite slope of the error function. Since this means that the

system has arrived in a tricky area of the error surface, it makes sense to reduce

the step size quite drastically at this point.

Unfortunately bold driver cannot be used in this form for online learning: the

stochastic fluctuations in e(k) would hopelessly confuse the algorithm.

iii. Momentum

Another technique that can help the system out of local minima is the use of a

momentum term. This is probably the most popular extension of the backpropagation

algorithm; it is hard to find cases where this is not used. With momentum m, the

parameter update at a given moment k becomes:

1 kmkfk (39)

wheref(k) is a factor depending on current / mean error and current / mean derivative

and e 0 < m < 1 is a new global parameter which must be determined by trial and

error. Momentum simply adds a fraction m of the previous parameter update to the

current one.

When the gradient keeps pointing in the same direction, this will increase the size of

the steps taken towards the minimum. It is therefore often necessary to reduce theglobal learning rate when using a lot of momentum (m close to 1). Combining a high

learning rate with a lot of momentum, the system will rush past the minimum with

huge steps.

When the gradient keeps changing direction, momentum will smooth out the

variations.

iv. Stopping condition

For sequential training the algorithm will stop after all data samples in the training set

were presented to the system.

When using batch training the algorithm will stop after the specified number ofepochs.


28/46

- 28 / 46 -

In both cases, there is an option to stop the training process when a desired (imposed)

estimation error is reached, that means

maxeke (40)

IV.2.2. Genetic AlgorithmsThe training procedure using genetic algorithms implies the following steps to be taken:

Step 1. From the system parameters construct a vector with a predefined structure;Step 2. Create a population consisting of a specified number (NPop) of randomly

generated vectors like the one defined on the previous step; Npop is an algorithm

specific parameter;

Step 3. InitNgen (number of generations)Step 4. Evaluate current population; for each individual the mean square estimation error

is computed;

Step 5. Select a number of individuals for reproductionStep 6. Apply the crossover operator with a specified probability Pc;Step 7. Apply mutation operator with a specified probability Pm;Step 8. SelectNpop individuals from the extended population (parents and offspring)Step 9. IfNGen is reached go to next step, else repeat from step 4

Step 10. From the final population select the best individualStep 11. Knowing the structure of an individual set the parameters

Regarding the Sugeno fuzzy systems, when constructing individuals by simply gathering all

parameters and putting them together, the crossover might produce worse individuals, due to

the fact that it is possible for one or more Sugeno systems to have similar characteristics.

To overcome this drawback, two methods were considered:

i. Test for similarity

In this case, all vectors containing centers of Gaussian membership functions are

clustered based on the relative distance between them. From each cluster a

representative is chosen. Then, these representatives are used to construct the initialpopulation.

Parameters for clustering algorithm are chosen by trial and error.

ii. Replace genetic algorithms with evolutionary strategies

Comparing with genetic algorithms, in evolutionary strategies there is no crossover or

the probability of crossover is drastically lowered, the main operator being mutation,

the population contains fewer individuals, a parent produces more offsprings.

In this case, Gaussian mutation will be applied. Gaussian mutation adds to each

component of the individual a small quantity obtained from a Gaussian distribution.

Due to the fact that implementation of evolutionary strategies is more complicated,

this aspect is left for future development of the toolbox.


29/46

- 29 / 46 -

IV.3. Training strategies

Due to the fact that the structure implemented by this project has a relatively large number of

parameters, it is more efficient to separate them by means of power to affect the

performance of the system.

In this revision of the toolbox a two-stage training procedure is implemented.

Mainly, the first stage works with the static version of the system, ARMA filters being

characterized only by gain.

The second stage is started with the insertion of randomly generated poles and zeros (numbers

in [-1, 1] interval are generated to assure stability and minimum phase). After that, allparameters are adjusted, making sure in the same time that the performance is increased, thus

not ruining the work in the first stage.

For the moment, two strategies were defined:

1) Stage 1gradient-descent batch training on static structureStage 2gradient-descent batch training on dynamic structure

2) Stage 1genetic algorithms on static structureStage 2gradient-descent batch training on dynamic structure

Remark

During early tests it has been observed that adjustments made to the Sugeno systems

parameters, sometimes determine the evaluation to NaN, due to the fact that no

membership function is activated, thus the denominator of the input-output transfer isevaluated to 0 (ore very close to 0, considering the error caused by number

representation), and the final output of the system is the result of a division by 0.

Also it is worth mention that initialization of the fuzzy systems guarantees that at least

one of the membership functions is activated.

To overcome this behaviour, when a Sugeno fuzzy system evaluates to NaN, the last

parameter change is discarded.

IV.4. Data structures

Current revision of the toolbox is implemented using the OOP capabilities of the MATLABenvironment.

Although the previous implementation was also modular and scalable it was pretty hard to

debug training-related problems.

Encapsulation reduced the amount of debuggable code and ensured minimum spread and

propagation of errors.

Another issue solved by this approach is the speed of training an evaluation.

In the following sections the implemented classes will be enumerated. Although implemented,

standard methods like constructor, display, set, get will not be mentioned.


30/46

- 30 / 46 -

IV.4.1. Class ARMA

A. Fields

Gaingain of the filter (scalar)Polespoles of the filter (1-by-2 vector)

Zeroszeros of the filter (1-by-2 vector)

Inputpresent and last 2 input samples (1-by-3 vector)

Outputpresent and last 2 output samples (1-by-3 vector)

Doutdinpresent and last 2 samples for derivative o output with respect to input (1-

by-3 vector)

Doutdgainpresent and last 2 samples for derivative o output with respect to gain (1-

by-3 vector)

Doutdpolespresent and last 2 samples for derivative o output with respect to poles

(2-by-3 vector)

Doutdzerospresent and last 2 samples for derivative o output with respect to zeros

(2-by-3 vector)

B. Methods

Evaluateevaluate output

Evalderivevaluate output and derivatives

Updategdupdates filter parameters according to gradient-descent expression

Resetresets initial conditions

IV.4.2. Class SFS

A. Fields

Centercenters of Gaussian membership functions

Betaoutput singletons

Sigmadispersions of the Gaussian membership functions

Doutdinderivative of output with respect to input (scalar)

Doutdcenterderivative of output with respect to centers (nmf-by-1 vector)

Doutdbetaderivative of output with respect to singletons (nmf-by-1 vector)

Doutdsigmaderivative of output with respect to dispersions (nmf-by-1 vector)

where nmfrepresents the number of membership functions

B. MethodsEvaluateevaluate output


31/46

- 31 / 46 -

Evalderivevaluate output and derivatives

Updategdupdates fuzzy system parameters according to gradient-descent expression

IV.4.3. Class BRANCH

A. Fields

Arma1input filter (ARMA object)

SfsSugeno fuzzy system (SFS object)

Arma2output filter (ARMA object)

B. Methods

Evaluateevaluate output expression (calls object specific evaluate method)

Evalderiv evaluate output and derivatives expression (calls object specific evalderiv

method)

Updategd updates parameters according to gradient-descent expression (calls object

specific updategd method)

Resetresets initial conditions for arma1 and arma2

Initpzinitializes arma1 and arma2 poles and zeros with random values in [-1, 1]

IV.4.4. Class NFS

A. Fields

Branchessynapses (vector of BRANCH objects)

Armaoutput filter (ARMA object)

B. Methods

Evaluateevaluate output expression (calls object specific evaluate method)

Evalderiv evaluate output and derivatives expression (calls object specific evalderiv

method)

Updategd updates parameters according to gradient-descent expression (calls object

specific updategd method)

Resetresets initial conditions for ARMA and each branch object

Initpz initializes ARMA objects poles and zeros with random values in [-1, 1] (calls

object specific initpz method)

Traintrains the structure according to specified input, target and strategy

Scalescales data sets (private method called by train)


32/46

- 32 / 46 -

Gaimplements a genetic algorithm (private method called by train)

Sequentialgd implements a sequential gradient-descent training algorithm (private

method called by train)

Batchgdimplements a batch gradient-descent training algorithm (private method called

by train)

IV.5. Future development

IV.5.1. Evolutionary strategies

Evolutionary strategies may prove to be more efficient in some cases than genetic algorithms,

especially for final tuning of the parameters.

Currently, genetic algorithms are implemented using a free, specialized toolbox developed by

the Evolutionary Computation Research Group in the Department of Automatic Control and

System Engineering from the University of Sheffield, UK.

IV.5.2. Generating training data sets

Besides training process, data preparation is the most time consuming activity involved.

Choosing the most relevant input signals and then the best suited data samples for training has

a crucial importance for a successful training. Thus an automatic or supervised data

preparation process could prove to be a useful tool.

Also, implementing such a process would help online training

IV.5.3. Graphical user interface

After the completion of the toolbox, a specialized graphical user interface would increase the

ergonomics in using this tool.

IV.6. Resources

MATLAB resources, beside the standard package, that were used to implement this project

are:

Statistics Toolboxdata clustering functions and structures

Genetic Algorithms Toolbox free toolbox, not included in MATLAB packages, developed

at the Department of Automatic Control and System Engineering from the University of

Sheffield, UK (http://www.shef.ac.uk/acse/research/ecrg/gat.html).
http://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.html


33/46

- 33 / 46 -

V. Testing and results

V.1. Vehicle lateral dynamic model

V.1.1. Model description

The model is a simplified one track vehicle lateral dynamic linear model with roll.

System structure:

one input *L

(steering angle),

two outputs andya r (lateral acceleration and yaw rate) and two state variables and r (slip angle and yaw rate).

The model expression in state space form:

' ' '

2

*

' 2 ' 2 '

sin

0

1

( )

R R

ef

R

L

r

H V H H V V V

ref ref ref

L

H H V V V V H H V V

z z ref z

g

xv

C C l C l C CK K K

mv mv mvn

rr l C l C l C l C l C

I I v I

&

&

' ' ' '

* 1 00 1

00 1

y

V H H H V V Vay

ref L

r

C C l C l C C

na m mv mr r n

One track model/

3 DOF

r

Center of

transient

v

xy

CG


34/46

- 34 / 46 -

V.1.2. Physical simplifications

The vehicle lateral dynamic is a very complicated physical phenomenon; here we use

the simplified model-one track model to describe it. There are some important assumptions

that have been made for the application of the one-track model:

1) The height of gravity centre is zero,2) there is no pitch and roll motion and3) the model is purely linear.

For the derivation of the lateral dynamics, a coordinate system is fixed to the centre of

gravity. The equations of motion are described according to the force balances and torque

balances at the centre of gravity.Therefore, from the application viewpoint, due to the one-track models simplification

especially the simplification in the tire model, it has been verified that it can be a good

approximation of the vehicle dynamics only when lateral accelerationya is small than 0.4g on

normal dry asphalt roads [1]. And it is only valid for some not so critical driving situations,

for the pitch motion and the roll motion has been neglected.

x

z


35/46

- 35 / 46 -

V.1.3. Unknown input signal

In this model, there exists one unknown input signal, the road bank angle x . This signal

cannot be measured directly in the general vehicle system, so normally it is taken as an

unknown input signal.

V.1.4. Model parameter variation

Vehicle reference velocity refv

The system matrices are the function of the vehicle reference velocity, such as in A, B, C

matrices; therefore the system is exactly a LTV system. But for the purpose of the vehicle

lateral dynamic research, the variation of the longitudinal vehicle velocity is comparably

slow, so it can be considered as a constant during one observation ( such as in a short time

window as 1 second, for the residual evaluation).

Vehicle mass m

For the load of the vehicle varying, accordingly the vehicle sprung mass and the inertia will

be changed. Especially the changes are very large for the truck, but for the personal car,

comparing to the large total mass, the change caused by the number of passengers can be

neglected normally.

x

gmS

USm g


36/46

- 36 / 46 -

Vehicle cornering stiffness C

Cornering stiffness is the change in lateral force per unit slip angle change at a specified

normal load in the linear range of the tire.

,( , )

y

z

d FC f F

d

Nominal value for our research car (Mercedes-Benz S500):

'103600 [ ] , 179000[ ]V H

N NC Crad rad

Actually, the tire sideslip stiffness 'f

C

andr

C

depend on roadtire friction coefficient,

wheel load, camber, toe-in, wheel pressure etc., see [3]. The problem of this fact is the

number of the unknown parameters and functions are very large and very complex. There are

some exact functions for the non-linear tire model, such as magic function and HSRI model,

but they can only be used in tire or vehicle off-lines simulation.

The general simple way to linearize the non-linear tire model is to linearize its

characteristics at the origin, so the sideslip stiffness C

is taken as a constant. However this

assumption is only valid in small sideslip angle and constant road adhesion efficient.

In some papers [1], [2], based on the stiffness of the steering mechanism (steering column,

gear, etc.), the following assumption has been used,

'H VC kC .

Tire Heading

Direction

Lateral Force

Sli An le


37/46

- 37 / 46 -

V.1.5. Model noise

The sensor noises are caused by the lateral acceleration sensor, yaw rates sensor noise and

steering angle sensor. The all sensor noise data are measured and supplied by the Bosch

company [2]. The details are given in the table.

V.1.6. Typical failures

Some typical failure types and values for the benchmark system are given in Table 1. The

given values are only used to show a realistic range for the faults, other fault values are also

possible. For the steering angle a ramp fault is because of the sensor type improbable, so no

fault value is given. Multiplicative faults are also not very probable and no realistic fault

values are known at this moment by the authors.

Table 1 Typical failure for the benchmark system

Offset faults

Step Ramp

Yaw rate 2 /s, 5 /s, 10 /s 10/s/min

Lateral acceleration 2 m/s2, 5 m/s

2 4 m/s

2/s, 10 m/s

2/s

Steering angle 15 , 30 --

Multiplicative faults

Giergeschwindigkeit (100 20) %, (100 40) % 100% (100 50) % in 10 s

Querbeschleunigung (100 50) %, (100 80) % 100% (100 50) % in 10 s

Lenkradwinkel -- --


38/46

- 38 / 46 -

V.1.7. Physical parameters of the vehicle lateraldynamical model

Physical constants Symbol in Matlab Value Unit Explains

g g 9.80665 [m/s^2] Gravity constant

Vehicle parameters

Li i_L 18.0 [-] Steering transmission ratio

Rm m_R 1630 [kg] Rolling sprung mass.

NRm m_NR 220 [kg] Non-rolling unsprung mass.

m m=m_R+m_NR [kg] Total mass

Vl l_V 1.52931 [m] Distance from the vehicle CG

to the front axle

Hl l_H 1.53069 [m] Distance from the vehicle CG

to the rear axle

zI I_z 3870 [kg-m^2] Moment of inertia about the z-

axis of the vehicle

RK

K_phi 0.9429 The roll coefficient

tire model Parameters

'

VC c_alpha_V 103600 [N/rad] Front tire cornering stiffness

HC c_alpha_H 179000 [N/rad]. Rear tire cornering stiffness

I.1.1.1. System variables

Beta [rad] Vehicle side slip angle

r r [rad/s] Vehicle yaw rate

*

L Delta_L [rad] Vehicle steering angle

ya Ay [ 2ms ]

Vehicle lateral acceleration

refv v_ref [Km/h] Vehicle longitude velocity

I.1.1.2. Sensor noise data

Standard variation

ayn N_ay 2(0.2, 2.4) [ ]

y

ma s

[ 2ms

] Lateral acceleration sensor

noise


39/46

- 39 / 46 -

rn N_r (0.2,0.9) [ / ]r rad s rad/s] Yaw rate sensor noise

Ln

N_delta* 2

L

rad

[rad] Steering angle sensor noise

V.1.8. Reference[1] Marcus Brner, Rolf Isermann, Adaptive one-track model for critical lateral driving

situations,

[2] Bosch GmbH , Fehlerarten fr die ISP-Sensorik, Internal report, 2003

[3] S. X. Ding, Y. Ma, H.-G. Schulz, B. Chu ect.,Fault tolerant estimation of vehicle lateral

dynamics, IFAC, Safeprocess, 2003.

[4] Mitschke, M., Dynamik der Kraftfahrzeuge. Band C. Springer- Verlag, 1990

V.2. Test 1

The following graphic represents measurements of the steering angle.

Figure 9 The steering angle (sample period 0.1 s)

From this data set, a subset is selected for training. The main principle used for determining

the most useful data subset concerns the bandwidth of the signal in specified interval. As long

as the correct data set is selected, in most cases, fewer samples produce a better and more

efficient training.

For the current project development, determining a procedure an implementing an algorithm

for choosing the training set is out of scope, but, as mentioned before, such a procedure wouldmake the toolbox much more easy to use.

The output variable, lateral acceleration is plotted in the next graphic:


40/46

- 40 / 46 -

Figure 10 Lateral Acceleration

From observing the previous graphic, it is obvious that this data set includes a small amount

of noise.

For the first training test the following parameters were set:

Training subset

window_start 5000

window_stop 8500

step 10

Training strategy

Phase 1Gradient DescentBatch training

Learning rate 0.01

Number of epochs 100


Learning rate 0.005


After training process the following response was obtained


41/46

- 41 / 46 -

Figure 11 Training results

Using as input only the steering angle, it is obvious that the neuro-fuzzy structure would not

be able to be very precise, because it is not able to model the internal noise that causes the

small ripples in the reference output.

This might suggest introducing an internal feedback.

Testing the structure for the initial data set, the following results were obtained


42/46

- 42 / 46 -

Figure 12 Testing Set (MSE=0.028)

V.3. Test 2

As stated in the previous section, considering an internal feedback as an extra input signal for

the neuro-fuzzy structure might produce better results.

Thus, considering the following notations:

x = steering angle;

y = lateral acceleration;

for this second test, 2 inputs are considered for the neuro-fuzzy structure:

1ky

kxinput (41)

Training subset

window_start 5000

window_stop 8500

step 10

Training strategy


Learning rate 0.009


43/46

- 43 / 46 -



Learning rate 0.005


Figure 13 Training results (2 inputs, MSE = 0.01)

For both tests, training process was around 58 seconds which is a major improvement from

the previous version of the toolbox.

For the testing set, following graphic shows the results:


44/46

- 44 / 46 -

Figure 14 Testing set (MSE=0.0047)

Other tests, with more inputs were considered, but the results are, at this time inconclusive.


45/46

- 45 / 46 -

VI. Conclusions

The main improvement brought by this project to the previous version of the MATLAB

toolbox is the speed. It is well known that artificial intelligence techniques like neural

networks, fuzzy systems, neuro-fuzzy systems, evolutionary algorithms (genetic algorithms,

evolutionary strategies, genetic programming) are time consuming and technology dependent.

This version of the toolbox has solved the time dependency of the training process,

performance being influenced only by the training parameters.

On the other hand, there are a great number of parameters that need to be adjusted to obtain

optimum performance. Thus a parameter management is required.

One partial solution is to define some training strategies which will allow some degree of

separation between parameters.

Previous study of the proposed structure revealed a great sensitivity of the output when

varying the SugenoFuzzy systems parameters, thus a small updates could determine a big

modification of the performance (in most cases, decrease of performance). This means thatvariation of the fuzzy parameters would determine a quick exploration of the search space and

finding an approximate location of the global optimum.

The restrictions imposed to the ARMA filters, mainly due to stability considerations, make

the filter parameter adjustment suited for tuning the entire systems performance.

Another modification of the toolbox is the implementation of the data structures. In this case

an OOP approach was chosen, thus monitoring the behaviour of each component became an

easy task. Also, finding and isolating errors and faults is assured.

Concluding, this version brings a great increase of speed making the use of the toolbox more

efficient.Of course, future studies will have to find the best way to increase the performance. It is

obvious that classic training algorithms are not well suited, thus some specialized versions

must be developed.

The results presented confirmed the modelling capabilities of the structure and also that this

can be done efficiently.


46/46

VII. Bibliography and references

1) Kaufman, ArnoldFundamental Theoretical Elements2) Dasgupta, DipankarEvolutionary Algorithms in Engineering Applications3) Hagan, Martin TNeural Network Design4) Russell, Stuart JArtificial Intelligence: A Modern Approach5) Bellman, Richard ErnestMethods Of Nonlinear Analysis6) Bck, Thomas - Evolutionary Algorithms in Theory and Practice (Evolution

Strategies, Evolutionary Programming, Genetic Algorithms )

7) www.wikipedia.org8) http://www.shef.ac.uk/acse/research/ecrg/gat.html
http://www.wikipedia.org/http://www.wikipedia.org/http://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.wikipedia.org/

Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

Documents

Transcript of Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification