Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

download Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

of 46

Transcript of Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    1/46

    Universitt Duisburg-Essen

    Facultt fr Ingenieurwissenschaften

    AUTOMATISIERUNGSTECHNIK UND KOMPLEXE SYSTEME

    Evolutionary Neuro-Fuzzy System with

    Internal Dynamics for System Identification

    Dipl-Eng.

    Cristian FLOREA

    Coordinators

    Prof. Dr. Eng. Steven X. DING

    Dr. Eng. Lavinia FERARIU

    June 2006

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    2/46

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    3/46

    Abstract

    The scope of this project is to explore the capabilities of a dynamic neuro-fuzzy structure in

    system identification, which is the first step towards control.

    The motivation of this study is based on the properties of the neuro-fuzzy systems, which are

    summarized in the first chapter.

    The visible part of a systems dynamic is represented by the measurements of its inputs and

    outputs (and sometimes state variables). The results are presented as time series. Studying

    them proved to be an appropriate way for determining the internal structure of the generating

    system.

    Neuro-fuzzy structures capable of generating similar time series can, sometimes, be

    considered functional equivalent with the real systems, thus providing an alternative way for

    modelling.

    Of course, nowadays mathematical background and technology already constitute as powerful

    modelling tools that are used in most cases, even when the model is relatively complex(usually, only main characteristics need to be modelled).

    Neuro-fuzzy systems are intended to be used only with very complex systems for being an

    efficient option, but theyre still an alternative way to model any given real system.

    The practical component of this project provides a tool for system identification, developed

    using the MATLAB development environment.

    At this moment, validation of the toolbox uses simplified models of real systems, but a

    milestone in the research process should be set by using more complex systems (social,

    economic, financial)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    4/46

    - 4 / 46 -

    I. Neuro-fuzzy systems .......................................................................................................... 6

    I.1. Neural Networks ........................................................................................................ 6

    I.2. Fuzzy Systems ............................................................................................................ 7

    I.3. Combining Neural Networks and Fuzzy Systems ...................................................... 8

    I.3.1. Neuro-Fuzzy Systems Characteristics ................................................................ 9

    I.3.2. Neuro-fuzzy Systems properties ........................................................................ 9

    I.3.3. Types of neuro-fuzzy systems .......................................................................... 10

    II. Time Series ....................................................................................................................... 11

    II.1. Time series classification ......................................................................................... 11

    II.1.1. Continuity of measuring ................................................................................... 11

    II.1.2. Number of variables ......................................................................................... 11

    II.1.3. Linearity ........................................................................................................... 11II.1.4. Stationarity ....................................................................................................... 11

    II.2. Prediction strategies ................................................................................................. 12

    II.2.1. Prerequisites ..................................................................................................... 12

    II.2.2. Strategies .......................................................................................................... 13

    III. Neuro-Fuzzy Systems in Time Series Analysis ........................................................... 14

    III.1. Fuzzy Neurons ...................................................................................................... 15

    III.2. Neurons with fuzzy weights ................................................................................. 15

    III.3. The Yamakawa Neuron Model ............................................................................ 16

    III.4. The Dynamic Yamakawa Neuron Model ............................................................. 18

    IV. The MATLAB Implementation ................................................................................... 21

    IV.1. Performance evaluation ........................................................................................ 21

    IV.2. Training algorithms .............................................................................................. 21

    IV.2.1. Gradient-based algorithms (Backpropagation) ................................................ 22

    IV.2.2. Genetic Algorithms .......................................................................................... 28

    IV.3. Training strategies ................................................................................................ 29IV.4. Data structures ...................................................................................................... 29

    IV.4.1. Class ARMA .................................................................................................... 30

    IV.4.2. Class SFS.......................................................................................................... 30

    IV.4.3. Class BRANCH ................................................................................................ 31

    IV.4.4. Class NFS ......................................................................................................... 31

    IV.5. Future development .............................................................................................. 32

    IV.5.1. Evolutionary strategies ..................................................................................... 32

    IV.5.2. Generating training data sets ............................................................................ 32

    IV.5.3. Graphical user interface ................................................................................... 32

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    5/46

    - 5 / 46 -

    IV.6. Resources ............................................................................................................. 32

    V. Testing and results ............................................................................................................ 33

    V.1. Vehicle lateral dynamic model ................................................................................. 33

    V.1.1. Model description ............................................................................................. 33

    V.1.2. Physical simplifications .................................................................................... 34

    V.1.3. Unknown input signal ...................................................................................... 35

    V.1.4. Model parameter variation ............................................................................... 35

    V.1.5. Model noise ...................................................................................................... 37

    V.1.6. Typical failures ................................................................................................. 37

    V.1.7. Physical parameters of the vehicle lateral dynamical model ........................... 38

    I.1.1.1. System variables ........................................................................................... 38

    I.1.1.2. Sensor noise data .......................................................................................... 38V.1.8. Reference .......................................................................................................... 39

    V.2. Test 1 ........................................................................................................................ 39

    V.3. Test 2 ........................................................................................................................ 42

    VI. Conclusions .................................................................................................................. 45

    VII. Bibliography and references ......................................................................................... 46

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    6/46

    - 6 / 46 -

    I. Neuro-fuzzy systems

    Neural networks and fuzzy systems, as alternative methods in data processing, are suited for

    intelligent behaviour modelling and mimic the actions of a human expert capable of solving

    complex problems.

    This goal is achieved through observation and learning instead of classical mathematical

    modelling (using classical laws of physics, chemistry, biology, economics and many more).

    Considering the previous remarks, the process of knowledge assimilation has the leading role.

    According to knowledge classification, assimilation has three major components:

    Interviewing and observingsuited for knowledge that can be expressed as a set ofrules;

    Instruction; Learning;I.1. Neural Networks

    Neural networks are systems that try to make use of some of the known or expected

    organizing principles of the human brain. They consist of a number of independent, simple

    processors - the neurons. These neurons communicate with each other via weighted

    connections.

    At first, research in this area was driven by neurobiological interests. The modelling of single

    neurons and the so-called learning rules for modifying synaptic weights were the initial

    research topics.

    Modern research in neural networks considers the development of architectures and learning

    algorithms, and examines the applicability of these models to information processing tasks.

    Although, there are still many researchers who are modelling biological neural networks by

    artificial neural networks to learn more about the structure of the human brain and the way it

    works, biological plausibility is usually neglected and only the problem of information

    processing with artificial neural networks is considered. These models have in common that

    they are based on rather simple processing units or neurons exchanging information via

    weighted connections.

    Different types of neural networks can solve different problems, like pattern recognition,

    pattern completion, determining similarities between patterns or data - also in terms of

    interpolation or extrapolation - and automatic classification.

    Learning in neural networks means to determine a mapping from an input to an output space

    by using example patterns. If the same or similar input patterns are presented to the network

    after learning, it should produce an appropriate output pattern.

    Neural networks can be used if training data is available. It is not necessary to have a

    mathematical model of the problem of interest, and there is no need to provide any form of

    prior knowledge. On the other hand the solution obtained from the learning process usually

    cannot be interpreted.

    Although there are some approaches to extract rules from neural networks, most neural

    network architectures are black boxes. They cannot be checked whether their solution is

    plausible, i.e. their final state cannot be interpreted in terms of rules. This also means that aneural network usually cannot be initialized with prior knowledge if it is available, and

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    7/46

    - 7 / 46 -

    thus the network must learn from scratch. The learning process itself can take very long,

    and there is usually no guarantee of success.

    The following table synthesizes the advantages and drawbacks of neural networks:

    Advantages Disadvantages

    No mathematical model is needed There is no need for prior knowledge There are many training methods

    developed

    Black box-like system Usually, the manner in which the results

    are obtained is not interpretable

    Adaptation to environment changes mayprove to be difficult, thus retraining is

    needed

    If they exist, prior knowledge cant beused

    The training process is not guaranteed toconverge

    Table 1

    I.2. Fuzzy Systems

    When using fuzzy set theory, it is easy to model the fuzzy boundaries of linguistic terms by

    introducing gradual memberships. In contrast to classical set theory, in which an object or a

    case either is a member of a given set (defined, e.g., by some property) or not, fuzzy set

    theory makes it possible that an object or a case belongs to a set only to a certain degree.

    Interpretations of membership degrees include similarity, preference, and uncertainty. They

    can state how similar an object or case is to a prototypical one, they can indicate preferences

    between sub-optimal solutions to a problem, or they can model uncertainty about the true

    situation, if this situation is described in imprecise terms. In general, due to their closeness to

    human reasoning, solutions obtained using fuzzy approaches are easy to understand and to

    apply.

    Due to these strengths, fuzzy systems are the method of choice, if linguistic, vague, or

    imprecise information has to be modelled.

    The fuzzy systems are based on if-then rules. The antecedent of a rule consists of fuzzy

    descriptions of input values, and the consequent defines a - possibly fuzzy - output value forthe given input. The benefits of these fuzzy systems lie in the suitable knowledge

    representation. But problems may arise when fuzzy concepts have to be represented by

    concrete membership degrees, which guarantee that a fuzzy system works as expected.

    A fuzzy system can be used to solve a problem if knowledge about the solution is available in

    the form of linguistic if-then rules. By defining suitable fuzzy sets to represent linguistic

    terms used within the rules, a fuzzy system can be created from these rules.

    There is no formal model of the problem of interest and no training data required.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    8/46

    - 8 / 46 -

    Lets summarize in the following table:

    Advantages Disadvantages

    No mathematical model is required Prior knowledge represented as if-then

    rules may be used;

    Ease of implementation if-then rules facilitate the interpretation

    of results

    if-then rules are a prerequisite; No learning capabilities; There are no standard methods for

    parameter adjustments;

    Difficulties in interpreting the resultsmay occur;

    Adaptation to changing environmentcould be difficult;

    Improvements are not guaranteed byparameter adaptation;

    Table 2

    I.3. Combining Neural Networks and Fuzzy Systems

    Presently, the neuro-fuzzy approach is becoming one of the major areas of interest because it

    gets the benefits of neural networks as well as of fuzzy logic systems and it removes the

    individual disadvantages by combining them on the common features.

    Different architectures of neuro-fuzzy system have been investigated. These architectures

    have been applied in many applications especially in the process control.

    Neural networks and Fuzzy logic have some common features such as distributedrepresentation of knowledge, model-free estimation, ability to handle data with uncertainty

    and imprecision etc.

    Fuzzy logic has tolerance for imprecision of data, while neural networks have tolerance for

    noisy data. A neural networks learning capability provides a good way to adjust experts

    knowledge and it automatically generates additional fuzzy rules and membership functions to

    meet certain specifications. This reduces the design time and cost.

    On the other hand, the fuzzy logic approach possibly enhances the generalization capability of

    a neural network by providing more reliable output when extrapolation is needed beyond the

    limits of the training data.

    The basic idea of combining fuzzy systems and neural networks is to design an architecture

    that uses a fuzzy system to represent knowledge in an interpretable manner and the learning

    ability of a neural network to optimize its parameters.

    The drawbacks of both of the individual approaches - the black box behaviour of neural

    networks, and the problems of finding suitable membership values for fuzzy systems - could

    thus be avoided.

    A combination can constitute an interpretable model that is capable of learning and can use

    problem-specific prior knowledge. Therefore, neuro-fuzzy methods are especially suited for

    applications, where user interaction in model design or interpretation is desired.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    9/46

    - 9 / 46 -

    I.3.1. Neuro-Fuzzy Systems Characteristics

    Although there are a lot of different approaches, usually the term neuro--fuzzy system is used

    for approaches which display the following properties:

    A neuro-fuzzy system is based on a fuzzy system which is trained by a learningalgorithm derived from neural network theory. The (heuristic) learning procedureoperates on local information, and causes only local modifications in the underlying

    fuzzy system.

    A neuro-fuzzy system can be viewed as a 3-layer feed forward neural network. Thefirst layer represents input variables, the middle (hidden) layer represents fuzzy rules

    and the third layer represents output variables. Fuzzy sets are encoded as (fuzzy)

    connection weights. It is not necessary to represent a fuzzy system like this to apply a

    learning algorithm to it. However, it can be convenient, because it represents the data

    flow of input processing and learning within the model. (Sometimes 5-layer

    architecture is used, where the fuzzy sets are represented in the units of the second and

    fourth layer.) A neuro-fuzzy system can be always (i.e. before, during and after learning) interpreted

    as a system of fuzzy rules. It is also possible to create the system out of training data

    from scratch, as it is possible to initialize it by prior knowledge in form of fuzzy rules.

    (Not all neuro-fuzzy models specify learning procedures for fuzzy rule creation)

    The learning procedure of a neuro-fuzzy system takes the semantic properties of theunderlying fuzzy system into account. This results in constraints on the possible

    modifications applicable to the system parameters. (Not all neuro-fuzzy approacheshave this property.)

    A neuro-fuzzy system approximates an n-dimensional (unknown) function that ispartially defined by the training data. The fuzzy rules encoded within the system

    represent vague samples, and can be viewed as prototypes of the training data. A

    neuro-fuzzy system should not be seen as a kind of (fuzzy) expert system, and it has

    nothing to do with fuzzy logic in the narrow sense.

    I.3.2. Neuro-fuzzy Systems properties

    From the fuzzy systems point of view, the main advantage is the learning capability, as for

    neural networks, use of prior knowledge for initial conditions is an obvious gain, potentially

    causing a speed-up of the training process. Also, the if-then rules set offer the possibility of

    interpretable results.

    The architecture of a neuro-fuzzy system is determined by the rules and fuzzy sets specific to

    the problem. In this case there is no need to specify some network parameters, like the

    number of hidden layers.

    Unfortunately, combining the two types of systems is not a guaranteed way to success, due to thefact that training convergence is not a sure thing.

    An often used method for training fuzzy systems consists in representing it as a neural network

    and applying the specific training algorithms (i.e. backpropagation). This approach implies some

    alteration of the structure and/or algorithm, due to the fact that the functions used for fuzzy

    inference are not differentiable (i.e. min, max). There are two possibilities:

    Replace specific fuzzy inference functions with differentiable ones (in this case there maybe a loss of interpretability of the results); Use of training methods that are not gradient-based;

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    10/46

    - 10 / 46 -

    Modern neuro-fuzzy systems are usually modelled as multi-layer feed-forward neural networks.

    Here we can mention:

    ANFIS (Adaptive-Network-based Fuzzy Inference System) a Sugeno fuzzy system ismodelled as 5-layer, feed-forward neural network;

    GARIC (Generalized Approximate Reasoning-based Intelligent Control)implements afuzzy controller using specialized feed-forward neural networks;

    Of course, there are other principles too as those used for fuzzy associative memories, self-

    organizing feature map (transform the input of arbitrary dimension into a one or two

    dimensional discrete map subject to a topologicalneighbourhood preservingconstraint.

    Advantages Disadvantages

    No mathematical model is necessary Prior knowledge represented as if-then

    rules is not required but it can be used;

    There are many training/learningalgorithms;

    Ease of implementation; Interpretable results;

    Training convergence is not guaranteed;

    Table 3

    I.3.3. Types of neuro-fuzzy systemsUsually there are two major combinations of neural networks and fuzzy systems.

    First case involves a neural network and a fuzzy system working independently. The neural

    network is used to determine some parameters of the fuzzy system. The adjustment can be

    online or offline. Because of this co-working, these systems are named cooperative neuro-

    fuzzy systems.

    In the second case, we have hybrid neuro-fuzzy systems, where a homogeneous neural

    network-like architecture is obtained by interpreting a fuzzy system as a neural network or by

    directly implementing it as one.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    11/46

    - 11 / 46 -

    II. Time Series

    Inertia is an intrinsic property for all systems, thus observing their previous behaviour might

    provide useful information that will allow derive some laws and rules about its evolution.

    A time series is a sequence of observations (data points), measured typically at successivetimes, spaced apart at uniform time intervals.

    There are two main goals of time series analysis:

    identifying the nature of the phenomenon represented by the sequence of observations; forecasting (predicting future values of the time series variable).

    Both of these goals require that the pattern of observed time series data is identified and more

    or less formally described. Once the pattern is established, we can interpret and integrate it

    with other data (i.e., use it in our theory of the investigated phenomenon).

    Regardless of the depth of our understanding and the validity of our interpretation (theory) of

    the phenomenon, we can extrapolate the identified pattern to predict future events.

    If a time series can be predicted, then it is called deterministic. Usually, in practice, time

    series are stochastic processes due to noisy observations. Consequently future observations

    are determined only in part by past values.

    II.1. Time series classification

    Essentially aspects characterizing a time series determine a criterion or set of criteria used for

    classifying.

    II.1.1. Continuity of measuring

    If it is possible for a time series to measure its values at any given time, then the time series is

    called continuous.

    In practice, generally, time series contain observations made at predetermined moments,

    separated by constant time intervals. In this case we have discrete time series.

    A continuous time series can be transformed into a discrete one by measuring its values at

    equally distant moments, thus obtaining a sampled time series.

    Another type is the integrative (cumulative) time series, where measuring is possible only for

    cumulative values of a variable (i.e. rainfall quantities).

    II.1.2. Number of variables

    A time series that contains values of a single variable is named monovariable. Otherwise it is

    a multivariable time series.

    II.1.3. Linearity

    Linearity of a time series can be determined using methods as Hurst coefficient, Lyapunov

    (characteristic) exponent and correlation dimension.

    II.1.4. Stationarity

    A stationary process has the property that the mean, variance and autocorrelation structure donot change over time.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    12/46

    - 12 / 46 -

    Usually, time series are non-stationary, thus some methods were developed to transform a

    non-stationary time series into a stationary one.

    1. Difference the data. That is, given the seriesZt, we create the new series1 iii ZZY

    The differenced data will contain one less point than the original data. Although you can

    difference the data more than once, one difference is usually sufficient.

    2. If the data contain a trend, we can fit some type of curve to the data and then modelthe residuals from that fit. Since the purpose of the fit is to simply remove long term

    trend, a simple fit, such as a straight line, is typically used.

    3. For non-constant variance, taking the logarithm or square root of the series maystabilize the variance. For negative data, you can add a suitable constant to make the

    entire data positive before applying the transformation. This constant can then be

    subtracted from the model to obtain predicted (i.e., the fitted) values and forecasts for

    future points.

    II.2. Prediction strategies

    II.2.1. Prerequisites

    Before presenting some prediction strategies, we need to define how to organize information

    given by the time series in order to use it efficiently.

    1. Data sets

    Usually, time series data is divided into three sets:

    Training set used to train the prediction system; by means of trial and error thedimension of training set is varied until an optimal size is reached;

    Validation set used for monitoring the training process to make sure that theprediction system has not over-learned the training set;

    Testing setused after training to study the performance of the prediction system;2. Sample delay and window size

    Given a time series xt, xt-1, .xt-i,..and considering that we should predict xt+n, then one

    must decide how many samples are used (this is called the window size) and how is the data

    sampled. For a time series xt, xt-k, xt-2k.., xt-ik,.., k is called sample delay.

    Both parameters will be determined experimentally.

    3. Measure of prediction error

    Given a time series x1, x2, x3,, xn and the mean value xmean then the standard deviation

    n

    i

    meanisd xxn

    x1

    2

    1

    1(1)

    is used as prediction error, given that the prediction is always the mean value. To mention

    that, for random time series, the mean value is the best prediction.

    A second method for measuring the prediction error starts from a series of predicted values

    nxxx ,......., 21 , the prediction error is defined as

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    13/46

    - 13 / 46 -

    sd

    error

    x

    xError

    , where

    n

    i

    iierror xxn

    x1

    2

    1

    1 (2)

    In this case, the prediction quality is measured as the improvement relative to the prediction

    of the mean value.

    II.2.2. Strategies

    One approach consists in constructing a function Fn that can directly predict ntx :

    iktktkttnnt xxxxFx ,.......,,, 2 (3)

    Another way is to construct a function FI

    NktktkttIt xxxxFx ,.......,,, 21

    which can predict one step ahead 1 tx , and then apply this function iteratively n times.

    Finally, one can construct a function F1 to predict 1 tx , then retrain it, so F2 is obtains that

    predicts2

    tx , and so on:

    1121122

    211

    ,.......,,,

    ,.......,,,

    iktktkttt

    iktktkttt

    xxxxFx

    xxxxFx

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    14/46

    - 14 / 46 -

    III.Neuro-Fuzzy Systems in Time Series Analysis

    Classical time series analysis has two major components:

    1. Time domain analysis - usually more profitably for stochastic signals, uses correlationstechniques to study signal characteristicsThe correlation of two signals is defined as:

    dtyxrt

    xy

    In particular, when a function is correlated with itself, the operator is called

    autocorrelation; otherwise it is called cross correlation.

    Correlation determines how much similarity there is between the two argument functions.

    Some of the general properties are:

    The maximum value of the correlation always occurs at t = 0. The function alwaysdecreases (or stays constant) as t approaches infinity;

    The more the area under the correlation curve, the more is the similarity between thetwo signals;

    2. Frequency domain analysis best suited for periodic signals, is based on the Fouriertransform.

    Supposex is a complex-valued Lebesgue integral function. The Fourier transform to the

    frequency domain is given by the function

    dtetxX tj

    2

    1, with R

    The Fourier analysis also uses Fourier series, discrete-time Fourier transform and discrete

    Fourier transform.

    On the other hand new approaches such as neural networks and especially neuro-fuzzy

    systems offer ways for modelling a systems behaviour using artificial intelligence specific

    techniques.

    Neuro-fuzzy systems are more suitable for modelling large scale, complex systems thatotherwise would impose the use of very complex and hard to compute mathematical

    equations.

    Nowadays, in many cases, the classical methods are, sometimes by far, more efficient than

    neuro-fuzzy modelling, but the research community is seeking for new, improved and

    efficient modelling and training of these structures.

    The scope of this project is to explore the capabilities of a dynamic neuro-fuzzy structure in

    system identification, which is the first step towards control.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    15/46

    - 15 / 46 -

    III.1. Fuzzy Neurons

    The simplest neuro-fuzzy system is the fuzzy neuron which implements some basic fuzzy-

    logic functions.

    Figure 1 The Fuzzy Neuron

    For examplefcan be one of the MIN or MAX functions:

    nini

    nini

    xxxMAXxxxfy

    xxxMINxxxfy

    ,.......,,....,,.......,,....,

    ,.......,,....,,.......,,....,

    11max

    11min

    Using fuzzy logical neurons, the output is more or less influenced by the values of inputs.

    This influence depends on both the weights and the operation of fusion:

    for a neuron of type AND, the influence of its inputs having a weak weight is mostimportant

    for a neuron of type OR the inputs whose weight is significant are rather taken intoaccount

    III.2. Neurons with fuzzy weights

    Another way to fuzzify the neuron model is the use of fuzzy weights instead of crisp values.

    Fuzzy weights are interpreted as membership functions, thus the linear synaptic connections

    are replaced with non-linearities labelled as loose connected or tight connected.

    Exciting or inhibiting connections are represented through fuzzy intersection or through fuzzy

    complement followed by fuzzy intersection.

    a. Conventional neuron with fuzzy weightsConsidering the standard neuron

    x1

    xi

    xn

    w1

    wi

    wn

    y

    x1

    xi

    xn

    w1

    wi

    wn

    y

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    16/46

    - 16 / 46 -

    where x1, x2,, xn are inputs, w1, w2,, wn the weights, the bias (offset) and y the output.

    Positive weights are excitatory and the negative ones inhibiting. The model has a single

    parameter for each synapse and one non-linear function f. Due to its simplicity, more complex

    functions are achieved with large, complicated architectures.

    A more powerful neuron model is obtained when fuzzy weights are used and, more important,the inputs and outputs are also membership functions (take fuzzy values). Such a model is the

    Yamakawa neuron which will be presented in the next section.

    b. Direct fuzzification of neural networksIn this case, inputs and/or output and/or weights are generalized to fuzzy values. The

    following table presents all possible combinations

    Fuzzy Neural

    Network

    Weights Inputs Outputs

    Type 1 Crisp Fuzzy Crisp

    Type 2 Crisp Fuzzy Fuzzy

    Type 3 Fuzzy Fuzzy Fuzzy

    Type 4 Fuzzy Crisp Fuzzy

    Type 5 Crisp Crisp Fuzzy

    Type 6 Fuzzy Crisp Crisp

    Type 7 Fuzzy Fuzzy Crisp

    Type 1 networks were used to classify fuzzy input vectors into crisp classes, types 24 were

    used to implement fuzzy if-then rules.

    According to some researches, types 57 cant be implemented. For type 5 the output will

    always be crisp, while for 6 and 6 there is no need to fuzzify the weights.

    III.3. The Yamakawa Neuron Model

    Lets consider the linearcombinatory model:

    Figure 2 The Linear Combinator

    The Yamakawa neuron is derived from the model above, where weights a i are replaced by

    non-linear functions implemented with Sugeno fuzzy systems (SISO Single Input, Single

    Output) and the bias a0 is set to zero.

    x1

    xi

    xn

    a1

    ai

    an

    y

    a0

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    17/46

    - 17 / 46 -

    Figure 3 The Yamakawa Model

    The structure of the non-linear synapse is presented in the next figure:

    Figure 4 The Yamakawa Model Synapse Structure (Sugeno Fuzzy System)

    and we have:

    m

    j

    iji

    m

    j

    jiiji

    ii

    xg

    wxg

    xf

    1

    ,

    1

    ,,

    where

    g i,j - the jth membership function of the ith Sugeno fuzzy system w

    i,jthe j

    thvariable weight of the i

    thSugeno fuzzy system

    mthe number of membership functions

    x1

    xi

    xn

    y

    f1

    fi

    fn

    xi

    fi (xi)

    gi,1

    gi,j

    gi,m

    wi,1

    wi,j

    wi,m

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    18/46

    - 18 / 46 -

    III.4. The Dynamic Yamakawa Neuron Model

    The Yamakawa model offers more computational power but is still a static structure. In order

    to be more practical one can transform it into a dynamic structure. Considering that,

    nowadays, in practice, most time series are discrete, ARMA (Auto-Regressive, Moving

    Average) filters prove to be suitable for modelling dynamic behaviour.

    The proposed structure is represented in the following figure:

    Figure 5 Dynamic Synapse

    Reconsidering the Yamakawa model, the following representation is obtained:

    Figure 6 The Dynamic Yamakawa Model

    Lets consider a second-order ARMA filter with the following structure:

    Figure 7 Structure of a second order ARMA filter

    ARMA1 SFS ARMA2xi (k) yi (k)

    ARMA11 SFS1 ARMA21x1 (k)

    ARMA1i SFSi ARMA2ixi (k) y (k)

    ARMA1n SFSn ARMA2nxn (k)

    ARMA

    T

    T

    T

    T

    y(k)x(k)

    -

    +

    +

    +

    +

    +

    +

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    19/46

    - 19 / 46 -

    The inputoutput transfer of the ARMA filter is given by:

    kxqaqa

    qbqbbky

    2

    2

    1

    1

    2

    2

    1

    10

    1)(

    where 1q is the delay operator.

    For the Sugeno fuzzy systems Gaussian membership functions are considered, with centers

    uniformly distributed in [-1;1].

    m

    i

    ckx

    m

    i

    ckx

    i

    i

    i

    i

    i

    e

    e

    ky

    1

    )(

    1

    )(

    2

    2

    2

    2

    )(

    ioutput singletons (variable weights); idispersions for the Gaussian membership functions; cicentres of the Gaussian membership functions; mnumber of membership functions;

    In order to write the inputoutput transfer equation for the entire system, we need to agree on

    some notations:

    Figure 8 The Yamakawa Dynamic Model with signals notations

    ksqaqa

    qbqbbky

    2

    2

    1

    1

    2

    2

    1

    10

    1)(

    m

    i

    i kvks1

    )(

    kw

    qaqa

    qbqbbkv ii 22

    2

    12

    1

    22

    2

    12

    1

    2

    0

    1)(

    ni 1

    m

    j

    ckz

    m

    j

    ckz

    ji

    i

    ji

    jii

    ji

    jii

    e

    e

    kw

    1

    )(

    1

    )(

    ,

    2,

    2,

    2,

    2,

    )(

    ni 1

    ARMA11 SFS1 ARMA21x1 (k)

    ARMA1i SFSi ARMA2ixi (k) y (k)

    ARMA1n SFSn ARMA2nxn (k)

    ARMAzi wi vi s

    zn wn

    v1

    vn

    z1 w1

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    20/46

    - 20 / 46 -

    kxqaqa

    qbqbbkz ii 21

    2

    11

    1

    21

    2

    11

    1

    1

    0

    1)(

    ni 1

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    21/46

    - 21 / 46 -

    IV. The MATLAB Implementation

    Having studied this dynamic structures capability during my license degree project, the

    purpose of the current project is to make training process more efficient (concerning duration,

    performance and ease of use). The objective is to find those algorithms, strategies and training

    parameters that determine fast training, maximum of performance and minimum of useradjustable parameters / variables.

    IV.1. Performance evaluation

    The structure proposed in this paper is intended to model a dynamic system using neuro-fuzzy

    paradigm specific techniques. Thus, evaluation of modelling performance based on output

    estimation error comes naturally.

    In this case, the mean square error is considered:

    22 ,2

    1,

    2

    1, kykykekJ d

    , where is the neuro-fuzzy system parameter vector.

    IV.2. Training algorithms

    The training algorithms are implemented as sets of functions designed to alter the parameters

    of the neuro-fuzzy system according to the criterion above.

    Two types of algorithms were considered, first the gradient-descent based, specific for

    training neural network and second a genetic algorithm.

    Both types have de same logical structure:

    scaledata_sets

    while notstop

    output= evaluate(structure)

    error= target_outputoutput

    modify parameters

    ifstop_crierion

    stop = trueelse

    stop = false

    end if

    end while

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    22/46

    - 22 / 46 -

    Data scaling is made according to the following procedure:

    1min_max_

    _min__2__

    1min_max_

    _min__2__

    1min_max_

    _min__2__

    )_max(max_

    )_min(min_

    trainingtraining

    settrainingsettetingsettestingscaled

    trainingtraining

    settrainingsetvalidationsetvalidationscaled

    trainingtraining

    settrainingsettrainingsettrainingscaled

    settrainingtraining

    settrainingtraining

    Although the training set values will be in [-1,1], for the validation and testing sets it is

    possible to exceed this interval. This scaling procedure assures that the transformation is

    bijective.

    IV.2.1. Gradient-based algorithms (Backpropagation)

    This class of algorithms uses the following formula for updating system parameters:

    ,1

    kJkk (4)

    where is an algorithm specific parameter called learning rate.

    Considering the expression for J(k,), the above formula becomes:

    ,

    1

    ky

    kekk (5)

    So, besides learning rate and estimation error, parameter variation depends only on systems

    output.

    Remark

    For an ARMA filter having the input-output transfer described by

    kxqaqa

    qbqbbky

    2

    2

    1

    1

    2

    2

    1

    10

    1

    (6)

    the following expressions were derived:

    derivative of output with respect to input

    2

    2

    1

    1

    210

    1

    qaqa

    bbbk

    x

    y7)

    derivative of output with respect to denominator coefficients)(

    12

    2

    1

    1

    kyqaqa

    q

    a

    yi

    i

    ; 2,1i (8)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    23/46

    - 23 / 46 -

    derivative of output with respect to numerator coefficients kx

    qaqa

    q

    b

    yi

    i

    2

    2

    1

    11

    ; 2,0i (9)

    For this revision of the toolbox, a poles-zeros factorized expression was preferred for the

    ARMA filters. This allows a better control of the filters behaviour.

    The following expressions were considered:

    )())((

    ))(()(

    2

    21 kxpqpq

    zqzqkky p

    (10)

    kxqaqa

    qbqbbkxqppqpp

    qzzqzzkky p 22

    1

    1

    22110

    2

    21

    1

    21

    221121

    11

    1)(

    (11)

    derivative of output with respect to input

    2211

    21

    2121

    1

    1)(

    qppqpp

    zzzzkk

    x

    y p(12)

    derivative of output with respect to gain

    kxqppqpp

    qzzqzzkky

    p

    2

    21

    1

    21

    2

    21

    1

    21

    11)( (13)

    derivative of output with respect to poles)(

    12

    2

    1

    1

    kyqaqa

    q

    a

    yi

    i

    ; 2,1i (14)

    2

    2

    22

    1

    12

    1

    2

    21

    1

    11

    p

    a

    a

    y

    p

    a

    a

    y

    p

    y

    p

    a

    a

    y

    p

    a

    a

    y

    p

    y

    (15)

    kyqaqa

    qpqk

    p

    y

    kyqaqa

    qpqk

    p

    y

    22

    11

    21

    1

    2

    22

    11

    22

    1

    1

    1)(

    1)(

    (16)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    24/46

    - 24 / 46 -

    derivative of output with respect to zeros kx

    qaqa

    q

    b

    yi

    i

    2

    2

    1

    11

    ; 2,0i (17)

    2

    2

    22

    1

    12

    1

    2

    21

    1

    11

    z

    b

    b

    y

    z

    b

    b

    y

    z

    y

    z

    b

    b

    y

    z

    b

    b

    y

    z

    y

    18)

    kxqaqa

    qzqkkzy

    kxqaqa

    qzqkk

    z

    y

    p

    p

    2

    2

    1

    1

    2

    1

    1

    2

    2

    2

    1

    1

    2

    2

    1

    1

    1)(

    1)(

    (19)

    For the Sugeno fuzzy systems the expressions considered are:

    inputoutput transfer

    m

    i

    ckx

    m

    i

    ckx

    i

    i

    i

    i

    i

    e

    e

    ky

    1

    )(

    1

    )(

    2

    2

    2

    2

    )(

    (20)

    Let

    m

    i

    ckx

    ii

    i

    ekP1

    )(

    2

    2

    )( (21)

    and

    m

    i

    ckx

    i

    i

    ekQ

    1

    )(

    2

    2

    )(

    (22)

    Then

    )(

    )()(

    kQ

    kPky (23)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    25/46

    - 25 / 46 -

    derivative of output with respect to inputConsidering the notations above, one can write

    2)(

    )()()()(

    )(kQ

    kx

    QkPkQk

    x

    P

    kx

    y

    (24)

    where

    m

    i i

    i

    ckx

    i

    ckxek

    x

    Pi

    i

    1

    2

    )(

    )(21)(

    2

    2

    (25)

    and

    m

    i i

    i

    ckx

    ckxek

    x

    Qi

    i

    1

    2

    )(

    )(21)(

    2

    2

    (26)

    derivative of output with respect to singletonsFrom the inputoutput transfer expression, one can derive

    )(

    )(

    )(kQ

    kP

    ky

    i

    i

    , mi ,1 (27)

    having

    2

    2

    )(

    )( iickx

    i

    ekP

    , mi ,1 (28)

    derivative of output with respect to centresThe expressions above lead us to write

    2)(

    )()()()(

    )(kQ

    kc

    QkPkQk

    c

    P

    kc

    y ii

    i

    , mi ,1 (29)

    with

    2

    )(

    )(2)(

    2

    2

    i

    i

    ckx

    i

    i

    ckxek

    c

    Pi

    i

    , mi ,1 (30)

    and

    2

    )(

    )(2)(

    2

    2

    i

    i

    ckx

    i

    ckxek

    c

    Qi

    i

    , mi ,1 (31)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    26/46

    - 26 / 46 -

    derivative of output with respect to dispersionsSome simple calculus and we have

    2)(

    )()()()(

    )( kQ

    kQ

    kPkQkP

    k

    y ii

    i

    , mi ,1 (32)

    where the derivatives involved have following expressions

    3

    2)(

    ))((2)(

    2

    2

    i

    i

    ckx

    i

    i

    ckxek

    Pi

    i

    , mi ,1 (33)

    3

    2)(

    ))((2)(

    2

    2

    i

    i

    ckx

    i

    ckxek

    Pi

    i

    , mi ,1 (34)

    With notations from Figure 8 we can write the training formula for each component.

    For the algorithm itself, several options are available:

    i. Parameter alteration

    Sometimes the training set will contain fewer but more relevant samples. In this case

    sequential training may be used, thus, the parameter change will occur after each

    sample being evaluated. Otherwise batch training is used.

    For batch training, suppose the training set has m samples, then the following

    expressions will be considered:

    derivativemeanerrormeankk __1 (35)

    where

    m

    t

    tem

    errormean1

    )(1

    _ (36)

    and

    m

    t

    ty

    mderivativemean

    1

    1_

    (37)

    ii. Learning rate

    The implicit option is the modifiable but fixed learning rate, which means that for a

    training session the learning rate will be constant, being specified before starting the

    procedure.

    Another option is the variable learning rate. In this case, during training procedure, the

    learning rate is modified according to a previously established rule:

    Annealing (gradually lower)In order to reach the minimum, and stay there, we must anneal (gradually

    lower) the global learning rate. A simple, non-adaptive annealing schedule for

    this purpose is the search-then-converge schedule

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    27/46

    - 27 / 46 -

    T

    kk

    1

    0 (38)

    Its name derives from the fact that it keeps nearly constant for the first T

    training patterns, allowing the network to find the general location of theminimum, before annealing it at a (very slow) pace that is known from theory

    to guarantee convergence to the minimum. The characteristic time T of this

    schedule is a new free parameter that must be determined by trial and

    error.

    Bold driverA useful batch method for adapting the global learning rate is the bold

    driver algorithm. Its operation is simple: after each epoch, compare the

    network's loss e(k) to its previous value, e(k-1). If the error has decreased,

    increase by a small proportion (typically 1%-5%). If the error has increased

    by more than a tiny proportion (say, 10-10

    ), however, undo the last parameterchange, and decrease sharply - typically by 50%. Thus bold driver will keep

    growing slowly until it finds itself taking a step that has clearly gone too far

    up onto the opposite slope of the error function. Since this means that the

    system has arrived in a tricky area of the error surface, it makes sense to reduce

    the step size quite drastically at this point.

    Unfortunately bold driver cannot be used in this form for online learning: the

    stochastic fluctuations in e(k) would hopelessly confuse the algorithm.

    iii. Momentum

    Another technique that can help the system out of local minima is the use of a

    momentum term. This is probably the most popular extension of the backpropagation

    algorithm; it is hard to find cases where this is not used. With momentum m, the

    parameter update at a given moment k becomes:

    1 kmkfk (39)

    wheref(k) is a factor depending on current / mean error and current / mean derivative

    and e 0 < m < 1 is a new global parameter which must be determined by trial and

    error. Momentum simply adds a fraction m of the previous parameter update to the

    current one.

    When the gradient keeps pointing in the same direction, this will increase the size of

    the steps taken towards the minimum. It is therefore often necessary to reduce theglobal learning rate when using a lot of momentum (m close to 1). Combining a high

    learning rate with a lot of momentum, the system will rush past the minimum with

    huge steps.

    When the gradient keeps changing direction, momentum will smooth out the

    variations.

    iv. Stopping condition

    For sequential training the algorithm will stop after all data samples in the training set

    were presented to the system.

    When using batch training the algorithm will stop after the specified number ofepochs.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    28/46

    - 28 / 46 -

    In both cases, there is an option to stop the training process when a desired (imposed)

    estimation error is reached, that means

    maxeke (40)

    IV.2.2. Genetic AlgorithmsThe training procedure using genetic algorithms implies the following steps to be taken:

    Step 1. From the system parameters construct a vector with a predefined structure;Step 2. Create a population consisting of a specified number (NPop) of randomly

    generated vectors like the one defined on the previous step; Npop is an algorithm

    specific parameter;

    Step 3. InitNgen (number of generations)Step 4. Evaluate current population; for each individual the mean square estimation error

    is computed;

    Step 5. Select a number of individuals for reproductionStep 6. Apply the crossover operator with a specified probability Pc;Step 7. Apply mutation operator with a specified probability Pm;Step 8. SelectNpop individuals from the extended population (parents and offspring)Step 9. IfNGen is reached go to next step, else repeat from step 4

    Step 10. From the final population select the best individualStep 11. Knowing the structure of an individual set the parameters

    Regarding the Sugeno fuzzy systems, when constructing individuals by simply gathering all

    parameters and putting them together, the crossover might produce worse individuals, due to

    the fact that it is possible for one or more Sugeno systems to have similar characteristics.

    To overcome this drawback, two methods were considered:

    i. Test for similarity

    In this case, all vectors containing centers of Gaussian membership functions are

    clustered based on the relative distance between them. From each cluster a

    representative is chosen. Then, these representatives are used to construct the initialpopulation.

    Parameters for clustering algorithm are chosen by trial and error.

    ii. Replace genetic algorithms with evolutionary strategies

    Comparing with genetic algorithms, in evolutionary strategies there is no crossover or

    the probability of crossover is drastically lowered, the main operator being mutation,

    the population contains fewer individuals, a parent produces more offsprings.

    In this case, Gaussian mutation will be applied. Gaussian mutation adds to each

    component of the individual a small quantity obtained from a Gaussian distribution.

    Due to the fact that implementation of evolutionary strategies is more complicated,

    this aspect is left for future development of the toolbox.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    29/46

    - 29 / 46 -

    IV.3. Training strategies

    Due to the fact that the structure implemented by this project has a relatively large number of

    parameters, it is more efficient to separate them by means of power to affect the

    performance of the system.

    In this revision of the toolbox a two-stage training procedure is implemented.

    Mainly, the first stage works with the static version of the system, ARMA filters being

    characterized only by gain.

    The second stage is started with the insertion of randomly generated poles and zeros (numbers

    in [-1, 1] interval are generated to assure stability and minimum phase). After that, allparameters are adjusted, making sure in the same time that the performance is increased, thus

    not ruining the work in the first stage.

    For the moment, two strategies were defined:

    1) Stage 1gradient-descent batch training on static structureStage 2gradient-descent batch training on dynamic structure

    2) Stage 1genetic algorithms on static structureStage 2gradient-descent batch training on dynamic structure

    Remark

    During early tests it has been observed that adjustments made to the Sugeno systems

    parameters, sometimes determine the evaluation to NaN, due to the fact that no

    membership function is activated, thus the denominator of the input-output transfer isevaluated to 0 (ore very close to 0, considering the error caused by number

    representation), and the final output of the system is the result of a division by 0.

    Also it is worth mention that initialization of the fuzzy systems guarantees that at least

    one of the membership functions is activated.

    To overcome this behaviour, when a Sugeno fuzzy system evaluates to NaN, the last

    parameter change is discarded.

    IV.4. Data structures

    Current revision of the toolbox is implemented using the OOP capabilities of the MATLABenvironment.

    Although the previous implementation was also modular and scalable it was pretty hard to

    debug training-related problems.

    Encapsulation reduced the amount of debuggable code and ensured minimum spread and

    propagation of errors.

    Another issue solved by this approach is the speed of training an evaluation.

    In the following sections the implemented classes will be enumerated. Although implemented,

    standard methods like constructor, display, set, get will not be mentioned.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    30/46

    - 30 / 46 -

    IV.4.1. Class ARMA

    A. Fields

    Gaingain of the filter (scalar)Polespoles of the filter (1-by-2 vector)

    Zeroszeros of the filter (1-by-2 vector)

    Inputpresent and last 2 input samples (1-by-3 vector)

    Outputpresent and last 2 output samples (1-by-3 vector)

    Doutdinpresent and last 2 samples for derivative o output with respect to input (1-

    by-3 vector)

    Doutdgainpresent and last 2 samples for derivative o output with respect to gain (1-

    by-3 vector)

    Doutdpolespresent and last 2 samples for derivative o output with respect to poles

    (2-by-3 vector)

    Doutdzerospresent and last 2 samples for derivative o output with respect to zeros

    (2-by-3 vector)

    B. Methods

    Evaluateevaluate output

    Evalderivevaluate output and derivatives

    Updategdupdates filter parameters according to gradient-descent expression

    Resetresets initial conditions

    IV.4.2. Class SFS

    A. Fields

    Centercenters of Gaussian membership functions

    Betaoutput singletons

    Sigmadispersions of the Gaussian membership functions

    Doutdinderivative of output with respect to input (scalar)

    Doutdcenterderivative of output with respect to centers (nmf-by-1 vector)

    Doutdbetaderivative of output with respect to singletons (nmf-by-1 vector)

    Doutdsigmaderivative of output with respect to dispersions (nmf-by-1 vector)

    where nmfrepresents the number of membership functions

    B. MethodsEvaluateevaluate output

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    31/46

    - 31 / 46 -

    Evalderivevaluate output and derivatives

    Updategdupdates fuzzy system parameters according to gradient-descent expression

    IV.4.3. Class BRANCH

    A. Fields

    Arma1input filter (ARMA object)

    SfsSugeno fuzzy system (SFS object)

    Arma2output filter (ARMA object)

    B. Methods

    Evaluateevaluate output expression (calls object specific evaluate method)

    Evalderiv evaluate output and derivatives expression (calls object specific evalderiv

    method)

    Updategd updates parameters according to gradient-descent expression (calls object

    specific updategd method)

    Resetresets initial conditions for arma1 and arma2

    Initpzinitializes arma1 and arma2 poles and zeros with random values in [-1, 1]

    IV.4.4. Class NFS

    A. Fields

    Branchessynapses (vector of BRANCH objects)

    Armaoutput filter (ARMA object)

    B. Methods

    Evaluateevaluate output expression (calls object specific evaluate method)

    Evalderiv evaluate output and derivatives expression (calls object specific evalderiv

    method)

    Updategd updates parameters according to gradient-descent expression (calls object

    specific updategd method)

    Resetresets initial conditions for ARMA and each branch object

    Initpz initializes ARMA objects poles and zeros with random values in [-1, 1] (calls

    object specific initpz method)

    Traintrains the structure according to specified input, target and strategy

    Scalescales data sets (private method called by train)

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    32/46

    - 32 / 46 -

    Gaimplements a genetic algorithm (private method called by train)

    Sequentialgd implements a sequential gradient-descent training algorithm (private

    method called by train)

    Batchgdimplements a batch gradient-descent training algorithm (private method called

    by train)

    IV.5. Future development

    IV.5.1. Evolutionary strategies

    Evolutionary strategies may prove to be more efficient in some cases than genetic algorithms,

    especially for final tuning of the parameters.

    Currently, genetic algorithms are implemented using a free, specialized toolbox developed by

    the Evolutionary Computation Research Group in the Department of Automatic Control and

    System Engineering from the University of Sheffield, UK.

    IV.5.2. Generating training data sets

    Besides training process, data preparation is the most time consuming activity involved.

    Choosing the most relevant input signals and then the best suited data samples for training has

    a crucial importance for a successful training. Thus an automatic or supervised data

    preparation process could prove to be a useful tool.

    Also, implementing such a process would help online training

    IV.5.3. Graphical user interface

    After the completion of the toolbox, a specialized graphical user interface would increase the

    ergonomics in using this tool.

    IV.6. Resources

    MATLAB resources, beside the standard package, that were used to implement this project

    are:

    Statistics Toolboxdata clustering functions and structures

    Genetic Algorithms Toolbox free toolbox, not included in MATLAB packages, developed

    at the Department of Automatic Control and System Engineering from the University of

    Sheffield, UK (http://www.shef.ac.uk/acse/research/ecrg/gat.html).

    http://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.html
  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    33/46

    - 33 / 46 -

    V. Testing and results

    V.1. Vehicle lateral dynamic model

    V.1.1. Model description

    The model is a simplified one track vehicle lateral dynamic linear model with roll.

    System structure:

    one input *L

    (steering angle),

    two outputs andya r (lateral acceleration and yaw rate) and two state variables and r (slip angle and yaw rate).

    The model expression in state space form:

    ' ' '

    2

    *

    ' 2 ' 2 '

    sin

    0

    1

    ( )

    R R

    ef

    R

    L

    r

    H V H H V V V

    ref ref ref

    L

    H H V V V V H H V V

    z z ref z

    g

    xv

    C C l C l C CK K K

    mv mv mvn

    rr l C l C l C l C l C

    I I v I

    &

    &

    ' ' ' '

    * 1 00 1

    00 1

    y

    V H H H V V Vay

    ref L

    r

    C C l C l C C

    na m mv mr r n

    One track model/

    3 DOF

    r

    Center of

    transient

    v

    xy

    CG

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    34/46

    - 34 / 46 -

    V.1.2. Physical simplifications

    The vehicle lateral dynamic is a very complicated physical phenomenon; here we use

    the simplified model-one track model to describe it. There are some important assumptions

    that have been made for the application of the one-track model:

    1) The height of gravity centre is zero,2) there is no pitch and roll motion and3) the model is purely linear.

    For the derivation of the lateral dynamics, a coordinate system is fixed to the centre of

    gravity. The equations of motion are described according to the force balances and torque

    balances at the centre of gravity.Therefore, from the application viewpoint, due to the one-track models simplification

    especially the simplification in the tire model, it has been verified that it can be a good

    approximation of the vehicle dynamics only when lateral accelerationya is small than 0.4g on

    normal dry asphalt roads [1]. And it is only valid for some not so critical driving situations,

    for the pitch motion and the roll motion has been neglected.

    x

    z

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    35/46

    - 35 / 46 -

    V.1.3. Unknown input signal

    In this model, there exists one unknown input signal, the road bank angle x . This signal

    cannot be measured directly in the general vehicle system, so normally it is taken as an

    unknown input signal.

    V.1.4. Model parameter variation

    Vehicle reference velocity refv

    The system matrices are the function of the vehicle reference velocity, such as in A, B, C

    matrices; therefore the system is exactly a LTV system. But for the purpose of the vehicle

    lateral dynamic research, the variation of the longitudinal vehicle velocity is comparably

    slow, so it can be considered as a constant during one observation ( such as in a short time

    window as 1 second, for the residual evaluation).

    Vehicle mass m

    For the load of the vehicle varying, accordingly the vehicle sprung mass and the inertia will

    be changed. Especially the changes are very large for the truck, but for the personal car,

    comparing to the large total mass, the change caused by the number of passengers can be

    neglected normally.

    x

    gmS

    USm g

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    36/46

    - 36 / 46 -

    Vehicle cornering stiffness C

    Cornering stiffness is the change in lateral force per unit slip angle change at a specified

    normal load in the linear range of the tire.

    ,( , )

    y

    z

    d FC f F

    d

    Nominal value for our research car (Mercedes-Benz S500):

    '103600 [ ] , 179000[ ]V H

    N NC Crad rad

    Actually, the tire sideslip stiffness 'f

    C

    andr

    C

    depend on roadtire friction coefficient,

    wheel load, camber, toe-in, wheel pressure etc., see [3]. The problem of this fact is the

    number of the unknown parameters and functions are very large and very complex. There are

    some exact functions for the non-linear tire model, such as magic function and HSRI model,

    but they can only be used in tire or vehicle off-lines simulation.

    The general simple way to linearize the non-linear tire model is to linearize its

    characteristics at the origin, so the sideslip stiffness C

    is taken as a constant. However this

    assumption is only valid in small sideslip angle and constant road adhesion efficient.

    In some papers [1], [2], based on the stiffness of the steering mechanism (steering column,

    gear, etc.), the following assumption has been used,

    'H VC kC .

    Tire Heading

    Direction

    Lateral Force

    Sli An le

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    37/46

    - 37 / 46 -

    V.1.5. Model noise

    The sensor noises are caused by the lateral acceleration sensor, yaw rates sensor noise and

    steering angle sensor. The all sensor noise data are measured and supplied by the Bosch

    company [2]. The details are given in the table.

    V.1.6. Typical failures

    Some typical failure types and values for the benchmark system are given in Table 1. The

    given values are only used to show a realistic range for the faults, other fault values are also

    possible. For the steering angle a ramp fault is because of the sensor type improbable, so no

    fault value is given. Multiplicative faults are also not very probable and no realistic fault

    values are known at this moment by the authors.

    Table 1 Typical failure for the benchmark system

    Offset faults

    Step Ramp

    Yaw rate 2 /s, 5 /s, 10 /s 10/s/min

    Lateral acceleration 2 m/s2, 5 m/s

    2 4 m/s

    2/s, 10 m/s

    2/s

    Steering angle 15 , 30 --

    Multiplicative faults

    Giergeschwindigkeit (100 20) %, (100 40) % 100% (100 50) % in 10 s

    Querbeschleunigung (100 50) %, (100 80) % 100% (100 50) % in 10 s

    Lenkradwinkel -- --

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    38/46

    - 38 / 46 -

    V.1.7. Physical parameters of the vehicle lateraldynamical model

    Physical constants Symbol in Matlab Value Unit Explains

    g g 9.80665 [m/s^2] Gravity constant

    Vehicle parameters

    Li i_L 18.0 [-] Steering transmission ratio

    Rm m_R 1630 [kg] Rolling sprung mass.

    NRm m_NR 220 [kg] Non-rolling unsprung mass.

    m m=m_R+m_NR [kg] Total mass

    Vl l_V 1.52931 [m] Distance from the vehicle CG

    to the front axle

    Hl l_H 1.53069 [m] Distance from the vehicle CG

    to the rear axle

    zI I_z 3870 [kg-m^2] Moment of inertia about the z-

    axis of the vehicle

    RK

    K_phi 0.9429 The roll coefficient

    tire model Parameters

    '

    VC c_alpha_V 103600 [N/rad] Front tire cornering stiffness

    HC c_alpha_H 179000 [N/rad]. Rear tire cornering stiffness

    I.1.1.1. System variables

    Beta [rad] Vehicle side slip angle

    r r [rad/s] Vehicle yaw rate

    *

    L Delta_L [rad] Vehicle steering angle

    ya Ay [ 2ms ]

    Vehicle lateral acceleration

    refv v_ref [Km/h] Vehicle longitude velocity

    I.1.1.2. Sensor noise data

    Standard variation

    ayn N_ay 2(0.2, 2.4) [ ]

    y

    ma s

    [ 2ms

    ] Lateral acceleration sensor

    noise

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    39/46

    - 39 / 46 -

    rn N_r (0.2,0.9) [ / ]r rad s rad/s] Yaw rate sensor noise

    Ln

    N_delta* 2

    L

    rad

    [rad] Steering angle sensor noise

    V.1.8. Reference[1] Marcus Brner, Rolf Isermann, Adaptive one-track model for critical lateral driving

    situations,

    [2] Bosch GmbH , Fehlerarten fr die ISP-Sensorik, Internal report, 2003

    [3] S. X. Ding, Y. Ma, H.-G. Schulz, B. Chu ect.,Fault tolerant estimation of vehicle lateral

    dynamics, IFAC, Safeprocess, 2003.

    [4] Mitschke, M., Dynamik der Kraftfahrzeuge. Band C. Springer- Verlag, 1990

    V.2. Test 1

    The following graphic represents measurements of the steering angle.

    Figure 9 The steering angle (sample period 0.1 s)

    From this data set, a subset is selected for training. The main principle used for determining

    the most useful data subset concerns the bandwidth of the signal in specified interval. As long

    as the correct data set is selected, in most cases, fewer samples produce a better and more

    efficient training.

    For the current project development, determining a procedure an implementing an algorithm

    for choosing the training set is out of scope, but, as mentioned before, such a procedure wouldmake the toolbox much more easy to use.

    The output variable, lateral acceleration is plotted in the next graphic:

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    40/46

    - 40 / 46 -

    Figure 10 Lateral Acceleration

    From observing the previous graphic, it is obvious that this data set includes a small amount

    of noise.

    For the first training test the following parameters were set:

    Training subset

    window_start 5000

    window_stop 8500

    step 10

    Training strategy

    Phase 1Gradient DescentBatch training

    Learning rate 0.01

    Number of epochs 100

    Phase 2Gradient DescentBatch training

    Learning rate 0.005

    Number of epochs 300

    After training process the following response was obtained

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    41/46

    - 41 / 46 -

    Figure 11 Training results

    Using as input only the steering angle, it is obvious that the neuro-fuzzy structure would not

    be able to be very precise, because it is not able to model the internal noise that causes the

    small ripples in the reference output.

    This might suggest introducing an internal feedback.

    Testing the structure for the initial data set, the following results were obtained

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    42/46

    - 42 / 46 -

    Figure 12 Testing Set (MSE=0.028)

    V.3. Test 2

    As stated in the previous section, considering an internal feedback as an extra input signal for

    the neuro-fuzzy structure might produce better results.

    Thus, considering the following notations:

    x = steering angle;

    y = lateral acceleration;

    for this second test, 2 inputs are considered for the neuro-fuzzy structure:

    1ky

    kxinput (41)

    Training subset

    window_start 5000

    window_stop 8500

    step 10

    Training strategy

    Phase 1Gradient DescentBatch training

    Learning rate 0.009

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    43/46

    - 43 / 46 -

    Number of epochs 100

    Phase 2Gradient DescentBatch training

    Learning rate 0.005

    Number of epochs 300

    Figure 13 Training results (2 inputs, MSE = 0.01)

    For both tests, training process was around 58 seconds which is a major improvement from

    the previous version of the toolbox.

    For the testing set, following graphic shows the results:

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    44/46

    - 44 / 46 -

    Figure 14 Testing set (MSE=0.0047)

    Other tests, with more inputs were considered, but the results are, at this time inconclusive.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    45/46

    - 45 / 46 -

    VI. Conclusions

    The main improvement brought by this project to the previous version of the MATLAB

    toolbox is the speed. It is well known that artificial intelligence techniques like neural

    networks, fuzzy systems, neuro-fuzzy systems, evolutionary algorithms (genetic algorithms,

    evolutionary strategies, genetic programming) are time consuming and technology dependent.

    This version of the toolbox has solved the time dependency of the training process,

    performance being influenced only by the training parameters.

    On the other hand, there are a great number of parameters that need to be adjusted to obtain

    optimum performance. Thus a parameter management is required.

    One partial solution is to define some training strategies which will allow some degree of

    separation between parameters.

    Previous study of the proposed structure revealed a great sensitivity of the output when

    varying the SugenoFuzzy systems parameters, thus a small updates could determine a big

    modification of the performance (in most cases, decrease of performance). This means thatvariation of the fuzzy parameters would determine a quick exploration of the search space and

    finding an approximate location of the global optimum.

    The restrictions imposed to the ARMA filters, mainly due to stability considerations, make

    the filter parameter adjustment suited for tuning the entire systems performance.

    Another modification of the toolbox is the implementation of the data structures. In this case

    an OOP approach was chosen, thus monitoring the behaviour of each component became an

    easy task. Also, finding and isolating errors and faults is assured.

    Concluding, this version brings a great increase of speed making the use of the toolbox more

    efficient.Of course, future studies will have to find the best way to increase the performance. It is

    obvious that classic training algorithms are not well suited, thus some specialized versions

    must be developed.

    The results presented confirmed the modelling capabilities of the structure and also that this

    can be done efficiently.

  • 8/2/2019 Evolutionary Neuro - Fuzzy System with Internal Dynamics for System Identification

    46/46

    VII. Bibliography and references

    1) Kaufman, ArnoldFundamental Theoretical Elements2) Dasgupta, DipankarEvolutionary Algorithms in Engineering Applications3) Hagan, Martin TNeural Network Design4) Russell, Stuart JArtificial Intelligence: A Modern Approach5) Bellman, Richard ErnestMethods Of Nonlinear Analysis6) Bck, Thomas - Evolutionary Algorithms in Theory and Practice (Evolution

    Strategies, Evolutionary Programming, Genetic Algorithms )

    7) www.wikipedia.org8) http://www.shef.ac.uk/acse/research/ecrg/gat.html

    http://www.wikipedia.org/http://www.wikipedia.org/http://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.shef.ac.uk/acse/research/ecrg/gat.htmlhttp://www.wikipedia.org/