Bearing fault diagnosis using CWT, BGA and Artificial Bee ... · Bearing fault diagnosis using CWT,...
Transcript of Bearing fault diagnosis using CWT, BGA and Artificial Bee ... · Bearing fault diagnosis using CWT,...
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 1
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Bearing fault diagnosis using CWT, BGA and
Artificial Bee Colony Algorithm S.Devendiran
1, K. Manivannan
2, C.Rajeswari
3, Joshua Michael Amarnath
4, & Apoorv prasad
5
1, 2, &4 School of Mechanical and Building sciences ,VIT University, Vellore, India
3&5 School of Information Technology & Engineering, VIT University, Vellore, India
[email protected] 3 [email protected]
4joshua [email protected]
Abstract– Health diagnosis of bearing is essential reduce
the breakdowns of rotating machinery. An intelligent
method to diagnose the bearing fault using vibration signal
is proposed. This paper proposes a binary genetic
algorithm (BGA) in feature selection process and discuss
about the role of fitness functions in feature selection
process by application of different fitness functions in GA
process. A vibration signal from various conditions of
bearing is extracted from a test rig and statistical features
extracted using wavelet coefficients by continuous wavelet
transform (CWT). A new heuristic classifier artificial bee
colony (ABC) algorithm is applied and fault diagnosis
results are compared with learning vector quantization
(LVQ) classifier and their relative efficiency were
compared based on their classification accuracy. To select
the predominant features a famous feature selection
approach a binary genetic algorithm (BGA) were used.
Index Term-- Test rig, Fault diagnosis, Continuous wavelet
decomposition, Statistical features, binary genetic algorithm
(BGA), Artificial bee colony (ABC) algorithm and Learning
vector quantization (LVQ),
I. INTRODUCTION
The dynamic performance of rotating components is highly
influential on efficiency of any rotating machine. Particularly
bearings, which is considered as the heart of an rotating
machinery. The accurate diagnosis of rolling bearing faults
can reduce or prevent the accidents and In case the rolling
bearing breaks down, the consequences can be serious [1].
Fault finding methodologies of rolling bearings have attained
importance in preventing of machinery from failures [2] as
well it is very important to know the nature and severity of a
bearing fault in order to select the most appropriate
maintenance action. In bearing health diagnostics the most
preferable, reliable and popular method is vibration analysis
technique [3-4]. Many signal processing techniques have been
developed and applied for machine diagnosis in this research
area. They include the conventional techniques, such as the
spectral analysis [5-6].Some of the researchers used FFT of
intrinsic mode functions in Hilbert–Huang transform which
provides multi-resolution in various scales of frequency along
with taking the contents of the signal frequency and
considering their variation in Bearing health diagnosis [7].Few
researchers compared Fourier Transform (FT), Windowed
Fourier Transform (WFT) and Wavelet Transform (WT)
methods for phase calculation and it was found that WT and
WFT was more appropriate than FT in calculation at
discontinuities [8]. Wavelet transform is applied to find
changes in the vibration signals obtained from bearings being
monitored. In particular, The wavelet and the envelope
detection (ED) method deployed in fault diagnosis of rolling
element bearing and the results showed that both the wavelet
and ED methods are effective in finding the bearing fault, but
the wavelet method is less time expensive[9].Continues
wavelet transforms(CWT) are often used for find the
singularity points in output signals sampled from the machines,
furthermore, for fault diagnostics and wavelet transform
modulus maxima to detect abrupt changes in the vibration
signals obtained from operating bearings being monitored
[10].A specific method such as singularity analysis across all
scales of the continuous wavelet transform is performed to
identify the location (in time) of defect-induced bursts in the
vibration signals. In one dimensional CWT the dimensionality
of the raw data is not reduced but it preserves the missing
features of DWT and it is a complementary part of DWT and
usage of simulation results showed more distinctive fault
signatures with coefficients of wavelet decomposition rather
than the actual signal [11].
In recent years, intelligent fault diagnosis of rolling bearing
based on statistical features which is extracted from time and
frequency domain vibration signals, has received significant
attention because of containing significant information about
component failure [12-13 ]. Over a last decade rolling-element
bearing heath monitoring has been an important research
topic in pattern recognition domain. However, most studies
have focused on fault type classification based on acquired
vibration fault samples using classifiers such as support
vector machine and neural networks [14-15] . Researchers [16]
prefer to use some selected number of features to classify data
in the aim of reduce the dimensionality of the data without
compromise the useful information and obviously to reduce
computation timing. The aim of the present work is to develop
a new automatic monitoring and diagnosing procedure for
detect the condition of rolling-element bearings in early stage
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 2
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
with less computation time and higher classification accuracy .
An traditional frequency spectrum analysis method of normal
as well the faulty signals by means of a manually interpreted
knowledge-based system, is also proposed to diagnose
whether the defect is in the inner race, outer race, rolling
element along with automated system of procedure using
CWT based feature extraction , GA for optimum feature
selection process and comparison of results using classifiers
such as ABC and LVQ algorithm is not yet attempted .This
paper is one such attempt to apply the above mentioned
methods in bearing fault diagnosis.
II. EXPERIMENTAL STUDIES
An experimental setup was designed to perform the tests to
validate the proposed methodology (Fig.1). Experimental set-
up shown in Fig-2 consists of variable frequency drive (VFD),
three phase 0.5 Hp AC motor, bearing ,belt drive, gearbox and
brake drum dynamometer with weighing scale. A standard
deep groove ball bearing (No. 6001) is used in this experiment.
Tri axial type accelerometer (Vibration sensor) is fixed over
the bearing block to measure the vibration signals. 24 Bit,
ATA0824DAQ51 data acquisition system was used and the
signals were collected with the sampling frequency of 12800
hz. Bearing was driven by an motor at a constant rotating
speed of 1700 r/min. Constant load was applied by brake drum
dynamometer and the speed is monitor by tachometer.
Various bearing conditions indicating normal, outer race fault,
inner race fault and ball fault of 1mm deep crack are depicted
in Fig.1. These artificial faults created using the electric
discharge machining process. For each bearing conditions, the
experiments carried out to acquire time domain vibration
signals for prescribed load and speed conditions. FFT(Fast
Fourier Transform) is the most common transformation
technique in health monitoring due to its tranquil
interpretation of fault condition, WT is more appropriate than
that of FFT in calculation at discontinuities. WT transfers the
data from time domain to time-frequency domain
Fig. 1. Flow chart for
automated bearing condition diagnosis
The Fig-2 indicates the test rig as well the normal, outer race ,
inner race and ball fault(1mm crack depth) conditions were
formed using the wire cut EDM process. The experiment of
acquiring vibration signal is carried out for all bearing
conditions. The number of the sample data for each bearing
condition is depicted in Table I. Sample signal of the four
bearing conditions is in time domain is illustrated in Fig. 3.
Commonly used transformation technique in health
monitoring is Fast Fourier transform (FFT), which is used to
transform the time series data to frequency domain, where the
signal is used to deduction of sine and cosine waves from the
sample. FFT is executed on sample signals for all the states of
bearing. The bearing can be diagnosed by analysing the
abnormal frequency-domain amplitude. The frequency of the
abnormal vibration is called fault frequency which is decided
by the fault location .The following equations given the detail
of fault characteristic frequencies for different parts of
bearing .The characteristic bearing frequencies are BPFO-
Ball Pass Frequency Outer Race, BPFI- Ball Pass Frequency
Inner Race, FTF- Fundamental Train Frequency and BSF-
Ball Spin Frequency. These characteristic frequencies are
useful to find the defects of the bearing components from the
concern component frequencies and its harmonics.
Frequency analysis may be the most fundamental approach for
bearing condition monitoring and fault detection. In tradition ,
finding those frequencies and measuring the amplitude
variations in the particular frequency and its side bands as well
the harmonics of those frequencies will give the information
of the health condition of bearing (shown in Fig.4 and Fig.5)
Shaft rotational frequency- Fs( Hz) = Shaft speed/60 (1)
Ball passing frequency outer race, (BPFO) =
(2) 1 cos
2
b d
d
N BFs
P
Raw
signal
extraction
Continuous
Wavelet
transform
Feature
Extraction
Feature
Selection by GA
ABC/LVQ
Classification
Classification
comparison
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 3
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Ball passing frequency inner race (BPFI) =
(3)
Fundamental train frequency (FTF) =
(4)
Ball spin frequency (BSF) =
(5)
Fig. 2. Test rig (experimental set up) and different bearing conditions
1 cos2
b d
d
N BFs
P
11 cos
2
d
d
BFs
P
2
2
21 cos
2
d d
d d
P BFs
B P
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 4
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Fig. 3. Time domain raw signals (a) Normal (b)Outer race fault (c) Ball fault (d)Inner race fault
Fig. 4. Frequency spectrum signals with harmonic variations in characteristic frequencies (a) normal (b) outer race fault (c) ball fault (d) Inner race fault
In this proposed work the wavelet transform method to be used as dimensionality reduction function for the raw data. The
features are extracted from the wavelet-transformed data. These features form a transformed space and it is used as the input of
next process called feature selection and further classification. Limitation of FFT is that it cannot find the non-stationary transient
information from the samples, which serves as the reason this paper focus on wavelet transform.
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 5
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
TABLE I
Sample structure used for training and testing of proposed
Fig. 5. Frequency spectrum signals (a) normal (b) outer race fault (c) ball fault (d) Inner race fault
I. THEORETICAL BACKGROUND OF WAVELET TRANSFORM
Wavelet Transform (WT) is a time-frequency decomposition
of a sample signal into ―wavelet‖ basic function. Wavelet
analysis is widely used for decomposing, de-noising and
signal analysis over a non-stationary signals. At high
frequencies WT gives good time and poor frequency
resolution, and at the same time at low frequencies it gives
good frequency and poor time resolution. Investigation with
wavelets proceed with breaking up a signal into shifted and
scaled versions of its mother (or original) wavelet, that is
obtaining one high frequency term from each level and one
low frequency residual from the last level of decomposition.
In other words Decomposition of signal is a process of
breaking of signals into lower resolution components with
respect to levels. In general two categories of transformation
widely used in wavelet: Continuous Wavelet Transform
(CWT) and Discrete Wavelet Transform (DWT). Continuous
wavelet transform had the capability by creating time-
frequency signal which contains a very good time and
frequency localization. This locate the wavelet transform
apart from the Fourier Transform, the effect were
accumulation of higher frequency sine waves spread
throughout the frequency axis.
Bearing Condition Number of samples
For training For testing
All Normal 100 20
Inner raceFault 100 20
Outer raceFault 100 20
Ball Fault 100 20
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 6
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
TABLE II
CWT is widely used to divide a continuous-time function into
wavelets. The continuous wavelet transform of a time function
z(t) is denoted as :
(6)
Where is a continuous function in both the time
domain and the frequency domain called the mother /original
wavelet and * represents operation of complex conjugate.
Further expansion of gives
Where x, y ∈ R, x≠0 (7)
In general mother wavelet gives a source function to generate
the translated and scaled version of its sibling wavelets. As
given in equation (7), the transform signal CWT (a, b) is
defined on plane x - y, were a and b are used to change the
frequency and the time location of the wavelet. Whenever
high frequency resolution is required, the decrement of x will
construct a high-frequency wavelet and vice versa is possible.
In other side as y increases, the wavelet transverses the length
of the input signal, and increases or decreases in response to
changes in the local time and frequency content of the signals.
Acquired signals are decomposed based on Continuous
Wavelet transform, which is then used for extracting various
statistical features. Transform coefficients are a measure of
similarity between the raw and daughter wavelets [17].
Morlet wavelet has equal octave intervals and resulting in the
first formalization of the continuous wavelet transform. It
have a cosine function which exponentially decreases at both
ends (Fig. 4). It looks like an impulse function modulated with
a cosine function. Morlet wavelet is more suitable in cases of
variations found in abnormal stationary signals.
Fig. 4. Morlet wavelet
( , )( , ) ( ) ( )a bCWT x y z t t dt
*
(a,b)(t)ψ
*
(a,b)(t)ψ
1( , )
t yx y xx
Feature Equation Definition
Mean 1
n
i
imean
k
kn
Average of all values in the population
Standard deviation
2
1
1( )
1
n
Sd i
i
k kn
Square root of an unbiased estimator of the
variance of the population
Kurtosis 4
1
1( )
n
kur
i
k k t kn
Fourth central moment of X, divided by fourth
power of its standard deviation
Root means square 2
1
1 n
rms
i
k kn
Root of sum of squared values
Variance
2
2var
( )
1
i
n
k k
kn
Measures how far a set of numbers is spread out
Peak to RMS . .
h
peak to rms
rms
kk
k
Ratio of the largest absolute value and the root
mean squared value
Peak to peak
2p p h lk k k Difference between largest and smallest values
Skewness
3
1
3
n
i
iskew
k k
kn
Third central moment of the value, divided by the
cube of its standard deviation
Minimum min min( )ik k
Minimum value in the set
Maximum max max( )ik k
Maximum value in the set
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 7
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
After a trail of experiments, it found that scaling the signal
with a factor of 8 is considered to have more efficiency than
all other scales .The equation for Morlet Wavelet transform is
given by equation (8).
( )
√ (
)
(8)
: central frequency of mother wavelet
Figure 3 depicts the plot of raw signal and its corresponding
Morlet Continuous wavelet transform coefficients of four
types of signals – Normal bearing, inner and outer race and
Ball faults. When comparing raw and transformed it is noticed
that the anomalies are more distinct in transformed signal.
This will help increase the classifier accuracy.
Statistical features such as Mean , standard deviation etc.,
(shown in Table II) were used to extract the required features
from the coefficients and calculated feature sets (shown in
Table III) are used as a input for further feature selection and
classification
TABLE III
Sample feature values extracted of training data (before normalization) from 1D Morlet-wavelet decomposition for 3 different conditions of bearing fault
IV.GENETIC ALGORITHM
Genetic Algorithm (GA) is a search heuristic technique, which
imitates the process of natural selection. This algorithm is
consistently used to generate positive solutions to search and
optimization problems. The algorithm produces an optimal
solution based on Darwinian principle of ‗survival of the
fittest‘ through a series of iterative calculation. GA begins
with a set of chromosomes called population. After initializing
the population randomly, GA evaluated the fitness of each
individual. The algorithm then generates successive
generations of population in order to obtain an optimal
solution. Successive generations are created using natural
evolution such as mutation, crossover, etc. At each generation
a fitness function is used to evaluate the candidate solution.
Crossover and mutation operations are the main parameters,
which impact the fitness value. Every successive generation is
the product of mutation or crossover of the previous
generation and the candidates are chosen according to the
fitness value. The candidates (Chromosomes) with best fitness
have higher probability of being selected for reproduction.
Thus after successive generations the best candidate or
solution shall be obtained [18].
In this research paper Genetic Algorithm is used to select the
best features which will be used as the input to the classifier.
Therefore, the initial population is randomly selected features.
Each candidate represents a feature among the total of ten.
The objective of GA is:
Establish least within-class distance
Establish maximum between-class distance
These two objectives are applied to each feature and the best
features are selected. Finding the feature with minimum
within-class distance ensures that the samples are lying at the
least distance from the center of respective class. This will
guarantee better resemblance among the samples in the
respective classes and improve the chances of correct
classification [19]. The within-class distance is defined as:
(9)
Where c is the class, c=1, 2, 3, 4; Samples, S=1,2,3….n; Mc is
the mean vector of the class c.
Having a maximum between-class distance ensures that the
particular feature will guarantee better divergence from the
other classes thereby increasing classification accuracy.
(10)
M is the mean vector of all the classes.
The cost function or the fitness function defined for GA is
n
j
c
c
j
T
c
c
jc
c MSMSn
D1
1
MMMMnD c
c
T
cib
4
1
Feature Normal Inner-race fault Outer-race fault Ball fault
sample-1 sample-2 sample-1 sample-2 sample-1 sample-2 sample-1 sample-2
Mean -0.0003 0.00051 -7.00E-05 0.00034 -0.00049 0.00036 -0.00029 2.00E-05
Standard deviation 0.1755 0.16975 0.12791 0.13382 0.10916 0.11237 0.20063 0.23035
Kurotsis 2.49133 2.38996 2.3897 3.05922 4.16713 3.72449 7.08792 6.79926
RMS 0.17541 0.16967 0.12784 0.13375 0.10911 0.11231 0.20053 0.23023
Variance 0.0308 0.02882 0.01636 0.01791 0.01192 0.01263 0.04025 0.05306
Peak2rms 2.58228 2.53587 2.67953 3.54373 4.28417 3.72643 4.95698 5.15222
Peak2peak 0.89626 0.85445 0.68458 0.91314 0.89701 0.81399 1.97231 2.29468
Skewness 0.00247 -0.00595 -0.0057 -0.00236 0.02021 0.01894 0.02296 0.02338
Minimum -0.45296 -0.4242 -0.34256 -0.47398 -0.46744 -0.39547 -0.99404 -1.10846
Maximum -0.45296 -0.4242 0.34201 0.43916 -0.46744 -0.39547 0.97827 1.18622
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 8
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
(11)
The candidate (Chromosome), which minimizes this function
the best, are chosen and given as classifier input and accuracy
of classifier for various combinations of features are
compared. The procedure and pseudo code of GA is given in
Fig 6(a) &6(b).
V. ARTIFICIAL BEE COLONY ALGORITHM
Artificial Bee Colony (ABC) Algorithm is a swarm-based
algorithm [20]. It is based on the foraging behavior of
honeybees. In ABC the bee colony consist of three types of
bees: employed bees, onlooker bees, scout bees. To
incorporate ABC, the optimization problem must be converted
to a problem of obtaining the best parameter vector, which
minimizes an objective function. After which the algorithm
randomly initializes the solution vector and iteratively
improve it and thereby achieves the most optimal solution.
The solution vector is the food source of the foraging bees
[16].
The algorithm can be divided into four parts for better
understanding.
1. Initialization Phase
2. Employed Bee Phase
3. Onlooker Bee Phase
4. Scout Bee Phase
1) In initialization phase all the vectors of the population of
food source are initialized (Xf). The size of the population is
the total number of (employed bees + onlooker bees). Each
food source will contain n number of (Xm,i , i=1, 2..n) which
have to be optimized so the objective function is minimized.
The initialization is done by the following equation:
(12)
b
c DDD 1
iiiif LUrandLX 1,0,
Generate random initial population
Evaluate each individual’s average fitness (ref eqn. 1, 2 & 3)
Repeat
Select best ranking individuals
Randomly pair individuals to reproduce
Obtain crossover off springs
Also obtain mutated off springs
Determine each individuals fitness
Update the new off springs into the population
RETURN best individual among population (Selected
Feature Vector)
END procedure
Fig. 6 (b) Pseudo code of genetic algorithm
No
Final best combination of features
Yes
Initial population generation with different
combination of features
Evaluate fitness of each combination by
finding the in-class and between class
distances
Choose best ranking combinations
Obtain new
combination using
crossover function
Obtain new
combination using
mutation function
Update the population with the best
combination
Terminate
?
Pair the
combinations
Evaluate fitness of each combination
by calculating in-class and between-
Fig. 6 (a) Flow chart of genetic algorithm
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 9
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
2) An employed bees search for a new food source (Solution,
Vf) which lies near to the one in memory(Xf.)The new food
source can be obtained using the equation:
(13)
The fitness of the newly found solution is evaluated. If the
fitness value (nectar amount) is greater than the previous
solution the bee memorizes the new solution and discards the
old one. The fitness (fitf) can be calculated using the formula:
(14)
Here is the objective function of food source ,
which should be minimized. When is greater than
zero the fitness values will become less (making that food
source less profitable) and whenever is minimized
the fitness value is proportionally higher (making the food
source more profitable).
3) Onlooker bees are categorized as unemployed bee along
with scout bees. The employed bees complete the search
process and returns to onlooker bees to the fitness value (food
source information or nectar value) . The onlooker bees
evaluate the fitness value and choose a food source on a
probabilistic basis. The probability value pmwith which
is chosen by an onlooker bee is defined by the equation:
(15)
After the onlooker bees choose the food source , a
neighboring food source is chosen using equation (10) and its
nectar amount or fitness value is calculated. Then a greedy
selection is applied to the and .This will ensure
that better solution attract more onlooker bees and
4) Employed bee tries to improve their food source by
searching the neighborhood. If the employed bee fails to
improve the food source after a predetermined number of
iterations its becomes a scout bee and moves to a random food
source and continue searching. The procedure and pseudo
code of ABC algorithm is given in Fig 7(a) &7(b).
ikififif XXX ,,,,if,V
01
0.1
1
ffff
ff
ffff
xfifxfabs
xfifxfxfit
ff xf
fx
ff xf
ff xf
fx
PS
f
ff
ff
f
xfit
xfitP
1
fx
fx
fv
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 10
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Initialization Phase:
Initialize all vectors of the population food source,
Eq.(9)
Send the employed bees to the current food source.
REPEAT:
Employed Bee Phase:
For each employed bee- Search for a new food
source near to Eq.(10)
Evaluate Fitness (Fitf) for the newly found food
source (11)
Apply greedy selection process on and
Onlooker Bee Phase:
Obtain fitness information of from
employed bees.
Calculate Probability value of food source Pm
Eq. (12)
A neighbor is chosen using Eq.(10) and its
fitness value is evaluated
Apply greedy selection process on and
For each Scout Bee
If there is an abandoned solution for the scout
then replace it with a new solution, which will
be randomly produced
Memorize the best solution so far
UNTIL cycle = MAX CYCLE NUMBER or MAX CPU
TIME
End procedure
Final Class Centers
Yes
No
Initialize employed bees randomly with
solution vectors
Send the employee bees to the solution
vector
For each sample, distance is taken from the
solution vector
Mean Square Error of misclassification is
used as the fitness value
Onlooker bees find the probabilistic values
of the fitness
Based on the best probability value, a new
solution is chosen
Employed bees share fitness value with
onlooker bees
Solution vectors with which the minimum
distance is obtained, the sample belongs to
that class
Check for abandoned solutions
Terminate
?
Fig. 7 (b) Pseudo of ABC classification algorithm Fig. 7 (a) Flow chart of ABC classification algorithm
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 11
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
VI.LEARNING VECTOR QUANTIZATION
Learning Vector Quantization is a supervised
classification algorithm. It is a special case of artificial neural
networks. It is applied on the basis of winner-takes-all
methodology and is related to self-organizing Maps (SOM)
[22] and K-Nearest Neighbor algorithm [23].
There are a few different LVQ algorithms present, but all are
based on the following basic concept [24]:
Weight vectors (class centers) are randomly initialized.
A set of learning sample inputs (Xi) are given to the classifier
along with the respective correct class labels.
The distance between the class centre and the input vector is
determined and a winner is selected.
This classification method uses a weight vector which is the
centre for each class. Initially a set of learning input samples
are given to the classifier and each input will have a correct
class label. The weight vector for each class is randomly
initialized using an input vector within the class. The classifier
then calculates the Euclidean distance (eq. 16) between the
input vector and the class centre for all classes. The input is
classified to the class which the minimum distance
corresponds to. Each classified input is then checked for
classification accuracy by utilizing the class label information
that each input vector holds [20]. If the input vector is
correctly
classified, the centre of the corresponding class is pushed
towards the input vector(eq. 17). Otherwise, the centre of the
corresponding class is pushed away from the input vector(eq.
18). The following equation defines the LVQ process:
( ) ( )
(16)
( ) ( ) ( ( ))
(17)
( ) ( ) ( ( ))
(18)
Subsequently all the inputs are fed to the classifier and the
final class centre is obtained. This is called the training phase.
In the validation phase a new input is given to the classifier
and it calculates the Euclidean distance between the input
vector and each class center (obtained after training). The
input is classified to the class which corresponds to the
minimum distance. The pseudo code and the procedure of
LVQ classification algorithm is given in Fig 8(a) &8(b).
Fig. 8 (a) Pseudo code of LVQ classification algorithm
Initialize random reference weight vectors, Wi(center)
For each input vector Xi
Repeat until all input vectors are considered:
Using equation (13) , compute Euclidean distance Di between weight Wi and
input Xi vectors
Find vectors of minimum Euclidean distance
If sample is correctly classified:
Push weight vector towards the input vector (eq. 14)
Else
Push the weight vector away from the input vector (eq. 15)
Reduce learning rate, 𝛼
End
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 12
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Fig. 8 (b) Flow chart of LVQ classification algorithm
VII.IMPLEMENTATION OF FEATURE SELECTION AND
CLASSIFICATION
Matlab platform is used to execute the process of feature
selection and classification. Initially a number features are
extracted from the raw signal. This large number of values in
the data can lead to increase in computational complexity thus
classifier efficiency is affected. This can be avoided by
selecting the best features required for the classification and
removing the unnecessary extracted features. It helps in
improving the performance of the classifier. An efficient way
of feature selection can be done with Genetic Algorithm.
Optimization of the feature selection is done based on
distance-based selection method. In this work, Initial
population is randomly selected. Number of chromosomes is
equal to number of features taken to the optimized selection
process. Ten features are taken as GA input parameters and 50
generations were executed. In this research paper Genetic
Algorithm is used to select the best features which will be
used as the input to the classifier. Therefore, the initial
population is randomly selected features. Each candidate
represents a feature among the total of ten.
LVQ classification method uses a weight vector, which is the
centre for each class. Initially a set of learning input samples
VIII.IMPLEMENTATION of GA
Yes
No
Initialize random weight vectors
Compute Euclidean distance between
each sample and the weight vector
Obtain vectors with minimum Euclidean
distance and classify
Correctly
classified
?
Push weight vectors
towards the input
sample
Push weight vectors away
from the input sample
Reduce learning rate, 𝛼
Terminate?
Class center vector
Yes No
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 13
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
are given to the classifier and each input will have a correct
class label. The weight vector for each class is randomly
initialized using an input vector within the class. The classifier
then calculates the Euclidean distance (Eq. 11) between the
input vector and the class centre for all classes. The input is
classified to the class, which the minimum distance
corresponds to. Each classified input is then checked for
classification accuracy by utilizing the class label information
that each input vector holds [25]. If the input vector is
correctly classified, the centre of the corresponding class is
pushed towards the input vector (Eq. 12). Otherwise, the
centre of the corresponding class is pushed away from the
input vector (Eq. 13). Subsequently all the inputs are fed to the
classifier and the final class centre is obtained. This is called
the training phase. In the validation phase a new input is given
to the classifier and it calculates the Euclidean distance
between the input vector and each class centre (obtained after
training). The input is classified to the class which
corresponds to the minimum distance. The weight vectors
correspond to the class center. ABC algorithm is an
optimization algorithm and it is used as a another classifier in
this work . It exploits the foraging behavior of bees in order
optimize a problem. To do that, the problem should be defined
and a cost function should be designed accordingly. The cost
function will differ from one application to another and it is
designed to return a value which will be minimized by the
algorithm. Implementation of a classifier using an
optimization algorithm is done by designing a cost function,
which returns the mean square error of misclassification to the
algorithm which it tries to minimize. The ABC algorithm
initializes each employed bee with random solution vector (a
vector with four class center). Since we have used 10
employed bees, there will be 10 solution vectors. Then the
fitness of each vector is calculated. Fitness calculation is
carried out in 3 steps. First step is to find the Euclidean
distance from each sample to four class centers. The sample is
then classified to the class corresponding to minimum
distance. Then, the mean square error of classification is
calculated. This value is the fitness function. The employed
bees then share the fitness information with the onlooker bees.
The onlooker bees then calculate a probabilistic value using
Eqn.7 with data acquired from the employed bees. The
solution vector corresponding to the highest probabilistic
value is memorized. In case a food source is not yielding good
fitness after a predefined number of iteration (limit value) it
will be considered as an abandoned source. The employed bee
corresponding to the source will become a scout bee searching
for a new one. This process is continued until a fixed number
of iterations are carried out. As the iterations increase, the
center of each class moves towards a better position. Initially
the colony size which comprise of employed bees and
onlooker bees taken as 20 and food sources as 10.
VIII. RESULTS AND DISCUSSIONS
The bearing signals were extracted in the sampling rate of
12800 Hz through accelerometer. Signals are decomposed
and statistical features are extracted after obtaining the
vibration signals for four conditions bearing using morlet
wavelet coefficients. GA based feature selection, initial
parameters taken as follows: the population is 80, the length of
chromosome code is 20 (2 sets of features, each set contains
10 nos.), number of generation is 100. The best combination
of selected features satisfying the given objective. The values
greater than 0.5 is considered as a selected feature and the
fitness of the process is also arrived. From the feature results
11011000110010001000 features selected for
example:(F1,F2,F4,F5,F9,F10,F13,F17) are selected and the
remaining are abandoned. The Selected feature subsets from
GA are used to train and test the ABC and LVQ classifiers.
The total feature set calculated is split into training (70%) and
testing (30%) data set. Samples for a test set are used to
evaluate the LVQ and ABC classifiers.
TABLE IV
Output efficiencies of LVQ classifier
LVQ Classifier Scheme
Accuracy on test data (%) Average
Accuracy on
testing data
(%)
Normal
condition
Inner-
Race fault
Outer-
Race fault
Ball Fault
All 20 features (Without feature
selection)
LVQ-S1 90 97 95 98 95.00
F1,F2,F4,F5,F9,F10,F13,F17 LVQ-S2 100 100 90 100 97.50
F3,F5,F6,F7,F9,F11,F12,F14 LVQ-S3 100 95 90 100 96.25
Random 5 features -
F1,F10,F11,F14,F16
LVQ-S4 75 100 70 100 86.25
The classification results Table IV and Table V as well Fig
9(a) and Fig 9(b) consists of the description of respective
schemes and the corresponding prediction results . Among all
schemes, overall average testing accuracy of 97.5% is higher
in case of LVQ-S2 schema. It is also seen that the results from
LVQ classifier without feature selection has given 95%
accuracy whereas with feature selection. ABC with GA gives
a classification accuracy of 98.75% which is the maximum
among the two classifiers. Meanwhile, the ABC classifier
provides 95.50% accuracy when built without feature
selection. The accuracy increases with feature selections using
GA even with reduced features .Thus we can infer that a
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 14
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
combination of GA and ABC (scheme ABC-S2) gives a better result.
TABLE V
Output efficiencies of ABC classifier
ABC Classifier
Accuracy on test data (%) Average
accuracy
on testing
Data (%) Normal
condition
Inner-
Race fault
Outer-
Race Fault Ball Fault
All 20 features (Without feature
selection)
ABC-S1 92 100 95 95 95.50
F1,F2,F4,F5,F9,F10,F13,F17 ABC-S2
95 100 95 100 98.75
F3,F5,F6,F7,F9,F11,F12,F14 ABC-S3
100 100 100 90 97.5
Random 5 features -
F1,F10,F11,F14,F16 ABC-S4
70 100 95 95 90.00
Fig. 9 (a) Classification scheme Vs Accuracy of LVQ
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 15
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Fig. 9 (b) Classification scheme Vs Accuracy of ABC
XII. CONCLUSION
This paper introduced the another effective approach by
combine the strength of optimization technique genetic
algorithm in feature selection process and ABC as a
classification algorithm to solve the bearing fault diagnosis
problems. In feature selection GA has proved its capability by
quickly converge and it has a strong search capability and in
selecting minimal features. The results show that how this
present approach increases the predictive accuracy for bearing
fault diagnosis. The proposed methods are compared by not
deploy the feature selection process before classification and
by using feature selection process along with both the
classification algorithms ABC and LVQ in the process .
Classification accuracy measures are used to evaluate the
performance of the proposed approaches and proven clearly
the effectiveness of the present appr
REFERENCES [1] X. Chiementin, F. Bolaers and J.-P. Dron, Early detection of
fatigue damage on rolling element bearings using adapted wavelet,
Journal of Vibration and Acoustics, 129 (4) (2007) 495-506. [2] Mathew J, Alfredson RJ. The condition monitoring of rolling
element bearings using vibration analysis. Trans ASME, J Vibr,
Acoust, Stress Reliab Design 1984; 106:447–53. [3] Scheffer, C., & Girdhar, P. (2004). Practical machinery vibration
analysis and predictive maintenance. Newnes.-2004
[4] N. Tandon a, A. Choudhury, A review of vibration and acoustic measurement methods for the detection of defects in rolling
element bearings,Tribology International 32 (1999) 469–480
[5] Harris, T. (1991). Rolling bearing analysis. New York: Wiley. [6] Taylor, J. (2003). The vibration analysis handbook. Vibration
consultants, Tampa, FL.
[7] V.K. Rai, A.R. Mohanty - Bearing fault diagnosis using FFT of intrinsic mode functions in Hilbert–Huang transform,Mechanical
Systems and Signal Processing 21 (2007) 2607–2615
[8] Zonghua Zhang, Zhao Jing, Zhaohui Wang, Dengfeng Kuang, Comparison of Fourier transform, windowed Fourier transform,
and wavelet transform methods for phase calculation at
discontinuities in fringe projection profilometry, Optics and Lasers in Engineering 50 (2012) 1152–1160
[9] P.W. Tse, Y.H. Peng, R. Yam, Wavelet analysis and envelope
detection for rolling element bearing fault diagnosis—their effectiveness and flexibilities, Journal of Vibration and Acoustics
123 (2001) 303–310
[10] K. Mori, N. Kasashima, T. Yoshioka, Y. Ueno, Prediction of spalling on a ball bearing by applying the discrete wavelet
transform to vibration signals, Wear 195 (1996) 162–168.
[11] Q. Sun, Y. Tang, Singularity analysis using continuous wavelet transform for bearing fault diagnosis, Mechanical Systems and
Signal Processing 16 (2002) 1025–1041.
[12] Xiaoran Zhu*, Youyun Zhang and Yongsheng Zhu-Intelligent fault diagnosis of rolling bearing based on kernel neighborhood
rough sets and statistical features, Journal of Mechanical Science
and Technology 26 (9) (2012) 2649~2657 [13] Jing Lin,Liangsheng Qu -Feature extraction based on morlet
wavelet and its application for mechanical fault diagnosis., Journal of Sound and Vibration (2000) 234(1), 135-148.
[14] Yang, J., Zhang, Y., & Zhu, Y. (2007a). Intelligent fault diagnosis
of rolling element bearing based on SVMS and fractal dimension. Mechanical Systems and Signal Processing, 21, 2012–2024.
[15] Wang, H., & Chen, P. (2011). Intelligent diagnosis method for
rolling element bearing faults using possibility theory and neural network. Computers & Industrial Engineering, 60, 511–518.
[16] Weixiang Sun and Jin Chen, Jiaqing Li ,Decision tree and PCA-
based fault diagnosis, Mechanical Systems and Signal Processing 21 (2007) 1300–1317.
[17] Jing Lin, Feature extraction of machine sound using wavelet and
its application in fault diagnosis, NDT&E International 34 (2001) 25±30
[18] Cheng-Lung Huang , Chieh-Jen Wang. A GA-based feature
selection and parameters optimization for support vector machines, Expert Systems with Applications 31 (2006) 231–240
[19] Ngoc-Tu Nguyen, Hong-Hee Leeand Jeong-Min Kwon, Optimal
feature selection using genetic algorithm for mechanical fault detection of induction motor, Journal of Mechanical Science and
Technology 22 (2008) 490~496
[20] Dervis Karaboga, Celal Ozturk, A novel clustering approach: Artificial Bee Colony (ABC) algorithm, Applied Soft Computing
11 (2011) 652–657
[21] Changsheng Zhang, Dantong Ouyang, Jiaxu Ning, An artificial bee colony approach for clustering, Expert Systems with
International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:15 No:03 16
150603-7474-IJMME-IJENS © June 2015 IJENS I J E N S
Applications 37 (2010) 4761–4767
[22] T. Kohonen, Self-Organizing Maps, Springer, Berlin, 1997. [23] Jianye Liu, Yongchun Liang, Xiaoyun Sun, Application of
Leaming Vector Quantization Network in fault Diagnosis ofPower
Transformer, proceedings of IEEE conference 2009, international conference on mechatronics and automation
[24] Fahad and Sikander,Classification of textual documents using
learning vector quantization. Information Technology Journal 6.1 (2007): 154-159
[25] Ouyang Sen, Song Zhengxiang, Wang Jianhua, Chen Degui,
application of LVQ neural networks Combined with Genetic Algorithm in Power Quality Signals Classification, IEEE 2002