A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network...

8/3/2019 A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classificati

1/4

A Brief Performance Evaluation of ECG Feature

Extraction Techniques for Artificial Neural

Network Based Classification

Rajesh Ghongade

#

, Dr. A.A. Ghatol

*

#Vishwakarma Institute of Information Technology, Pune-411048, India

*Dr. Babasaheb Ambedkar Technological University, Lonere-402103, India.

Abstract-Electrocardiogram is the most easily accessible bio-electric signal that provides the doctors with reasonablyaccurate data regarding the patient heart condition. Many of

the cardiac problems are visible as distortions in theelectrocardiogram (ECG). Normally ECG related diagnoses are

carried out by the medical practitioners manually. The majortask in diagnosing the heart condition is analyzing each heart

beat and co-relating the distortions found therein with variousheart diseases. Since the abnormal heart beats can occurrandomly it becomes very tedious and time-consuming to

analyze say a 24 hour ECG signal, as it may contain hundredsof thousands of heart beats. Hence it is desired to automate the

entire process of heart beat classification and preferably

diagnose it accurately.

In this paper the authors have focused on the various

schemes for extracting the useful features of the ECG signalsfor use with artificial neural networks. Once feature extractionis done, ANNs can be trained to classify the patterns reasonablyaccurately. Arrhythmia is one such type of abnormality

detectable by an ECG signal. The three classes of ECG signalsare Normal, Fusion and Premature Ventricular Contraction

(PVC). The task of an ANN based system is to correctly identifythe three classes, most importantly the PVC type, this being afatal cardiac condition. Transform feature extraction andmorphological feature extraction schemes are mostly preferred.

Discrete Fourier Transform, Principal Component Analysis,and Discrete Wavelet Transform are the three transform

schemes along with three other morphological featureextraction schemes are discussed and compared in this paper

Keywords: ECG , MLP, Feature extraction, DWT, DFT, PCA

I. INTRODUCTION

Heart is one of the most important organs of the human body

hence it is termed as a vital organ. The major function of the

heart is to deliver oxygen to all the cells in a human body. If

this function is improper, the oxygen supply is hampered and

the cells remain deprived of the essential element. The

irregularity of the heart function can be termed as arrhythmia

and is the most observed cardiac disease. There are

numerous types of arrhythmia disorders, out of which premature ventricular contraction is a pre-indication of the

lethal ventricular fibrillation [1][2], is taken into

consideration here. The ECG is one of the most effective

diagnostic tools available to the medics. Of the many

thousand heartbeats available, it becomes tedious to locate

the distorted heartbeat and then classify it.[12]

Figure 1 depicts the three types of heartbeats. Here we

discuss the various schemes to extract useful information

from the ECG signal and use these features for training an

ANN for pattern recognition of these three types of

heartbeats and classify them correctly as Normal (N), Fusion

(F) and Premature Ventricular Contraction (PVC).We have

used the ECG data available from MIT/BIH Arrhythmia

Database since it is considered as the benchmark data.

Figure 1. Types of heartbeats

80 0

90 0

1 0 0 0

1100

1 2 0 0

1 3 0 0

1 4 0 0

1 5 0 0

0 9 0 18 0

Samples

R

Q

S

Figure 2. ECG morphology

II. ECGDATAPRE-PROCESSING

The data available from MIT/BIH Arrhythmia Database[14]

is the standard used by many researchers. This database

contains 25 records of 30 minutes duration. The ECG signalis sampled at 360 Hz with a resolution of 11bits. Since we

are interested in only certain type of arrhythmias three

records we used. Since each record is a continuous waveform

it is necessary to extract only a single heartbeat. This is done

by considering the R peak and extracting 90 samples on

either side of this R peak. It is not mandatory to use all the

180 samples for classification but here we use all 180

samples.[1] Figure 2 shows the data selection method for one

heartbeat. Since the MIT/BIH database comes with


2/4

annotations for each heartbeat it is necessary to associate the

categories with individual extracted beats. A total of 1827

beats were extracted with 609 beats belonging to each

category viz. Normal, Fusion and PVC.

We can use the entire 180 samples for training an ANN but

this may give rise to two problems namely overtraining and

excessive computational overhead. Overtraining affects the

accuracy adversely while computational overhead will put

constraints on the speed and required resources. Hence only

useful features have to be identified so that only these

features can be used to train an ANN and avoid the above

mentioned problems.[12][13]

Figure 2. Selection of samples

III. TRANSFORM FEATURE EXTRACTION

A. DISCRETE FOURIER TRANSFORMThe Discrete Fourier Transform (DFT) of a sequence

{ }( ), 0,1,..., 1u n n N = is defined as1

0

( ) ( ) , 0,1,,..., 1N

kn

N

n

v k u n W k N

=

= = (1)

where2j

N

NW e

(2)

The resulting sequence consists of real and imaginary

components.[11] Again all components are not required as

feature vectors and we can select those components withhigher energy contribution. Further only the real part of the

components can be used. We have selected the real part of

only 20 components which preserves the ECG waveform

morphology. Further it was found that components more

than 20, do not contribute much to the classifier accuracy.

Once we select 20 significant components the next task is to

train a MLP with these coefficients as inputs and the

category of the heartbeat as output.

B. PRINCIPAL COMPONENT ANALYSISThe aim of PCA, also known as Karhunen-Leve Transform

- KLT, is to reduce the dimensionality of the data set.[1] The

importance of the variables is statistically evaluated. So, this

method is based on the significance of the information, in

which the KLT identifies the direction of the signal with

maximum energy or variance. Below we present the

algorithm to obtain the Principal Components of a vector set

X represented by a NX matrix, where N represents the

number of segments or vectors ix in the vector set,

0,1,..., 1i N= , and Mrepresents the number of samplesper segment or the dimension of the vectors that constitute

the vector set.

PCA ALGORITHM:

a) Obtain the Mean vector and mean adjusted data:1

0

1 N

x i

i

xN

=

= (3)

( )xMean Adjusted Data X =

b) Obtain the covariance matrix:1

0

1( )( )

NT

x i i i i

i

C x xN

=

= (4)c) Obtain the eigen vectors and eigen values:

xC e e= (5)

where e is eigen vector and is eigen value

d) Choosing components and forming a feature vector:We get 180 components corresponding to the

dimensionality of the input sequence. Components thatare significant from the point of view of contribution to

the total energy of the signal are selected. The selectedcomponents together must constitute about 99% of the

total energy of the signal.This procedure decreases the data dimensionality

without significant loss of information.

( 1 2 3 ... )Feature Vector eig eig eig eigP = (6)e) Creating the new data set:

[ ]T

New Data Feature Vector M ean Adjusted Data=

(7)

Once we select 14 significant components the next task is to

train a MLP with these coefficients as inputs and the


C. DISCRETE WAVELET TRANSFORMWavelet analysis consists of decomposing a signal or an

image into a hierarchical set of approximations and details.The levels in the hierarchy often correspond to those in a

dyadic (dual) scale

The continuous wavelet transform is defined as:

1( , ) ( )

t bCWT a b f t dt

aa

=

(8)

where a and b are the dilation and translation parameters,

respectively. A wide variety of functions can be chosen as

the mother wavelet , provided2

( )t L and

( ) 0t dt

= (9)

An elegant way to compute Discrete Wavelet Transform(DWT) is to convolve the signal with a pair of appropriately

designed quadrature mirror filters (QMF) and then down

sample by a factor of two. The QMF pair which decomposesthe signal consists of low pass filter H and a high pass filter

G which split the signal bandwidth into half.

A 23 coefficient data is retained for training corresponding to

A3, other components are discarded. Again 23 coefficients

are selected on the basis of morphology preservation

criterion. .[8]


3/4

Once these 23 DWT coefficients are selected the next task is

to train a MLP with these coefficients as inputs and the


IV. MORPHOLOGICAL FEATURE EXTRACTION

A. R-peakOne very peculiar feature that distinguishes the Fusion

type from other types i.e. Normal and PVC types is the

amplitude of the R-peak. Figure 3 depicts the R-peak

distribution for the three classes. The R-peak is computed the

difference between the mean level of the ECG sample and R-

peak. We observe that the three classes of heart beats are

more or less distinctly separated.

R-Peak

0

100

200

300

400

500

600

700

800

1 301 601 901 1201 1501

ECG beats

mV

Figure 3. R-peak value distribution over dataset

B. QRS areaAnother important observation is that the QRS area for the

three classes under consideration differs from one another

considerably. Figure 4 depicts the fact. It is evident that this

value can prove as an important feature for classification.

QRS-AREA

0

5000

10000

15000

20000

25000

30000

1 501 1001 1501

ECG Beats

V-s

Figure 4. QRS Area value distribution over dataset

C. Q-S distanceThe time difference between the Q and the S point is also

found to be an important feature that differentiates the three

classes. Figure 5 shows the clearly the difference in the Q-S

distance values.

V. ARTIFICIAL NEURAL NETWORKS

An artificial neural network is inspired by the biological

neurons present in the human brain. Though the sheer

number of biological neurons and their high

interconnectivity is impossible to be duplicated, the scaled

down models of the artificial neural networks shows similar

capacity of learning and generalizing intrinsic characteristics

of the information presented to it.

A. Multilayer Perceptron (MLP)This is one of the most popular and powerful ANN

architectures. MLP not only overcomes the severe limitation

of a simple perceptron, but also offers the most important

feature of learning. It is thus a very powerful pattern

classifier.

Q-S DIST

0

10

20

30

40

50

60

70

1 501 1001 1501

ECG Beats

seconds

Figure 5. Q-S distance distribution over the entire datasetFunction approximation can also be done quite effectively by

the MLP. Figure 6 depicts the MLP architecture used for

experimentation.

0x

1x

2x

29x

'0x

'29x

0y

1y

2y

Input layer Hidden layer Output layer

.

.

.

.

.

.

.

.

.

.

Figure 6. Multilayer perceptron

The entire process of classification consists of two phases:

training phase and testing phase. MLP training is executed

using a well-known technique of backpropagation. After

satisfactory training error level is obtained the training is

terminated and a new data is presented to the network to test

the performance.

B. Backpropagation AlgorithmFollowing steps indicate the backpropagation algorithm.

a) Initialize the weights and biases with small randomvalues.

b) Present a continuous valued input vector 0 1 1, ,..., Nx x x

and specify the desired outputs 0 1 1, ,..., Md d d

c) Compute actual outputs0 1 1, ,..., My y y

d) Adapt weights:Use a recursive algorithm starting at the output nodesand working back to the first hidden layer. Adjust

weights by'

, ,( 1) ( )

i j i j j iw t w t x+ = +

Where wi,j(t) is the weight from hidden node I or from an

input to nodej at time t,xi'

is either the output of node i or isan input , is a gain term and j is an error term for nodej .If nodej is an output node then

2(1 ( ) )( )j j j jy d y =

Where dj is the desired output of node j and yj is the actualoutput.

If nodej is an internal hidden node then' 2(1 ( ) )jj k jk

k

x w = ,


4/4

where k is over all the nodes in the layers above node j.

Internal node thresholds are adapted in a similar manner by

assuming they are connection weights on links fromauxiliary constant valued inputs (called as biases).

Convergence is faster if a momentum term is added andweight changes are smoothed by

'

, ,( 1) ( )

[ ( ) ( 1)]

i j i j j i

ij ij

w t w t x

w t w t

+ = +

+

Where 0

A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network...

Documents

Transcript of A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network...