A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network...

download A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classification

of 4

Transcript of A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network...

  • 8/3/2019 A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classificati

    1/4

    A Brief Performance Evaluation of ECG Feature

    Extraction Techniques for Artificial Neural

    Network Based Classification

    Rajesh Ghongade

    #

    , Dr. A.A. Ghatol

    *

    #Vishwakarma Institute of Information Technology, Pune-411048, India

    *Dr. Babasaheb Ambedkar Technological University, Lonere-402103, India.

    Abstract-Electrocardiogram is the most easily accessible bio-electric signal that provides the doctors with reasonablyaccurate data regarding the patient heart condition. Many of

    the cardiac problems are visible as distortions in theelectrocardiogram (ECG). Normally ECG related diagnoses are

    carried out by the medical practitioners manually. The majortask in diagnosing the heart condition is analyzing each heart

    beat and co-relating the distortions found therein with variousheart diseases. Since the abnormal heart beats can occurrandomly it becomes very tedious and time-consuming to

    analyze say a 24 hour ECG signal, as it may contain hundredsof thousands of heart beats. Hence it is desired to automate the

    entire process of heart beat classification and preferably

    diagnose it accurately.

    In this paper the authors have focused on the various

    schemes for extracting the useful features of the ECG signalsfor use with artificial neural networks. Once feature extractionis done, ANNs can be trained to classify the patterns reasonablyaccurately. Arrhythmia is one such type of abnormality

    detectable by an ECG signal. The three classes of ECG signalsare Normal, Fusion and Premature Ventricular Contraction

    (PVC). The task of an ANN based system is to correctly identifythe three classes, most importantly the PVC type, this being afatal cardiac condition. Transform feature extraction andmorphological feature extraction schemes are mostly preferred.

    Discrete Fourier Transform, Principal Component Analysis,and Discrete Wavelet Transform are the three transform

    schemes along with three other morphological featureextraction schemes are discussed and compared in this paper

    Keywords: ECG , MLP, Feature extraction, DWT, DFT, PCA

    I. INTRODUCTION

    Heart is one of the most important organs of the human body

    hence it is termed as a vital organ. The major function of the

    heart is to deliver oxygen to all the cells in a human body. If

    this function is improper, the oxygen supply is hampered and

    the cells remain deprived of the essential element. The

    irregularity of the heart function can be termed as arrhythmia

    and is the most observed cardiac disease. There are

    numerous types of arrhythmia disorders, out of which premature ventricular contraction is a pre-indication of the

    lethal ventricular fibrillation [1][2], is taken into

    consideration here. The ECG is one of the most effective

    diagnostic tools available to the medics. Of the many

    thousand heartbeats available, it becomes tedious to locate

    the distorted heartbeat and then classify it.[12]

    Figure 1 depicts the three types of heartbeats. Here we

    discuss the various schemes to extract useful information

    from the ECG signal and use these features for training an

    ANN for pattern recognition of these three types of

    heartbeats and classify them correctly as Normal (N), Fusion

    (F) and Premature Ventricular Contraction (PVC).We have

    used the ECG data available from MIT/BIH Arrhythmia

    Database since it is considered as the benchmark data.

    Figure 1. Types of heartbeats

    80 0

    90 0

    1 0 0 0

    1100

    1 2 0 0

    1 3 0 0

    1 4 0 0

    1 5 0 0

    0 9 0 18 0

    Samples

    R

    Q

    S

    Figure 2. ECG morphology

    II. ECGDATAPRE-PROCESSING

    The data available from MIT/BIH Arrhythmia Database[14]

    is the standard used by many researchers. This database

    contains 25 records of 30 minutes duration. The ECG signalis sampled at 360 Hz with a resolution of 11bits. Since we

    are interested in only certain type of arrhythmias three

    records we used. Since each record is a continuous waveform

    it is necessary to extract only a single heartbeat. This is done

    by considering the R peak and extracting 90 samples on

    either side of this R peak. It is not mandatory to use all the

    180 samples for classification but here we use all 180

    samples.[1] Figure 2 shows the data selection method for one

    heartbeat. Since the MIT/BIH database comes with

  • 8/3/2019 A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classificati

    2/4

    annotations for each heartbeat it is necessary to associate the

    categories with individual extracted beats. A total of 1827

    beats were extracted with 609 beats belonging to each

    category viz. Normal, Fusion and PVC.

    We can use the entire 180 samples for training an ANN but

    this may give rise to two problems namely overtraining and

    excessive computational overhead. Overtraining affects the

    accuracy adversely while computational overhead will put

    constraints on the speed and required resources. Hence only

    useful features have to be identified so that only these

    features can be used to train an ANN and avoid the above

    mentioned problems.[12][13]

    Figure 2. Selection of samples

    III. TRANSFORM FEATURE EXTRACTION

    A. DISCRETE FOURIER TRANSFORMThe Discrete Fourier Transform (DFT) of a sequence

    { }( ), 0,1,..., 1u n n N = is defined as1

    0

    ( ) ( ) , 0,1,,..., 1N

    kn

    N

    n

    v k u n W k N

    =

    = = (1)

    where2j

    N

    NW e

    (2)

    The resulting sequence consists of real and imaginary

    components.[11] Again all components are not required as

    feature vectors and we can select those components withhigher energy contribution. Further only the real part of the

    components can be used. We have selected the real part of

    only 20 components which preserves the ECG waveform

    morphology. Further it was found that components more

    than 20, do not contribute much to the classifier accuracy.

    Once we select 20 significant components the next task is to

    train a MLP with these coefficients as inputs and the

    category of the heartbeat as output.

    B. PRINCIPAL COMPONENT ANALYSISThe aim of PCA, also known as Karhunen-Leve Transform

    - KLT, is to reduce the dimensionality of the data set.[1] The

    importance of the variables is statistically evaluated. So, this

    method is based on the significance of the information, in

    which the KLT identifies the direction of the signal with

    maximum energy or variance. Below we present the

    algorithm to obtain the Principal Components of a vector set

    X represented by a NX matrix, where N represents the

    number of segments or vectors ix in the vector set,

    0,1,..., 1i N= , and Mrepresents the number of samplesper segment or the dimension of the vectors that constitute

    the vector set.

    PCA ALGORITHM:

    a) Obtain the Mean vector and mean adjusted data:1

    0

    1 N

    x i

    i

    xN

    =

    = (3)

    ( )xMean Adjusted Data X =

    b) Obtain the covariance matrix:1

    0

    1( )( )

    NT

    x i i i i

    i

    C x xN

    =

    = (4)c) Obtain the eigen vectors and eigen values:

    xC e e= (5)

    where e is eigen vector and is eigen value

    d) Choosing components and forming a feature vector:We get 180 components corresponding to the

    dimensionality of the input sequence. Components thatare significant from the point of view of contribution to

    the total energy of the signal are selected. The selectedcomponents together must constitute about 99% of the

    total energy of the signal.This procedure decreases the data dimensionality

    without significant loss of information.

    ( 1 2 3 ... )Feature Vector eig eig eig eigP = (6)e) Creating the new data set:

    [ ]T

    New Data Feature Vector M ean Adjusted Data=

    (7)

    Once we select 14 significant components the next task is to

    train a MLP with these coefficients as inputs and the

    category of the heartbeat as output.

    C. DISCRETE WAVELET TRANSFORMWavelet analysis consists of decomposing a signal or an

    image into a hierarchical set of approximations and details.The levels in the hierarchy often correspond to those in a

    dyadic (dual) scale

    The continuous wavelet transform is defined as:

    1( , ) ( )

    t bCWT a b f t dt

    aa

    =

    (8)

    where a and b are the dilation and translation parameters,

    respectively. A wide variety of functions can be chosen as

    the mother wavelet , provided2

    ( )t L and

    ( ) 0t dt

    = (9)

    An elegant way to compute Discrete Wavelet Transform(DWT) is to convolve the signal with a pair of appropriately

    designed quadrature mirror filters (QMF) and then down

    sample by a factor of two. The QMF pair which decomposesthe signal consists of low pass filter H and a high pass filter

    G which split the signal bandwidth into half.

    A 23 coefficient data is retained for training corresponding to

    A3, other components are discarded. Again 23 coefficients

    are selected on the basis of morphology preservation

    criterion. .[8]

  • 8/3/2019 A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classificati

    3/4

    Once these 23 DWT coefficients are selected the next task is

    to train a MLP with these coefficients as inputs and the

    category of the heartbeat as output.

    IV. MORPHOLOGICAL FEATURE EXTRACTION

    A. R-peakOne very peculiar feature that distinguishes the Fusion

    type from other types i.e. Normal and PVC types is the

    amplitude of the R-peak. Figure 3 depicts the R-peak

    distribution for the three classes. The R-peak is computed the

    difference between the mean level of the ECG sample and R-

    peak. We observe that the three classes of heart beats are

    more or less distinctly separated.

    R-Peak

    0

    100

    200

    300

    400

    500

    600

    700

    800

    1 301 601 901 1201 1501

    ECG beats

    mV

    Figure 3. R-peak value distribution over dataset

    B. QRS areaAnother important observation is that the QRS area for the

    three classes under consideration differs from one another

    considerably. Figure 4 depicts the fact. It is evident that this

    value can prove as an important feature for classification.

    QRS-AREA

    0

    5000

    10000

    15000

    20000

    25000

    30000

    1 501 1001 1501

    ECG Beats

    V-s

    Figure 4. QRS Area value distribution over dataset

    C. Q-S distanceThe time difference between the Q and the S point is also

    found to be an important feature that differentiates the three

    classes. Figure 5 shows the clearly the difference in the Q-S

    distance values.

    V. ARTIFICIAL NEURAL NETWORKS

    An artificial neural network is inspired by the biological

    neurons present in the human brain. Though the sheer

    number of biological neurons and their high

    interconnectivity is impossible to be duplicated, the scaled

    down models of the artificial neural networks shows similar

    capacity of learning and generalizing intrinsic characteristics

    of the information presented to it.

    A. Multilayer Perceptron (MLP)This is one of the most popular and powerful ANN

    architectures. MLP not only overcomes the severe limitation

    of a simple perceptron, but also offers the most important

    feature of learning. It is thus a very powerful pattern

    classifier.

    Q-S DIST

    0

    10

    20

    30

    40

    50

    60

    70

    1 501 1001 1501

    ECG Beats

    seconds

    Figure 5. Q-S distance distribution over the entire datasetFunction approximation can also be done quite effectively by

    the MLP. Figure 6 depicts the MLP architecture used for

    experimentation.

    0x

    1x

    2x

    29x

    '0x

    '29x

    0y

    1y

    2y

    Input layer Hidden layer Output layer

    .

    .

    .

    .

    .

    .

    .

    .

    .

    .

    Figure 6. Multilayer perceptron

    The entire process of classification consists of two phases:

    training phase and testing phase. MLP training is executed

    using a well-known technique of backpropagation. After

    satisfactory training error level is obtained the training is

    terminated and a new data is presented to the network to test

    the performance.

    B. Backpropagation AlgorithmFollowing steps indicate the backpropagation algorithm.

    a) Initialize the weights and biases with small randomvalues.

    b) Present a continuous valued input vector 0 1 1, ,..., Nx x x

    and specify the desired outputs 0 1 1, ,..., Md d d

    c) Compute actual outputs0 1 1, ,..., My y y

    d) Adapt weights:Use a recursive algorithm starting at the output nodesand working back to the first hidden layer. Adjust

    weights by'

    , ,( 1) ( )

    i j i j j iw t w t x+ = +

    Where wi,j(t) is the weight from hidden node I or from an

    input to nodej at time t,xi'

    is either the output of node i or isan input , is a gain term and j is an error term for nodej .If nodej is an output node then

    2(1 ( ) )( )j j j jy d y =

    Where dj is the desired output of node j and yj is the actualoutput.

    If nodej is an internal hidden node then' 2(1 ( ) )jj k jk

    k

    x w = ,

  • 8/3/2019 A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classificati

    4/4

    where k is over all the nodes in the layers above node j.

    Internal node thresholds are adapted in a similar manner by

    assuming they are connection weights on links fromauxiliary constant valued inputs (called as biases).

    Convergence is faster if a momentum term is added andweight changes are smoothed by

    '

    , ,( 1) ( )

    [ ( ) ( 1)]

    i j i j j i

    ij ij

    w t w t x

    w t w t

    + = +

    +

    Where 0