Research Article A Novel Clinical Decision Support System...

Research ArticleA Novel Clinical Decision Support System UsingImproved Adaptive Genetic Algorithm for the Assessment ofFetal Well-Being

Sindhu Ravindran1 Asral Bahari Jambek1 Hariharan Muthusamy2 and Siew-Chin Neoh3

1School of Microelectronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra 02600 Perlis Malaysia2School of Mechatronic Engineering Universiti Malaysia Perlis (UniMAP) Campus Pauh Putra 02600 Perlis Malaysia3Department of Computing Science and Digital Technologies University of Northumbria Newcastle NE UK

Correspondence should be addressed to Hariharan Muthusamy hariunimapedumy

Received 12 September 2014 Revised 16 November 2014 Accepted 16 November 2014

Academic Editor Maria N D S Cordeiro

Copyright copy 2015 Sindhu Ravindran et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Anovel clinical decision support system is proposed in this paper for evaluating the fetal well-being from the cardiotocogram (CTG)dataset through an Improved Adaptive Genetic Algorithm (IAGA) and Extreme Learning Machine (ELM) IAGA employs a newscaling technique (called sigma scaling) to avoid premature convergence and applies adaptive crossover and mutation techniqueswith masking concepts to enhance population diversity Also this search algorithm utilizes three different fitness functions (twosingle objective fitness functions and multi-objective fitness function) to assess its performance The classification results unfoldthat promising classification accuracy of 94 is obtained with an optimal feature subset using IAGA Also the classification resultsare compared with those of other Feature Reduction techniques to substantiate its exhaustive search towards the global optimumBesides five other benchmark datasets are used to gauge the strength of the proposed IAGA algorithm

1 Introduction

In clinical practice cardiotocography (CTG) was introducedin the late 1960s which is a noninvasive and cost-effectivetechnique for evaluating the fetal well-being This techniquehas been widely used by obstetricians for examining thefetal well-being (inside the motherrsquos uterus) as the fetus isnot available for direct observations The babyrsquos fetal heartrate (FHR) and the motherrsquos uterine contractions (UC) arerecorded on a paper trace known as cardiotocograph TheCTG technique is highly instrumental in the early identifica-tion of a pathological state (ie congenital heart defect fetaldistress or hypoxia) and it helps the obstetrician to predictfuture complications

During the critical period of labor these FHR signalsare used as a denotation of the fetal condition and as awarning of possible fetal and neonatal compromise namelymetabolic acidosis Severe hypoxic injury of the fetus canresult in the neurodevelopmental disability and cerebral palsyor even death [1] Hence such FHR patterns are devised in

such a way that such risky conditions of the fetus need to beidentified in earlier stages in order to alert the obstetriciansto intervene before there is an irreversible damage to thefetus Although the FHR signals interpret and provide earlyestimations and warnings about the fetal condition [2] stillthere has been lot of scepticism as there is inconsistencyin such interpretation and the increase of false positivediagnosis On one hand advances in signal processing andpattern recognition techniques have paved theway to developan efficient medical diagnosis system to analyze and classifythe FHR signal appropriately [3]

In this paper a clinical decision support system calledImproved Adaptive Genetic Algorithm (IAGA) has beendeveloped to discern the FHR signals of the CTG recordingsinto their respective groups For this purpose the CTGdataset has been acquired from the UCI machine learningrepository for experimentation This dataset consists of 2126CTG samples and each of these samples has a feature lengthof 21 Out of these 2126 samples 1655 samples belong tothe normal state 176 samples belong to the pathological

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2015 Article ID 283532 11 pageshttpdxdoiorg1011552015283532

2 Computational and Mathematical Methods in Medicine

Table 1 List of features in CTG dataset

Snumber

Name of thefeatures Description

1 LB FHR baseline (beats per minute)2 AC Number of accelerations per second3 FM Number of fetal movements per second4 UC Number of uterine contractions per second5 DL Number of light decelerations per second6 DS Number of severe decelerations per second

7 DP Number of prolonged decelerations persecond

8 ASTV Percentage of time with abnormal shortterm variability

9 MSTV Mean value of short term variability

10 ALTV Percentage of time with abnormal long termvariability

11 MLTV Mean value of long term variability12 Width Width of FHR histogram13 Min Minimum of FHR histogram14 Max Maximum of FHR histogram15 Nmax Number of histogram peaks16 Nzeros Number of histogram zeros17 Mode Histogram mode18 Mean Histogram mean19 Median Histogram median20 Variance Histogram variance

21 Tendency

Histogram tendencyminus1 = left asymmetric0 = symmetric1 = right asymmetric

state and 295 samples belong to the suspect state Featuresof this CTG dataset used for experimentation have beenexplained in Table 1 Single and multiobjective fitness func-tions have been used to gauge the efficacy of the IAGASeveral classifiers like 119896-NN (119896-Nearest Neighbor) SVM(SupportVectorMachine) BN (BayesianNetwork) andELM(Extreme LearningMachine) have been used to discern thesefeatures

With the objective of proving the robustness of thissearch algorithm apart from the CTG dataset five otherbenchmark datasets that is MEEI voice dataset Parkin-sonrsquos (PD) dataset Cleveland Heart Disease (CAD) datasetErythemato-Squamous (ES) dataset and Breast Tissue (BT)dataset have been taken from the UCI machine learningrepository for experimentation and tested with this algo-rithm The main aim is to select the best clinical features ofthese datasets through the proposed IAGAmethod which inturn attains promising classification accuracywith aminimalnumber of features

The rest of this paper is organised as follows Section 2describes the literature works done in this specific problemdomain Section 3 explains about the feature selection carriedout using IAGA Section 4 presents the classifiers employed

and the performance measures Section 5 enunciates the dis-cussion of the classification results of CTG dataset Section 6elaborates the classification results of other datasets andSection 7 concludes the entire work

2 Previous Works on CTG Dataset

Numerous approaches have been investigated using conven-tional and artificial intelligence techniques for feature extrac-tion and also to come out with diagnostic systems [4] In[3] the automatic classification of FHR signal which belongsto hypoxic and normal newborns has been carried outthrough a hidden Markov models (HMM) based approachYet again an ANBLIR (artificial neural network based onlogical interpretation of fuzzy if-then rules) system is usedto evaluate the risk of low-fetal birth weight as normal orabnormal using CTG signals recorded during the pregnancyin [5]

21 Basic GA As an expeditious search strategy GA hasbeen utilized in the assessment of the fetal well-beingFor example Ocak (2013) reported that a genetic basedoptimization followed by SVM classification helps in pre-dicting the substantial features for assessing the fetal stateThe classification accuracy with FHR data set obtained was9923with 13 features [6] An adaptive neurofuzzy inferencesystem (ANFIS) has been proposed for the prediction of fetalstatus from theCTG recordings as normal or pathological [7]

Yılmaz and Kılıkcıer have suggested a combined schemeof binary decision tree (BDT) and particle swarm opti-mization (PSO) for handling this specific classification taskThrough least squares support vector machine (LS-SVM)a good classification accuracy rate of 9162 was achievedBesides this experimentation has resulted in the three-classclassification of the CTG dataset using receiver operationcharacteristic analysis and Cobweb representation [8] How-ever the problem of design of a better genetic based searchheuristic that provides higher classification accuracy alongwith a reduced number of voice features in this specific prob-lem domain is still open In other words the reduced optimalfeature subset must be sufficient enough to classify the datasamples into their respective classes than the previous works

Basic GA deals with candidate solutions which arerepresented by individuals (or chromosomes) in a largepopulation It starts with the random generation of ini-tial set of chromosomes followed by their correspondingfitness evaluation The successive generations are created(iteratively) by the picking of the highly fit individuals ofthe current generation On achieving the eligible individualthat satisfies our constraint the GA cycle is stopped usingthe termination criteria and this individual becomes thesolution of the problem There are three significant geneticoperators called selection crossover and mutation whichhelp in the reproduction process of the chromosomes [9]Ideally a GA cycle takes place using these genetic operatorsand such genetic parameters influencing this process tend toproduce good individuals Such a basic GA cycle is depictedin Figure 1

Computational and Mathematical Methods in Medicine 3

Start

Fitness evaluation

Selection crossover and mutation

New generation

Select the fittest individual

Termination criteria

Stop

Yes

No

Initial generation of individuals

Figure 1 Basic GA cycle

3 An Improved Adaptive Genetic Algorithm(IAGA) for Feature Selection

To begin with initially the input data of the CTG datasetis extracted from the FHR signals IAGA selects the bestoptimal feature subset which is then fed as inputs to theclassifiers and these classifier inputs are distinguished intotheir respective number of classes The classification resultsare presented in terms of performance measures like PositivePrediction Negative Prediction Sensitivity Specificity AreaUnder Receiver Operating Curve (AUC) Overall Accuracy119865-measure 119866-mean and Kappa statistic and so forthBesides the performance of the IAGA method has beencompared with three existing feature reduction and featureselection methods (PCA SFS and SBS) so as to affirm thestrength of proposed search algorithm

Despite the algorithm of Basic GA being applied to thisCTG dataset IAGA is implemented in two different wayswith certain modification in the crossover and mutationprobabilities and their mode of usage during the searchprocedure Totally three types of FS (Basic GA IAGA-method 1 and IAGA-method 2) have been performed onthis dataset followed by feature classification Initially BasicGA is employed to this CTG dataset for optimizing thefeatures which was mentioned earlier in Section 2 This is

further improved as IAGA for the same purpose which variesfrom Basic GA in terms of selection function crossover andmutation rates and fitness functions Three different fitnessfunctions have been formulated as the evaluation measuresof IAGA

31 Stochastic Universal Sampling The proposed IAGAmethod employs the stochastic universal sampling (SUS)technique for selecting the fittest chromosomes This tech-nique was developed by Baker in 1987 [10] Based on thisrandom search the SUS function utilizes a single randomvalue to sample the chromosomes It makes use of severalstrategies of GA to select the best parent chromosomes forreproduction Also it ensures that these chromosomes arehighly capable of being reproduced Through this selectionscheme the genetic diversity of the population is highlymaintained

32 Sigma Scaling An important parameter that must befine-tuned during the genetic search is the selection pressure(SP) that is the degree to which the selection emphasizesthe better chromosomes When this SP is very low it willresult in the low convergence rate of the search algorithm Onthe other hand a higher SP will make the search procedureto attain premature convergence easily Hence to overcomesuch conditions various fitness scaling methods (a methodused to adjust the chromosomal fitness) have been proposedto prevent GA from attaining the premature convergenceSigma scaling is one such a method that maintains a constantSP throughout the entire generation and it is used in thisIAGA method for the reorientation of chromosomal fitnessSuppose119891(119905)119909 is the fitness of some individual 119909 of generation 119905and suppose the average fitness and standard deviation (SD)of the fitness of the individuals in generation 119905 are given by119891

(119905) and 120590

(119905) respectively then the adjusted fitness of 119909 ingeneration 119905 is given as follows

ℎ

(119905)

119909 =

min(0 1 +119891

(119905)119909 minus 119891

(119905)

120590

(119905)) 120590

(119905)= 0

1 120590

(119905)= 0

(1)

33 Adaptive Crossover andMutationwithMasking Uniformcrossover is applied in the search procedure of IAGA witha crossover rate (119875119888) of 09 During uniform crossover eachgene in the offspring is created by copying the correspondinggene from one or the other parent chosen according to arandomly generated crossovermask A single crossovermaskis randomly generated during the search of IAGA for eachpair of parents (P1mdashfirst parent and P2mdashsecond parent) inwhich every ldquo1rdquo and ldquo0rdquo in the mask imply the copied genefrom the first parent and second parent respectively [11]The offspring produced in this way constitutes a collection ofgenes from both the parents

Flip-bit mutation is applied with a mutation rate (119875119898) of003 The mutation operator makes use of a random maskIt changes those corresponding bits which imply ldquo1rdquo in themask Depending upon this random nature of the mask thevalue of 119875119898 is determined and this highly influences the


random behavior of the 119875119898 value [12] However it is good tovary the mutation rate between 01 and 0001

34 Fitness Functions

341 Objective Function I Three fitness functions have beendevised to assess the chromosomal quality and they are givenbelow

Objv1 = Accuracy (2)

342 Objective Function II During the classification ofbinary and multiclass problems error rate and classificationaccuracy are the two popular measures used for evaluatingthe classifier performance Generally the classifierrsquos accu-racy is determined as its performance measure in balanceddatasets [13] But when dealing with imbalanced data accu-racy is known to be unsuitable to measure the classificationperformance as it may mislead the classification process dueto the emphasis on the influence of the majority class

In order to overcome this convenience a very fewmetricshave been devised as fitness functions and they are geometricmean (119866-mean) AUC (area under ROC curve) and 119865-measure In this approach of IGA geometric mean (119866-mean)has been chosen as an objective function as it is one of themost popular measures for handling imbalanced data andeasier to calculate than 119865-measure and AUC In this regard anew fitness function (Objv2) has been devised to evaluate thefitness where the classification performancewill be evaluatedthrough geometric mean and its equation is defined as

Objv2 =119888

prod

119894=1

(119872119894)1119888

(3)

where

119872119894 =Number of correctly classified samples

Total number of samples

119888 = number of classes in the dataset(4)

For binary class the value of 119888 = 2 and it corresponds toclassify the data into two groups For multiclass the value of119888 = 1 2 3 119873 and it corresponds to classify the data into119873 number of classes

343 Objective Function III Eventually in this IAGAapproach a single objective fitness function (Objv2) has beencombined with a multiobjective fitness function (number ofzeros in the chromosome) to bias the genetic search towardsthe global optimum and it is defined as

Objv3 = (119908)Objv2 + (1 minus 119908) ( 119885119873

) (5)

where Objv2 corresponds to the 119866-mean value and 119908 (0 lt

119908 lt 1) is the equalizing factor which adjusts the significanceof 119866-mean 119885 implies the number of zeros in the chromo-some and119873 implies the length of the chromosome

35 Methods of IAGA

351 IAGA-Method 1 The adaptive approach of IAGA hasbeen devised in two different ways In basic GA when thevalues of119875119888 and119875119898 aremaintained constantly throughout theentire search procedure there will be no improvement in theindividualsrsquo fitness or it may result in premature convergenceowing to attain a suboptimal solutionThis not only will affectthe performance of search algorithm but also fails to achievethe desired solution expeditiously This can be avoided bymodifying the values of 119875119888 and 119875119898 in an adaptive mannerin accordance with the chromosomal fitness in their specificgenerations [14]

Based on the convergence to the optimum this IAGA-method 1 (IAGA-M1) determines the adaptive crossoverand mutation rates This is done with the help of certainmeasures like average fitness value (119891avg) and maximumfitness value (119891max) of the population respectively 119883 givesthis relationship between the maximum and average fitnessas follows

119883 = 119891max minus 119891avg (6)

However when GA converges to local optimum that iswhen the value of119883 decreases the values of119875119888 and119875119898 have tobe increased Inversely when the value of 119883 increases thesevalues have to be decreased Besides when GA converges to alocally optimal or even globally optimal solution the increasebeing done in the values of 119875119888 and 119875119898 will disrupt the near-optimal solutions [14] Due to this effect the population maynever converge to the global optimum and the performanceof GA will be diminished considerably

In order to overcome these issues two more fitnessmeasures called 1198911015840 and 11989110158401015840 have been taken into account insuch a way that these measures help to preserve the excellentindividuals of the population 1198911015840 is the bigger fitness of thetwo crossover chromosomes and 11989110158401015840 is the fitness value of theindividual that has to be mutated These measures are highlyinstrumental in overcoming the premature convergence andpreserving the excellent individuals Previously the values of119875119888 and 119875119898 are varied depending upon the 119883 value But nowwe can conclude that 119875119888 and 119875119898 are not only related to 119883 butalso related to 119891max minus119891

1015840 and 119891max minus11989110158401015840 [15] Eventually the 119875119888

and119875119898 values are determined using the following equation as

119875119888 =

1198961 (119891max minus 1198911015840)

(119891max minus 119891avg)119891

1015840ge 119891avg

09 119891

1015840lt 119891avg

119875119898 =

1198962 (119891max minus 11989110158401015840)

(119891max minus 119891avg)119891

10158401015840ge 119891avg

003 119891

10158401015840lt 119891avg

(7)

For both these equations in (7) 1198961 and 1198962 are predeter-mined values that are less than 10 considering the probabilityformutation and crossover [14] In this experiment the valuesof 1198961 and 1198962 have been empirically determined and fixed as04 and 01 respectively


352 IAGA-Method 2 In IAGA-method 2 (IAGA-M2) thevalues of 119875119888 and 119875119898 are devised with their maximum andminimumprobabilities that is instead of applying two valuesfor 119875119888 and 119875119898 four values namely 119875119888max 119875119888min 119875119898maxand 119875119898min are infused This adaptive approach has beenfollowed in order to maintain diversity thereby sustainingthe convergence capacity of IAGA [16] Hence the crossoverand mutation rates are chosen as described in the followingequation

119875119888

=

119875119888max minus (119875119888max minus 119875119888min)

(1 + exp (120582 ((1198911015840 minus 119891avg) (119891max minus 119891avg))))119891

1015840ge 119891avg

119875119888max 119891

1015840lt 119891avg

119875119898

=



1015840ge 119891avg

119875119898max 119891

1015840lt 119891avg

(8)

where 119891max is the maximum fitness 119891avg is the average fitnessof the population 1198911015840 is the bigger fitness of the two crossoverchromosomes and 119891

10158401015840 is the fitness value of the individualthat has to bemutated119875119888max and119875119888min are themaximum andminimum probabilities of crossover119875119898max and119875119898min are themaximum and minimum probabilities of mutation [16] andlambda is a constant (120582 = 2)

While modifying the crossover and mutation rates thevalues of 119875119888max and 119875119888min are chosen as 09 and 06 on thebasis of empirical methods Similarly the values of119875119898max and119875119898min are also chosen as 01 and 0001 based on empiricalmethods respectively Hence when the chromosomes aresubjected to crossover and mutation the values of 119875119888 and119875119898 are modified adaptively and then masking is applied Thespecifications of IAGA are tabulated in Table 2 and Figure 2shows the entire genetic cycle using the two methods ofIAGA

4 Experimental Setup

41 Classifiers The performance of the classifiers employedin this work is examined by 10-fold cross validation schemewhich have served as an evaluator of this proposed IAGAalgorithm For the classification of selected features four lin-ear and nonlinear classifiers like 119896-NN SVM BN and ELMare applied For performing binary classification both 119896-NNand SVM are used in which SVM gives better classificationresults For multiclass classification 119896-NN BN and ELMare employed wherein ELM achieves better classificationperformance A detailed description of these classifiers isdiscussed below

411 Support Vector Machine Support vector machine(SVM) is an acknowledged classification technique beingwidely used for solving classification problems In general

Table 2 Specifications of IAGA

Parameters SpecificationsProbability of crossover 119875119888 09Type of crossover Uniform crossoverProbability of mutation 119875119898 003Type of mutation Flip-bit mutationSelection method Stochastic selectionNumber of runs 30Length of chromosome 22Population size 21Number of elites 1Maximum probability of crossover 09Minimum probability of crossover 06Maximum probability of mutation 01Minimum probability of mutation 0001

SVM separates the classes with a decision surface thatincreases themargin between the classesThe surface is calledthe optimal hyperplane and the data points closest to thishyper plane are called support vectors [17] The data used inthis work is not linearly separable Hence nonlinear kernelfunctions are employed to transform the data into a newfeature space where a hyperplane tends to separate the dataThe optimal separating hyperplane is being searched by thesekernel functions in a new feature space to increase its distancefrom the closest training point Due to a better generalizationcapability and low computational cost RBF kernel has beenapplied in this work for separating the optimal hyperplaneand the equation of the kernel function is given below

119870(119909 119909119894) = exp(minus

1003817

1003817

1003817

1003817

119909 minus 1199091198941003817

1003817

1003817

1003817

2

2120590

2)

(9)

There are two significant parameters called regularizationparameter (120574) and squared bandwidth of the RBF kernel(120590

2) which should be optimally selected to obtain promising

classification accuracy By trial and error the value of sigma(120590

2) and gamma (120574) for the RBF kernel was set to 04 and 90

respectively

412 Extreme Learning Machine A new learning algorithmfor the single hidden layer feed forward networks (SLFNs)[18] called extreme learning machine (ELM) was proposedbyHuang et al It has beenwidely used in various applicationsto overcome the slow training speed and overfitting problemsof the conventional neural network learning algorithms [19]For the given 119873 training samples the output of a SLFNnetwork with 119871 hidden nodes can be expressed as thefollowing

119891119871 (119909119895) =

119871

sum

119894

120573119894119892 (119908119894 sdot 119909119895 + 119887119894) 119895 = 1 2 3 119873 (10)

It can be described as 119891(119909) = ℎ(119909)120573 where 119909119895 119908119894 and119887119894 are the input training vector input weights and biases to


Selection using SUS with sigma

Population initialization

Creation of crossover masks

Fixing the crossover rate

Implement uniform

Creation of mutation masks

Fixing the mutation rate

Completed 30 generations and obtained the optimal solution



Implement uniform



(Pc) adaptively using (8)(Pc) adaptively using (6)

(Pm) adaptively using (7) (Pm) adaptively using (9)

Implement flip-bit mutation

New generation of chromosomes

GA cycle is stopped


IAGA-methodIAGA-method

and weighted aggregationFitness evaluation using accuracy G-mean

Figure 2 Genetic cycle using IAGA-methods 1 and 2

the hidden layer respectively 120573119894 is the output weights thatlinks the 119894th hidden node to the output layer and 119892(sdot) is theactivation function of the hidden nodes

Training an SLFN is simply finding a least square solutionby using Moore-Penrose generalized inverse

120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)

minus1 depending onthe singularity of 1198671015840119867 or 1198671198671015840 Assume that 1198671015840119867 is not asingular and the coefficient 1120576 (120576 is positive regularizationcoefficient) is added to the diagonal of1198671015840119867 in the calculationof the output weights 120573119894 Hence more stable learning systemwith better generalization performance can be obtained Theoutput function of ELM can be written compactly as

119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)

During the implementation of ELM kernel the hiddenlayer feature mappings need not to be known to users andRBF kernel has been employed in this work Best values

for positive regularization coefficient (120576) as 4 and RBFkernel parameter as 1 were found empirically after severalexperiments

42 Implementation of the IAGA with Benchmark DatasetsThe implementation of the proposed IAGA methods alongwith the respective FS and FR methods has been elaboratedin this section The entire process takes place in three majorsteps which are

(i) feature reductionselection using four existing meth-ods

(ii) optimization using IAGA method(iii) classification through 119896-NN SVM BN and ELM

Initially PCA is applied to the data samples of theconcerned dataset and their dimensionality is reduced to thedesired range From PCA only the principal componentsof having 98 of the variability are selected as featuresSimultaneously SFS and SBS are employed with the samedataset to select the minimized feature subset In SBS and


Dimensionality Feature selection

SBS

Chromosome

with masking

New population and its

Objv1 Objv2 and Objv3

Optimal feature

Existing FRFS methods

Benchmark datasets

Binary class

Proposed IAGA method

reduction using PCAusing SFS and selection using SUS

Applying adaptivecrossover and mutation

with sigma scaling

fitness evaluation usingsubset is obtainedFeature classification using k-

NN SVM BN and ELM

Multiclass

are met

Terminationcriteria

Figure 3 Implementation of IAGA in pattern classification

SFS only 50 of the features are selected from optimallyordered set of all features of the six datasets The reducedfeature subsets of both these methods are individually fed asinputs to the classifiers

Secondly the proposed IAGA method is applied forchoosing the best features of the specific datasetThismethodinvolves certain significant levels of the genetic process likepopulation initialization chromosome selection using SUSwith sigma scaling applying crossover and mutation usingmasking techniques fitness evaluation using the three fitnessfunctions (Objv1 Objv2 and Objv3) and satisfying thetermination criteria The selected feature subset obtainedthrough this optimization technique is fed as inputs tothe classifiers Hence three types of inputs are sent to theclassifiers (inputs from PCA inputs from SFS and SBS andreduced inputs from IAGA)

Finally the classifier tends to classify the inputs into theirrespective binary or multiclass groups based on the numberof the classes of the concerned dataset For instance whenCTG dataset is subjected to the aforementioned processesclassifiers like 119896-NN BN and ELM will classify the datasamples into pathological suspect and normal samplesSimilarly whenPDdataset is subjected to the aforementionedprocesses the 119896-NN and SVM classifiers will classify thedata samples into pathological and normal samples Figure 3describes the implementation of the proposed IAGAmethodsalong with the benchmark datasets

43 Performance Measures-Confusion Matrix A confusionmatrix is a plot used to evaluate the performance of aclassifier As the proposed IAGA algorithm deals with bothbinary andmulticlass classification problems the experimen-tal results are presented in terms of several performancemeasures Among these measures the classification resultsof binary classification are explained in terms of Sensitivity

Specificity AUC Overall Accuracy 119865-measure119866-mean andKappa statistic The experimental results of multiclass classi-fication are described in terms of Accuracy and119866-mean Truepositive (TP) true negative (TN) false positive (FP) andfalse negative (FN) are the four basic elements of a confusionmatrix and the aforementioned performance measures areevaluated from these elements which are explained below

Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times

TP + TNTP + TN + FP + FN

(15)

119865-measure

119865-measure = 2 times precision times recallprecision + recall

(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy

Feature reductionselection methods

BN accuracyELM accuracy

k-NN accuracy

Figure 4 Classification accuracies of all the FRFS methods usingCTG dataset

Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)

where 1198750 is the total agreement probability and 119875119888 isthe hypothetical probability of chance agreement

119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)

5 Results and Discussion

51 Classification Results of CTG Dataset Pattern classifi-cation is performed on the CTG data by employing IAGAmethods to perform FS In addition this dataset is alsotreated with the three existing FS methods like PCA SFSand SBS and their simulation results are compared withthose of the IAGAmethods Table 3 presents the comparisonof simulation results obtained after applying the proposedIAGAmethods and three existing FS methods It can be seenfrom this table that from PCA 16 principal componentscontaining 98 of the variability are selected

When FS is performed using SBS and SFS first 11 features(50 of the features) are selected from the optimally orderedset of 22 features Totally the simulation results unfoldthat the three existing FS methods have brought out a bestaverage classification accuracy of 9214 9210 and 9271respectively In the view of comparing the classificationperformance of the IAGA methods directly with that ofthe three existing FS methods (PCA SBS and SFS) first6 features are chosen and classification experiments areconducted The classification accuracy obtained by selectingthe first 6 features using PCA SBS and SFS is 8984 9144and 9064 respectively

Figures 4 and 5 show the comparative view of the clas-sification accuracies and number of selection features using

02468

1012141618

BN featuresELM features

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es

Figure 5 Number of selected features using all the FRFS methodsfor CTG dataset

Table 3 Classification accuracies of all FSFR methods using CTGdataset

FS methods Number of selectedattributes

Accuracy obtainedusing ELM

PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055

Basic GA 14 9787IAGA-M1 6 9361IAGA-M2 13 9361

Table 4 Performance measures of CTG dataset

Metrics ELMAcc in 119866-mean

PCA 9214 8097SFS 9210 8330SBS 9271 8340Basic GA 9787 9507IAGA-M1 9361 8557IAGA-M2 9361 8588

all the FRFS methods for the CTG dataset The inferencefrom these Figures reveals that among all the FRFSmethodsthe overall (best) classification accuracy is yielded throughIAGA-M1 for 6 features

Table 4 explains the performance measures of all theFRFS methods using this dataset The observations showthat the two IAGA methods have come out with a moderateaccuracy ranging between 85 and 95 However when theoverall performance is taken into account the best averageclassification accuracy has been achieved by both IAGA-M1 and M2 with 9361 But when comparing the feature


Table 5 Classification accuracies of classifiers using CTG dataset

Classifier Classification accuracyWith original features After FS of increase

ELM 9103 9361 264

Table 6 Confusion matrix of CTG dataset

Method ELMPathological Suspect Normal

IAGA-M11620 61 1033 224 142 10 152

count IAGA-M1 has accomplished better results than IAGA-M2 by selecting only 6 CTG features and those six featuresare LB (FHR baseline) AC (accelerations) UC (uterine con-tractions) ASTV (abnormal short term variability) ALTV(abnormal long term variability) and MLTV (mean valueof long term variability) Also these two methods haveproduced best 119866-mean values of 8557 and 8588 throughELM classification

Table 5 represents the classification results obtainedbefore and after undergoing FS procedure At the initial stageof classification when the original CTG samples are directlyfed into the ELM classifier the accuracy range obtained wasmediocre However when these features are selected throughIAGA the accuracy seemed to increase by 264 significantlyEventually the maximum classification accuracy of 9401 isobtained using IAGA-M1 for the discrimination of the CTGsamples with the SD of 024

52 Confusion Matrix of CTG Dataset Table 6 explicatesthe simulation results of CTG dataset in the form of con-fusion matrix for the three classifiers Since 3 class classi-fications have been performed on this dataset there seemsto be moderate amount of misclassifications for all thethree groups Out of the 1655 pathological samples 1620samples were correctly classified with remaining 35 incor-rect classifications For the 295 suspect samples there are224 correct classifications with 71 misclassifications Finallyout of 176 normal samples 152 normal samples are cor-rectly classified with the remaining 24 incorrectly classifiedsamples

6 Classification Results ofOther Benchmark Datasets

61 Comparison with PreviousWorks for All the Datasets Theobservation from Table 7 shows the comparison of the clas-sification results of CTG dataset and the other five datasetswith their previous works in the literature Inferences fromthe literature enunciate that the proposed IAGA techniquehas achieved the best classification accuracies (eg 9361 forCTG dataset) and optimal feature subset (6 clinical features)so far The classification performance has been achievedsimilarly for the remaining datasets as well

0246810121416

7580859095

100105

MEEI PD CAD ES BT CTG

Feat

ure c

ount

Accu

racy

Datasets

Maximum accuracySelected features

Figure 6 Comparison plot of overall performance of six datasets

62 Overall Classification Performance The overall clas-sification performance is displayed in the form of com-parison plot in Figure 6 and this figure substantiates themaximum classification performances achieved using theproposed search techniques When analysing the overall per-formance the best classification accuracy has been achievedthrough IAGA-M2 for four datasets and IAGA-M1 for twodatasets

63 Comparison of Fitness Functions of the IAGA MethodsThree fitness functions namely classification accuracy 119866-mean and weighted aggregation (119866-mean + sum of unse-lected features) have been devised as the first second andthird fitness function respectively Table 8 depicts the overallcomparison between these three different fitness functions ofthe proposed IAGA methods The overall observation fromthis table implies that the third fitness function (Objv3) (indi-cated by the symbol radic) of all the three proposed methodshas produced the maximum number of best classificationaccuracies for most of the datasets

7 Conclusion

This paper has proposed an efficient clinical support systemcalled IAGA to discern the highly discriminative clinical fea-tures from the CTG dataset through ELM classifier to assessthe fetal well-being The classification results indicate thatIAGAmethod has performed better in terms of classificationaccuracy and reduced feature count when compared withthe previous works in the literature The classification resultsare presented in terms of various performance measures likeSensitivity Specificity AUC Overall Accuracy 119865-measure119866-mean and Kappa statistic In order to demonstrate theeffectiveness of this algorithm five other benchmark datasetshave been testedwith the proposed IAGA searchmethod andthe classification results are elaborated in detail Also theseresults are compared with other existing feature selectionand feature reduction methods to potentiate its robustnessObserving the classification results obtained it can be con-cluded that this decision support system has achieved theoptimal solution obtained so far and has been instrumentalfor the obstetricians in predicting the fetal well-being moreaccurately


Table 7 Comparison with previous works of all the datasets

S number [Reference Number] Features and methods Selected features Classifier AccuracyMulticlass classification

CTG dataset1 [7] ANFIS mdash mdash 97152 [6] GA 13 SVM 99233 [8] LS-SVM-PSO-BDT mdash SVM 91624 Proposed study IAGA-M1 6 ELM 9361 plusmn 042

ES dataset1 [20] IFSFS 21 SVM 98612 [21] Two-stage GFSBFS 20 16 19 SVM 100 100 97063 [22] GA based FS algorithm 16 BN 99204 Proposed study IAGA-M2 14 BN 9883 plusmn 012

BT dataset1 [23] Normalization mdash SVM 71692 [24] Electrical impedance spectroscopy 8 923 [25] ACO and fuzzy system mdash SVM 71694 Proposed study IAGA-M2 3 ELM 9358 plusmn 042

Binary ClassificationMEEI dataset

1 [26] 30 acoustic features and PCA 17 SVM 9812 [27] LDA based filter bank energies Not reported LDA 853 [28] 22 acoustic features and IFS 16 SVM 91554 Proposed study 22 acoustic features and IAGA 8 SVM 100

PD dataset1 [29] GA 10 SVM 992 [30] GA 9 k-NN 98203 Proposed study IAGA-M1 8 k-NN 9938 plusmn 022

CAD dataset1 [31] GA 9 SVM 832 [32] WEKA filtering method 7 MLP 863 Proposed study IAGA-M2 3 SVM 8323 plusmn 084

Table 8 Overall comparison of fitness functions

Datasets Basic GA IAGA-method 1 IAGA-method 2Objv1 Objv2 Objv3 Objv1 Objv2 Objv3 Objv1 Objv2 Objv3

MEEI radic radic times

PD times radic times

CAD times radic radic

ES radic times times

BT radic radic radic

CTG radic radic times

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] F Tan X Fu Y Zhang and A G Bourgeois ldquoA geneticalgorithm-based method for feature subset selectionrdquo SoftComputing vol 12 no 2 pp 111ndash120 2008

[2] G Georgoulas C D Stylios and P P Groumpos ldquoPredictingthe risk of metabolic acidosis for newborns based on fetal heartrate signal classification using support vector machinesrdquo IEEETransactions on Biomedical Engineering vol 53 no 5 pp 875ndash884 2006

[3] G G Georgoulas C D Stylios G Nokas and P P GroumposldquoClassification of fetal heart rate during labour using hiddenmarkov modelsrdquo in Proceedings of the IEEE International JointConference on Neural Networks vol 3 pp 2471ndash2475 July 2004


[4] N Krupa M A MA E Zahedi S Ahmed and F MHassan ldquoAntepartum fetal heart rate feature extraction andclassification using empirical mode decomposition and supportvector machinerdquo BioMedical Engineering Online vol 10 article6 2011

[5] R Czabanski J Jezewski J Wrobel and J Jezewski ldquoPredictingthe risk of low-fetal birth weight from cardiotocographicsignals using ANBLIR system with deterministic annealingand 120576-insensitive learningrdquo IEEE Transactions on InformationTechnology in Biomedicine vol 14 no 4 pp 1062ndash1074 2010

[6] H Ocak ldquoA medical decision support system based on supportvector machines and the genetic algorithm for the evaluation offetal well-beingrdquo Journal ofMedical Systems vol 37 no 2 article9913 2013

[7] H Ocak and H M Ertunc ldquoPrediction of fetal state fromthe cardiotocogram recordings using adaptive neuro-fuzzyinference systemsrdquo Neural Computing and Applications vol 23no 6 pp 1583ndash1589 2013

[8] E Yılmaz and C Kılıkcıer ldquoDetermination of fetal state fromcardiotocogram using LS-SVM with particle swarm optimiza-tion and binary decision treerdquoComputational andMathematicalMethods in Medicine vol 2013 Article ID 487179 8 pages 2013

[9] S N Sivanandam and S N Deepa Introduction to GeneticAlgorithms Springer Berlin Germany 2007

[10] J E Baker ldquoReducing bias and inefficiency in the selectionalgorithmrdquo inProceedings of the 2nd International Conference onGenetic Algorithms onGenetic Algorithms andTheir Applicationpp 14ndash21 1987

[11] Q Guo R M Hierons M Harman and K Derderian ldquoCon-structing multiple unique inputoutput sequences using meta-heuristic optimisation techniquesrdquo IEE Proceedings Softwarevol 152 no 3 pp 127ndash140 2005

[12] N S Choubey andM U Kharat ldquoGrammar induction strategyusing genetic algorithm case study of fifteen toy languagesrdquoThePacific Journal of Science and Technology vol 11 pp 294ndash3002010

[13] B Yuan and W Liu ldquoA measure oriented training scheme forimbalanced classification problemsrdquo inNew Frontiers in AppliedData Mining pp 293ndash303 2012

[14] M Srinivas and L M Patnaik ldquoAdaptive probabilities ofcrossover and mutation in genetic algorithmsrdquo IEEE Transac-tions on Systems Man and Cybernetics vol 24 no 4 pp 656ndash667 1994

[15] W-Y Lin W-Y Lee and T-P Hong ldquoAdapting crossover andmutation rates in genetic algorithmsrdquo Journal of InformationScience and Engineering vol 19 no 5 pp 889ndash903 2003

[16] D Wang X Wang J Tang C Zhu and T Xu ldquoFault diagnosisof the PT fuel system based on improved adaptive geneticalgorithmrdquo Journal of Computational Information Systems vol8 no 9 pp 3651ndash3658 2012

[17] G K Pakhale and P K Gupta ldquoComparison of advancedpixel based (ANN and SVM) and object-oriented classificationapproaches using landsat-7 Etm+ datardquo International Journal ofEngineering and Technology vol 2 no 4 pp 245ndash251 2010

[18] G-B Huang H Zhou X Ding and R Zhang ldquoExtremelearning machine for regression and multiclass classificationrdquoIEEE Transactions on Systems Man and Cybernetics Part BCybernetics vol 42 no 2 pp 513ndash529 2012

[19] S Ding Y Zhang X Xu and L Bao ldquoA novel extremelearning machine based on hybrid kernel functionrdquo Journal ofComputers vol 8 no 8 pp 2110ndash2117 2013

[20] J Xie and C Wang ldquoUsing support vector machines witha novel hybrid feature selection method for diagnosis oferythemato-squamous diseasesrdquo Expert Systems with Applica-tions vol 38 no 5 pp 5809ndash5815 2011

[21] J Xie J Lei W Xie Y Shi and X Liu ldquoTwo-stage hybrid fea-ture selection algorithms for diagnosing erythemato-squamousdiseasesrdquo Health Information Science and Systems vol 1 p 102013

[22] A Ozcift and A Gulten ldquoA robust multi-class feature selec-tion strategy based on rotation forest ensemble algorithm fordiagnosis of erythemato-squamous diseasesrdquo Journal ofMedicalSystems vol 36 no 2 pp 941ndash949 2012

[23] M Nandy ldquoAn analytical study of supervised and unsuper-vised classification methods for breast cancer diagnosisrdquo IJCAProceedings on Computing Communication and Sensor Network2013 vol 2013 no 2 pp 1ndash4 2013

[24] J Estrela da Silva J P Marques de Sa and J Jossinet ldquoClassi-fication of breast tissue by electrical impedance spectroscopyrdquoMedical and Biological Engineering and Computing vol 38 no1 pp 26ndash30 2000

[25] A Einipour ldquoInference system modeling using hybrid evolu-tionary algorithm application to breast cancer data setrdquo Inter-national Journal of Computer Science andNetwork Solutions vol1 pp 21ndash31 2013

[26] C Peng W Chen X Zhu B Wan and D Wei ldquoPathologicalvoice classification based on a single Vowelrsquos acoustic featuresrdquoin Proceedings of the 7th IEEE International Conference onComputer and Information Technology (CIT rsquo07) pp 1106ndash1110October 2007

[27] J-Y Lee S Jeong and M Hahn ldquoClassification of pathologicaland normal voice based on linear discriminant analysisrdquo inAdaptive and Natural Computing Algorithms pp 382ndash390Springer Berlin Germany 2007

[28] M K Arjmandi M Pooyan M Mikaili M Vali and AMoqarehzadeh ldquoIdentification of voice disorders using long-time features and support vector machine with different featurereduction methodsrdquo Journal of Voice vol 25 no 6 pp e275ndashe289 2011

[29] H Xiao ldquoDiagnosis of Parkinsonrsquos disease using genetic algo-rithm and support vector machine with acoustic character-isticsrdquo in Proceedings of the 5th International Conference onBiomedical Engineering and Informatics (BMEI rsquo12) pp 1072ndash1076 Chongqing China October 2012

[30] R A Shirvan and E Tahami ldquoVoice analysis for detectingParkinsonrsquos disease using genetic algorithm and KNN classifi-cation methodrdquo in Proceedings of the 18th Iranian Conference ofBiomedical Engineering (ICBME rsquo11) pp 278ndash283 Tehran IranDecember 2011

[31] E Ephzibah ldquoCost effective approach on feature selection usinggenetic algorithms and LS-SVM classifierrdquo IJCA Special Issueon Evolutionary Computation for Optimization TechniquesECOT 2010

[32] J Hossain N F Mohd Sani A Mustapha and L S AffendeyldquoUsing feature selection as accuracy benchmarking in clinicaldata miningrdquo Journal of Computer Science vol 9 no 7 pp 883ndash888 2013

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


MEDIATORSINFLAMMATION

of


Behavioural Neurology

EndocrinologyInternational Journal of



Disease Markers


BioMed Research International

OncologyJournal of



Oxidative Medicine and Cellular Longevity


PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of



Computational and Mathematical Methods in Medicine

OphthalmologyJournal of


Diabetes ResearchJournal of



Research and TreatmentAIDS


Gastroenterology Research and Practice


Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom


Table 1 List of features in CTG dataset

Snumber

Name of thefeatures Description

1 LB FHR baseline (beats per minute)2 AC Number of accelerations per second3 FM Number of fetal movements per second4 UC Number of uterine contractions per second5 DL Number of light decelerations per second6 DS Number of severe decelerations per second

7 DP Number of prolonged decelerations persecond

8 ASTV Percentage of time with abnormal shortterm variability

9 MSTV Mean value of short term variability

10 ALTV Percentage of time with abnormal long termvariability

11 MLTV Mean value of long term variability12 Width Width of FHR histogram13 Min Minimum of FHR histogram14 Max Maximum of FHR histogram15 Nmax Number of histogram peaks16 Nzeros Number of histogram zeros17 Mode Histogram mode18 Mean Histogram mean19 Median Histogram median20 Variance Histogram variance

21 Tendency

Histogram tendencyminus1 = left asymmetric0 = symmetric1 = right asymmetric

state and 295 samples belong to the suspect state Featuresof this CTG dataset used for experimentation have beenexplained in Table 1 Single and multiobjective fitness func-tions have been used to gauge the efficacy of the IAGASeveral classifiers like 119896-NN (119896-Nearest Neighbor) SVM(SupportVectorMachine) BN (BayesianNetwork) andELM(Extreme LearningMachine) have been used to discern thesefeatures

With the objective of proving the robustness of thissearch algorithm apart from the CTG dataset five otherbenchmark datasets that is MEEI voice dataset Parkin-sonrsquos (PD) dataset Cleveland Heart Disease (CAD) datasetErythemato-Squamous (ES) dataset and Breast Tissue (BT)dataset have been taken from the UCI machine learningrepository for experimentation and tested with this algo-rithm The main aim is to select the best clinical features ofthese datasets through the proposed IAGAmethod which inturn attains promising classification accuracywith aminimalnumber of features

The rest of this paper is organised as follows Section 2describes the literature works done in this specific problemdomain Section 3 explains about the feature selection carriedout using IAGA Section 4 presents the classifiers employed

and the performance measures Section 5 enunciates the dis-cussion of the classification results of CTG dataset Section 6elaborates the classification results of other datasets andSection 7 concludes the entire work

2 Previous Works on CTG Dataset

Numerous approaches have been investigated using conven-tional and artificial intelligence techniques for feature extrac-tion and also to come out with diagnostic systems [4] In[3] the automatic classification of FHR signal which belongsto hypoxic and normal newborns has been carried outthrough a hidden Markov models (HMM) based approachYet again an ANBLIR (artificial neural network based onlogical interpretation of fuzzy if-then rules) system is usedto evaluate the risk of low-fetal birth weight as normal orabnormal using CTG signals recorded during the pregnancyin [5]

21 Basic GA As an expeditious search strategy GA hasbeen utilized in the assessment of the fetal well-beingFor example Ocak (2013) reported that a genetic basedoptimization followed by SVM classification helps in pre-dicting the substantial features for assessing the fetal stateThe classification accuracy with FHR data set obtained was9923with 13 features [6] An adaptive neurofuzzy inferencesystem (ANFIS) has been proposed for the prediction of fetalstatus from theCTG recordings as normal or pathological [7]

Yılmaz and Kılıkcıer have suggested a combined schemeof binary decision tree (BDT) and particle swarm opti-mization (PSO) for handling this specific classification taskThrough least squares support vector machine (LS-SVM)a good classification accuracy rate of 9162 was achievedBesides this experimentation has resulted in the three-classclassification of the CTG dataset using receiver operationcharacteristic analysis and Cobweb representation [8] How-ever the problem of design of a better genetic based searchheuristic that provides higher classification accuracy alongwith a reduced number of voice features in this specific prob-lem domain is still open In other words the reduced optimalfeature subset must be sufficient enough to classify the datasamples into their respective classes than the previous works

Basic GA deals with candidate solutions which arerepresented by individuals (or chromosomes) in a largepopulation It starts with the random generation of ini-tial set of chromosomes followed by their correspondingfitness evaluation The successive generations are created(iteratively) by the picking of the highly fit individuals ofthe current generation On achieving the eligible individualthat satisfies our constraint the GA cycle is stopped usingthe termination criteria and this individual becomes thesolution of the problem There are three significant geneticoperators called selection crossover and mutation whichhelp in the reproduction process of the chromosomes [9]Ideally a GA cycle takes place using these genetic operatorsand such genetic parameters influencing this process tend toproduce good individuals Such a basic GA cycle is depictedin Figure 1


Start

Fitness evaluation


New generation



Stop

Yes

No









(119905) and 120590


ℎ

(119905)

119909 =

min(0 1 +119891

(119905)119909 minus 119891

(119905)

120590

(119905)) 120590

(119905)= 0

1 120590

(119905)= 0

(1)










Objv2 =119888

prod

119894=1

(119872119894)1119888

(3)

where






Objv3 = (119908)Objv2 + (1 minus 119908) ( 119885119873

) (5)



35 Methods of IAGA



119883 = 119891max minus 119891avg (6)





119875119888 =

1198961 (119891max minus 1198911015840)

(119891max minus 119891avg)119891

1015840ge 119891avg

09 119891

1015840lt 119891avg

119875119898 =

1198962 (119891max minus 11989110158401015840)

(119891max minus 119891avg)119891

10158401015840ge 119891avg

003 119891

10158401015840lt 119891avg

(7)




119875119888

=



1015840ge 119891avg

119875119888max 119891

1015840lt 119891avg

119875119898

=



1015840ge 119891avg

119875119898max 119891

1015840lt 119891avg

(8)










119870(119909 119909119894) = exp(minus

1003817

1003817

1003817

1003817

119909 minus 1199091198941003817

1003817

1003817

1003817

2

2120590

2)

(9)





respectively


119891119871 (119909119895) =

119871

sum

119894

120573119894119892 (119908119894 sdot 119909119895 + 119887119894) 119895 = 1 2 3 119873 (10)







Implement uniform






Implement uniform







GA cycle is stopped







120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)


119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)









SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















Start

Fitness evaluation


New generation



Stop

Yes

No









(119905) and 120590


ℎ

(119905)

119909 =

min(0 1 +119891

(119905)119909 minus 119891

(119905)

120590

(119905)) 120590

(119905)= 0

1 120590

(119905)= 0

(1)










Objv2 =119888

prod

119894=1

(119872119894)1119888

(3)

where






Objv3 = (119908)Objv2 + (1 minus 119908) ( 119885119873

) (5)



35 Methods of IAGA



119883 = 119891max minus 119891avg (6)





119875119888 =

1198961 (119891max minus 1198911015840)

(119891max minus 119891avg)119891

1015840ge 119891avg

09 119891

1015840lt 119891avg

119875119898 =

1198962 (119891max minus 11989110158401015840)

(119891max minus 119891avg)119891

10158401015840ge 119891avg

003 119891

10158401015840lt 119891avg

(7)




119875119888

=



1015840ge 119891avg

119875119888max 119891

1015840lt 119891avg

119875119898

=



1015840ge 119891avg

119875119898max 119891

1015840lt 119891avg

(8)










119870(119909 119909119894) = exp(minus

1003817

1003817

1003817

1003817

119909 minus 1199091198941003817

1003817

1003817

1003817

2

2120590

2)

(9)





respectively


119891119871 (119909119895) =

119871

sum

119894

120573119894119892 (119908119894 sdot 119909119895 + 119887119894) 119895 = 1 2 3 119873 (10)







Implement uniform






Implement uniform







GA cycle is stopped







120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)


119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)









SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of























Objv2 =119888

prod

119894=1

(119872119894)1119888

(3)

where






Objv3 = (119908)Objv2 + (1 minus 119908) ( 119885119873

) (5)



35 Methods of IAGA



119883 = 119891max minus 119891avg (6)





119875119888 =

1198961 (119891max minus 1198911015840)

(119891max minus 119891avg)119891

1015840ge 119891avg

09 119891

1015840lt 119891avg

119875119898 =

1198962 (119891max minus 11989110158401015840)

(119891max minus 119891avg)119891

10158401015840ge 119891avg

003 119891

10158401015840lt 119891avg

(7)




119875119888

=



1015840ge 119891avg

119875119888max 119891

1015840lt 119891avg

119875119898

=



1015840ge 119891avg

119875119898max 119891

1015840lt 119891avg

(8)










119870(119909 119909119894) = exp(minus

1003817

1003817

1003817

1003817

119909 minus 1199091198941003817

1003817

1003817

1003817

2

2120590

2)

(9)





respectively


119891119871 (119909119895) =

119871

sum

119894

120573119894119892 (119908119894 sdot 119909119895 + 119887119894) 119895 = 1 2 3 119873 (10)







Implement uniform






Implement uniform







GA cycle is stopped







120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)


119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)









SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of


















119875119888

=



1015840ge 119891avg

119875119888max 119891

1015840lt 119891avg

119875119898

=



1015840ge 119891avg

119875119898max 119891

1015840lt 119891avg

(8)










119870(119909 119909119894) = exp(minus

1003817

1003817

1003817

1003817

119909 minus 1199091198941003817

1003817

1003817

1003817

2

2120590

2)

(9)





respectively


119891119871 (119909119895) =

119871

sum

119894

120573119894119892 (119908119894 sdot 119909119895 + 119887119894) 119895 = 1 2 3 119873 (10)







Implement uniform






Implement uniform







GA cycle is stopped







120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)


119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)









SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of





















Implement uniform






Implement uniform







GA cycle is stopped







120573 = 119867dagger119879(11)

where 119867dagger = (119867

1015840119867)

minus1119867

1015840 or 119867

1015840(119867119867

1015840)


119891 (119909) = ℎ (119909)119867

1015840(

1

120576

+ 119867119867

1015840)

minus1

119879

(12)









SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of


















SBS

Chromosome

with masking



Optimal feature


Benchmark datasets

Binary class




with sigma scaling


NN SVM BN and ELM

Multiclass

are met

Terminationcriteria







Sensitivity (SE)

SE = 100 times

TPTP + FN

(13)

Specificity (SP)

SP = 100 times

TNTN + FP

(14)

Accuracy (ACC)

ACC = 100 times


(15)

119865-measure


(16)

where

precision =

TPTP + FP

recall = TPTP + FN

(17)

Area under ROC

ROC =

SE + SP2

(18)


75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















75

80

85

90

95

100

PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2

Accu

racy



k-NN accuracy


Kappa statistic

KS =1198750 minus 119875119888

1 minus 119875119888

(19)


119866-mean

119866-mean =

119888

prod

119894=1

(119872119894)1119888

(20)





02468

1012141618


PCA

SFS

SBS

Basic

GA

IAG

A-M

1

IAG

A-M

2


k-NN features

Num

ber o

f sel

ecte

d fe

atur

es





PCA 16 92146 8960

SFS 11 92106 9144

SBS 11 92716 9055










ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of



















ELM 9103 9361 264



IAGA-M11620 61 1033 224 142 10 152






0246810121416

7580859095

100105


Feat

ure c

ount

Accu

racy

Datasets





7 Conclusion






















References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of




































References







































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of



















































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of
















Research Article A Novel Clinical Decision Support System...

Documents

Transcript of Research Article A Novel Clinical Decision Support System...