Research Article A Study of Moment Based Features...

18
Research Article A Study of Moment Based Features on Handwritten Digit Recognition Pawan Kumar Singh, Ram Sarkar, and Mita Nasipuri Department of Computer Science and Engineering, Jadavpur University, 188 Raja S. C. Mullick Road, Kolkata, West Bengal 700032, India Correspondence should be addressed to Pawan Kumar Singh; [email protected] Received 3 November 2015; Revised 16 January 2016; Accepted 27 January 2016 Academic Editor: Miin-Shen Yang Copyright © 2016 Pawan Kumar Singh et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Handwritten digit recognition plays a significant role in many user authentication applications in the modern world. As the handwritten digits are not of the same size, thickness, style, and orientation, therefore, these challenges are to be faced to resolve this problem. A lot of work has been done for various non-Indic scripts particularly, in case of Roman, but, in case of Indic scripts, the research is limited. is paper presents a script invariant handwritten digit recognition system for identifying digits written in five popular scripts of Indian subcontinent, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. A 130-element feature set which is basically a combination of six different types of moments, namely, geometric moment, moment invariant, affine moment invariant, Legendre moment, Zernike moment, and complex moment, has been estimated for each digit sample. Finally, the technique is evaluated on CMATER and MNIST databases using multiple classifiers and, aſter performing statistical significance tests, it is observed that Multilayer Perceptron (MLP) classifier outperforms the others. Satisfactory recognition accuracies are attained for all the five mentioned scripts. 1. Introduction e field of automated reading of printed or handwritten doc- uments by the electronic devices is known as Optical Char- acter Recognition (OCR) system, which is broadly defined as the process of recognizing either printed or handwritten text from document images and converting it into electronic form. OCR systems can contribute tremendously to the advancement of the automation process and can improve the interaction between man and machine in many applications, including office automation, bank check verification, postal automation, and a large variety of business and data entry applications. Handwritten digit recognition is the method of recognizing and classifying handwritten digits from 0 to 9 without human interaction [1]. Although the recognition of handwritten numerals has been studied for more than three decades and many techniques with high accuracy rates have already been developed, the research in this area continues with the aim of improving the recognition rates further. Handwritten digit recognition is a complex problem due to the fact that variation exists in writing style of different writers. e phenomenon that makes the problem more challenging is the inherent variation in writing styles at different instances. Due to this reason, building a generic recognizer that is capable of recognizing handwritten digits written by diverse writers is not always feasible [2]. However, the extraction of the most informative features with highly discriminatory ability to improve the classification accuracy with reduced complexity remains one of the most important problems for this task. It is a task of great importance for which there are standard databases that allow different approaches to be compared and validated. India is a multilingual country with 23 constitutionally recognized languages written in 12 major scripts [1]. Besides these, hundreds of other languages are used in India, each one with a number of dialects. e officially recognized languages are Hindi, Bengali, Punjabi, Marathi, Gujarati, Oriya, Sindhi, Assamese, Nepali, Urdu, Sanskrit, Tamil, Telugu, Kannada, Malayalam, Kashmiri, Manipuri, Konkani, Maithili, Santhali, Bodo, English, and Dogri. e 12 major scripts used to write these languages are Devanagari, Bangla, Oriya, Gujarati, Gurumukhi, Tamil, Telugu, Kannada, Malayalam, Manipuri, Hindawi Publishing Corporation Applied Computational Intelligence and So Computing Volume 2016, Article ID 2796863, 17 pages http://dx.doi.org/10.1155/2016/2796863

Transcript of Research Article A Study of Moment Based Features...

Page 1: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Research ArticleA Study of Moment Based Features onHandwritten Digit Recognition

Pawan Kumar Singh Ram Sarkar and Mita Nasipuri

Department of Computer Science and Engineering Jadavpur University 188 Raja S C Mullick RoadKolkata West Bengal 700032 India

Correspondence should be addressed to Pawan Kumar Singh pawansinghjugmailcom

Received 3 November 2015 Revised 16 January 2016 Accepted 27 January 2016

Academic Editor Miin-Shen Yang

Copyright copy 2016 Pawan Kumar Singh et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Handwritten digit recognition plays a significant role in many user authentication applications in the modern world As thehandwritten digits are not of the same size thickness style and orientation therefore these challenges are to be faced to resolvethis problem A lot of work has been done for various non-Indic scripts particularly in case of Roman but in case of Indicscripts the research is limited This paper presents a script invariant handwritten digit recognition system for identifying digitswritten in five popular scripts of Indian subcontinent namely Indo-Arabic BanglaDevanagari Roman and Telugu A 130-elementfeature set which is basically a combination of six different types of moments namely geometric moment moment invariant affinemoment invariant Legendremoment Zernikemoment and complexmoment has been estimated for each digit sample Finally thetechnique is evaluated on CMATER and MNIST databases using multiple classifiers and after performing statistical significancetests it is observed that Multilayer Perceptron (MLP) classifier outperforms the others Satisfactory recognition accuracies areattained for all the five mentioned scripts

1 Introduction

Thefield of automated reading of printed or handwritten doc-uments by the electronic devices is known as Optical Char-acter Recognition (OCR) system which is broadly definedas the process of recognizing either printed or handwrittentext from document images and converting it into electronicform OCR systems can contribute tremendously to theadvancement of the automation process and can improve theinteraction between man and machine in many applicationsincluding office automation bank check verification postalautomation and a large variety of business and data entryapplications Handwritten digit recognition is the method ofrecognizing and classifying handwritten digits from 0 to 9without human interaction [1] Although the recognition ofhandwritten numerals has been studied for more than threedecades and many techniques with high accuracy rates havealready been developed the research in this area continueswith the aim of improving the recognition rates further

Handwritten digit recognition is a complex problem dueto the fact that variation exists in writing style of different

writers The phenomenon that makes the problem morechallenging is the inherent variation in writing styles atdifferent instances Due to this reason building a genericrecognizer that is capable of recognizing handwritten digitswritten by diverse writers is not always feasible [2] Howeverthe extraction of the most informative features with highlydiscriminatory ability to improve the classification accuracywith reduced complexity remains one of the most importantproblems for this task It is a task of great importancefor which there are standard databases that allow differentapproaches to be compared and validated

India is a multilingual country with 23 constitutionallyrecognized languages written in 12 major scripts [1] Besidesthese hundreds of other languages are used in India each onewith a number of dialectsThe officially recognized languagesare Hindi Bengali Punjabi Marathi Gujarati Oriya SindhiAssamese Nepali Urdu Sanskrit Tamil Telugu KannadaMalayalam KashmiriManipuri KonkaniMaithili SanthaliBodo English and Dogri The 12 major scripts used to writethese languages are Devanagari Bangla Oriya GujaratiGurumukhi Tamil Telugu Kannada Malayalam Manipuri

Hindawi Publishing CorporationApplied Computational Intelligence and So ComputingVolume 2016 Article ID 2796863 17 pageshttpdxdoiorg10115520162796863

2 Applied Computational Intelligence and Soft Computing

Roman and Urdu In a multilingual country like India it is acommon scenario that a document like job application formrailway ticket reservation form and so forth is composed oftext contents written in different languagesscripts in orderto reach a larger cross section of people The variation ofdifferent scripts may be in the form of numerals or alphanumerals in a single document page But the techniquesdeveloped for text identification generally do not incorporatethe recognition of digitsThis is because the features requiredfor the text identification may not be applicable for identify-ing the digits

The paper is organized as follows Section 2 presents abrief review of some of the previous approaches to hand-written digit recognition whereas in Section 3 we introduceour script independent handwritten digit recognition systemSection 4describes the performance of our systemon realisticdatabases of handwritten digits and finally Section 5 con-cludes the paper

2 Review of Related Works

Gorgevik and Cakmakov [3] developed Support VectorMachine (SVM) based digits recognition system for hand-written Roman numerals They extracted four types of fea-tures from each digit image (1) projection histograms (2)contour profiles (3) ring-zones and (4) Kirsch featuresTheyreported 9727 recognition accuracy on National Instituteof Standards and Technology (NIST) handwritten digitsdatabase [4] In [5] Chen et al proposed max-min poste-rior pseudoprobabilities framework for Roman handwrittendigit recognition They extracted 256 dimension directionalfeatures from the input image Finally these features weretransformed into a set of 128 features using Principal Com-ponent Analysis (PCA) They reported recognition accuracyof 9876 on NIST database [4] Labusch et al [6] describeda sparse coding based feature extraction method with SVMas a classifier They found recognition accuracy of 9941on MNIST (Modified NIST) handwritten digits database [7]The work described in [8] combined three recognizers bymajority vote and one of them is based on Kirsch gradient(four orientations) dimensionality reduction by PCA andclassification by SVM They achieved an accuracy rate of9505 with 093 error on 10000 test samples of MNISTdatabase [7] Mane and Ragha [9] performed handwrittendigit recognition using elastic image matching techniquebased on eigendeformation which is estimated by the PCAof actual deformations automatically selected by the elasticmatching They achieved an overall accuracy of 9491 ontheir owndatabase collected fromdifferent individuals of var-ious professions for the experiment Cruz et al [10] presenteda handwritten digit recognition system which uses multiplefeature extraction methods and classifier ensemble A totalof six feature extraction algorithms namely MultizoningModified Edge Maps Structural Characteristics ProjectionsConcavities Measurements and Gradient Directional wereevaluated in this paper A scheme using neural networks as acombiner achieved a recognition rate of 9968 on a trainingset of 60000 images and a test set of 10000 images of MNISTdatabase

Dhandra et al [11] investigated a script independentautomatic numeral recognition system for recognition ofKannada Telugu and Devanagari handwritten numerals Inthe proposed method 30 classes were reduced to 18 classesby extracting the global and local structural features likedirectional density estimation water reservoirs maximumprofile distances and fill-hole density Finally a probabilisticneural network (PNN) classifier was used for the recognitionsystem which yielded an accuracy of 9720 on a totalof 2550 numeral images written in Kannada Telugu andDevanagari scripts In [12] Yang et al proposed supervisedmatrix factorization method used directly as multiclass clas-sifier They reported recognition accuracy of 9871 withsupervised learning approach on MNIST database [7] In[13] a mixture of multiclass logistic regression models wasdescribed They claimed recognition accuracy of 98 onthe Indian digit database provided by CENPARMI [14] Daset al [15] described a technique for creating a pool of localregions and selection of an optimal set of local regions fromthat pool for extracting optimal discriminating informationfor handwritten Bangla digit recognition Genetic algorithm(GA) was then applied on these local regions to samplethe best discriminating features The features extracted fromthese selected local regions were then classified with SVMand recognition accuracy of 97 was achieved In [16] awavelet analysis based technique for feature extraction wasreported For classification SVM and k-Nearest Neighbor (k-NN) were used and an overall recognition accuracy of 9704was reported on MNIST digit database [7] A comparativestudy in [17] was conducted by training the neural networkusing Backpropagation (BP) algorithm and further usingPCA for feature extraction Digit recognition was finallycarried out using 13 algorithms neural network algorithmand the Fisher Discriminant Analysis (FDA) algorithm TheFDA algorithm proved less efficient with an overall accuracyof 7767 whereas the BP algorithm with PCA for its featureextraction gave an accuracy of 912

In [18] a set of structural features (namely number ofholes water reservoirs in four directions maximum profiledistances in four directions and fill-hole density) and k-NNclassifier were employed for classification and recognition ofhandwritten digits They reported recognition accuracy of9694 on 5000 samples of MNIST digit database [7] In[19] AlKhateeb and Alseid proposed an Arabic handwrittendigit recognition system using Dynamic Bayesian NetworkThey employed DCT coefficients based features for classifi-cation The system was tested on Indo-Arabic digits database(ADBase) which contains 70000 Indo-Arabic digits [20] andan average recognition accuracy of 8526 was achieved on10000 samples Ebrahimzadeh and Jampour [21] proposedan appearance feature-based approach using Histogram ofOriented Gradients (HOG) for handwritten digit recogni-tion A linear SVM was then used for classification of thedigits in MNIST dataset and an overall accuracy of 9725had been realized Gil et al [22] presented a novel approachusing SVM binary classifiers and unbalanced decision treesTwo classifiers were proposed in this study where one usedthe digit characteristics as input and the other used thewhole image as such It is observed that a handwritten

Applied Computational Intelligence and Soft Computing 3

digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved

From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1

3 Feature Extraction Methodology

One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection

31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and

Handwritten digit images

Feature extraction using moment based features

Classification of digits using multiple classifiers

Statistical significance tests for comparison of multipleclassifiers using multiple datasets

Selection of appropriate classifier

Detailed performance evaluation of chosen classifierusing training and test sets

Figure 1 Schematic diagram illustrating the key modules of theproposed methodology

similarly in mechanics to characterize bodies by their spatialdistribution of mass

A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921

Let 120583119899 be a real sequence of numbers and let us define

Δ119898

120583119899=

119898

sum119894=0

(minus1)119894

(119898

119894)120583119899+119894 (1)

Note that Δ119898120583119899can be viewed as the 119898th order derivative of

120583119899By theHausdorff theorem a necessary and sufficient con-

dition that there exists a monotonic function 119865(119909) satisfyingthe system

120583119899= int1

0

119909119899

119889119865 (119909) 119899 = 0 1 2 (2)

is that the system of linear inequalities

Δ119896

120583119899ge 0 119896 = 0 1 2 (3)

should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals

int1

0

119909119899

119891 (119909) 119889119909 119899 = 0 1 (4)

completely characterizes the functionA necessary and sufficient condition that there exists a

function 119865(119909) of bounded variation satisfying (7) is that thesequence

119901

sum119898=0

(119901

119898)1003816100381610038161003816Δ119901minus119898

120583119898

1003816100381610038161003816 119901 = 0 1 2 (5)

4 Applied Computational Intelligence and Soft Computing

should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)

32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902

of a gray level image 119891(119909 119910) is defined as

119872119901119902= ∬infin

minusinfin

119909119901

119910119902

119891 (119909 119910) 119889119909 119889119910 (6)

where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901

119910119902 is the basis function for this moment definition A set

of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is

the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely

119891(119909 119910) is uniquely determined by 119872119901119902 Considering the

fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform

119898119901119902=

119872

sum119909=1

119873

sum119910=1

119909119901

119910119902

119891 (119909 119910) (7)

where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling

then the image may be positioned such that its center of mass

(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583

119901119902 The simplified form of central moment of

order (119901 + 119902) is defined as follows

120583119901119902=

119872

sum119909=1

119873

sum119910=1

(119909 minus 119909)119901

(119910 minus 119910)119902

119891 (119909 119910) (8)

where 119909 = 1198981011989800and 119910 = 119898

0111989800

The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583

119901119902computed using the centroid of the image are

equivalent to 119898119901119902

whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902

are defined as

120578119901119902=120583119901119902

120583120574

00

(9)

where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578

02 12057811 12057820 known as the

moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by

120579 =1

2tanminus1 (

212058311

12058320minus 12058302

) (10)

where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows

119890 =(12058320minus 12058302)2

+ 41205832

11

12058300

(11)

33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order

Applied Computational Intelligence and Soft Computing 5

three that are invariant with respect to image scale transla-tion and rotation Consider

1206011= 12057820+ 12057802

1206012= (12057820minus 12057802)2

+ 41205782

11

1206013= (12057830minus 312057812)2

+ (312057821minus 12057803)2

1206014= (12057830+ 12057812)2

+ (12057821+ 12057803)2

1206015= (12057830minus 312057812) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057821minus 12057803)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

1206016= (12057820minus 12057802) [(12057830+ 12057812)2

minus (12057821+ 12057803)2

]

+ 412057811(12057830+ 12057812) (12057821+ 12057803)

1206017= (312057821minus 12057803) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057812minus 12057830)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

(12)

This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work

34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows

1198681=

1

120583400

(1205832012058302minus 1205832

11)

1198682=

1

1205831000

(1205832

301205832

03minus 612058330120583211205831212058303+ 4120583301205833

12

+ 4120583031205833

21minus 31205832

211205832

12)

1198683=

1

120583700

(12058320(1205832112058303minus 1205832

12) minus 12058311(1205833012058303minus 1205832112058312)

+ 12058302(1205833012058312minus 1205832

21))

1198684=

1

1205831100

(1205833

201205832

03minus 61205832

20120583111205831212058303minus 61205832

20120583211205830212058303

+ 91205832

20120583021205832

12+ 12120583

201205832

111205830312058321

+ 61205832012058311120583021205833012058303minus 18120583

2012058311120583021205832112058312

minus 81205833

111205830312058330minus 6120583201205832

021205833012058312+ 9120583201205832

021205832

21

+ 121205832

11120583021205833012058312minus 61205832

111205832

021205833012058321+ 1205832

021205832

30)

1198685=

1

120583600

(1205834012058304minus 41205833112058313+ 31205832

22)

1198686=

1

120583900

(120583401205830412058322+ 2120583311205832212058313minus 120583401205832

13minus 120583041205832

31

minus 1205833

22)

(13)

A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work

35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows

119871119901119902=(2119901 + 1) (2119902 + 1)

4

sdot ∬+1

minus1

119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910

(14)

where the kernel function 119875119901(119909) denotes the 119901th-order

Legendre polynomial and is given by

119875119901(119909) =

119901

sum119896=0

119862119901119896[(1 minus 119909)

119896

+ (minus1)119901

(1 + 119909)119896

] (15)

where

119862119901119896=(minus1)119896

2119896+1

(119901 + 119896)

(119901 minus 119896) (119896)2 (16)

Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871

119901119902 defined by (14) is usually

approximated by the formula

119871119901119902

=(2119901 + 1) (2119902 + 1)

(119873 minus 1)2

119873

sum119894=1

119873

sum119895=1

119875119901(119909119894) 119875119902(119910119895) 119891 (119909

119894 119910119895)

(17)

where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910

119895= (2119895minus119873minus1)(119873minus1)

and for a binary image 119891(119909119894 119910119895) is given as

119891 (119909119894 119910119895) =

1 if (119894 119895) is in original object

0 otherwise(18)

As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form

119901119902=(2119901 + 1) (2119902 + 1)

4

119873

sum119894=1

119873

sum119895=1

ℎ119901119902(119909119894119910119895) 119891 (119909

119894 119910119895) (19)

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

2 Applied Computational Intelligence and Soft Computing

Roman and Urdu In a multilingual country like India it is acommon scenario that a document like job application formrailway ticket reservation form and so forth is composed oftext contents written in different languagesscripts in orderto reach a larger cross section of people The variation ofdifferent scripts may be in the form of numerals or alphanumerals in a single document page But the techniquesdeveloped for text identification generally do not incorporatethe recognition of digitsThis is because the features requiredfor the text identification may not be applicable for identify-ing the digits

The paper is organized as follows Section 2 presents abrief review of some of the previous approaches to hand-written digit recognition whereas in Section 3 we introduceour script independent handwritten digit recognition systemSection 4describes the performance of our systemon realisticdatabases of handwritten digits and finally Section 5 con-cludes the paper

2 Review of Related Works

Gorgevik and Cakmakov [3] developed Support VectorMachine (SVM) based digits recognition system for hand-written Roman numerals They extracted four types of fea-tures from each digit image (1) projection histograms (2)contour profiles (3) ring-zones and (4) Kirsch featuresTheyreported 9727 recognition accuracy on National Instituteof Standards and Technology (NIST) handwritten digitsdatabase [4] In [5] Chen et al proposed max-min poste-rior pseudoprobabilities framework for Roman handwrittendigit recognition They extracted 256 dimension directionalfeatures from the input image Finally these features weretransformed into a set of 128 features using Principal Com-ponent Analysis (PCA) They reported recognition accuracyof 9876 on NIST database [4] Labusch et al [6] describeda sparse coding based feature extraction method with SVMas a classifier They found recognition accuracy of 9941on MNIST (Modified NIST) handwritten digits database [7]The work described in [8] combined three recognizers bymajority vote and one of them is based on Kirsch gradient(four orientations) dimensionality reduction by PCA andclassification by SVM They achieved an accuracy rate of9505 with 093 error on 10000 test samples of MNISTdatabase [7] Mane and Ragha [9] performed handwrittendigit recognition using elastic image matching techniquebased on eigendeformation which is estimated by the PCAof actual deformations automatically selected by the elasticmatching They achieved an overall accuracy of 9491 ontheir owndatabase collected fromdifferent individuals of var-ious professions for the experiment Cruz et al [10] presenteda handwritten digit recognition system which uses multiplefeature extraction methods and classifier ensemble A totalof six feature extraction algorithms namely MultizoningModified Edge Maps Structural Characteristics ProjectionsConcavities Measurements and Gradient Directional wereevaluated in this paper A scheme using neural networks as acombiner achieved a recognition rate of 9968 on a trainingset of 60000 images and a test set of 10000 images of MNISTdatabase

Dhandra et al [11] investigated a script independentautomatic numeral recognition system for recognition ofKannada Telugu and Devanagari handwritten numerals Inthe proposed method 30 classes were reduced to 18 classesby extracting the global and local structural features likedirectional density estimation water reservoirs maximumprofile distances and fill-hole density Finally a probabilisticneural network (PNN) classifier was used for the recognitionsystem which yielded an accuracy of 9720 on a totalof 2550 numeral images written in Kannada Telugu andDevanagari scripts In [12] Yang et al proposed supervisedmatrix factorization method used directly as multiclass clas-sifier They reported recognition accuracy of 9871 withsupervised learning approach on MNIST database [7] In[13] a mixture of multiclass logistic regression models wasdescribed They claimed recognition accuracy of 98 onthe Indian digit database provided by CENPARMI [14] Daset al [15] described a technique for creating a pool of localregions and selection of an optimal set of local regions fromthat pool for extracting optimal discriminating informationfor handwritten Bangla digit recognition Genetic algorithm(GA) was then applied on these local regions to samplethe best discriminating features The features extracted fromthese selected local regions were then classified with SVMand recognition accuracy of 97 was achieved In [16] awavelet analysis based technique for feature extraction wasreported For classification SVM and k-Nearest Neighbor (k-NN) were used and an overall recognition accuracy of 9704was reported on MNIST digit database [7] A comparativestudy in [17] was conducted by training the neural networkusing Backpropagation (BP) algorithm and further usingPCA for feature extraction Digit recognition was finallycarried out using 13 algorithms neural network algorithmand the Fisher Discriminant Analysis (FDA) algorithm TheFDA algorithm proved less efficient with an overall accuracyof 7767 whereas the BP algorithm with PCA for its featureextraction gave an accuracy of 912

In [18] a set of structural features (namely number ofholes water reservoirs in four directions maximum profiledistances in four directions and fill-hole density) and k-NNclassifier were employed for classification and recognition ofhandwritten digits They reported recognition accuracy of9694 on 5000 samples of MNIST digit database [7] In[19] AlKhateeb and Alseid proposed an Arabic handwrittendigit recognition system using Dynamic Bayesian NetworkThey employed DCT coefficients based features for classifi-cation The system was tested on Indo-Arabic digits database(ADBase) which contains 70000 Indo-Arabic digits [20] andan average recognition accuracy of 8526 was achieved on10000 samples Ebrahimzadeh and Jampour [21] proposedan appearance feature-based approach using Histogram ofOriented Gradients (HOG) for handwritten digit recogni-tion A linear SVM was then used for classification of thedigits in MNIST dataset and an overall accuracy of 9725had been realized Gil et al [22] presented a novel approachusing SVM binary classifiers and unbalanced decision treesTwo classifiers were proposed in this study where one usedthe digit characteristics as input and the other used thewhole image as such It is observed that a handwritten

Applied Computational Intelligence and Soft Computing 3

digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved

From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1

3 Feature Extraction Methodology

One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection

31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and

Handwritten digit images

Feature extraction using moment based features

Classification of digits using multiple classifiers

Statistical significance tests for comparison of multipleclassifiers using multiple datasets

Selection of appropriate classifier

Detailed performance evaluation of chosen classifierusing training and test sets

Figure 1 Schematic diagram illustrating the key modules of theproposed methodology

similarly in mechanics to characterize bodies by their spatialdistribution of mass

A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921

Let 120583119899 be a real sequence of numbers and let us define

Δ119898

120583119899=

119898

sum119894=0

(minus1)119894

(119898

119894)120583119899+119894 (1)

Note that Δ119898120583119899can be viewed as the 119898th order derivative of

120583119899By theHausdorff theorem a necessary and sufficient con-

dition that there exists a monotonic function 119865(119909) satisfyingthe system

120583119899= int1

0

119909119899

119889119865 (119909) 119899 = 0 1 2 (2)

is that the system of linear inequalities

Δ119896

120583119899ge 0 119896 = 0 1 2 (3)

should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals

int1

0

119909119899

119891 (119909) 119889119909 119899 = 0 1 (4)

completely characterizes the functionA necessary and sufficient condition that there exists a

function 119865(119909) of bounded variation satisfying (7) is that thesequence

119901

sum119898=0

(119901

119898)1003816100381610038161003816Δ119901minus119898

120583119898

1003816100381610038161003816 119901 = 0 1 2 (5)

4 Applied Computational Intelligence and Soft Computing

should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)

32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902

of a gray level image 119891(119909 119910) is defined as

119872119901119902= ∬infin

minusinfin

119909119901

119910119902

119891 (119909 119910) 119889119909 119889119910 (6)

where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901

119910119902 is the basis function for this moment definition A set

of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is

the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely

119891(119909 119910) is uniquely determined by 119872119901119902 Considering the

fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform

119898119901119902=

119872

sum119909=1

119873

sum119910=1

119909119901

119910119902

119891 (119909 119910) (7)

where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling

then the image may be positioned such that its center of mass

(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583

119901119902 The simplified form of central moment of

order (119901 + 119902) is defined as follows

120583119901119902=

119872

sum119909=1

119873

sum119910=1

(119909 minus 119909)119901

(119910 minus 119910)119902

119891 (119909 119910) (8)

where 119909 = 1198981011989800and 119910 = 119898

0111989800

The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583

119901119902computed using the centroid of the image are

equivalent to 119898119901119902

whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902

are defined as

120578119901119902=120583119901119902

120583120574

00

(9)

where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578

02 12057811 12057820 known as the

moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by

120579 =1

2tanminus1 (

212058311

12058320minus 12058302

) (10)

where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows

119890 =(12058320minus 12058302)2

+ 41205832

11

12058300

(11)

33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order

Applied Computational Intelligence and Soft Computing 5

three that are invariant with respect to image scale transla-tion and rotation Consider

1206011= 12057820+ 12057802

1206012= (12057820minus 12057802)2

+ 41205782

11

1206013= (12057830minus 312057812)2

+ (312057821minus 12057803)2

1206014= (12057830+ 12057812)2

+ (12057821+ 12057803)2

1206015= (12057830minus 312057812) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057821minus 12057803)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

1206016= (12057820minus 12057802) [(12057830+ 12057812)2

minus (12057821+ 12057803)2

]

+ 412057811(12057830+ 12057812) (12057821+ 12057803)

1206017= (312057821minus 12057803) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057812minus 12057830)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

(12)

This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work

34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows

1198681=

1

120583400

(1205832012058302minus 1205832

11)

1198682=

1

1205831000

(1205832

301205832

03minus 612058330120583211205831212058303+ 4120583301205833

12

+ 4120583031205833

21minus 31205832

211205832

12)

1198683=

1

120583700

(12058320(1205832112058303minus 1205832

12) minus 12058311(1205833012058303minus 1205832112058312)

+ 12058302(1205833012058312minus 1205832

21))

1198684=

1

1205831100

(1205833

201205832

03minus 61205832

20120583111205831212058303minus 61205832

20120583211205830212058303

+ 91205832

20120583021205832

12+ 12120583

201205832

111205830312058321

+ 61205832012058311120583021205833012058303minus 18120583

2012058311120583021205832112058312

minus 81205833

111205830312058330minus 6120583201205832

021205833012058312+ 9120583201205832

021205832

21

+ 121205832

11120583021205833012058312minus 61205832

111205832

021205833012058321+ 1205832

021205832

30)

1198685=

1

120583600

(1205834012058304minus 41205833112058313+ 31205832

22)

1198686=

1

120583900

(120583401205830412058322+ 2120583311205832212058313minus 120583401205832

13minus 120583041205832

31

minus 1205833

22)

(13)

A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work

35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows

119871119901119902=(2119901 + 1) (2119902 + 1)

4

sdot ∬+1

minus1

119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910

(14)

where the kernel function 119875119901(119909) denotes the 119901th-order

Legendre polynomial and is given by

119875119901(119909) =

119901

sum119896=0

119862119901119896[(1 minus 119909)

119896

+ (minus1)119901

(1 + 119909)119896

] (15)

where

119862119901119896=(minus1)119896

2119896+1

(119901 + 119896)

(119901 minus 119896) (119896)2 (16)

Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871

119901119902 defined by (14) is usually

approximated by the formula

119871119901119902

=(2119901 + 1) (2119902 + 1)

(119873 minus 1)2

119873

sum119894=1

119873

sum119895=1

119875119901(119909119894) 119875119902(119910119895) 119891 (119909

119894 119910119895)

(17)

where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910

119895= (2119895minus119873minus1)(119873minus1)

and for a binary image 119891(119909119894 119910119895) is given as

119891 (119909119894 119910119895) =

1 if (119894 119895) is in original object

0 otherwise(18)

As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form

119901119902=(2119901 + 1) (2119902 + 1)

4

119873

sum119894=1

119873

sum119895=1

ℎ119901119902(119909119894119910119895) 119891 (119909

119894 119910119895) (19)

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 3

digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved

From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1

3 Feature Extraction Methodology

One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection

31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and

Handwritten digit images

Feature extraction using moment based features

Classification of digits using multiple classifiers

Statistical significance tests for comparison of multipleclassifiers using multiple datasets

Selection of appropriate classifier

Detailed performance evaluation of chosen classifierusing training and test sets

Figure 1 Schematic diagram illustrating the key modules of theproposed methodology

similarly in mechanics to characterize bodies by their spatialdistribution of mass

A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921

Let 120583119899 be a real sequence of numbers and let us define

Δ119898

120583119899=

119898

sum119894=0

(minus1)119894

(119898

119894)120583119899+119894 (1)

Note that Δ119898120583119899can be viewed as the 119898th order derivative of

120583119899By theHausdorff theorem a necessary and sufficient con-

dition that there exists a monotonic function 119865(119909) satisfyingthe system

120583119899= int1

0

119909119899

119889119865 (119909) 119899 = 0 1 2 (2)

is that the system of linear inequalities

Δ119896

120583119899ge 0 119896 = 0 1 2 (3)

should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals

int1

0

119909119899

119891 (119909) 119889119909 119899 = 0 1 (4)

completely characterizes the functionA necessary and sufficient condition that there exists a

function 119865(119909) of bounded variation satisfying (7) is that thesequence

119901

sum119898=0

(119901

119898)1003816100381610038161003816Δ119901minus119898

120583119898

1003816100381610038161003816 119901 = 0 1 2 (5)

4 Applied Computational Intelligence and Soft Computing

should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)

32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902

of a gray level image 119891(119909 119910) is defined as

119872119901119902= ∬infin

minusinfin

119909119901

119910119902

119891 (119909 119910) 119889119909 119889119910 (6)

where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901

119910119902 is the basis function for this moment definition A set

of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is

the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely

119891(119909 119910) is uniquely determined by 119872119901119902 Considering the

fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform

119898119901119902=

119872

sum119909=1

119873

sum119910=1

119909119901

119910119902

119891 (119909 119910) (7)

where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling

then the image may be positioned such that its center of mass

(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583

119901119902 The simplified form of central moment of

order (119901 + 119902) is defined as follows

120583119901119902=

119872

sum119909=1

119873

sum119910=1

(119909 minus 119909)119901

(119910 minus 119910)119902

119891 (119909 119910) (8)

where 119909 = 1198981011989800and 119910 = 119898

0111989800

The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583

119901119902computed using the centroid of the image are

equivalent to 119898119901119902

whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902

are defined as

120578119901119902=120583119901119902

120583120574

00

(9)

where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578

02 12057811 12057820 known as the

moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by

120579 =1

2tanminus1 (

212058311

12058320minus 12058302

) (10)

where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows

119890 =(12058320minus 12058302)2

+ 41205832

11

12058300

(11)

33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order

Applied Computational Intelligence and Soft Computing 5

three that are invariant with respect to image scale transla-tion and rotation Consider

1206011= 12057820+ 12057802

1206012= (12057820minus 12057802)2

+ 41205782

11

1206013= (12057830minus 312057812)2

+ (312057821minus 12057803)2

1206014= (12057830+ 12057812)2

+ (12057821+ 12057803)2

1206015= (12057830minus 312057812) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057821minus 12057803)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

1206016= (12057820minus 12057802) [(12057830+ 12057812)2

minus (12057821+ 12057803)2

]

+ 412057811(12057830+ 12057812) (12057821+ 12057803)

1206017= (312057821minus 12057803) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057812minus 12057830)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

(12)

This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work

34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows

1198681=

1

120583400

(1205832012058302minus 1205832

11)

1198682=

1

1205831000

(1205832

301205832

03minus 612058330120583211205831212058303+ 4120583301205833

12

+ 4120583031205833

21minus 31205832

211205832

12)

1198683=

1

120583700

(12058320(1205832112058303minus 1205832

12) minus 12058311(1205833012058303minus 1205832112058312)

+ 12058302(1205833012058312minus 1205832

21))

1198684=

1

1205831100

(1205833

201205832

03minus 61205832

20120583111205831212058303minus 61205832

20120583211205830212058303

+ 91205832

20120583021205832

12+ 12120583

201205832

111205830312058321

+ 61205832012058311120583021205833012058303minus 18120583

2012058311120583021205832112058312

minus 81205833

111205830312058330minus 6120583201205832

021205833012058312+ 9120583201205832

021205832

21

+ 121205832

11120583021205833012058312minus 61205832

111205832

021205833012058321+ 1205832

021205832

30)

1198685=

1

120583600

(1205834012058304minus 41205833112058313+ 31205832

22)

1198686=

1

120583900

(120583401205830412058322+ 2120583311205832212058313minus 120583401205832

13minus 120583041205832

31

minus 1205833

22)

(13)

A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work

35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows

119871119901119902=(2119901 + 1) (2119902 + 1)

4

sdot ∬+1

minus1

119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910

(14)

where the kernel function 119875119901(119909) denotes the 119901th-order

Legendre polynomial and is given by

119875119901(119909) =

119901

sum119896=0

119862119901119896[(1 minus 119909)

119896

+ (minus1)119901

(1 + 119909)119896

] (15)

where

119862119901119896=(minus1)119896

2119896+1

(119901 + 119896)

(119901 minus 119896) (119896)2 (16)

Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871

119901119902 defined by (14) is usually

approximated by the formula

119871119901119902

=(2119901 + 1) (2119902 + 1)

(119873 minus 1)2

119873

sum119894=1

119873

sum119895=1

119875119901(119909119894) 119875119902(119910119895) 119891 (119909

119894 119910119895)

(17)

where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910

119895= (2119895minus119873minus1)(119873minus1)

and for a binary image 119891(119909119894 119910119895) is given as

119891 (119909119894 119910119895) =

1 if (119894 119895) is in original object

0 otherwise(18)

As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form

119901119902=(2119901 + 1) (2119902 + 1)

4

119873

sum119894=1

119873

sum119895=1

ℎ119901119902(119909119894119910119895) 119891 (119909

119894 119910119895) (19)

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

4 Applied Computational Intelligence and Soft Computing

should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)

32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902

of a gray level image 119891(119909 119910) is defined as

119872119901119902= ∬infin

minusinfin

119909119901

119910119902

119891 (119909 119910) 119889119909 119889119910 (6)

where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901

119910119902 is the basis function for this moment definition A set

of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is

the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely

119891(119909 119910) is uniquely determined by 119872119901119902 Considering the

fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform

119898119901119902=

119872

sum119909=1

119873

sum119910=1

119909119901

119910119902

119891 (119909 119910) (7)

where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling

then the image may be positioned such that its center of mass

(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583

119901119902 The simplified form of central moment of

order (119901 + 119902) is defined as follows

120583119901119902=

119872

sum119909=1

119873

sum119910=1

(119909 minus 119909)119901

(119910 minus 119910)119902

119891 (119909 119910) (8)

where 119909 = 1198981011989800and 119910 = 119898

0111989800

The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583

119901119902computed using the centroid of the image are

equivalent to 119898119901119902

whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902

are defined as

120578119901119902=120583119901119902

120583120574

00

(9)

where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578

02 12057811 12057820 known as the

moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by

120579 =1

2tanminus1 (

212058311

12058320minus 12058302

) (10)

where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows

119890 =(12058320minus 12058302)2

+ 41205832

11

12058300

(11)

33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order

Applied Computational Intelligence and Soft Computing 5

three that are invariant with respect to image scale transla-tion and rotation Consider

1206011= 12057820+ 12057802

1206012= (12057820minus 12057802)2

+ 41205782

11

1206013= (12057830minus 312057812)2

+ (312057821minus 12057803)2

1206014= (12057830+ 12057812)2

+ (12057821+ 12057803)2

1206015= (12057830minus 312057812) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057821minus 12057803)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

1206016= (12057820minus 12057802) [(12057830+ 12057812)2

minus (12057821+ 12057803)2

]

+ 412057811(12057830+ 12057812) (12057821+ 12057803)

1206017= (312057821minus 12057803) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057812minus 12057830)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

(12)

This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work

34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows

1198681=

1

120583400

(1205832012058302minus 1205832

11)

1198682=

1

1205831000

(1205832

301205832

03minus 612058330120583211205831212058303+ 4120583301205833

12

+ 4120583031205833

21minus 31205832

211205832

12)

1198683=

1

120583700

(12058320(1205832112058303minus 1205832

12) minus 12058311(1205833012058303minus 1205832112058312)

+ 12058302(1205833012058312minus 1205832

21))

1198684=

1

1205831100

(1205833

201205832

03minus 61205832

20120583111205831212058303minus 61205832

20120583211205830212058303

+ 91205832

20120583021205832

12+ 12120583

201205832

111205830312058321

+ 61205832012058311120583021205833012058303minus 18120583

2012058311120583021205832112058312

minus 81205833

111205830312058330minus 6120583201205832

021205833012058312+ 9120583201205832

021205832

21

+ 121205832

11120583021205833012058312minus 61205832

111205832

021205833012058321+ 1205832

021205832

30)

1198685=

1

120583600

(1205834012058304minus 41205833112058313+ 31205832

22)

1198686=

1

120583900

(120583401205830412058322+ 2120583311205832212058313minus 120583401205832

13minus 120583041205832

31

minus 1205833

22)

(13)

A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work

35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows

119871119901119902=(2119901 + 1) (2119902 + 1)

4

sdot ∬+1

minus1

119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910

(14)

where the kernel function 119875119901(119909) denotes the 119901th-order

Legendre polynomial and is given by

119875119901(119909) =

119901

sum119896=0

119862119901119896[(1 minus 119909)

119896

+ (minus1)119901

(1 + 119909)119896

] (15)

where

119862119901119896=(minus1)119896

2119896+1

(119901 + 119896)

(119901 minus 119896) (119896)2 (16)

Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871

119901119902 defined by (14) is usually

approximated by the formula

119871119901119902

=(2119901 + 1) (2119902 + 1)

(119873 minus 1)2

119873

sum119894=1

119873

sum119895=1

119875119901(119909119894) 119875119902(119910119895) 119891 (119909

119894 119910119895)

(17)

where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910

119895= (2119895minus119873minus1)(119873minus1)

and for a binary image 119891(119909119894 119910119895) is given as

119891 (119909119894 119910119895) =

1 if (119894 119895) is in original object

0 otherwise(18)

As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form

119901119902=(2119901 + 1) (2119902 + 1)

4

119873

sum119894=1

119873

sum119895=1

ℎ119901119902(119909119894119910119895) 119891 (119909

119894 119910119895) (19)

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 5

three that are invariant with respect to image scale transla-tion and rotation Consider

1206011= 12057820+ 12057802

1206012= (12057820minus 12057802)2

+ 41205782

11

1206013= (12057830minus 312057812)2

+ (312057821minus 12057803)2

1206014= (12057830+ 12057812)2

+ (12057821+ 12057803)2

1206015= (12057830minus 312057812) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057821minus 12057803)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

1206016= (12057820minus 12057802) [(12057830+ 12057812)2

minus (12057821+ 12057803)2

]

+ 412057811(12057830+ 12057812) (12057821+ 12057803)

1206017= (312057821minus 12057803) (12057830+ 12057812)

sdot [(12057830+ 12057812)2

minus 3 (12057821+ 12057803)2

] + (312057812minus 12057830)

sdot (12057821+ 12057803) [3 (120578

30+ 12057812)2

minus (12057821+ 12057803)2

]

(12)

This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work

34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows

1198681=

1

120583400

(1205832012058302minus 1205832

11)

1198682=

1

1205831000

(1205832

301205832

03minus 612058330120583211205831212058303+ 4120583301205833

12

+ 4120583031205833

21minus 31205832

211205832

12)

1198683=

1

120583700

(12058320(1205832112058303minus 1205832

12) minus 12058311(1205833012058303minus 1205832112058312)

+ 12058302(1205833012058312minus 1205832

21))

1198684=

1

1205831100

(1205833

201205832

03minus 61205832

20120583111205831212058303minus 61205832

20120583211205830212058303

+ 91205832

20120583021205832

12+ 12120583

201205832

111205830312058321

+ 61205832012058311120583021205833012058303minus 18120583

2012058311120583021205832112058312

minus 81205833

111205830312058330minus 6120583201205832

021205833012058312+ 9120583201205832

021205832

21

+ 121205832

11120583021205833012058312minus 61205832

111205832

021205833012058321+ 1205832

021205832

30)

1198685=

1

120583600

(1205834012058304minus 41205833112058313+ 31205832

22)

1198686=

1

120583900

(120583401205830412058322+ 2120583311205832212058313minus 120583401205832

13minus 120583041205832

31

minus 1205833

22)

(13)

A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work

35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows

119871119901119902=(2119901 + 1) (2119902 + 1)

4

sdot ∬+1

minus1

119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910

(14)

where the kernel function 119875119901(119909) denotes the 119901th-order

Legendre polynomial and is given by

119875119901(119909) =

119901

sum119896=0

119862119901119896[(1 minus 119909)

119896

+ (minus1)119901

(1 + 119909)119896

] (15)

where

119862119901119896=(minus1)119896

2119896+1

(119901 + 119896)

(119901 minus 119896) (119896)2 (16)

Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871

119901119902 defined by (14) is usually

approximated by the formula

119871119901119902

=(2119901 + 1) (2119902 + 1)

(119873 minus 1)2

119873

sum119894=1

119873

sum119895=1

119875119901(119909119894) 119875119902(119910119895) 119891 (119909

119894 119910119895)

(17)

where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910

119895= (2119895minus119873minus1)(119873minus1)

and for a binary image 119891(119909119894 119910119895) is given as

119891 (119909119894 119910119895) =

1 if (119894 119895) is in original object

0 otherwise(18)

As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form

119901119902=(2119901 + 1) (2119902 + 1)

4

119873

sum119894=1

119873

sum119895=1

ℎ119901119902(119909119894119910119895) 119891 (119909

119894 119910119895) (19)

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

6 Applied Computational Intelligence and Soft Computing

Legendre polynomials

minus05 0 05 1minus1

x

Pn(x)

P0(x)

P1(x)

P2(x)

P3(x)

P4(x)

P5(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

(a)

Legendre polynomials

minus05 0 05 1minus1

x

P6(x)

P7(x)

P8(x)

P9(x)

P10(x)

minus1

0

02

04

06

08

1

minus08

minus06

minus04

minus02

Pn(x)

(b)

Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875

1(119909) to 119875

5(119909) and (b) 119875

6(119909) to 119875

10(119909)

where

ℎ119901119902(119909119894 119910119895) = int

119909119894+Δ1199092

119909119894minusΔ1199092

int119910119895+Δ1199102

119910119895minusΔ1199102

119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)

withΔ119909 = 119909119894minus119909119894minus1

= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1

= 2(119873minus1)To evaluate the double integral ℎ

119901119902(119909119894 119910119895) defined by (20)

an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments

119901119902defined by (19) Therefore this

method requires a large number of computing operations Asone can see

119901119902can be expressed with the help of a useful

formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be

derived based on the set of invariant moments found in theprevious subsection

11987100= 12060100

11987110= (

3

4) 12060110

11987101= (

3

4) 12060101

11987120= (

5

4) [(

3

2) 12060120minus (

1

2) 12060100]

11987102= (

5

4) [(

3

2) 12060102minus (

1

2) 12060100]

11987111= (

9

4) 12060111

11987130= (

7

4) [(

5

2) 12060130minus (

3

2) 12060110]

11987103= (

7

4) [(

5

2) 12060103minus (

3

2) 12060101]

11987121= (

15

4) [(

3

2) 12060121minus (

1

2) 12060101]

11987112= (

15

4) [(

3

2) 12060112minus (

1

2) 12060110]

(21)

36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows

119909 = 119903 cos 120579

119910 = 119903 sin 120579(22)

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 7

where

119903 = radic1199092 + 1199102

120579 = tanminus1 (119910

119909)

(23)

An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation

Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881

119899119898(119909 119910) The form of these polynomials is as

follows

119881119899119898(119909 119910) = 119881

119899119898(119901 120579) = 119877

119899119898(120588) exp (119895119898120579) (24)

where

119899 positive integer or zero

119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899

120588 length of vector from origin to 119891(119909 119910) pixel

120579 angle between vector 120588 and 119909-axis in counterclock-wise direction

As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows

119881119899119898

=119899 + 1

120587∬119891(119909 119910)119881

lowast

119899119898(120588 120579) 119889119909 119889119910 (25)

in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and

119881119899119898

=119899 + 1

120587int2120587

0

int1

0

119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(26)

in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by

119881119903

119899119898=119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579

(27)

By change of variable 1205791= 120579 minus 120572

119881119903

119899119898=119899 + 1

120587int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588)

sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [

119899 + 1

120587

sdot int2120587

0

int1

0

119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579

1) 120588 119889120588 119889120579]

sdot exp (minus119895119898120572) = 119881119899119898

exp (minus119895119898120572)

(28)

Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881

119899119898| can be taken as a rotation invariant

feature of the underlying image function The real-valuedradial polynomial 119877

119899119898(120588) is defined as follows

119877119899119898(120588) =

(119899minus|119898|)2

sum119909=0

(minus1)119904

sdot(119899 minus 119904)

119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904

(29)

where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional

moments 120583119901119902

as follows

119885119899119897=(119899 + 1)

120587

sdot

119899

sum119896=119897

119902

sum119895=0

119897

sum119898=0

(minus119894)119898

(119902

119895)(

119897

119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898

(30)

Zernikemomentsmay bemore easily derived from rotationalmoments119863

119899119896 by

119885119899119897=

119899

sum119896=119897

119861119899119897119896119863119899119896 (31)

When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909

2

+ 1199102

le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows

(1) The magnitude of Zernike moment has rotationalinvariant property

(2) They are robust to noise and shape variations to someextent

(3) Since the basis is orthogonal they have minimumredundant information

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

8 Applied Computational Intelligence and Soft Computing

Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10

Order(119899)

Zernike moments of order 119899with repetition119898 (119881

119899119898)

Total number ofmoments up to order 10

0 11988100

36

1 11988111

2 11988120 11988122

3 11988131 11988133

4 11988140 11988142 11988144

5 11988151 11988153 11988155

6 11988160 11988162 11988164 11988166

7 11988171 11988173 11988175 11988177

8 11988180 11988182 119881841198818611988188

9 11988191 11988193 119881951198819711988199

10 119881100 119881102 119881104 119881106 119881108 1198811010

(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments

(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails

Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work

The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image

37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by

119862119901119902= int1198862

1198861

int1198872

1198871

(119909 + 119895119910)119901

(119909 minus 119895119910)119902

119891 (119909 119910) 119889119909 119889119910 (32)

where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows

(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated

(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject

(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem

The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the

complex moments of order (119901 + 119902) can be written as follows

119862119897

119899= 119862119901119902

= int2120587

0

int+infin

0

120588119901+119902

119890119895(119901minus119902)120579

119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)

where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862

119901119902and 119862

119903

119901119902 the

relationship [33] between them is given as follows

119862119903

119901119902= 119862119901119902119890minus119895(119901minus119902)120579

(34)

where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862

119901119902| remains

unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2

Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3

4 Experimental Study and Analysis

In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows

Accuracy Rate () =Number of Correctly classified digits

Total number of digitstimes 100 (35)

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 9

Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10

Order(119899) Complex moments (119862

119901119902) Complex moments of order 119899 with repetition 119897 (119862119897

119899) Total number of moments

up to order 100 119862

00 1198620

0

66

1 11986210 11986201 119862

1

1 119862minus1

1

2 11986220 11986211 11986202 119862

2

2 1198620

2 119862minus2

2

3 11986230 11986221 11986212 11986203 119862

3

3 1198621

3 119862minus1

3119862minus3

3

4 11986240 11986231 11986222 11986213 11986204 119862

4

4 1198622

4 1198620

4 119862minus2

4 119862minus4

4

5 11986250 11986241 11986232 11986223 11986214 11986205 119862

5

5 1198623

5 1198621

5 119862minus1

5 119862minus3

5 119862minus5

5

6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862

6

6 1198624

6 1198622

6 1198620

6 119862minus2

6 119862minus4

6 119862minus6

6

7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862

7

7 1198625

7 1198623

7 1198621

7 119862minus1

7 119862minus3

7 119862minus4

7 119862minus7

7

8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862

8

8 1198626

8 1198624

8 1198622

8 1198620

8 119862minus2

8 119862minus4

8 119862minus6

8 119862minus8

8

9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862

9

9 1198627

9 1198625

9 1198623

9 1198621

9 119862minus1

9 119862minus3

9 119862minus5

9 119862minus7

9 119862minus9

9

10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862

10

10 1198628

10 1198626

10 1198624

10 1198622

10 1198620

10 119862minus2

10 119862minus4

10 119862minus6

10 119862minus8

10 119862minus10

10

Table 3 Description of feature vector

Serialnumber List of moments Number of

features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66

Total 130

41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India

The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively

Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb

A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3

42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

10 Applied Computational Intelligence and Soft Computing

(a) (b)

(c) (d)

(e)

Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu

each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed

Naıve Bayes Naıve Bayes classifier for details refer to[35]

Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 11

Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)

DatasetsRecognition accuracy ()

ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic

1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)

2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)

3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)

4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)

5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)

6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)

7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)

8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)

9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)

10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)

11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)

12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)

Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433

Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]

The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers

The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)

43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers

on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie

Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen

the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows

119877119895=1

119873

119873

sum119894=1

119903119894

119895 (36)

The null hypothesis states that all the classifiers are equivalentand so their ranks 119877

119895should be equal To justify it the

Friedman statistic [40] is computed as follows

1205942

119865=

12119873

119896 (119896 + 1)[

[

sum119895

1198772

119895minus119896 (119896 + 1)

2

4]

]

(37)

Under the current experimentation this statistic is dis-tributed according to 1205942

119865with 119896 minus 1 (=7) degrees of freedom

Using (37) the value of 1205942119865is calculated as 3046 From the

table of critical values (see any standard statistical book) thevalue of 1205942

119865with 7 degrees of freedom is 140671 for 120572 = 005

(where 120572 is known as level of significance) It can be seen thatthe computed 1205942

119865differs significantly from the standard 1205942

119865

So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the

following formula

119865119865=

(119873 minus 1) 1205942

119865

119873(119896 minus 1) minus 1205942119865

(38)

119865119865is distributed according to the 119865-distribution with 119896 minus 1

(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865

119865is calculated as 80659 The critical value of 119865

(7 77) for 120572 = 005 is 2147 (see any standard statistical book)

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

12 Applied Computational Intelligence and Soft Computing

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

8486889092949698

100

Reco

gniti

on ac

cura

cy (

)

(a)

Classifiers

ArabicBanglaDevanagari

TeluguRoman

888990919293949596979899

100

95

confi

denc

e sco

re (

)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts

05

10152025303540

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Mod

el b

uild

ing

time (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(a)

0

1

2

3

4

5

6

Classifiers

ArabicBanglaDevanagari

TeluguRoman

Reco

gniti

on ti

me (

s)

Naiuml

ve B

ayes

Baye

s Net

MLP

SVM

Rand

om F

ores

t

Bagg

ing

Mul

ticla

ss C

lass

ifier

Logi

stic

(b)

Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition

which shows a significant difference between the standardand calculated values of 119865

119865 Thus both Friedman and Iman

et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known

as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows

CD = 119902120572

radic119896 (119896 + 1)

6119873 (39)

For the Nemenyi test the value of 119902005

for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902

005for eight classifiers is 2690 (see

Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 13

49555245

3125 3545 36253205 3205

3031

Differences of mean ranksDifference of mean ranksCD

0

1

2

3

4

5

6M

ean

rank

s

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|

Pairwise comparison of classifiers for the Nemenyi test

(a)

49555245

3125 3545 36253205 3205

269

Differences of mean ranksDifference of mean ranksCD

|R3minusR1|

|R3minusR2|

|R3minusR4|

|R3minusR5|

|R3minusR6|

|R3minusR7|

|R3minusR8|0

1

2

3

4

5

6

Mea

n ra

nks

Comparison of classifiers with MLP for the Bonferroni-Dunn test

(b)

Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test

difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers

44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows

(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)

(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine

moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine

moment invariant + Legendre moment + Zernikemoment (F1ndashF64)

(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)

86

88

90

92

94

96

98

100

Feature set and all possible combinations

ArabicBanglaDevanagari

RomanTelugu

Reco

gniti

on ac

cura

cy (

)

F1

ndashF18

F1

ndashF64

F19

ndashF28

F1

ndashF28

F29

ndashF64

F19

ndashF64

F65

ndashF130

F29

ndashF130

F1

ndashF130

Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier

(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)

The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations

45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

14 Applied Computational Intelligence and Soft Computing

Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09922 01758 0293

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989

Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09944 00535 0115

1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000

Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

0988 00342 00847

0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998

mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively

5 Conclusion

India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 15: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 15

Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09975 016 02716

1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997

Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals

Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC

ldquo0rdquo

09867 00051 00449

0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965

on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally

expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist

To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 16: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

16 Applied Computational Intelligence and Soft Computing

Acknowledgments

The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India

References

[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015

[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003

[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005

[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997

[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007

[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008

[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998

[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007

[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009

[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010

[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011

[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011

[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011

[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003

[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012

[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013

[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013

[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014

[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014

[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008

[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014

[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014

[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014

[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015

[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992

[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921

[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962

[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 17: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Applied Computational Intelligence and Soft Computing 17

[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005

[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996

[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990

[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984

[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985

[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions

in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995

[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001

[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996

[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992

[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014

[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 18: Research Article A Study of Moment Based Features …downloads.hindawi.com/journals/acisc/2016/2796863.pdfResearch Article A Study of Moment Based Features on Handwritten Digit Recognition

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014