Online Handwritten English Numeral
-
Upload
mary-morse -
Category
Documents
-
view
232 -
download
0
Transcript of Online Handwritten English Numeral
-
8/10/2019 Online Handwritten English Numeral
1/54
DISSERTATION
on
ONLINE HANDWRITTEN ENGLISH NUMERAL
CHARACTER RECOGNITION
THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE DEGREE OF
Master of TechnologyIn
IT (COURSEWARE ENGINEERING)
SUBMITTEDBY
DALIA PRATIHAR
EXAMINATION ROLL NUMBER: M48CWE13-05
REGISTRATION NUMBER: 117311 of 2011-2012
UNDER THE SUPERVISION
OF
Sri Arunasish Acharya
SCHOOL OF EDUCATION TECHNOLOGY
FACULTY COUNCIL FOR UG AND PG STUDIES
IN ENGINEERING AND TECHNOLGY
JADAVPUR UNIVERSITY
KOLKATA-700032
2013
-
8/10/2019 Online Handwritten English Numeral
2/54
Faculty Council for UG and PG studies in Engineering and Technology
JADAVPUR UNIVERSITY, KOLKATA- 700032
Certificate of Recommendation
This is to certify that Dalia Pratihar (M48CWE13-05) has completed her dissertation
entitled Online Handwritten English Numeral Character Recognition, under the directsupervision and guidance of Sri Arunasish Acharya, Assistant Professor, School of
Education Technology, Jadavpur University, Kolkata. We are satisfied with her work,which is being presented for the partial fulfillment of the requirements for the degree of
MTech in I.T. (Courseware Engineering) of Jadavpur University, Kolkata - 700032.
_____________________________
Sri Arunasish Acharya
Assistant Professor,
School of Education Technology,
Jadavpur University,
Kolkata 700032
_____________________________
Prof. Samar Bhattacharya
Director,
School of Education Technology,
Jadavpur University,
Kolkata 700032
-
8/10/2019 Online Handwritten English Numeral
3/54
JADAVPUR UNIVERSITY
FACULTY OF ENGINEERING & TECHNOLOGY
CERTIFICATE OF APPROVAL *
The thesis at instance is hereby approved as a creditable study of an Engineering subject
carried out and presented in a manner satisfactory to warrant its acceptance as aprerequisite to the degree for which it has been submitted. It is understood that by this
approval the undersigned does not necessarily endorse or approve any statement made,opinion expressed or conclusion drawn therein, but approve this thesis for the purpose for
which it is submitted.
Final Examination for
Evaluation of the Thesis.
________________________________________
________________________________________
________________________________________
Signature of the Examiners
* Only in case the thesis is approved.
-
8/10/2019 Online Handwritten English Numeral
4/54
Declaration of Originality and Compliance of Academic Ethics
I hereby declare that this thesis contains literature survey and original research work byundersigned candidate, as part of her MTech in I.T.(Courseware Engineering) studies.
All information in this document have been obtained and presented in accordance withacademic rules and ethical conduct.
I also declare that, as required by these rules and conduct, I have fully cited andreferenced all materials and results that are not original to this work.
Name : DALIA PRATIHAR
Exam Roll Number : M48CWE13-05
Registration Number: 117311 of 2011-2012
Thesis Title : ONLINE HANDWRITTEN ENGLISH NUMERAL
CHARACTER RECOGNITION
Signature with date:
-
8/10/2019 Online Handwritten English Numeral
5/54
ACKNOWLEDGEMENT
I owe thanks to a great many people who helped and supported me during the
project work.
My deepest thanks to Sri Arunasish Acharya, the Supervisor of the project, for
guiding and correcting various documents of mine with much attention and care. Without hisconstant endeavours, this project wouldnot have been completed successfully.
I express my sincere thanks to Prof. Samar Bhattacharya, Director of School
of Education Technology, for extending his valuable support.
My deep sense of gratitude to Prof. Kalyan Kumar Datta, Dr.(Prof)
Pramathanath Basu, Dr.Ranjan Parekh, Dr. Matangini Chattopadhyay, Smt. Saswati
Mukherjee and Mr. Joydeep Mukherjee for their continuous support during the entire
course of the research.
I would also thank my classmates without whose support and motivation this
project would have been a distant reality. Thanks and appreciation to all the helpful people of
my department, for their support.
I also extend my heartfelt thanks to my family and well wishers.
With regards,
Dalia Pratihar
Examination Roll Number: M48CWE13-05
Registration Number: 117311 of 2011-2012
SCHOOL OF EDUCATION TECHNOLOGY
JADAVPUR UNIVERSITY
KOLKATA-700032
-
8/10/2019 Online Handwritten English Numeral
6/54
TABLE OF CONTENTS
Chapters Pages
Executive Summary 1. Introduction 2
1.1
Problem Statement
1.2
Objectives
1.3 Character Recognition (General Idea)
1.4 Assumptions & Scope
1.5 Organization of Thesis
2. Literature Survey 4
3. Concepts & Problem Analysis 7
4. Design & Development
4.1 Introduction
4.2 System Components
4.3 Procedure and Flowchart
4.4 Pseudocode
4.5 Character Database
4.6 Hardware & Software used
4.6.1 Development Platform
4.6.2 External Interfaces
4.7 Design Methodology
14
5. Results & Interpretation 25
6. Conclusion & Future Scope
6.1 Conclusion
6.2 Future Scope
28
-
8/10/2019 Online Handwritten English Numeral
7/54
References 30
Appendix 33
-
8/10/2019 Online Handwritten English Numeral
8/54
1
EXECUTIVE SUMMARY
In the present work, a prototype system for recognition of online handwritten numeral Englishcharacters has been developed. A database of this script is created for 0-9. The total number ofsamples in the present database is 150 which have been collected from 15 different users. A
novel method of feature extraction has been proposed, which employs the concept of imagecentroid. Apart from this, the range of black pixels in both vertical and horizontal directions iscomputed which forms another feature of the present algorithm. Finally, the statistical method ofEuclidean distance has been employed for the final recognition. The accuracy has also beenmeasured, based on which, some scope for further improvement has been suggested.
-
8/10/2019 Online Handwritten English Numeral
9/54
2
Chapter 1
INTRODUCTION1.1. PROBLEM STATEMENT
Develop a highly accurate online English numeral character recognition system.
1.2. OBJECTIVES
This project has certain objectives as follows ::
To design and develop a high-performance online handwritten English
numeral character recognition system.
To explore the effectiveness of the present work with respect to the worksalready done in the field of character recognition.
1.3. CHARACTER RECOGNITION (GENERAL IDEA)
Imparting machines human like capabilities for accomplishing various tasks has always been anactive area of interest as this eases many tedious works that could earlier be performed only by
human beings. Character Recognition is one such domain where the man-machine dividenarrows down significantly. Apart from the data entered through keyboard, the machine istrained to understand data fed through other methods like scanning, bar code reading etc. Thebirth of such Optical Character Recognition (OCR) system can be traced back to as early as the1870s. Presently, a few thousand is the number of systems sold every week, and the cost of toohas decreased significantly making it easily accessible to the general public. [22]
1.4. ASSUMPTIONS AND SCOPE
In a classification task, we are allotted a pattern and the job is to classify it into one out of cclasses. The number of classes, c, is assumed to be known a-priori and here is equal to 10. Each
pattern class is represented by a set of feature values. We assume that each pattern is representeduniquely by a single feature vector and that it can belong to only one class. Also, all the classesare equi-probable that is no single class has greater priority than the other ones in this recognitiontask. Moreover, we assume that all the input digits are in perfectly straight and non-tilted. Thisforms a constraint on the writer. Also, we are taking the input using a touchpad and assume thatthe user is writing fairly accurately.
-
8/10/2019 Online Handwritten English Numeral
10/54
3
1.5. ORGANIZATION OF THESIS
In this present work a system for Online English Handwritten Digit Recognition has beendesigned. Furthermore a set of experiments can be conducted using different image of same digitwritten by different persons, then the result is verified. Based on the results of these experiments,
the advantages and disadvantages of the method can be then analyzed and discussed.Finally, some conclusions about this method will be drawn.
Chapter 1: The thesis is divided into four chapters in total. The first chapter is anintroduction to our project work which includes the problem statement, objective of thedissertation, brief discussion, assumptions made and scope regarding the project. Lastly, itgives an idea about the organization of the other chapters and a brief idea about their contentsas listed below:
Chapter 2: This chapter titled Literature Survey gives a thorough if not exhaustiveaccount of the works carried out in the field of Character Recognition which have been referred
in the present work.
Chapter 3: This chapter titled Concepts & Problem Analysis describes the backgroundconcepts that are needed in order to analyze the problem and hence derive and optimal solutionfor the task in hand, that is, recognizing the input numerals.
Chapter 4: This chapter titled Design & Development demonstrates the novel algorithmproposed in this dissertation, the logic behind it, the database developed both for the ideal andtest cases and the coding done using in software tool.
Chapter 5: This chapter is entitled as Results & Interpretations and points to some
experimental studies, their results and their interpretations.
Chapter 6: This end chapter named Conclusion & Future Scope concludes this thesisand discusses some of the limitations and scope for enhancement regarding this project.
-
8/10/2019 Online Handwritten English Numeral
11/54
4
Chapter 2
LITERATURESURVEY
Online character recognition has been an area of extensive study and research in the past.
Character recognition systems contribute tremendously to the advance of the automation process
and can be of significant benefit to man-machine communication in many applications, such as(1) reading aid for the blind (2) automatic text entry into the computer for desktop publication,library cataloging, ledgering , (3) automatic reading for sorting of postal mail, bank chequesand other documents, (4) document data compression: from document image to ASCII format,(5) language processing, (6) multi-media system design, etc.[9]
In designing a highly accurate character recognition system, the challenges are to extract themost efficient features from the different character images of several characters so that they canbe easily identified by the system. Several methods of feature extraction for characterrecognition have been reported in the literature.[10]
In paper [11], an OCR system has been designed and tested for Urdu characters. After research,the entire alphabet set of 40 characters is zoned into 21 classes. Initially, after binarization andsegmentation, a chain code of each column of the image is generated. Then this chain code isstored in an xml file. The file contains all the 21 classes of Urdu alphabets as parent nodes orelements. Each child node has three attributes. One, the name of the character and the other,chain code of that character, calculated from its image earlier. Unicode of the character issaved in xml as third attribute of the child node, which is assigned to the identifiedcharacter at the end of the matching procedure. The ideal and test images are matched withsome extent of pre-set error margin.
In paper [12], a novel diagonal feature extraction scheme for recognizing off-line handwrittencharacters is proposed. Every character image of size 90x60 pixels is divided into 54 equalzones, each of size 10x10 pixels . The features are then extracted from each zones pixelsby moving along the diagonals of its respective 10X10 pixels. Each such 10x10 zone has19 diagonal lines and the foreground pixels present along each diagonal line is summed toget a single sub-feature, thus 19 sub-features are obtained from the each of the 54 zones. These19 sub-features values are averaged to form a single feature value and placed in thecorresponding zone. This procedure is sequentially repeated for the all the 54 zones. Therezones whose diagonals are empty of foreground pixels have the feature values equal to zero.
-
8/10/2019 Online Handwritten English Numeral
12/54
5
Finally, 54 features are extracted for each character. Apart from these, 9 and 6 features areobtained by averaging the values placed in zones rowwise and columnwise, respectively(since the original image size was assumed to be 90x60 pixels). As result, every character isrepresented by 69, that is, 54 +15 features. These extracted features are then used to train afeed forward back propagation neural network and used for classification.
In paper [13], some statistical features like zonal density, projection histograms (horizontal,vertical and both diagonal, that is, diagonal-1 (left diagonal) and diagonal-2 (right diagonal)),distance profiles (from left, right, top and bottom sides) have been used. The distance profiles arecomputed by calculating the distance (number of pixels) of the outer edge of the character to theimage boundary. In computing Background Directional Feature, they have considered a maskalong 8 directions and then passed it over the original image to obtain the cumulative fraction ofpixels in each direction. One noteworthy work done in this paper is the incorporation of post-processing technique to construct meaningful sentence using the component constructs.
The recognition system proposed in paper [14] inputs some text lines, extracts certain features
like projection profiles (vertical and horizontal), density of the black pixels, variance of thehorizontal profile derivative. A text line containing words has been labeled as Htop, Hupper,Hbase and Hbottom. Moreover, while considering the vertical projection profile, more refinedfeatures have been taken into account like height of middle part, height of upper part. Finally,they have used a Bayesian classifier for recognition of the fonts, where the fonts are known a-priori.
In paper [15], the normalized and thinned character image is divided into sectors with 12 sectorswith each sector covering a fixed angle equal to 30 degrees. For each sector, the distance of eachblack pixel from the image centre is computed and then summed up to give a single value. Also,per sector, the slope of a black pixel with respect to the image centre is calculated. These values
are normalized by dividing each value by the number of black pixels. Then the 12x2=24 featuresare got from these 12 sectors. Next, 4 sectors have been considered with sector angle 90 degreeeach. Then occupancy value is computed which has been defined as the proportion of blackpixels in that sector with respect to the entire image. Also, the end point in each of the 4 sectorsis found by the neighboring pixel analysis method. So, the features totaling 32 include vectordistances, angles, occupancy and end-points. For recognition, both neural networks and fuzzylogic techniques are then adopted.
In paper [16], the skewed images are rectified by transforming the same from a tilted to anupright one. After extracting directional element feature (DEF) from each character image, CityBlock Distance with Deviation (CBDD) and Asymmetric Mahalanobis distance (AMD) areproposed for rough classification and fine classification.
Based on the theory of Image Segmentation, the centroid of a character can be found.Around this centroid, the image is divided into 36 equal angle (each angle 10 degrees)regions clockwise and the direction feature of the character distribution is obtained inpaper [17]. Then the slope of each black pixel with respect to the image centroid is found out.The angle thus obtained is used to attach the concerned pixel to one of the 36 regions. Thehandwriting restraint is removed by adopting measures to remove deflection, that is skewness. In
-
8/10/2019 Online Handwritten English Numeral
13/54
6
the principle of high-accuracy matching, the minimal matching database has been used toapproach the real-time character match.
In paper [18], after segmentation, each isolated image glyph is processed to extract the featuresof the glyph like the character height, character width, the number of horizontal lines (long and
short), the number of vertical lines (long and short), the horizontally oriented curves, thevertically oriented curves, the number of circles, number of slope lines, image centroid andspecial dots. The heights, the circles and the slopes are computed by applying appropriate masks.The glyphs are now set ready for classification based on these features. The extracted featuresare passed to a Support Vector Machine (SVM) where the characters are classified by SupervisedLearning Algorithm.
-
8/10/2019 Online Handwritten English Numeral
14/54
7
Chapter 3
CONCEPTS&
PROBLEMANALYSIS
Pixel
Short for Picture Element, a pixel is generally thought of as the smallest single component of adigital image. The address of a pixel corresponds to its physical coordinates. Each pixel is asample of the original image, hence more samples typically provide more accuraterepresentations of the original. [1]
Fig. 3.1 Fig. 3.2
-
8/10/2019 Online Handwritten English Numeral
15/54
8
For example, the above Fig. titled Fig. 1.1 shows an image drawn in Adobe Photoshop 7.0 usingthe brush tool. The same image when zoomed to 1600% reveals the component pixels, a portionof which is shown in Fig. 3.2
The intensity of each pixel may vary. In color image systems, a color is typically represented by
three or four component intensities such as red, green, and blue (RGB systems) or cyan,magenta, yellow, and black.(CMYK systems). [1]
The number of bits used to represent each pixel determines how many distinct colors can bedisplayed using that pixel. For example, in 8-bit color mode, the color monitor uses 8 bits foreach pixel, making it possible to display 2 to the 8th power (256) different colors or shades ofgray. [2]
Pattern Recognition
Pattern recognition can be broadly defined as a process to generate a meaningful description of
data and a deeper understanding of a problem through manipulation of a large set of primitive,obviously haphazard and quantifying data. Some of that large data set may come from statistics,a document, or graphics, and is eventually expected to be in a visual form. Preprocessing of thesedata is necessary for error corrections, for image enhancement, and for their understanding andrecognition. Preprocessing operations are generally classified as low-level operations, whilepattern recognition including analysis, description and understanding of the image (or the largedata set), is high-level processing. [3]
Fig. 3.3
There are three general approaches for implementing pattern recognition systems namely::
Statistical Pattern Recognition (StatPR)
Syntactic Pattern Recognition (SynPR)
Neural Pattern Recognition (NeurPR)
-
8/10/2019 Online Handwritten English Numeral
16/54
-
8/10/2019 Online Handwritten English Numeral
17/54
10
several other cases where this type of recognition comes into play, like Vehicle license platerecognition for security and surveillance purposes [6], Traffic Sign Detection [7], SignatureVerification [8] etc.More specifically, it is an application of the science of Pattern Recognition.
Irrespective of the problem type, there steps involved in recognizing a character are as follows::
1. Data Acquisition/Sensing2. Pre-processing3. Segmentation4. Feature Extraction5. Classification6. Post-processing [9]
Fig. 3.5
The preprocessing step involves image binarization, image skeletonization/ thinning, imagesegmentation amongst others.
Image thinning is the process where the input image is processed to remove unnecessary pixelswhile retaining connectivity. Hilditch thinning algorithm is the most widely used algorithm in
this regard. [23][24]
Skew detection and correction is another preprocessing technique whereby slanted images arerectified to produce upright ones. Several papers demonstrate such preprocessing based ontechniques like vertical projection profiles [25], Radon transform [26] etc.
-
8/10/2019 Online Handwritten English Numeral
18/54
11
Image segmentation is a vital preprocessing step where the scanned text is partitioned intoparagraphs, lines, words and finally single characters. The works been carried out in this areainclude techniques like projection profiles [25], Hough Transform [27] amongst several others.
Pattern
A pattern is a d-dimensional vector x = (x1,...,xd) where each element of the vector correspondsto the value of a different feature. A classification task is perfect when each pattern is uniquelyrepresented by a set of unique values of these features.
(a) (b)
Fig. 3.6
For example, an acetylene structure is uniquely described by the pattern (a) whereas pattern (b)
describes a benzene ring. The unique orientation/pattern of the component carbon and hydrogenatoms contribute to the difference of these compounds.
Feature
These are used to describe an observation. A pattern is also referred to as a sample orobservation.
Feature extraction
The process of extracting certain attributes from collected data. These attributes are used to map
the original data to a feature space in which the data will be separable so that classification canbe performed.
Feature selection
The process of selecting the most relevant features that have been extracted, thus selecting thefeatures that will map to the optimal feature space to perform classification.
-
8/10/2019 Online Handwritten English Numeral
19/54
12
For example, in order to discriminate between 6 and 9, apart from using features like numberof pixels etc, if we can incorporate another feature vector in the set that is, the phase angle, thenthe discrimination process gives correct result.
Character
In computer software, any symbol that requires one byte of storage is known as character. Thisincludes all the ASCII and extended ASCII characters, including the space character.
AA A Fig. 3.7
The above figure shows the character A written in various fonts like Times New Roman, Courier
New and Verdana, in font size 36.
Character Recognition Modes
Offline
The image of the written text may be sensed "off line" from a piece of paper by opticalscanning and then fed into the system as a graphic file. From that file, the character ofinterest has to be located and then processed for recognition.
Online
In this case, the movements of the pen tip may be sensed "on line" or in real-time, forexample by a pen-based computer screen surface. In the present project, a writing tablethas been used for facilitating online digit recognition.
Euclidean Distance
Very often, especially when measuring the distance in the plane, we use the formula for theEuclidean distance. According to the Euclidean distance formula, the distance between twopoints in the plane with coordinates (x, y) and (a, b) is given by::
dist((x, y), (a, b)) = (x - a) + (y - b)
As an example, the (Euclidean) distance between points (2, -1) and (-2, 2) is found to be
dist((2, -1), (-2, 2))= (2 - (-2)) + ((-1) - 2)= (2 + 2) + (-1 - 2)= (4) + (-3)
-
8/10/2019 Online Handwritten English Numeral
20/54
13
= 16 + 9= 25= 5
The source of this formula is in the Pythagorean theorem. [11]
BMP File Format
The BMP file format, also known as bitmap image file or device independent bitmap (DIB) fileformat or simply a bitmap, is an image file format used to store bitmap digital images,independently of the display device (such as a graphics adapter), especially on MicrosoftWindows and OS/2 operating systems. The main aim of DIBs is to allow bitmaps to be movedfrom one device to another (hence, the device-independent part of the name). [20]The BMP file only supports single line bitmaps of 1, 4, 8 or 24 bits per pixel. In our project, wehave used .bmp format due to the following reasons ::
BMP files store each pixel independently, hence maintain the accuracy and quality of the storedimage. There is no compression, hence, no loss in pixel information. In the present work, theimages undergo several steps and are saved for treating them as input for the next step. Hence,preserving the quality of the images is a very important factor and bmp format is absolutelyperfect for that. Also, this format can represent complex images and shapes and retain imageproperties even when the image is magnified. [21]
Having gained a fair idea about the concepts required, the task in hand can now be started.
-
8/10/2019 Online Handwritten English Numeral
21/54
14
Chapter 4
DESIGN&
DEVELOPMENT4.1. INTRODUCTION
In this project, a novel system of English numeral has been proposed. It focuses on onlinerecognition of the digits. The recognition system is based on the statistical approach of patternrecognition. As will be evident in the following sub-sections, the system extracts some featuresrelated to the digitized images and compares them with the ideal values to find the closest match.The degree of this similarity is assessed by means of Euclidean distance classifier.
4.2. SYSTEM COMPONENTS
The numeral recognition system developed has several sub-systems as follows::
1. Image Acquisition
For the dataset of our system, we have used the Adobe Photoshop 6.0 Text Tool, in Times NewRoman Font, pen size 72pt, color black. The image size is immaterial here as the digitizedimages are resized to an optimal dimension subsequently. The images are saved as Bitmapimages in Windows format, 24bit pixel depth. The test images are input into the system by using
a tablet. The images are then read into the Matlab package using standard Matlab I/O functions.
2. Binarization
In this step, the images are converted from color images into binary. In order to reduce storagerequirements and to increase processing speed, it is often desirable to represent gray-scale orcolor images as binary images by picking a threshold value. As evident from the name, a binaryimage is composed of only two types of pixels- black and white. So, the bits per pixel is reduced
-
8/10/2019 Online Handwritten English Numeral
22/54
15
to a mere value of 2. Image binarization via thresholding may be both local or global [10]. In oursystem, the intensity based on which the images are binarized is purely experimental.
3. Image Cropage ing
The binarized images are now cropped in a way such that the bounding box touches theoutermost pixels exactly that is, no extra white space is allowed beyond the border black pixels.
4. Size Normalization
This is an important step in our system where the cropped image is resized to 40x40 image forcomparison. This is absolutely necessary as no constraint was imposed on the input image size.Hence it is not appropriate for comparison unless size normalized.
5. Image Centroid Calculation
In this step, the centroid of an image is calculated. The image centroid, both in horizontal andvertical directions, is defined as
Position of black pixels (in horizontal / vertical direction)=
Total number of black pixels
So, the image centroid is not necessarily same as the image centre.
6. Division into blocks
The image is then divided into 4 blocks of equal size starting from the top-left corner. Now, foreach such block, we do the following steps ::
i. Calculate image centroid using the above formulaii. Find out the distance of this centroid from original image centroid
iii. Sum up all these distances into a single valueiv. Calculate the Euclidean distance between this value and the same in case of all
other ideal images. This constitutes one of our features.
7. Euclidean Distance with respect to image centroid
The Euclidean distance between the ideal images and the test image in question is calculated andstored as our second feature.
8. Calculation of range of black pixels both horizontal and vertical wise
In this step, the difference between the extreme black pixels per row and column is calculatedand then summed up to give our third and fourth feature respectively. This is done as follows::
-
8/10/2019 Online Handwritten English Numeral
23/54
16
For example, let us consider the following image matrix. Here, 1 implies a black pixel is present.
1 2 3 Range(row wise)
I= 1 1 0 0 1-1=0
2 0 1 1 3-2=1 Sum=0+1+2=3
3 1 1 1 3-1=2
Range 2 1 2(col wise)
Sum=2+1+2=5
These values are calculated for both test and ideal images and for each test image, the euclideandistance between these values and the ideal images values is computed to give two more of ourfeatures.
9. Summation of all the Euclidean Distance values
In this stage, the four Euclidean distance containing matrices are summed up to generate the finaldata matrix. In this matrix, the rows contribute to test images and the columns correspond toideal images. So, in a certain row, the cell with the minimum value is the recognized character.The recognition is correct if the column that contains this minimum value is the same as the testimage in concern.
4.3. PROCEDURE AND FLOWCHART
A complete procedure of handwritten English numeral recognition is given below::
o Capture the test numerals.o Perform Binarization.o Perform the Normalization process (cropping, resizing, thinning).o Apply Feature Extraction Techniques (image centroid and pixel position range
techniques).o Implement the Euclidean Distance classifier.o
Get the recognized character.
A complete flowchart of the handwritten English numeral recognition system is givenbelow::
-
8/10/2019 Online Handwritten English Numeral
24/54
17
Fig. 4.14.4. PSEUDOCODE
Some of the algorithms used are as follows::
Image Binarization
Start at left top of the image
For each pixel in the input image,if (luminance > 0.6)
pixel:=1 /* white pixel */else
pixel:=0 /* black pixel */
So, after applying this algorithm, the output binary image has values of 1 (white) for all
-
8/10/2019 Online Handwritten English Numeral
25/54
18
pixels in the input image with luminance greater than LEVEL=0.6 and 0 (black) for all otherpixels
Image Cropping
For each image,
Identify the extreme top-left corner containing black pixel (x1, y1)Identify the extreme top-right corner containing black pixel (x1, y2)Identify the extreme bottom-left corner containing black pixel (x2, y1)Identify the extreme bottom-right corner containing black pixel (x2, y2)
Crop the image so that the entire image is contained within the bounding box of the abovementioned four points. So, the dimension of the rectangle for cropping is::
(y2-y1+1)x (x2-x1+1)
Where,y2-y1+1=image widthx2-x1+1=image height
Image resizing
For each image, normalize it such that resultant imagewidth=40 pixels,height=40 pixels
Image Thinning
For each image,Remove pixels so that the image shrinks to a minimally connected stroke. This removaloperation is carried out innumerable times, till the image hence obtained has the followingproperties:
As thin as possibleConnectedCentered
Centroid Calculation
Let,n = number of black pixels = 0 (initially)sumr = x co-ordinate of black pixelsumc = y co-ordinate of black pixel
-
8/10/2019 Online Handwritten English Numeral
26/54
19
Start at top-left corner of the image.
while (extreme bottom-right corner is not reached)If(black pixel is found)
n:=n+1 /* increment black pixel count by one*/
sumr:= sumr + x co-ordinate of black pixelsumc:= sumc + y co-ordinate of black pixelelseproceed to next pixel
end
Finally, centroid (x,y) is::
x=sumr / n , y=sumc / n
Division of image into blocks
Start at top-left corner of the imageThe image is divided into four blocks as follows
1 20 40
North
West
North
East
NorthWest
SouthEast
Fig. 4.2
Range calculation
Rowwise::
Start at top-left of the image.
For each row,
20 20
40
40
40
-
8/10/2019 Online Handwritten English Numeral
27/54
20
Diff(d)=(y co-ordinate of leftmost black pixel) - (y co-ordinate of rightmost black pixel);End
Sum = sum of all the values(d) per row
Columnwise::
Start at top-left of the image.
For each column,Diff(d)=(x co-ordinate of topmost black pixel) (x co-ordinate of bottommost black pixel);end
Sum = sum of all the values(d) per column
4.5. CHARACTER DATABASE
One hundred and fifty samples were collected from 15 persons, 10 samples each for constructingour test image database, while, as discussed earlier, the Adobe Photoshop softwares Text Toolhas been used to generate the 10 ideal digitized images.
4.6. HARDWARE & SOFTWARE USED
4.6.1. Development Platform
HARDWARE
Processor Intel Core 2 Duo T5870 @ 2.00GHz RAM 2.99GB Motherboard Intel Gigabyte Technology Chipset 8I865G
OPERATING SYSTEM USED
Microsoft Windows XP Professional Version 02 Service Pack 3
SOFTWARE
MATLAB 7.12 Adobe Photoshop 7.0
MS Paint
4.6.2. External Interfaces
HARDWARE INTERFACES
-
8/10/2019 Online Handwritten English Numeral
28/54
21
Tablet and pen
SOFTWARE INTERFACES
A GUI is made using MATLAB to take user command for recognizing a character
4.7. DESIGN METHODOLOGY
The design started with collecting handwritten samples of English numerals. A blank image file,of 200x200 size and 72page i resolution is created in Adobe Photoshop and the digit is written bythe writer by means of a tablet and an associated pen in this file
Fig. 4.3
Thereafter, the file is saved in the appropriate folder using .bmp extension.
-
8/10/2019 Online Handwritten English Numeral
29/54
22
Fig. 4.4
-
8/10/2019 Online Handwritten English Numeral
30/54
23
In this way all handwritten samples of English numeral characters have been collected.
Fig. 4.5
Then these images are imported into MATLAB using in-built functions.
After cropping, the images are saved in a separate folder.
Fig. 4.6
Thereafter, these are resized and thinned and kept aside in a new folder.
-
8/10/2019 Online Handwritten English Numeral
31/54
24
Fig. 4.6
The final results are obtained once the user clicks a button developed using MATLAB guide.
Fig. 4.7
-
8/10/2019 Online Handwritten English Numeral
32/54
25
Chapter 5
RESULTS&
INTERPRETATIONSA sample of the final data matrix, which shows the Euclidean distance values is shown below::
Fig. 5.1
Here, the orange box denotes the digit recognized by the system, and the green one denotes thedigit that should have been recognized in case there is mismatch. The result is emphasized bymeans of a flag, where RCorrect recognition, WWrong recognition
The final results are displayed in the MATLAB command window as follows::
-
8/10/2019 Online Handwritten English Numeral
33/54
26
Fig. 5.2
Where the digits shown in column represent the recognized digits, whether correct or incorrect.And the accuracy shows the percentage to which the digits have been recognized correctly.
After analyzing the results of all the 150 samples, we derive the actual accuracy of the systemdeveloped. One thing that must be noted here is that the accuracy so derived is totally dependenton the size of the test image set and hence is not fixed.
The total dataset accuracy values are saved in a certain folder as follows::
Fig. 5.3
-
8/10/2019 Online Handwritten English Numeral
34/54
27
The accuracy values and the digits recognized for all the datasets from 1-15 are shown below::
Fig.5.4
So, the overall accuracy for 150 (15 x 10) samples comes out to be= ((100.00+100.00+53.33+100.00+80.00+60.00+73.33+73.33+73.33+80.00)/10) %= (793.32 / 10) %= 79.332%~= 80%
Of this, the best accuracy is for 0, 1 and 3. The worst accuracy is for digit 2.
-
8/10/2019 Online Handwritten English Numeral
35/54
28
Chapter 6
CONCLUSION&
FUTURE SCOPE
6.1. CONCLUSION
In this project, a system for recognizing handwritten English numerals has been developed byadopting a qualitative research methodology approach. The research is being termed qualitativein the sense that it began with a broad problem statement of having to identify English numeralsand then plans were chalked out to accomplish that task. After initial research and literaturesurvey, a fair number of concepts were drawn based on which an efficient algorithm was
proposed. Thereafter a top-down design methodology was adopted with some inherent designconstraints as discussed earlier. Then experiments were carried out to collect sample data. Next,existing statistical methods were adopted to process that data and derive the final interpretationsfor the systems results. Thus, conclusions regarding the efficiency of the system developed weredrawn based on the statistical findings. A fair recognition accuracy of 80% has been achieved,which is quite acceptable given the novelty of the algorithm proposed.
6.2. FUTURE SCOPE
However, the system developed has some shortcomings. First, it is capable of testing only
isolated characters from 0-9. That is, the user cannot expect to recognize a number containingmultiple digits. Two, there is no option for skew and slant normalization. Also, the system forrecognition expects the user to write all the digits 0-9 for proper recognition as the backend logicis coded likewise.
The accuracy for the system can be increased by applying some classifier other than theEuclidean distance based one. This approach can be extended to be applicable for recognition ofwords, sentences and documents by implementing segmentation techniques. Also, post-
-
8/10/2019 Online Handwritten English Numeral
36/54
29
processing checking by incorporation of semantic information can help in increasing theaccuracy and efficiency of the system. Neural network can also be applied in future to make thesystem learn and adapt, hence, this would do away with the need of database. And finally, abetter user interface development is definitely possible which incorporates all the steps fromimage input to recognition within it solely and not as disjoint software components. Also, more
transparency can be incorporated in the system so as to impart to the user what is actuallyhappening and what are the outputs of each step of the system.
But otherwise, the system is a useful work in the development of a viable Optical CharacterRecognition system.
-
8/10/2019 Online Handwritten English Numeral
37/54
30
References
[1]Pixel - Wikipedia, the free encyclopediahttps://en.wikipedia.org/wiki/Pixel
[2] What is pixel - A Word Definition From the Webopedia Computer Dictionaryhttp://www.webopedia.com/TERM/P/pixel.html
[3] Bow, S.T., Pattern Recognition, Marcel Dekker, New York, 1984, ISBN 0-8247-0659-5,Page v
[4] Schalkoff. Robert, PATTERN RECOGNITION: STATISTICAL, STRUCTURAL ANDNEURAL APAGE ROACHES, John Wiley & Sons, 01-Sep-2007, page 2, page 18-19, 2007
[5] http://www.byclb.com/TR/Tutorials/neural_networks/ch1_1.htm
[6] Duan. Tran Duc, Du. Tran Le Hong, Phuoc. Tran Vinh, Hoang. Nguyen Viet, Building anAutomatic Vehicle License-Plate Recognition System, Intl. Conf. in Computer Science RIVF05, February 21-24, 2005, Can Tho, Vietnam, page 59-63, 2005
[7] Garcia. Miguel Angel, Sotelo. Miguel Angel, Gorostiza. E. Martin, Traffic Sign Detection inStatic Images using Matlab, IEEE 0-7803-7937-3/'03, page 212-215, 2003
[8] Deng. Peter Shaohua, Liao. Hong-Yuan Mark, Ho. Chin Wen, Tyan. Hsiao-Rong, Wavelet-Based Off-Line Handwritten Signature Verification, Computer Vision and ImageUnderstanding Vol. 76, No. 3, December, page 173190, 1999
[9]Pal. U., Chaudhuri. B.B., Indian script character recognition: a survey, Pattern Recognition37 (2004), page 18871899, 2004
http://www.elsevier.com/locate/patcog
[10]Arica. Nafiz, Yarman-Vural. Fatos T., An Overview of Character Recognition Focused onOff-Line Handwriting, IEEE Transactions on Systems, Man, and Cyberneticspart c: Apagelications and Reviews, vol. 31, no. 2, may 2001, page 216-233, 2001
[11]Nawaz. Tabassam, Naqvi. Syed Ammar Hassan Shah, Rehman. Habib ur, Faiz. Anoshia,Optical Character Recognition System for Urdu (Naskh Font) Using Pattern Matching
Technique, International Journal of Image Processing, (IJIP)Volume (3) : Issue (3), page 92-104
[12] J.Pradeep, E.Srinivasan, S.Himavathi, Diagonal Based Feature Extraction For HandwrittenAlphabets Recognition System Using Neural Network, International Journal of ComputerScience & Information Technology (IJCSIT), Vol 3, No 1, Feb 2011, page 27-38, 2011
-
8/10/2019 Online Handwritten English Numeral
38/54
31
[13] Siddharth. Kartar Singh, Jangid. Mahesh, Dhir. Renu, Rani. Rajneesh, HandwrittenGurmukhi Character Recognition Using Statistical and Background Directional DistributionFeatures, ISSN : 0975-3397 Vol. 3 No. 6 June 2011, page 2332-2345, 2011
[14] Zramdini. Abdel Wahab, Ingold. Rolf, Optical font recognition from projection profiles,
Electronic Publishing, Vol. 6(3), (September 1993), page 249260, 1993
[15]Madasu. Vamsi K., Lovell. Brian C., Hanmandlu. M., Hand printed Character Recognitionusing Neural Networks
[16] Kato. Nei, Omachi. Shinichiro, Aso. Hirotomo, Nemoto. Yoshiaki, A HandwrittenCharacter Recognition System Using Directional Element Feature and Asymmetric MahalanobisDistance, IEEE Transactions On Pattern Analysis And Machine Intelligence, vol. 21, no. 3,March 1999, page 258-262, 1999
[17] Lin. Huiqin, Ou. Wennuan, Zhu. Tonglin, The Research of Algorithm for Handwritten
Character Recognition in Correcting Assignment System, IEEE Sixth International Conferenceon Image and Graphics, 978-0-7695-4541-7/11, page 456-460, 2011
[18] R. Seethalakshmi, T.R. Sreeranjani, Balachandar T., Singh. Abnikant, Singh. Markandey,Ratan. Ritwaj, Kumar. Sarvesh, Optical Character Recognition for printed Tamil text usingUnicode, Journal of Zhejiang University SCIENCE ISSN 1009-3095, page 1297-1305, 2005
[19] The Distance Formulahttp://www.cut-the-knot.org/pythagoras/DistanceFormula.shtml
[20] BMP file format - Wikipedia, the free encyclopediahttp://en.wikipedia.org/wiki/BMP_file_format
[21] The Advantages of BMP _ eHowhttp://www.ehow.com/list_6118423_advantages-bmp.html
[22] Line Eikvil, OCR Optical Character Recognition, December 1993http://citeseerx.ist.psu.edu
[23] Yin. Ming, Narita. Seinosuke, Speedup Method for Real-Time Thinning Algorithm,Digital Image Computing Techniques and Applications, 21--22 January 2002, Melbourne,Australia, 2002
[24] Shanthi. N, Duraiswamy. K, Preprocessing Algorithms for the Recognition of TamilHandwritten Characters, 3rd International CALIBER - 2005, Cochin, 2-4 February, 2005, page77-82, 2005
[25] Papandreou. A., Gatos. B., A Novel Skew Detection Technique Based on VerticalProjections, International Conference on Document Analysis and Recognition, 2011, IEEE1520-5363/11, page 384-388, 2011
-
8/10/2019 Online Handwritten English Numeral
39/54
32
[26] Louloudis. G.,Gatos. B., Pratikakis. I., Halatsis. C., Text line and word segmentation ofhandwritten documents, Pattern Recognition 42 (2009) 3169 3183, 2009
http://www.elsevier.com/locate/pr
-
8/10/2019 Online Handwritten English Numeral
40/54
33
Appendix
Some code snippets are given below::
Final_34.m
function varargout = final_34(varargin)gui_Singleton = 1;gui_State = struct('gui_Name', mfilename, ...
'gui_Singleton', gui_Singleton, ...'gui_OpeningFcn', @final_34_OpeningFcn, ...'gui_OutputFcn', @final_34_OutputFcn, ...'gui_LayoutFcn', [], ...'gui_Callback', []);
if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});end
if nargout[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
elsegui_mainfcn(gui_State, varargin{:});
endend
function final_34_OpeningFcn(hObject, ~, handles, varargin)
handles.output = hObject;end
function varargout = final_34_OutputFcn(~, ~, handles)varargout{1} = handles.output;end
function recognize_Callback(~, ~, ~)call_crop_test34();prepro_test34();centroid_call();
centroid_call_test34();callcentroid_seg();callcentroid_seg_test34();centroidseg_dist();centroidseg_dist_test34();cal_diff_seg_dist34();allsegdist_diff();call_getextrm_ideal();
-
8/10/2019 Online Handwritten English Numeral
41/54
34
call_getextrm_test34();cal_getxtrm_ideal_col();cal_getextrm_test_col34();cal_diff_xtrm();cal_diff_xtrm_col();
calceuclid();summing_run_cent();min_vals();accu();end
accu.m
count=0;for i=1:10
if((i-1)==in(i))
count=count+1;endendperc=(count/10)*100;disp('accuracy=');disp(perc);
min_vals.m
for i=1:10[minimum,in]=min(sum_run_cent,[],2);in=in-1;
enddisp(in);
call_crop_test34.m
for i=1:10cropping34(i);
end
cropping34.m
function []=cropping34(f)
b=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test images new34\',num2str(f-1),'.bmp'));b = im2bw(b,0.6);k=0;s=size(b);
-
8/10/2019 Online Handwritten English Numeral
42/54
35
for i=1:s(1)for j=1:s(2)
if(b(i,j)==0)k=k+1;
arr_x(k)=i;arr_y(k)=j;end
endend
x1=min(arr_x,[],1);x1=min(x1,[],2);x2=max(arr_x,[],1);x2=max(x2,[],2);y1=min(arr_y,[],2);
y2=max(arr_y,[],2);
rect=[y1-1 x1-1 y2-y1+1 x2-x1+1];b=imcrop(b,rect);
imwrite(b,strcat('C:\ProgramFiles\MATLAB\R2011a\bin\test34_cropped\',num2str(f-1),'.bmp'),'bmp');end
prepro_test34.m
for i=1:10b=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test34_cropped\',num2str(i-1),'.bmp'));k=0;dim=size(b);b=imresize(b,[40 40]);b = bwmorph(~b,'thin',Inf);b=~b;imwrite(b,strcat('C:\Program Files\MATLAB\R2011a\bin\test34_thinned\',num2str(i-1),'.bmp'),'bmp');end
centroid_call.m
for i=1:10p(i,1:2)=getcentroid(i);end
-
8/10/2019 Online Handwritten English Numeral
43/54
36
centroid_call_test34.m
for i=1:10q(i,1:2)=getcentroid_test34(i);end
getcentroid.m
function cent=getcentroid(i)
f=imread(strcat('C:\MATLAB7\bin\imagedatabase\',num2str(i-1),'.bmp'));c=0;sumr=0;sumc=0;
for i=1:s(1)
for j=1:s(2)if(ti(i,j)==0)sumr=sumr+i;sumc=sumc+j;c=c+1;
endend
endcr=sumr/c;cc=sumc/c;
cent=[cr cc];end
getcentroid_test34.m
function cent_test=getcentroid_test34(i)
f=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test34_thinned\',num2str(i-1),'.bmp'));c=0;sumr=0;sumc=0;
for i=1:s(1)for j=1:s(2)if(ti(i,j)==0)
sumr=sumr+i;sumc=sumc+j;c=c+1;
end
-
8/10/2019 Online Handwritten English Numeral
44/54
37
endend
cr=sumr/c;cc=sumc/c;
cent_test=[cr cc];end
centroid_seg.m
for i=1:10cent_seg(i,1:8)=getcentroid_seg(i);end
getcentroid_seg.m
function cent=getcentroid_seg(i)
f=imread(strcat('C:\MATLAB7\bin\imagedatabase\',num2str(i-1),'.bmp'));s=size(ti);
c_seg1=0;sumr_seg1=0;sumc_seg1=0;
for i=1:s(1)/2for j=1:s(2)/2if(ti(i,j)==0)
sumr_seg1=sumr_seg1+i;sumc_seg1=sumc_seg1+j;c_seg1=c_seg1+1;
endend
end
cr_seg1=sumr_seg1/c_seg1;cc_seg1=sumc_seg1/c_seg1;
%******************************seg2*****************************%
c_seg2=0;sumr_seg2=0;sumc_seg2=0;
for i=1:s(1)/2
-
8/10/2019 Online Handwritten English Numeral
45/54
38
for j=(s(2)/2)+1:s(2)if(ti(i,j)==0)
sumr_seg2=sumr_seg2+i;sumc_seg2=sumc_seg2+j;c_seg2=c_seg2+1;
endendend
cr_seg2=sumr_seg2/c_seg2;cc_seg2=sumc_seg2/c_seg2;
%******************************seg3*****************************%
c_seg3=0;sumr_seg3=0;
sumc_seg3=0;
for i=(s(1)/2)+1:s(1)for j=(s(2)/2)+1:s(2)if(ti(i,j)==0)
sumr_seg3=sumr_seg3+i;sumc_seg3=sumc_seg3+j;c_seg3=c_seg3+1;
endendend
cr_seg3=sumr_seg3/c_seg3;cc_seg3=sumc_seg3/c_seg3;
%******************************seg4*****************************%
c_seg4=0;sumr_seg4=0;sumc_seg4=0;
for i=(s(1)/2)+1:s(1)for j=1:s(2)if(ti(i,j)==0)
sumr_seg4=sumr_seg4+i;sumc_seg4=sumc_seg4+j;c_seg4=c_seg4+1;
endend
-
8/10/2019 Online Handwritten English Numeral
46/54
39
end
cr_seg4=sumr_seg4/c_seg4;cc_seg4=sumc_seg4/c_seg4;
cent=[cr_seg1 cc_seg1 cr_seg2 cc_seg2 cr_seg3 cc_seg3 cr_seg4 cc_seg4];
end
callcentroid_seg_test.m
for i=1:10cent_seg_test(i,1:8)=getcentroid_seg_test34(i);end
getcentroid_seg_test34.m
function cent=getcentroid_seg_test34(i)
f=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test34_thinned\',num2str(i-1),'.bmp'));% showgrid();% calculation of centroid %% level=graythresh(f);% ti=im2bw(f,level);
ti=im2bw(f,0.6);s=size(ti);
c_seg1=0;sumr_seg1=0;sumc_seg1=0;
for i=1:s(1)/2for j=1:s(2)/2if(ti(i,j)==0)
sumr_seg1=sumr_seg1+i;sumc_seg1=sumc_seg1+j;c_seg1=c_seg1+1;
endendend
cr_seg1=sumr_seg1/c_seg1;cc_seg1=sumc_seg1/c_seg1;
-
8/10/2019 Online Handwritten English Numeral
47/54
40
%******************************seg2*****************************%
c_seg2=0;sumr_seg2=0;
sumc_seg2=0;
for i=1:s(1)/2for j=(s(2)/2)+1:s(2)if(ti(i,j)==0)
sumr_seg2=sumr_seg2+i;sumc_seg2=sumc_seg2+j;c_seg2=c_seg2+1;
endendend
cr_seg2=sumr_seg2/c_seg2;cc_seg2=sumc_seg2/c_seg2;
%******************************seg3*****************************%
c_seg3=0;sumr_seg3=0;sumc_seg3=0;
for i=(s(1)/2)+1:s(1)for j=(s(2)/2)+1:s(2)if(ti(i,j)==0)
sumr_seg3=sumr_seg3+i;sumc_seg3=sumc_seg3+j;c_seg3=c_seg3+1;
endendend
cr_seg3=sumr_seg3/c_seg3;cc_seg3=sumc_seg3/c_seg3;
%******************************seg4*****************************%
c_seg4=0;sumr_seg4=0;sumc_seg4=0;
-
8/10/2019 Online Handwritten English Numeral
48/54
41
for i=(s(1)/2)+1:s(1)for j=1:s(2)if(ti(i,j)==0)
sumr_seg4=sumr_seg4+i;
sumc_seg4=sumc_seg4+j;c_seg4=c_seg4+1;endendend
cr_seg4=sumr_seg4/c_seg4;cc_seg4=sumc_seg4/c_seg4;
cent=[cr_seg1 cc_seg1 cr_seg2 cc_seg2 cr_seg3 cc_seg3 cr_seg4 cc_seg4];
endcentroidseg_dist.m
for i=1:10
dist_seg1(i,1)=(p(i,1)-cent_seg(i,1))^2+(p(i,2)-cent_seg(i,2))^2;dist_seg1(i,1)=sqrt(dist_seg1(i,1));
dist_seg1(i,2)=(p(i,1)-cent_seg(i,3))^2+(p(i,2)-cent_seg(i,4))^2;dist_seg1(i,2)=sqrt(dist_seg1(i,2));
dist_seg1(i,3)=(p(i,1)-cent_seg(i,5))^2+(p(i,2)-cent_seg(i,6))^2;dist_seg1(i,3)=sqrt(dist_seg1(i,3));
dist_seg1(i,4)=(p(i,1)-cent_seg(i,7))^2+(p(i,2)-cent_seg(i,8))^2;dist_seg1(i,4)=sqrt(dist_seg1(i,4));
end
for i=1:10 dist_seg1(i,5)=dist_seg1(i,1)+dist_seg1(i,2)+dist_seg1(i,3)+dist_seg1(i,4);end
centroidseg_dist_test34.m
for i=1:10dist_seg1_test(i,1)=(q(i,1)-cent_seg_test(i,1))^2+(q(i,2)-cent_seg_test(i,2))^2;dist_seg1_test(i,1)=sqrt(dist_seg1_test(i,1));
dist_seg1_test(i,2)=(q(i,1)-cent_seg_test(i,3))^2+(q(i,2)-cent_seg_test(i,4))^2;
-
8/10/2019 Online Handwritten English Numeral
49/54
42
dist_seg1_test(i,2)=sqrt(dist_seg1_test(i,2));
dist_seg1_test(i,3)=(q(i,1)-cent_seg_test(i,5))^2+(q(i,2)-cent_seg_test(i,6))^2;dist_seg1_test(i,3)=sqrt(dist_seg1_test(i,3));
dist_seg1_test(i,4)=(q(i,1)-cent_seg_test(i,7))^2+(q(i,2)-cent_seg_test(i,8))^2;dist_seg1_test(i,4)=sqrt(dist_seg1_test(i,4));
end
for i=1:10dist_seg1_test(i,5)=dist_seg1_test(i,1)+dist_seg1_test(i,2)+dist_seg1_test(i,3)+dist_seg1_test(i,4);end
cal_diff_seg_dist34.m
for i=1:10for m=1:10
diff_seg(i,m)=abs(dist_seg1_test(i,5)-dist_seg1(m,5));end
end
allsegdist_diff.m
sum_seg_dist_four_diff=diff_seg;for i=1:10
for j=1:10if(isnan(sum_seg_dist_four_diff(i,j)))
sum_seg_dist_four_diff(i,j)=9999;end
endend
call_getextrm_ideal.m
for i=1:10extrm(i,1)=getextrm_ideal(i);
end
call_getextrm_test34.m
for i=1:10extrm_test(i,1)=getextrm_test34(i);
end
-
8/10/2019 Online Handwritten English Numeral
50/54
43
getextrm_ideal.m
function xt=getextrm_ideal(i)
f=imread(strcat('C:\MATLAB7\bin\imagedatabase\',num2str(i-1),'.bmp'));
s=size(f);left(40)=zeros();right(40)=zeros();diff(40)=zeros();
for i=1:s(1)for j=1:s(2)
if(ti(i,j)==0)
left(i)=j;break;end
endend
for m=s(1):1for n=s(2):1
if(ti(m,n)==0)right(m)=n;break;
endend
end
diff=abs(right-left);disp(diff);xt=0;
for p=1:40xt=xt+diff(p);
enddisp(xt);end
getextrm_test34
function xt=getextrm_test34(i)
f=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test34_thinned\',num2str(i-1),'.bmp'));
-
8/10/2019 Online Handwritten English Numeral
51/54
44
s=size(f);left(40)=zeros();right(40)=zeros();diff(40)=zeros();
for i=1:s(1)for j=1:s(2)if(ti(i,j)==0)
left(i)=j;break;
endend
end
for m=s(1):1for n=s(2):1
if(ti(m,n)==0)right(m)=n;break;
endend
end
diff=abs(right-left);disp(diff);xt=0;for p=1:40
xt=xt+diff(p);enddisp(xt);end
cal_getextrm_ideal_col.m
for i=1:10extrm_col(i,1)=getextrm_ideal_col(i);
end
cal_getextrm_test_col34.m
for i=1:10extrm_col_test(i,1)=getextrm_test_col34(i);
end
-
8/10/2019 Online Handwritten English Numeral
52/54
45
getextrm_ideal_col.m
function xt=getextrm_ideal_col(i)
f=imread(strcat('C:\MATLAB7\bin\imagedatabase\',num2str(i-1),'.bmp'));
s=size(f);
top(40)=zeros();bottom(40)=zeros();diff(40)=zeros();
for j=1:s(2)for i=1:s(1)
if(ti(i,j)==0)top(j)=i;
break;endend
end
for n=1:s(2)for m=s(1):1
if(ti(m,n)==0)bottom(n)=m;break;
endend
end
diff=abs(bottom-top);disp(diff);xt=0;for p=1:40
xt=xt+diff(p);enddisp(xt);end
getextrm_test_col34.m
function xt=getextrm_test_col34(i)
f=imread(strcat('C:\Program Files\MATLAB\R2011a\bin\test34_thinned\',num2str(i-1),'.bmp'));s=size(f);top(40)=zeros();
-
8/10/2019 Online Handwritten English Numeral
53/54
46
bottom(40)=zeros();diff(40)=zeros();
for j=1:s(2)for i=1:s(1)
if(ti(i,j)==0)top(j)=i;break;
endend
end
for n=1:s(2)for m=s(1):1
if(ti(m,n)==0)
bottom(n)=m;break;end
endend
diff=abs(bottom-top);disp(diff);xt=0;for p=1:40
xt=xt+diff(p);enddisp(xt);end
cal_diff_xtrm.m
for i=1:10for j=1:10diff_xtrm(i,j)=abs(extrm_test(i,1)-extrm(j,1)) ;end
end
cal_diff_xtrm_col.m
for i=1:10for j=1:10diff_xtrm_col(i,j)=abs(extrm_col_test(i,1)-extrm_col(j,1));end
-
8/10/2019 Online Handwritten English Numeral
54/54
end
caleuclid.m
m=1;n=1;for i=1:10
for j=1:10D1(m,j)=sqrt((q(i,1)-p(j,1))^2+(q(i,2)-p(j,2))^2);
endm=m+1;enddisp(D1);
summing_run_cent.m
sum_run_cent=sum_seg_dist_four_diff+D1+diff_xtrm+diff_xtrm_col;sum_run_cent=(sum_run_cent)/100;disp(sum_run_cent);