Application of Genetic Algorithm in Optical Character...
Transcript of Application of Genetic Algorithm in Optical Character...
54 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
Application of Genetic Algorithm in Optical Character
Recognition Techniques
Farhan Inamdar1, Pankaj Dadhich
2, Yogen Lohite
3, Yogesh Chandrate
4,
1,2,3,4Lecturer Guru Gobind Singh Polytechnic, Nashik
Abstract— Optical Character recognition is major
technique in Image Processing fields which has various
application in obtaining more clear character in the field
of Medical, Industrial processes as well as Domestic. In
this technology scanned, Handwritten, printed or
typewritten text is converted into editable format to
obtain its more clear digital version.Charector
recognition may be carried out by online and offline in
online recognition stylus or electronic tablet is interface
with computer to abstract information about character
whereas in off-line recognition target is digitally scanned
by optical scanner.
In this paper I have concentrated on offline character
recognition along with genetic algorithm and its
advantages
Keywords- Genetic Algorithm ,Digitization Optical
Character Recognition(OCR),Pre-processing
1. INTRODUNCTION
Optical Character recognition is major technique in
Image Processing fields which has various
application in obtaining more clear character in the
field of Medical, Industrial processes as well as
Domestic. In this technology scanned, Handwritten,
printed or typewritten text is converted into editable
format to obtain its more clear digital
version.Charector recognition may be carried out by
online and offline in online recognition stylus or
electronic tablet is interface with computer to
abstract information about character whereas in off-
line recognition target is digitally scanned by
optical scanner. OCR consists of many phases such
as Pre-processing, Segmentation, Feature Extraction,
Classifications and Recognition. The input of one
step is the output of next step. The task of pre-
processing relates to the removal of noise and
variation in handwritten. Several area where OCR
used including mail sorting, bank processing,
document reading and postal address recognition
require offline handwriting recognition systems,
pattern recognition
Phases of General Character Recognition System
A) Pre-processing: In The pre-processing
phase, there is a series of operations performed on
the scanned input image such as noise removing and
normalisation. It enhances the image rendering it
suitable for segmentation the gray-level character
image is normalized into a window sized. After
noise reduction, we produced a bitmap image . Then,
the bitmap image was transformed into a thinned
image.
B) Segmentation: The Segmentation phase is
the most important process. Segmentation is done
by separation from the individual characters of an
image. Segmentation is done to make the separation
between the individual characters of an image.
Sometimes components of two adjacent characters
may be touched or overlapped and this situation
create difficulties in the segmentation task.
55 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
Touching or overlapping problem occurs frequently
because of modified characters in upper-zone and
lower-zone.
C) Feature Extraction: In this phase, features
of individual character are extracted. The
performance of an each character recognition
system that depends on the features that are
extracted. The extracted features from input
character should allow classification of a character
in a unique way. We used diagonal features,
intersection and open end points features, transition
features, zoning features, directional features,
parabola curve fitting–based features, and power
curve fitting–based features in order to find the
feature set for a given character.
D) Classification : The classification is the
process of identifying each character and assigning
to it the correct character class, so that texts in
images are converted in to computer understandable
form. This process used extracted feature of text
image for classification i.e. input to this stage is
output of the feature extraction process. Classifiers
compare the input feature with stored pattern and
find out best matching class for input. There are
many technique used for classification such as
Artificial Neural Network (ANN), Template
Matching, Support Vector Matching (SVM) etc.
E) Post-processing Module : The output of
Text Recognition Module is in the form text data
which is understand by computer, So there need to
store it in to some proper format( i.e. txt or MS-
Word )for farther use such as Editing or Searching
in that data.
2. LITERATURE REVIEW
Claudiu et al. (2011) [1] has investigated using
simple training data pre-processing gave us experts
with errors less correlated than those of different
nets trained on the same or bootstrapped data.
Hence committees that simply average the expert
outputs considerably improve recognition rates.
Georgios et al. (2010) [2] has presented a
methodology for off-line handwritten character
recognition. The proposed methodology relies on a
new feature extraction technique based on recursive
subdivisions of the character image. Feature
extraction is followed by a two-stage classification
scheme based on the level of granularity of the
feature extraction method. Classes with high values
in the confusion matrix are merged at a certain level
and for each group of merged classes, granularity
features from the level that best distinguishes them
are employed. Two handwritten character databases
(CEDAR and CIL) as well as two handwritten digit
databases (MNIST and CEDAR) were used in order
to demonstrate the effectiveness of the proposed
technique.
Sankaran et al. (2012) [3] has presented present a
novel recognition approach that results in a 15%
decrease in word error rate on heavily degraded
Indian language document images. Classical OCR
approaches perform poorly over complex scripts
such as those for Indian languages. Sankaran et al.
(2012) [3] addressed these issues by proposing to
recognize character n-gram images, which are
basically groupings of consecutive
character/component segments. Their approach was
unique, since they use the character ngrams as a
primitive for recognition rather than for
postprocessing. By exploiting the additional context
present in the character n-gram images, we enable
better disambiguation S between confusing
characters in the recognition phase. The labels
obtained from recognizing the constituent n-grams
are then fused to obtain a label for the word that
emitted them. Their method is inherently robust to
degradations such as cuts and merges which are
common in digital libraries of scanned documents.
We also present a reliable and scalable scheme for
recognizing character n-gram images. Tests on
English and Malayalam document images show
considerable improvement in recognition in the case
of heavily degraded documents.
Jawahar et al. (2012) [4] has propose a recognition
scheme for the Indian script of Devanagari.
Recognition accuracy of Devanagari script is not
yet comparable to its Roman counterparts. This is
mainly due to the complexity of the script, writing
style etc. Our solution uses a Recurrent Neural
Network known as Bidirectional Long- Short Term
Memory (BLSTM). Our approach does not require
word to character segmentation, which is one of the
most common reason for high word error rate.
Jawahar et al. (2012) [4] has reported a reduction of
more than 20% in word error rate and over 9%
reduction in character error rate while comparing
with the best available OCR system.
Badawy, W. et al. (2012) [6] has discussed the
Automatic license plate recognition (ALPR) is the
extraction of vehicle license plate information from
an image or a sequence of images. The extracted
56 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
information can be used with or without a database
in many applications, such as electronic payment
systems (toll payment, parking fee payment), and
freeway and arterial monitoring systems for traffic
surveillance. The ALPR uses either a color, black
and white, or infrared camera to take images.
Ntirogiannis et al. (2013) [7] has studied that the
document image binarization is of great importance
in the document image analysis and recognition
pipeline since it affects further stages of the
recognition process. The evaluation of a
binarization method aids in studying its algorithmic
behaviour, as well as verifying its effectiveness, by
providing qualitative and quantitative indication of
its performance. This paper addresses a pixel-based
binarization evaluation methodology for historical
handwritten/machine-printed document images. In
the proposed evaluation scheme, the recall and
precision evaluation measures are properly
modified using a weighting scheme that diminishes
any potential evaluation bias.
Yang et al. (2012) [8] has proposed a novel adaptive
binarization method based on wavelet filter is
proposed in this paper, which shows comparable
performance to other similar methods and processes
faster, so that it is more suitable for real-time
processing and applicable for mobile devices. The
proposed method is evaluated on complex scene
images of ICDAR 2005 Robust Reading
Competition, and experimental results provide a
support for our work.
3. PROPOSED WORK
Algorithm
• Read the unknown image
• Process the image (conversion into gray scale
and then into bit string of 0,1)
• Recognition of image
• Initialise the generation
• Evaluate fitness function each individual
• Select best 2 chromosome for next generation
• Preform crossover to generate new chromosome
• Perform mutation
• Stop the algorithm when best fitness value is
found
Flowchart
In this study we have used MATLAB package for
the application of genetic algorithm to initialize
application values. Furthermore Roulette Wheel
Selection method used as parents to crossover that
we have applied to a set of binary numbers (0,1
encoding) representing capital English alphabets.
The alphabets ranges from A to Z represent in a
matrix of 8 * 6 array dimension (see figure) which
behaves as initiated value for comparing.
57 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
Each sample (character) considered a chromosome,
which has 48 genes presented as Table as below
Apply Generic Algorithm:
For each population two or more chromosomes are
selected to be parents to crossover. The problem we
have faced here is to know how to select these
chromosomes to create a new offspring. There are
many methods to overcome this problem and for
selecting the best chromosome like Roulette Wheel
Selection, Boltzmann Tournament Selection for
Genetic, Rank Selection and some others. The
chromosomes with higher fitness have a higher
possibility to be selected to produce offspring for
the next generation. After many generations of
evolution, the optimal solution of the problem is
hopefully to be found in the population [4] [5] [6].
The selected fitness function that we have utilized
in our research is as follows:
Where:
α : Desired gene from the Actual data;
β : Fault level of a chromosome computed by
genetic algorithm and n : the number of genes in
chromosome.
Using the above fitness function, we have trained
our system like if the fault level of a chromosome is
equal to the desired the fitness value of that
chromosome will be equal to 1 or if generation
number is more than thirty (30) the process of GAs
will be terminated.
The system structure can be described as below,
Process the character before recognition since
procedures takes the unknown character and then
return the unknown character without empty or
points in to array. We have used 30 generations, for
each generation 10 chromosomes and for each
chromosome has 48 genes. The genes of all
individuals consist of either 0 or 1(assuming a
binary encoding for simplicity).
The First Generation:
Firstly, we generate chromosomes by combining all
features extracted from unknown character in array.
First test for unclear A:
For example if we have the following image
Character of unclear A
Initial chromosome for primary offspring :
Choosing two chromosomes with a higher fitness
and applying crossover and mutation operation to
get a new chromosome, which is considered an
initial for a second offspring and so on (see bold
rows below) and because of the great fitness
function values of these selected chromosomes
which mean that character have enough recognition
from the first steps and hence the process stop.
58 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
Two chromosomes with the high fitness function
value are as follows
Now we will choose two chromosomes with a
higher fitness and applying crossover and mutation
operation to get new chromosomes, the two selected
chromosomes are as bellow.
Using fitness calculation function we have found
that a function values are greater than the previous
two (02) bold chromosomes hence a crossover for
these chromosomes are given as follows.
Moreover the all configured chromosomes and then
crossover process or mutation as follows:
After account of fitness value that genetic algorithm
is able to identify the character in the 30th
generation.
Above Table consists final chromosome, which
specify character A after testing with different only
in one gene ( in the 37th position) comparing with
all initial chromosome which specify normal (clear)
character A as target.
In the similar way, we can recognise all alphabet
and numbers as well.
5. RESULT
This section explains the GUI of the MATLAB
toolkit and OCR using Genetic algorithm. The
following are the sample empty GUI window of the
MATLAB toolkit.
59 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
The various components marked in the above image
are explained as the following:
(1) Steps involved in GA tool for OCR.
(2) Select parent image – The push button option to
select parent image for correlation. The input
images to be fused are loaded and displayed in
these image frames (First Image, Second
Image).
(3) Initial chromosome for primary offspring – The
push button option to display the Input
Character Images
(4) Start GA for Character A – The push button
interface to initiate the GA for alphabet A
(5) Comparative Analysis – The push button option
to display the comparative analysis of (different
pixel).
Optimization Using Genetic Algorithm:
Input image to be recognise
Output editable .txt file.
4. ADVANTAGES AND APPLICATIONS
Advantages:
It provides efficient techniques for optimization
and machine learning applications
This project gives an alternative to traditional
optimization technique by using directed
random searches to locate optimal solutions
The Recognition problem is solved by the GA,
which may yield a different solution for the
same word each time.
95-97% of characters recognition.
60 Farhan Inamdar, Pankaj Dadhich, Yogen Lohite, Yogesh Chandrate
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 6, Issue 2
February 2017
Applications:
Optical character recognition has been applied to a
number of applications. Some of them are Listed
below
Invoice Imaging
Legal Industry
Banking
Healthcare
Captcha
Institutional Repositories and Digital Libraries
Optical Music Recognition
Automatic Number Recognition
Handwriting Recognition
5. CONCLUSION
The genetic algorithms being trained using
standard templates of the capital alphabets and
then calculate a fitness function value.
Character recognition process produces
comparatively big values for specific character
but for rest characters fitness values found small
in less time due to obvious features
Training accomplish that genetic algorithm was
able to identify the character in the 30th
generation.
It‟s also true that the fitness function did not
produce one „1‟ value which mean 100%
recognition for character „A‟ but accomplish
with 97% in the 30th generation with fitness
function value equal 0.5
The system was then tested for most of 26
characters and it shows 95-97% of characters
recognition
REFERENCES
[1] Khaled M.G Noaman, Jamil Abdulhameed M. Saif,
Ibrahim A.A. Alqubati, Optical Character
Recognition Based on Genetic Algorithms, Journal
of Emerging Trends in Computing and Information
Sciences, Vol. 6, No. 4 April 2015
[2] Dan ClaudiuCires¸an and Ueli Meier and Luca
Maria Gambardella and JurgenSchmidhuber,
“Convolutional Neural Network Committees for
Handwritten Character Classification”, 2011
International Conference on Document Analysis and
Recognition, IEEE, 2011.
[3] GeorgiosVamvakas, Basilis Gatos, Stavros J.
Perantonis, “Handwritten character recognition
through two-stage foreground sub-sampling”
,Pattern Recognition, Volume 43, Issue 8, August
2010.
[4] Shrey Dutta, Naveen Sankaran, PramodSankar K.,
C.V. Jawahar, “Robust Recognition of Degraded
Documents Using Character N-Grams”, IEEE, 2012.
[5] Naveen Sankaran and C.V Jawahar, “Recognition of
Printed Devanagari Text Using BLSTM Neural
Network”, IEEE, 2012.
[6] Yong-Qin Zhang, Yu Ding, Jin-Sheng Xiao, Jiaying
Liu and Zongming Guo1, “Visibility enhancement
using an image filtering approach”, Zhang et al.
EURASIP Journal on Advances in Signal Processing
2012.
[7] Badawy, W. "Automatic License Plate Recognition
(ALPR): A State of the Art Review." (2012): 1-1.
[8] Ntirogiannis, Konstantinos, Basilis Gatos, and
IoannisPratikakis. "A Performance Evaluation
Methodology for Historical Document Image
Binarization." (2013): 1-1.
[9] Yang, Jufeng, Kai Wang, Jiaofeng Li, Jiao Jiao, and
Jing Xu. "A fast adaptive binarization method for
complex scene images." In Image Processing (ICIP),
2012 19th IEEE International Conference on, pp.
1889-1892. IEEE, 2012.
[10] Sumetphong, Chaivatna, and
SupachaiTangwongsan. "An Optimal Approach
towards Recognizing Broken Thai Characters in
OCR Systems." Digital Image Computing
Techniques and Applications (DICTA), 2012
International Conference on. IEEE, 2012.