Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique
-
Upload
rami-alsahhar -
Category
Documents
-
view
2.941 -
download
0
description
Transcript of Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique
![Page 1: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/1.jpg)
Rami Al-Sahhar
Ideas for today and tomorrow
![Page 2: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/2.jpg)
Agenda OCR Overview
The Arabic OCR Problem
OCR Challenges
Proposed Solution Detailed system stages
Sample Run
Future Work
Demo
![Page 3: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/3.jpg)
OCR Overview (OCR) is the process of converting an image of text, such as
a scanned paper document, into computer-editable text
The ultimate goal of OCR is to simulate the human ability to read both machine-printed and hand-written texts
Most of the work on OCR has been on Latin and Chinese characters
Arabic character recognition started recently and advanced relatively slowly due to the complexity of recognizing Arabic text, which has characters that are cursive in nature.
Arabic character recognition is still an open and challenging field of research
![Page 4: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/4.jpg)
The Arabic OCR Problem
To propose a complete system that classifies and recognizes machine-printed Arabic text
The input to the system is TIFF image file
The Arabic font size varies from 8 up to 36
The font type is Arabic Simplified or Traditional Arabic
The image scanned at 300 dpi ( Resolution )
The output is editable text in a word processor program ( MS Word)
![Page 5: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/5.jpg)
OCR Challenges
Understanding TIFF image format and pixel representation
Programmatically , read TIFF image pixel by pixel from right to left
Features extraction
Segmentation free
Spaces ,Words , Letters and Line isolation
Noise reduction
Dots and holes
Overlapped characters
![Page 6: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/6.jpg)
OCR Challenges Arabic Character Characteristics
Right to left
Always cursive
Change of character shape according to its location in the word
Four different shapes
28 basic characters: 15 with dots, 13 without
No fixed character width and no fixed size
![Page 7: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/7.jpg)
OCR Challenges Arabic Character Characteristics
A sample of written Arabic showing some of its characteristics
Group of Arabic character shapes
![Page 8: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/8.jpg)
The Proposed Solution
The proposed system starts from the document image acquisition stage and ends with recognized Arabic text in standard Simplified true type font format in MS Word 2007
We started designing our system by experimenting with prior researchers’ techniques, adopting or modifying some of them if they met our requirements, but otherwise developing our own techniques
Consequently, the components of our system are either due to the work of others, the result of our improvement of others’ work, or our own completely new techniques.
![Page 9: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/9.jpg)
The Proposed Solution
Prol
og-B
ased
Prol
og-B
ased
C -B
ased
C -B
ased
TIFFIMAGE FILE
PREPROCESSING
FEATURE EXTRACTION
POSTENHANCEMENT
CLASSIFICATION AND
RECOGNITION
RECOGNIZED TEXT
ATR (Arabic Text Recognition)
System model
![Page 10: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/10.jpg)
The Proposed Solution
Preprocessing PhaseDigitalization, scaling, word-level segmentation, noise removal and elimination of redundant information as far as possible Image information retrieval
Load/Read the input (TIFF) image file as binary; retrieve the image properties (size, width, height, pixel resolution, image channels and image alignment; and create memory storage for system intermediate processing
Image digitalization Digitizes the TIFF image in order to apply fixed-level
thresholding and to convert the gray-scale and bitmapped image to a binary (0’s and 1’s representation) scale image
![Page 11: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/11.jpg)
The Proposed Solution
It does the vertical and horizontal histograms to retrieve the number of lines per page and number of components (words) per each line
We calculate the font baseline and size by finding the maximum horizontal histogram of each line per page
This enables the dots or other special characters such as Shadda, Madda, and Tanween to be classified as upper or lower components related to this baseline Text line detection Word segmentation
![Page 12: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/12.jpg)
The Proposed SolutionB&W image is found in file name: [ test1.tif]
Processing a [1615x2160] image with [1] channel(s)
Image Origin : [Top-left Origin] , Align : [4-]
Data Order :[Interleaved Color Channels]
Number of Lines(s) found: [6]
Line #0 , Y = 78 , Height = [67]
Line #1 , Y = 185 , Height =[ 67]
Line #2 , Y = 292 , Height = [67]
Line #3 , Y = 399 , Height = [67]
Line #4 , Y = 506 , Height = [67]
Line #5 , Y = 613 , Height = [67]
Font Baseline =[ 38 pixels]
Number of Components found at Image Line #0 : [9]
Number of Components found at Image Line #1 : [14]
Number of Components found at Image Line #2 : [16]
Number of Components found at Image Line #3 : [18]
Number of Components found at Image Line #4 : [10]
Number of Components found at Image Line #5 : [6]
B&W image is found in file name: [ test1.tif]
Processing a [1615x2160] image with [1] channel(s)
Image Origin : [Top-left Origin] , Align : [4-]
Data Order :[Interleaved Color Channels]
Number of Lines(s) found: [6]
Line #0 , Y = 78 , Height = [67]
Line #1 , Y = 185 , Height =[ 67]
Line #2 , Y = 292 , Height = [67]
Line #3 , Y = 399 , Height = [67]
Line #4 , Y = 506 , Height = [67]
Line #5 , Y = 613 , Height = [67]
Font Baseline =[ 38 pixels]
Number of Components found at Image Line #0 : [9]
Number of Components found at Image Line #1 : [14]
Number of Components found at Image Line #2 : [16]
Number of Components found at Image Line #3 : [18]
Number of Components found at Image Line #4 : [10]
Number of Components found at Image Line #5 : [6]
Preprocessing phase Text Line Detection
![Page 13: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/13.jpg)
The Proposed Solution
Number of retrieved contours : [2]
************* Bounding Rectangle (1,22)-(72,65) ************
Component [1] Origin Y = 22 , Height = 43 Area = 429.000000
Component [2] Origin Y = 47, Height = 7 Area = 19.000000
Max Component Area = 429.000000 , Y = 22 , H = 43
Number of retrieved contours : [2]
************* Bounding Rectangle (1,22)-(72,65) ************
Component [1] Origin Y = 22 , Height = 43 Area = 429.000000
Component [2] Origin Y = 47, Height = 7 Area = 19.000000
Max Component Area = 429.000000 , Y = 22 , H = 43
21 Preprocessing phase Word
Segmentation
![Page 14: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/14.jpg)
The Proposed Solution Feature Extraction Phase
Is the most challenging part for character or text recognition The choice of good features significantly improves the
recognition rate and minimizes the error in case of noise The main selected features are :
Outer contours described in Freeman chain codes Contours’ corners Dot information Font estimated size
All of these features are extracted for all detected components during the page scanning
![Page 15: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/15.jpg)
The Proposed Solution Freeman chain code
Chain code was introduced by Freeman as a mean of representing lines or boundaries of shapes by a connected sequence of straight-line segments of specified length and direction
An example of the 8-connectivity chain codeChain code numbering schemes
![Page 16: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/16.jpg)
The Proposed Solution Contour extraction process
This is the core process to extract the main word-level features of the Arabic text in Freeman Chain code format
After extracting the Freeman codes, we aggregate those codes into pairs as (X, Y) where X is the direction (i.e. from 1 to 7) and Y is the length in pixels
![Page 17: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/17.jpg)
The Proposed SolutionContours Freeman Chain Codes
[2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,2,3,2,3,6,5,6,6,5,7,7,7,6,7,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,4,3,1,1,1,1,2,2,2,3,4,4,3,4,4,4,4,4,4,4,4,5,4,6,5,6,6,6,1,0,0,7,7,7,5,4,5,4,4,5,4,4,4,4,4,3,2,2,2,2,2,2,2,2,2,3,2,3,2,3,6,5,6,5,7,6,7,7,6,7,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,3,2,2,2,2,2,2,2,3,2,2,2,3,2,3,2,3,3,3,3,4,3,6,6,6,6,7,0,7,7,7,7,6,7,6,6,6,7,6,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,3,6,6,6,6,7,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,1,7,7,7,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Contours Freeman Chain Code Pairs - Aligned
Total Pairs : [95] ==> [(2,14),(3,1),(2,1),(3,1),(2,1),(3,1),(6,1),(5,1),(6,2),(5,1),(7,3),(6,1),(7,1),(6,4),(5,1),(4,19),(3,1),(4,1),(3,1),(1,4),(2,3),(3,1),(4,2),(3,1),(4,8),(5,1),(4,1),(6,1),(5,1),(6,3),(1,1),(0,2),(7,3),(5,1),(4,1),(5,1),(4,2),(5,1),(4,5),(3,1),(2,9),(3,1),(2,1),(3,1),(2,1),(3,1),(6,1),(5,1),(6,1),(5,1),(7,1),(6,1),(7,2),(6,1),(7,1),(6,4),(5,1),(4,13),(3,1),(2,7),(3,1),(2,3),(3,1),(2,1),(3,1),(2,1),(3,4),(4,1),(3,1),(6,4),(7,1),(0,1),(7,4),(6,1),(7,1),(6,3),(7,1),(6,5),(5,1),(4,14),(3,2),(6,4),(7,2),(0,38),(1,1),(0,2),(1,1),(0,1),(1,1),(0,1),(1,1),(7,3),(0,1),(7,1),(0,22)]
Contour Corners Positions:
[5,18,23,33,37,60,67,81,87,90,92,103,113,118,132,146,160,162,168,172,178,180,189,202,206,210,258,263,]
Contours Freeman Chain Codes
[2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,2,3,2,3,6,5,6,6,5,7,7,7,6,7,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,4,3,1,1,1,1,2,2,2,3,4,4,3,4,4,4,4,4,4,4,4,5,4,6,5,6,6,6,1,0,0,7,7,7,5,4,5,4,4,5,4,4,4,4,4,3,2,2,2,2,2,2,2,2,2,3,2,3,2,3,6,5,6,5,7,6,7,7,6,7,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,3,2,2,2,2,2,2,2,3,2,2,2,3,2,3,2,3,3,3,3,4,3,6,6,6,6,7,0,7,7,7,7,6,7,6,6,6,7,6,6,6,6,6,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,3,6,6,6,6,7,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,1,7,7,7,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Contours Freeman Chain Code Pairs - Aligned
Total Pairs : [95] ==> [(2,14),(3,1),(2,1),(3,1),(2,1),(3,1),(6,1),(5,1),(6,2),(5,1),(7,3),(6,1),(7,1),(6,4),(5,1),(4,19),(3,1),(4,1),(3,1),(1,4),(2,3),(3,1),(4,2),(3,1),(4,8),(5,1),(4,1),(6,1),(5,1),(6,3),(1,1),(0,2),(7,3),(5,1),(4,1),(5,1),(4,2),(5,1),(4,5),(3,1),(2,9),(3,1),(2,1),(3,1),(2,1),(3,1),(6,1),(5,1),(6,1),(5,1),(7,1),(6,1),(7,2),(6,1),(7,1),(6,4),(5,1),(4,13),(3,1),(2,7),(3,1),(2,3),(3,1),(2,1),(3,1),(2,1),(3,4),(4,1),(3,1),(6,4),(7,1),(0,1),(7,4),(6,1),(7,1),(6,3),(7,1),(6,5),(5,1),(4,14),(3,2),(6,4),(7,2),(0,38),(1,1),(0,2),(1,1),(0,1),(1,1),(0,1),(1,1),(7,3),(0,1),(7,1),(0,22)]
Contour Corners Positions:
[5,18,23,33,37,60,67,81,87,90,92,103,113,118,132,146,160,162,168,172,178,180,189,202,206,210,258,263,]
Feature extraction: Freeman chain codes, pairs and corner positions
![Page 18: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/18.jpg)
The Proposed Solution Corner Detection
This phase detects and extracts the component’s contour corners of the text under processing
It is based on an implementation of contour detection and curve representation by circular local histogram of contour chain code presented by [Arrebola, Camacho, Bandera , & Sandoval
(1999)]
The corner detection phase is very important for the next classification and recognition phase
It helps our Prolog engine to determine the unique shape of the character’s feature regardless of the character orientation
The output of this phase is a stream of corner information to be input for the next phase
![Page 19: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/19.jpg)
The Proposed Solution Contour
enhancement We introduced an algorithm to remove noisy pixels which come within any straight line, and to convert Arabic characters to approximately straight lines.
These enhancement rules, which are derived from testing Arabic characters multiple times, reduce the time required for character recognition
![Page 20: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/20.jpg)
The Proposed Solution
Dot Detection and Font Size Estimation
Dot detection is another challenging important task It helps the prolog classification engine to recognize words that
include dots and it decides which characters are un-dotted by nature
Font size estimation is an critical task in Omni-size character recognition systems
Font size estimation is usually used to find the pen width in online recognition systems
Approaches used to estimate the font size by dots is presented in [Shirali-Shahreza , & Shirali-Shahreza (2006)]
![Page 21: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/21.jpg)
The Proposed Solution
![Page 22: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/22.jpg)
The Proposed Solution
Font size against calculated component’s height (in pixels)
![Page 23: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/23.jpg)
The Proposed Solution
Definite Clause Grammar (DCG): Provides a mechanism for defining the grammar rules of a
language These rules are automatically translated to a Prolog program
which defines a parser for the language being defined Grammar rules are a feature only in some Prolog systems,
and are designed to facilitate the parsing of natural language Using this notation, a grammar is represented as a set of
logical rules When the DCG rules are consulted (or optimized), they are
translated into Prolog clauses
![Page 24: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/24.jpg)
The Proposed Solution
Word-level Classification and Recognition Phase This is the most critical phase in our proposed ATR system It is written in Prolog language using Prolog matching,
backtracking and DCG techniques The input for this phase is data on two features : The first input stream is the corner sequence of the word-level
outer contours for each component that represents the elevation information of the input stream (the upper part that holds most of the features)
The second input stream is the dot information found in the same component
![Page 25: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/25.jpg)
The Proposed Solution
The Prolog matching and backtracking techniques also use the corner sequence stream to classify the unknown inputs into character classes, while the Prolog DCG technique uses the dot information stream to recognize the actual Arabic letters of a particular character class
![Page 26: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/26.jpg)
The Proposed Solution
% DCG part for Arabic text recognition based on two input streams
% usage: phrase(s(R),[m,h_c,d1,m,dc]).
s([H|T]) -->cc(H), subs(T). % every string is a character class followed by a sub-string
s(R)-->cc(R). % or a string can be simply a character class
subs(R)-->s(R). % a substring is nothing but a string (recursively)
cc(R)-->ch(R). % a character class can be a simple character
% or character classes can belong to any of the following classes
cc(R)-->bc(R). % Ba class (ba, ta, tha, ya_md)
cc(R)-->h_c(R). % H_ class (h_, jeem, kha)
cc(R)-->dc(R). % Dal Class (dal, thal)
cc(R)-->rc(R). % Ra' Class (ra, zay)
cc(R)-->sc(R). % Seen Class (seen, sheen)
DCG implementation: The DCG grammar structure and some of the character classes are described below :
![Page 27: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/27.jpg)
The Proposed Solution Microsoft Word Document Integration
This is the final phase of our optical text recognition system It is written in Prolog language to interface with Microsoft Word
program It uses Microsoft Word Document API to write the recognized
characters into a new Word document It writes the output text in the same recognized font size in a
predefined font type It also writes the white spaces and new lines to maintain the
same original text alignment and format
![Page 28: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/28.jpg)
The Proposed Solution
Character Reference Database
Character Reference Database
Freeman Chain CodesFreeman Chain Codes
Lines per PageLines per PageWords per LineWords per Line
Pixel ResolutionPixel Resolution
Recognized Text (Word Document)
Recognized Text (Word Document)
DCG EngineDCG Engine
Document Image (TIFF File)
Document Image (TIFF File)
Component Area, Height and Width
Component Area, Height and Width
Component Coordinates(X, Y)
Component Coordinates(X, Y)
Word-levelSegmentationWord-level
Segmentation
ImageDigitalization
ImageDigitalization
Word-levelContour Extraction
Word-levelContour Extraction
Image Information Retrieval
Image Information Retrieval
Corner DetectionCorner Detection
Dots DetectionDots Detection
Character-ShapeProlog Matching
Character-ShapeProlog Matching
Word-levelRecognition
Word-levelRecognition
MS WORD Document Integration
MS WORD Document Integration
Font SizeFont Size
Height/ WidthHeight/ Width
Font BaselineFont Baseline
Contour EnhancementContour Enhancement
Dot Information Stream
Character Shape Stream
![Page 29: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/29.jpg)
Sample RunOriginal TIFF image with Arabic text The recognized Arabic text in MS
Word 2007
![Page 30: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/30.jpg)
Future Work
Support more Arabic font types
Support more image types ( GIF , BMP , JPEG…etc)
Support different font sizes in same page
Support Arabic & English fonts together , numeric and special characters
Support Spellchecker and word suggestions
Implement the system as Arabic Business Card reader
Capture and Recognize feature for iPhone
![Page 31: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/31.jpg)
Demo
![Page 32: Offline Omni Font Arabic Optical Text Recognition System using Prolog Classification Technique](https://reader036.fdocuments.us/reader036/viewer/2022062312/5539b2484a79590a7f8b49b4/html5/thumbnails/32.jpg)
OCR Applications
Industries and Institutions in which control of large amounts of paper work is critical Banking, Credit cards, Insurance industries
The medical community To capture, store and transmit radiology images
Libraries and archives For conservation and preservation of vulnerable documents and
for the provision of access to source documents