Omr and ocr
-
Upload
arslan-arshad -
Category
Technology
-
view
70 -
download
3
Transcript of Omr and ocr
Get Ready::
OMR, OCR And ICROptical Mark Recognition, Optical Character
Recognition and intelligent character Recognition.
Definition/Concept of OMR
A technology that allows an input device (e.g. imaging scanner) to read hand-drawn marks such as small circles or rectangles on specially designed paper.
Often used for test, survey, or questionnaire answer sheets.
The process of capturing data by contrasting reflectivity at predetermined positions on a page
Sometimes Referred to as Optical Mark Reader
OMR Forms
An OMR works with a specialized document and contains timing tracks along one edge of the form to indicate scanner where to read for marks which look like black boxes on the top or bottom of a form
OMR “Reads” mark information from Forms in the form of numbers/letters and put it into the computer.
OMR Forms Timing tracks indicate where to read for marks and indicate where to
clip images.
Timing Tracks
OMR Scanners and Software Have specifically placed LEDs (Light-emitting diodes).
LEDs sense marks in certain columns once a timing track is detected.
Software interprets the output from the scan and translates it to the desired format (e.g. ASCII).
Scanner Characteristics: ~130 pages per minute
(e.g. Kodak i 830) ~85 pages per minute (e.g
Axiome AXM 980 or Kodak 3000 Series)
INSIGHT 4ES (3,000/hour)
Kodak i 830
OMR Scanners and Software
OMR Software is used to capture data
from OMR Sheets.(e.g Remark Office
OMR, Smartshoot OMR)
Software Characteristics: - Performing specific imaging
functions such as:
- image acquisition,
- file conversion,
- data extraction, and
- file read/write commands (e.g. ISIS)
Axiome AXM 980
Remark Office OMR
OMR Storage Characteristics Storage: -
Barcodes: Identification of forms.
OMR Marks and Barcodes are read and moved directly into a database management system (e.g. SQL) then to a census database.
Images are not normally scanned and stored.
However, The capability of saving the scanned image is there!
Storage of Scanned Images (Recent Mainstream Capability)
Increasingly critical for validating results
Images can be used for correcting poorly filled out forms
Images can be used for validating results
Comprehensive image database of forms
OMR Accuracy Accuracy
To achieve high accuracy, well structured design and good quality printing of these forms is critical.
If the timing track and the bubbles on the form are not in the exact columns where the LEDs in the read head can detect them (Skew), there is no way for the scanner to read the marks (Float) This is referred to as skew and float
OMR Advantages OMR is a data collection technology
that does not require a recognition engine. Therefore: It is fast, using minimum processing power
to process forms Costs are predictable and defined OMR capture speeds range around 4000
forms per hr
OMR Disadvantages Disadvantages
OMR cannot recognize hand-printed or machine-printed characters.
With OMR, images of forms are not captured by scanners so electronic retrieval is not possible.
Tick boxes may not be suitable for all types of questions
OMR Challenges/Issues The entire process must be tested
Information Capture Recognizing Verifying Results
Questionnaire Design and Preparation is Critical Forms must be readable to the scanner when collected
Field Operators must take particular care in filling out questionnaires Completeness and consistency checks must be in place Careful care must be taken for the condition of the
Questionnaire (dust, humidity, transportation, etc)
Price of OMR
Today, most Economic China Made Hardware Scanners are available for atleast Rs. 180,000/- per scanner. Reasonably acceptable versions reach 2,50,000/- and beyond.
Software Prices depend on use there are multiple software like Remark Office OMR(nearly cost 5000/year)
The average cost around 0.25 per sheet.
Major Commercial Suppliers Pearson NCS - UK Company with US manufacturing base
(http://www.ncspearson.com)
Scantron - US Company with US manufacturing base (http://www.scantron.com)
Sekonic - Japanese Company with Japanese manufacturing base (http://www.sekonic.co.jp)
Axiome - Swiss Company with Swiss Manufacturing base (http://www.axiome.ch)
What is OCR and ICR? OCR: -
“Gives scanning and imaging systems the ability to turn images of machine printed characters into machine readable characters.” Images of the machine printed characters are
extracted from a bitmap of the scanned image.
ICR: -
“Gives scanning and imaging systems the ability to turn images of hand written characters into machine readable characters.” Images of the hand written characters are
extracted from a bitmap of the scanned image
Forms OCR/ ICR is more flexible since:
no timing tracks are required The image can float on a page
The use of drop color reduces the size of the scanner’s output and enhances the accuracy
ICR/OCR technology often uses registration mark on the four-corners of a document, in the recognition of an image.
OCR/ICR Scanner
Forms can be scanned through a scanner and then the recognition engine of the OCR/ICR system interpret the images and turn images of handwritten or printed characters into ASCII data (machine-readable characters).
Speeds Range from: 85-160 sheets/min (dependent on the recognition engine)
OCR/ICR Software There are plenty of free software in market.
1. Microsoft OneNote 2007
2. MS Office Document Imaging
3. SimpleOCR
4. TopOCR
5. FreeOCR
These software are free and easily available, These software use OCR algorithm to recognize letter from image.
Premium versions are supported automatic scan from scanner and these are bit faster than free software. Price lie between (5,000—20,000)
Office Imaging App
OCR/ICR Storage Characteristics
Storage/Retrieval
Images are scanned and stored and maintained electronically
There is no need to store the paper forms as long as you safeguard the electronic files
With OCR/ICR technologies, images can be scanned, indexed, and written to optical media
Ideal OCR/ICR Accuracy Thresholds
Accuracy:
Accuracy achieved by data entry clerks (~99.5%) are approximately equal to OCR/ICR in in perfect tuning (~99.5%)
Up to 99.9% accuracy with editing (like OMR)
The recognition engine must be tuned, tested and validated very carefully
Ideal OCR/ICR Accuracy Thresholds
Accuracy:
Accuracy achieved by data entry clerks (~99.5%) are approximately equal to OCR/ICR in in perfect tuning (~99.5%)
Up to 99.9% accuracy with editing (like OMR)
The recognition engine must be tuned, tested and validated very carefully
OCR/ICR Advantages Advantages
Recognition engines used with imaging can capture highly specialized data sets
OCR/ICR recognize machine-printed or hand-printed characters.
Scanning and recognition allowed efficient management and planning for the rest of the processing workload
Quick retrieval for editing and reprocessing
OCR/ICR Disadvantages
May require significant manual intervention. Additional workload to data collectors -ICR has
severe limitations when it comes to human handwriting.
Characters must be hand-printed/machine-printed with separate characters in boxes.
ineffective when dealing with cursive
characters.
OMR-OCR/ICR Compared
OCR/ICR Challenges/Issues Has corresponding issues with OMR
Algorithm development (Preparation of memory dictionary)
Processing time considerations due to recognition engine
Development costs
Price of OCR/ICR
OCR is less costly as compared to OMR.
The printer of OCR cost between 6000-80,000
Price is depend on speed of scanning and quality provide by Scanner.
On other hand there are many free OCR Software providing free functionality like (OCR Using Microsoft OneNote 2007, SimpleOCR, TopOCR, FreeOCR and Ms office Document Imaging)
Major Commercial Suppliers Top Image Systems (TIS) (http://www.topimagesystems.com)
ReadSoft (http://www.readsoft.com)
Teleform (http://www.intelliscan.com/TeleForm1.htm)
Scanner Suppliers Fujitsu, Canon, Bell & Howell, Kodak
Technology Evolution
Cursive
Bad quality machine print
UnconstrainedHandprint
ConstrainedHandprint
Machine Print
TEXT STYLESFORM TYPESNo special form designNo constraining boxes or combsCondensed stringsDirty & Noisy formsBad quality paperLegacy Forms
Specially designed for automatic recognition
Constraining boxes or combs
Drop out ink for preprinted text & boxes
TECHNOLOGY EVOLUTION
OCR ICRIntelligentRecognition
Illustration: Conference on Technology Options for 2011 Census
THANK YOU!