Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition...

Post on 06-Jan-2018

217 views 0 download

description

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Get image B/W Scanning Gray Scanning Color Scanning Load from image file Preprocess image Layout recognition Text recognition User assisted correction Result exportation

Transcript of Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition...

Dr. István MarosiScansoft-Recognita, Inc., Hungary

SSIP 2005, Szeged

Character Recognition Internals

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognitionText recognitionUser assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionGet image

B/W ScanningGray ScanningColor ScanningLoad from image file

Preprocess imageLayout recognitionText recognitionUser assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionGet imagePreprocess image

Color separationThresholdingDespecklingRotationDeskewing

Layout recognitionText recognitionUser assisted correctionResult exportation

Color SeparationDe-speckle, de-skew

04 Jul 2005 Istvan Marosi

The Preprocessed ImageJoined chars

04 Jul 2005 Istvan Marosi

Joined charsThe Preprocessed Image

04 Jul 2005 Istvan Marosi

The Preprocessed ImageJoined chars

04 Jul 2005 Istvan Marosi

The Preprocessed ImageBroken chars

04 Jul 2005 Istvan Marosi

The Preprocessed ImageBroken chars

04 Jul 2005 Istvan Marosi

The Preprocessed ImageBroken chars

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisition

Layout recognitionText zones

Columns of flowed textTablesInverse text

Graphic zonesText recognitionUser assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisition

Layout recognitionText zonesGraphic zones

Line ArtPhoto

Text recognitionUser assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognition

Text recognitionSegmentationCalculation of Feature Vector ElementsClassificationLanguage AnalysisVoting

User assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

SegmentationWhat are those pixel groups belonging to a single letter?

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognition

Text recognitionSegmentationCalculation of Feature Vector ElementsClassificationLanguage AnalysisVoting

User assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

Calculation of FV Elements: Contour Tracing

Find a (new) white-black transitionFollow the “edge” of the pixels using the MIN or MAX ruleAdministrate the already traced white-black transitionsCollect information while going aroundAnd repeat the process on new shapes ...

04 Jul 2005 Istvan Marosi

Contour TracingFind a (new) white-black transition

Follow the “edge” of the pixels using the MIN or MAX ruleAdministrate the already traced white-black transitionsCollect information while going aroundAnd repeat the process on new shapes ...

04 Jul 2005 Istvan Marosi

Contour TracingFind a (new) white-black transitionFollow the “edge” of the pixels using the MIN or MAX rule

if black(a) then turn(ccw)else if black(b) then forwardelse turn(cw)

a b

04 Jul 2005 Istvan Marosi

Contour TracingFind a (new) white-black transitionFollow the “edge” of the pixels using the MIN or MAX rule

if black(a) then turn(ccw)else if black(b) then forwardelse turn(cw)

a b

if white(b) then turn(cw)else if white(a) then forwardelse turn(ccw)

ab

04 Jul 2005 Istvan Marosi

Contour TracingFind a (new) white-black transitionFollow the “edge” of the pixels using the MIN or MAX ruleAdministrate the already traced white-black transitionsCollect information while going aroundAnd repeat the process on new shapes ...

04 Jul 2005 Istvan Marosi

Some Easily Calculatable Data

Problem #1

Turning CW: In=In-1+1Turning CCW: In=In-1-1 Going Forward: In=In-1

04 Jul 2005 Istvan Marosi

Some Easily Calculatable Data

Problem #2

Turning CW: In=In-1+1Turning CCW: In=In-1-1 Going Forward: In=In-1

04 Jul 2005 Istvan Marosi

Some Easily Calculatable Data

Problem #3

Going Up: In=In-1-Xn

Going Down: In=In-1+Xn

Going Right: In=In-1

Going Left: In=In-1

04 Jul 2005 Istvan Marosi

Some Easily Calculatable Data

Problem #4

Going Up: In=In-1-Xn

Going Down: In=In-1+Xn

Going Right: In=In-1

Going Left: In=In-1

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognition

Text recognitionSegmentationCalculation of Feature Vector ElementsClassificationLanguage AnalysisVoting

User assisted correctionResult exportation

AB A B

04 Jul 2005 Istvan Marosi

A

B AB

Classification; Training modelsRestricted Coulomb Energy (RCE) Network(Dr. Leon Cooper, Dr. Charles Elbaum)

04 Jul 2005 Istvan Marosi

Classification; Training modelsRestricted Coulomb Energy (RCE) Network(Dr. Leon Cooper, Dr. Charles Elbaum)

A

B AB

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Default radius Rmax

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Default radius Rmax

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Default radius Rmax

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Decreased radius

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Decreased radius Rmin

04 Jul 2005 Istvan Marosi

Classification; Training modelsNestor Learning System (NLS)

Pass 2Decreased radius

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognition

Text recognitionSegmentationCalculation of Feature Vector ElementsClassificationLanguage AnalysisVoting

User assisted correctionResult exportation

04 Jul 2005 Istvan Marosi

VotingText recognition in OmniPage Pro

OCR Engines available:Caere’s engine (codename: Salt & Pepper)

Recognita’s engine (codename: Paprika)

ScanSoft’s engine (codename: Fireworx)

04 Jul 2005 Istvan Marosi

Text recognition in OmniPage ProOCR Engines available:

Caere’s engine (Salt & Pepper)

Uses a Matrix Matching based algorithmfeature set: 40 cells of an 8x5 gridgood overall description of a shapeweaker at detailed structure

Recognita’s engine (Paprika)

Uses a Contour Tracing based algorithmfeture set: convex and concave arcs on the contourgood detailed description of a shapeweaker at overall structure

Voting

04 Jul 2005 Istvan Marosi

Text recognition in OmniPage ProOCR Engines available:

Caere’s engine (Salt & Pepper)

Recognita’s engine (Paprika)

ScanSoft’s engine (Fireworx)

Segmentation algorithms:

Voting

04 Jul 2005 Istvan Marosi

Text recognition in OmniPage ProOCR Engines available:

Caere’s engine (Salt & Pepper)

Recognita’s engine (Paprika)

ScanSoft’s engine (Fireworx)

Segmentation algorithms:Developed by independent groupsHave different strengths and weaknesses

Voting

04 Jul 2005 Istvan Marosi

Text recognition in OmniPage ProOCR Engines availableSegmentation algorithms

Conclusion:They are complementaryLet’s create a voting system

Voting

04 Jul 2005 Istvan Marosi

Voting strategiesExternal „Black box”voting

~20% gain

Image

Paprika Salt &Pepper

Vote

Txt 3 Txt 1

Dict

Final Txt

Voting

Fire-worx

Txt 2

04 Jul 2005 Istvan Marosi

Voting strategiesExternal „Black box”voting

Internal „Shape”voting

Voting Image

Paprika

Fire-worx

BronzeTxt 3

Txt 2

Dict

Final Txt

Salt &Pepper

Txt 1

04 Jul 2005 Istvan Marosi

Paprika

Original segmentation:Every independent connected component is a

character

Good segmentation: recognizeBad segmentation: reject

Image

Recognize originalsegmentationK.B.

Voting

04 Jul 2005 Istvan Marosi

Paprika

Image

Recognize originalsegmentation

Txt 2Train adaptive classifier

from original shapes

K.B.

AdaptiveK.B.

VotingTxt 1

04 Jul 2005 Istvan Marosi

Paprika

Try several segmentationsLoop if unrecognizable

Image

Recognize originalsegmentation

Txt 2Train adaptive classifier

from original shapes

Recognize broken andjoined shapes

K.B.

AdaptiveK.B.

VotingTxt 1

Dict

04 Jul 2005 Istvan Marosi

Paprika

Image

Recognize originalsegmentation

Txt 2Train adaptive classifier

from original shapes

Recognize broken andjoined shapes

K.B.

AdaptiveK.B.

Train adaptive classifierfrom ‘ugly’ shapes

VotingTxt 1

Dict

04 Jul 2005 Istvan Marosi

Paprika

Image

Recognize originalsegmentation

Txt 3

Txt 2Train adaptive classifier

from original shapes

Recognize broken andjoined shapes

K.B.

AdaptiveK.B.

Train adaptive classifierfrom ‘ugly’ shapes

Recognize more brokenand joined shapes Try several segmentations

Loop if unrecognizable

VotingTxt 1

Dict

04 Jul 2005 Istvan Marosi

Image

Paprika

Fire-worx

BronzeTxt 3

Txt 1

Dict

Final Txt

Salt &Pepper

Txt 1

Voting strategies

~60% gain

Voting

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognitionText recognition

User assisted correctionBy the user’s random editing...

Pop-up verifierManual Training

By proofreading of doubtful wordsResult exportation

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognitionText recognition

User assisted correctionBy the user’s random editing...By proofreading of doubtful words

Correct: User dictionaryChanged: IntelliTrain

Remember trained charactersApply them on following pages

Result exportation

04 Jul 2005 Istvan Marosi

IntelliTrainRecognized word: sorneUüng

04 Jul 2005 Istvan Marosi

IntelliTrainRecognized word: sorneUüngFixed word: something

04 Jul 2005 Istvan Marosi

IntelliTrainRecognized word: sorneUüngFixed word: something

04 Jul 2005 Istvan Marosi

IntelliTrainRecognized word: sorneUüngFixed word: somethingSubstitutions found: m rn

thi Uü

04 Jul 2005 Istvan Marosi

IntelliTrainRecognized word: sorneUüngFixed word: somethingSubstitutions found: m rn

thi UüPerform automatically:

Learn image pattern and substitution infoFind similar substituted (‘blue’) text on actual pageMatch against pattern of substitution and correctFind such errors on following pages, too

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognitionText recognitionUser assisted correction

Result exportationCombine pages into a Document

Header / Footer recognitionPage numbersHyperlinks (e.g. „See Table 20”)

Save results

04 Jul 2005 Istvan Marosi

OCR InternalsMain tasks of an OCR system:

Image acquisitionLayout recognitionText recognitionUser assisted correction

Result exportationCombine pages into a DocumentSave results

doc filee-mailSpeech synthesizer

04 Jul 2005 Istvan Marosi