Module 8 -- Automatic Separation and Form ID; …class.kofax.com/KC10/KC10Mod08.pdfModule 8 --...

18
Module 8 -- Automatic Separation and Form ID; Recognition Profiles Kofax Technical Training Page 1 Module 8 Automatic Separation and Form ID; Recognition Profiles Slide 1 Slide 2 Module 8 -- Automatic Separation and Form ID; Recognition Profiles Automatic Separation and Form ID; Recognition Profiles Automatic Separation and Form ID; Recognition Profiles Automatic separation and form identification Adding form types Adding sample pages Zone types Modifying recognition profiles Image cleanup profiles Copy and paste Rubber band OCR/ICR Slide 3 Module 8 -- Automatic Separation and Form ID; Recognition Profiles Form Identification: Topology-based An analysis of the sample page on a pixel-by-pixel basis results in a histogram. The histogram becomes a “fingerprint” for the sample page. When documents are scanned, Kofax Capture compares them to the sample page(s) and tries to match the “fingerprints” using two criteria: Confidence and Difference

Transcript of Module 8 -- Automatic Separation and Form ID; …class.kofax.com/KC10/KC10Mod08.pdfModule 8 --...

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 1

Module 8Automatic Separation and Form ID; Recognition Profiles

Slide 1

Slide 2 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Automatic Separation and Form ID; Recognition Profiles

Automatic Separation

and Form ID; Recognition

Profiles

• Automatic separation and form identification

• Adding form types

• Adding sample pages

• Zone types

• Modifying recognition profiles

• Image cleanup profiles

• Copy and paste

• Rubber band OCR/ICR

Slide 3 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Form Identification: Topology-based

An analysis of the sample page on a pixel-by-pixel

basis results in a histogram.

The histogram becomes a “fingerprint” for the

sample page.

When documents are scanned, Kofax Capture compares them to the

sample page(s) and tries to match the “fingerprints”

using two criteria: Confidence and Difference

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 2

Slide 4 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Automatic Separation and Form ID

• Kofax Capture then automatically performs document separation in the Recognition Server using the form identification method just described.

Select “Kofax page recognition” as the Separation and Form

Identification method to make use of topology-based form identification

and document separation in the Recognition Server

Slide 5 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Enhancing Automatic Separation and Form ID

• However, there are instances where the documents are far too similar and the standard topology-based form identification method does not yield consistently accurate document separation results.

• In these scenarios, Form Identification zones can be added and the Confidence and Difference settings can be altered (resulting in a custom Separation and Form Identification profile) to aid in document separation.

• This custom Separation and Form Identification profile works in conjunction with the Form ID zone(s).

Form Identification zones use a Recognition Profile to identify a unique set of pixels (a form

number, name, logo, or shape) and it is linked to a specific form type.

Slide 6 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Confidence and Difference Settings

The Confidence and Difference settings can be altered, resulting in

a custom Separation and Form Identification profile.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 3

Slide 7 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Disabling Topology–based Form ID

• In extreme cases, the topology-based form identification can be completely disabled, relying entirely on one or more Form Identification zones for document separation.

• This is accomplished by creating a custom Separation and Form Identification profile where the Confidence and Difference sliders are both set to “zero” percent.

• Kofax Capture then uses only the Form Identification zones for performing document separation, enhancing accuracy.

NOTE: When using this technique, proper page registration (as discussed in Module 7) may prove to be very critical in consistently

attaining the required form identification.

Slide 8 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Custom Separation and Form ID Profile

Disable the topology-based Form ID by setting the Confidence and Difference sliders to 0%. This will cause Kofax

Capture to perform automatic document separation based entirely on Form ID zones.

Slide 9 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Renaming

Objects (batch classes, document classes, folder

classes, form types, field types) can be renamed to eliminate

confusion. In this case, a new form type is going to be added to

this document class, so the batch class, document class and

form type will be renamed to avoid confusion.

This is done by right mouse clicking on the object and

selecting “Rename” or click on the Rename icon in the ribbon.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 4

Slide 10 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Add Form Type

Right mouse-click on the newly renamed Catalog Orders document class and select “New Form Type”

from the context menu. Give the new form type a new name.

Slide 11 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Image Tab

Permanent Image Cleanup profiles affect the final image output and are

applied to the entire form (not just on a zone-by-zone basis).

Since index zones are going to be setup on this page, Kofax Capture

automatically registers the page (so “Register all pages” really does not

need to be checked).

The Automatic Index Field Recognition Threshold slider may be modified as

necessary.

Slide 12 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Page Level Bar Code Tab

The bar code on the second form type is going to be read as an index zone, so

there is no need to modify any settings on the Page Level Bar Code tab.

Click [OK] when settings are

complete

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 5

Slide 13 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Add Sample Pages

Don’t forget…before adding a sample page, make sure the scanner properties

match those of the production environment.

Click [Add Sample Pages]

Slide 14 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Sample Page Ready

The sample page can be scanned or imported.

As before, the Sample Page icon appears associated

with the form type and the image appears in the

Viewer panel.

Slide 15 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Zone Types

• Multiple zone types are available for defining locations and activities to be performed by Kofax Capture in production. Zone are selected and configured from the Insert tab and include:

• Data or Index• Text Registration• Shape Registration• Separator• Form Identification• Group Zones

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 6

Slide 16 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Data or Index Zones

• Data or Index zones are associated with fields

• Data may be manually entered by a Validation operator (like the State field)

• Or it can be automatically read using one of the recognition engines

• Fields may also be populated using a database lookup or via custom validation scripts

• The zone drawn also defines the field of view for the operator in the Validation and Verification queues

Slide 17 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Text Registration Zones

• Registration zones use optical recognition technology to recognize the first four letters of a text string

• A registration point is placed at the bottom-left corner of the first character reference points from which zones are registered

• Registration points help to align pages, aiding zone positioning for accurate optical data extraction

Remember: Registration is the alignment or calibration of a page being imported or scanned to its sample page, helping to assure

that zones line-up with the goal of enhancing the accuracy of optical recognition operations. Registration can be enhanced by

one or more Registration Zones (Text and/or Shape).

Slide 18 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Shape Registration Zones

• One or more registration points based on Shape Registration zones may also be created for registration.

• Shape Registration zones use shapes like corners, line intersections, and BLOBs (Binary Large OBjects) to establish registration points.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 7

Slide 19 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Separator Zones

• Separator zones use optical recognition technology to recognize an alphanumeric string.

• Anytime that string is found in the specified location, document separation occurs.

• Separator zones are used when there is a single form type per batch class.

• Separator zones require that a custom Separation and Form Identification profile be set up at the batch class level.

• Only one per batch class allowed.

Slide 20 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Form Identification Zones

• Like Separator zones, Form Identification zones use optical recognition technology to recognize an alphanumeric string.

• Upon recognition, the zone can:• Identify the form type• Separate the document

• Designed for use with multiple formtypes in the batch class.

Slide 21

Group Zones

• Group zones allow the collection of multiple values and return the best single value (like a check box) or concatenate the multiple values (like our First Name/Last Name below) into a single field value.

• All zones within a Group zone as well as the Group zone itself, are associated with a common Field.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 8

Slide 22 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Creating a Group Zone

The group zone has a magenta colored frame. This magenta

bounding box defines the area of the image that the Validation and Verification operators will view.

3. Draw a box around the group with the mouse.

2. Select the Group Zone icon from the Insert tab.

1. Define each included member zone first.

Slide 23 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Group Zone Properties

Configure the properties for the

group zone.

Slide 24 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Zones are Indicated on the Sample Page

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 9

Slide 25 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Selecting Recognition Profiles

Recognition profiles use recognition engines. Kofax profiles may be edited and “saved as” a new name to

use whenever needed.

If Auto recognition is enabled, the zone will be processed using a Recognition profile.

Slide 26 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Modifying Recognition Profiles

• Remember: Kofax-supplied profiles can be used as-is, or they can be modified and saved as a new custom profile.

• When a profile is modified, it becomes a new custom profile that gets added to the pool of profiles from which to select when defining optical recognition operations.

• Changing the profile in one instance, will affect the behavior of that profile in all instances of its use.

Slide 27 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Recognition Profile Settings

• Recognition Profiles contain settings for maximizing data extraction.

• For example, a choice of different languages is available…

• Or the print type may be selected…

• Or the content and character set…

Notice that [Save] is

disabled, but [Save As] is

available.

Remember: Kofax profiles cannot be overwritten; only modified and then saved as

new custom profiles.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 10

Slide 28 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Image Cleanup Profiles

• Every recognition profile may use an image cleanup profile.

• This image cleanup is applied to the individual zone using the recognition profile, therefore the changes are temporary and do not affect the image output at export.

• Zonal image cleanup increases accuracy for automatic recognition.

• Image cleanup profiles propagate upward and can be shared by recognition profiles.

Select [Edit] to modify the

Image Cleanup settings.

Slide 29 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Black Border Removal & Cropping

Black border removal gets rid of black

borders around white documents. The

white noise gap is the number of pixels

Kofax Capture must encounter before it

stops border removal because it now thinks

it’s into the form itself.

Please refer to the Reference Section of your course material for examples of image cleanup.

Slide 30 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Deshade

Deshading removes areas of shading on a

document. Set the minimum width and height for an area of

shading and tell Kofax the maximum size of a

“speckle.” Remember, to Kofax, speckles (black and

white pixels) make up what we see as

shading.

Notice that a choice of measurement units is

available: English (inches), Metric

(centimeters) or Pixels.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 11

Slide 31 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Deskew

Deskewing straightens out

crooked documents –up to 12 degrees.

Documents skewed at greater than 12

degrees must be rescanned…or use

VRS (it can deskew up to 45 degrees)

Slide 32 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Despeckle

Speckles up to 32x32 pixels can be

removed through the despeckle settings

Slide 33 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Line Removal

Horizontal and vertical lines may be eliminated by defining what a

line is (and isn’t). In the top example, any horizontal line that is less than 200 pixels (1 inch long at 200 dpi) is NOT removed. Maybe it’s the crossbar on the letter “T.”

Maximum gap in a line that Kofax allows before it stops removing the line can be specified. And

“Edge clean factor” can be set to smooth jagged lines.

“Repair character” fixes characters that have been

damaged by line removal, like when someone doesn’t stay

between the lines when they fill out a form.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 12

Slide 34 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Streak Removal

Vertical streaks may be caused by errant pixels being held in

memory too long during the scan

operation. A width up to 10 pixels maximum

can be specified.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Edge Enhancement

Edge enhancement filters are either ON or

OFF. There are no user definable

properties. Look at the word, “Widgets”

in your reference material provided for an example of what each of these filters

does.

Slide 35

Slide 36 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Selecting Units of Measure

It is possible to start with inches, but change to pixels.

Built-in conversion calculator

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 13

Slide 37 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Testing an Image Cleanup Profile

Image cleanup settings can be

tested from this window

Slide 38 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Image Cleanup Test Results

Horizontal and Vertical line removal working together

View of image without image cleanup

Results of recognition operation after image

cleanup and extraction

View of image with image cleanup applied

Slide 39 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Save As to Create Image Cleanup Profile

Select [Save As] to name the

profile, and then [OK], and finally [Close] to exit.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 14

Slide 40 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Save As to Create Recognition Profile

Note that the image cleanup profile propagates upwards and constitutes a change to the recognition profile. So

select [Save As] to rename the new recognition profile, and

then [Close] to exit.

Slide 41 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Image Cleanup Profile Re-use

Remember: Image cleanup profiles

propagate upward and can be shared

by recognition profiles.

Slide 42 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Recognition and Image Cleanup Profile Re-use

Look for the opportunity to reuse image cleanup and recognition

profiles.

Remember: Changing the profile in one instance, will affect the behavior of that profile in all

instances of its use.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 15

Slide 43 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Copy and Paste – Objects Setup

Use Copy to duplicate an existing object

(batch class, document class,

folder class, or form type). The copy

method is useful for creating a new object that is similar to an

existing one.

Highlight the object and

select [Copy].

Slide 44 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Paste Option

When the Paste option is selected, a copied batch class is given a different

name (such as Name1) with links to the existing document class(es).

Copy

Original

Original

Links to Original

Slide 45 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Paste Special Option

Paste with a copy of the original Document Classes -- a copy of all document class(es)

and their related objects will be created with a unique name(s) and associated with the

copied batch class.

Paste with a link to the original Document Classes -- a copy of the batch class will be

created, linked to the original document classes (same as the Paste option ).

Copy

Original

Copy

Original

Original

Copy

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 16

Slide 46 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Copy and Paste – Zones

Zones can also be copied and pasted for

ease and speed of administration.

Highlight a zone, right mouse click

and select “Copy”.

Slide 47 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Paste New Zones

Right mouse click and select

“Paste”.

Slide 48 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

New Zone and Index Field are Created

A new zone is created and can

be placed as desired on the sample page.

A new index field is created in the

document class.

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 17

Slide 49 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Rubber Band OCR/ICR Setup

Rubber band OCR/ICR is setup in the

Administration Module as a Batch Class

Advanced property

Select a recognition profile based on one of the following profiles (this selection applies only to fields that are not already setup for automatic recognition):

•Kofax High Performance OCR Zonal•Kofax High Performance ICR Zonal•Kofax Advanced OCR Zonal•Kofax ICR Zonal•A custom profile based on one of the above profiles

Slide 50 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Rubber Band OCR/ICR in Validation & Verification

Step 1: The Validation and/or Verification operator selects the rubber band OCR/ICR icon from the Home tab.

Cursor changes shape

Slide 51 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Rubber Band OCR/ICR – Step 2

Step 2: With the new cursor icon, the operator draws a zone around the data to be read and extracted

Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Kofax Technical Training Page 18

Slide 52 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Rubber Band OCR/ICR – Step 3

Step 3: The index field is populated with the value read

Remember: This feature first has to be setup in

the Administration module (Batch Class Advanced properties)

Slide 53 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Demonstration

Adding form types and sample pages, setting up zones, modifying image

cleanup and recognition profiles, copy and paste, and rubber band OCR

Slide 54 Module 8 -- Automatic Separation and Form ID; Recognition Profiles

Lab

• Adding form types and sample pages

• Setting up zones

• Modifying image cleanup and recognition profiles

• Copy and paste

• Rubber band OCR

Refer to the Kofax Capture 10 Lab and Reference Guide