LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

59
LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition Supervised by Prof. LYU, Rung Tsong Michael Prepared by: Wong Chi Hang Tsang Siu Fung Department of Computer Science & Engineering The Chinese University of Hong Kong

description

Department of Computer Science & Engineering The Chinese University of Hong Kong. LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition. Supervised by Prof. LYU, Rung Tsong Michael. Prepared by: Wong Chi Hang Tsang Siu Fung. Outline. Introduction Overall Design - PowerPoint PPT Presentation

Transcript of LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Page 1: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

LYU0203Smart Traveller with Visual Translatorfor OCR and Face Recognition

Supervised by Prof. LYU, Rung Tsong Michael

Prepared by: Wong Chi Hang

Tsang Siu Fung

Department of Computer Science & Engineering

The Chinese University of Hong Kong

Page 2: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Outline

Introduction Overall Design Korean OCR Face Detection Future Work

Page 3: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – What is VTT?

Smart Traveller with Visual Translator (VTT)Mobile Device which is convenient for a

traveller to carry Mobile Phone, Pocket PC, Palm, etc.

Recognize and translate the foreign text into native language

Detect and recognize the face into name

Page 4: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Motivation

More and more people have mobile device which include Pocket PC, Palm, mobile phone.

Mobile Device becomes more powerful. There are many people travelling aboard

Page 5: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Motivation (Cont.)

Types of programs for Mobile Device Communication and Network Multimedia Games Personal management System tool Utility

Page 6: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Motivation (Cont.)

Application for traveller?Almost no!!!

Very often, travellers encounter many problems about unfamiliar foreign language

Therefore, the demand of an application for traveller is very large.

Page 7: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Objective

Help travellers to overcome language and memory power problems

Two main features:Recognize and translate Korean to English (K

orean is not understandable for us)Detect and recognize the face (Sometimes we

forget the name of a friend)

Page 8: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Objective (Cont.)

Target of Korean OCR Signs and Guideposts

Printed Characters Contrast Text Color and

Background Color Target of Face

Recognizer One face in photo Frontal face Limited set of faces

Page 9: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Introduction – Objective (Cont.)

Real Life ExamplesSometimes we lose the way, we need to know

where we are.Sometimes we forget somebody we met

before.

Page 10: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Overall Design of VTT System

GUI

Camera API

Camera

Korean OCR Face Recognizer

Face DatabaseStroke Database

&Dictionary

Request

RequestData

Data

RequestOutput

User

QueryResult Query UpdateResult

Request Response

Request Response Request Response

Page 11: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Design

Image Segmentation

Stroke Feature

Extraction

Stroke

Recognition

Translator

Stroke Database

& Dictionary

Stroke

Combination

Text Area Detection

Character Extraction

Binarization

Stroke Extraction

Input Image

Strokes

Strokes’ Feature Vectors

Recognized Results

Korean

Chars

Query

Result

Query Result

English

Binary Character Image

Page 12: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Text Area Detection

Edge Detection using Sobel Filter

-1 -2 -1

0 0 0

1 2 1

-1 0 1

-2 0 2

-1 0 1|||| yxE GGF

Page 13: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Text Area Detection (Cont.)

Horizontal and Vertical Edge Projection

Hor

izon

tal

Pro

jec

tion

Threshold

Vertical Projection

1

0

),(1

)(width

iEh yiF

widthyP

1

0

),(1

)(height

jEv jxF

heightxP

Page 14: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Binarization

Color Segmentation Base on Color

Histogram

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

-45 5 55 105 155 205 255

Threshold

Page 15: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Stroke Extraction

Labeling of Connected Component with 8-connectivity

Page 16: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Stroke Extraction (Cont.) Why do we choose stroke but not whole

character? Korean Character is composed of Some Stroke

types Limited Set of Stroke Types in Korean

Page 17: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Stroke Feature

Our Proposed FeatureFive rays each sideDifference of adjacent rays

(-1 or 0 or 1)Has holes (0 or 1)Dimension ratio of Stroke

(width/height) (-1 or 0 or 1)

Page 18: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Stroke Feature (Cont.)

Problems FacedTrain the stroke database needs much

timeTwo or more strokes maybe stick together

Page 19: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Stroke Recognition Exact Matching by Pre-learned Stroke Features Trained Decision Tree

Page 20: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

KOCR – Pattern Identification Six Pattern of Korean Character

Identify by simple if-then-else statement

0 1 2 3 4 5

Page 21: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Face Detection

Outline 1. Find Face Region 2. Find the potential eye region 3. Locate the iris 4. Improvement

Page 22: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region

There are three methods available

1. Projection of the image

2. Base on gray-scale image

3. Color-based model

Page 23: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region -Projection of the image

Consider only one single color: blue, green or red.

Usually blue pixel value is used because it can avoid the interference of the facial feature.

Project the blue pixel vertically to find the left and right edge of face.

Page 24: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region (Cont.) -Projection of the image

x

Edge of face

Sum of pixel value

Page 25: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region (Cont.) -Projection of the image

The image should be filtered out the high frequency of this curve by FTT (Fast Fourier Transform)

Assume the face occupy large area of the image

Page 26: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region -Base on gray-scale image

No color information Pattern recognition

Page 27: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

1. Find Face Region -Color-based model

We use this method because of its simplicity and robustness.

Color-based model is used to represent color.

Since human retina has three types of color photoreceptor cone cell, color model need three numerical components.

Page 28: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Color-based model (Cont.)

There are many color model such as RGB, YUV (luminance-chrominance) and HSB (hue, saturation and brightness)

Usually RGB color model will be transformed to other color model such as YUV and HSB.

Page 29: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Color-based model (Cont.)-YUV

We use YUV or YCbCr color model. Y component is used to represent the inte

nsity of the image Cb and Cr are used to represent the blue an

d red component respectively.

Page 30: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Color-based model (Cont.) -YCbCr Image

Y Cb Cr

Original Image -

Page 31: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color

How can YUV color model represent face color?

What happens when we transform the pixel into Cr-Cb histogram?

Page 32: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color

We just use a simple ellipse equation to model skin color.

Cb

Cr

Page 33: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color

where L is the length of the long axis and S is the length of the short axis.

We choose L = 35.42, S = 20.615, θ = -0.726 (radius)

12

2

2

2

S

Y

L

X

The equation of the ellipse :

Page 34: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color -Color segmentation

• The white regions represent the skin color pixels

Page 35: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color -Color segmentation (modified

version1) We distribute some agents in the image

uniformly. Then each agent will check whether the pixel is

a skin-like pixel and not visited by the other agent.

If yes, it will produce 4 more agents at its four neighboring points.

If no, it will moved to one of its four neighboring points randomly.

Page 36: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color (Cont.)

-Color segmentation (modified version1)

• This agent produce 4 more agents

If the pixel is a skin-like pixel and not visited by the If the pixel is a skin-like pixel and not visited by the other agent, produce 4 more agents at its four neigother agent, produce 4 more agents at its four neighboring pointshboring points

Page 37: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color (Cont.)

-Color segmentation (modified version1)Otherwise, it will moved to one of its four neighboring

points randomly

This agent move to one of four neighboring point

Page 38: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color (Cont.)

-Color segmentation (modified version1) Each agent will

search their own region

Each region are shown in the next slide with different color.

Page 39: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color (Cont.)

-Color segmentation (modified version1) The advantage of this algorithm is that we

need not to search the whole image. Therefore, it is fast.

Page 40: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Representation of Face color (Cont.)

-Color segmentation (modified version1)

19270 of 102900 pixels is searched (about 18.7%)

There are 37 regions

Page 41: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

2. Eye detection

After the segmentation of face region, we have some parts which are not regarded as skin color.

They are probably the region of eye and mouth

We only consider the red component of these regions because it usually includes the most information about faces.

Page 42: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

2. Eye detection (Cont.)

We extraction such regions by pseudo-convex hull.

Page 43: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

2. Eye detection (Cont.)

We do the following on the regions of potential eye region

1. Histogram equalization

2. Threshold

Page 44: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

2. Eye detection (Cont.)

Histogram equalization

Threshold with < 49Threshold with < 49

After the histogram equalization and After the histogram equalization and threshold, the searching space of eyes is threshold, the searching space of eyes is greatly reduced.greatly reduced.

Page 45: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris

After the operations above, we almost find the eye.

However, we should locate the iris. We use the following different methodsTemplate matchingHough Transform

Page 46: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.)-Template matching

It bases on normalized cross-correlation. It is used to measure the similarity

between two images

Page 47: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Template matching

Let I1, I2 be images of the same size.

I1(pi) = ai , I2(pi) = bi

i iii

iii

bbaa

bbaaIINCC

2221

)()(

)))(((),(

NCC(I1, I2) lies on the range [-1, 1]

Page 48: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Template matching

We use this template and calculate the NCC.

This template can be obtained by averaging all the eye image.

Page 49: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Template matching

Red region show the result

Page 50: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Hough transform

Hough Transform can find the complete shape of the edge according to small portion of edge information.

It works with a parametric representation of the object we are looking for.

We use Hough transform with 2D circle parametric representation to find the iris.

Page 51: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Hough transform

We find the edge of eye by Sobel filter.

Page 52: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Hough transform

We apply a circle on the edge image and count the number of pixel lying on the circle

Page 53: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Hough transform

A(x,y,r) <- Number of pixel

where A(x,y,r) is Accumulator, where x,y are the coordinate of the center and r is the radius of the circle.

The searching space for the circle is [x, y, r] = [17, 17, 8].

Page 54: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

3. Locate the iris (Cont.) -Hough transform

We have tried this method It fails to find the iris

Page 55: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

4. Improvement

Skin Color DetectionNeuron Network with simplified activate

function (polynomial)Probability function (e.g. Bayesian estimation)

Setup face Shape modelit estimates the shape of face

Page 56: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

4. Improvement (Cont.)

Template MatchingReplace it with deformable template or probability function.

Page 57: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Future Work

Stroke Combination Dictionary Face Detection Improvement Face Recognition

normal luminance light source about 20 people > 90 % accuracy

Port the system into Pocket PC

Page 58: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Q&A

Page 59: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

~The End~