Eye Blink Full Doc

8/2/2019 Eye Blink Full Doc

1/45

Eye Blinks

Abstract

This graduation project aims to present an application that is able of

replacing the traditional mouse with the human face as a new way to interact

with the computer. Facial features (nose tip and eyes) are detected and

tracked in real-time to use their actions as mouse events. The coordinates

and movement of the nose tip in the live video feed are translated to

become the coordinates and movement of the mouse pointer on the users

screen. The left/right eye blinks fire left/right mouse click events. The only

external device that the user needs is a webcam that feeds the program with

the video stream. In the past few years high technology has become more

progressed, and less expensive. With the availability of high speed

processors and inexpensive webcams, more and more people have become

interested in real-time applications that involve image processing. One of the

promising fields in artificial intelligence is HCI(Human Computer Interface.)

which aims to use human features (e.g. face, hands) to interact with the

computer. One way to achieve that is to capture the desired feature with a

webcam and monitor its action in order to translate it to some events that

communicate with the computer.

In our work we were trying to compensate people who have hands

disabilities that prevent them from using the mouse by designing an

application that uses facial features (nose tip and eyes) to interact with the

Computer. The nose tip was selected as the pointing device; the reason

behind that decision is the location and shape of the nose; as it is located in

the middle of the face it is more comfortable to use it as the feature that

moves the mouse pointer and defines its coordinates, not to mention that it

is located on the axis that the face rotates about, so it basically does not


2/45

change its distinctive convex shape which makes it easier to track as the

face moves. Eyes were used to simulate mouse clicks, so the user can fire

their events as he blinks.

EXISTING SYSTEM

While different devices were used in HCI (e.g. infrared cameras,

sensors, microphones) we used an off-the-shelf webcam that affords a

moderate resolution and frame rate as the capturing device in order to make

the ability of using the program affordable for all individuals.

PROPOSED SYSTEM

To present an algorithm that distinguishes true eye blinks from Involuntary

ones, detects and tracks the desired facial features precisely, and fast

enough to be applied in real-time.

SYSTEM SPECIFICATION

Operating Systems: XP sp2 , 2003

Pentium P4 Processors or better

1 GB of RAM [Required]

JDK 1.5 or more.

JMF 2.x version and above

30 frames supportable web camera.

MODULES


3/45

Facial features (nose tip and eyes) are detected and tracked in real-

time to use their actions as mouse events.

Nose tip movements are lively fed and translated to become the

movement of mouse pointer or cursor.

The left/right eye blinks replaces left/right mouse click events.

Face Detection

In this module, we propose a real-time face detection algorithm using Six-

Segmented Rectangular (SSR) filter, distance information, and template matching

technique. Between-the-Eyes is selected as face representative in our detection

because its characteristic is common to most people and is easily seen for a wide

range of face orientation. Firstly, we scan a certain size of rectangle divided into six

segments throughout the face image. Then their bright-dark relations are tested if its

center can be a candidate of Between-the-Eyes. Next, the distance information

obtained from stereo camera and template matching is applied to detect the true

Between-the-Eyes among candidates. We implement this system on PC with Xeon

2.2 GHZ. The system can run at real-time speed of 30 frames/sec with detection rate

of 92%.

The current evolution of computer technologies has enhanced various applications

in human-computer interface. Face and gesture recognition is a part of this field,

which can be applied in various applications such as in robotic, security system,

drivers monitor, and video coding system.

Since human face is a dynamic object and has a high degree of variability, various

techniques have been proposed previously. Based on the survey of Hjelmas [1], he

has classified face detection techniques into two categories: featurebased approach


4/45

and image-based approach. The techniques in the first category makes used of

apparent properties of face such as face geometry, skin color, and motion. Even

feature-based technique can achieve high speed in face detection, but it also has

problem in poor reliability under lighting condition. For second category, theimagebased approach takes advantage of current advance in pattern recognition

theory. Most of the imagebased approach applies a window scanning technique for

detecting face [1], which requires large computation. Therefore, by using only

imagebased approach is not suitable enough in real-time application.

In order to achieve high speed and reliable face detection system, we propose the

method combine both feature-based and image-based approach to detect the point

between the eyes (hereafter we call it Between-the-Eyes) by using Six-Segmented

Rectangular filter (SSR filter). The proposed SSR filter, which is the rectangle

divided into 6 segments, operates by using the concept of bright-dark relation

around Between-the-Eyes area. We select Between-the-Eyes as face representative

because it is common to most people and easy to find for wide range of face

orientation [2]. Between-the-Eyes have dark part (eyes and eyebrows) on both sides,

and have comparably bright part on upper side (forehead), and lower side (nose andcheekbone). This characteristic is stable for any facial expression [2].

In this paper, we use an intermediate representation of image called integral image

from Viola and Jones work [3] to calculate sums of pixel values in each segment of

SSR filter. Firstly, SSR filter is scanned on the image and the average gray level of

each segment is calculated from integral image. Then, the bright-dark relations

between each segment are tested to see whether its center can be a candidate point

for Between -the- Eyes. Next, the stereo camera is used to find the distance

information and the suitable Between-the- Eyes template size. Then, the Between-

the-Eyes candidates are evaluated by using a template of Between-the-Eyes

(obtained from 400 images of 40 people from ORL face database [4]) matching

technique. Finally the true Between-the-Eyes can be detected.


5/45

The proposed technique gains advantage of using only the gray level information so

it is more reliable for changes of lighting conditions. Moreover, this method is also

not affected by beards, mustaches, hair, or nostril visibility, since only the

information around eyes, eyebrows and nose area is required. We implement thissystem on PC with Xeon 2.2 GHz CPU. The system can run at 30 frames/sec with

detection rate of 92%.

In Section 2, we describe the concept of integral image followed by the explanation

of using SSR filter to extract Between-the-Eyes candidates in Section 3. For Section

4, we explain the candidate selection method by using stereo camera and average

Between-the-Eyes template matching technique. Then in Section 5, the whole

system of real-time face detection system is shown. The experimental results are

shown in Section 6 and end up with conclusion in Section 7.

Integral Image

The SSR filter is computed by using intermediate representation for image called

integral image. For the original image i(x, y), the integral image is defined as [3]

The integral image can be computed in one pass over the original image by the

following pair of recurrences.

s(x ,y ) = s(x , y - 1) + i(x ,y) (2)

ii(x ,y ) = ii(x - 1, y) + s(x ,y ) (3)

Wheres(x ,y ) is the cumulative row sum,s(x , -1) = 0, and ii(-1, y) = 0.

Using the integral image, the sum of pixels within rectangle D (rs) can be computed

at high speed with four array references as shown in Fig.1.


6/45

sr= (ii (x ,y ) + ii(x - W, y - L)) - (ii (x - W, y) + ii(x, y - L )) (4)

Figure 1. Integral Image

SSR filter

1 SSR filter

At the beginning, a rectangle is scanned throughout the input image. This rectangle

is segmented into six segments as shown in Fig.2 (a).

Figure 2. SSR Filter


7/45

We denote the total sum of pixel value of each segment (B1 B6) as 1 6 b b S S .

The proposed SSR filter is used to detect the Between-the-Eyes based on two

characteristics of face geometry.

(1) The nose area ( n S ) is brighter than the right and left eye area ( er S and el S ,

respectively) as shown in Fig.2 (b), where

Sn = Sb2 + Sb5

Ser= Sb1 + Sb4

Sel = Sb3 + Sb6

Then,

Sn > Ser (5)

Sn > Sel (6)

(2) The eye area (both eyes and eyebrows) ( e S ) is relatively darker than the

cheekbone area (including nose) ( c S ) as shown in Fig. 2 (c), where

Se = Sb1 + Sb2 + Sb3

Sc = Sb4 + Sb5 + Sb6

Then,

Se < Sc (7)

When expression (5), (6), and (7) are all satisfied, the center of the rectangle can be

a candidate for Between-the-Eyes.


8/45

Figure 3. Between-the-Eyes candidates from SSR filter

In Fig.3 (b), the Between-the-Eyes candidate area is displayed as the white areas and

the non-candidate area is displayed as the black part. By performing labeling

process on Fig. 3 (b), the result of using SSR filter to detect Between-the-Eyes

candidates is shown in Fig. 3 (a).

2 Filter Size Estimation

In order to find the most suitable filter size, we use 400 facial images of 40 people,

i.e., 10 for each from ORL face database [4]. The images were taken at different

time, under various lighting condition, at different gesture, and with and without

eyeglasses. Each image size is 92112 with 256 gray levels.

We perform filter size estimation manually for all 400 facial images to find thestandard filter size, which covers two eyes, two eyebrows, and cheekbone area

(including nose). The result is estimated to be a rectangle of size 6030 pixels. In

the experiment, we counted whether a true candidate is included or is in vicinity

area. By varying the standard filter size of 6030 by 20%, the true candidate point

detection rate and the number of candidate of each filter size are shown in Table 1.

The standard filter size of 6030 can obtain 92% detection rate, which prove that

this filter size can function effectively. On the other hand, the detection rate

becomes worse (52%), when we use the filter of size 8442 pixels, because large

filter size may include some unnecessary parts of face such as hair or beard. Since

the sum of pixel value is used in expression (5), (6), and (7), the filter of size 2412


9/45

and 3618 as shown in Fig. 4 can achieve unexpected high detection rate of

Betweenthe- Eyes even these filter sizes do not completely contain both eyes area,

because only some parts of eyes are still darker than nose area.

Fig. 5 is the examples of some successful Between-the-Eyes detections, where some

of failures are shown in Fig. 6. These detection errors may cause by the

illumination. The detection failure of the middle image in Fig. 6 is mainly

influenced by the reflection on the eyeglasses.

Figure 4. Various size of SSR filter

Figure 5. Examples of successful Between-the-Eyes detection

Figure 6. Examples of failures in Between-the-Eyes detection


10/45

Moreover Fig. 7 is the example of successful Between-the-Eyes detection for image,

which has horizontal illumination hits on one side of face. In this case, SSR filter

can also function effectively even if one side of face is covered by shadow.

Therefore SSR filter can be used to detect Betweenthe- Eyes under variations oflighting condition.

Table 1. Detection results of various SSR filter sizes (from 400 face images)

Figure 7. Example of successful Between-the-Eyes detection for face, which has

illumination hits on one side

According to Table 1, the rectangular of size 0.6~1.2 times of standard size (6030)

can be used to detect candidate of Between-the-Eyes. Therefore various size of face

image from 0.83~1.67 times of the standard image (92112 pixels) can be detected

by our proposed SSR filter.

Candidate Selection

1 Stereo camera


11/45

In real situation, face size varies according to the distance from face to cameras. We

use two cameras to construct a binocular stereo system to find the distance

information, so that a suitable size of Between-the-Eyes template can be estimated

for further template matching technique discussed later in Section 4.2. Since thestereo camera system is the general process, the detail explanation is omitted in this

paper.

We performed the experiments to find the suitable size of Between-the-Eyes

template by using the difference among right and left images based on the principle

of binocular stereo camera system. Firstly, we measured the horizontal different in

pixel between the Between-the-Eyes of face image obtained from right and left

cameras manually. Then the width between the right and left temples is manually

measured, which should be corresponded to the width of the template of Between-

the-Eyes.

The relation between disparities and suitable templates sizes of Between-the-Eyes is

shown in Fig.8. Based on this relation, we can select an appropriate size of the

template according to the measured disparity in an actual scene. This is why our

proposed technique is applicable to faces at various distances between 0.5-3.5 m.

from the cameras.

From experiments and relation in Fig.8, we can find relations between SSR filter

size, disparity, and size of Between-the-Eyes template as shown in Table 2. Only

two filter size: 4020 and 2412 are used since they are flexible enough to detect

face within pre-defined range. For example, face of disparity equal to 20, the SSR

filter of size 4020 is used and the template size of Between-the-Eyes is 4824

pixels. Then the template is scaled to match the average Between-the-Eyes template

size for template matching technique. For the face of disparity outside the range

shown in table 2 is assumed to be undetectable.


12/45

Figure 8. The relation between the horizontal differences in pixel (disparity) and the

Between-the-Eyes template size

Table 2. Filter size, disparity, and related Between-the-Eyes template size

2 Average Between-the-Eyes Template Matching

Because the SSR filter extracts not only the true Between-the- Eyes but also some

false candidates, so we use the average Between-the-Eyes template matching

technique to solve this problem. The average Between-the-Eyes pattern used in this

paper obtained in the same manner as [2] from 400 face images of 40 people from

ORL face database [4].

Figure 9. Average Between-the-Eyes template and its variance pattern

Fig. 9 is the average Between-the-Eyes template and its variance pattern of size

3216. The gray levels of each sample were normalized to have average gray level

equal to zero and variance equal to one. Then we calculated an average pattern and

its variance at each pixel. Next, the gray level was converted to have the average

level equal to 128 with standard deviation of 64. Then we can get the average


13/45

pattern as an image. To obtain the variance pattern, each value was multiply by 255.

Both average and variance pattern are symmetry.

To avoid the influence of unbalanced illumination, we evaluate the right and left

part of face separately because lighting condition is likely different between right

and left half of face. Moreover, we also avoid the affect of hair and beard, and

reduce calculation load by discard the top three rows from the calculation. At the

end, the pattern of 1613 pixels (for one side) is used in template matching.

Define the average Between-the-Eyes template and its variance for left side of face

as , and for the right side as t1ij, v1ij (i=0,...,15, j = 3, ...., 15) and t

rij, v

rij (i=0,...,15, j

= 3, ...., 15). trij and t1ij have average value of 128 with standard deviation of 64,

where vrand v1 represent maximum gray level equal to 255.

To evaluate the candidates, we define the Betweenthe- Eyes pattern as pmn

(m=0,...,31, n = 0, ...., 15) . Then right and left half of pmn is re-defined again

separately as prij (i=0,...,15, j = 3, ...., 15) and p1ij (i=0,...,15, j = 3, ...., 15),

respectively, each has been converted to have average value of 128 and standard

deviation of 64.

Then the left mismatching value (Dl) and the right mismatching value (Dr) are

calculated by using the following equation.

Only the candidate with both Dl and Dr less than pre-defined threshold ( D ) is

counted as the true candidate. For the case of more than one candidate has bothD l


14/45

and Dr less than threshold, the candidate with the smallest mismatch value is judged

as the true Between-the-Eyes candidate.

3 Detection of Eye-Like Points

Since Between-the-Eyes is located in the middle of left and right eye alignment, we

perform detection of both eyes to confirm the location of the true Between-the-Eyes.

When the locations of both eyes are extracted from the selected face area, the

Between-the-Eyes is re-registered as the middle point among them.

We search eyes area from Between-the-Eyes template obtained from Section 4.1.

The eye detection is done in a simple way as a technique used in [5]. In order toavoid the influence of illumination, we perform the right eye and left eye search

independently. Firstly, the rectangular areas on both side of the Between-the-Eyes

candidate where the eyes should be found are extracted. In this paper, for the

selected Between-the-Eyes area of size 3216, we avoid the affect of eyebrows,

hair, and beard by ignore 1 pixel at boarder. Then both eyes areas are assumed to be

at 1214 pixels on each side of face (neglect three pixel in the middle of Between-

the- Eyes template as nose area).

Next, we find the threshold level for each area to binarize the image. The threshold

level is determined when the sum of the number of pixels of all components except

the boarder exceeds a pre-defined value [6] (10 in this paper). In some case, the

eyebrows have almost the same gray level as the eyes. So we select the area within a

certain range of pixels (5~25 pixels) with the lowest position.

To solve the problem in similarity of gray level of eyes and eyebrows, the searching

process using the concept of left and right eye alignment is performed. The range of

this process focuses on the 33 pixels in the middle of both eyes area. Then

condition of the distance between the located eyes ( De) and the angle ( Ae) at


15/45

Between-the-Eyes candidate are tested using the following expression. Both

expressions are obtained from experiments.

15 < De < 21 (10)

115 < Ae < 180 (11)

Only the candidate with eyes relation satisfies both condition is re-registered as the

true Between-the-Eyes. Otherwise, the Between-the-Eyes and eyes area cannot

determine.

Real-Time Face Detection System

The processing flow of Real-Time face detection system is shown in Fig. 10.

Figure 10. Processing Flow of Real-Time Face Detection

Experiment

We implement the system on PC with Xeon 2.2 GHz CPU. In the experiment, two

commercial NTSC video cameras, multivideo composer, and video capture board

without any special hardware is used. Two NTSC cameras are used to construct a

binocular stereo system. The multi-video composer combines four NTSC video


16/45

signals into one NTSC signal. We use only two NTSC video signals from multi-

video composer in our experiment. Each video image becomes one half of the

original size. Therefore, the captured image size for each camera is 320240.

However, to avoid the interlaced scanning problem for moving object, we use onlyeven line data. Consequently, the image size is 320120 for each camera. The

resulting horizontal image resolution is double of the vertical one as shown in the

bottom two images in Fig. 11. We keep this non-uniform resolution to obtain as

accurate disparity as possible.

On the other hand, we need a regular image for applying template matching of

Between-the-Eyes. Therefore we reconstruct a smaller image by sub-sampling as

shown in uppermost-left image of Fig.11.

Fig. 11 is the face detection result from the experiment performed in the laboratory

with unspecified background. The uppermost-left image is a monochrome image of

the right camera with only the green component. The Betweenthe Eyes detection is

applied to this (160120) monochrome image. The lower image is the image

obtained from the right camera, and the lowest image is obtained from the left

camera.

Figure 11. Face Detection Result


17/45

The detection result from SSR filter is shown in the uppermost-right image. The

upper corner is the Between-the Eyes candidate area after cutting and scaling to

match the average matching template. Its binarized image of detected eyes and

eyebrows after eye detection process is displayed below. Anyway, since noinformation in the inclination of face is used in SSR filter, this technique cannot be

used to detect face with inclination larger than 10 . For the case of large reflection

at eyeglasses, our proposed technique also failed to detect the true Between-the Eyes

occasionally. In real implementation, the system can operate at 30 frames/sec, which

achieve real-time processing speed.

We propose a real-time face detection system consists of three major components:

SSR filter, stereo camera system, and average Between-the-Eyes template matching

unit. At the beginning, a SSR filter, in which bright-dark relations of average gray

levels of each segments are tested if its center can be Between-the-Eyes candidate.

At this point, we used integral image proposed by Viola [3] in SSR filter

calculation in order to obtain real-time scanning of filter throughout the image.

Since only gray information is used, our proposed technique is more reliable for

changes of lighting conditions than skin color extraction methods. Next, stereocamera system is performed to find distance information so that the suitable size of

Between-the-Eyes template can be estimated. This technique can be used to reduce

calculation load and to detect faces of different size. Then we performed the average

Between-the-Eyes template matching to select the true candidate, followed by the

detection of both eye areas to verify our detection result. We implemented the

system on PC with Xeon 2.2 GHz. The system ran at 30 frames/sec, which satisfied

realtime processing speed. Anyway our proposed technique still has limitation inface orientation. Further development to solve this problem should be performed

Hough Transform


18/45

Common Names: Hough transform

Brief Description

The Hough transform is a technique which can be used to isolate features of a

particular shape within an image. Because it requires that the desired features be

specified in some parametric form, the classical Hough transform is most

commonly used for the detection of regular curves such as lines, circles, ellipses,

etc. AgeneralizedHough transform can be employed in applications where a simple

analytic description of a feature(s) is not possible. Due to the computational

complexity of the generalized Hough algorithm, we restrict the main focus of this

discussion to the classical Hough transform. Despite its domain restrictions, the

classical Hough transform (hereafter referred to without the classicalprefix) retains

many applications, as most manufactured parts (and many anatomical parts

investigated in medical imagery) contain feature boundaries which can be described

by regular curves. The main advantage of the Hough transform technique is that it is

tolerant of gaps in feature boundary descriptions and is relatively unaffected by

image noise.

How It Works


19/45

The Hough technique is particularly useful for computing a global description of a

feature(s) (where the number of solution classes need not be known a priori), given

(possibly noisy) local measurements. The motivating idea behind the Hough

technique for line detection is that each input measurement (e.g. coordinate point)indicates its contribution to a globally consistent solution (e.g. the physical line

which gave rise to that image point).

As a simple example, consider the common problem of fitting a set of line segments

to a set of discrete image points (e.g. pixel locations output from an edge detector).

Figure 1 shows some possible solutions to this problem. Here the lack of a priori

knowledge about the number of desired line segments (and the ambiguity about

what constitutes a line segment) render this problem under-constrained.

Figure 1a) Coordinate points. b) and c) Possible straight line fittings.

We can analytically describe a line segment in a number of forms. However, a

convenient equation for describing a set of lines usesparametric ornormalnotion:


20/45

where is the length of a normal from the origin to this line and is the orientation

of with respect to the X-axis. (See Figure 2.) For any point on this line, and

are constant.

Figure 2 Parametric description of a straight line.

In an image analysis context, the coordinates of the point(s) of edge segments ( i.e.

) in the image are known and therefore serve as constants in the parametric

line equation, while and are the unknown variables we seek. If we plot the

possible values defined by each , points in cartesian image space map to

curves (i.e. sinusoids) in the polar Hough parameter space. This point-to-curve

transformation is the Hough transformation for straight lines. When viewed in

Hough parameter space, points which are collinear in the cartesian image space


21/45

become readily apparent as they yield curves which intersect at a common

point.

The transform is implemented by quantizing the Hough parameter space into finite

intervals or accumulator cells. As the algorithm runs, each is transformed

into a discretized curve and the accumulator cells which lie along this curve are

incremented. Resulting peaks in the accumulator array represent strong evidence

that a corresponding straight line exists in the image.

We can use this same procedure to detect other features with analytical descriptions.

For instance, in the case ofcircles, the parametric equation is

where and are the coordinates of the center of the circle and is the radius. In this

case, the computational complexity of the algorithm begins to increase as we now

have three coordinates in the parameter space and a 3-D accumulator. (In general,

the computation and the size of the accumulator array increase polynomially with

the number of parameters. Thus, the basic Hough technique described here is only

practical for simple curves.)

Guidelines for Use

The Hough transform can be used to identify the parameter(s) of a curve which best

fits a set of given edge points. This edge description is commonly obtained from a

feature detecting operator such as the Roberts Cross, Sobel orCanny edge detector

and may be noisy, i.e. it may contain multiple edge fragments corresponding to a

single whole feature. Furthermore, as the output of an edge detector defines only
http://homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm


22/45

where features are in an image, the work of the Hough transform is to determine

both whatthe features are (i.e. to detect the feature(s) for which it has a parametric

(or other) description) and how many of them exist in the image.

In order to illustrate the Hough transform in detail, we begin with the simple image

of two occluding rectangles,

The Canny edge detectorcan produce a set of boundary descriptions for this part, as

shown in

Here we see the overall boundaries in the image, but this result tells us nothing

about the identity (and quantity) of feature(s) within this boundary description. In

this case, we can use the Hough (line detecting) transform to detect the eight

separate straight lines segments of this image and thereby identify the true

geometric structure of the subject.

If we use these edge/boundary points as input to the Hough transform, a curve is

generated in polar space for each edge point in cartesian space. The

accumulator array, when viewed as an intensity image, looks like
http://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1can1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1.gif


23/45

Histogram equalizing the image allows us to see the patterns of information

contained in the low intensity pixel values, as shown in

Note that, although and are notionally polar coordinates, the accumulator space is

plotted rectangularly with as the abscissa and as the ordinate. Note that the

accumulator space wraps around at the vertical edge of the image such that, in fact,

there are only 8 real peaks.

Curves generated by collinear points in the gradient image intersect in peaks in

the Hough transform space. These intersection points characterize the straight line

segments of the original image. There are a number of methods which one might

employ to extract these bright points, orlocal maxima, from the accumulator array.

For example, a simple method involves threshold and then applying some thinning

to the isolated clusters of bright spots in the accumulator array image. Here we use a

relative threshold to extract the unique points corresponding to each of the

straight line edges in the original image. (In other words, we take only those local

maxima in the accumulator array whose values are equal to or greater than some

fixed percentage of the global maximum value.)
http://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou2.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou1.gif


24/45

Mapping back from Hough transform space (i.e.de-Houghing) into cartesian space

yields a set of line descriptions of the image subject. By overlaying this image on an

inverted version of the original, we can confirm the result that the Hough transform

found the 8 true sides of the two rectangles and thus revealed the underlyinggeometry of the occluded scene

Note that the accuracy of alignment of detected and original image lines, which is

obviously not perfect in this simple example, is determined by the quantization of

the accumulator array. (Also note that many of the image edges have several

detected lines. This arises from having several nearby Hough-space peaks with

similar line parameter values. Techniques exist for controlling this effect, but were

not used here to illustrate the output of the standard Hough transform.)

Note also that the lines generated by the Hough transform are infinite in length. If

we wish to identify the actual line segments which generated the transform

parameters, further image analysis is required in order to see which portions of these

infinitely long lines actually have points on them.

To illustrate the Hough technique's robustness to noise, the Canny edge description

has been corrupted by 1% salt and pepper noise

before Hough transforming it. The result, plotted in Hough space, is
http://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou3.gif


25/45

De-Houghing this result (and overlaying it on the original) yields

(As in the above case, the relative threshold is 40%.)

The sensitivity of the Hough transform to gaps in the feature boundary can be

investigated by transforming the image

, which has been edited using apaint program. The Hough representation is

and the de-Houghed image (using a relative threshold of 40%) is
http://homepages.inf.ed.ac.uk/rbf/HIPR2/imagedit.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou7.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou6.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1can3.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou5.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sqr1hou4.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/imagedit.htm


26/45

In this case, because the accumulator space did not receive as many entries as in

previous examples, only 7 peaks were found, but these are all structurally relevant

lines.

We will now show some examples with natural imagery. In the first case, we have a

city scene where the buildings are obstructed in fog,

If we want to find the true edges of the buildings, an edge detector ( e.g. Canny)

cannot recover this information very well, as shown in

However, the Hough transform can detect some of the straight lines representing

building edges within the obstructed region. The histogram equalized accumulator

space representation of the original image is shown in
http://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/histeq.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sff1hou1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sff1can1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sff1sca1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/histeq.htm


27/45

If we set the relative threshold to 70%, we get the following de-Houghed image

Only a few of the long edges are detected here, and there is a lot of duplication

where many lines or edge fragments are nearly colinear. Applying a more generous

relative threshold, i.e. 50%, yields

yields more of the expected lines, but at the expense of many spurious lines arising

from the many colinear edge fragments.

Our final example comes from a remote sensing application. Here we would like to

detect the streets in the image

of a reasonably rectangular city sector. We can edge detect the image using the

Canny edge detectoras shown in
http://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/urb1can1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/urb1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sff1hou3.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/sff1hou2.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htm


28/45

However, street information is not available as output of the edge detector alone.

The image

shows that the Hough line detector is able to recover some of this information.

Because the contrast in the original image is poor, a limited set of features ( i.e.

streets) is identified.

Common Variants

Generalized Hough Transform

The generalized Hough transform is used when the shape of the feature that we wish

to isolate does not have a simple analytic equation describing its boundary. In this

case, instead of using a parametric equation of the curve, we use a look-up table to

define the relationship between the boundary positions and orientations and the

Hough parameters. (The look-up table values must be computed during apreliminary phase using a prototype shape.)

For example, suppose that we know the shape and orientation of the desired feature.

(See Figure 3.) We can specify an arbitrary reference point within the

feature, with respect to which the shape (i.e. the distance and angle of normal lines

drawn from the boundary to this reference point ) of the feature is defined. Our

look-up table (i.e.R-table) will consist of these distance and direction pairs, indexed

by the orientation of the boundary.
http://homepages.inf.ed.ac.uk/rbf/HIPR2/images/urb1hou1.gif


29/45

Figure 3 Description of R-table components.

The Hough transform space is now defined in terms of the possible positions of the

shape in the image, i.e. the possible ranges of . In other words, the

transformation is defined by:

(The and values are derived from the R-table for particular known orientations

.) If the orientation of the desired feature is unknown, this procedure is complicated

by the fact that we must extend the accumulator by incorporating an extra parameter

to account for changes in orientation.

Interactive Experimentation

You can interactively experiment with this operator by clicking here.

Exercises

1. Find the Hough line transform of the objects shown in Figure 4.
http://homepages.inf.ed.ac.uk/rbf/HIPR2/houghdemo.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/houghdemo.htm


30/45

Figure 4 Features to input to the Hough transform line detector.

2. Starting from the basic image

create a series of images with which you can investigate the ability of the

Hough line detector to extract occluded features. For example, begin using

translation and image addition to create an image containing the original

image overlapped by a translated copy of that image. Next, use edge

detection to obtain a boundary description of your subject. Finally, apply the

Hough algorithm to recover the geometries of the occluded features.

3. Investigate the robustness of the Hough algorithm to image noise. Starting

from an edge detected version of the basic image
http://homepages.inf.ed.ac.uk/rbf/HIPR2/translte.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/pixadd.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/edgdetct.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/edgdetct.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/wdg3.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/art5.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/translte.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/pixadd.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/edgdetct.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/edgdetct.htm


31/45

try the following: a) Generate a series of boundary descriptions of the image

using different levels of Gaussian noise. How noisy (i.e. broken) does the

edge description have to be before Hough is unable to detect the original

geometric structure of the scene? b) Corrode the boundary descriptions withdifferent levels ofsalt and pepper noise. At what point does the combination

of broken edges and added intensity spikes render the Hough line detector

useless?

4. Try the Hough transform line detector on the images:

and

Experiment with the Hough circle detector on
http://homepages.inf.ed.ac.uk/rbf/HIPR2/noise.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/noise.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/noise.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/arp1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/pdc1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/pea1.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/noise.htmhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/noise.htm


32/45

and

5. One way of reducing the computation required to perform the Hough

transform is to make use of gradient information which is often available as

output from an edge detector. In the case of the Hough circle detector, the

edge gradient tells us in which direction a circle must lie from a given edge

coordinate point. (See Figure 5.)
http://homepages.inf.ed.ac.uk/rbf/HIPR2/images/tom2.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/rck3.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/cel5.gifhttp://homepages.inf.ed.ac.uk/rbf/HIPR2/images/alg1.gif


33/45

Figure 5 Hough circle detection with gradient information.

a) Describe how you would modify the 3-D circle detector accumulator array

in order to take this information into account. b) To this algorithm we maywant to add gradient magnitude information. Suggest how to introduce

weightedincrementing of the accumulator.

6.The Hough transform can be seen as an efficient implementation of a

generalized matched filter strategy. In other words, if we created a template

composed of a circle of 1's (at a fixed ) and 0's everywhere else in the

image, then we could convolve it with the gradient image to yield an

accumulator array-like description of all the circles of radius in the image.

Show formally that the basic Hough transform (i.e. the algorithm with no use

of gradient direction information) is equivalent to template matching.

7. Explain how to use the generalized Hough transform to detect octagons.


34/45

Hough transform

The Hough transform is a feature extraction technique used in digital imageprocessing. The classical transform identifies lines in the image, but it has been

extended to identifying positions of arbitrary shapes. The transform universally used

today was invented by Richard Duda and Peter Hart in 1972, who called it a

"generalized Hough transform" after the related patent of Paul Hough. The

transform was popularized in the computer vision community by Dana H. Ballard

through a 1981 journal article titled "Generalizing the Hough transform to detect

arbitrary shapes".

Theory

To extract features from digital images, it is useful to be able to find simple shapes -

straight lines, circles, ellipses and the like - in images. In order to achieve this goal,

one must be able to detect a group of pixel that are on a straight line or a smoothcurve. That is what Hough transform supposed to do.

The simplest case of Hough transform is Hough linear transform. To illustrate the

idea, let's start with a straight line. In the image space, the straight in can be

described as y = mx + b and is plotted for each pair of values (x,y). However, the

charactistics of that straight line is not x or y, but its slope m and intercept b. Based

on that fact, the straight line y = mx + b can be represented as a point (b, m) in the

parameter space (b vs. m graph.)

Using slope-intercept parameters could make application complicated since both

parameters are unboundedAs lines get more and more vertical, the magnitudes of m

and b grow towards infinity For computational purposes, however, it is better to


35/45

parameterize the lines in the Hough transform with two other parameters, commonly

called rand (theta). The parameterrrepresents the smallest distance between the

line and the origin, while is the angle of the locus vector from the origin to this

closest point Using this parametrization, the equation of the line can be written as:

r=x.cos +y.sin

It is therefore possible to associate to each line of the image, a couple (r,) which isunique if and , or if and . The (r,) plane is

sometimes referred to as Hough space. This representation makes the Hough

transform to be conceptually very close to the so-called Radon transform.

It is well known that an infinite number of lines can go through a single point of the

plane. If that point has coordinates (x0,y0) in the image plane, all the lines that go

through it obey the following equation:

r() =x0.cos +y0.sin

This corresponds to a sinusoidal curve in the (r,) plane, which is unique to that

point. If the curves corresponding to two points are superimposed, the location (in
http://en.wikipedia.org/wiki/Image:Rho-theta.jpg


36/45

theHough space) where they cross correspond to lines (in the original image space)

that pass through both points. More generally, a set of points that form a straight line

will produce sinusoids which cross at the parameters for that line. Thus, the problem

of detecting colinear points can be converted to the problem of finding concurrentcurves.

Implementation

Hough transform algorithm uses an array called accumulator to detect the existence

of a line y = mx + b. The dimension of the accumulator is equal to the number of

unknown parameters of Hough transform problem. For example, the Hough linear

transform problem has two unknown parameters: m and b. The two demension of

the accumulator array would correspond to quantized values for m and b. For each

pixel and its neighbhood, Hough transform algorithm determines if there is enough

evidence of an edge at that pixel. If so, it will caculate the paramaters of that line,

and then look for the accumulator's bin that the parameters fall into, and increase the

value of that bin. By finding the bins with the highest value, the most likely lines

can be extracted, and their (approximate) geometric definitions read off. The

simplest way of finding these peaks is by applying some form of threshold, but

different techniques may yield better results in different circumstances - determining

which lines are found as well as how many. Since the lines returned do not contain

any length information, it is often next necessary to find which parts of the image

match up with which lines.

Example

Consider three data points, shown here as black dots.


37/45

For each data point, a number of lines are plotted

going through it, all at different angles. These are

shown here as solid lines.

For each solid line a line is plotted which is

perpendicular to it and which intersects the origin.

These are shown as dashed lines.

The length and angle of each dashed line is measured.

In the diagram above, the results are shown in tables.

This is repeated for each data point.

A graph of length against angle, known as a Hough

space graph, is then created.
http://en.wikipedia.org/wiki/Image:Hough_transform_diagram.png


38/45

The point where the lines intersect gives a distance and angle. This distance and

angle indicate the line which bisects the points being tested. In the graph shown the

lines intersect at the purple point; this corresponds to the solid purple line in the

diagrams above, which bisects the three points.

The following is a different example showing the results of a Hough transform on a

raster image containing two thick lines.
http://en.wikipedia.org/wiki/Image:Hough_space_plot_example.png


39/45

The results of this transform were stored in a matrix. Cell value represents the

number of curves through any point. Higher cell values are rendered brighter. The

two distinctly bright spots are the intersections of the curves of the two lines. Fromthese spots' positions, angle and distance from image center of the two lines in the

input image can be determined.

Variations and extensions

Using the gradient direction to reduce the number of votes

An improvement suggested by O'Gorman and Clowes can be used to detect lines ifone takes into account that the local gradient of the image intensity will necessarily

be orthogonal to the edge. Since edge detection generally involves computing the

intensity gradient magnitude, the gradient direction is often found as a side effect. If

a given point of coordinates (x,y) happens to indeed be on a line, then the local
http://en.wikipedia.org/wiki/Image:Hough-example-result-en.png


40/45

direction of the gradient gives the parameter corresponding to said line, and the r

parameter is then immediately obtained. In fact, the real gradient direction is only

estimated with a given amount of accuracy (approximately 20), which means that

the sinusoid must be traced around the estimated angle, 20. This however reducesthe computation time and has the interesting effect of reducing the number of

useless votes, thus enhancing the visibility of the spikes corresponding to real lines

in the image.

Hough transform of curves, and Generalised Hough transform

Although the version of the transform described above applies only to finding

straight lines, a similar transform can be used for finding any shape which can be

represented by a set of parameters. A circle, for instance, can be transformed into a

set of three parameters, representing its center and radius, so that the Hough space

becomes three dimensional. Arbitrary ellipses and curves can also be found this

way, as can any shape easily expressed as a set of parameters. For more complicated

shapes, the Generalised Hough transform is used, which allows a feature to vote for

a particular position, orientation and/or scaling of the shape using a predefined look-

up table.

Using weighted features

One common variation detail. That is, finding the bins with the highest count in one

stage can be used to constrain the range of values searched in the next.

Limitations

The Hough Transform is only efficient if a high number of votes fall in the right bin,

so that the bin can be easily detected amid the background noise. This means that

the bin must not be too small, or else some votes will fall in the neighboring bins,

thus reducing the visibility of the main bin.


41/45

Also, when the number of parameters is large (that is, when we are using the

Generalised Hough Transform with typically more than three parameters), the

average number of votes cast in a single bin is very low, and those bins indeed

corresponding to a figure in the image not necessarily appear to have a much highernumber of votes than the neighbors. Thus, the Generalised Hough Transform must

be used with great care to detect anything other than lines or circles.

Finally, much of the efficiency of the Hough Transform is dependent on the quality

of the input data: the edges must be detected well for the Hough Transform to be

efficient. Use of the Hough Transform on noisy images is a very delicate matter and

generally, a denoising stage must be used before. In the case where the image is

corrupted by speckle, as is the case in radar images, the Radon transform is

sometimes preferred to detect lines, since it has the nice effect of attenuating the

noise through summation.

What is eye tracking? (References)

Is there an easier way for the disabled to communicate? How does a 6-month-old

baby perceive the world? Where is the most effective ad space on a website?

Eye tracking can be used to find answers to questions like these, as well as many

others by measuring a persons point of gaze (i.e. where they are looking) and

determining eye/head position.

The origins of eye tracking are over a century old, but in the last 5 years large

technological advances have opened up new possibilities. Modern day eye tracking

can be used not only in a laboratory, but in homes, schools, and businesses where it

aids in research and analysis and is used for interacting with computers as well as

with friends and family.

Simple Idea, Complex Math


42/45

Eye tracking works by reflecting invisible infrared light onto an eye, recording the

reflection pattern with a sensor system, and then calculating the exact point of gaze

using a geometrical model. Once the point of gaze is determined, it can be

visualized and shown on a computer monitor. The point of gaze can also be used tocontrol and interface with different machines. This technique is referred to as eye

control.

Improving the experience

The main challenges of eye tracking are not only in developing the right algorithms

and sensor solutions, which are a prerequisite for a high level of accuracy, but also

in the way users interact with a specific eye tracking device. Eye trackers should be

able to perform with all types of eyes and account for such things as glasses, contact

lenses, head movement and light conditions. Users should also be able to save

personal settings and even look away from the eye tracker without needing to

recalibrate.

Until recently, different types of eyes required different methods of eye tracking.

Dark pupil tracking worked better for people with dark eyes and bright pupil

tracking worked better for children and people with blue eyes. Recently, both of

these techniques have been combined to eliminate the need for two separate eye

trackers.

Another important aspect in eye tracking is the track box. This is the imaginary

box in which a user can move his/her head and still be tracked by the device. With a

larger track box, the user will have more freedom of movement and experiencegreater comfort.


43/45

Multiple Applications

With the right idea there is no limit to the applications of eye tracking. Currently,

some of the major uses for analysis are academic research e.g. cognitive science,

psychology and medical research; market research and usability studies, such as

evaluations of advertising or package design and software or web usability.

Eye tracking techniques can also be used for interaction - people can control a

computer and make things happen by just looking at it. Eye control can be used as

sole interaction technique or combined with keyboard, mouse, physical buttons and

voice.

Eye control is used in communication devices for disabled persons and in various

industrial and medical applications.

Future Value

The crude, complex, and highly intrusive eye tracking techniques of the past have

been replaced by refined and user-friendly methods that are producing valuable

results today and paving the way for the future. Eye tracking and eye control have a

limitless future. Areas like personal computing, the automotive industry, medical

research, and education will soon be utilizing eye tracking in ways never thought

possible.

Eye Tracking technology


44/45

Tobiis eye tracking technology utilizes advanced image processing of a persons

face, eyes and reflections in the eyes of near-infrared reference lights to accurately

estimate:

the 3D position in space of each eye

the precise target to which each eye gaze is directed towards

Key advantages

Tobii has taken eye tracking technology a significant step forward through a number

of key innovations that enable large market applications. Key advantages of Tobiis

eye tracking technology are:

Fully automatic eye tracking

High tracking accuracy

Ability to track nearly all people

Completely non-intrusive

Good tolerance of head-motion

Patented techniques

Compared to other technologies, a number of innovations have been made to

overcome traditional problems associated with eye tracking, such as cumbersome

equipment, poor tracking precision and limited tolerance to head motion. Some of

the key aspects of Tobiis technology include:

Patented techniques to use fixed wide field of view optics in combination

with high resolution sensors

Patented techniques for accurate estimation of the 3D position in space for

both eyes


45/45

Sophisticated image processing and patented control logic to allow for 100%

automatic tracking, and high tracking ability; tracks almost everyone, even

those with glasses

Advanced algorithms to compensate for head motion without loss inaccuracy

Unique techniques to enable long-lasting calibrations

Application technology

Tobii conducts research and development into eye tracking applications. We have

developed an extensive toolbox of software that allows us to rapidly create eye

control applications and eye gaze analysis applications.

Tobiis eye-based interaction technology includes the Tobii eye control engine, a

powerful ActiveX-based API for rapid creation of eye control applications in the

Windows environment. This allows our customers and partners to quickly develop

and customize applications to utilize eye gaze as a modality in computer interfaces.

This is not yet on the market, but is available to key partners on a projectbasis.

Eye Blink Full Doc

Documents

Transcript of Eye Blink Full Doc

EDA-BASED ESTIMATION OF VISUAL ATTENTION BY OBSERVATION OF EYE BLINK FREQUENCY … · 2017-10-17 · EDA-BASED ESTIMATION OF VISUAL ATTENTION BY OBSERVATION OF EYE BLINK FREQUENCY

Vol. 4, Issue 2, February 2015 Human Eye Blink Detection ... · K Takahashi et al [2012]: This paper proposed a practical method for eye blink detection using a monocular system.

* Hema.B **Gopi - IJMRijmr.net.in/download.php?filename=IQonm0SrUNry3or... · 2.1 Eye-blink sensor Eye-blink sensor is used to monitor the alertness of driving person through eye

Yawning Detection Using Embedded Smart Camerasshervin/pubs/CogniVue-IEEE-TIM.pdf · 2015-12-31 · such as percentage eye closure, eye blink rate, blink speed and amplitude, head

Fixational Eye Movement Correction of Blink-Induced Gaze ... Ey… · 1.2 Blink detection Non-human primates. We identified eye movements during blinks as epochs with sustained motion

Issue 4: In the Blink of an Eye

Make a Sale in the Blink of an Eye

CENTER FOR MACHINE PERCEPTION Eye-Blink Detection Using ... · Title of Diploma Thesis : Eye -Blink Detection Using Facial Landmarks . Guidelines: 1. Propose an eye- blink detection

Seminar 8th Eye Blink

Real-Time Eye Blink Detection using Facial Landmarksvision.fe.uni-lj.si/cvww2016/proceedings/papers/05.pdf · Real-Time Eye Blink Detection using Facial Landmarks ... where a mapping

Construction for 'Eye Blink' Effect

In the Blink of an Eye (Revised - Walter Murch

EYE BLINK SENSOR TO CONTROL ACCIDENTPpt main

Accident prevention using bluetooth tech and eye blink sensor

In the blink of an eye…

Caprolactam Reference Exposure Levels Scientific Review ... · Statistical Significance of Eye Blink Data Eye Blink Analysis Method Time Point During Exposure Traditional dim light

Eye-blink detection system for human–computer interaction detection system for human... · 2016. 3. 29. · eye-blink is longer than 250 ms and shorter than 2 s, then such blink

Eye blink detection for different driver states in conditionally … · Eye blink detection for different driver states in conditionally automated driving and manual driving using

Young Worlds Change In the Blink of an Eye

Blink of an Eye Nanometer is to