Download - 4thYear Project Thesis Final Version_images

7/23/2019 4thYear Project Thesis Final Version_images

http://slidepdf.com/reader/full/4thyear-project-thesis-final-versionimages 1/20

3 | P a g e

1.2 Objective

The main aim of this project is to design and build a human-

computer interface to interpret the hand palm movement by applying the

principles of static hand gesture interpretation.

Gesture was the first mode of communication for the primitive cave

men. Later human civilization developed the verbal communication very

well. Still non-verbal communication has not lost its importance. Such non

– verbal communication are being used not only for the physically

challenged people, but also for different applications in diversified areas

such as aviation, surveying, music direction and so on. It is the best method

to interact with the computer without using other peripheral devices like

keyboard, mouse or a remote control. Researchers around the world are

actively engaged in development of robust and efficient gesture recognition

system, especially hand gesture recognition system, for various

applications. Hand gesture recognition provides a natural way to interact

and communicate with machines of different kinds.

`Extraction

Method

Motion Direction

Detection

FeatureExtraction

&

Utilization

Figure 1: The schematic view of gesture identification



6 | P a g e

1.3 Overview of Project

Figure 2: Input given to the system as motion from left to

right and vice versa.

Once we have obtained the binary images of hand palm gestures, the

next step is the extraction of the contour of the object in the image.

First we need to understand what contours are and how they differ

from edges of an object. Edges are computed as points that are extrema of

the image gradient in the direction of the gradient. We can think of them as

the min and max points in a 1D function. The point is, edge pixels are a local

notion. They just point out a significant difference between neighboring



11 | P a g e

In computer science, images are represented in the form of 2-D matrices

of pixels. A pixel may be defined as a minute area of illumination on a

display screen, one of many from which an image is composed. It is the

smallest controllable element of a picture represented on the screen. The

address of a pixel corresponds to its physical coordinates.

Figure 3: Pixel Representation of Image

2.1.1.2 Color Models

A color model is an abstract mathematical model describing the way

colors can be represented as tuples of numbers, typically as three or four

values or color components. When this model is associated with a

precise description of how the components are to be interpreted

(viewing conditions, etc.), the resulting set of colors is called color space.

The various color models are as follows:



13 | P a g e

on. The "black" areas have not actually become darker but appear

"black" relative to the higher intensity "white" projected onto the screen

around it.

Figure 4: 3D representation of the human color

space.

The human tristimulus space has the property that additive mixing of

colors corresponds to the adding of vectors in this space. This makes it

easy to, for example, describe the possible colors (gamut) that can be

constructed from the red, green, and blue primaries in a computer

display.



14 | P a g e

CIE XYZ color space:

One of the first mathematically defined color spaces is the CIE XYZ

color space (also known as CIE 1931 color space), created by the

International Commission on Illumination in 1931. These data were

measured for human observers and a 2-degree field of view. In 1964,

supplemental data for a 10-degree field of view were published.

Figure 5: CIE 1931 Standard Colorimetric Observer

functions between 380 nm and 780 nm (at 5 nm

intervals).

Note that the tabulated sensitivity curves have a certain amount of

arbitrariness in them. The shapes of the individual X, Y and Z sensitivity

curves can be measured with a reasonable accuracy. However, the

overall luminosity function (which in fact is a weighted sum of these

three curves) is subjective, since it involves asking a test person whether

two light sources have the same brightness, even if they are in completely



16 | P a g e

Mixtures of light of these primary colors cover a large part of the human

color space and thus produce a large part of human color experiences.

This is why color television sets or color computer monitors need only

produce mixtures of red, green and blue light.

Figure 6: RGB cube

Other primary colors could in principle be used, but with red, green and

blue the largest portion of the human color space can be captured.

Unfortunately there is no exact consensus as to what loci in the

chromaticity diagram the red, green, and blue colors should have, so the

same RGB values can give rise to slightly different colors on different

screens.

HSV and HSL representations:

Recognizing that the geometry of the RGB model is poorly aligned

with the color-making attributes recognized by human vision, computer

graphics researchers developed two alternate representations of RGB,

HSV and HSL (hue, saturation, value and hue, saturation, lightness), in

the late 1970s. HSV and HSL improve on the color cube representation



17 | P a g e

of RGB by arranging colors of each hue in a radial slice, around a central

axis of neutral colors which ranges from black at the bottom to white at

the top. The fully saturated colors of each hue then lie in a circle, a color

wheel.

Figure 7: HSV color Model

HSV models itself on paint mixture, with its saturation and value

dimensions resembling mixtures of a brightly colored paint with,

respectively, white and black. HSL tries to resemble more perceptual

color models such as NCS or Munsell. It places the fully saturated colors

in a circle of lightness ½, so that lightness 1 always implies white, and

lightness 0 always implies black.



19 | P a g e

2.1.1.3 RGB Color Model

The RGB color model is an additive color model in which red, green, and

blue light are added together in various ways to reproduce a broad array

of colors. The name of the model comes from the initials of the three

additive primary colors, red, green, and blue.

Figure 8: RGB Color Model

The main purpose of the RGB color model is for the sensing,

representation, and display of images in electronic systems, such as

televisions and computers, though it has also been used in conventional

photography. Before the electronic age, the RGB color model already

had a solid theory behind it, based in human perception of colors.



21 | P a g e

recognition. Finally the input images are recognized as a meaningful gesture

based on the gesture modeling and analysis. The details of the above phases

are discussed in the following paragraphs. A schematic diagram of the

popularly used hand gesture recognition system is shown below.

Figure 9: Generalized System Architecture for Hand Gesture

Recognition

2.1.2.1 Data Acquisition

For efficient hand gesture recognition, data acquisition should be as

much perfect as possible. Suitable input device should be selected for the

data acquisition. There are a number of input devices for data acquisition.



31 | P a g e

Figure 10. Background Subtraction

Background subtraction is a widely used approach for detecting moving

objects in videos from static cameras. The rationale in the approach is that

of detecting the moving objects from the difference between the current

frame and a reference frame, often called “background image”, or

“background model”. Background subtraction is mostly done if the image in

question is a part of a video stream. Background subtraction provides

important cues for numerous applications in computer vision, for example

surveillance tracking or human poses estimation. However, background

subtraction is generally based on a static background hypothesis which is

often not applicable in real environments. With indoor scenes, reflections



37 | P a g e

known Conway's Game of Life, for example, uses the Moore neighborhood.

It is similar

to the notion of 8 connected pixels in computer

graphics.

In Moore Neighborhood tracing algorithm, when the current pixel P

has the foreground color, the Moore neighborhood of p is examined in a

clockwise manner starting with the pixel from which p was entered and

advancing pixel by pixel until a new foreground pixel in P is encountered.

The algorithm is described more precisely below.

Figure 11: Moore Neighborhood

If pixel 4 is a white pixel, we set pixel 4 asthe new P (i.e. Current Pixel) and

backtrack to its previous pixel (pixel 3 inthis case) and explore its 8 neighbors.

1 2 3

0

7

P 4

6 5

P



39 | P a g e

2.2.4 Polygonal Approximation

The purpose of the algorithm is, given a curve composed of line

segments, to find a similar curve with fewer points. The algorithm defines

'dissimilar' based on the maximum distance between the original curve and



40 | P a g e

the simplified curve. The simplified curve consists of a subset of the points

that defined the original curve.

Figure 12: Simplifying a piecewise linear curve with the Douglas–

Peucker algorithm.

The algorithm recursively divides the line. Initially it is given all the

points between the first and last point. It automatically marks the first and

last point to be kept. It then finds the point that is furthest from the line

segment with the first and last points as end points (this point is obviously



47 | P a g e

Figure 13. Centroid Calculation of Hand Palm

Palm as a hand model coordinate system of the source point so need

higher objectivity, this paper proposes a palm center detection method

based on determination of the contour features moment and positioning

the center of gravity of the hand (palm center coordinate).

Spatial moments of an image is computed by –

Mij = ∑x,y (f(x,y).x j.y i)

The central moments -:

Muij = ∑x,y (f(x,y).(x-) j.(y-)i)

Where (,) is the mass center:



50 | P a g e

The basic operation principle is below: use IO port TRIG to

trigger ranging. It needs 10 us high level signal at least Module will

send eight 40 kHz square wave automatically, and will test if there is

any signal returned. If there is signal returned, output will be high

level signal via IO port ECHO. The duration of the high level signal is

the time from transmitter to receiving with the ultrasonic.

Testing distance=duration of high level*sound velocity (340m/s) / 2.

We can use the above calculation to find the distance between the

obstacle and the ultrasonic module.

Fig.14. Ultrasonic Sensors

2.3.2 ARDUINO BOARD

The Arduino Uno is a microcontroller board based on the

ATmega328. It has 14 digits input/output pins (of which 6 can be

used as PWM outputs), 6 analog inputs, a 16 MHz ceramic resonator,

a USB connection, a power jack, an ICSP header, and a reset button.

It contains everything needed to support the microcontroller; simply

connect it to a computer with a USB cable or power it with a AC-to-

DC adapter or battery to get started.



51 | P a g e

The Uno differs from all preceding boards in that it does not use the

FTDI USB-to serial driver chip. Instead, it features the Atmega16U2

(atmega8U2 up to the version R2) programmed as a USB-to-serial

converter.

Revision 2 of the UNO Board has a resistor pulling the 8U2 HWB

line to ground, making it easier to put into DFU mode.

“Uno” means one in Italian and is named to mark the

upcoming release of Arduino 1.0. The Uno and version 1.0 will be the

reference version of Arduino, moving forward. The Uno is the latest

in a series of USB Arduino, moving forward. The Uno is the latest in

a series of USB Arduino platform; for a comparison with previous

versions.

Fig.15 Arduino Board

Summary of Arduino board

Microcontroller ATmega328

Operating voltage 5V



54 | P a g e

Fig.16 Jumper Wires



59 | P a g e

The sensor sends a message back to the computer brick telling it the time

taken for the signal to return. Then the brick uses this info to compute how

far away the object is.

The ultrasonic sensor sends out sound from one side and receives sound

reflected from an object on the other side.

The sensor uses the time it takes for the sound to come back from the

object in front to determine the distance of an object. The “sonic” in

ultrasonic refers to sound, and “ultra” means that humans cannot hear it

(but bats and dogs can hear those sounds).

The ultrasonic sensor can measure distances in centimeters and inches. It

can measure from 0 to 2.5 meters, with a precision of 3 cm.

It works very well and provides good readings in sensing large-sized

objects with hard surfaces. But, reflections from soft fabrics, curved

objects (such as balls) or very thin and small objects can be difficult for the

sensor to read.

Note: Two ultrasonic sensors in the same room may interfere with each

other’s readings

Fig.18 Ultrasonic Sensor Circuit Diagram



63 | P a g e

Fig 19. Connection Setup of Sensors and Arduino At-mega 328

Micro-controller.



64 | P a g e

4.2 Results & Analysis:

Figure 20: Experimental Result