Hand Gesture Imp

7/24/2019 Hand Gesture Imp

1/46

CHAPTER 1 : INTRODUCTION

Nowadays, robots are used in the different domains ranging from search and rescue in the

dangerous environments to the interactive entertainments. The more the robots are employedin our daily life, the more a natural communication with the robot is required. Since the

introduction of the most common input computer devices not a lot have changed. This is

probably because the existing devices are adequate. It is also now that computers have been

so tightly integrated with everyday life, that new applications and hardware are constantly

introduced. The means of communicating with computers at the moment are limited to

keyboards, mice, light pen, trackball, keypads etc. These devices have grown to be familiar

but inherently limit the speed and naturalness with which we interact with the computer. n

the other hand, hand gesture, as a natural interface means, has been attracting so much

attention for interactive communication with robots in the recent years.

!s the computer industry follows "oore#s $aw since middle %&'(s, powerful machines are

built equipped with more peripherals. )ision based interfaces are feasible and at the presentmoment the computer is able to *see+. ence users are allowed for richer and user-friendlier

man-machine interaction. This can lead to new interfaces that will allow the deployment of

new commands that are not possible with the current input devices. lenty of time will be

saved as well. /ecently, there has been a surge in interest in recogni0ing human hand

gestures. In this context, vision based hand detection and tracking techniques are used to

provide an efficient real time interface with the robot. owever, the problem of visual hand

recognition and tracking is quite challenging. "any early approaches used position markers

or colored gloves to make the problem of hand recognition easier, but due to their

inconvenience, they cannot be considered as a natural interface for the robot control. Thanks

to the latest advances in the computer vision field, the recent vision based approaches do not

need any extra hardware except a camera. These techniques can be categori0ed as model

based and appearance based methods. 1hile model based techniques can recogni0e the hand

motion and its shape exactly, they are computationally expensive and therefore they are

infeasible for a real time control application.

The appearance based techniques on the other hand are faster but they still deal with some

issues such as2

complex nature of the hand with more than 3( 45

cluttered and variant background

variation in lighting conditions

real time computational demand

This pro6ect on the one hand addresses the solution to the mentioned issues in the hand

recognition problem using 34 images and on the other hand proposes an innovative natural

commanding system for a uman /obot Interaction 7/I8.

and- gesture recognition has various applications like computer games, machinery control

7e.g. 9rane8, and thorough mouse replacement. ne of the most structured sets of gestures

belongs to sign language. In sign language, each gesture has an assigned meaning 7ormeanings8.9omputer recognition of hand gestures may provide a more natural-computer
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.html


2/46

interface, allowing people to point, or rotate a 9!4 model by rotating their hands. and

gestures can be classified in two categories2 static and dynamic. ! static gesture is a particular

hand configuration and pose, represented by a single image. ! dynamic gesture is a moving

gesture, represented by a sequence of images. 1e will focus on the recognition of static

images. Interactive applications pose particular challenges. The response time should be very

fast. The user should sense no appreciable delay between when he or she makes a gesturemotion and when the computer responds. The computer vision algorithms should be reliable

and work for different people. There are also economic constraints2 the vision-based

interfaces will be replacing existing ones, which are often very low cost. :ven for added

functionality, consumers may not want to spend more. 1hen additional hardware is needed

the cost is considerable higher.

CHAPTER 2 : SYSTEM DESCRIPTION

The system which is developed for hand based robot control consists of set-up of the robot,34 imaging system and a control application.

1. BLOCK DIAGRAM
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.html


3/46

In the pro6ect we use a video web camera for gesture identification. It sniffs frames of live

video stream in some time interval. In our case frame capture rate for gesture search is %

frame per two second.

5or video capturing the camera used has resolution ';(cbcr

format. ?sing "!T$!@ program we convert it in to greyscale format. @y using "!T$!@

we take snapshots of the obtained video in an interval of two seconds.

The si0e of the obtained image is ';(


4/46

Similarly all the four gestures are trained to obtain four networks respectively. The networks

are saved in the workspace of "!T$!@. This acts as database for classification.

roposed technique is to control robotic system using hand gesture display is divided into

four subparts2

9apture frame containing some gesture presentation.

:xtract hand gesture area from captured frame.

4etermine gesture by pattern matching using radial basis function.

4etermine control instruction, corresponding to matched gesture, and gives that

instruction to specified robotic system.

Now for real time execution, the snapshots of the video is taken. Then appropriate cropping is

done and its given to the neural network. Cesture classification is done by the radial basisfunction. The correlation property between the images is used here i.e crosscorrelation is

calculated. This determines the threshold level for gesture classification. The neural net

classification output is used to drive the hardware. The !rduino is used for hardware

interfacing. The control commands are send to !rduino from "!T$!@ directly. n the basis

of this command motion of the /@T is ad6usted.

Thus from the corpus of gestures, specific gesture of interest can be identified, and on the

basis of that, specific command for execution of action can be given to robotic system.

In our pro6ect the software platform used is "!T$!@ %(.The various sections of the system

like gesture recognition, radial basis function, image processing and interfacing using arduino

is explained in forthcoming sections.

2. WEBCAM

! webcam is a video camera that feeds its images in real time to a computer or computer

network, often via ?S@, :thernet, or 1i-5i. Their most popular use is the establishment of

video links, permitting computers to act as videophones or videoconference stations. The

common use as a video camera for the 1orld 1ide 1eb gave the webcam its name. ther

popular uses include security surveillance, computer vision, video broadcasting and for

recording social videos. 1ebcams are known for their low manufacturing cost and flexibility,

making them the lowest cost form of videotelephony. They have also become a source of

security and privacy issues, as some built-in webcams can be remotely activated via spyware

1ebcams typically include a lens, an image sensor, support electronics, and may also includea microphone for sound. )arious lenses are available, the most common in consumer-grade

webcams being a plastic lens that can be screwed in and out to focus the camera. 5ixed focus

lenses, which have no provision for ad6ustment, are also available. !s a camera systemDs

depth of field is greater for small image formats and is greater for lenses with a large f-

number 7small aperture8, the systems used in webcams have a sufficiently large depth of field

that the use of a fixed focus lens does not impact image sharpness to a great extent.

2.1. Technolo!

Image sensors can be 9"S or 994, the former being dominant for low-cost cameras, but

994 cameras do not necessarily outperform 9"S-based cameras in the low cost price

range. "ost consumer webcams are capable of providing )C! resolution video at a frame


5/46

rate of A( frames per second. "any newer devices can produce video in multi-megapixel

resolutions, and a few can run at high frame rates such as the layStation :ye, which can

produceA3(E3;( video at %3( frames per second.

Support electronics read the image from the sensor and transmit it to the host computer. The

camera pictured to the right, for example, uses a Sonix SN&9%(% to transmit its image over

?S@. Some cameras, such as mobile phone cameras, use a 9"S sensor with supportingelectronics Fon dieF, i.e. the sensor and the support electronics are built on a single silicon

chip to save space and manufacturing costs. "ost webcams feature built-in microphones to

make video calling and videoconferencing more convenient. The ?S@ video device class

7?)98 specification allows for interconnectivity of webcams to computers without the need

for proprietary device drivers. "icrosoft 1indows


6/46

".11. L(&e O)*ec$ T&(c+'n

In some interactive applications, the computer needs to track the position or orientation of a

hand that is prominent in the image. /elevant applications might be computer games, or

interactive machine control. In such cases, a description of the overall properties of the image

may be adequate. Image moments, which are fast to compute, provide a very coarse summary

of global averages of orientation and position. If the hand is on a uniform background,

this method can distinguish hand positions and simple pointing gestures. The large-ob6ect-

tracking method makes use of a low-cost detectorprocessor to quickly calculate moments.

This is called the artificial retina chip. This chip combines image detection with some low-level image processing 7named artificial retina by analogy with those combined abilities of

the human retina8. The chip can compute various functions useful in the fast algorithms for

interactive graphics applications.

".12 Sh(,e &econ'$'on

"ost applications, such as recogni0ing particular static hand signal, require a richer

description of the shape of the input ob6ect than image moments provide. If the hand signals

fell in a predetermined set, and the camera views a close-up of the hand, we may use an

example-based approach, combined with a simple method top analy0e hand signals called

orientation histograms. These example-based applications involve two phasesJ training and

running. In the training phase, the user shows the system one or more examples of a specific

hand shape. The computer forms and stores the corresponding orientation histograms. In the


7/46

run phase, the computer compares the orientation histogram of the current image with each of

the stored templates and selects the category of the closest match, or interpolates between

templates, as appropriate. This method should be robust against small differences in the si0e

of the hand but probably would be sensitive to changes in hand orientation. Cesture

recognition can be seen as a way for computers to begin to understand human body language,

this building a richer bridge between machines and humans than primitive text user interfacesor even C?Is 7graphical user interfaces8, which still limit the ma6ority of input to keyboard

and mouse. Cesture recognition enables humans to interface with the machine 7"I8 and

interact naturally without any mechanical devices. ?sing the concept of gesture recognition,

it is possible to point a finger at the computer screen so that the cursor will move accordingly.

This could potentially make conventional input devices such as mouse, keyboards and even

touch-screens redundant.

-. NEURAL NETWORKS

Neural networks are composed of simple elements operating in parallel. These elements are

inspired by biological nervous systems. !s in nature, the network function is determinedlargely by the connections between elements. 1e can train a neural network to perform a

particular function by ad6usting the values of the connections 7weights8 between elements.

9ommonly neural networks are ad6usted, or trained, so that a particular input leads to a

specific target output. Such a situation is shown in figure A. There, the network is ad6usted,

based on a comparison of the output and the target, until the network output matches the

target. Typically many such inputtarget pairs are used, in this supervised learning to train a

network.

Neural networks have been trained to perform complex functions in various fields of

application including pattern recognition, identification, classification, speech, vision andcontrol systems.


8/46

Today neural networks can be trained to solve problems that are difficult for conventional

computers or human beings. The supervised training methods are commonly used, but other

networks can be obtained from unsupervised training techniques or from direct

design methods. ?nsupervised networks can be used, for instance, to identify groups of data.

9ertain kinds of linear networks and opfield networks are designed directly. In summary,

there are a variety of kinds of design and learning techniques that enrich the choices that auser can make. The field of neural networks has a history of some five decades but has found

solid application only in the past fifteen years, and the field is still developing rapidly. Thus,

it is distinctly different from the fields of control systems or optimi0ation where the

terminology, basic mathematics, and design procedures have been firmly established and

applied for many years.

-.1. Ne%&on Moel

-.11. S'/,le Ne%&on! neuron with a single scalar input and no bias is shown on the left below.

The scalar input p is transmitted through a connection that multiplies its strength by the scalar

weight w, to form the product wp, again a scalar. ere the weighted input wp is the only

argument of the transfer function f, which produces the scalar output a. The neuron on the

right has a scalar bias, b. >ou may view the bias as simply being added to the product wp as

shown by the summing 6unction or as shifting the function f to the left by an amount b. The

bias is much like a weight, except that it has a constant input of %. The transfer function net

input n, again a scalar, is the sum of the weighted input wp and the bias b. This sum is the

argument of the transfer function f. ere f is a transfer function, typically a step function or a

sigmoid function, that takes the argument n and produces the output a. Note that w and b are

both ad6ustable scalar parameters of the neuron. The central idea of neural networks is that

such parameters can be ad6usted so that the network exhibits some desired or interestingbehaviour.


9/46

Thus, we can train the network to do a particular 6ob by ad6usting the weight or bias

parameters, or perhaps the network itself will ad6ust these parameters to achieve some desired

end. !ll of the neurons in the program written in "!T$!@ have a bias. owever, you may

omit a bias in a neuron if you wish. !

-.12. T&(n#0e& %nc$'on#Three of the most commonly used transfer functions are shown in figure 7'8.

The hard limit transfer function shown above limits the output of the neuron to either (, if the

net input argument n is less than (, or %, if n is greater than or equal to (. This is the function

used for the radial basis function algorithm written in "!T$!@ to create neurons that make

a classification decision.

There are two modes of learning2 Supervised and unsupervised. @elow there is a brief

description of each one to determine the best one for our problem


10/46

-.1". S%,e&'#e Le(&n'n

Supervised learning is based on the system trying to predict outcomes for known examples

and is a commonly used training method. It compares its predictions to the target answer and

FlearnsF from its mistakes. The data start as inputs to the input layer neurons. The neurons

pass the inputs along to the next nodes. !s inputs are passed along, the weighting, or

connection, is applied and when the inputs reach the next node, the weightings are summedand either intensified or weakened. This continues until the data reach the output layer where

the model predicts an outcome. In a supervised learning system, the predicted output is

compared to the actual output for that case. If the predicted output is equal to the actual

output, no change is made to the weights in the system. @ut, if the predicted output is higher

or lower than the actual outcome in the data, the error is propagated back through the system

and the weights are ad6usted accordingly. This feeding errors backwards through the network

is called Fback-propagation.F

@oth the "ulti-$ayer erceptron and the /adial @asis 5unction are supervised learning

techniques. The "ulti-$ayer erceptron uses the back-propagation while the /adial @asis

5unction is a feed-forward approach which trains on a single pass of the data.

-.1-. Un#%,e&'#e Le(&n'n

Neural networks which use unsupervised learning are most effective for describing data

rather than predicting it. The neural network is not shown any outputs or answers as part of

the training process in fact, there is no concept of output fields in this type of system. The

primary unsupervised technique is the Kohonen network. The main uses of Kohonen and

other unsupervised neural systems are in cluster analysis where the goal is to group FlikeF

cases together. The advantage of the neural network for this type of analysis is that it requires

no initial assumptions about what constitutes a group or how many groups there are. The

system starts with a clean slate and is not biased about which factors should be most

important.

;.3. !dvantages of Neural 9omputing

There are a variety of benefits that an analyst reali0es from using neural networks in their

work.

attern recognition is a powerful technique for harnessing the information in the data

and generali0ing about it. Neural nets learn to recogni0e the patterns which exist in

the data set.

The system is developed through learning rather than programming. rogramming is

much more time consuming for the analyst and requires the analyst to specify the

exact behaviour of the model. Neural nets teach themselves the patterns in the datafreeing the analyst for more interesting work.

Neural networks are flexible in a changing environment. /ule based systems or

programmed systems are limited to the situation for which they were designed--when

conditions change, they are no longer valid. !lthough neural networks may take some

time to learn a sudden drastic change, they are excellent at adapting to constantly

changing information.

Neural networks can build informative models where more conventional approaches

fail. @ecause neural networks can handle very complex interactions they can easily


11/46

model data which is too difficult to model with traditional approaches such as

inferential statistics or programming logic.

erformance of neural networks is at least as good as classical statistical modeling,

and better on most problems. The neural networks build models that are more

reflective of the structure of the data in significantly less time.

Neural networks now operate well with modest computer hardware. !lthough neural

networks are computationally intensive, the routines have been optimi0ed to the

point that they can now run in reasonable time on personal computers. They do not

require supercomputers as they did in the early days of neural network research.

-.". L'/'$($'on# o0 Ne%&(l Co/,%$'n

There are some limitations to neural computing. The key limitation is the neural networkDs

inability to explain the model it has built in a useful way. !nalysts often want to know why

the model is behaving as it is. Neural networks get better answers but they have a hard time

explaining how they got there.

There are a few other limitations that should be understood. 5irst, it is difficult to extract rules

from neural networks. This is sometimes important to people who have to explain their

answer to others and to people who have been involved with artificial intelligence,

particularly expert systems which are rule-based.

!s with most analytical methods, you cannot 6ust throw data at a neural net and get a good

answer. >ou have to spend time understanding the problem or the outcome you are trying to

predict. !nd, you must be sure that the data used to train the system are appropriate and are

measured in a way that reflects the behaviour of the factors. If the data are not representative

of the problem, neural computing will not product good results. This is a classic situationwhere Fgarbage inF will certainly produce Fgarbage out.F

5inally, it can take time to train a model from a very complex data set. Neural techniques are

computer intensive and will be slow on low end 9s or machines without math coprocessors.

It is important to remember though that the overall time to results can still be faster than other

data analysis approaches, even when the system takes longer to train. rocessing speed alone

is not the only factor in performance and neural networks do not require the time

programming and debugging or testing assumptions that other analytical approaches do.

3. RADIAL BASIS UNCTION

! /adial @asis 5unction 7/@58 neural network has an input layer, a hidden layer and an

output layer. The neurons in the hidden layer contain Caussian transfer functions whose

outputs are inversely proportional to the distance from the center of the neuron.

/@5 networks are similar to K-"eans clustering and NNC/NN networks. The main

difference is that NNC/NN networks have one neuron for each point in the training file,

whereas /@5 networks have a variable number of neurons that is usually much less than the

number of training points. 5or problems with small to medium si0e training sets, NNC/NN

networks are usually more accurate than /@5 networks, but NNC/NN networks are

impractical for large training sets.

3.1. Ho4 RB ne$4o&+# 4o&+


12/46

!lthough the implementation is very different, /@5 neural networks are conceptually similar

to K-Nearest Neighbor 7k-NN8 models. The basic idea is that a predicted target value of an

item is likely to be about the same as other items that have close values of the predictor

variables. 9onsider this figure2

!ssume that each case in the training set has two predictor variables, x and y. The cases are

plotted using their x,y coordinates as shown in the figure. !lso assume that the target variable

has two categories, positive which is denoted by a square and negative which is denoted by a

dash. Now, suppose we are trying to predict the value of a new case represented by the

triangle with predictor values xL', yLB.%. Should we predict the target as positive or

negativeM

Notice that the triangle is position almost exactly on top of a dash representing a negativevalue. @ut that dash is in a fairly unusual position compared to the other dashes which are

clustered below the squares and left of center. So it could be that the underlying negative

value is an odd case.

The nearest neighbor classification performed for this example depends on how many

neighboring points are considered. If %-NN is used and only the closest point is considered,

then clearly the new point should be classified as negative since it is on top of a known

negative point. n the other hand, if &-NN classification is used and the closest & points are

considered, then the effect of the surrounding = positive points may overbalance the close

negative point.

!n /@5 network positions one or more /@5 neurons in the space described by the predictor

variables 7x,y in this example8. This space has as many dimensions as there are predictorvariables. The :uclidean distance is computed from the point being evaluated 7e.g., the


13/46

triangle in this figure8 to the center of each neuron, and a radial basis function 7/@58 7also

called a kernel function8 is applied to the distance to compute the weight 7influence8 for each

neuron. The radial basis function is so named because the radius distance is the argument to

the function. 1eight L /@5 7distance8

The further a neuron is from the point being evaluated, the less influence it has.

4ifferent types of radial basis functions could be used, but the most common is the Caussian

function2

If there is more than one predictor variable, then the /@5 function has as many dimensions as

there are variables. The following picture illustrates three neurons in a space with twopredictor variables, < and >. is the value coming out of the /@5 functions2


14/46

The best predicted value for the new point is found by summing the output values of the /@5

functions multiplied by weights computed for each neuron.

The radial basis function for a neuron has a center and a radius 7also called a spread8. The

radius may be different for each neuron, and, in /@5 networks generated by 4T/:C, the

radius may be different in each dimension.


15/46

1ith larger spread, neurons at a distance from a point have a greater influence.

3.2. RB Ne$4o&+ A&ch'$ec$%&e

/@5 networks have three layers2

%. Input layer O There is one neuron in the input layer for each predictor variable. In the case

of categorical variables, N-% neurons are used where N is the number of categories. The input

neurons 7or processing before the input layer8 standardi0e the range of the values by


16/46

subtracting the median and dividing by the interquartile range. The input neurons then feed

the values to each of the neurons in the hidden layer.

3. idden layer O This layer has a variable number of neurons 7the optimal number is

determined by the training process8. :ach neuron consists of a radial basis function centered

on a point with as many dimensions as there are predictor variables. The spread 7radius8 of

the /@5 function may be different for each dimension. The centers and spreads aredetermined by the training process. 1hen presented with the x vector of input values from

the input layer, a hidden neuron computes the :uclidean distance of the test case from the

neuron#s center point and then applies the /@5 kernel function to this distance using the

spread values. The resulting value is passed to the the summation layer.

A. Summation layer O The value coming out of a neuron in the hidden layer is multiplied by a

weight associated with the neuron 71%, 13, ...,1n in this figure8 and passed to the

summation which adds up the weighted values and presents this sum as the output of the

network. Not shown in this figure is a bias value of %.( that is multiplied by a weight 1( and

fed into the summation layer. 5or classification problems, there is one output 7and a separate

set of weights and summation unit8 for each target category. The value output for a category

is the probability that the case being evaluated has that category.

!n input vector x is used as input to all radial basis functions, each with different parameters.

The output of the network is a linear combination of the outputs from radial basis functions.

3.". T&('n'n RB Ne$4o&+#

The following parameters are determined by the training process2

%. The number of neurons in the hidden layer.3. The coordinates of the center of each hidden-layer /@5 function.


17/46

A. The radius 7spread8 of each /@5 function in each dimension.

;. The weights applied to the /@5 function outputs as they are passed to the summation layer.

)arious methods have been used to train /@5 networks. ne approach first uses K-means

clustering to find cluster centers which are then used as the centers for the /@5 functions.

owever, K-means clustering is a computationally intensive procedure, and it often does not

generate the optimal number of centers. !nother approach is to use a random subset of thetraining points as the centers.

4T/:C uses a training algorithm developed by Sheng 9hen,


18/46

comprehensive set of reference-standard algorithms and graphical tools for image processing,

analysis, visuali0ation, and algorithm development. >ou can perform image enhancement,

image deblurring, feature detection, noise reduction, image segmentation, spatial

transformations, and image registration. "any functions in the toolbox are multithreaded to

take advantage of multicore and multiprocessor computers. Image rocessing Toolbox

supports a diverse set of image types, including high dynamic range, gigapixel resolution,I99-compliant colour, and tomographic images. Craphical tools let you explore an image,

examine a region of pixels, ad6ust the contrast, create contours or histograms, and manipulate

regions of interest 7/Is8. 1ith the toolbox algorithms you can restore degraded images,

detect and measure features, analy0e shapes and textures, and ad6ust the colour balance of

images.

7. ARDUINO

!rduino is an open-source electronics prototyping platform based on flexible, easy-to-use

hardware and software. ItDs intended for artists, designers, hobbyists, and anyone interested in

creating interactive ob6ects or environments.!rduino can sense the environment by receiving input from a variety of sensors and can

affect its surroundings by controlling lights, motors, and other actuators. The microcontroller

on the board is programmed using the !rduino programming language 7based on 1iring8 and

the !rduino development environment 7based on rocessing8. !rduino pro6ects can be stand-

alone or they can communicate with software running on a computer 7e.g."!T$!@,5lash,

rocessing8.

!rduino is different from other platforms on the market because of these features2

It is a multiplatform environmentJ it can run on 1indows, "acintosh, and $inux

It is based on the rocessing programming I4:, an easy-to-use development

environment used by artists and designers


19/46

>ou program it via a ?S@ cable, not a serial port. This feature is useful because many

modern computers don#t have serial ports.

It is open source hardware and softwareRif you wish, you can download the circuit

diagram, buy all the components, and make your own, without paying anything to the

makers of !rduino.

There is an active community of users, so there are plenty of people who can help

you.

The !rduino ro6ect was developed in an educational environment and is therefore

great for newcomers to get things working quickly.

7.1. B(&e M'n'/%/ Coe $o R%n $he A&%'no

void setup(){

//we can put setup code here, to run once

}

void loop()

{

//we can put main code here, it run repeatedly

}

The setup78 function is called when a sketch starts. ?se it to initiali0e variables, pin modes,

start using libraries, etc. The setup function will only run once, after each powerup or reset of

the !rduino board.!fter creating a setup78 function, the loop78 function does precisely what its name suggests,

and loops consecutively, allowing our program to change and respond as it runs. 9ode in the

loop78 section of our sketch is used to actively control the !rduino board.

!ny line that starts with two slashes 78 will not be read by the compiler, so we can write

anything you want after it.

7.2. LED Bl'n+'n E8(/,le

/* Blink Turns on an LE on !or one second, then o!! !or one second, repeatedly"

This e#ample code is in the pu$lic domain" */

void setup(){

// initiali%e the di&ital pin as an output"

// 'in has an LE connected on most rduino $oards+

pinode(, -.T'.T)

}

void loop() {

di&ital0rite(, 1231) // set the LE on

delay(444) // wait !or a second

di&ital0rite(, L-0) // set the LE o!!

delay(444) // wait !or a second

}


20/46

In the program below, the first thing we do is to initiali0e pin %A as an output pin with the line

pinode(, -.T'.T)

In the main loop, we turn the $:4 on with the line2

di&ital0rite(, 1231)

This supplies B volts to pin %A. That creates a voltage difference across the pins of the $:4,and lights it up. Then we turn it off with the line2

di&ital0rite(, L-0)

That takes pin %A back to ( volts, and turns the $:4 off. In between the on and the off, we

want enough time for a person to see the change, so the delay78 commands tell the !rduino to

do nothing for %((( milliseconds, or one second. 1hen we use the delay78 command, nothing

else happens for that amount of time.

CHAPTER " : IMAGE DATABASE

The starting point of the pro6ect was the creation of a database with all the images that would

be used for training and testing.

The image database can have different formats. Images can be either hand drawn, digiti0ed

photographs or a 34A4 dimensional hand. hotographs were used, as they are the most

realistic approach.

Images came from two main sources. )arious !S$ databases on the Internet and photographs

took with a web camera. This meant that they have different si0es, different resolutions and

sometimes almost completely different angles of shooting. Images belonging to the last cases

were very few but they were discarded, as there was no chance of classifying them correctly.

Two operations were carried out in all of the images. They were converted to grayscale and

the background was made uniform. The internet databases already had uniform backgrounds

but the ones took with the camera had to be processed in !dobe hotoshop.

The database itself was constantly changing throughout the completion of the pro6ect as it

was it that would decide the robustness of the algorithm. Therefore, it had to be done in such

way that different situations could be tested and thresholds above which the algorithm didn#t

classify correct would be decided. The construction of such a database is clearly dependent

on the application. If the application is a crane controller for example operated by the same

person for long periods the algorithm doesn#t have to be robust on different person#s images.

In this case noise and motion blur should be tolerable.

1e can see an example as shown in figure2
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.html


21/46

In the first row are the training images. In the second the testing images. 5or most of the

gestures the training set originates from a single gesture. Those were enhanced in !dobe

hotoshop using various filters. The reason for this is that the algorithm is to be very robust

for images of the same database. If there was a misclassification to happen it would be

preferred to be for unknown images.

The final form of the database is this.

Train set2 5our training sets of images, each one containing thirty five images. :ach set

originates from a single image for testing.

Test Set2 The number of test images varies for each gesture. There is no reason for keeping

those on a constant number. Some images can tolerate much more variance and images from

new databases and they can be tested extensively, while other images are restricted to fewer

testing images.

The system could be approached either in high or low-level. The former would employ

models of the hand, finger, 6oints and perhaps fit such a model to the visual data. This

approach offers robustness, but at the expense of speed.

! low-level approach would process data at a level not much higher than that of pixel

intensities. !lthough this approach would not have the power to make inferences about

occluded data, it could be simple and fast.The pattern recognition system that will be used can be seen in figure.


22/46

Some transformation T, converts an image into a feature vector, which will be then compared

with feature vectors of a training set of gestures. 1e will be seeking for the simplest possible

transformation T, which allows gesture recognition.

CHAPTER - : RADIAL BASISNETWORK IMPLEMENTATION IN

MATLAB

1. LEARNING RULES

1e will define a learning rule as a procedure for modifying the weights and biases of a

network. 7This procedure may also be referred to as a training algorithm.8 The learning rule is

applied to train the network to perform some particular task. $earning rules in the "!T$!@

toolbox fall into two broad categories2 supervised learning and unsupervised learning.

Those two categories were described in detail in previous chapter. The algorithm has been

developed using supervised learning.

In supervised learning, the learning rule is provided with a set of examples 7the training set8

of proper network behavior2 where is an input to the network, and is the corresponding

correct 7target8 output. !s the inputs are applied to the network, the network outputs are

compared to the targets. The learning rule is then used to ad6ust the weights and biases of the

network in order to move the network outputs closer to the targets. The /adial @asis 5unction

learning rule falls in this supervised learning category.

2. LINEAR CLASSIICATION
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.html


23/46

$inear networks can be trained to perform classification with the function N:1/@. N:1/@

4esign a radial basis network. /adial basis networks can be used to approximate functions.

N:1/@ adds neurons to the hidden layer of a radial basis network until it meets the

specified mean squared error goal.

Gnet,trH L newrb7,T,C!$,S/:!4,"N,458N:1/@7,T,C!$,S/:!4,"N,458 takes these arguments,

- /x matrix of input vectors.

T - Sx matrix of target class vectors.

C!$ - "ean squared error goal, default L (.(.

S/:!4 - Spread of radial basis functions, default L %.(.

"N - "aximum number of neurons, default is .

45 - Number of neurons to add between displays, default L 3B,

and returns a new radial basis network.

The larger that S/:!4 is the smoother the function approximation will be. Too large aspread means a lot of neurons will be required to fit a fast changing function. Too small a

spread means many neurons will be required to fit a smooth function, and the network may

not generali0e well. 9all N:1/@ with different spreads to find the best value for a given

problem.

:xamples2

ere we design a radial basis network given inputs and targets T.

L G% 3 AHJ

T L G3.( ;.% B.&HJ

net L newrb7,T8Jere the network is simulated for a new input.

L %.BJ

> L sim7net,8

Alo&'$h/

N:1/@ creates a two layer network. The first layer has /!4@!S neurons, and calculates

its weighted inputs with 4IST, and its net input with N:T/4. The second layer has

?/:$IN neurons, calculates its weighted input with 4T/4 and its net inputs with

N:TS?". @oth layers have biases. Initially the /!4@!S layer has no neurons. The

following steps are repeated until the networkDs mean squared error falls below C!$ or themaximum number of neurons are reached2

%8 The network is simulated.

38 The input vector with the greatest error is found.

A8 ! /!4@!S neuron is added with weights equal to that vector.

;8 The ?/:$IN layer weights are redesigned to minimi0e error.

@elow in figure, we can see a flow chart of the algorithm. It is always operating on the same

order. There is a graphical interface only for selecting the test set as it is the only user input.

The error is also displayed graphically for each epoch that is through.


24/46

CHAPTER 3 : PHASES O THE

PRO9ECT

The pro6ect had two phases2

Training hase

:xecution hase

1. TRAINING PHASE

!n artificial neural network is a system based on the operation of biological neural networks,

in other words, is an emulation of biological neural system. 1hy would be necessary the

implementation of artificial neural networksM !lthough computing these days is truly

advanced, there are certain tasks that a program made for a common microprocessor is unable

to performJ even so a software implementation of a neural network can be made with their

advantages and disadvantages.

1.1. T&('n'n o0 (&$'0'c'(l ne%&(l ne$4o&+#

! neural network has to be configured such that the application of a set of inputs produces7either DdirectD or via a relaxation process8 the desired set of outputs. )arious methods to set

the strengths of the connections exist. ne way is to set the weights explicitly, using a priori

knowledge. !nother way is to DtrainD the neural network by feeding it teaching patterns and

letting it change its weights according to some learning rule. 1e can categories the learning

situations in two distinct sorts. These are2

S%,e&'#e le(&n'nor !ssociative learning in which the network is trained by providing it

with input and matching output patterns. These input-output pairs can be provided by an

external teacher, or by the system which contains the neural network 7self-supervised8.
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.html


25/46

Un#%,e&'#e le(&n'nor Self-organi0ation in which an 7output8 unit is trained to respond to

clusters of pattern within the input. In this paradigm the system is supposed to discover

statistically salient features of the input population. ?nlike the supervised learning paradigm,

there is no a priori set of categories into which the patterns are to be classifiedJ rather the

system must develop its own representation of the input stimuli.

Re'n0o&ce/en$ Le(&n'n: This type of learning may be considered as an intermediate formof the above two types of learning. ere the learning machine does some action on the

environment and gets a feedback response from the environment. The learning system grades

its action good 7rewarding8 or bad 7punishable8 based on the environmental response and

accordingly ad6usts its parameters. Cenerally, parameter ad6ustment is continued until an

equilibrium state occurs, following which there will be no more changes in its parameters.

The self-organi0ing neural learning may be categori0ed under this type of learning.

The /adial @asis 5unction learning rule falls in this supervised learning category. In

supervised learning, the learning rule is provided with a set of examples 7the training set8

of proper network behavior2 where is an input to the network, and is the corresponding

correct 7target8 output. !s the inputs are applied to the network, the network outputs are

compared to the targets. The learning rule is then used to ad6ust the weights and biases of thenetwork in order to move the network outputs closer to the targets.


26/46

1.21. 'eo c(,$%&'n

The camera which is used is a web cam which has resolution ';(cbcr format. ?sing "!T$!@ program we convert it in to greyscale format. @y

using "!T$!@ we take snapshots of the obtained video in an interval of two seconds.

The si0e of the obtained image is ';(


27/46

1.22. Ne%&(l ne$4o&+

AB images of the single gesture is provided to the network after appropriate cropping is done.

This AB images of gestures are concatenated to form a single matrix which forms input vector

to the network. 5or each and every input vector we will assign a target vector. The target

vector is assumed to be a single image of a single gesture, which is common one for all input

vectors.

Similarly all the four gestures are trained to obtain four networks respectively. The networksare saved in the workspace of "!T$!@. This acts as database for classification.


28/46

The specific gesture is used to convey a specific command for the execution of action by the

robot. The respective control actions for the gestures given above 7in clockwise8 are forward,

backward, right and left movement of the robot.

2. E;ECUTION PHASE

In this section we describe a methodology for implementing the phase of execution of

artificial neural networks 7!NN8 in hardware devices. The various entities are interconnectedto form a neuron that can be mapped to a hardware device. ?sing a similar approach, neurons

are grouped in layers, which are then interconnected themselves to construct an artificial

neural network. The methodology is intended to lead a neural network designer through the

steps required to take the design into a hardware device, starting with the results provided by

a neuro simulator, obtaining the network parameters and translating them into a fully

synthesi0able design.

2.1. E8ec%$'on Ph(#e6 Bloc+ D'(&(/


29/46

2.11. S$e,# 'n e8ec%$'on ,h(#e:


30/46

/eal time hand tracking scheme can be used for capturing the gesture if gesture is

available in the vision range it is captured by camera

9onvert the image to a binary image

9rop out the unwanted portion by cropping algorithm

Cive it to the neural net to perform classification

@ased on the stored database neural net will return the classified result

The neural net classification output is used to drive the hardware.

The !rduino is used for hardware interfacing.

The control commands are send to !rduino from "!T$!@ directly.

n the basis of this command motion of the /@T is ad6usted.

CHAPTER 5 : INTERACING USING

ARDUINO

!s we already discussed, different functions corresponding to each meaningful hand gesture

are written and stored in database for controlling robot. That is for each meaningful hand

gesture we want to make robot moving in corresponding direction. To make it happen,

according to each gesture identified the "!T$!@ should send corresponding control signal

to the robot. The interfacing of "!T$!@ with hardware is done using !rduino. It simplifies

the serial communication between the "!T$!@ and the hardware. !nother advantage of

using !rduino is, it can be directly controlled over the "!T$!@ by creating !rduino ob6ect

inside the software itself. It is possible by the use "!T$!@ support package for !rduino. It

uses the serial port interface in "!$!@ for communication with !rduino board.
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.html


31/46

"!$!@ support package for !rduino comes with a server program adiosrv.pde that can be

downloaded on the !rduino board. nce downloaded, the adiosrv.pde will2

%. 1ait and listen for "!T$!@ commands

3. ?pon receiving "!T$!@ command, execute and return the result

The steps in writedownload the server program to !rduino board is the following2

%. pen !rduino I4:3. Select the adiosrv.pde file by navigating through 5ile pen in the I4:

A. 9lick the upload button

1. TYPICAL USAGE

"ake sure the board is connected to the computer via ?S@ port, make sure to know which

serial port the !rduino is connected to 7this is the same port found at the end of the drivers

installation step8, and finally, make sure that the port is not used by the I4: 7in other words,

the I4: must be closed or disconnected8, so that "!T$!@ can use the serial connection.

5rom "!T$!@, launch the command aLarduino7DportD8 where DportD is the 9" port to

which the !rduino is connected to, e.g. D9"BD or D9"=D on 1indows, or DdevttyS%(%D on

?nix 7use Ddevtty?S@(D for !rduino versions prior to ?no8 or D4:"D 7see below for the4:" mode8 and make sure the function terminates successfully.

Then use the commands a.pin"ode, 7to change the mode of each pin between input and

output8 a.digital/ead, a.digital1rite, a.analog/ead, and a.analog1rite, to perform digital

input, digital output, analog input, and analog output.

2. E;AMPLE

The previous example $:4 @link can be directly implemented on the "!T$!@ as follows2

a 5 arduino(67-89) set up the !rduino ob6ect by specifying the 9"

a"pinode(,9output9) declare %Ath pin as output

while() execute the infinite loop

a"di&ital0rite(, 1231) set the $:4 on

pause() wait for a second

a"di&ital0rite(, L-0) set the $:4 off

pause() wait for a second

end

CHAPTER 7 : ROBOT CONTROL

The robot is two wheel control with a castor wheel provided for the support.

1. MOTOR DRIER IC L2


32/46

input is high, the associated driver gets enabled. !s a result, the outputs become active and

work in phase with their inputs. Similarly, when the enable input is low, that driver is

disabled, and their outputs are off and in the high-impedance state. $3&A4 has output current

of '((m! and peak output current of %.3! per channel. "oreover for protection of circuit

from back :"5 output diodes are included within the I9. The output supply 7)9938 has a

wide range from ;.B) to A').

Thus if both the motors are moving in forward direction, the robot moves forward. 1hile if

both the motors are moving in reverse direction, the robot moves backward. If the left motor

is in forward motion and the right motor is in reverse motion, then the robot moves to its

right. 1hile if the right motor is in forward motion and the left motor is in reverse motion,

then the robot moves to its left. Thus with appropriate control instructions from arduino, we

can control the movement of the robot.


33/46

2. DC GEARED MOTOR

The geared instrument dc motor is ideally suited to a wide range of applications requiring a

combination of low speed operation and small unit si0e. The integral iron core 49 motor

provides smooth operation and a bidirectional variable speed capability while the gearhead

utili0es a multistage metal spur gear train rated for a working torque up to (.3Nm. The unit,

which is suitable for mounting in any attitude, provides reliable operation over a wide

ambient temperature range and is equipped with an integral )4/ 7voltage dependent resistor8

electrical suppression system to minimi0e electrical interference. The unit offers a range of

gear ratio options for operating speeds from B-3(( rpm and is ideally suited to applications

where small si0e and low unit price are important design criteria.

In 49 geared motors as the name suggests, this motors spin freely under the command of acontroller. This makes them easier to control, as the controller knows exactly how far they

have rotated, without having to use a sensor. Therefore they are used on many rolling wheels.

It has six wires coming out of it. 5our of them are used for receiving data from the

microcontroller board 7arduino8 for its movement, while one is connected to %3) 49 supply,

to drive the motor, while other wire is connected to B) 49 supply to provide the supply for

the driver I9.


34/46

CHAPTER = : HARDWARE SETUP

CHAPTER < : SOURCE CODE

1. PROGRAM TO TRAIN THE NEURAL NETWORK

WW This program is used to make the neural network for the identification of the gesture%.Same program can be used for other gestures also by replacing the sample images with the
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.html


35/46

images corresponding to that gesture

clcJ

clear allJ

WW $oad the sample images 7of different possess8 to make net for gesture%

img%Limread7Dfirsgesture%.pngD8J

img3Limread7Dfirsgesture3.pngD8JimgALimread7DfirsgestureA.pngD8J

img;Limread7Dfirsgesture;.pngD8J

imgBLimread7DfirsgestureB.pngD8J

img'Limread7Dfirsgesture'.pngD8J

imgVLimread7DfirsgestureV.pngD8J

img=Limread7Dfirsgesture=.pngD8J

img&Limread7Dfirsgesture&.pngD8J

img%(Limread7Dfirsgesture%(.pngD8J

img%%Limread7Dfirsgesture%%.pngD8J

img%3Limread7Dfirsgesture%3.pngD8J

img%ALimread7Dfirsgesture%A.pngD8Jimg%;Limread7Dfirsgesture%;.pngD8J

img%BLimread7Dfirsgesture%B.pngD8J

img%'Limread7Dfirsgesture%'.pngD8J

img%VLimread7Dfirsgesture%V.pngD8J

img%=Limread7Dfirsgesture%=.pngD8J

img%&Limread7Dfirsgesture%&.pngD8J

img3(Limread7Dfirsgesture3(.pngD8J

img3%Limread7Dfirsgesture3%.pngD8J

img33Limread7Dfirsgesture33.pngD8J

img3ALimread7Dfirsgesture3A.pngD8J

img3;Limread7Dfirsgesture3;.pngD8J

img3BLimread7Dfirsgesture3B.pngD8J

img3'Limread7Dfirsgesture3'.pngD8J

img3VLimread7Dfirsgesture3V.pngD8J

img3=Limread7Dfirsgesture3=.pngD8J

img3&Limread7Dfirsgesture3&.pngD8J

imgA(Limread7DfirsgestureA(.pngD8J

imgA%Limread7DfirsgestureA%.pngD8J

imgA3Limread7DfirsgestureA3.pngD8J

imgAALimread7DfirsgestureAA.pngD8JimgA;Limread7DfirsgestureA;.pngD8J

imgABLimread7DfirsgestureAB.pngD8J

WW 9rop the sample images

imgcpd%Limcrop7a%8J

imgcpd3Limcrop7a38J

imgcpdALimcrop7aA8J

imgcpd;Limcrop7a;8J

imgcpdBLimcrop7aB8J

imgcpd'Limcrop7a'8JimgcpdVLimcrop7aV8J


36/46

imgcpd=Limcrop7a=8J

imgcpd&Limcrop7a&8J

imgcpd%(Limcrop7a%(8J

imgcpd%%Limcrop7a%%8J

imgcpd%3Limcrop7a%38J

imgcpd%ALimcrop7a%A8Jimgcpd%;Limcrop7a%;8J

imgcpd%BLimcrop7a%B8J

imgcpd%'Limcrop7a%'8J

imgcpd%VLimcrop7a%V8J

imgcpd%=Limcrop7a%=8J

imgcpd%&Limcrop7a%&8J

imgcpd3(Limcrop7a3(8J

imgcpd3%Limcrop7a3%8J

imgcpd33Limcrop7a338J

imgcpd3ALimcrop7a3A8J

imgcpd3;Limcrop7a3;8Jimgcpd3BLimcrop7a3B8J

imgcpd3'Limcrop7a3'8J

imgcpd3VLimcrop7a3V8J

imgcpd3=Limcrop7a3=8J

imgcpd3&Limcrop7a3&8J

imgcpdA(Limcrop7aA(8J

imgcpdA%Limcrop7aA%8J

imgcpdA3Limcrop7aA38J

imgcpdAALimcrop7aAA8J

imgcpdA;Limcrop7aA;8J

imgcpdABLimcrop7aAB8J

WW Train the network. p is the input vector and t is the target vector

pLGimgcpd%,imgcpd3,imgcpdA,imgcpd;,imgcpdB,imgcpd',imgcpdV,imgcpd=,imgcpd&,

imgcpd%(,imgcpd%%,imgcpd%3,imgcpd%A,imgcpd%;,imgcpd%B,imgcpd%',imgcpd%V,

imgcpd%=,imgcpd%&,imgcpd3(,imgcpd3%,imgcpd33,imgcpd3A,imgcpd3;,imgcpd3B,

imgcpd3',imgcpd3V,imgcpd3=,imgcpd3&,imgcpdA(,imgcpdA%,imgcpdA3,imgcpdAA,

imgcpdA;,imgcpdABHJ

tLGimgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,

imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,

imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%HJ

pLdouble7p8J

tLdouble7t8J

netLnewrb7p,t,.((%,.%,%(((,3B8JWtrain using /@5 net

save7Dgesture%netD,DnetD8J Wsave the network

2. PROGRAM TO CROP THE CAPTURED IMAGE

WW "!T$!@ 5?9TIN imcrop T 9/ T: 9!T?/:4 I"!C:function imgoutLimgcrop7imgin8 Wfunction definition


37/46

imgin L imresi0e7cropimg,G3;(,3;(H8J Wresi0e the original image

columnsumLsum7cropimg8JWfind the sum of column entries

rowsumLsum7cropimgD8J Wfind the sum of row entries

q%L%J

q3L3;(J

qAL%Jq;L3;(J

for wL%23;( Wsearch for the vertical cropping boundary

if 7columnsum7%,w8L%(8

q%LwJ

breakJ

end

end

for wLq%23;( Wsearch for the vertical cropping boundary

if 7columnsum7%,w8L%(8

q3LwJ

endend

for wL%23;( Wsearch for the hori0ontal cropping boundary

if 7rowsum7%,w8L%(8

qALwJ

breakJ

end

end

for wLqA23;( Wsearch for the hori0ontal cropping boundary

if 7rowsum7%,w8LB(8

q;LwJ

end

end

Wcrop the image between boundaries and resi0e to original

imgoutLcropimg7qA2q;,q%2q38J

imgoutLG0eros7q;-qAX%,%'(-7q3-q%X%88,imgoutJ0eros7%3(-7q;-qAX%8,%'(8HJ

imgoutLimresi0e7imgout,G%3(,%'(H8J

imgoutLdouble7imgout8J Wconvert the image type to double for future use

". PROGRAM OR REAL TIME HAND TRACKING AND GESTURE

IDENTIICATION

WW "!T$!@ program for real time hand tracking and gesture identification

clcJ

clear allJ

close allJ

qL(J Winitiali0e the switch variable

a L arduino7D9";D8J Winitiali0e the arduino ob6ect

a.pin"ode7%A,DoutputD8J Wdefine the pin %A as output pin

a.pin"ode7%3,DoutputD8J Wdefine the pin %3 as output pin

a.pin"ode7%%,DoutputD8J Wdefine the pin %3 as output pina.pin"ode7%(,DoutputD8J Wdefine the pin %( as output pin


38/46

closepreviewJ

vid L videoinput7DwinvideoD, %, D>?>3YA3(x3;(D8J Wspecify the video adaptor

src L getselectedsource7vid8J

vid./eturned9olorspace L DgrayscaleDJ Wdefine the color format to C/!>S9!$:

vid.5rameserTrigger L %J

preview7vid8J Wpreview the video ob6ectload gesture%netJ Wload the network for each gesture

load gesture3netJ

load gestureAnetJ

load gesture;netJ

imgref%Limread7Drefimg%.pngD8J Wloading the reference image for each gesture

imgref3Limread7Drefimg3.pngD8J

imgrefALimread7DrefimgA.pngD8J

imgref;Limread7Drefimg;.pngD8J

imgcrpd%Limcrop7imgref%8J Wcrop the reference images

imgcrpd3Limcrop7imgref38J

imgcrpdALimcrop7imgrefA8Jimgcrpd;Limcrop7imgref;8J

imgtrained%Lsim7gesture%net,imgcrpd%8J Wtrain the cropped reference images

imgtrained3Lsim7gesture3net,imgcrpd38J

imgtrainedALsim7gestureAnet,imgcrpdA8J

imgtrained;Lsim7gesture;net,imgcrpd;8J

while7%8

preview7vid8J Wpreview the video ob6ect

currentimgLgetsnapshot7vid8J Wcapture the image of interest

currentimgLim3bw7currentimg,.VB8J Wconvert captured image to binary

imshow7currentimg8J Wdisplay the binary image

cropedimgLimcrop7currentimg8J Wcrop the binary image

yLsim7gesture%net,cropedimg8J Wsimulate using first net

corrwgt%Lxcorr37imgtrained%,y8J Wfind out correlation

corrwgt%Lim3col7corrwgt%,G% %H8J Wcorrelation weight to column vector

corrwgt%Lsum7corrwgt%8J Wtake column value sum

yLsim7gesture3net,currentimg8J Wsimulate using second net

corrwgt3Lxcorr37imgtrained3,y8J Wfind out correlation

corrwgt3Lim3col7corrwgt3,G% %H8J Wcorrelation weight to column vector

corrwgt3Lsum7corrwgt38J Wtake column value sum

yLsim7gestureAnet,currentimg8J Wsimulate using third netcorrwgtALxcorrA7imgtrainedA,y8JWfind out correlation

corrwgtALim3col7corrwgtA,G% %H8J Wcorrelation weight to column vector

corrwgtALsum7corrwgtA8J Wtake column value sum

yLsim7gesture;net,currentimg8J Wsimulate using fourth net

corrwgt;Lxcorr;7imgtrained;,y8JWfind out correlation

corrwgt;Lim3col7corrwgt;,G% %H8JW correlation weight to column vector

corrwgt;Lsum7corrwgt;8J Wtake column value sum

corrwghtmatrixLGcorrwgt% corrwgt3 corrwgtA corrwgt;HJ

Wdefine the switch variable by thresholding correlation weights

if 7wLAeV8

qL%Jelseif 7w03Z%eVUw03'e'8


39/46

qLA J

elseif 7w0%eV8

qL3J

elseif 7w0'eV8

qL;J

elseqL(J

end

Wgeneration of control signals to motor driver using switch method

switch7q8

case %

display7D5/1!/4D8J

a.digital1rite7%A,(8J

a.digital1rite7%3,%8J

a.digital1rite7%%,(8J

a.digital1rite7%(,%8J

case 3display7D@!9K1!/4D8

a.digital1rite7%A,%8J

a.digital1rite7%3,(8J

a.digital1rite7%%,%8J

a.digital1rite7%(,(8J

case A

display7D/ICTD8J


a.digital1rite7%3,%8J



case ;

display7D$:5TD8J


a.digital1rite7%3,(8J


a.digital1rite7%(,%8J

otherwise

display7DN "TIND8

a.digital1rite7%A,(8Ja.digital1rite7%3,(8J



end

end

-. PROGRAM TO BE BURNED ON ARDUINO BOARD


40/46

('o#&.,e

[ !nalog and 4igital Input and utput Server for "!T$!@ [

[ This file is meant to be used with the "!T$!@ arduino I package, however, it can be

used from the I4: environment 7or any other serial terminal8 by typing commands like2(e( 2 assigns digital pin \; 7e8 as input

(f% 2 assigns digital pin \B 7f8 as output

(n% 2 assigns digital pin \%A 7n8 as output

%c 2 reads digital pin \3 7c8

%e 2 reads digital pin \; 7e8

3n( 2 sets digital pin \%A 7n8 low

3n% 2 sets digital pin \%A 7n8 high

3f% 2 sets digital pin \B 7f8 high

3f( 2 sets digital pin \B 7f8 low

;63 2 sets digital pin \& 768 to B(Lascii738 over 3BB

;60 2 sets digital pin \& 768 to %33Lascii708 over 3BBAa 2 reads analog pin \( 7a8

Af 2 reads analog pin \B 7f8

/( 2 sets analog reference to 4:5!?$T

/% 2 sets analog reference to INT:/N!$

/3 2 sets analog reference to :


41/46

Nothing gets executed in the loop if nothing is available to be read, but as soon as anything

becomes available, then the part coded after the if statement 7that is the real stuff8 gets

executed [

if 7Serial.available78 (8 ^

[ whatever is available from the serial is read here [

val L Serial.read78J[ This part basically implements a state machine that reads the serial port and makes 6ust one

transition to a new state, depending on both the previous state and the command that is read

from the serial port. Some commands need additional inputs from the serial port, so they need

3 or A state transitions 7each on happening as soon as anything new is available from the

serial port8 to be fully executed. !fter a command is fully executed the state returns to its

initial value sL-% [

switch 7s8 ^

[ sL-% means NTINC /:9:I):4 >:T [[[[[[[[[[[[[[[[[[[ [

case -%2

[ calculate next state [

if 7val;V UU valZ&(8 ^[ the first received value indicates the mode ;& is ascii for %, ... &( is ascii for sL( is

change-pin mode sL%( is 4IJ sL3( is 4J sLA( is !IJ sL;( is !J sL&( is query script type 7%

basic, 3 motor8 sLA;( is change analog reference [

sL%([7val-;=8J

_

[ the following statements are needed to handle unexpected first values coming from the

serial 7if the value is unrecogni0ed then it defaults to sL-%8 [

if 77s;( UU sZ&(8 ]] 7s&( UU s`LA;(88 ^

sL-%J

_

[ the break statements gets out of the switch-case, so

[ we go back to line &V and wait for new serial data [

breakJ [ sL-% 7initial state8 taken care of [

[ sL( or % means 9!NC: IN "4: [

case (2

[ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin

%& [

if 7val&= UU valZ%%V8 ^

pinLval-&VJ [ calculate pin[

sL%J [ next we will need to get ( or % from serial [_

else ^

sL-%J [ if value is not a pin then return to -% [

_

breakJ [ sL( taken care of [

case %2

[ the third received value indicates the value ( or % [

if 7val;V UU valZB(8 ^

[ set pin mode [

if 7valLL;=8 ^

pin"ode7pin,IN?T8J_


42/46

else ^

pin"ode7pin,?T?T8J

_

_

sL-%J [ we are done with 9!NC: IN so go to -% [

breakJ [ sL% taken care of [[ sL%( means 4ICIT!$ IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[ [

case %(2


%& [


pinLval-&VJ [ calculate pin [

dgvLdigital/ead7pin8J [ perform 4igital Input [

Serial.println7dgv8J [ send value via serial [

_

sL-%J [ we are done with 4I so next state is -% [

breakJ [ sL%( taken care of [[ sL3( or 3% means 4ICIT!$ ?T?T [[[[[[[[[[[[[[[[[[[ [

case 3(2


%& [



sL3%J [ next we will need to get ( or % from serial [

_

else ^


_

breakJ [ sL3( taken care of [

case 3%2

[ the third received value indicates the value ( or % [

if 7val;V UU valZB(8 ^

dgvLval-;=J [ calculate value [

digital1rite7pin,dgv8J [ perform 4igital utput [

_

sL-%J [ we are done with 4 so next state is -% [

breakJ [ sL3% taken care of [[ sLA( means !N!$C IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[[ [

case A(2

[ the second received value indicates the pin from abs 7DaD8 L&V, pin (, to abs 7DfD8 L%(3, pin ',

note that these are the digital pins from %; to %& located in the lower right part of the board [

if 7val&' UU valZ%(A8 ^


agvLanalog/ead7pin8J [ perform !nalog Input [

Serial.println7agv8J [ send value via serial [

_

sL-%J [ we are done with !I so next state is -%[

breakJ [ sLA( taken care of [[ sL;( or ;% means !N!$C ?T?T [[[[[[[[[[[[[[[[[[[[ [


43/46

case ;(2


%& [


pinLval-&VJ [ calculate pin[

sL;%J [ next we will need to get value from serial [_

else ^


_

breakJ [ sL;( taken care of [

[ the third received value indicates the analog value [

analog1rite7pin,val8J [ perform !nalog utput [

sL-%J [ we are done with ! so next state is -% [

breakJ [ sL;% taken care [

[ sL&( means uery Script Type 7% basic, 3 motor8 [

case &(2if 7valLLBV8 ^

[ if string sent is && send script type via serial [

Serial.println7%8J

_

sL-%J [ we are done with this so next state is -% [

breakJ [ sL&( taken care of [

[ sLA;( or A;% means !N!$C /:5:/:N9: [[[[[[[[[[[[[[[ [

case A;(2

[ the second received value indicates the reference which is encoded as is (, %, 3 for

4:5!?$T, INT:/N!$ and :


44/46

_ [ end loop statement [

CHAPTER 1> : CHALLENGES ACED

There are many challenges associated with the accuracy and usefulness of gesture recognition

software. 5or image-based gesture recognition there are limitations on the equipment used

and image noise. Images or video may not be under consistent lighting, or in the same

location. Items in the background or distinct features of the users may make recognition more

difficult. The variety of implementations for image-based gesture recognition may also cause

issue for viability of the technology to general usage. 5or example, an algorithm calibrated

for one camera may not work for a different camera. The amount of background noise also

causes tracking and recognition difficulties, especially when occlusions 7partial and full8

occur. 5urthermore, the distance from the camera, and the cameraDs resolution and quality,

also cause variations in recognition accuracy. In order to capture human gestures by visual

sensors, robust computer vision methods are also required, for example for hand tracking andhand posture recognition or for capturing movements of the head, facial expressions or ga0e

direction. The ma6or challenges faced in the completion of the pro6ect were2

/eal-time tracking of gesture in a noisy atmosphere.

Setting of threshold value for gesture classification.

Time taken for training a gesture.

)ariation in illumination of surroundings.

CHAPTER 11 : RESULTS,ANALYSIS CONCLUSIONAND FUTURE WORKS

RESULTS AND ANALYSIS

We have conduced e!"e#$%en& 'a&ed on $%a(e& )e have ac*u$#ed u&$n(a &$%"+e )e'ca% We have co++eced he&e daa on un$-o#% 'ac.(#ound&F$(u#e /0 $++uae& he #ea+ $%e hand #ac.$n( and (e&u#e $den$ca$on-o# he co%%and o- -o#)a#d %o$on

Ou# a+(o#$h% +ead& o he co##ec #eco(n$$on o- a++ he -ou# (e&u#e&

A+&o, he co%"ua$on $%e needed o o'a$n he&e #e&u+& $& ve#2 &%a++,&$nce he a+(o#$h% $& *u$e &$%"+e Th$#2 ve $%a(e& o- a &$n(+e (e&u#e )e#e a.en -o# he #a$n$n( "u#"o&e

and he ne)o#. )a& -o#%ed The #ea+ $%e $%a(e )a& ca"u#ed '2 he)e'ca% and )a& ($ven o he ne -o# c+a&&$ca$on 3a&ed on he &o#eddaa'a&e neu#a+ ne )$++ #eu#n he c+a&&$ed #e&u+Th$& neu#a+ nec+a&&$ca$on ou"u )a& u&ed o d#$ve he ha#d)a#e, $ne#-aced )$h hehe+" o- A#du$no The con#o+ co%%and& )e#e &end o A#du$no -#o% 4ATLA3d$#ec+2 and on he 'a&$& o- h$& co%%and, %o$on o- he RO3OT )a&ad5u&ed

We have noed ha $%a(e& a.en unde# $n&u6c$en +$(h have +ed o he$nco##ec #e&u+& In he&e ca&e& he -a$+u#e %a$n+2 &e%& -#o% he

e##oneou& &e(%ena$on o- &o%e 'ac.(#ound "o#$on& a& he hand #e($on
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.html


45/46

Ou# a+(o#$h% a""ea#& o "e#-o#% )e++ $n un$-o#% 'ac.(#ound& )$ha""#o"#$ae $++u%$na$on Ove#a++, )e nd he "e#-o#%ance o- h$& &$%"+ea+(o#$h% *u$e &a$&-aco#2 $n he cone! o- ou# %o$va$n( #o'o con#o+a""+$ca$on

CONCLUSION AND FUTURE WORKS

We "#o"o&ed a -a& and &$%"+e a+(o#$h% -o# hand (e&u#e #eco(n$$on -o#con#o++$n( #o'o We have de%onaed he e7ec$vene&& o- h$&co%"ua$ona++2 e6c$en a+(o#$h% on #ea+ $%a(e& )e have ac*u$#ed Inou# &2&e% o- (e&u#e con#o++ed #o'o&, )e have on+2 con&$de#ed a +$%$ednu%'e# o- (e&u#e& Ou# a+(o#$h% can 'e e!ended $n a nu%'e# o- )a2&o #eco(n$8e a '#oade# &e o- (e&u#e& The (e&u#e #eco(n$$on "o#$on o-ou# a+(o#$h% $& oo &$%"+e, and )ou+d need o 'e $%"#oved $- h$&echn$*ue )ou+d need o 'e u&ed $n cha++en($n( o"e#a$n( cond$$on&Re+$a'+e "e#-o#%ance o- hand (e&u#e #eco(n$$on echn$*ue& $n a (ene#a+&e$n( #e*u$#e dea+$n( )$h occ+u&$on&, e%"o#a+ #ac.$n( -o# #eco(n$8$n(d2na%$c (e&u#e&, a& )e++ a& 9D %ode+$n( o- he hand, )h$ch a#e &$++

%o&+2 'e2ond he cu##en &ae o- he a#The advana(e o- u&$n( neu#a+ ne)o#.& $& ha 2ou can d#a) conc+u&$on&-#o% he ne)o#. ou"u I- a veco# $& no c+a&&$ed co##ec )e can chec.$& ou"u and )o#. ou a &o+u$onEven )$h +$%$ed "#oce&&$n( "o)e#, $ )$++ 'e "o&&$'+e o de&$(n ve#2e6c$en a+(o#$h%& '2: Advanced DSP "#oce&&o# can #educe he &$8e o-%odu+e Unde#&and he$# ;&a$c< (e&u#e& Con#o+ -o# ohe# '$o%e#$cu&e& Ou# &o-)a#e ha& 'een de&$(ned o 'e #eu&a'+e -o# %an2 'ehav$o#&ha a#e %o#e co%"+e!, )h$ch %a2 'e added o ou# )o#. 3ecau&e )e+$%$ed ou#&e+ve& o +o) "#oce&&$n( "o)e#, ou# )o#. cou+d ea&$+2 'e %ade%o#e "e#-o#%$n( '2 add$n( a &ae=o-=he=a# "#oce&&o# The u&e o- #ea+e%'edded OS cou+d $%"#ove ou# &2&e% $n e#%& o- &"eed and &a'$+$2 Inadd$$on, $%"+e%en$n( %o#e &en&o# %oda+$$e& )ou+d $%"#ove #o'u&ne&&even $n ve#2 co%"+e! &cene&Ou# &2&e% ha& &ho)n he "o&&$'$+$2 ha $ne#ac$on )$h %ach$ne&h#ou(h (e&u#e& $& a -ea&$'+e a&. and he &e o- deeced (e&u#e& cou+d'e enhanced o %o#e co%%and& '2 $%"+e%en$n( a %o#e co%"+e! %ode+o- a advanced veh$c+e -o# no on+2 $n +$%$ed &"ace )h$+e a+&o $n '#oade#a#ea a& $n he #oad& oo In he -uu#e, &e#v$ce #o'o e!ecu$n( %an2d$7e#en a&.& -#o% "#$vae %ove%en o a -u++2=>ed(ed advancedauo%o$ve ha can %a.e d$&a'+ed o a'+e $n a++ &en&e

SNAPSHOTS


46/46

Hand Gesture Imp

Documents

Transcript of Hand Gesture Imp