Hand Gesture Imp
-
Upload
anonymous-l9fb0xu -
Category
Documents
-
view
226 -
download
0
Transcript of Hand Gesture Imp
-
7/24/2019 Hand Gesture Imp
1/46
CHAPTER 1 : INTRODUCTION
Nowadays, robots are used in the different domains ranging from search and rescue in the
dangerous environments to the interactive entertainments. The more the robots are employedin our daily life, the more a natural communication with the robot is required. Since the
introduction of the most common input computer devices not a lot have changed. This is
probably because the existing devices are adequate. It is also now that computers have been
so tightly integrated with everyday life, that new applications and hardware are constantly
introduced. The means of communicating with computers at the moment are limited to
keyboards, mice, light pen, trackball, keypads etc. These devices have grown to be familiar
but inherently limit the speed and naturalness with which we interact with the computer. n
the other hand, hand gesture, as a natural interface means, has been attracting so much
attention for interactive communication with robots in the recent years.
!s the computer industry follows "oore#s $aw since middle %&'(s, powerful machines are
built equipped with more peripherals. )ision based interfaces are feasible and at the presentmoment the computer is able to *see+. ence users are allowed for richer and user-friendlier
man-machine interaction. This can lead to new interfaces that will allow the deployment of
new commands that are not possible with the current input devices. lenty of time will be
saved as well. /ecently, there has been a surge in interest in recogni0ing human hand
gestures. In this context, vision based hand detection and tracking techniques are used to
provide an efficient real time interface with the robot. owever, the problem of visual hand
recognition and tracking is quite challenging. "any early approaches used position markers
or colored gloves to make the problem of hand recognition easier, but due to their
inconvenience, they cannot be considered as a natural interface for the robot control. Thanks
to the latest advances in the computer vision field, the recent vision based approaches do not
need any extra hardware except a camera. These techniques can be categori0ed as model
based and appearance based methods. 1hile model based techniques can recogni0e the hand
motion and its shape exactly, they are computationally expensive and therefore they are
infeasible for a real time control application.
The appearance based techniques on the other hand are faster but they still deal with some
issues such as2
complex nature of the hand with more than 3( 45
cluttered and variant background
variation in lighting conditions
real time computational demand
This pro6ect on the one hand addresses the solution to the mentioned issues in the hand
recognition problem using 34 images and on the other hand proposes an innovative natural
commanding system for a uman /obot Interaction 7/I8.
and- gesture recognition has various applications like computer games, machinery control
7e.g. 9rane8, and thorough mouse replacement. ne of the most structured sets of gestures
belongs to sign language. In sign language, each gesture has an assigned meaning 7ormeanings8.9omputer recognition of hand gestures may provide a more natural-computer
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.html -
7/24/2019 Hand Gesture Imp
2/46
interface, allowing people to point, or rotate a 9!4 model by rotating their hands. and
gestures can be classified in two categories2 static and dynamic. ! static gesture is a particular
hand configuration and pose, represented by a single image. ! dynamic gesture is a moving
gesture, represented by a sequence of images. 1e will focus on the recognition of static
images. Interactive applications pose particular challenges. The response time should be very
fast. The user should sense no appreciable delay between when he or she makes a gesturemotion and when the computer responds. The computer vision algorithms should be reliable
and work for different people. There are also economic constraints2 the vision-based
interfaces will be replacing existing ones, which are often very low cost. :ven for added
functionality, consumers may not want to spend more. 1hen additional hardware is needed
the cost is considerable higher.
CHAPTER 2 : SYSTEM DESCRIPTION
The system which is developed for hand based robot control consists of set-up of the robot,34 imaging system and a control application.
1. BLOCK DIAGRAM
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.html -
7/24/2019 Hand Gesture Imp
3/46
In the pro6ect we use a video web camera for gesture identification. It sniffs frames of live
video stream in some time interval. In our case frame capture rate for gesture search is %
frame per two second.
5or video capturing the camera used has resolution ';(cbcr
format. ?sing "!T$!@ program we convert it in to greyscale format. @y using "!T$!@
we take snapshots of the obtained video in an interval of two seconds.
The si0e of the obtained image is ';(
-
7/24/2019 Hand Gesture Imp
4/46
Similarly all the four gestures are trained to obtain four networks respectively. The networks
are saved in the workspace of "!T$!@. This acts as database for classification.
roposed technique is to control robotic system using hand gesture display is divided into
four subparts2
9apture frame containing some gesture presentation.
:xtract hand gesture area from captured frame.
4etermine gesture by pattern matching using radial basis function.
4etermine control instruction, corresponding to matched gesture, and gives that
instruction to specified robotic system.
Now for real time execution, the snapshots of the video is taken. Then appropriate cropping is
done and its given to the neural network. Cesture classification is done by the radial basisfunction. The correlation property between the images is used here i.e crosscorrelation is
calculated. This determines the threshold level for gesture classification. The neural net
classification output is used to drive the hardware. The !rduino is used for hardware
interfacing. The control commands are send to !rduino from "!T$!@ directly. n the basis
of this command motion of the /@T is ad6usted.
Thus from the corpus of gestures, specific gesture of interest can be identified, and on the
basis of that, specific command for execution of action can be given to robotic system.
In our pro6ect the software platform used is "!T$!@ %(.The various sections of the system
like gesture recognition, radial basis function, image processing and interfacing using arduino
is explained in forthcoming sections.
2. WEBCAM
! webcam is a video camera that feeds its images in real time to a computer or computer
network, often via ?S@, :thernet, or 1i-5i. Their most popular use is the establishment of
video links, permitting computers to act as videophones or videoconference stations. The
common use as a video camera for the 1orld 1ide 1eb gave the webcam its name. ther
popular uses include security surveillance, computer vision, video broadcasting and for
recording social videos. 1ebcams are known for their low manufacturing cost and flexibility,
making them the lowest cost form of videotelephony. They have also become a source of
security and privacy issues, as some built-in webcams can be remotely activated via spyware
1ebcams typically include a lens, an image sensor, support electronics, and may also includea microphone for sound. )arious lenses are available, the most common in consumer-grade
webcams being a plastic lens that can be screwed in and out to focus the camera. 5ixed focus
lenses, which have no provision for ad6ustment, are also available. !s a camera systemDs
depth of field is greater for small image formats and is greater for lenses with a large f-
number 7small aperture8, the systems used in webcams have a sufficiently large depth of field
that the use of a fixed focus lens does not impact image sharpness to a great extent.
2.1. Technolo!
Image sensors can be 9"S or 994, the former being dominant for low-cost cameras, but
994 cameras do not necessarily outperform 9"S-based cameras in the low cost price
range. "ost consumer webcams are capable of providing )C! resolution video at a frame
-
7/24/2019 Hand Gesture Imp
5/46
rate of A( frames per second. "any newer devices can produce video in multi-megapixel
resolutions, and a few can run at high frame rates such as the layStation :ye, which can
produceA3(E3;( video at %3( frames per second.
Support electronics read the image from the sensor and transmit it to the host computer. The
camera pictured to the right, for example, uses a Sonix SN&9%(% to transmit its image over
?S@. Some cameras, such as mobile phone cameras, use a 9"S sensor with supportingelectronics Fon dieF, i.e. the sensor and the support electronics are built on a single silicon
chip to save space and manufacturing costs. "ost webcams feature built-in microphones to
make video calling and videoconferencing more convenient. The ?S@ video device class
7?)98 specification allows for interconnectivity of webcams to computers without the need
for proprietary device drivers. "icrosoft 1indows
-
7/24/2019 Hand Gesture Imp
6/46
".11. L(&e O)*ec$ T&(c+'n
In some interactive applications, the computer needs to track the position or orientation of a
hand that is prominent in the image. /elevant applications might be computer games, or
interactive machine control. In such cases, a description of the overall properties of the image
may be adequate. Image moments, which are fast to compute, provide a very coarse summary
of global averages of orientation and position. If the hand is on a uniform background,
this method can distinguish hand positions and simple pointing gestures. The large-ob6ect-
tracking method makes use of a low-cost detectorprocessor to quickly calculate moments.
This is called the artificial retina chip. This chip combines image detection with some low-level image processing 7named artificial retina by analogy with those combined abilities of
the human retina8. The chip can compute various functions useful in the fast algorithms for
interactive graphics applications.
".12 Sh(,e &econ'$'on
"ost applications, such as recogni0ing particular static hand signal, require a richer
description of the shape of the input ob6ect than image moments provide. If the hand signals
fell in a predetermined set, and the camera views a close-up of the hand, we may use an
example-based approach, combined with a simple method top analy0e hand signals called
orientation histograms. These example-based applications involve two phasesJ training and
running. In the training phase, the user shows the system one or more examples of a specific
hand shape. The computer forms and stores the corresponding orientation histograms. In the
-
7/24/2019 Hand Gesture Imp
7/46
run phase, the computer compares the orientation histogram of the current image with each of
the stored templates and selects the category of the closest match, or interpolates between
templates, as appropriate. This method should be robust against small differences in the si0e
of the hand but probably would be sensitive to changes in hand orientation. Cesture
recognition can be seen as a way for computers to begin to understand human body language,
this building a richer bridge between machines and humans than primitive text user interfacesor even C?Is 7graphical user interfaces8, which still limit the ma6ority of input to keyboard
and mouse. Cesture recognition enables humans to interface with the machine 7"I8 and
interact naturally without any mechanical devices. ?sing the concept of gesture recognition,
it is possible to point a finger at the computer screen so that the cursor will move accordingly.
This could potentially make conventional input devices such as mouse, keyboards and even
touch-screens redundant.
-. NEURAL NETWORKS
Neural networks are composed of simple elements operating in parallel. These elements are
inspired by biological nervous systems. !s in nature, the network function is determinedlargely by the connections between elements. 1e can train a neural network to perform a
particular function by ad6usting the values of the connections 7weights8 between elements.
9ommonly neural networks are ad6usted, or trained, so that a particular input leads to a
specific target output. Such a situation is shown in figure A. There, the network is ad6usted,
based on a comparison of the output and the target, until the network output matches the
target. Typically many such inputtarget pairs are used, in this supervised learning to train a
network.
Neural networks have been trained to perform complex functions in various fields of
application including pattern recognition, identification, classification, speech, vision andcontrol systems.
-
7/24/2019 Hand Gesture Imp
8/46
Today neural networks can be trained to solve problems that are difficult for conventional
computers or human beings. The supervised training methods are commonly used, but other
networks can be obtained from unsupervised training techniques or from direct
design methods. ?nsupervised networks can be used, for instance, to identify groups of data.
9ertain kinds of linear networks and opfield networks are designed directly. In summary,
there are a variety of kinds of design and learning techniques that enrich the choices that auser can make. The field of neural networks has a history of some five decades but has found
solid application only in the past fifteen years, and the field is still developing rapidly. Thus,
it is distinctly different from the fields of control systems or optimi0ation where the
terminology, basic mathematics, and design procedures have been firmly established and
applied for many years.
-.1. Ne%&on Moel
-.11. S'/,le Ne%&on! neuron with a single scalar input and no bias is shown on the left below.
The scalar input p is transmitted through a connection that multiplies its strength by the scalar
weight w, to form the product wp, again a scalar. ere the weighted input wp is the only
argument of the transfer function f, which produces the scalar output a. The neuron on the
right has a scalar bias, b. >ou may view the bias as simply being added to the product wp as
shown by the summing 6unction or as shifting the function f to the left by an amount b. The
bias is much like a weight, except that it has a constant input of %. The transfer function net
input n, again a scalar, is the sum of the weighted input wp and the bias b. This sum is the
argument of the transfer function f. ere f is a transfer function, typically a step function or a
sigmoid function, that takes the argument n and produces the output a. Note that w and b are
both ad6ustable scalar parameters of the neuron. The central idea of neural networks is that
such parameters can be ad6usted so that the network exhibits some desired or interestingbehaviour.
-
7/24/2019 Hand Gesture Imp
9/46
Thus, we can train the network to do a particular 6ob by ad6usting the weight or bias
parameters, or perhaps the network itself will ad6ust these parameters to achieve some desired
end. !ll of the neurons in the program written in "!T$!@ have a bias. owever, you may
omit a bias in a neuron if you wish. !
-.12. T&(n#0e& %nc$'on#Three of the most commonly used transfer functions are shown in figure 7'8.
The hard limit transfer function shown above limits the output of the neuron to either (, if the
net input argument n is less than (, or %, if n is greater than or equal to (. This is the function
used for the radial basis function algorithm written in "!T$!@ to create neurons that make
a classification decision.
There are two modes of learning2 Supervised and unsupervised. @elow there is a brief
description of each one to determine the best one for our problem
-
7/24/2019 Hand Gesture Imp
10/46
-.1". S%,e&'#e Le(&n'n
Supervised learning is based on the system trying to predict outcomes for known examples
and is a commonly used training method. It compares its predictions to the target answer and
FlearnsF from its mistakes. The data start as inputs to the input layer neurons. The neurons
pass the inputs along to the next nodes. !s inputs are passed along, the weighting, or
connection, is applied and when the inputs reach the next node, the weightings are summedand either intensified or weakened. This continues until the data reach the output layer where
the model predicts an outcome. In a supervised learning system, the predicted output is
compared to the actual output for that case. If the predicted output is equal to the actual
output, no change is made to the weights in the system. @ut, if the predicted output is higher
or lower than the actual outcome in the data, the error is propagated back through the system
and the weights are ad6usted accordingly. This feeding errors backwards through the network
is called Fback-propagation.F
@oth the "ulti-$ayer erceptron and the /adial @asis 5unction are supervised learning
techniques. The "ulti-$ayer erceptron uses the back-propagation while the /adial @asis
5unction is a feed-forward approach which trains on a single pass of the data.
-.1-. Un#%,e&'#e Le(&n'n
Neural networks which use unsupervised learning are most effective for describing data
rather than predicting it. The neural network is not shown any outputs or answers as part of
the training process in fact, there is no concept of output fields in this type of system. The
primary unsupervised technique is the Kohonen network. The main uses of Kohonen and
other unsupervised neural systems are in cluster analysis where the goal is to group FlikeF
cases together. The advantage of the neural network for this type of analysis is that it requires
no initial assumptions about what constitutes a group or how many groups there are. The
system starts with a clean slate and is not biased about which factors should be most
important.
;.3. !dvantages of Neural 9omputing
There are a variety of benefits that an analyst reali0es from using neural networks in their
work.
attern recognition is a powerful technique for harnessing the information in the data
and generali0ing about it. Neural nets learn to recogni0e the patterns which exist in
the data set.
The system is developed through learning rather than programming. rogramming is
much more time consuming for the analyst and requires the analyst to specify the
exact behaviour of the model. Neural nets teach themselves the patterns in the datafreeing the analyst for more interesting work.
Neural networks are flexible in a changing environment. /ule based systems or
programmed systems are limited to the situation for which they were designed--when
conditions change, they are no longer valid. !lthough neural networks may take some
time to learn a sudden drastic change, they are excellent at adapting to constantly
changing information.
Neural networks can build informative models where more conventional approaches
fail. @ecause neural networks can handle very complex interactions they can easily
-
7/24/2019 Hand Gesture Imp
11/46
model data which is too difficult to model with traditional approaches such as
inferential statistics or programming logic.
erformance of neural networks is at least as good as classical statistical modeling,
and better on most problems. The neural networks build models that are more
reflective of the structure of the data in significantly less time.
Neural networks now operate well with modest computer hardware. !lthough neural
networks are computationally intensive, the routines have been optimi0ed to the
point that they can now run in reasonable time on personal computers. They do not
require supercomputers as they did in the early days of neural network research.
-.". L'/'$($'on# o0 Ne%&(l Co/,%$'n
There are some limitations to neural computing. The key limitation is the neural networkDs
inability to explain the model it has built in a useful way. !nalysts often want to know why
the model is behaving as it is. Neural networks get better answers but they have a hard time
explaining how they got there.
There are a few other limitations that should be understood. 5irst, it is difficult to extract rules
from neural networks. This is sometimes important to people who have to explain their
answer to others and to people who have been involved with artificial intelligence,
particularly expert systems which are rule-based.
!s with most analytical methods, you cannot 6ust throw data at a neural net and get a good
answer. >ou have to spend time understanding the problem or the outcome you are trying to
predict. !nd, you must be sure that the data used to train the system are appropriate and are
measured in a way that reflects the behaviour of the factors. If the data are not representative
of the problem, neural computing will not product good results. This is a classic situationwhere Fgarbage inF will certainly produce Fgarbage out.F
5inally, it can take time to train a model from a very complex data set. Neural techniques are
computer intensive and will be slow on low end 9s or machines without math coprocessors.
It is important to remember though that the overall time to results can still be faster than other
data analysis approaches, even when the system takes longer to train. rocessing speed alone
is not the only factor in performance and neural networks do not require the time
programming and debugging or testing assumptions that other analytical approaches do.
3. RADIAL BASIS UNCTION
! /adial @asis 5unction 7/@58 neural network has an input layer, a hidden layer and an
output layer. The neurons in the hidden layer contain Caussian transfer functions whose
outputs are inversely proportional to the distance from the center of the neuron.
/@5 networks are similar to K-"eans clustering and NNC/NN networks. The main
difference is that NNC/NN networks have one neuron for each point in the training file,
whereas /@5 networks have a variable number of neurons that is usually much less than the
number of training points. 5or problems with small to medium si0e training sets, NNC/NN
networks are usually more accurate than /@5 networks, but NNC/NN networks are
impractical for large training sets.
3.1. Ho4 RB ne$4o&+# 4o&+
-
7/24/2019 Hand Gesture Imp
12/46
!lthough the implementation is very different, /@5 neural networks are conceptually similar
to K-Nearest Neighbor 7k-NN8 models. The basic idea is that a predicted target value of an
item is likely to be about the same as other items that have close values of the predictor
variables. 9onsider this figure2
!ssume that each case in the training set has two predictor variables, x and y. The cases are
plotted using their x,y coordinates as shown in the figure. !lso assume that the target variable
has two categories, positive which is denoted by a square and negative which is denoted by a
dash. Now, suppose we are trying to predict the value of a new case represented by the
triangle with predictor values xL', yLB.%. Should we predict the target as positive or
negativeM
Notice that the triangle is position almost exactly on top of a dash representing a negativevalue. @ut that dash is in a fairly unusual position compared to the other dashes which are
clustered below the squares and left of center. So it could be that the underlying negative
value is an odd case.
The nearest neighbor classification performed for this example depends on how many
neighboring points are considered. If %-NN is used and only the closest point is considered,
then clearly the new point should be classified as negative since it is on top of a known
negative point. n the other hand, if &-NN classification is used and the closest & points are
considered, then the effect of the surrounding = positive points may overbalance the close
negative point.
!n /@5 network positions one or more /@5 neurons in the space described by the predictor
variables 7x,y in this example8. This space has as many dimensions as there are predictorvariables. The :uclidean distance is computed from the point being evaluated 7e.g., the
-
7/24/2019 Hand Gesture Imp
13/46
triangle in this figure8 to the center of each neuron, and a radial basis function 7/@58 7also
called a kernel function8 is applied to the distance to compute the weight 7influence8 for each
neuron. The radial basis function is so named because the radius distance is the argument to
the function. 1eight L /@5 7distance8
The further a neuron is from the point being evaluated, the less influence it has.
4ifferent types of radial basis functions could be used, but the most common is the Caussian
function2
If there is more than one predictor variable, then the /@5 function has as many dimensions as
there are variables. The following picture illustrates three neurons in a space with twopredictor variables, < and >. is the value coming out of the /@5 functions2
-
7/24/2019 Hand Gesture Imp
14/46
The best predicted value for the new point is found by summing the output values of the /@5
functions multiplied by weights computed for each neuron.
The radial basis function for a neuron has a center and a radius 7also called a spread8. The
radius may be different for each neuron, and, in /@5 networks generated by 4T/:C, the
radius may be different in each dimension.
-
7/24/2019 Hand Gesture Imp
15/46
1ith larger spread, neurons at a distance from a point have a greater influence.
3.2. RB Ne$4o&+ A&ch'$ec$%&e
/@5 networks have three layers2
%. Input layer O There is one neuron in the input layer for each predictor variable. In the case
of categorical variables, N-% neurons are used where N is the number of categories. The input
neurons 7or processing before the input layer8 standardi0e the range of the values by
-
7/24/2019 Hand Gesture Imp
16/46
subtracting the median and dividing by the interquartile range. The input neurons then feed
the values to each of the neurons in the hidden layer.
3. idden layer O This layer has a variable number of neurons 7the optimal number is
determined by the training process8. :ach neuron consists of a radial basis function centered
on a point with as many dimensions as there are predictor variables. The spread 7radius8 of
the /@5 function may be different for each dimension. The centers and spreads aredetermined by the training process. 1hen presented with the x vector of input values from
the input layer, a hidden neuron computes the :uclidean distance of the test case from the
neuron#s center point and then applies the /@5 kernel function to this distance using the
spread values. The resulting value is passed to the the summation layer.
A. Summation layer O The value coming out of a neuron in the hidden layer is multiplied by a
weight associated with the neuron 71%, 13, ...,1n in this figure8 and passed to the
summation which adds up the weighted values and presents this sum as the output of the
network. Not shown in this figure is a bias value of %.( that is multiplied by a weight 1( and
fed into the summation layer. 5or classification problems, there is one output 7and a separate
set of weights and summation unit8 for each target category. The value output for a category
is the probability that the case being evaluated has that category.
!n input vector x is used as input to all radial basis functions, each with different parameters.
The output of the network is a linear combination of the outputs from radial basis functions.
3.". T&('n'n RB Ne$4o&+#
The following parameters are determined by the training process2
%. The number of neurons in the hidden layer.3. The coordinates of the center of each hidden-layer /@5 function.
-
7/24/2019 Hand Gesture Imp
17/46
A. The radius 7spread8 of each /@5 function in each dimension.
;. The weights applied to the /@5 function outputs as they are passed to the summation layer.
)arious methods have been used to train /@5 networks. ne approach first uses K-means
clustering to find cluster centers which are then used as the centers for the /@5 functions.
owever, K-means clustering is a computationally intensive procedure, and it often does not
generate the optimal number of centers. !nother approach is to use a random subset of thetraining points as the centers.
4T/:C uses a training algorithm developed by Sheng 9hen,
-
7/24/2019 Hand Gesture Imp
18/46
comprehensive set of reference-standard algorithms and graphical tools for image processing,
analysis, visuali0ation, and algorithm development. >ou can perform image enhancement,
image deblurring, feature detection, noise reduction, image segmentation, spatial
transformations, and image registration. "any functions in the toolbox are multithreaded to
take advantage of multicore and multiprocessor computers. Image rocessing Toolbox
supports a diverse set of image types, including high dynamic range, gigapixel resolution,I99-compliant colour, and tomographic images. Craphical tools let you explore an image,
examine a region of pixels, ad6ust the contrast, create contours or histograms, and manipulate
regions of interest 7/Is8. 1ith the toolbox algorithms you can restore degraded images,
detect and measure features, analy0e shapes and textures, and ad6ust the colour balance of
images.
7. ARDUINO
!rduino is an open-source electronics prototyping platform based on flexible, easy-to-use
hardware and software. ItDs intended for artists, designers, hobbyists, and anyone interested in
creating interactive ob6ects or environments.!rduino can sense the environment by receiving input from a variety of sensors and can
affect its surroundings by controlling lights, motors, and other actuators. The microcontroller
on the board is programmed using the !rduino programming language 7based on 1iring8 and
the !rduino development environment 7based on rocessing8. !rduino pro6ects can be stand-
alone or they can communicate with software running on a computer 7e.g."!T$!@,5lash,
rocessing8.
!rduino is different from other platforms on the market because of these features2
It is a multiplatform environmentJ it can run on 1indows, "acintosh, and $inux
It is based on the rocessing programming I4:, an easy-to-use development
environment used by artists and designers
-
7/24/2019 Hand Gesture Imp
19/46
>ou program it via a ?S@ cable, not a serial port. This feature is useful because many
modern computers don#t have serial ports.
It is open source hardware and softwareRif you wish, you can download the circuit
diagram, buy all the components, and make your own, without paying anything to the
makers of !rduino.
There is an active community of users, so there are plenty of people who can help
you.
The !rduino ro6ect was developed in an educational environment and is therefore
great for newcomers to get things working quickly.
7.1. B(&e M'n'/%/ Coe $o R%n $he A&%'no
void setup(){
//we can put setup code here, to run once
}
void loop()
{
//we can put main code here, it run repeatedly
}
The setup78 function is called when a sketch starts. ?se it to initiali0e variables, pin modes,
start using libraries, etc. The setup function will only run once, after each powerup or reset of
the !rduino board.!fter creating a setup78 function, the loop78 function does precisely what its name suggests,
and loops consecutively, allowing our program to change and respond as it runs. 9ode in the
loop78 section of our sketch is used to actively control the !rduino board.
!ny line that starts with two slashes 78 will not be read by the compiler, so we can write
anything you want after it.
7.2. LED Bl'n+'n E8(/,le
/* Blink Turns on an LE on !or one second, then o!! !or one second, repeatedly"
This e#ample code is in the pu$lic domain" */
void setup(){
// initiali%e the di&ital pin as an output"
// 'in has an LE connected on most rduino $oards+
pinode(, -.T'.T)
}
void loop() {
di&ital0rite(, 1231) // set the LE on
delay(444) // wait !or a second
di&ital0rite(, L-0) // set the LE o!!
delay(444) // wait !or a second
}
-
7/24/2019 Hand Gesture Imp
20/46
In the program below, the first thing we do is to initiali0e pin %A as an output pin with the line
pinode(, -.T'.T)
In the main loop, we turn the $:4 on with the line2
di&ital0rite(, 1231)
This supplies B volts to pin %A. That creates a voltage difference across the pins of the $:4,and lights it up. Then we turn it off with the line2
di&ital0rite(, L-0)
That takes pin %A back to ( volts, and turns the $:4 off. In between the on and the off, we
want enough time for a person to see the change, so the delay78 commands tell the !rduino to
do nothing for %((( milliseconds, or one second. 1hen we use the delay78 command, nothing
else happens for that amount of time.
CHAPTER " : IMAGE DATABASE
The starting point of the pro6ect was the creation of a database with all the images that would
be used for training and testing.
The image database can have different formats. Images can be either hand drawn, digiti0ed
photographs or a 34A4 dimensional hand. hotographs were used, as they are the most
realistic approach.
Images came from two main sources. )arious !S$ databases on the Internet and photographs
took with a web camera. This meant that they have different si0es, different resolutions and
sometimes almost completely different angles of shooting. Images belonging to the last cases
were very few but they were discarded, as there was no chance of classifying them correctly.
Two operations were carried out in all of the images. They were converted to grayscale and
the background was made uniform. The internet databases already had uniform backgrounds
but the ones took with the camera had to be processed in !dobe hotoshop.
The database itself was constantly changing throughout the completion of the pro6ect as it
was it that would decide the robustness of the algorithm. Therefore, it had to be done in such
way that different situations could be tested and thresholds above which the algorithm didn#t
classify correct would be decided. The construction of such a database is clearly dependent
on the application. If the application is a crane controller for example operated by the same
person for long periods the algorithm doesn#t have to be robust on different person#s images.
In this case noise and motion blur should be tolerable.
1e can see an example as shown in figure2
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.html -
7/24/2019 Hand Gesture Imp
21/46
In the first row are the training images. In the second the testing images. 5or most of the
gestures the training set originates from a single gesture. Those were enhanced in !dobe
hotoshop using various filters. The reason for this is that the algorithm is to be very robust
for images of the same database. If there was a misclassification to happen it would be
preferred to be for unknown images.
The final form of the database is this.
Train set2 5our training sets of images, each one containing thirty five images. :ach set
originates from a single image for testing.
Test Set2 The number of test images varies for each gesture. There is no reason for keeping
those on a constant number. Some images can tolerate much more variance and images from
new databases and they can be tested extensively, while other images are restricted to fewer
testing images.
The system could be approached either in high or low-level. The former would employ
models of the hand, finger, 6oints and perhaps fit such a model to the visual data. This
approach offers robustness, but at the expense of speed.
! low-level approach would process data at a level not much higher than that of pixel
intensities. !lthough this approach would not have the power to make inferences about
occluded data, it could be simple and fast.The pattern recognition system that will be used can be seen in figure.
-
7/24/2019 Hand Gesture Imp
22/46
Some transformation T, converts an image into a feature vector, which will be then compared
with feature vectors of a training set of gestures. 1e will be seeking for the simplest possible
transformation T, which allows gesture recognition.
CHAPTER - : RADIAL BASISNETWORK IMPLEMENTATION IN
MATLAB
1. LEARNING RULES
1e will define a learning rule as a procedure for modifying the weights and biases of a
network. 7This procedure may also be referred to as a training algorithm.8 The learning rule is
applied to train the network to perform some particular task. $earning rules in the "!T$!@
toolbox fall into two broad categories2 supervised learning and unsupervised learning.
Those two categories were described in detail in previous chapter. The algorithm has been
developed using supervised learning.
In supervised learning, the learning rule is provided with a set of examples 7the training set8
of proper network behavior2 where is an input to the network, and is the corresponding
correct 7target8 output. !s the inputs are applied to the network, the network outputs are
compared to the targets. The learning rule is then used to ad6ust the weights and biases of the
network in order to move the network outputs closer to the targets. The /adial @asis 5unction
learning rule falls in this supervised learning category.
2. LINEAR CLASSIICATION
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.html -
7/24/2019 Hand Gesture Imp
23/46
$inear networks can be trained to perform classification with the function N:1/@. N:1/@
4esign a radial basis network. /adial basis networks can be used to approximate functions.
N:1/@ adds neurons to the hidden layer of a radial basis network until it meets the
specified mean squared error goal.
Gnet,trH L newrb7,T,C!$,S/:!4,"N,458N:1/@7,T,C!$,S/:!4,"N,458 takes these arguments,
- /x matrix of input vectors.
T - Sx matrix of target class vectors.
C!$ - "ean squared error goal, default L (.(.
S/:!4 - Spread of radial basis functions, default L %.(.
"N - "aximum number of neurons, default is .
45 - Number of neurons to add between displays, default L 3B,
and returns a new radial basis network.
The larger that S/:!4 is the smoother the function approximation will be. Too large aspread means a lot of neurons will be required to fit a fast changing function. Too small a
spread means many neurons will be required to fit a smooth function, and the network may
not generali0e well. 9all N:1/@ with different spreads to find the best value for a given
problem.
:xamples2
ere we design a radial basis network given inputs and targets T.
L G% 3 AHJ
T L G3.( ;.% B.&HJ
net L newrb7,T8Jere the network is simulated for a new input.
L %.BJ
> L sim7net,8
Alo&'$h/
N:1/@ creates a two layer network. The first layer has /!4@!S neurons, and calculates
its weighted inputs with 4IST, and its net input with N:T/4. The second layer has
?/:$IN neurons, calculates its weighted input with 4T/4 and its net inputs with
N:TS?". @oth layers have biases. Initially the /!4@!S layer has no neurons. The
following steps are repeated until the networkDs mean squared error falls below C!$ or themaximum number of neurons are reached2
%8 The network is simulated.
38 The input vector with the greatest error is found.
A8 ! /!4@!S neuron is added with weights equal to that vector.
;8 The ?/:$IN layer weights are redesigned to minimi0e error.
@elow in figure, we can see a flow chart of the algorithm. It is always operating on the same
order. There is a graphical interface only for selecting the test set as it is the only user input.
The error is also displayed graphically for each epoch that is through.
-
7/24/2019 Hand Gesture Imp
24/46
CHAPTER 3 : PHASES O THE
PRO9ECT
The pro6ect had two phases2
Training hase
:xecution hase
1. TRAINING PHASE
!n artificial neural network is a system based on the operation of biological neural networks,
in other words, is an emulation of biological neural system. 1hy would be necessary the
implementation of artificial neural networksM !lthough computing these days is truly
advanced, there are certain tasks that a program made for a common microprocessor is unable
to performJ even so a software implementation of a neural network can be made with their
advantages and disadvantages.
1.1. T&('n'n o0 (&$'0'c'(l ne%&(l ne$4o&+#
! neural network has to be configured such that the application of a set of inputs produces7either DdirectD or via a relaxation process8 the desired set of outputs. )arious methods to set
the strengths of the connections exist. ne way is to set the weights explicitly, using a priori
knowledge. !nother way is to DtrainD the neural network by feeding it teaching patterns and
letting it change its weights according to some learning rule. 1e can categories the learning
situations in two distinct sorts. These are2
S%,e&'#e le(&n'nor !ssociative learning in which the network is trained by providing it
with input and matching output patterns. These input-output pairs can be provided by an
external teacher, or by the system which contains the neural network 7self-supervised8.
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.html -
7/24/2019 Hand Gesture Imp
25/46
Un#%,e&'#e le(&n'nor Self-organi0ation in which an 7output8 unit is trained to respond to
clusters of pattern within the input. In this paradigm the system is supposed to discover
statistically salient features of the input population. ?nlike the supervised learning paradigm,
there is no a priori set of categories into which the patterns are to be classifiedJ rather the
system must develop its own representation of the input stimuli.
Re'n0o&ce/en$ Le(&n'n: This type of learning may be considered as an intermediate formof the above two types of learning. ere the learning machine does some action on the
environment and gets a feedback response from the environment. The learning system grades
its action good 7rewarding8 or bad 7punishable8 based on the environmental response and
accordingly ad6usts its parameters. Cenerally, parameter ad6ustment is continued until an
equilibrium state occurs, following which there will be no more changes in its parameters.
The self-organi0ing neural learning may be categori0ed under this type of learning.
The /adial @asis 5unction learning rule falls in this supervised learning category. In
supervised learning, the learning rule is provided with a set of examples 7the training set8
of proper network behavior2 where is an input to the network, and is the corresponding
correct 7target8 output. !s the inputs are applied to the network, the network outputs are
compared to the targets. The learning rule is then used to ad6ust the weights and biases of thenetwork in order to move the network outputs closer to the targets.
-
7/24/2019 Hand Gesture Imp
26/46
1.21. 'eo c(,$%&'n
The camera which is used is a web cam which has resolution ';(cbcr format. ?sing "!T$!@ program we convert it in to greyscale format. @y
using "!T$!@ we take snapshots of the obtained video in an interval of two seconds.
The si0e of the obtained image is ';(
-
7/24/2019 Hand Gesture Imp
27/46
1.22. Ne%&(l ne$4o&+
AB images of the single gesture is provided to the network after appropriate cropping is done.
This AB images of gestures are concatenated to form a single matrix which forms input vector
to the network. 5or each and every input vector we will assign a target vector. The target
vector is assumed to be a single image of a single gesture, which is common one for all input
vectors.
Similarly all the four gestures are trained to obtain four networks respectively. The networksare saved in the workspace of "!T$!@. This acts as database for classification.
-
7/24/2019 Hand Gesture Imp
28/46
The specific gesture is used to convey a specific command for the execution of action by the
robot. The respective control actions for the gestures given above 7in clockwise8 are forward,
backward, right and left movement of the robot.
2. E;ECUTION PHASE
In this section we describe a methodology for implementing the phase of execution of
artificial neural networks 7!NN8 in hardware devices. The various entities are interconnectedto form a neuron that can be mapped to a hardware device. ?sing a similar approach, neurons
are grouped in layers, which are then interconnected themselves to construct an artificial
neural network. The methodology is intended to lead a neural network designer through the
steps required to take the design into a hardware device, starting with the results provided by
a neuro simulator, obtaining the network parameters and translating them into a fully
synthesi0able design.
2.1. E8ec%$'on Ph(#e6 Bloc+ D'(&(/
-
7/24/2019 Hand Gesture Imp
29/46
2.11. S$e,# 'n e8ec%$'on ,h(#e:
-
7/24/2019 Hand Gesture Imp
30/46
/eal time hand tracking scheme can be used for capturing the gesture if gesture is
available in the vision range it is captured by camera
9onvert the image to a binary image
9rop out the unwanted portion by cropping algorithm
Cive it to the neural net to perform classification
@ased on the stored database neural net will return the classified result
The neural net classification output is used to drive the hardware.
The !rduino is used for hardware interfacing.
The control commands are send to !rduino from "!T$!@ directly.
n the basis of this command motion of the /@T is ad6usted.
CHAPTER 5 : INTERACING USING
ARDUINO
!s we already discussed, different functions corresponding to each meaningful hand gesture
are written and stored in database for controlling robot. That is for each meaningful hand
gesture we want to make robot moving in corresponding direction. To make it happen,
according to each gesture identified the "!T$!@ should send corresponding control signal
to the robot. The interfacing of "!T$!@ with hardware is done using !rduino. It simplifies
the serial communication between the "!T$!@ and the hardware. !nother advantage of
using !rduino is, it can be directly controlled over the "!T$!@ by creating !rduino ob6ect
inside the software itself. It is possible by the use "!T$!@ support package for !rduino. It
uses the serial port interface in "!$!@ for communication with !rduino board.
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.html -
7/24/2019 Hand Gesture Imp
31/46
"!$!@ support package for !rduino comes with a server program adiosrv.pde that can be
downloaded on the !rduino board. nce downloaded, the adiosrv.pde will2
%. 1ait and listen for "!T$!@ commands
3. ?pon receiving "!T$!@ command, execute and return the result
The steps in writedownload the server program to !rduino board is the following2
%. pen !rduino I4:3. Select the adiosrv.pde file by navigating through 5ile pen in the I4:
A. 9lick the upload button
1. TYPICAL USAGE
"ake sure the board is connected to the computer via ?S@ port, make sure to know which
serial port the !rduino is connected to 7this is the same port found at the end of the drivers
installation step8, and finally, make sure that the port is not used by the I4: 7in other words,
the I4: must be closed or disconnected8, so that "!T$!@ can use the serial connection.
5rom "!T$!@, launch the command aLarduino7DportD8 where DportD is the 9" port to
which the !rduino is connected to, e.g. D9"BD or D9"=D on 1indows, or DdevttyS%(%D on
?nix 7use Ddevtty?S@(D for !rduino versions prior to ?no8 or D4:"D 7see below for the4:" mode8 and make sure the function terminates successfully.
Then use the commands a.pin"ode, 7to change the mode of each pin between input and
output8 a.digital/ead, a.digital1rite, a.analog/ead, and a.analog1rite, to perform digital
input, digital output, analog input, and analog output.
2. E;AMPLE
The previous example $:4 @link can be directly implemented on the "!T$!@ as follows2
a 5 arduino(67-89) set up the !rduino ob6ect by specifying the 9"
a"pinode(,9output9) declare %Ath pin as output
while() execute the infinite loop
a"di&ital0rite(, 1231) set the $:4 on
pause() wait for a second
a"di&ital0rite(, L-0) set the $:4 off
pause() wait for a second
end
CHAPTER 7 : ROBOT CONTROL
The robot is two wheel control with a castor wheel provided for the support.
1. MOTOR DRIER IC L2
-
7/24/2019 Hand Gesture Imp
32/46
input is high, the associated driver gets enabled. !s a result, the outputs become active and
work in phase with their inputs. Similarly, when the enable input is low, that driver is
disabled, and their outputs are off and in the high-impedance state. $3&A4 has output current
of '((m! and peak output current of %.3! per channel. "oreover for protection of circuit
from back :"5 output diodes are included within the I9. The output supply 7)9938 has a
wide range from ;.B) to A').
Thus if both the motors are moving in forward direction, the robot moves forward. 1hile if
both the motors are moving in reverse direction, the robot moves backward. If the left motor
is in forward motion and the right motor is in reverse motion, then the robot moves to its
right. 1hile if the right motor is in forward motion and the left motor is in reverse motion,
then the robot moves to its left. Thus with appropriate control instructions from arduino, we
can control the movement of the robot.
-
7/24/2019 Hand Gesture Imp
33/46
2. DC GEARED MOTOR
The geared instrument dc motor is ideally suited to a wide range of applications requiring a
combination of low speed operation and small unit si0e. The integral iron core 49 motor
provides smooth operation and a bidirectional variable speed capability while the gearhead
utili0es a multistage metal spur gear train rated for a working torque up to (.3Nm. The unit,
which is suitable for mounting in any attitude, provides reliable operation over a wide
ambient temperature range and is equipped with an integral )4/ 7voltage dependent resistor8
electrical suppression system to minimi0e electrical interference. The unit offers a range of
gear ratio options for operating speeds from B-3(( rpm and is ideally suited to applications
where small si0e and low unit price are important design criteria.
In 49 geared motors as the name suggests, this motors spin freely under the command of acontroller. This makes them easier to control, as the controller knows exactly how far they
have rotated, without having to use a sensor. Therefore they are used on many rolling wheels.
It has six wires coming out of it. 5our of them are used for receiving data from the
microcontroller board 7arduino8 for its movement, while one is connected to %3) 49 supply,
to drive the motor, while other wire is connected to B) 49 supply to provide the supply for
the driver I9.
-
7/24/2019 Hand Gesture Imp
34/46
CHAPTER = : HARDWARE SETUP
CHAPTER < : SOURCE CODE
1. PROGRAM TO TRAIN THE NEURAL NETWORK
WW This program is used to make the neural network for the identification of the gesture%.Same program can be used for other gestures also by replacing the sample images with the
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.html -
7/24/2019 Hand Gesture Imp
35/46
images corresponding to that gesture
clcJ
clear allJ
WW $oad the sample images 7of different possess8 to make net for gesture%
img%Limread7Dfirsgesture%.pngD8J
img3Limread7Dfirsgesture3.pngD8JimgALimread7DfirsgestureA.pngD8J
img;Limread7Dfirsgesture;.pngD8J
imgBLimread7DfirsgestureB.pngD8J
img'Limread7Dfirsgesture'.pngD8J
imgVLimread7DfirsgestureV.pngD8J
img=Limread7Dfirsgesture=.pngD8J
img&Limread7Dfirsgesture&.pngD8J
img%(Limread7Dfirsgesture%(.pngD8J
img%%Limread7Dfirsgesture%%.pngD8J
img%3Limread7Dfirsgesture%3.pngD8J
img%ALimread7Dfirsgesture%A.pngD8Jimg%;Limread7Dfirsgesture%;.pngD8J
img%BLimread7Dfirsgesture%B.pngD8J
img%'Limread7Dfirsgesture%'.pngD8J
img%VLimread7Dfirsgesture%V.pngD8J
img%=Limread7Dfirsgesture%=.pngD8J
img%&Limread7Dfirsgesture%&.pngD8J
img3(Limread7Dfirsgesture3(.pngD8J
img3%Limread7Dfirsgesture3%.pngD8J
img33Limread7Dfirsgesture33.pngD8J
img3ALimread7Dfirsgesture3A.pngD8J
img3;Limread7Dfirsgesture3;.pngD8J
img3BLimread7Dfirsgesture3B.pngD8J
img3'Limread7Dfirsgesture3'.pngD8J
img3VLimread7Dfirsgesture3V.pngD8J
img3=Limread7Dfirsgesture3=.pngD8J
img3&Limread7Dfirsgesture3&.pngD8J
imgA(Limread7DfirsgestureA(.pngD8J
imgA%Limread7DfirsgestureA%.pngD8J
imgA3Limread7DfirsgestureA3.pngD8J
imgAALimread7DfirsgestureAA.pngD8JimgA;Limread7DfirsgestureA;.pngD8J
imgABLimread7DfirsgestureAB.pngD8J
WW 9rop the sample images
imgcpd%Limcrop7a%8J
imgcpd3Limcrop7a38J
imgcpdALimcrop7aA8J
imgcpd;Limcrop7a;8J
imgcpdBLimcrop7aB8J
imgcpd'Limcrop7a'8JimgcpdVLimcrop7aV8J
-
7/24/2019 Hand Gesture Imp
36/46
imgcpd=Limcrop7a=8J
imgcpd&Limcrop7a&8J
imgcpd%(Limcrop7a%(8J
imgcpd%%Limcrop7a%%8J
imgcpd%3Limcrop7a%38J
imgcpd%ALimcrop7a%A8Jimgcpd%;Limcrop7a%;8J
imgcpd%BLimcrop7a%B8J
imgcpd%'Limcrop7a%'8J
imgcpd%VLimcrop7a%V8J
imgcpd%=Limcrop7a%=8J
imgcpd%&Limcrop7a%&8J
imgcpd3(Limcrop7a3(8J
imgcpd3%Limcrop7a3%8J
imgcpd33Limcrop7a338J
imgcpd3ALimcrop7a3A8J
imgcpd3;Limcrop7a3;8Jimgcpd3BLimcrop7a3B8J
imgcpd3'Limcrop7a3'8J
imgcpd3VLimcrop7a3V8J
imgcpd3=Limcrop7a3=8J
imgcpd3&Limcrop7a3&8J
imgcpdA(Limcrop7aA(8J
imgcpdA%Limcrop7aA%8J
imgcpdA3Limcrop7aA38J
imgcpdAALimcrop7aAA8J
imgcpdA;Limcrop7aA;8J
imgcpdABLimcrop7aAB8J
WW Train the network. p is the input vector and t is the target vector
pLGimgcpd%,imgcpd3,imgcpdA,imgcpd;,imgcpdB,imgcpd',imgcpdV,imgcpd=,imgcpd&,
imgcpd%(,imgcpd%%,imgcpd%3,imgcpd%A,imgcpd%;,imgcpd%B,imgcpd%',imgcpd%V,
imgcpd%=,imgcpd%&,imgcpd3(,imgcpd3%,imgcpd33,imgcpd3A,imgcpd3;,imgcpd3B,
imgcpd3',imgcpd3V,imgcpd3=,imgcpd3&,imgcpdA(,imgcpdA%,imgcpdA3,imgcpdAA,
imgcpdA;,imgcpdABHJ
tLGimgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,
imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,
imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%HJ
pLdouble7p8J
tLdouble7t8J
netLnewrb7p,t,.((%,.%,%(((,3B8JWtrain using /@5 net
save7Dgesture%netD,DnetD8J Wsave the network
2. PROGRAM TO CROP THE CAPTURED IMAGE
WW "!T$!@ 5?9TIN imcrop T 9/ T: 9!T?/:4 I"!C:function imgoutLimgcrop7imgin8 Wfunction definition
-
7/24/2019 Hand Gesture Imp
37/46
imgin L imresi0e7cropimg,G3;(,3;(H8J Wresi0e the original image
columnsumLsum7cropimg8JWfind the sum of column entries
rowsumLsum7cropimgD8J Wfind the sum of row entries
q%L%J
q3L3;(J
qAL%Jq;L3;(J
for wL%23;( Wsearch for the vertical cropping boundary
if 7columnsum7%,w8L%(8
q%LwJ
breakJ
end
end
for wLq%23;( Wsearch for the vertical cropping boundary
if 7columnsum7%,w8L%(8
q3LwJ
endend
for wL%23;( Wsearch for the hori0ontal cropping boundary
if 7rowsum7%,w8L%(8
qALwJ
breakJ
end
end
for wLqA23;( Wsearch for the hori0ontal cropping boundary
if 7rowsum7%,w8LB(8
q;LwJ
end
end
Wcrop the image between boundaries and resi0e to original
imgoutLcropimg7qA2q;,q%2q38J
imgoutLG0eros7q;-qAX%,%'(-7q3-q%X%88,imgoutJ0eros7%3(-7q;-qAX%8,%'(8HJ
imgoutLimresi0e7imgout,G%3(,%'(H8J
imgoutLdouble7imgout8J Wconvert the image type to double for future use
". PROGRAM OR REAL TIME HAND TRACKING AND GESTURE
IDENTIICATION
WW "!T$!@ program for real time hand tracking and gesture identification
clcJ
clear allJ
close allJ
qL(J Winitiali0e the switch variable
a L arduino7D9";D8J Winitiali0e the arduino ob6ect
a.pin"ode7%A,DoutputD8J Wdefine the pin %A as output pin
a.pin"ode7%3,DoutputD8J Wdefine the pin %3 as output pin
a.pin"ode7%%,DoutputD8J Wdefine the pin %3 as output pina.pin"ode7%(,DoutputD8J Wdefine the pin %( as output pin
-
7/24/2019 Hand Gesture Imp
38/46
closepreviewJ
vid L videoinput7DwinvideoD, %, D>?>3YA3(x3;(D8J Wspecify the video adaptor
src L getselectedsource7vid8J
vid./eturned9olorspace L DgrayscaleDJ Wdefine the color format to C/!>S9!$:
vid.5rameserTrigger L %J
preview7vid8J Wpreview the video ob6ectload gesture%netJ Wload the network for each gesture
load gesture3netJ
load gestureAnetJ
load gesture;netJ
imgref%Limread7Drefimg%.pngD8J Wloading the reference image for each gesture
imgref3Limread7Drefimg3.pngD8J
imgrefALimread7DrefimgA.pngD8J
imgref;Limread7Drefimg;.pngD8J
imgcrpd%Limcrop7imgref%8J Wcrop the reference images
imgcrpd3Limcrop7imgref38J
imgcrpdALimcrop7imgrefA8Jimgcrpd;Limcrop7imgref;8J
imgtrained%Lsim7gesture%net,imgcrpd%8J Wtrain the cropped reference images
imgtrained3Lsim7gesture3net,imgcrpd38J
imgtrainedALsim7gestureAnet,imgcrpdA8J
imgtrained;Lsim7gesture;net,imgcrpd;8J
while7%8
preview7vid8J Wpreview the video ob6ect
currentimgLgetsnapshot7vid8J Wcapture the image of interest
currentimgLim3bw7currentimg,.VB8J Wconvert captured image to binary
imshow7currentimg8J Wdisplay the binary image
cropedimgLimcrop7currentimg8J Wcrop the binary image
yLsim7gesture%net,cropedimg8J Wsimulate using first net
corrwgt%Lxcorr37imgtrained%,y8J Wfind out correlation
corrwgt%Lim3col7corrwgt%,G% %H8J Wcorrelation weight to column vector
corrwgt%Lsum7corrwgt%8J Wtake column value sum
yLsim7gesture3net,currentimg8J Wsimulate using second net
corrwgt3Lxcorr37imgtrained3,y8J Wfind out correlation
corrwgt3Lim3col7corrwgt3,G% %H8J Wcorrelation weight to column vector
corrwgt3Lsum7corrwgt38J Wtake column value sum
yLsim7gestureAnet,currentimg8J Wsimulate using third netcorrwgtALxcorrA7imgtrainedA,y8JWfind out correlation
corrwgtALim3col7corrwgtA,G% %H8J Wcorrelation weight to column vector
corrwgtALsum7corrwgtA8J Wtake column value sum
yLsim7gesture;net,currentimg8J Wsimulate using fourth net
corrwgt;Lxcorr;7imgtrained;,y8JWfind out correlation
corrwgt;Lim3col7corrwgt;,G% %H8JW correlation weight to column vector
corrwgt;Lsum7corrwgt;8J Wtake column value sum
corrwghtmatrixLGcorrwgt% corrwgt3 corrwgtA corrwgt;HJ
Wdefine the switch variable by thresholding correlation weights
if 7wLAeV8
qL%Jelseif 7w03Z%eVUw03'e'8
-
7/24/2019 Hand Gesture Imp
39/46
qLA J
elseif 7w0%eV8
qL3J
elseif 7w0'eV8
qL;J
elseqL(J
end
Wgeneration of control signals to motor driver using switch method
switch7q8
case %
display7D5/1!/4D8J
a.digital1rite7%A,(8J
a.digital1rite7%3,%8J
a.digital1rite7%%,(8J
a.digital1rite7%(,%8J
case 3display7D@!9K1!/4D8
a.digital1rite7%A,%8J
a.digital1rite7%3,(8J
a.digital1rite7%%,%8J
a.digital1rite7%(,(8J
case A
display7D/ICTD8J
a.digital1rite7%A,(8J
a.digital1rite7%3,%8J
a.digital1rite7%%,(8J
a.digital1rite7%(,(8J
case ;
display7D$:5TD8J
a.digital1rite7%A,(8J
a.digital1rite7%3,(8J
a.digital1rite7%%,(8J
a.digital1rite7%(,%8J
otherwise
display7DN "TIND8
a.digital1rite7%A,(8Ja.digital1rite7%3,(8J
a.digital1rite7%%,(8J
a.digital1rite7%(,(8J
end
end
-. PROGRAM TO BE BURNED ON ARDUINO BOARD
-
7/24/2019 Hand Gesture Imp
40/46
('o#&.,e
[ !nalog and 4igital Input and utput Server for "!T$!@ [
[ This file is meant to be used with the "!T$!@ arduino I package, however, it can be
used from the I4: environment 7or any other serial terminal8 by typing commands like2(e( 2 assigns digital pin \; 7e8 as input
(f% 2 assigns digital pin \B 7f8 as output
(n% 2 assigns digital pin \%A 7n8 as output
%c 2 reads digital pin \3 7c8
%e 2 reads digital pin \; 7e8
3n( 2 sets digital pin \%A 7n8 low
3n% 2 sets digital pin \%A 7n8 high
3f% 2 sets digital pin \B 7f8 high
3f( 2 sets digital pin \B 7f8 low
;63 2 sets digital pin \& 768 to B(Lascii738 over 3BB
;60 2 sets digital pin \& 768 to %33Lascii708 over 3BBAa 2 reads analog pin \( 7a8
Af 2 reads analog pin \B 7f8
/( 2 sets analog reference to 4:5!?$T
/% 2 sets analog reference to INT:/N!$
/3 2 sets analog reference to :
-
7/24/2019 Hand Gesture Imp
41/46
Nothing gets executed in the loop if nothing is available to be read, but as soon as anything
becomes available, then the part coded after the if statement 7that is the real stuff8 gets
executed [
if 7Serial.available78 (8 ^
[ whatever is available from the serial is read here [
val L Serial.read78J[ This part basically implements a state machine that reads the serial port and makes 6ust one
transition to a new state, depending on both the previous state and the command that is read
from the serial port. Some commands need additional inputs from the serial port, so they need
3 or A state transitions 7each on happening as soon as anything new is available from the
serial port8 to be fully executed. !fter a command is fully executed the state returns to its
initial value sL-% [
switch 7s8 ^
[ sL-% means NTINC /:9:I):4 >:T [[[[[[[[[[[[[[[[[[[ [
case -%2
[ calculate next state [
if 7val;V UU valZ&(8 ^[ the first received value indicates the mode ;& is ascii for %, ... &( is ascii for sL( is
change-pin mode sL%( is 4IJ sL3( is 4J sLA( is !IJ sL;( is !J sL&( is query script type 7%
basic, 3 motor8 sLA;( is change analog reference [
sL%([7val-;=8J
_
[ the following statements are needed to handle unexpected first values coming from the
serial 7if the value is unrecogni0ed then it defaults to sL-%8 [
if 77s;( UU sZ&(8 ]] 7s&( UU s`LA;(88 ^
sL-%J
_
[ the break statements gets out of the switch-case, so
[ we go back to line &V and wait for new serial data [
breakJ [ sL-% 7initial state8 taken care of [
[ sL( or % means 9!NC: IN "4: [
case (2
[ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin
%& [
if 7val&= UU valZ%%V8 ^
pinLval-&VJ [ calculate pin[
sL%J [ next we will need to get ( or % from serial [_
else ^
sL-%J [ if value is not a pin then return to -% [
_
breakJ [ sL( taken care of [
case %2
[ the third received value indicates the value ( or % [
if 7val;V UU valZB(8 ^
[ set pin mode [
if 7valLL;=8 ^
pin"ode7pin,IN?T8J_
-
7/24/2019 Hand Gesture Imp
42/46
else ^
pin"ode7pin,?T?T8J
_
_
sL-%J [ we are done with 9!NC: IN so go to -% [
breakJ [ sL% taken care of [[ sL%( means 4ICIT!$ IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[ [
case %(2
[ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin
%& [
if 7val&= UU valZ%%V8 ^
pinLval-&VJ [ calculate pin [
dgvLdigital/ead7pin8J [ perform 4igital Input [
Serial.println7dgv8J [ send value via serial [
_
sL-%J [ we are done with 4I so next state is -% [
breakJ [ sL%( taken care of [[ sL3( or 3% means 4ICIT!$ ?T?T [[[[[[[[[[[[[[[[[[[ [
case 3(2
[ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin
%& [
if 7val&= UU valZ%%V8 ^
pinLval-&VJ [ calculate pin [
sL3%J [ next we will need to get ( or % from serial [
_
else ^
sL-%J [ if value is not a pin then return to -% [
_
breakJ [ sL3( taken care of [
case 3%2
[ the third received value indicates the value ( or % [
if 7val;V UU valZB(8 ^
dgvLval-;=J [ calculate value [
digital1rite7pin,dgv8J [ perform 4igital utput [
_
sL-%J [ we are done with 4 so next state is -% [
breakJ [ sL3% taken care of [[ sLA( means !N!$C IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[[ [
case A(2
[ the second received value indicates the pin from abs 7DaD8 L&V, pin (, to abs 7DfD8 L%(3, pin ',
note that these are the digital pins from %; to %& located in the lower right part of the board [
if 7val&' UU valZ%(A8 ^
pinLval-&VJ [ calculate pin [
agvLanalog/ead7pin8J [ perform !nalog Input [
Serial.println7agv8J [ send value via serial [
_
sL-%J [ we are done with !I so next state is -%[
breakJ [ sLA( taken care of [[ sL;( or ;% means !N!$C ?T?T [[[[[[[[[[[[[[[[[[[[ [
-
7/24/2019 Hand Gesture Imp
43/46
case ;(2
[ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin
%& [
if 7val&= UU valZ%%V8 ^
pinLval-&VJ [ calculate pin[
sL;%J [ next we will need to get value from serial [_
else ^
sL-%J [ if value is not a pin then return to -% [
_
breakJ [ sL;( taken care of [
[ the third received value indicates the analog value [
analog1rite7pin,val8J [ perform !nalog utput [
sL-%J [ we are done with ! so next state is -% [
breakJ [ sL;% taken care [
[ sL&( means uery Script Type 7% basic, 3 motor8 [
case &(2if 7valLLBV8 ^
[ if string sent is && send script type via serial [
Serial.println7%8J
_
sL-%J [ we are done with this so next state is -% [
breakJ [ sL&( taken care of [
[ sLA;( or A;% means !N!$C /:5:/:N9: [[[[[[[[[[[[[[[ [
case A;(2
[ the second received value indicates the reference which is encoded as is (, %, 3 for
4:5!?$T, INT:/N!$ and :
-
7/24/2019 Hand Gesture Imp
44/46
_ [ end loop statement [
CHAPTER 1> : CHALLENGES ACED
There are many challenges associated with the accuracy and usefulness of gesture recognition
software. 5or image-based gesture recognition there are limitations on the equipment used
and image noise. Images or video may not be under consistent lighting, or in the same
location. Items in the background or distinct features of the users may make recognition more
difficult. The variety of implementations for image-based gesture recognition may also cause
issue for viability of the technology to general usage. 5or example, an algorithm calibrated
for one camera may not work for a different camera. The amount of background noise also
causes tracking and recognition difficulties, especially when occlusions 7partial and full8
occur. 5urthermore, the distance from the camera, and the cameraDs resolution and quality,
also cause variations in recognition accuracy. In order to capture human gestures by visual
sensors, robust computer vision methods are also required, for example for hand tracking andhand posture recognition or for capturing movements of the head, facial expressions or ga0e
direction. The ma6or challenges faced in the completion of the pro6ect were2
/eal-time tracking of gesture in a noisy atmosphere.
Setting of threshold value for gesture classification.
Time taken for training a gesture.
)ariation in illumination of surroundings.
CHAPTER 11 : RESULTS,ANALYSIS CONCLUSIONAND FUTURE WORKS
RESULTS AND ANALYSIS
We have conduced e!"e#$%en& 'a&ed on $%a(e& )e have ac*u$#ed u&$n(a &$%"+e )e'ca% We have co++eced he&e daa on un$-o#% 'ac.(#ound&F$(u#e /0 $++uae& he #ea+ $%e hand #ac.$n( and (e&u#e $den$ca$on-o# he co%%and o- -o#)a#d %o$on
Ou# a+(o#$h% +ead& o he co##ec #eco(n$$on o- a++ he -ou# (e&u#e&
A+&o, he co%"ua$on $%e needed o o'a$n he&e #e&u+& $& ve#2 &%a++,&$nce he a+(o#$h% $& *u$e &$%"+e Th$#2 ve $%a(e& o- a &$n(+e (e&u#e )e#e a.en -o# he #a$n$n( "u#"o&e
and he ne)o#. )a& -o#%ed The #ea+ $%e $%a(e )a& ca"u#ed '2 he)e'ca% and )a& ($ven o he ne -o# c+a&&$ca$on 3a&ed on he &o#eddaa'a&e neu#a+ ne )$++ #eu#n he c+a&&$ed #e&u+Th$& neu#a+ nec+a&&$ca$on ou"u )a& u&ed o d#$ve he ha#d)a#e, $ne#-aced )$h hehe+" o- A#du$no The con#o+ co%%and& )e#e &end o A#du$no -#o% 4ATLA3d$#ec+2 and on he 'a&$& o- h$& co%%and, %o$on o- he RO3OT )a&ad5u&ed
We have noed ha $%a(e& a.en unde# $n&u6c$en +$(h have +ed o he$nco##ec #e&u+& In he&e ca&e& he -a$+u#e %a$n+2 &e%& -#o% he
e##oneou& &e(%ena$on o- &o%e 'ac.(#ound "o#$on& a& he hand #e($on
http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.html -
7/24/2019 Hand Gesture Imp
45/46
Ou# a+(o#$h% a""ea#& o "e#-o#% )e++ $n un$-o#% 'ac.(#ound& )$ha""#o"#$ae $++u%$na$on Ove#a++, )e nd he "e#-o#%ance o- h$& &$%"+ea+(o#$h% *u$e &a$&-aco#2 $n he cone! o- ou# %o$va$n( #o'o con#o+a""+$ca$on
CONCLUSION AND FUTURE WORKS
We "#o"o&ed a -a& and &$%"+e a+(o#$h% -o# hand (e&u#e #eco(n$$on -o#con#o++$n( #o'o We have de%onaed he e7ec$vene&& o- h$&co%"ua$ona++2 e6c$en a+(o#$h% on #ea+ $%a(e& )e have ac*u$#ed Inou# &2&e% o- (e&u#e con#o++ed #o'o&, )e have on+2 con&$de#ed a +$%$ednu%'e# o- (e&u#e& Ou# a+(o#$h% can 'e e!ended $n a nu%'e# o- )a2&o #eco(n$8e a '#oade# &e o- (e&u#e& The (e&u#e #eco(n$$on "o#$on o-ou# a+(o#$h% $& oo &$%"+e, and )ou+d need o 'e $%"#oved $- h$&echn$*ue )ou+d need o 'e u&ed $n cha++en($n( o"e#a$n( cond$$on&Re+$a'+e "e#-o#%ance o- hand (e&u#e #eco(n$$on echn$*ue& $n a (ene#a+&e$n( #e*u$#e dea+$n( )$h occ+u&$on&, e%"o#a+ #ac.$n( -o# #eco(n$8$n(d2na%$c (e&u#e&, a& )e++ a& 9D %ode+$n( o- he hand, )h$ch a#e &$++
%o&+2 'e2ond he cu##en &ae o- he a#The advana(e o- u&$n( neu#a+ ne)o#.& $& ha 2ou can d#a) conc+u&$on&-#o% he ne)o#. ou"u I- a veco# $& no c+a&&$ed co##ec )e can chec.$& ou"u and )o#. ou a &o+u$onEven )$h +$%$ed "#oce&&$n( "o)e#, $ )$++ 'e "o&&$'+e o de&$(n ve#2e6c$en a+(o#$h%& '2: Advanced DSP "#oce&&o# can #educe he &$8e o-%odu+e Unde#&and he$# ;&a$c< (e&u#e& Con#o+ -o# ohe# '$o%e#$cu&e& Ou# &o-)a#e ha& 'een de&$(ned o 'e #eu&a'+e -o# %an2 'ehav$o#&ha a#e %o#e co%"+e!, )h$ch %a2 'e added o ou# )o#. 3ecau&e )e+$%$ed ou#&e+ve& o +o) "#oce&&$n( "o)e#, ou# )o#. cou+d ea&$+2 'e %ade%o#e "e#-o#%$n( '2 add$n( a &ae=o-=he=a# "#oce&&o# The u&e o- #ea+e%'edded OS cou+d $%"#ove ou# &2&e% $n e#%& o- &"eed and &a'$+$2 Inadd$$on, $%"+e%en$n( %o#e &en&o# %oda+$$e& )ou+d $%"#ove #o'u&ne&&even $n ve#2 co%"+e! &cene&Ou# &2&e% ha& &ho)n he "o&&$'$+$2 ha $ne#ac$on )$h %ach$ne&h#ou(h (e&u#e& $& a -ea&$'+e a&. and he &e o- deeced (e&u#e& cou+d'e enhanced o %o#e co%%and& '2 $%"+e%en$n( a %o#e co%"+e! %ode+o- a advanced veh$c+e -o# no on+2 $n +$%$ed &"ace )h$+e a+&o $n '#oade#a#ea a& $n he #oad& oo In he -uu#e, &e#v$ce #o'o e!ecu$n( %an2d$7e#en a&.& -#o% "#$vae %ove%en o a -u++2=>ed(ed advancedauo%o$ve ha can %a.e d$&a'+ed o a'+e $n a++ &en&e
SNAPSHOTS
-
7/24/2019 Hand Gesture Imp
46/46