Hand Gesture Imp

download Hand Gesture Imp

of 46

Transcript of Hand Gesture Imp

  • 7/24/2019 Hand Gesture Imp

    1/46

    CHAPTER 1 : INTRODUCTION

    Nowadays, robots are used in the different domains ranging from search and rescue in the

    dangerous environments to the interactive entertainments. The more the robots are employedin our daily life, the more a natural communication with the robot is required. Since the

    introduction of the most common input computer devices not a lot have changed. This is

    probably because the existing devices are adequate. It is also now that computers have been

    so tightly integrated with everyday life, that new applications and hardware are constantly

    introduced. The means of communicating with computers at the moment are limited to

    keyboards, mice, light pen, trackball, keypads etc. These devices have grown to be familiar

    but inherently limit the speed and naturalness with which we interact with the computer. n

    the other hand, hand gesture, as a natural interface means, has been attracting so much

    attention for interactive communication with robots in the recent years.

    !s the computer industry follows "oore#s $aw since middle %&'(s, powerful machines are

    built equipped with more peripherals. )ision based interfaces are feasible and at the presentmoment the computer is able to *see+. ence users are allowed for richer and user-friendlier

    man-machine interaction. This can lead to new interfaces that will allow the deployment of

    new commands that are not possible with the current input devices. lenty of time will be

    saved as well. /ecently, there has been a surge in interest in recogni0ing human hand

    gestures. In this context, vision based hand detection and tracking techniques are used to

    provide an efficient real time interface with the robot. owever, the problem of visual hand

    recognition and tracking is quite challenging. "any early approaches used position markers

    or colored gloves to make the problem of hand recognition easier, but due to their

    inconvenience, they cannot be considered as a natural interface for the robot control. Thanks

    to the latest advances in the computer vision field, the recent vision based approaches do not

    need any extra hardware except a camera. These techniques can be categori0ed as model

    based and appearance based methods. 1hile model based techniques can recogni0e the hand

    motion and its shape exactly, they are computationally expensive and therefore they are

    infeasible for a real time control application.

    The appearance based techniques on the other hand are faster but they still deal with some

    issues such as2

    complex nature of the hand with more than 3( 45

    cluttered and variant background

    variation in lighting conditions

    real time computational demand

    This pro6ect on the one hand addresses the solution to the mentioned issues in the hand

    recognition problem using 34 images and on the other hand proposes an innovative natural

    commanding system for a uman /obot Interaction 7/I8.

    and- gesture recognition has various applications like computer games, machinery control

    7e.g. 9rane8, and thorough mouse replacement. ne of the most structured sets of gestures

    belongs to sign language. In sign language, each gesture has an assigned meaning 7ormeanings8.9omputer recognition of hand gestures may provide a more natural-computer

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-1.html
  • 7/24/2019 Hand Gesture Imp

    2/46

    interface, allowing people to point, or rotate a 9!4 model by rotating their hands. and

    gestures can be classified in two categories2 static and dynamic. ! static gesture is a particular

    hand configuration and pose, represented by a single image. ! dynamic gesture is a moving

    gesture, represented by a sequence of images. 1e will focus on the recognition of static

    images. Interactive applications pose particular challenges. The response time should be very

    fast. The user should sense no appreciable delay between when he or she makes a gesturemotion and when the computer responds. The computer vision algorithms should be reliable

    and work for different people. There are also economic constraints2 the vision-based

    interfaces will be replacing existing ones, which are often very low cost. :ven for added

    functionality, consumers may not want to spend more. 1hen additional hardware is needed

    the cost is considerable higher.

    CHAPTER 2 : SYSTEM DESCRIPTION

    The system which is developed for hand based robot control consists of set-up of the robot,34 imaging system and a control application.

    1. BLOCK DIAGRAM

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-2.html
  • 7/24/2019 Hand Gesture Imp

    3/46

    In the pro6ect we use a video web camera for gesture identification. It sniffs frames of live

    video stream in some time interval. In our case frame capture rate for gesture search is %

    frame per two second.

    5or video capturing the camera used has resolution ';(cbcr

    format. ?sing "!T$!@ program we convert it in to greyscale format. @y using "!T$!@

    we take snapshots of the obtained video in an interval of two seconds.

    The si0e of the obtained image is ';(

  • 7/24/2019 Hand Gesture Imp

    4/46

    Similarly all the four gestures are trained to obtain four networks respectively. The networks

    are saved in the workspace of "!T$!@. This acts as database for classification.

    roposed technique is to control robotic system using hand gesture display is divided into

    four subparts2

    9apture frame containing some gesture presentation.

    :xtract hand gesture area from captured frame.

    4etermine gesture by pattern matching using radial basis function.

    4etermine control instruction, corresponding to matched gesture, and gives that

    instruction to specified robotic system.

    Now for real time execution, the snapshots of the video is taken. Then appropriate cropping is

    done and its given to the neural network. Cesture classification is done by the radial basisfunction. The correlation property between the images is used here i.e crosscorrelation is

    calculated. This determines the threshold level for gesture classification. The neural net

    classification output is used to drive the hardware. The !rduino is used for hardware

    interfacing. The control commands are send to !rduino from "!T$!@ directly. n the basis

    of this command motion of the /@T is ad6usted.

    Thus from the corpus of gestures, specific gesture of interest can be identified, and on the

    basis of that, specific command for execution of action can be given to robotic system.

    In our pro6ect the software platform used is "!T$!@ %(.The various sections of the system

    like gesture recognition, radial basis function, image processing and interfacing using arduino

    is explained in forthcoming sections.

    2. WEBCAM

    ! webcam is a video camera that feeds its images in real time to a computer or computer

    network, often via ?S@, :thernet, or 1i-5i. Their most popular use is the establishment of

    video links, permitting computers to act as videophones or videoconference stations. The

    common use as a video camera for the 1orld 1ide 1eb gave the webcam its name. ther

    popular uses include security surveillance, computer vision, video broadcasting and for

    recording social videos. 1ebcams are known for their low manufacturing cost and flexibility,

    making them the lowest cost form of videotelephony. They have also become a source of

    security and privacy issues, as some built-in webcams can be remotely activated via spyware

    1ebcams typically include a lens, an image sensor, support electronics, and may also includea microphone for sound. )arious lenses are available, the most common in consumer-grade

    webcams being a plastic lens that can be screwed in and out to focus the camera. 5ixed focus

    lenses, which have no provision for ad6ustment, are also available. !s a camera systemDs

    depth of field is greater for small image formats and is greater for lenses with a large f-

    number 7small aperture8, the systems used in webcams have a sufficiently large depth of field

    that the use of a fixed focus lens does not impact image sharpness to a great extent.

    2.1. Technolo!

    Image sensors can be 9"S or 994, the former being dominant for low-cost cameras, but

    994 cameras do not necessarily outperform 9"S-based cameras in the low cost price

    range. "ost consumer webcams are capable of providing )C! resolution video at a frame

  • 7/24/2019 Hand Gesture Imp

    5/46

    rate of A( frames per second. "any newer devices can produce video in multi-megapixel

    resolutions, and a few can run at high frame rates such as the layStation :ye, which can

    produceA3(E3;( video at %3( frames per second.

    Support electronics read the image from the sensor and transmit it to the host computer. The

    camera pictured to the right, for example, uses a Sonix SN&9%(% to transmit its image over

    ?S@. Some cameras, such as mobile phone cameras, use a 9"S sensor with supportingelectronics Fon dieF, i.e. the sensor and the support electronics are built on a single silicon

    chip to save space and manufacturing costs. "ost webcams feature built-in microphones to

    make video calling and videoconferencing more convenient. The ?S@ video device class

    7?)98 specification allows for interconnectivity of webcams to computers without the need

    for proprietary device drivers. "icrosoft 1indows

  • 7/24/2019 Hand Gesture Imp

    6/46

    ".11. L(&e O)*ec$ T&(c+'n

    In some interactive applications, the computer needs to track the position or orientation of a

    hand that is prominent in the image. /elevant applications might be computer games, or

    interactive machine control. In such cases, a description of the overall properties of the image

    may be adequate. Image moments, which are fast to compute, provide a very coarse summary

    of global averages of orientation and position. If the hand is on a uniform background,

    this method can distinguish hand positions and simple pointing gestures. The large-ob6ect-

    tracking method makes use of a low-cost detectorprocessor to quickly calculate moments.

    This is called the artificial retina chip. This chip combines image detection with some low-level image processing 7named artificial retina by analogy with those combined abilities of

    the human retina8. The chip can compute various functions useful in the fast algorithms for

    interactive graphics applications.

    ".12 Sh(,e &econ'$'on

    "ost applications, such as recogni0ing particular static hand signal, require a richer

    description of the shape of the input ob6ect than image moments provide. If the hand signals

    fell in a predetermined set, and the camera views a close-up of the hand, we may use an

    example-based approach, combined with a simple method top analy0e hand signals called

    orientation histograms. These example-based applications involve two phasesJ training and

    running. In the training phase, the user shows the system one or more examples of a specific

    hand shape. The computer forms and stores the corresponding orientation histograms. In the

  • 7/24/2019 Hand Gesture Imp

    7/46

    run phase, the computer compares the orientation histogram of the current image with each of

    the stored templates and selects the category of the closest match, or interpolates between

    templates, as appropriate. This method should be robust against small differences in the si0e

    of the hand but probably would be sensitive to changes in hand orientation. Cesture

    recognition can be seen as a way for computers to begin to understand human body language,

    this building a richer bridge between machines and humans than primitive text user interfacesor even C?Is 7graphical user interfaces8, which still limit the ma6ority of input to keyboard

    and mouse. Cesture recognition enables humans to interface with the machine 7"I8 and

    interact naturally without any mechanical devices. ?sing the concept of gesture recognition,

    it is possible to point a finger at the computer screen so that the cursor will move accordingly.

    This could potentially make conventional input devices such as mouse, keyboards and even

    touch-screens redundant.

    -. NEURAL NETWORKS

    Neural networks are composed of simple elements operating in parallel. These elements are

    inspired by biological nervous systems. !s in nature, the network function is determinedlargely by the connections between elements. 1e can train a neural network to perform a

    particular function by ad6usting the values of the connections 7weights8 between elements.

    9ommonly neural networks are ad6usted, or trained, so that a particular input leads to a

    specific target output. Such a situation is shown in figure A. There, the network is ad6usted,

    based on a comparison of the output and the target, until the network output matches the

    target. Typically many such inputtarget pairs are used, in this supervised learning to train a

    network.

    Neural networks have been trained to perform complex functions in various fields of

    application including pattern recognition, identification, classification, speech, vision andcontrol systems.

  • 7/24/2019 Hand Gesture Imp

    8/46

    Today neural networks can be trained to solve problems that are difficult for conventional

    computers or human beings. The supervised training methods are commonly used, but other

    networks can be obtained from unsupervised training techniques or from direct

    design methods. ?nsupervised networks can be used, for instance, to identify groups of data.

    9ertain kinds of linear networks and opfield networks are designed directly. In summary,

    there are a variety of kinds of design and learning techniques that enrich the choices that auser can make. The field of neural networks has a history of some five decades but has found

    solid application only in the past fifteen years, and the field is still developing rapidly. Thus,

    it is distinctly different from the fields of control systems or optimi0ation where the

    terminology, basic mathematics, and design procedures have been firmly established and

    applied for many years.

    -.1. Ne%&on Moel

    -.11. S'/,le Ne%&on! neuron with a single scalar input and no bias is shown on the left below.

    The scalar input p is transmitted through a connection that multiplies its strength by the scalar

    weight w, to form the product wp, again a scalar. ere the weighted input wp is the only

    argument of the transfer function f, which produces the scalar output a. The neuron on the

    right has a scalar bias, b. >ou may view the bias as simply being added to the product wp as

    shown by the summing 6unction or as shifting the function f to the left by an amount b. The

    bias is much like a weight, except that it has a constant input of %. The transfer function net

    input n, again a scalar, is the sum of the weighted input wp and the bias b. This sum is the

    argument of the transfer function f. ere f is a transfer function, typically a step function or a

    sigmoid function, that takes the argument n and produces the output a. Note that w and b are

    both ad6ustable scalar parameters of the neuron. The central idea of neural networks is that

    such parameters can be ad6usted so that the network exhibits some desired or interestingbehaviour.

  • 7/24/2019 Hand Gesture Imp

    9/46

    Thus, we can train the network to do a particular 6ob by ad6usting the weight or bias

    parameters, or perhaps the network itself will ad6ust these parameters to achieve some desired

    end. !ll of the neurons in the program written in "!T$!@ have a bias. owever, you may

    omit a bias in a neuron if you wish. !

    -.12. T&(n#0e& %nc$'on#Three of the most commonly used transfer functions are shown in figure 7'8.

    The hard limit transfer function shown above limits the output of the neuron to either (, if the

    net input argument n is less than (, or %, if n is greater than or equal to (. This is the function

    used for the radial basis function algorithm written in "!T$!@ to create neurons that make

    a classification decision.

    There are two modes of learning2 Supervised and unsupervised. @elow there is a brief

    description of each one to determine the best one for our problem

  • 7/24/2019 Hand Gesture Imp

    10/46

    -.1". S%,e&'#e Le(&n'n

    Supervised learning is based on the system trying to predict outcomes for known examples

    and is a commonly used training method. It compares its predictions to the target answer and

    FlearnsF from its mistakes. The data start as inputs to the input layer neurons. The neurons

    pass the inputs along to the next nodes. !s inputs are passed along, the weighting, or

    connection, is applied and when the inputs reach the next node, the weightings are summedand either intensified or weakened. This continues until the data reach the output layer where

    the model predicts an outcome. In a supervised learning system, the predicted output is

    compared to the actual output for that case. If the predicted output is equal to the actual

    output, no change is made to the weights in the system. @ut, if the predicted output is higher

    or lower than the actual outcome in the data, the error is propagated back through the system

    and the weights are ad6usted accordingly. This feeding errors backwards through the network

    is called Fback-propagation.F

    @oth the "ulti-$ayer erceptron and the /adial @asis 5unction are supervised learning

    techniques. The "ulti-$ayer erceptron uses the back-propagation while the /adial @asis

    5unction is a feed-forward approach which trains on a single pass of the data.

    -.1-. Un#%,e&'#e Le(&n'n

    Neural networks which use unsupervised learning are most effective for describing data

    rather than predicting it. The neural network is not shown any outputs or answers as part of

    the training process in fact, there is no concept of output fields in this type of system. The

    primary unsupervised technique is the Kohonen network. The main uses of Kohonen and

    other unsupervised neural systems are in cluster analysis where the goal is to group FlikeF

    cases together. The advantage of the neural network for this type of analysis is that it requires

    no initial assumptions about what constitutes a group or how many groups there are. The

    system starts with a clean slate and is not biased about which factors should be most

    important.

    ;.3. !dvantages of Neural 9omputing

    There are a variety of benefits that an analyst reali0es from using neural networks in their

    work.

    attern recognition is a powerful technique for harnessing the information in the data

    and generali0ing about it. Neural nets learn to recogni0e the patterns which exist in

    the data set.

    The system is developed through learning rather than programming. rogramming is

    much more time consuming for the analyst and requires the analyst to specify the

    exact behaviour of the model. Neural nets teach themselves the patterns in the datafreeing the analyst for more interesting work.

    Neural networks are flexible in a changing environment. /ule based systems or

    programmed systems are limited to the situation for which they were designed--when

    conditions change, they are no longer valid. !lthough neural networks may take some

    time to learn a sudden drastic change, they are excellent at adapting to constantly

    changing information.

    Neural networks can build informative models where more conventional approaches

    fail. @ecause neural networks can handle very complex interactions they can easily

  • 7/24/2019 Hand Gesture Imp

    11/46

    model data which is too difficult to model with traditional approaches such as

    inferential statistics or programming logic.

    erformance of neural networks is at least as good as classical statistical modeling,

    and better on most problems. The neural networks build models that are more

    reflective of the structure of the data in significantly less time.

    Neural networks now operate well with modest computer hardware. !lthough neural

    networks are computationally intensive, the routines have been optimi0ed to the

    point that they can now run in reasonable time on personal computers. They do not

    require supercomputers as they did in the early days of neural network research.

    -.". L'/'$($'on# o0 Ne%&(l Co/,%$'n

    There are some limitations to neural computing. The key limitation is the neural networkDs

    inability to explain the model it has built in a useful way. !nalysts often want to know why

    the model is behaving as it is. Neural networks get better answers but they have a hard time

    explaining how they got there.

    There are a few other limitations that should be understood. 5irst, it is difficult to extract rules

    from neural networks. This is sometimes important to people who have to explain their

    answer to others and to people who have been involved with artificial intelligence,

    particularly expert systems which are rule-based.

    !s with most analytical methods, you cannot 6ust throw data at a neural net and get a good

    answer. >ou have to spend time understanding the problem or the outcome you are trying to

    predict. !nd, you must be sure that the data used to train the system are appropriate and are

    measured in a way that reflects the behaviour of the factors. If the data are not representative

    of the problem, neural computing will not product good results. This is a classic situationwhere Fgarbage inF will certainly produce Fgarbage out.F

    5inally, it can take time to train a model from a very complex data set. Neural techniques are

    computer intensive and will be slow on low end 9s or machines without math coprocessors.

    It is important to remember though that the overall time to results can still be faster than other

    data analysis approaches, even when the system takes longer to train. rocessing speed alone

    is not the only factor in performance and neural networks do not require the time

    programming and debugging or testing assumptions that other analytical approaches do.

    3. RADIAL BASIS UNCTION

    ! /adial @asis 5unction 7/@58 neural network has an input layer, a hidden layer and an

    output layer. The neurons in the hidden layer contain Caussian transfer functions whose

    outputs are inversely proportional to the distance from the center of the neuron.

    /@5 networks are similar to K-"eans clustering and NNC/NN networks. The main

    difference is that NNC/NN networks have one neuron for each point in the training file,

    whereas /@5 networks have a variable number of neurons that is usually much less than the

    number of training points. 5or problems with small to medium si0e training sets, NNC/NN

    networks are usually more accurate than /@5 networks, but NNC/NN networks are

    impractical for large training sets.

    3.1. Ho4 RB ne$4o&+# 4o&+

  • 7/24/2019 Hand Gesture Imp

    12/46

    !lthough the implementation is very different, /@5 neural networks are conceptually similar

    to K-Nearest Neighbor 7k-NN8 models. The basic idea is that a predicted target value of an

    item is likely to be about the same as other items that have close values of the predictor

    variables. 9onsider this figure2

    !ssume that each case in the training set has two predictor variables, x and y. The cases are

    plotted using their x,y coordinates as shown in the figure. !lso assume that the target variable

    has two categories, positive which is denoted by a square and negative which is denoted by a

    dash. Now, suppose we are trying to predict the value of a new case represented by the

    triangle with predictor values xL', yLB.%. Should we predict the target as positive or

    negativeM

    Notice that the triangle is position almost exactly on top of a dash representing a negativevalue. @ut that dash is in a fairly unusual position compared to the other dashes which are

    clustered below the squares and left of center. So it could be that the underlying negative

    value is an odd case.

    The nearest neighbor classification performed for this example depends on how many

    neighboring points are considered. If %-NN is used and only the closest point is considered,

    then clearly the new point should be classified as negative since it is on top of a known

    negative point. n the other hand, if &-NN classification is used and the closest & points are

    considered, then the effect of the surrounding = positive points may overbalance the close

    negative point.

    !n /@5 network positions one or more /@5 neurons in the space described by the predictor

    variables 7x,y in this example8. This space has as many dimensions as there are predictorvariables. The :uclidean distance is computed from the point being evaluated 7e.g., the

  • 7/24/2019 Hand Gesture Imp

    13/46

    triangle in this figure8 to the center of each neuron, and a radial basis function 7/@58 7also

    called a kernel function8 is applied to the distance to compute the weight 7influence8 for each

    neuron. The radial basis function is so named because the radius distance is the argument to

    the function. 1eight L /@5 7distance8

    The further a neuron is from the point being evaluated, the less influence it has.

    4ifferent types of radial basis functions could be used, but the most common is the Caussian

    function2

    If there is more than one predictor variable, then the /@5 function has as many dimensions as

    there are variables. The following picture illustrates three neurons in a space with twopredictor variables, < and >. is the value coming out of the /@5 functions2

  • 7/24/2019 Hand Gesture Imp

    14/46

    The best predicted value for the new point is found by summing the output values of the /@5

    functions multiplied by weights computed for each neuron.

    The radial basis function for a neuron has a center and a radius 7also called a spread8. The

    radius may be different for each neuron, and, in /@5 networks generated by 4T/:C, the

    radius may be different in each dimension.

  • 7/24/2019 Hand Gesture Imp

    15/46

    1ith larger spread, neurons at a distance from a point have a greater influence.

    3.2. RB Ne$4o&+ A&ch'$ec$%&e

    /@5 networks have three layers2

    %. Input layer O There is one neuron in the input layer for each predictor variable. In the case

    of categorical variables, N-% neurons are used where N is the number of categories. The input

    neurons 7or processing before the input layer8 standardi0e the range of the values by

  • 7/24/2019 Hand Gesture Imp

    16/46

    subtracting the median and dividing by the interquartile range. The input neurons then feed

    the values to each of the neurons in the hidden layer.

    3. idden layer O This layer has a variable number of neurons 7the optimal number is

    determined by the training process8. :ach neuron consists of a radial basis function centered

    on a point with as many dimensions as there are predictor variables. The spread 7radius8 of

    the /@5 function may be different for each dimension. The centers and spreads aredetermined by the training process. 1hen presented with the x vector of input values from

    the input layer, a hidden neuron computes the :uclidean distance of the test case from the

    neuron#s center point and then applies the /@5 kernel function to this distance using the

    spread values. The resulting value is passed to the the summation layer.

    A. Summation layer O The value coming out of a neuron in the hidden layer is multiplied by a

    weight associated with the neuron 71%, 13, ...,1n in this figure8 and passed to the

    summation which adds up the weighted values and presents this sum as the output of the

    network. Not shown in this figure is a bias value of %.( that is multiplied by a weight 1( and

    fed into the summation layer. 5or classification problems, there is one output 7and a separate

    set of weights and summation unit8 for each target category. The value output for a category

    is the probability that the case being evaluated has that category.

    !n input vector x is used as input to all radial basis functions, each with different parameters.

    The output of the network is a linear combination of the outputs from radial basis functions.

    3.". T&('n'n RB Ne$4o&+#

    The following parameters are determined by the training process2

    %. The number of neurons in the hidden layer.3. The coordinates of the center of each hidden-layer /@5 function.

  • 7/24/2019 Hand Gesture Imp

    17/46

    A. The radius 7spread8 of each /@5 function in each dimension.

    ;. The weights applied to the /@5 function outputs as they are passed to the summation layer.

    )arious methods have been used to train /@5 networks. ne approach first uses K-means

    clustering to find cluster centers which are then used as the centers for the /@5 functions.

    owever, K-means clustering is a computationally intensive procedure, and it often does not

    generate the optimal number of centers. !nother approach is to use a random subset of thetraining points as the centers.

    4T/:C uses a training algorithm developed by Sheng 9hen,

  • 7/24/2019 Hand Gesture Imp

    18/46

    comprehensive set of reference-standard algorithms and graphical tools for image processing,

    analysis, visuali0ation, and algorithm development. >ou can perform image enhancement,

    image deblurring, feature detection, noise reduction, image segmentation, spatial

    transformations, and image registration. "any functions in the toolbox are multithreaded to

    take advantage of multicore and multiprocessor computers. Image rocessing Toolbox

    supports a diverse set of image types, including high dynamic range, gigapixel resolution,I99-compliant colour, and tomographic images. Craphical tools let you explore an image,

    examine a region of pixels, ad6ust the contrast, create contours or histograms, and manipulate

    regions of interest 7/Is8. 1ith the toolbox algorithms you can restore degraded images,

    detect and measure features, analy0e shapes and textures, and ad6ust the colour balance of

    images.

    7. ARDUINO

    !rduino is an open-source electronics prototyping platform based on flexible, easy-to-use

    hardware and software. ItDs intended for artists, designers, hobbyists, and anyone interested in

    creating interactive ob6ects or environments.!rduino can sense the environment by receiving input from a variety of sensors and can

    affect its surroundings by controlling lights, motors, and other actuators. The microcontroller

    on the board is programmed using the !rduino programming language 7based on 1iring8 and

    the !rduino development environment 7based on rocessing8. !rduino pro6ects can be stand-

    alone or they can communicate with software running on a computer 7e.g."!T$!@,5lash,

    rocessing8.

    !rduino is different from other platforms on the market because of these features2

    It is a multiplatform environmentJ it can run on 1indows, "acintosh, and $inux

    It is based on the rocessing programming I4:, an easy-to-use development

    environment used by artists and designers

  • 7/24/2019 Hand Gesture Imp

    19/46

    >ou program it via a ?S@ cable, not a serial port. This feature is useful because many

    modern computers don#t have serial ports.

    It is open source hardware and softwareRif you wish, you can download the circuit

    diagram, buy all the components, and make your own, without paying anything to the

    makers of !rduino.

    There is an active community of users, so there are plenty of people who can help

    you.

    The !rduino ro6ect was developed in an educational environment and is therefore

    great for newcomers to get things working quickly.

    7.1. B(&e M'n'/%/ Coe $o R%n $he A&%'no

    void setup(){

    //we can put setup code here, to run once

    }

    void loop()

    {

    //we can put main code here, it run repeatedly

    }

    The setup78 function is called when a sketch starts. ?se it to initiali0e variables, pin modes,

    start using libraries, etc. The setup function will only run once, after each powerup or reset of

    the !rduino board.!fter creating a setup78 function, the loop78 function does precisely what its name suggests,

    and loops consecutively, allowing our program to change and respond as it runs. 9ode in the

    loop78 section of our sketch is used to actively control the !rduino board.

    !ny line that starts with two slashes 78 will not be read by the compiler, so we can write

    anything you want after it.

    7.2. LED Bl'n+'n E8(/,le

    /* Blink Turns on an LE on !or one second, then o!! !or one second, repeatedly"

    This e#ample code is in the pu$lic domain" */

    void setup(){

    // initiali%e the di&ital pin as an output"

    // 'in has an LE connected on most rduino $oards+

    pinode(, -.T'.T)

    }

    void loop() {

    di&ital0rite(, 1231) // set the LE on

    delay(444) // wait !or a second

    di&ital0rite(, L-0) // set the LE o!!

    delay(444) // wait !or a second

    }

  • 7/24/2019 Hand Gesture Imp

    20/46

    In the program below, the first thing we do is to initiali0e pin %A as an output pin with the line

    pinode(, -.T'.T)

    In the main loop, we turn the $:4 on with the line2

    di&ital0rite(, 1231)

    This supplies B volts to pin %A. That creates a voltage difference across the pins of the $:4,and lights it up. Then we turn it off with the line2

    di&ital0rite(, L-0)

    That takes pin %A back to ( volts, and turns the $:4 off. In between the on and the off, we

    want enough time for a person to see the change, so the delay78 commands tell the !rduino to

    do nothing for %((( milliseconds, or one second. 1hen we use the delay78 command, nothing

    else happens for that amount of time.

    CHAPTER " : IMAGE DATABASE

    The starting point of the pro6ect was the creation of a database with all the images that would

    be used for training and testing.

    The image database can have different formats. Images can be either hand drawn, digiti0ed

    photographs or a 34A4 dimensional hand. hotographs were used, as they are the most

    realistic approach.

    Images came from two main sources. )arious !S$ databases on the Internet and photographs

    took with a web camera. This meant that they have different si0es, different resolutions and

    sometimes almost completely different angles of shooting. Images belonging to the last cases

    were very few but they were discarded, as there was no chance of classifying them correctly.

    Two operations were carried out in all of the images. They were converted to grayscale and

    the background was made uniform. The internet databases already had uniform backgrounds

    but the ones took with the camera had to be processed in !dobe hotoshop.

    The database itself was constantly changing throughout the completion of the pro6ect as it

    was it that would decide the robustness of the algorithm. Therefore, it had to be done in such

    way that different situations could be tested and thresholds above which the algorithm didn#t

    classify correct would be decided. The construction of such a database is clearly dependent

    on the application. If the application is a crane controller for example operated by the same

    person for long periods the algorithm doesn#t have to be robust on different person#s images.

    In this case noise and motion blur should be tolerable.

    1e can see an example as shown in figure2

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-3-image-database.html
  • 7/24/2019 Hand Gesture Imp

    21/46

    In the first row are the training images. In the second the testing images. 5or most of the

    gestures the training set originates from a single gesture. Those were enhanced in !dobe

    hotoshop using various filters. The reason for this is that the algorithm is to be very robust

    for images of the same database. If there was a misclassification to happen it would be

    preferred to be for unknown images.

    The final form of the database is this.

    Train set2 5our training sets of images, each one containing thirty five images. :ach set

    originates from a single image for testing.

    Test Set2 The number of test images varies for each gesture. There is no reason for keeping

    those on a constant number. Some images can tolerate much more variance and images from

    new databases and they can be tested extensively, while other images are restricted to fewer

    testing images.

    The system could be approached either in high or low-level. The former would employ

    models of the hand, finger, 6oints and perhaps fit such a model to the visual data. This

    approach offers robustness, but at the expense of speed.

    ! low-level approach would process data at a level not much higher than that of pixel

    intensities. !lthough this approach would not have the power to make inferences about

    occluded data, it could be simple and fast.The pattern recognition system that will be used can be seen in figure.

  • 7/24/2019 Hand Gesture Imp

    22/46

    Some transformation T, converts an image into a feature vector, which will be then compared

    with feature vectors of a training set of gestures. 1e will be seeking for the simplest possible

    transformation T, which allows gesture recognition.

    CHAPTER - : RADIAL BASISNETWORK IMPLEMENTATION IN

    MATLAB

    1. LEARNING RULES

    1e will define a learning rule as a procedure for modifying the weights and biases of a

    network. 7This procedure may also be referred to as a training algorithm.8 The learning rule is

    applied to train the network to perform some particular task. $earning rules in the "!T$!@

    toolbox fall into two broad categories2 supervised learning and unsupervised learning.

    Those two categories were described in detail in previous chapter. The algorithm has been

    developed using supervised learning.

    In supervised learning, the learning rule is provided with a set of examples 7the training set8

    of proper network behavior2 where is an input to the network, and is the corresponding

    correct 7target8 output. !s the inputs are applied to the network, the network outputs are

    compared to the targets. The learning rule is then used to ad6ust the weights and biases of the

    network in order to move the network outputs closer to the targets. The /adial @asis 5unction

    learning rule falls in this supervised learning category.

    2. LINEAR CLASSIICATION

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network_12.html
  • 7/24/2019 Hand Gesture Imp

    23/46

    $inear networks can be trained to perform classification with the function N:1/@. N:1/@

    4esign a radial basis network. /adial basis networks can be used to approximate functions.

    N:1/@ adds neurons to the hidden layer of a radial basis network until it meets the

    specified mean squared error goal.

    Gnet,trH L newrb7,T,C!$,S/:!4,"N,458N:1/@7,T,C!$,S/:!4,"N,458 takes these arguments,

    - /x matrix of input vectors.

    T - Sx matrix of target class vectors.

    C!$ - "ean squared error goal, default L (.(.

    S/:!4 - Spread of radial basis functions, default L %.(.

    "N - "aximum number of neurons, default is .

    45 - Number of neurons to add between displays, default L 3B,

    and returns a new radial basis network.

    The larger that S/:!4 is the smoother the function approximation will be. Too large aspread means a lot of neurons will be required to fit a fast changing function. Too small a

    spread means many neurons will be required to fit a smooth function, and the network may

    not generali0e well. 9all N:1/@ with different spreads to find the best value for a given

    problem.

    :xamples2

    ere we design a radial basis network given inputs and targets T.

    L G% 3 AHJ

    T L G3.( ;.% B.&HJ

    net L newrb7,T8Jere the network is simulated for a new input.

    L %.BJ

    > L sim7net,8

    Alo&'$h/

    N:1/@ creates a two layer network. The first layer has /!4@!S neurons, and calculates

    its weighted inputs with 4IST, and its net input with N:T/4. The second layer has

    ?/:$IN neurons, calculates its weighted input with 4T/4 and its net inputs with

    N:TS?". @oth layers have biases. Initially the /!4@!S layer has no neurons. The

    following steps are repeated until the networkDs mean squared error falls below C!$ or themaximum number of neurons are reached2

    %8 The network is simulated.

    38 The input vector with the greatest error is found.

    A8 ! /!4@!S neuron is added with weights equal to that vector.

    ;8 The ?/:$IN layer weights are redesigned to minimi0e error.

    @elow in figure, we can see a flow chart of the algorithm. It is always operating on the same

    order. There is a graphical interface only for selecting the test set as it is the only user input.

    The error is also displayed graphically for each epoch that is through.

  • 7/24/2019 Hand Gesture Imp

    24/46

    CHAPTER 3 : PHASES O THE

    PRO9ECT

    The pro6ect had two phases2

    Training hase

    :xecution hase

    1. TRAINING PHASE

    !n artificial neural network is a system based on the operation of biological neural networks,

    in other words, is an emulation of biological neural system. 1hy would be necessary the

    implementation of artificial neural networksM !lthough computing these days is truly

    advanced, there are certain tasks that a program made for a common microprocessor is unable

    to performJ even so a software implementation of a neural network can be made with their

    advantages and disadvantages.

    1.1. T&('n'n o0 (&$'0'c'(l ne%&(l ne$4o&+#

    ! neural network has to be configured such that the application of a set of inputs produces7either DdirectD or via a relaxation process8 the desired set of outputs. )arious methods to set

    the strengths of the connections exist. ne way is to set the weights explicitly, using a priori

    knowledge. !nother way is to DtrainD the neural network by feeding it teaching patterns and

    letting it change its weights according to some learning rule. 1e can categories the learning

    situations in two distinct sorts. These are2

    S%,e&'#e le(&n'nor !ssociative learning in which the network is trained by providing it

    with input and matching output patterns. These input-output pairs can be provided by an

    external teacher, or by the system which contains the neural network 7self-supervised8.

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-4-radial-basis-network.html
  • 7/24/2019 Hand Gesture Imp

    25/46

    Un#%,e&'#e le(&n'nor Self-organi0ation in which an 7output8 unit is trained to respond to

    clusters of pattern within the input. In this paradigm the system is supposed to discover

    statistically salient features of the input population. ?nlike the supervised learning paradigm,

    there is no a priori set of categories into which the patterns are to be classifiedJ rather the

    system must develop its own representation of the input stimuli.

    Re'n0o&ce/en$ Le(&n'n: This type of learning may be considered as an intermediate formof the above two types of learning. ere the learning machine does some action on the

    environment and gets a feedback response from the environment. The learning system grades

    its action good 7rewarding8 or bad 7punishable8 based on the environmental response and

    accordingly ad6usts its parameters. Cenerally, parameter ad6ustment is continued until an

    equilibrium state occurs, following which there will be no more changes in its parameters.

    The self-organi0ing neural learning may be categori0ed under this type of learning.

    The /adial @asis 5unction learning rule falls in this supervised learning category. In

    supervised learning, the learning rule is provided with a set of examples 7the training set8

    of proper network behavior2 where is an input to the network, and is the corresponding

    correct 7target8 output. !s the inputs are applied to the network, the network outputs are

    compared to the targets. The learning rule is then used to ad6ust the weights and biases of thenetwork in order to move the network outputs closer to the targets.

  • 7/24/2019 Hand Gesture Imp

    26/46

    1.21. 'eo c(,$%&'n

    The camera which is used is a web cam which has resolution ';(cbcr format. ?sing "!T$!@ program we convert it in to greyscale format. @y

    using "!T$!@ we take snapshots of the obtained video in an interval of two seconds.

    The si0e of the obtained image is ';(

  • 7/24/2019 Hand Gesture Imp

    27/46

    1.22. Ne%&(l ne$4o&+

    AB images of the single gesture is provided to the network after appropriate cropping is done.

    This AB images of gestures are concatenated to form a single matrix which forms input vector

    to the network. 5or each and every input vector we will assign a target vector. The target

    vector is assumed to be a single image of a single gesture, which is common one for all input

    vectors.

    Similarly all the four gestures are trained to obtain four networks respectively. The networksare saved in the workspace of "!T$!@. This acts as database for classification.

  • 7/24/2019 Hand Gesture Imp

    28/46

    The specific gesture is used to convey a specific command for the execution of action by the

    robot. The respective control actions for the gestures given above 7in clockwise8 are forward,

    backward, right and left movement of the robot.

    2. E;ECUTION PHASE

    In this section we describe a methodology for implementing the phase of execution of

    artificial neural networks 7!NN8 in hardware devices. The various entities are interconnectedto form a neuron that can be mapped to a hardware device. ?sing a similar approach, neurons

    are grouped in layers, which are then interconnected themselves to construct an artificial

    neural network. The methodology is intended to lead a neural network designer through the

    steps required to take the design into a hardware device, starting with the results provided by

    a neuro simulator, obtaining the network parameters and translating them into a fully

    synthesi0able design.

    2.1. E8ec%$'on Ph(#e6 Bloc+ D'(&(/

  • 7/24/2019 Hand Gesture Imp

    29/46

    2.11. S$e,# 'n e8ec%$'on ,h(#e:

  • 7/24/2019 Hand Gesture Imp

    30/46

    /eal time hand tracking scheme can be used for capturing the gesture if gesture is

    available in the vision range it is captured by camera

    9onvert the image to a binary image

    9rop out the unwanted portion by cropping algorithm

    Cive it to the neural net to perform classification

    @ased on the stored database neural net will return the classified result

    The neural net classification output is used to drive the hardware.

    The !rduino is used for hardware interfacing.

    The control commands are send to !rduino from "!T$!@ directly.

    n the basis of this command motion of the /@T is ad6usted.

    CHAPTER 5 : INTERACING USING

    ARDUINO

    !s we already discussed, different functions corresponding to each meaningful hand gesture

    are written and stored in database for controlling robot. That is for each meaningful hand

    gesture we want to make robot moving in corresponding direction. To make it happen,

    according to each gesture identified the "!T$!@ should send corresponding control signal

    to the robot. The interfacing of "!T$!@ with hardware is done using !rduino. It simplifies

    the serial communication between the "!T$!@ and the hardware. !nother advantage of

    using !rduino is, it can be directly controlled over the "!T$!@ by creating !rduino ob6ect

    inside the software itself. It is possible by the use "!T$!@ support package for !rduino. It

    uses the serial port interface in "!$!@ for communication with !rduino board.

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-6-interfacing-using-arduino.html
  • 7/24/2019 Hand Gesture Imp

    31/46

    "!$!@ support package for !rduino comes with a server program adiosrv.pde that can be

    downloaded on the !rduino board. nce downloaded, the adiosrv.pde will2

    %. 1ait and listen for "!T$!@ commands

    3. ?pon receiving "!T$!@ command, execute and return the result

    The steps in writedownload the server program to !rduino board is the following2

    %. pen !rduino I4:3. Select the adiosrv.pde file by navigating through 5ile pen in the I4:

    A. 9lick the upload button

    1. TYPICAL USAGE

    "ake sure the board is connected to the computer via ?S@ port, make sure to know which

    serial port the !rduino is connected to 7this is the same port found at the end of the drivers

    installation step8, and finally, make sure that the port is not used by the I4: 7in other words,

    the I4: must be closed or disconnected8, so that "!T$!@ can use the serial connection.

    5rom "!T$!@, launch the command aLarduino7DportD8 where DportD is the 9" port to

    which the !rduino is connected to, e.g. D9"BD or D9"=D on 1indows, or DdevttyS%(%D on

    ?nix 7use Ddevtty?S@(D for !rduino versions prior to ?no8 or D4:"D 7see below for the4:" mode8 and make sure the function terminates successfully.

    Then use the commands a.pin"ode, 7to change the mode of each pin between input and

    output8 a.digital/ead, a.digital1rite, a.analog/ead, and a.analog1rite, to perform digital

    input, digital output, analog input, and analog output.

    2. E;AMPLE

    The previous example $:4 @link can be directly implemented on the "!T$!@ as follows2

    a 5 arduino(67-89) set up the !rduino ob6ect by specifying the 9"

    a"pinode(,9output9) declare %Ath pin as output

    while() execute the infinite loop

    a"di&ital0rite(, 1231) set the $:4 on

    pause() wait for a second

    a"di&ital0rite(, L-0) set the $:4 off

    pause() wait for a second

    end

    CHAPTER 7 : ROBOT CONTROL

    The robot is two wheel control with a castor wheel provided for the support.

    1. MOTOR DRIER IC L2

  • 7/24/2019 Hand Gesture Imp

    32/46

    input is high, the associated driver gets enabled. !s a result, the outputs become active and

    work in phase with their inputs. Similarly, when the enable input is low, that driver is

    disabled, and their outputs are off and in the high-impedance state. $3&A4 has output current

    of '((m! and peak output current of %.3! per channel. "oreover for protection of circuit

    from back :"5 output diodes are included within the I9. The output supply 7)9938 has a

    wide range from ;.B) to A').

    Thus if both the motors are moving in forward direction, the robot moves forward. 1hile if

    both the motors are moving in reverse direction, the robot moves backward. If the left motor

    is in forward motion and the right motor is in reverse motion, then the robot moves to its

    right. 1hile if the right motor is in forward motion and the left motor is in reverse motion,

    then the robot moves to its left. Thus with appropriate control instructions from arduino, we

    can control the movement of the robot.

  • 7/24/2019 Hand Gesture Imp

    33/46

    2. DC GEARED MOTOR

    The geared instrument dc motor is ideally suited to a wide range of applications requiring a

    combination of low speed operation and small unit si0e. The integral iron core 49 motor

    provides smooth operation and a bidirectional variable speed capability while the gearhead

    utili0es a multistage metal spur gear train rated for a working torque up to (.3Nm. The unit,

    which is suitable for mounting in any attitude, provides reliable operation over a wide

    ambient temperature range and is equipped with an integral )4/ 7voltage dependent resistor8

    electrical suppression system to minimi0e electrical interference. The unit offers a range of

    gear ratio options for operating speeds from B-3(( rpm and is ideally suited to applications

    where small si0e and low unit price are important design criteria.

    In 49 geared motors as the name suggests, this motors spin freely under the command of acontroller. This makes them easier to control, as the controller knows exactly how far they

    have rotated, without having to use a sensor. Therefore they are used on many rolling wheels.

    It has six wires coming out of it. 5our of them are used for receiving data from the

    microcontroller board 7arduino8 for its movement, while one is connected to %3) 49 supply,

    to drive the motor, while other wire is connected to B) 49 supply to provide the supply for

    the driver I9.

  • 7/24/2019 Hand Gesture Imp

    34/46

    CHAPTER = : HARDWARE SETUP

    CHAPTER < : SOURCE CODE

    1. PROGRAM TO TRAIN THE NEURAL NETWORK

    WW This program is used to make the neural network for the identification of the gesture%.Same program can be used for other gestures also by replacing the sample images with the

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-8-hardware-setup.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-9-source-code.html
  • 7/24/2019 Hand Gesture Imp

    35/46

    images corresponding to that gesture

    clcJ

    clear allJ

    WW $oad the sample images 7of different possess8 to make net for gesture%

    img%Limread7Dfirsgesture%.pngD8J

    img3Limread7Dfirsgesture3.pngD8JimgALimread7DfirsgestureA.pngD8J

    img;Limread7Dfirsgesture;.pngD8J

    imgBLimread7DfirsgestureB.pngD8J

    img'Limread7Dfirsgesture'.pngD8J

    imgVLimread7DfirsgestureV.pngD8J

    img=Limread7Dfirsgesture=.pngD8J

    img&Limread7Dfirsgesture&.pngD8J

    img%(Limread7Dfirsgesture%(.pngD8J

    img%%Limread7Dfirsgesture%%.pngD8J

    img%3Limread7Dfirsgesture%3.pngD8J

    img%ALimread7Dfirsgesture%A.pngD8Jimg%;Limread7Dfirsgesture%;.pngD8J

    img%BLimread7Dfirsgesture%B.pngD8J

    img%'Limread7Dfirsgesture%'.pngD8J

    img%VLimread7Dfirsgesture%V.pngD8J

    img%=Limread7Dfirsgesture%=.pngD8J

    img%&Limread7Dfirsgesture%&.pngD8J

    img3(Limread7Dfirsgesture3(.pngD8J

    img3%Limread7Dfirsgesture3%.pngD8J

    img33Limread7Dfirsgesture33.pngD8J

    img3ALimread7Dfirsgesture3A.pngD8J

    img3;Limread7Dfirsgesture3;.pngD8J

    img3BLimread7Dfirsgesture3B.pngD8J

    img3'Limread7Dfirsgesture3'.pngD8J

    img3VLimread7Dfirsgesture3V.pngD8J

    img3=Limread7Dfirsgesture3=.pngD8J

    img3&Limread7Dfirsgesture3&.pngD8J

    imgA(Limread7DfirsgestureA(.pngD8J

    imgA%Limread7DfirsgestureA%.pngD8J

    imgA3Limread7DfirsgestureA3.pngD8J

    imgAALimread7DfirsgestureAA.pngD8JimgA;Limread7DfirsgestureA;.pngD8J

    imgABLimread7DfirsgestureAB.pngD8J

    WW 9rop the sample images

    imgcpd%Limcrop7a%8J

    imgcpd3Limcrop7a38J

    imgcpdALimcrop7aA8J

    imgcpd;Limcrop7a;8J

    imgcpdBLimcrop7aB8J

    imgcpd'Limcrop7a'8JimgcpdVLimcrop7aV8J

  • 7/24/2019 Hand Gesture Imp

    36/46

    imgcpd=Limcrop7a=8J

    imgcpd&Limcrop7a&8J

    imgcpd%(Limcrop7a%(8J

    imgcpd%%Limcrop7a%%8J

    imgcpd%3Limcrop7a%38J

    imgcpd%ALimcrop7a%A8Jimgcpd%;Limcrop7a%;8J

    imgcpd%BLimcrop7a%B8J

    imgcpd%'Limcrop7a%'8J

    imgcpd%VLimcrop7a%V8J

    imgcpd%=Limcrop7a%=8J

    imgcpd%&Limcrop7a%&8J

    imgcpd3(Limcrop7a3(8J

    imgcpd3%Limcrop7a3%8J

    imgcpd33Limcrop7a338J

    imgcpd3ALimcrop7a3A8J

    imgcpd3;Limcrop7a3;8Jimgcpd3BLimcrop7a3B8J

    imgcpd3'Limcrop7a3'8J

    imgcpd3VLimcrop7a3V8J

    imgcpd3=Limcrop7a3=8J

    imgcpd3&Limcrop7a3&8J

    imgcpdA(Limcrop7aA(8J

    imgcpdA%Limcrop7aA%8J

    imgcpdA3Limcrop7aA38J

    imgcpdAALimcrop7aAA8J

    imgcpdA;Limcrop7aA;8J

    imgcpdABLimcrop7aAB8J

    WW Train the network. p is the input vector and t is the target vector

    pLGimgcpd%,imgcpd3,imgcpdA,imgcpd;,imgcpdB,imgcpd',imgcpdV,imgcpd=,imgcpd&,

    imgcpd%(,imgcpd%%,imgcpd%3,imgcpd%A,imgcpd%;,imgcpd%B,imgcpd%',imgcpd%V,

    imgcpd%=,imgcpd%&,imgcpd3(,imgcpd3%,imgcpd33,imgcpd3A,imgcpd3;,imgcpd3B,

    imgcpd3',imgcpd3V,imgcpd3=,imgcpd3&,imgcpdA(,imgcpdA%,imgcpdA3,imgcpdAA,

    imgcpdA;,imgcpdABHJ

    tLGimgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,

    imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,

    imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%,imgcpd%HJ

    pLdouble7p8J

    tLdouble7t8J

    netLnewrb7p,t,.((%,.%,%(((,3B8JWtrain using /@5 net

    save7Dgesture%netD,DnetD8J Wsave the network

    2. PROGRAM TO CROP THE CAPTURED IMAGE

    WW "!T$!@ 5?9TIN imcrop T 9/ T: 9!T?/:4 I"!C:function imgoutLimgcrop7imgin8 Wfunction definition

  • 7/24/2019 Hand Gesture Imp

    37/46

    imgin L imresi0e7cropimg,G3;(,3;(H8J Wresi0e the original image

    columnsumLsum7cropimg8JWfind the sum of column entries

    rowsumLsum7cropimgD8J Wfind the sum of row entries

    q%L%J

    q3L3;(J

    qAL%Jq;L3;(J

    for wL%23;( Wsearch for the vertical cropping boundary

    if 7columnsum7%,w8L%(8

    q%LwJ

    breakJ

    end

    end

    for wLq%23;( Wsearch for the vertical cropping boundary

    if 7columnsum7%,w8L%(8

    q3LwJ

    endend

    for wL%23;( Wsearch for the hori0ontal cropping boundary

    if 7rowsum7%,w8L%(8

    qALwJ

    breakJ

    end

    end

    for wLqA23;( Wsearch for the hori0ontal cropping boundary

    if 7rowsum7%,w8LB(8

    q;LwJ

    end

    end

    Wcrop the image between boundaries and resi0e to original

    imgoutLcropimg7qA2q;,q%2q38J

    imgoutLG0eros7q;-qAX%,%'(-7q3-q%X%88,imgoutJ0eros7%3(-7q;-qAX%8,%'(8HJ

    imgoutLimresi0e7imgout,G%3(,%'(H8J

    imgoutLdouble7imgout8J Wconvert the image type to double for future use

    ". PROGRAM OR REAL TIME HAND TRACKING AND GESTURE

    IDENTIICATION

    WW "!T$!@ program for real time hand tracking and gesture identification

    clcJ

    clear allJ

    close allJ

    qL(J Winitiali0e the switch variable

    a L arduino7D9";D8J Winitiali0e the arduino ob6ect

    a.pin"ode7%A,DoutputD8J Wdefine the pin %A as output pin

    a.pin"ode7%3,DoutputD8J Wdefine the pin %3 as output pin

    a.pin"ode7%%,DoutputD8J Wdefine the pin %3 as output pina.pin"ode7%(,DoutputD8J Wdefine the pin %( as output pin

  • 7/24/2019 Hand Gesture Imp

    38/46

    closepreviewJ

    vid L videoinput7DwinvideoD, %, D>?>3YA3(x3;(D8J Wspecify the video adaptor

    src L getselectedsource7vid8J

    vid./eturned9olorspace L DgrayscaleDJ Wdefine the color format to C/!>S9!$:

    vid.5rameserTrigger L %J

    preview7vid8J Wpreview the video ob6ectload gesture%netJ Wload the network for each gesture

    load gesture3netJ

    load gestureAnetJ

    load gesture;netJ

    imgref%Limread7Drefimg%.pngD8J Wloading the reference image for each gesture

    imgref3Limread7Drefimg3.pngD8J

    imgrefALimread7DrefimgA.pngD8J

    imgref;Limread7Drefimg;.pngD8J

    imgcrpd%Limcrop7imgref%8J Wcrop the reference images

    imgcrpd3Limcrop7imgref38J

    imgcrpdALimcrop7imgrefA8Jimgcrpd;Limcrop7imgref;8J

    imgtrained%Lsim7gesture%net,imgcrpd%8J Wtrain the cropped reference images

    imgtrained3Lsim7gesture3net,imgcrpd38J

    imgtrainedALsim7gestureAnet,imgcrpdA8J

    imgtrained;Lsim7gesture;net,imgcrpd;8J

    while7%8

    preview7vid8J Wpreview the video ob6ect

    currentimgLgetsnapshot7vid8J Wcapture the image of interest

    currentimgLim3bw7currentimg,.VB8J Wconvert captured image to binary

    imshow7currentimg8J Wdisplay the binary image

    cropedimgLimcrop7currentimg8J Wcrop the binary image

    yLsim7gesture%net,cropedimg8J Wsimulate using first net

    corrwgt%Lxcorr37imgtrained%,y8J Wfind out correlation

    corrwgt%Lim3col7corrwgt%,G% %H8J Wcorrelation weight to column vector

    corrwgt%Lsum7corrwgt%8J Wtake column value sum

    yLsim7gesture3net,currentimg8J Wsimulate using second net

    corrwgt3Lxcorr37imgtrained3,y8J Wfind out correlation

    corrwgt3Lim3col7corrwgt3,G% %H8J Wcorrelation weight to column vector

    corrwgt3Lsum7corrwgt38J Wtake column value sum

    yLsim7gestureAnet,currentimg8J Wsimulate using third netcorrwgtALxcorrA7imgtrainedA,y8JWfind out correlation

    corrwgtALim3col7corrwgtA,G% %H8J Wcorrelation weight to column vector

    corrwgtALsum7corrwgtA8J Wtake column value sum

    yLsim7gesture;net,currentimg8J Wsimulate using fourth net

    corrwgt;Lxcorr;7imgtrained;,y8JWfind out correlation

    corrwgt;Lim3col7corrwgt;,G% %H8JW correlation weight to column vector

    corrwgt;Lsum7corrwgt;8J Wtake column value sum

    corrwghtmatrixLGcorrwgt% corrwgt3 corrwgtA corrwgt;HJ

    Wdefine the switch variable by thresholding correlation weights

    if 7wLAeV8

    qL%Jelseif 7w03Z%eVUw03'e'8

  • 7/24/2019 Hand Gesture Imp

    39/46

    qLA J

    elseif 7w0%eV8

    qL3J

    elseif 7w0'eV8

    qL;J

    elseqL(J

    end

    Wgeneration of control signals to motor driver using switch method

    switch7q8

    case %

    display7D5/1!/4D8J

    a.digital1rite7%A,(8J

    a.digital1rite7%3,%8J

    a.digital1rite7%%,(8J

    a.digital1rite7%(,%8J

    case 3display7D@!9K1!/4D8

    a.digital1rite7%A,%8J

    a.digital1rite7%3,(8J

    a.digital1rite7%%,%8J

    a.digital1rite7%(,(8J

    case A

    display7D/ICTD8J

    a.digital1rite7%A,(8J

    a.digital1rite7%3,%8J

    a.digital1rite7%%,(8J

    a.digital1rite7%(,(8J

    case ;

    display7D$:5TD8J

    a.digital1rite7%A,(8J

    a.digital1rite7%3,(8J

    a.digital1rite7%%,(8J

    a.digital1rite7%(,%8J

    otherwise

    display7DN "TIND8

    a.digital1rite7%A,(8Ja.digital1rite7%3,(8J

    a.digital1rite7%%,(8J

    a.digital1rite7%(,(8J

    end

    end

    -. PROGRAM TO BE BURNED ON ARDUINO BOARD

  • 7/24/2019 Hand Gesture Imp

    40/46

    ('o#&.,e

    [ !nalog and 4igital Input and utput Server for "!T$!@ [

    [ This file is meant to be used with the "!T$!@ arduino I package, however, it can be

    used from the I4: environment 7or any other serial terminal8 by typing commands like2(e( 2 assigns digital pin \; 7e8 as input

    (f% 2 assigns digital pin \B 7f8 as output

    (n% 2 assigns digital pin \%A 7n8 as output

    %c 2 reads digital pin \3 7c8

    %e 2 reads digital pin \; 7e8

    3n( 2 sets digital pin \%A 7n8 low

    3n% 2 sets digital pin \%A 7n8 high

    3f% 2 sets digital pin \B 7f8 high

    3f( 2 sets digital pin \B 7f8 low

    ;63 2 sets digital pin \& 768 to B(Lascii738 over 3BB

    ;60 2 sets digital pin \& 768 to %33Lascii708 over 3BBAa 2 reads analog pin \( 7a8

    Af 2 reads analog pin \B 7f8

    /( 2 sets analog reference to 4:5!?$T

    /% 2 sets analog reference to INT:/N!$

    /3 2 sets analog reference to :

  • 7/24/2019 Hand Gesture Imp

    41/46

    Nothing gets executed in the loop if nothing is available to be read, but as soon as anything

    becomes available, then the part coded after the if statement 7that is the real stuff8 gets

    executed [

    if 7Serial.available78 (8 ^

    [ whatever is available from the serial is read here [

    val L Serial.read78J[ This part basically implements a state machine that reads the serial port and makes 6ust one

    transition to a new state, depending on both the previous state and the command that is read

    from the serial port. Some commands need additional inputs from the serial port, so they need

    3 or A state transitions 7each on happening as soon as anything new is available from the

    serial port8 to be fully executed. !fter a command is fully executed the state returns to its

    initial value sL-% [

    switch 7s8 ^

    [ sL-% means NTINC /:9:I):4 >:T [[[[[[[[[[[[[[[[[[[ [

    case -%2

    [ calculate next state [

    if 7val;V UU valZ&(8 ^[ the first received value indicates the mode ;& is ascii for %, ... &( is ascii for sL( is

    change-pin mode sL%( is 4IJ sL3( is 4J sLA( is !IJ sL;( is !J sL&( is query script type 7%

    basic, 3 motor8 sLA;( is change analog reference [

    sL%([7val-;=8J

    _

    [ the following statements are needed to handle unexpected first values coming from the

    serial 7if the value is unrecogni0ed then it defaults to sL-%8 [

    if 77s;( UU sZ&(8 ]] 7s&( UU s`LA;(88 ^

    sL-%J

    _

    [ the break statements gets out of the switch-case, so

    [ we go back to line &V and wait for new serial data [

    breakJ [ sL-% 7initial state8 taken care of [

    [ sL( or % means 9!NC: IN "4: [

    case (2

    [ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin

    %& [

    if 7val&= UU valZ%%V8 ^

    pinLval-&VJ [ calculate pin[

    sL%J [ next we will need to get ( or % from serial [_

    else ^

    sL-%J [ if value is not a pin then return to -% [

    _

    breakJ [ sL( taken care of [

    case %2

    [ the third received value indicates the value ( or % [

    if 7val;V UU valZB(8 ^

    [ set pin mode [

    if 7valLL;=8 ^

    pin"ode7pin,IN?T8J_

  • 7/24/2019 Hand Gesture Imp

    42/46

    else ^

    pin"ode7pin,?T?T8J

    _

    _

    sL-%J [ we are done with 9!NC: IN so go to -% [

    breakJ [ sL% taken care of [[ sL%( means 4ICIT!$ IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[ [

    case %(2

    [ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin

    %& [

    if 7val&= UU valZ%%V8 ^

    pinLval-&VJ [ calculate pin [

    dgvLdigital/ead7pin8J [ perform 4igital Input [

    Serial.println7dgv8J [ send value via serial [

    _

    sL-%J [ we are done with 4I so next state is -% [

    breakJ [ sL%( taken care of [[ sL3( or 3% means 4ICIT!$ ?T?T [[[[[[[[[[[[[[[[[[[ [

    case 3(2

    [ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin

    %& [

    if 7val&= UU valZ%%V8 ^

    pinLval-&VJ [ calculate pin [

    sL3%J [ next we will need to get ( or % from serial [

    _

    else ^

    sL-%J [ if value is not a pin then return to -% [

    _

    breakJ [ sL3( taken care of [

    case 3%2

    [ the third received value indicates the value ( or % [

    if 7val;V UU valZB(8 ^

    dgvLval-;=J [ calculate value [

    digital1rite7pin,dgv8J [ perform 4igital utput [

    _

    sL-%J [ we are done with 4 so next state is -% [

    breakJ [ sL3% taken care of [[ sLA( means !N!$C IN?T [[[[[[[[[[[[[[[[[[[[[[[[[[[ [

    case A(2

    [ the second received value indicates the pin from abs 7DaD8 L&V, pin (, to abs 7DfD8 L%(3, pin ',

    note that these are the digital pins from %; to %& located in the lower right part of the board [

    if 7val&' UU valZ%(A8 ^

    pinLval-&VJ [ calculate pin [

    agvLanalog/ead7pin8J [ perform !nalog Input [

    Serial.println7agv8J [ send value via serial [

    _

    sL-%J [ we are done with !I so next state is -%[

    breakJ [ sLA( taken care of [[ sL;( or ;% means !N!$C ?T?T [[[[[[[[[[[[[[[[[[[[ [

  • 7/24/2019 Hand Gesture Imp

    43/46

    case ;(2

    [ the second received value indicates the pin from abs 7DcD8 L&&, pin 3, to abs 7DtD8 L%%', pin

    %& [

    if 7val&= UU valZ%%V8 ^

    pinLval-&VJ [ calculate pin[

    sL;%J [ next we will need to get value from serial [_

    else ^

    sL-%J [ if value is not a pin then return to -% [

    _

    breakJ [ sL;( taken care of [

    [ the third received value indicates the analog value [

    analog1rite7pin,val8J [ perform !nalog utput [

    sL-%J [ we are done with ! so next state is -% [

    breakJ [ sL;% taken care [

    [ sL&( means uery Script Type 7% basic, 3 motor8 [

    case &(2if 7valLLBV8 ^

    [ if string sent is && send script type via serial [

    Serial.println7%8J

    _

    sL-%J [ we are done with this so next state is -% [

    breakJ [ sL&( taken care of [

    [ sLA;( or A;% means !N!$C /:5:/:N9: [[[[[[[[[[[[[[[ [

    case A;(2

    [ the second received value indicates the reference which is encoded as is (, %, 3 for

    4:5!?$T, INT:/N!$ and :

  • 7/24/2019 Hand Gesture Imp

    44/46

    _ [ end loop statement [

    CHAPTER 1> : CHALLENGES ACED

    There are many challenges associated with the accuracy and usefulness of gesture recognition

    software. 5or image-based gesture recognition there are limitations on the equipment used

    and image noise. Images or video may not be under consistent lighting, or in the same

    location. Items in the background or distinct features of the users may make recognition more

    difficult. The variety of implementations for image-based gesture recognition may also cause

    issue for viability of the technology to general usage. 5or example, an algorithm calibrated

    for one camera may not work for a different camera. The amount of background noise also

    causes tracking and recognition difficulties, especially when occlusions 7partial and full8

    occur. 5urthermore, the distance from the camera, and the cameraDs resolution and quality,

    also cause variations in recognition accuracy. In order to capture human gestures by visual

    sensors, robust computer vision methods are also required, for example for hand tracking andhand posture recognition or for capturing movements of the head, facial expressions or ga0e

    direction. The ma6or challenges faced in the completion of the pro6ect were2

    /eal-time tracking of gesture in a noisy atmosphere.

    Setting of threshold value for gesture classification.

    Time taken for training a gesture.

    )ariation in illumination of surroundings.

    CHAPTER 11 : RESULTS,ANALYSIS CONCLUSIONAND FUTURE WORKS

    RESULTS AND ANALYSIS

    We have conduced e!"e#$%en& 'a&ed on $%a(e& )e have ac*u$#ed u&$n(a &$%"+e )e'ca% We have co++eced he&e daa on un$-o#% 'ac.(#ound&F$(u#e /0 $++uae& he #ea+ $%e hand #ac.$n( and (e&u#e $den$ca$on-o# he co%%and o- -o#)a#d %o$on

    Ou# a+(o#$h% +ead& o he co##ec #eco(n$$on o- a++ he -ou# (e&u#e&

    A+&o, he co%"ua$on $%e needed o o'a$n he&e #e&u+& $& ve#2 &%a++,&$nce he a+(o#$h% $& *u$e &$%"+e Th$#2 ve $%a(e& o- a &$n(+e (e&u#e )e#e a.en -o# he #a$n$n( "u#"o&e

    and he ne)o#. )a& -o#%ed The #ea+ $%e $%a(e )a& ca"u#ed '2 he)e'ca% and )a& ($ven o he ne -o# c+a&&$ca$on 3a&ed on he &o#eddaa'a&e neu#a+ ne )$++ #eu#n he c+a&&$ed #e&u+Th$& neu#a+ nec+a&&$ca$on ou"u )a& u&ed o d#$ve he ha#d)a#e, $ne#-aced )$h hehe+" o- A#du$no The con#o+ co%%and& )e#e &end o A#du$no -#o% 4ATLA3d$#ec+2 and on he 'a&$& o- h$& co%%and, %o$on o- he RO3OT )a&ad5u&ed

    We have noed ha $%a(e& a.en unde# $n&u6c$en +$(h have +ed o he$nco##ec #e&u+& In he&e ca&e& he -a$+u#e %a$n+2 &e%& -#o% he

    e##oneou& &e(%ena$on o- &o%e 'ac.(#ound "o#$on& a& he hand #e($on

    http://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-10-challenges-faced.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.htmlhttp://gesturecontrolledrobot.blogspot.com/2012/06/chapter-12-resultsanalysis-conclusion.html
  • 7/24/2019 Hand Gesture Imp

    45/46

    Ou# a+(o#$h% a""ea#& o "e#-o#% )e++ $n un$-o#% 'ac.(#ound& )$ha""#o"#$ae $++u%$na$on Ove#a++, )e nd he "e#-o#%ance o- h$& &$%"+ea+(o#$h% *u$e &a$&-aco#2 $n he cone! o- ou# %o$va$n( #o'o con#o+a""+$ca$on

    CONCLUSION AND FUTURE WORKS

    We "#o"o&ed a -a& and &$%"+e a+(o#$h% -o# hand (e&u#e #eco(n$$on -o#con#o++$n( #o'o We have de%onaed he e7ec$vene&& o- h$&co%"ua$ona++2 e6c$en a+(o#$h% on #ea+ $%a(e& )e have ac*u$#ed Inou# &2&e% o- (e&u#e con#o++ed #o'o&, )e have on+2 con&$de#ed a +$%$ednu%'e# o- (e&u#e& Ou# a+(o#$h% can 'e e!ended $n a nu%'e# o- )a2&o #eco(n$8e a '#oade# &e o- (e&u#e& The (e&u#e #eco(n$$on "o#$on o-ou# a+(o#$h% $& oo &$%"+e, and )ou+d need o 'e $%"#oved $- h$&echn$*ue )ou+d need o 'e u&ed $n cha++en($n( o"e#a$n( cond$$on&Re+$a'+e "e#-o#%ance o- hand (e&u#e #eco(n$$on echn$*ue& $n a (ene#a+&e$n( #e*u$#e dea+$n( )$h occ+u&$on&, e%"o#a+ #ac.$n( -o# #eco(n$8$n(d2na%$c (e&u#e&, a& )e++ a& 9D %ode+$n( o- he hand, )h$ch a#e &$++

    %o&+2 'e2ond he cu##en &ae o- he a#The advana(e o- u&$n( neu#a+ ne)o#.& $& ha 2ou can d#a) conc+u&$on&-#o% he ne)o#. ou"u I- a veco# $& no c+a&&$ed co##ec )e can chec.$& ou"u and )o#. ou a &o+u$onEven )$h +$%$ed "#oce&&$n( "o)e#, $ )$++ 'e "o&&$'+e o de&$(n ve#2e6c$en a+(o#$h%& '2: Advanced DSP "#oce&&o# can #educe he &$8e o-%odu+e Unde#&and he$# ;&a$c< (e&u#e& Con#o+ -o# ohe# '$o%e#$cu&e& Ou# &o-)a#e ha& 'een de&$(ned o 'e #eu&a'+e -o# %an2 'ehav$o#&ha a#e %o#e co%"+e!, )h$ch %a2 'e added o ou# )o#. 3ecau&e )e+$%$ed ou#&e+ve& o +o) "#oce&&$n( "o)e#, ou# )o#. cou+d ea&$+2 'e %ade%o#e "e#-o#%$n( '2 add$n( a &ae=o-=he=a# "#oce&&o# The u&e o- #ea+e%'edded OS cou+d $%"#ove ou# &2&e% $n e#%& o- &"eed and &a'$+$2 Inadd$$on, $%"+e%en$n( %o#e &en&o# %oda+$$e& )ou+d $%"#ove #o'u&ne&&even $n ve#2 co%"+e! &cene&Ou# &2&e% ha& &ho)n he "o&&$'$+$2 ha $ne#ac$on )$h %ach$ne&h#ou(h (e&u#e& $& a -ea&$'+e a&. and he &e o- deeced (e&u#e& cou+d'e enhanced o %o#e co%%and& '2 $%"+e%en$n( a %o#e co%"+e! %ode+o- a advanced veh$c+e -o# no on+2 $n +$%$ed &"ace )h$+e a+&o $n '#oade#a#ea a& $n he #oad& oo In he -uu#e, &e#v$ce #o'o e!ecu$n( %an2d$7e#en a&.& -#o% "#$vae %ove%en o a -u++2=>ed(ed advancedauo%o$ve ha can %a.e d$&a'+ed o a'+e $n a++ &en&e

    SNAPSHOTS

  • 7/24/2019 Hand Gesture Imp

    46/46