IJEST11-03-04-290

download IJEST11-03-04-290

of 6

Transcript of IJEST11-03-04-290

  • 8/12/2019 IJEST11-03-04-290

    1/6

    ARTIFICIAL NEURAL NETWORK

    APPROACH FOR HAND GESTURE

    RECOGNITION

    MISS. SHWETA K. YEWALE

    ME Student

    Prof. Ram Meghe Institute of Technology & Research, Badnera

    MR. PANKAJ K. BHARNE

    ME Student

    Sipna College of Engg. & Technology, Amravati

    Abstract:

    Gesture recognition is an important for developing alternative human-computer interaction modalities. It

    enables human to interface with machine in a more natural way. For recognizing the gestures, there are

    several algorithms are available. There are several approaches for gesture recognition using MATLAB.

    Artificial Neural networks are flexible in a changing environment. This research paper gives the overview

    of ANN for gesture recognition. It also describes the process of gesture recognition using ANN.

    Keywords: Gesture Recognition; Artificial Neural Network; Histogram; Vectorization; Classification.

    1. IntroductionHuman hand gestures provide the most important means for non-verbal interaction among people. They range

    from simple manipulative gestures that are used to point at and move objects around to more complexcornmunicative ones that express our feelings and allow us to communicate with others.

    Hand gesture recognition based man-machine interface is being developed vigorously in recent years. Due to

    the effect of lighting and complex background, most visual hand gesture recognition systems work only under

    restricted environment.

    Many methods for hand gesture recognition using visual analysis have been proposed for hand gesture

    recognition. Sebastiean Marcel, Oliver Bernier, Jean Emmanuel Viallet and Danieal Collobert have proposedthe same using Input-output Hidden Markov Models [1]. Xia Liu and Kikuo Fujimura have proposed the hand

    gesture recognition using depth data [2]. For hand detection, many approached uses color or motion information

    [3, 4]. Attila Licsar and Tamas Sziranyi have developed a hand gesture recognition system based on the shapeanalysis of the static gesture [5]. Another method is proposed by E. Stergiopoulou and N. Papamarkos [6] which

    says that detection of the hand region can be achieved through color segmentation. Byung-Woo Min, Ho-Sub

    Yoon, Jung Soh, Yun-Mo Yangc and Toskiaki Ejima have suggested the method of Hand Gesture Recognitionusing Hidden Markov models [7]. Another very important method is suggested by Meide Zhao, Francis K.H.

    Quek and Xindong Wu [8]. They have used AQ Family Algorithms and R-MINI Algorithms for the detection of

    Hand Gestures.

    There is another efficient technique which uses Fast Multi-Scale Analysis for the recognition of hand gestures

    as suggested by Yikai Fang, Jian Cheng, Kongqiao Wang and Hanqing Lu [9], but this method iscomputationally expensive. Chris Joslin et. al. have suggested the method for enabling dynamic gesture

    recognition for hand gestures [10]. Rotation Invariant method is widely used for texture classification andrecognition. Timi Ojala et. al. have suggested the method for texture classification using Local Binary Patterns

    [11].

    Gestures are expressive, meaningful body motions i.e., physical movements of the fingers, hands, arms,head, face, or body with the intent to convey information or interact with the environment. Gestures can exist in

    isolation or involve external objects. [12]

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2603

  • 8/12/2019 IJEST11-03-04-290

    2/6

    Gesture recognition is the process by which gestures made by the user are made known to the system. [13]

    Gesture recognition is also important for developing alternative human-computer interaction modalities [14]. Itenables human to interface with machine in a more natural way.

    2. Artificial Neural NetworksNeural nets represent an approach to Artificial Intelligence that attempts to model the human brain. Neurons

    are processing units that operate in parallel inside the human brain. There are an estimated 10 billion neurons in

    the human brain with about 60 trillion connections between these neurons. Each neuron receives inputs fromother neurons in the form of tiny electrical signals and, likewise, it also outputs electrical signals to other

    neurons. These outputs are weighted in the sense that the neuron does not fire any output unless a certain

    threshold/bias is reached. These weights can be altered through learning experiences; this is how the brainlearns. The brain is therefore a network of neurons acting in parallel a Neural Network.

    Similarly, an Artificial Neural Nets consists of artificial neurons, which are mathematical models of

    biological neurons. Like the biological neuron, an artificial neuron (called a perceptron), receives numericalvalues and also outputs a numerical value. The diagram below shows a representation of an artificial neuron.

    Figure 1. Representation of an Artificial Neuron

    The input into the perceptron consists of the numerical value multiplied by a weight plus a bias. The

    perceptron only fires an output when the total strength of the input signals exceeds a certain threshold. As in

    biological neural Networks, this output is fed to other perceptrons.

    The weighted input to a perceptron is acted upon by a function (the transfer function) and this will determine

    the activation or output. Common transfer functions used in Artificial Neural networks include the Hard Limiter,

    Log-Sigmoid and the Sign function.

    3. Gesture Recognition Using Artificial Neural NetworksAn artificial neural network involves a network of simple processing elements (artificial neurons) which can

    exhibit complex global behavior, determined by the connections between the processing elements and elementparameters. It consists of an interconnected group of artificial neurons and processes information using a

    connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure

    based on external or internal information that flows through the network during the learning phase. The utility of

    artificial neural network models lies in the fact that they can be used to infer a function from observations. Thisis particularly useful in applications where the complexity of the data or task makes the design of such a

    function by hand impractical. The tasks to which artificial neural networks are applied. Classification, including

    pattern and sequence recognition, novelty detection and sequential decision making.

    The supervised learning paradigm is also applicable to sequential data (.g., for speech and gesture

    recognition). In MATLAB, Feedforword and Backpropogation algorithms are used for gesture recognition. [15]

    3.1 BackpropagationBackpropagation is a supervised learning technique used for training artificial neural networks. It was first

    described by Paul Werbos in 1974, and further developed by David E. Rumelhart, Geoffrey E. Hinton andRonald J. Williams in 1986.

    It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no

    connections that loop). The term is an abbreviation for "backwards propagation of errors". Backpropagation

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2604

  • 8/12/2019 IJEST11-03-04-290

    3/6

    requires that the transfer function used by the artificial neurons (or "nodes") be differentiable. . Figure 7 shows

    the Backpropagation Network.

    Figure 2. Backpropagation Network

    Actual Algorithm:

    (1) Initialize the weights in the network (often randomly)(2) Repeat for each example in the training set do

    = neural-net-output (network, e); forward pass

    (3) T = teacher output for e(4) Calculate error (T - O) at the output units(5) Compute delta_wi for all weights from hidden layer to output layer; backward pass(6) Compute delta_wi for all weights from input layer to hidden layer; backward pass continued(7) Update the weights in the network end(8) until all examples classified correctly or stopping criterion satisfied(9) Return (network)

    As the algorithm's name implies, the errors (and therefore the learning) propagate backwards from the output

    nodes to the inner nodes. So technically speaking, backpropagation is used to calculate the gradient of the error

    of the network with respect to the network's modifiable weights. This gradient is almost always then used in a

    simple stochastic gradient descent algorithm to find weights that minimize the error. Often the term

    "backpropagation" is used in a more general sense, to refer to the entire procedure encompassing both thecalculation of the gradient and its use in stochastic gradient descent. Backpropagation usually allows quick

    convergence on satisfactory local minima for error in the kind of networks to which it is suited.

    It is important to note that backpropagation networks are necessarily multilayer (usually with one input, onehidden, and one output layer). In order for the hidden layer to serve any useful function, multilayer networks

    must have non-linear activation functions for the multiple layers: a multilayer network using only linearactivation functions is equivalent to some single layer, linear network. Non-linear activation functions that are

    commonly used include the logistic function, the softmax function, and the Gaussian functions. The

    backpropagation algorithm for calculating a gradient has been rediscovered a number of times, and is a special

    case of a more general technique called automatic differentiation in the reverse accumulation mode.

    3.2 Feedforward Multilayer Perceptron NetworkThe feedforward neural network was the first and arguably simplest type of artificial neural network devised.

    In this network, the information moves in only one direction, forward, from the input nodes, through the hidden

    nodes (if any) and to the output nodes. There are no cycles or loops in the network.

    In computing, feed-forward normally refers to a multi-layer perceptron network in which the outputs from all

    neurons go to following but not preceding layers, so there are no feedback loops. Fig 3. below shows a

    representation of a simple feed-forward Neural Network with four inputs, one hidden layer and four outputs.

    Neural networks learn by changing their weights.

    Figure 3. A simple feed-forward Neural Net

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2605

  • 8/12/2019 IJEST11-03-04-290

    4/6

    4. Gesture Recognition Process Using MATLAB4.1 Procedure

    MATLAB is an interactive system whose basic data element is an array that does not require dimensioning.

    MATLAB is the tool of choice for high-productivity research, development, and analysis.

    The Gesture Recognition system is shown in Figure 4. It shows the flow of system for recognizing the

    patterns. Some transformation, converts an image into a feature vector, which will be then compared with

    feature vectors of a training set of gestures. [16]

    Figure 4. Gesture Recognition System

    The procedure of gesture recognition using MATLAB is divided in six steps. These are as:

    Step1- The first thing for the program to do is to read the image database.

    Step2- Resize all the images that were read in Step1 to 150x140 pixels. This size seems the optimal foroffering enough detail while keeping the processing time low.

    Step3 - Next thing to do is to find the edges. For this two filters were used.

    For the x direction x = [0 -1 1]For the y direction y=

    0

    1-1

    this is the same as x but transposed and multiplied by 1.

    Figure 5. shows two images of the result with the x-filter and y-filter.

    Figure 5. X-Y filters

    Step 4 - Dividing the two resulting matrices (images) dx and dy element by element and then taking the atan

    (tan1). This will give the gradient orientation.

    Step 5 - Then the MATLAB function im2col is called to rearrange the image blocks into columns.

    This is not a necessary step but it has to be done if we want to display the orientation histogram.Rose createsan angle histogram, which is a polar plot showing the distribution of values grouped according to their numeric

    range. Each group is shown as one bin. Below we can see some examples. While developing the algorithm those

    histograms are the fastest way of getting a good idea how good the detection is done.Figure 6. Shows the original images that generated the histograms above in the same Order

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2606

  • 8/12/2019 IJEST11-03-04-290

    5/6

    Fig (a) Before Histogram Equalization

    Fig (b) After Histogram Equalization

    Figure 6. Histogram of images

    Step 6 - Converting the column matrix with the radian values to degrees. This way we can scan the vector forvalues ranging from 0to 90. This is because for real elements of X, atan(X) is in the range. This can also be

    seen from the orientation histograms where values come up only on the first and last quarter.

    4.2 Pre-processing

    The web-cam captures the input at slow-rate samples of between 15-25 frames per second, and using animage differencing technique, the sequence of (x, y) coordinates representing the gesture is determined.

    This raw set of (x, y) coordinates will have to be preprocessed before it can be fed into the trained neural net

    for classification. One of the major limitations of neural nets is that they require a fixed number of inputs.Preprocessing must ensure that this condition is met. This means that a gesture with an inadequate number of

    inputs must not be passed onto the neural classifier or it must be enlarged in an appropriate manner to meet the

    size requirement. A gesture that is too long must be sampled appropriately to fit the exact number of inputs inthe neural classifier.

    The resultant processed input can now be passed into the classifier. Yet further preprocessing can be

    performed. Preprocessing can also be used to extract further meaning from the raw data and then passing theinterpreted data onto the neural classifier. This has the general effect of improving the performance of neural

    nets.

    The n input sequence of (x, y) coordinates is preprocessed into a vector sequence, which is then passed into

    the trained neural net for classification. The general effect of this is improved gesture recognition performance

    as compared to using raw (x, y) coordinates. Scaling can also be introduced to improve performance. The tablebelow outlines the vectorizationof points representing a gesture. [17]

    Table 1. The vectorization process

    4.3 Classification

    The set of input is passed through the trained Neural Network which classifies the gesture into one of several

    predefined classes that can be identified by the system.

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2607

  • 8/12/2019 IJEST11-03-04-290

    6/6

    Figure7. The classification process for system control

    Conclusion

    Human hand gestures provide the most important means for non-verbal interaction among people. At present,

    artificial neural networks are emerging as the technology of choice for many applications, such as pattern

    recognition, gesture recognition, prediction, system identification, and control.

    ANN provides good and powerful solution for gesture recognition in MATLAB. Artificial Neural networks

    are applicable to multivariate non-linear problems. It has a fast computational ability. The ability of neural netsto generalize makes them a natural for gesture recognition.

    References

    [1] Sebastian Marcel, Oliver Bernier, Jean Emmanuel Viallet and Daniel Collobert. (2000). Hand Gesture Recognition using Input Output Hidden Markov Models, Proc. of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp.

    456 - 461.

    [2] Xia Liu and Kikuo Fujimura. (2004). Hand Gesture Recognition using Depth Data, Proc. of the Sixth IEEE International conferenceon automatic Face and Gesture Recognition, pp. 529-534.

    [3] L. Bretzner, I. Laptev, and T. Lindeberg. ( 2003). Hand Gesture using multi-scale color features, hierarchical models and particlefiltering, Proc. of the Fifth International conference on Automatic Face and Gesture Recognition, pp. 423- 428.

    [4] V. Pavlovic, et al. (1997). Visual Interpretation Of Hand Gesture For Human-Computer Interaction: A Review, IEEE Trans. OnPattern anal. Mach. Intel. 19(7), pp 677-695.

    [5] Attila Licsar and Tamas Sziranyi. (2002). Supervised training based hand gesture recognition system, Proc. of the 16th InternationalConference on Pattern Recognition, Vol. 3, pp 30999 31003.[6] E.Stergiopoulou and N.Papamarkos. (2006). A New Technique on Hand Gesture Recognition, Proc of the IEEE InternationalConference on Image Processing, 2657-2660.

    [7] Byung-Woo Min, Ho-Sub Yoon, Jung Soh, Yun-Mo Yangc and Toshiaki Ejima, 1997. Hand Gesture Recognition Using HiddenMarkov Models, Proc. of the IEEE International conference on Systems, Man and Cybernetics, vol 5, pp. 4232 -4235.

    [8] Meide Zhao, Francis K.H. Quek, Member, IEEE, and Xindong Wu, Senior Member, IEEE, November 1998. RIEVL: RecursiveInduction Learning in Hand Gesture Recognition, IEEE Transactions on Pattern Analysis and machine intelligence, vol. 20, no. 11.

    [9] Yikai Fang, Jian Cheng, Kongqiao Wang and Hanqing Lu, 2007. Hand Gesture Recognition Using Fast Multi-scale Analysis, Proc.of the Fourth International Conference on Image and Graphics, pp 694-698.

    [10] Chris Joslin, Ayman El-Sawah, Qing chen, Nicolas Georganas, 2005. Dynamic Gesture Recognition, Proc. of the Instrumental andMeasurement Technology Conference, pp 1706-1710.

    [11] Timi Ojala, Matti Pietikainen and Topi Maenpaa, 2002 Multi-resolution Gray-Scale and Rotation Invariant Texture Classificationwith Local Binary Patterns, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 24, pp. 971-987.

    [12] Prateem Chakraborty, Prashant Sarawgi, Ankit Mehrotra, Gaurav Agarwal, Ratika Pradhan, 19-21 March, 2008. Hand GestureRecognition: A Comparative Study, Proceedings of the International MultiConference of Engineers and Computer Scientists 2008Vol I IMECS 2008, Hong Kong.

    [13] Daniel Thalmann, October 2002. Gesture Recognition Motion Capture, Motion Retargeting, and Action Recognition, EPFL VRlab.

    [14] Aditya Ramamoorthy et al. Recognition of dynamic hand gestures , page 1-13. Department of Electrical Engineering IIT NewDelhi-110016 India.

    [15] www.mathworks.com/products/neuralnet[16] Rajeshree Rokade, Dharmpal Doye, Manesh Kokare, 2009 Hand Gesture Recognition by Thinning Method, International

    Conference on Digital Image Processing.

    [17] Peter Wentworth, 2ndNovember 2008. An Investigation into Gesture Recognition in BingBee using Neural Nets in MATLAB,RHODES University.

    Shweta K. Yewale et al. / International Journal of Engineering Science and Technology (IJEST)

    ISSN : 0975-5462 Vol. 3 No. 4 April 2011 2608