The Sixth Sense Final
-
Upload
ashwani-kumar-rai -
Category
Documents
-
view
214 -
download
0
Transcript of The Sixth Sense Final
-
8/2/2019 The Sixth Sense Final
1/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 1
Introduction
Gesture recognition is a topic in computer science and language technology with the
goal of interpreting human gestures via mathematical algorithms. Gestures can originate
from any bodily motion or state but commonly originate from the face or hand. Current
focuses in the field include emotion recognition from the face and hand gesture
recognition. Many approaches have been made using cameras and computer vision
algorithms to interpret sign language. However, the identification and recognition of
posture, gait, and human behaviours is also the subject of gesture recognition
techniques.Gesture recognition can be seen as a way for computers to begin to understand human body
language, thus building a richer bridge between machines and humans than primitive
text user interfaces or even GUIs (graphical user interfaces), which still limit the majority of
input to keyboard and mouse.
Gesture recognition enables humans to interface with the machine (HMI) and interact
naturally without any mechanical devices. Using the concept of gesture recognition, it is
possible to point a finger at the computer screen so that the cursor will move accordingly. This
could potentially make conventional input devices such as mouse, keyboards and even touch-
screens redundant.
Gesture recognition can be conducted with techniques from computer vision and image
processing.
-
8/2/2019 The Sixth Sense Final
2/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 2
Interface with computers using gestures of the human body, typically hand movements.
In gesture recognition technology, a camera reads the movements of the human body and
communicates the data to a computer that uses the gestures as input to control devicesor applications. For example, a person clapping his hands together in front of a camera
can produce the sound of cymbals being crashed together when the gesture is fed through
a computer.
One way gesture recognition is being used is to help the physically impaired to interact
with computers, such as interpreting sign language. The technology also has the potential
to change the way users interact with computers by eliminating input devices such
as joysticks, mice and keyboards and allowing the unencumbered body to give signals to
the computer through gestures such as finger pointing.
Unlike hap tic interfaces, gesture recognition does not require the user to wear any special
equipment or attach any devices to the body. The gestures of the body are read by a
camera instead of sensors attached to a device such as a data glove.
In addition to hand and body movement, gesture recognition technology also can be used to read facial
and speech expressions (i.e., lip reading), and eye movements. The literature includes ongoing work in
the computer vision field on capturing gestures or more general human pose and movements by
cameras connected to a computer.
http://www.webopedia.com/TERM/I/interface.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/J/joystick.htmlhttp://www.webopedia.com/TERM/M/mouse.htmlhttp://www.webopedia.com/TERM/H/haptic.htmlhttp://www.webopedia.com/TERM/D/data_glove.htmlhttp://www.webopedia.com/TERM/D/data_glove.htmlhttp://www.webopedia.com/TERM/H/haptic.htmlhttp://www.webopedia.com/TERM/M/mouse.htmlhttp://www.webopedia.com/TERM/J/joystick.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/D/device.htmlhttp://www.webopedia.com/TERM/I/interface.html -
8/2/2019 The Sixth Sense Final
3/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 3
Gesture Based Interaction
Fig 1. The system detects the hands and fingers in real-time.
Touch screens such as those found on the iPhone or iPad are the latest form of technologyallowing interaction with smart phones, computers and other devices. However, scientists at
Fraunhofer FIT has developed the next generation non-contact gesture and finger recognition
system. The novel system detects hand and finger positions in real-time and translates these
into appropriate interaction commands. Furthermore, the system does not require special
gloves or markers and is capable of supporting multiple users.
With touch screens becoming increasingly popular, classic interaction techniques such as a
mouse and keyboard are becoming less frequently used. One example of a breakthrough is
the Apple iPhone which was released in summer 2007. Since then many other devices
featuring touch screens and similar characteristics have been successfully launched -- with
more advanced devices even supporting multiple users simultaneously, e.g. the Microsoft
Surface table becoming available. This is an entire surface which can be used for input.
However, this form of interaction is specifically designed for two-dimensional surfaces.
Fraunhofer FIT has developed the next generation of multi-touch environment, one that
requires no physical contact and is entirely gesture-based.
http://images.sciencedaily.com/2010/07/100721085354-large.jpghttp://images.sciencedaily.com/2010/07/100721085354-large.jpghttp://images.sciencedaily.com/2010/07/100721085354-large.jpg -
8/2/2019 The Sixth Sense Final
4/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 4
This system detects multiple fingers and hands at the same time and allows the user to
interact with objects on a display. The users move their hands and fingers in the air and the
system automatically recognizes and interprets the gestures accordingly.
Fig 2. The Data or Cyber Glove : A device capable of recording hand
movements, both the position of the hand and its orientation as well as finger movements; it
is capable of simple gesture recognition and general tracking of three-dimensional hand
orientation.
-
8/2/2019 The Sixth Sense Final
5/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 5
An input device for virtual reality in the form of a glove which measures the movements of
the wearer's fingers and transmits them to the computer. Sophisticated data gloves also
measure movement of the wrist and elbow. A data glove may also contain control buttons or
act as an output device, e.g. vibrating under control of the computer. The user usually sees a
virtual image of the data glove and can point or grip and push objects.
The CyberGlove is a fully instrumented glove that provides up to 22 high-accuracy joint-
angle measurements. It uses proprietary resistive bend-sensing technology to accurately
transform hand and finger motions into real-time digital joint-angle data. Our
VirtualHandStudio software converts the data into a graphical hand which mirrors the subtlemovements of the physical hand. It is available in two models and for either hand.
The 18-sensor model features two bend sensors on each finger, four abduction sensors, plus
sensors measuring thumb crossover , palm arch , wrist flexion and wrist abduction.
The 22-sensor model has three flexion sensors per finger, four abduction sensors, a palm-arch
sensor, and sensors to measure flexion and abduction. Each sensor is extremely thin and
flexible being virtually undetectable in the lightweight elastic glove.
The CyberGlove has been used in a wide variety of real-world applications, including digital
prototype evaluation, virtual reality biomechanics, and animation. The CyberGlove has
become the de facto standard for high-performance hand measurement and real-time motion
capture. Designed for Comfort and Functionality.
The CyberGlove has a software programmable switch and LED on the wristband to permit
the system software developer to provide the CyberGlove wearer with additional input/output
capability.
The instrumentation unit provides a variety of convenient functions and features including
time-stamp, CyberGlove status, external sampling synchronization and analog sensor outputs.
http://encyclopedia2.thefreedictionary.com/virtual+realityhttp://encyclopedia2.thefreedictionary.com/virtual+reality -
8/2/2019 The Sixth Sense Final
6/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 6
Gesture types
In computer interfaces, two types of gestures are distinguished:
Offline gestures: Those gestures that are processedafter the user interaction with the object. An example is
the gesture to activate a menu.
Online gestures : Direct manipulation gestures.They are used to scale or rotate a tangible object.
-
8/2/2019 The Sixth Sense Final
7/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 7
Possible types of gestures
Gesture recognition is useful for processing information from humans which is not
conveyed through speech or type. As well, there are various types of gestures which can be
identified by computers.
Sign language recognition. Just as speech recognition can transcribe speech to
text, certain types of gesture recognition software can transcribe the symbols
represented through sign language into text.
For socially assistive robotics. By using proper sensors (accelerometers and
gyros) worn on the body of a patient and by reading the values from those
sensors, robots can assist in patient rehabilitation. The best example can be stroke
rehabilitation.
Directional indication through pointing. Pointing has a very specific purpose in
our society, to reference an object or location based on its position relative to
ourselves. The use of gesture recognition to determine where a person is pointingis useful for identifying the context of statements or instructions. This application
is of particular interest in the field of robotics.
Control through facial gestures. Controlling a computer through facial gestures
is a useful application of gesture recognition for users who may not physically
be able to use a mouse or keyboard. Eye tracking in particular may be of use for
controlling cursor motion or focusing on elements of a display.
-
8/2/2019 The Sixth Sense Final
8/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 8
Input devices
The ability to track a person's movements and determine what gestures they may be
performing can be achieved through various tools. Although there is a large amount of
research done in image/video based gesture recognition, there is some variation within the
tools and environments used between implementations.
Depth-aware cameras. Using specialized cameras such as time-of-flight cameras, one
can generate a depth map of what is being seen through the camera at a short range, and use
this data to approximate a 3d representation of what is
being seen. These can be effective for detection of hand gestures due to their short range
capabilities.
Stereo cameras. Using two cameras whose relations to one another are known, a
3d representation can be approximated by the output of the cameras. To get the
cameras' relations, one can use a positioning reference such as a lexian-stripe or
infrared emitters. In combination with direct motion measurement (6D-Vision)
gestures can directly be detected.
Controller-based gestures. These controllers act as an extension of the body so
that when gestures are performed, some of their motion can be conveniently
captured by software. Mouse gestures are one such example, where the motion of
the mouse is correlated to a symbol being drawn by a person's hand, as is the
Remote, which can study changes in acceleration over time to represent gestures.
Single camera. A normal camera can be used for gesture recognition where the
resources/environment would not be convenient for other forms of image-based
recognition. Although not necessarily as effective as stereo or depth aware
cameras, using a single camera allows a greater possibility of accessibility to a
wider audience.
-
8/2/2019 The Sixth Sense Final
9/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 9
Algorithms Used in Gesture Recognition
Depending on the type of the input data, the approach for interpreting a gesture could be done
in different ways. However, most of the techniques rely on key pointers represented in a 3D
coordinate system. Based on the relative motion of these, the gesture can be detected with a
high accuracy, depending of the quality of the input and the algorithms approach.
In order to interpret movements of the body, one has to classify them according to common
properties and the message the movements may express. For example, in sign language each
gesture represents a word or phrase.
3D model-based algorithms
Fig. 1 A read hand (left) is interpreted as a collection of vertices and lines in the 3Dmesh
version (right), and the software uses their relative positionand interaction in order to infer
the gesture .
-
8/2/2019 The Sixth Sense Final
10/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 10
The 3D model approach can use volumetric or skeletal models, or even a combination of the
two. Volumetric approaches have been heavily used in computer animation industry and for
computer vision purposes. The models are generally created of complicated 3D surfaces, like
NURBS or polygon meshes. The drawback of this method is that is very computational
intensive, and systems for live analysis are still to be developed. For the moment, a more
interesting approach would be to map simple primitive objects to the persons most important
body parts ( for example cylinders for the arms and neck, sphere for the head) and analyse the
way these interact with each other. Furthermore, some abstract structures like super-quadrics
and generalised cylinders may be even more suitable for approximating the body parts. Very
exciting about this approach is that the parameters for these objects are quite simple. In order
to better model the relation between these, we make use of constraints and hierarchies
between our objects.
Skeletal-based algorithms
Fig. 2 The skeletal version (right) is effectively modelling the hand (left). This has less
parameters than the volumetric version and it's easier to compute, making it suitable for real-
time gesture analysis systems
http://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Cylinder_(geometry)http://www.gesturetek.com/illuminate/productsolutions_illuminatedisplay.phphttp://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Superquadrics -
8/2/2019 The Sixth Sense Final
11/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 11
Instead of using intensive processing of the 3D models and dealing with a lot of parameters,
one can just use a simplified version of joint angle parameters along with segment lengths.
This is known as a skeletal representation of the body, where a virtual skeleton of the person
is computed and parts of the body are mapped to certain segments. The analysis here is done
using the position and orientation of these segments and the relation between each one of
them( for example the angle between the joints and the relative position or orientation).
Appearance-based models
Fig 3. These binary silhouette (left) or contour(right) images represent typical input for
appearance-based algorithms. They are compared with different hand templates and if they
match, the correspondent gesture is inferred
-
8/2/2019 The Sixth Sense Final
12/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 12
These models dont use a spatial representation of the body anymore, because they derive the
parameters directly from the images or videos using a template database. Some are based on
the deformable 2D templates of the human parts of the body, particularly hands. Deformable
templates are sets of points on the outline of an object, used as interpolation nodes for the
objects outline approximation. One of the simplest interpolation function is linear, which
performs an average shape from point sets , point variability parameters and external
deformators. These template-based models are mostly used for hand-tracking , but could also
be of use for simple gesture classification.
A second approach in gesture detecting using appearance-based models uses image
sequences as gesture templates. Parameters for this method are either the images themselves,
or certain features derived from these. Most of the time, only one ( monoscopic) or two
( stereoscopic ) views are used.
-
8/2/2019 The Sixth Sense Final
13/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 13
Challenges
There are many challenges associated with the accuracy and usefulness of gesturerecognition software. For image-based gesture recognition there are limitations on the
equipment used and image noise. Images or video may not be under consistent lighting, or in
the same location. Items in the background or distinct features of the users may make
recognition more difficult.
The variety of implementations for image-based gesture recognition may also cause issue for
viability of the technology to general usage. For example, an algorithm calibrated for one
camera may not work for a different camera. The amount of background noise also causes
tracking and recognition difficulties, especially when occlusions (partial and full) occur.
Furthermore, the distance from the camera, and the camera's resolution and quality, also
cause variations in recognition accuracy.
In order to capture human gestures by visual sensors, robust computer vision methods are also
required, for example for hand tracking and hand posture recognition or for capturing
movements of the head, facial expressions or gaze direction.
-
8/2/2019 The Sixth Sense Final
14/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 14
Upcoming New Technologies
The Sixth Sense De vice:-
The Sixth Sense prototype is comprised of a pocket projector, a mirror and a camera. The
hardware components are coupled in a pendant like mobile wearable device. Both the
projector and the camera are connected to the mobile computing device in the users pocket.
The projector projects visual information enabling surfaces, walls and physical objects
around us to be used as interfaces; while the camera recognizes and tracks user's hand
gestures and physical objects using computer-vision based techniques. The software
program processes the video stream data captured by the camera and tracks the locations of
the colored markers (visual tracking fiducially) at the tip of the users fingers using
simple computer-vision techniques. The movements and arrangements of these
fiducially are interpreted into gestures that act as interaction instructions for the projected
application interfaces. The maximum number of tracked fingers is only constrained by the
number of unique fiducials , thus SixthSense also supports multi-userinteraction. The
SixthSense prototype implements several applications that demonstrate the usefulness,viability and flexibility of the system. The map application lets the user navigate a map
displayed on a nearby surface using hand gestures, similar to gestures supported by Multi-
Touch based systems, letting the user zoom in, zoom out or pan using intuitive hand
movements. The drawing application lets the user draw on any surface by tracking the
fingertip movements of the users index finger. Sixth Sense also recognizes users
freehandgestures(postures).
-
8/2/2019 The Sixth Sense Final
15/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 15
Construction and Working
Fig 1. The sixth sense system
The SixthSense prototype comprises a pocket projector, a mirror and a camera
contained in a pendant like, wearable device. Both the projector and the
camera are connected to a mobile computing device in the users pocket. The
projector projects visual information enabling surfaces, walls and physical
objects around us to be used as interfaces; while the camera recognizes and
tracks user's hand gestures and physical objects using computer-vision based
techniques. The software program processes the video stream data captured
by the camera and tracks the locations of the colored markers (visual tracking
fiducials) at the tips of the users fingers. The movements and arrangements of
these fiducials are interpreted into gestures that act as interaction instructions
for the projected application interfaces. SixthSense supports multi-touch and multi-user
interaction.
-
8/2/2019 The Sixth Sense Final
16/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 16
Fig 2. The procedure carried in sixth sense
The hardware that makes Sixth Sense work is a pendant like mobile wearableinterface
It has a camera, a mirror and a projector and is connected wirelessly to a Bluetooth or3G or wifi smart phone that can slip comfortably into ones pocket
The camera recognizes individuals, images, pictures, gestures one makes with theirhands
Information is sent to the Smartphone for processing The downward-facing projector projects the output image on to the mirror Mirror reflects image on to the desired surface Thus, digital information is freed from its confines and placed in the physical world
-
8/2/2019 The Sixth Sense Final
17/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 17
Example Applications
The SixthSense prototype contains a number of demonstration applications.
using
hand gestures to zoom and pan
fingertip
movements of the users index finger.
objects
the user interacts with.
The system recognizes a user's freehand gestures as well as icons/symbols drawn in the air
with the index finger, for example:
or wall
and flick through the photos he/she has taken.
an @
symbol lets the user check his mail.
-
8/2/2019 The Sixth Sense Final
18/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 18
Conclusion
The goal of virtual environments (VE) is to provide natural, efficient, powerful,
and flexible interaction. Gesture as an input modality can help meet these
requirements because Human gestures are natural and flexible, and may be
efficient and powerful, especially as compared with alternative interaction modes.
The traditional two-dimensional (2D), keyboard and mouse-oriented graphical user
interface (GUI) is not well suited for virtual environments. Synthetic
environments provide the opportunity to utilize several different sensing
modalities and technologies and tointegrate them into the user experience.
Devices which sense body position and orientation, direction of gaze, speech and
sound, facial expression, galvanic skin response, and other aspects of human
behaviour or state can be used to mediate communication between the human
and the environment. Combinations of communication modalities and sensing
devices can produce a wide range of unimodal and multimodal interface
techniques. The potential for these techniques to support natural and
powerful interfaces for communication in VEs appears promising.
-
8/2/2019 The Sixth Sense Final
19/19
The Sixth Sense: Gesture Recognition
Department of CS&E,PESCE, Mandya Page 19
References
1. Matthias Rehm, Nikolaus Bee, Elisabeth Andr, Wave Like an Egyptian -Accelerometer Based Gesture Recognition for Culture Specific Interactions, British
Computer Society, 2007
2. Pavlovic, V., Sharma, R. & Huang, T. (1997), "Visual interpretation of hand gesturesfor human-computer interaction: A review", IEEE Trans. Pattern Analysis and
Machine Intelligence., July, 1997. Vol. 19(7), pp. 677 -695.
3. R. Cipolla and A. Pentland, Computer Vision for Human-Machine Interaction,Cambridge University Press, 1998, ISBN 978-0521622530
4. Ying Wu and Thomas S. Huang, "Vision-Based Gesture Recognition: A Review", In:Gesture-Based Communication in Human-Computer Interaction, Volume 1739 of
Springer Lecture Notes in Computer Science, pages 103-115, 1999, ISBN 978-3-540-
66935-7, doi 10.1007/3-540-46616-9
5. Alejandro Jaimesa and Nicu Sebe, Multimodal humancomputer interaction: Asurvey, Computer Vision and Image Understanding Volume 108, Issues 1-2,
OctoberNovember 2007, Pages 116-134 Special Issue on Vision for Human-Computer Interaction, doi:10.1016/j.cviu.2006.10.019
http://mm-werkstatt.informatik.uni-augsburg.de/files/publications/199/wave_like_an_egyptian_final.pdfhttp://mm-werkstatt.informatik.uni-augsburg.de/files/publications/199/wave_like_an_egyptian_final.pdfhttp://www.cs.rutgers.edu/~vladimir/pub/pavlovic97pami.pdfhttp://www.cs.rutgers.edu/~vladimir/pub/pavlovic97pami.pdfhttp://books.google.com/books?id=Pe7gG0LxEUIC&dq=pentland+cipolla+computer+vision+human+interaction&printsec=frontcover&source=bl&ots=O2q5ExL8PU&sig=FMhom_f4h9dqeib-6pSSpjbsB38&hl=en&ei=uzvsSbruBdqIsAaq5PCKBw&sa=X&oi=book_result&ct=result&resnum=1http://en.wikipedia.org/wiki/Special:BookSources/9780521622530http://reference.kfupm.edu.sa/content/v/i/vision_based_gesture_recognition__a_revi_291732.pdfhttp://en.wikipedia.org/wiki/Special:BookSources/9783540669357http://en.wikipedia.org/wiki/Special:BookSources/9783540669357http://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://staff.science.uva.nl/~nicu/PUBS/PDF/2005/sebeHCI05.pdfhttp://en.wikipedia.org/wiki/Special:BookSources/9783540669357http://en.wikipedia.org/wiki/Special:BookSources/9783540669357http://reference.kfupm.edu.sa/content/v/i/vision_based_gesture_recognition__a_revi_291732.pdfhttp://en.wikipedia.org/wiki/Special:BookSources/9780521622530http://books.google.com/books?id=Pe7gG0LxEUIC&dq=pentland+cipolla+computer+vision+human+interaction&printsec=frontcover&source=bl&ots=O2q5ExL8PU&sig=FMhom_f4h9dqeib-6pSSpjbsB38&hl=en&ei=uzvsSbruBdqIsAaq5PCKBw&sa=X&oi=book_result&ct=result&resnum=1http://www.cs.rutgers.edu/~vladimir/pub/pavlovic97pami.pdfhttp://www.cs.rutgers.edu/~vladimir/pub/pavlovic97pami.pdfhttp://mm-werkstatt.informatik.uni-augsburg.de/files/publications/199/wave_like_an_egyptian_final.pdfhttp://mm-werkstatt.informatik.uni-augsburg.de/files/publications/199/wave_like_an_egyptian_final.pdf