Gesture Recognition?

Post on 28-Jan-2015

549 views 1 download

Tags:

description

 

Transcript of Gesture Recognition?

Gesture Recognition

B98902037 邱大祐bingdow@gmail.com

+886937201724

Abstract

• In my presentation, I will introduce a hand gesture recognition system to capture the hand movements and simple gesture.

• The feature of my method: low cost and low complexity.

Motion Capturing Devices

1. Microsoft Kinect + XBOX ($US 330)

2. Leap Motion ($US 69.99)

Using Mono Webcam

• Under $US 15 , it’s cheap and easy to set up.

Flow Diagram

Hand Detection (1)

• 1. Detect Motion Region• 2. Edge Detection• 3. Threshold• 4. Skin Color Estimation• 5. Cross Comparison• 6. Redraw Hand Shape

The Flow Diagram in hand detection

And Operator

Motion Region Image

Skin Color Estimation

Image

Edge Detection

Binary Image

Histogram Equalization

Image

Hand Detection (2)

• Hand moves significantly more than any other area in gesture capturing.

• Di(x, y) = |Fi(x, y) – Fi+1(x, y)|

• Fi(x, y): the i-th frame image.• X, y: the coordinates on the frame. • Di(x, y): the difference between Fi(x, y) and Fi+1(x, y).• Thresholding• Get ROI (Region Of Interest)

Hand Detection (3)

• Di(x, y) = |Fi(x, y) – Fi+1(x, y)|

• Fi(x, y): the i-th frame image. • x, y: the coordinates on the frame. • Di(x, y): the difference between Fi(x, y) and

Fi+1(x, y).• Calculate the absolute value of frame

difference.• Bounding Box

Hand Detection (4)

• Noise Removal in frame difference calculation.• Noise could be slightly moving objects, e.g.

slightly moving edge.• Set a threshold to remove noise (=80)• Di(x, y) = 1, if |Fi(x, y) – Fi+1(x, y)|> Threshold

Di(x, y) = 0, otherwise.• Otsu Method: Not good.

Hand Detection (5)

• Calculate difference to find the motion region and run bounding box algorithm.

Hand Detection (6)

Hand Detection (7)

• Skin Color Estimation• In RGB color space, constrain the skin color, e.g.

R > G > B… I use HSV color space.• (0<=H<=20, 30<=S<=150, 60<=V<=255)• This method might include a wide range of

colors, but that doesn’t matter because we will do a cross comparison later. The noise including wall, poster, … could be removed based on motion region detection.

HSV: Hue, Saturation, ValueRGB: Red, Green, Blue

Hand Detection (8)

• Label the estimated skin color as red.

Hand Detection (9)

• Do morphology closing on skin color estimation image.

• Histogram equalization on source image.• Binarize on source image.• Edge detection (Sobel) on source image.

Hand Detection (10)

• Finally, we can do a cross comparison by using motion region, binary image, skin color estimation image, and edge detection image.

• After that, we’ve done hand detection and calibration. We could use these results to track the basic hand gesture now.

Hand Detection (11)

• After cross comparison, we could get a full hand shape.

Background Subtraction (1)

• Till now, the result is not very stable because webcam has no light.

• The environment luminance sometimes causes variance. e.g. part of face.

• Further, we focus on removing the ambiguous match.

Binary or Skin Image

• In dark environment, we can’t detect skin color in the palm region because it’s full of white color pixels.

• We need to enhance the dark-environment condition.

• Use mathematical morphology (closing).

Ambiguous Match Problem

Arm moves significantly with palm. This image is taken in the dark environment. Such condition causes matching area to be larger than we expected.

Mathematical Morphology

• Dilation and Erosion operation.

Background Subtraction (2)

• Simple estimation 1. F1(x, y): background.

Fi(x, y): background if no ROI is detected.

2. By using height, width, and the centroid of white pixels to estimate the proper hand region.

Background Subtraction (3)

• Object tracking: to be implemented.• Motion Vector: A motion-based recognition.

Search Space (1)

A sketh map of search space and a group of search space and hand region.

Search Space (2)

• Search space is composed of neighborhood relationship.

• Search space can be used to predict next movement of the target object and reduce the tracking time.

• Once when there is any variation in search space, we can simply regard it as direction moves.

Motion Vector

• Use correlation to do image matching.• find to minimize dydx,

Ryx

dyydxximbPIXyximaPIX),(

|),,(),,(|

Feature Detection

• Feature detection can detect the finger.• We can use feature points to build a convex

hull and feedback to hand detection image.• Ideally, we could detect multiple hands and

detect hand regions precisely.• Also, we can use these feature points to

detect hand gesture more precisely.

Feature Detection (1)• cvfindContours: contour detection• SIFT, SURF algorithm: feature detection• 三點法 ( 或 11 點法 )• Assume contour has N points, P1, P2, …, Pn.

• Pi = (x, y).• The approximate curvature:

2/3)(22

ii

iiiii

yx

xyyxC

SIFT: Scale-Invariant Feature TransformSURF:

Feature Detection (2)

2/3)(22

ii

iiiii

yx

xyyxC

iii xxx 1 iiii xxxx 211

11 iii yyyiiii yyyy 211

The Detection Method we chosed

• The Viola–Jones object detection framework: the first object detection framework to provide competitive object detection rates in real-time proposed in 2001 by Paul Viola and Michael Jones.

• It was motivated primarily by the problem of face detection.

• This algorithm is implemented in OpenCV as cvHaarDetectObjects().

Feature Detection (4)

• Feature types used by Viola and Jones.

SIFT and SURF Detection

• Too slow for a real-time system with DEFAULT settings.

• In fact, in gesture recognition, we don’t need too many features.

The feature we use• Haar-like features

• Different colors denote different Haar-like features.

Haar-like features modularization

• We use the spatial relationship to modularize a finger model.

Recognition and Tracking (1)

Recognition and Tracking (2)

Rotate(Vf);

Scale(Vf);

 for( i = 1 to n){ɵ= arcos(Vf˙Vei / |Vf||Vei|);

if (ɵ  0)≒break; /*matched with Vei */

}

Overcome stereo problems

• For a user sitting in front of a laptop or PC, when he gestures, the hand structure must be palm, wrist, and arm in the top-down order in an image.

PART II

How to reduce the CPU loading

Simple Detection Algorithm (1)

• To capture general hand motion, this is module is enough

• Low-Complexity• DO NOT USE significant

feature detection to detectmotion variation in the beginning.

Simple Detection Algorithm (2)

• DO NOT USE significant feature detection to detect motion variation in the beginning.

• Why?• An example by my classmate, they use feature

detection first, to form hand region. This is NOT recommended.

• More complex, less prior.

Simple method, simple processing

Hand Region

Camera View

The processing diagram

Detect hand region size variation Start Tracking

Analyze Path and output the recognition movement

Iteration method

Iteration control (1)

• Initializing: This is to prevent log overflow• Timing: When to start, when to end?

If we enter tracking phase, each frame should be logged if hand region is detected.Our parameter: frame > 20, time > 0.2 sec.

Done yet?No

Iteration control (2)

• When analyzing path, path length is critical parameter.

• If the path we set is too long, that would cause serious latency.

• If the path we set is too short, that is too sensitive, not good for User Experience.

The path looks like this:

Iteration control (3)

• The path segmentation is critical. • Should be tuned again and again in order to

give response at the exactly time user want.

To be continued…

Thank you