Post on 16-Apr-2017
Human Action Recognition Using 3D Joint Information and Pyramidal HOOFD Features
Human Action Recognition Using 3D Joint Information and Pyramidal HOOFD FeaturesMSc Thesis by Bar Can stndaThesis Advisor: Prof. Dr. Mustafa nelBuraya grseller eklenecek
Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkHuman Action Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work
Outline
Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkHuman Action Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work
Outline
Motion PerceptionGunnar Johansson [1971]Sequence of images for Human Motion AnalysisMoving Light Displays enable identification of people and gender
Motion Capture [2014]Dawn of the Planet of the Apes
Motivation
Vast amount of Data
Motivation
Video CategorizationMoviesTVYouTube
Motivation
Video CategorizationHow many human-pixels are there?MoviesTVYouTube
Motivation
Video CategorizationHow many human-pixels are there?
MoviesTVYouTube
35%34%40%
Motivation
Rehabilitation
15M people suffer fom stroke every yearAutomated systemsGamification
Motivation - Application
Release of Low-cost Depth CamerasKinect (2010)Google Tango (developers only, 2014)Leap Motion (2013)Effective and robust performance givenComplex backgroundChallenging viewpoints Occlusions
Motivation Why depth?
Google TangoLeap Motion
Related Work
Related Work
Extraction of Cuboids, Dollar et al. [CVPR, 2005] Motion History ImagesMotion Energy Images,Gorelick et al. [PAMI, 2007] Intensity Based
Related WorkHistogram of Oriented 4D Normals (HON4D)Oreifej et al. [CVPR, 2013] Depth Motion Maps,Yang et al. [JRTIP, 2012]
Depth Map Based
Related WorkSequence of Most Informative Joints (SMIJ),Ofli et al. [CVIU, 2013] View Invariant HumanAction RecognitionUsing Histogram of3D Joints,Xia et al. [CVPR, 2012] Skeletal Data Based
Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkHuman Action Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work
Outline
Human Action Recognition Using 3D Joint Information and HOOFD FeaturesDepth AcquisitionFormation of shadowsEliminating the noise3D JointsHOOFDSignal WarpingPyramidal HOOFD FeaturesNaive BayesSupport Vector Machines
16
KinectDepth data acquisition is accomplised by using Light Coding Method
In order to process the depth data in any applicationFormation of shadowsEliminating the noise
ShadowsGenerated by the foreground objects
Noise Rough object boundaries caused gaps and holes on depth data
Bilateral Filter
Space termRange term
Joint Features20 Joints are provided by Kinect SDK
10 Joint Angles and theirderivatives calculated:
Joint FeaturesMapped to sphericalCoordinates
Origin is aligned tothe hip centerRadius parameter is discarded
Histogram of Oriented Optical Flows from Depth (HOOFD)
Optical Flow from Depth DataMapping of depth data to intensity imageDepth values (z) represented as intensity (I)Optical flow field which is invariant to sudden change of brightness
Optical Flow 2D displacement of pixel patches on the image plane
Brightness Constancy Equation
Linearizing assuming small (u,v) using Taylor Series Expansion
Histogram of Oriented Optical Flows from Depth (HOOFD)
Brightness values of individual pixels on a local patch are preserved.
By linearizing the equation around I(x,y,t) using Taylor series expansion we obtained the second equation22
Optical Flow Lucas Kanade MethodApply it within a local patch
Minimize using Least-Squares method
Even though we assumed that the equation is equal to 0, practically it is not.
We then discretize the equation and applied it within a local patch and we acquired this cost function
Minimizing this function using least squares gives us the optical flow vectors as a result23
Optical Flow Horn Schunk MethodAssumption: global smoothness in the flow over the whole image
Smoothness error:
Error in brightness constancy equation
Minimize:
However in the literature there is also another method proposed by Horn and Schunk, which introduced a global smoothness constraid over the whole image.
This is a useful method to correct errors that is caused by the gaps and holes on depth data.
Smoothness is introduced by minimizing the velocities, optical flow vectors 24
Histogram of Oriented Optical Flow from DepthBinning according to:Primary Angle between the flow vector and the horizontal axisMagnitude of the flow vector
Orientation & Magnitude images
Histogram Binning example with bin size = 4
Signal WarpingIf it is a longer action instance -> Discard framesIf it is a shorter action instance -> Replicate and insert frames
Pyramidal HOOFD FeaturesHistogram of Oriented Optical Flow from DepthAfter obtaining optical flows patches1. Patches are extracted around each joint
Pyramidal HOOFD FeaturesHistogram of Oriented Optical Flow from DepthAfter obtaining optical flows patches1. Patches are extracted around each joint2. HOOFDs are calculated in a pyramidal fashion
Level 2Level 3Level 1
Level 2Level 3Level 1
Level 2Level 3Level 1
Supervised learning methodsTraining examples are attached to known classesSpam filtering on an e-mail clientExamples: Naive Bayes, Support Vector Machines
Naive Bayes ClassifierIndependence assumption between featuresFor example: a car Volkswagen with a red color and 17 inch wheels and these features contribute independently to classify that this car is a Volkswagen
Support Vector MachinesCalculates the choice of the most optimal hyperplane that defines the decision boundary between two classes
Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkAction Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work
Outline
DatasetsMSR Action 3D10 Subjects20 Actions
MSR Pairs 3D10 Subjects12 Actions
MSRC-12 Gesture30 Subjects12 Actions
Experiments
Experiment - 1
SettingsDataset: MSRC-12 GestureFeature: Joint FeaturesRatio: Leave-one-subject-out-cross-valuation50% Training 50% Test75% Training 25% Test
Experiment - 1
Experiment - 1
Experiment - 2
SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test
Experiment - 2SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test
HON4D: To make the descriptors more discriminative, they quantized the 4Dspace using the vertices of a polychoronDictionary Learning Group Sparsity Geometric Constraint with Temporal Pyramid Matching40
Experiment - 2SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test
Smash ActionForward Punch Action
Experiment - 3SettingsFeature: HOOFD FeaturesDataset: MSR Action PairsRatio: 50% Training 50% Test
Conclusion & Future WorkWe developed a novel human action recognition framework by fusing 3D Joint information and HOOFD features
We proposed a new feature called Histogram of Oriented Optical Flow from Depth (HOOFD)
Several experiments with publicly available datasets were conducted to assess the performance of the proposed technique.
Comparison with state-of-the-art algorithms show the success of our algorithm.
As future work,Potential of HOOFD will be fully explored
Different popular classification approaches will be employed (Bag of Words, Random Forest, Boosted Trees)
Thank You ... ???