HMM-based Activity Recognition with a Ceiling RGB...
Transcript of HMM-based Activity Recognition with a Ceiling RGB...
HMM-based Activity Recognitionwith a Ceiling RGB-D Camera
Daniele Liciotti, Emanuele Frontoni, Primo Zingaretti,Nicola Bellotto, and Tom Duckett
Università Politecnica delle Marche (Italy), University of Lincoln (UK){d.liciotti, e.frontoni, p.zingaretti}@univpm.it, {nbellotto, tduckett}@lincoln.ac.uk
Introduction
The main goal is to classify and predict theprobability of an analysed subject action. Weperform activity detection and recognition us-ing an inexpensive RGB-D camera. Human ac-tivities, despite their unstructured nature, tendto have a natural hierarchical structure; forinstance, generally making a coffee involves athree-step process of turning on the coffee ma-chine, putting sugar in the cup and openingthe fridge for milk. Action sequence recognitionis then handled using a discriminative hiddenMarkov model (HMM).The innovative aspects are in proposing an ade-quate HMM structure and also the use of headand hands 3D positions to estimate the prob-ability that a certain action will be performed,which has never been done before, for the bestof our knowledge, in ADLs recognition in indoorenvironments.
ADLs Model
Activity1
Activity2
Activity3
Activityn
3D HeadPoint
3D HandPoint
HMM1
HMM2
HMM3
Activity1
Activity2
Activity3
Activityn
...3D Head
Point
3D HandPoint
HMM1
HMM2
HMM3 ...
Observations (O)
ClassificationSelect model with
MaximumLikehood
Figure 1: Block diagram of the recognition process.
Information provided by head and hands detec-tion algorithms can be used as input for a set ofHMMs. Each of these recognise different actionssequence. After training the model, we consideran action sequence
s = {s1, s2, . . . , sn}and calculate its probability λ for the observa-tion sequence P (s|λ). Then we classify the ac-tion as the one which has the largest posteriorprobability. Figure 1 depicts the general schemeof the recognition process. In particular, we usedthree different HMMs, which have as observa-tions 3D points of: the head; the hands; bothhead and hands together. Finally, the classifica-tion module provides the action xj that maxi-mizes PHMMi
. It is the HMM trajectory proba-bility that follows the activity sequence s giventhe sequence of n observations, i.e.:xj = arg max
iPHMMi
(X1:n ∈ seqn(s)|o1:n)
Setup and Acquisition
(a) (b)
(c) (d)Figure 2: Snapshots of RADiAL session registration.
To evaluate the usefulness of our approach foractivity recognition, we built a new dataset(RADiAL) that contains common daily activi-ties such as making coffee, making tea, open-ing the fridge and using the kettle. The RGB-Dcamera was installed on the ceiling of L-CASlaboratory at approximatel 4m above the floor(Figure 3).
fridge
teasugar
kettlecoffee
machine
RGB-DCamera
Figure 3: Reconstructed layout of the kitchenette whereRGB-D camera is installed.
The RADiAL dataset1 was collected in an open-plan office of the Lincoln Centre for AutonomousSystems (L-CAS). The office consists of a kitch-enette, resting area, lounge and 20 workingplaces that are occupied by students and post-doctoral researchers. We installed a ceilingRGB-D camera (Figure 3) that took a snapshot(with dimensions of 320×240 pixels, Figure 2) ofthe kitchenette area every second for 5 days, andwe hand-annotated activities of one of the re-searchers over time. Furthermore, the RADiALdataset contains the 3D positions of the head andhands for each person with a minute-by-minutetimeline of 5 different activities performed at thekitchen over the course of days. RADiAL con-tains 100 trials. Each trial includes the actionsrelated to one person.
Experimental Results
Five models were used to recognize activitiesin the RADiAL dataset and correspond re-spectively to the activities “other” (this actioncontains all the other activities perfomed in akitchen environment), “coffee” (making a coffee),“kettle” (taking the kettle), “tea/sugar” (makingtea or taking sugar), and “fridge” (opening thefridge). The results were obtained using two dif-ferent validation techniques.
othe
r
coffe
e
kettl
e
tea/
suga
r
fridg
e
Predicted label
other
coffee
kettle
tea/sugar
fridge
Tru
e lab
el
0.574 0.236 0.029 0.105 0.056
0.072 0.797 0.114 0.015 0.002
0.053 0.070 0.701 0.161 0.015
0.069 0.048 0.121 0.704 0.058
0.091 0.033 0.025 0.243 0.609 0.08
0.16
0.24
0.32
0.40
0.48
0.56
0.64
0.72
(a)HMM1
othe
r
coffe
e
kettl
e
tea/
suga
r
fridg
e
Predicted label
other
coffee
kettle
tea/sugar
fridge
Tru
e lab
el
0.704 0.106 0.032 0.089 0.068
0.045 0.831 0.052 0.070 0.003
0.020 0.194 0.579 0.202 0.006
0.036 0.067 0.198 0.678 0.021
0.056 0.084 0.022 0.184 0.6540.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(b)HMM2
othe
r
coffe
e
kettl
e
tea/
suga
r
fridg
e
Predicted label
other
coffee
kettle
tea/sugar
fridge
Tru
e lab
el
0.759 0.073 0.024 0.087 0.058
0.031 0.875 0.041 0.052 0.002
0.014 0.139 0.696 0.143 0.007
0.027 0.047 0.156 0.748 0.021
0.036 0.063 0.019 0.168 0.713 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(c)HMM3
Figure 4: k-fold cross-validation confusion matrices.
Table 1: Classification Results Cross Validation.HMM1 HMM2 HMM3
PPV TPR F1 PPV TPR F1 PPV TPR F1
other 0.73 0.57 0.64 0.89 0.70 0.79 0.93 0.76 0.84coffee 0.67 0.80 0.73 0.69 0.83 0.75 0.76 0.87 0.81kettle 0.60 0.70 0.65 0.47 0.58 0.52 0.58 0.70 0.63
tea/sugar 0.66 0.70 0.68 0.64 0.68 0.66 0.70 0.75 0.72fridge 0.74 0.61 0.67 0.74 0.65 0.69 0.78 0.71 0.74avg 0.68 0.68 0.68 0.73 0.71 0.71 0.78 0.77 0.77
ahttp://vrai.dii.univpm.it/radial-dataset