- A comparison with Neural Networks and Hidden Markov Models -
César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos AnjoUniversidade Federal de São Carlos (Federal University of São Carlos)
IBERAMIA 2012
Fingerspelling Recognition with Support Vector Machines
and Hidden Conditional Random Fields
Cartagena de Índias, Colombia2012
IntroductionContext, Motivation, Objectives and
the Organization of this Presentation
3
Multidisciplinary Computing and Linguistics
Ethnologue lists about 130 sign languages existent in the world (LEWIS, 2009)
ContextMotivationObjectives
Agenda
4
Two fronts
Social Aim to improve quality of life for the
deaf and increase the social inclusion
Scientific Investigation of the distinct interaction
methods, computational models and their respective challenges
ContextMotivationObjectives
Agenda
5
This paper Investigate the behavior and applicability of SVMs
and HCRFs in the recognition of specific signs from the Brazilian Sign Language
Long term Walk towards the creation of a full-fledged
recognition system for LIBRAS
This work represents a small but important step in achieving this goal
ContextMotivationObjectives
Agenda
6
Introduction
- Brazilian Sign Language
Review
Methods
Experiments
Results
Conclusion
Literature
Libras
Support Vector MachinesConditional Random Fields
ContextMotivationObjectives
Agenda
and Tools
7
The Brazilian Sign Language (LIBRAS)Structures and the manual alphabet
8
Introduction Libras Review Methods Experiments Results Conclusion
Natural languageNot mimicsNot universal It is not only “a problem of the
deaf or a language pathology” (QUADROS & KARNOPP, 2004)
LIBRASDifficultiesGrammar
9
Introduction Libras Review Methods Experiments Results Conclusion
Highly context-sensitive Same sign may have distinct meanings
Interpretation is hardeven for humans
LIBRASDifficultiesGrammar
10
Introduction Libras Review Methods Experiments Results Conclusion
Fingerspelling is onlypart of the Grammar Needed when explicitly spelling
the name of a person or a location
Subset of the full-language recognition problem
LIBRASDifficultiesGrammar
11
Introduction Libras Review Methods Experiments Results Conclusion
12
Introduction Libras Review Methods Experiments Results Conclusion
Literature Layer architectures are common Static gestures x Dynamic gestures
One of the best works on LIBRAS handles only the movement aspect of the language(Dias et al.)
13
Introduction Libras Review Methods Experiments Results Conclusion
Few studies explore SVMs But many use Neural Networks
No studies on HCRFs and LIBRAS
Introduction Libras Review Methods Experiments Results Conclusion
Static gesture classifier
P TP P A A T T O OA
Sequence classifier
PatoHMM
ANN
Example Recognition of a fingerspelled word
using a two-layered architecture
15
Introduction Libras Review Methods Experiments Results Conclusion
YANG, SCLAROFF e LEE, 2009 Multiple layers, SVMs
Elmezain, 2011 HCRF, in-air drawing recognition
16
Models and ToolsOverview of the chosen techniques
and reasons for their choice
17
Static Gesture Recognition
Neural Networks and Support Vector Machines for the detection of static signs
Introduction Libras Review Methods Experiments Results Conclusion
18
a𝑓 (𝑥) b
c
Find such that…
19
Introduction Libras Review Methods Experiments Results Conclusion
Biologically inspired McCulloch & Pitts, Rosenblatt, Rumelhart
Neural Networks
Support Vector MachinesMaximum Margin
Multiple Classes
Introduction Libras Review Methods Experiments Results Conclusion
Perceptron Hyperplane decision Linearly separable problems
Layer architecture Universal approximator
Learning is a ill-posed problem Multiple local minima, ill-conditioning
21
Introduction Libras Review Methods Experiments Results Conclusion
Strong theoretical basis Statistical Learning Theory Structural Risk Minimization (SRM)
Neural Networks
Support Vector MachinesMaximum Margin
Multiple Classes
22
Introduction Libras Review Methods Experiments Results Conclusion
Risk minimization through margin maximization Capacity control through margin control Sparse solutions considering only a few support vectors
Neural Networks
Support Vector MachinesMaximum Margin
Multiple Classes
Large-margin classifiers
23
Introduction Libras Review Methods Experiments Results Conclusion
Neural Networks
Support Vector MachinesMaximum Margin
Multiple Classes
Problem: binary-only classifier How to generalize to multiple classes?
24
Introduction Libras Review Methods Experiments Results Conclusion
Classical approaches One-against-all One-against-one Discriminant functions
Drawbacks Only works when equiprobable Evaluation of c(c-1)/2 machines Non-guaranteed optimum results
With 27 static gestures, this would result in 351 SVM evaluations each time a new
classification is required!
Problem: binary-only classifier How to generalize to multiple classes?
25
Introduction Libras Review Methods Experiments Results Conclusion
Directed Acyclic Graphs Generalization of Decision Trees,
allowing for non-directed cycles
Require at maximum c-1 evaluations
So, for 27 static gestures, only 26 SVM evaluations are required. Only
7.4% of the original effort
Neural Networks
Support Vector MachinesMaximum Margin
Multiple Classes
Problem: binary-only classifier How to generalize to multiple classes?
26
Introduction Libras Review Methods Experiments Results Conclusion
CandidatesElimination proccess
One class eliminated at a timeABCD
A lost D lost
B lostD lost A lost
C lost
B lostA lost
C lostB lost
D lostC lost
BCD
CD
BC
AB
ABC
D C B A
A x D
B x D A x C
C x D B x C A x B
A B C D
27
Introduction Libras Review Methods Experiments Results Conclusion
However, no matter the modelWe’ll have (extreme) noise due pose transitions
How can we cope with that?
28
Dynamic Gesture Recognition
Hidden Markov Models, Conditional Random Field and Hidden Conditional Random Fields for
dynamic gesture recognition.
Introduction Libras Review Methods Experiments Results Conclusion
29
bye𝑓 (𝑥) hi
hello
Find such thatgiven extremely noisy sequences of labels, estimate the word being signed.
blyrei
hil
hmeylrlwo
30
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
31
Introduction Libras Review Methods Experiments Results Conclusion
𝑝 (𝑥 , 𝑦 )=∏𝑡=1
𝑇
𝑝 (𝑦 𝑡|𝑦𝑡− 1 )𝑝 (𝑥𝑡∨𝑦 𝑡)
A B
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
Hidden Markov Models
Joint probability model of a observation sequence and its relationship with time
32
Introduction Libras Review Methods Experiments Results Conclusion
𝑝 (𝜔 𝑖∨𝒙 )=𝑝 (𝜔𝑖 )𝑝 ( 𝒙∨𝜔 𝑖 )
∑𝑗
𝑐
𝑝 (𝒙∨𝜔 𝑗 )
Hidden Markov Models
Marginalizing over y, we achieve the observation sequence likelihood
Which can be used for classificationusing either the ML or MAP criteria
𝑝 (𝑥 )=∑𝒚∏𝑡=1
𝑇
𝑝 ( 𝑦𝑡|𝑦𝑡 −1 )𝑝 (𝑥𝑡∨𝑦𝑡)
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
33
Introduction Libras Review Methods Experiments Results Conclusion
Word 1
Word 2
Word n
...
𝑝 (𝒙∨ω𝟏)
ω̂=max𝜔 𝑗∈ω
𝑝 (𝒙|𝜔 𝑗 )𝑝 (𝜔 𝑗)𝑝 (𝒙∨ω𝟐)
𝑝 (𝒙∨ω𝒏)
One model for each word
34
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models have found great applications in speech recognition
However, a fundamental paradigmshift recently occurred in this field
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
35
Introduction Libras Review Methods Experiments Results Conclusion
Probability distributions governing speech signals could not be modeled accurately, turning “Bayes decision theory inapplicable under those circumstances”
(Juang & Rabiner, 2005)
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
36
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
37
Introduction Libras Review Methods Experiments Results Conclusion
Conditional Random Fields Generalization of the Markov models
Discriminative Models Model without incorporating
Designates a family of MRFs Each new observation originates a new MRF
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
38
Introduction Libras Review Methods Experiments Results Conclusion
CRFLinear-chain CRFLogistic Regression
Directional modelsNaïve Bayes HMM
Disc
rimin
ative
Gene
rativ
e
Sequ
ence
Grap
hs
Infograph based on the tutorial by Sutton, C., McCallum, A., 2007
39
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
Conditional Random Fields Generalization of the Markov models
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃)
Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃𝑝)=𝑒𝑥𝑝 {∑𝑘=1
𝐾(𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 ,𝒚 𝒄)}𝑍 (𝒙 )=∑
𝒚∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄 ;𝜃)
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} Parameter vector which can be optimized using gradient
methods
Potential Cliques Potential Functions
Characteristic function vector
Partition function
40
Introduction Libras Review Methods Experiments Results Conclusion
𝑝 (𝑥 , 𝑦 )=¿ ∏𝑡=1
𝑇
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
𝑩𝑝 (𝑥𝑡∨𝑦𝑡)
𝑨𝑝 (𝑦𝑡|𝑦 𝑡−1 )
∏𝑡=1
𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
Conditional Random Fields Generalization of the Markov models
How do we initialize those models? Reaching HCRFs from a HMM
41
Introduction Libras Review Methods Experiments Results Conclusion
∏𝑡=1
𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇
∑𝑡=1
𝑇
❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1
𝑇
❑
∏𝑡=1
𝑇
)(∏𝑡=1𝑇
𝑝 (𝑥 , 𝑦 )=¿𝑒❑
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
∏𝑡=1
𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )∑
𝑡=1
𝑇
❑ 𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)ln
¿
42
Introduction Libras Review Methods Experiments Results Conclusion
∏𝑡=1
𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇
𝑝 (𝑥 , 𝑦 )=¿
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
∑𝑡=1
𝑇
❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1
𝑇
❑
ln 𝑨
𝑛 𝑛
𝑛 𝑚
ln 𝑩a11 a12 a13
a21 a22 a23
a31 a32 a33
b11 b12
b21 b22
b31 b32
𝝀 :𝑘=𝑛×𝑛+𝑛×𝑚
43
Introduction Libras Review Methods Experiments Results Conclusion
∏𝑡=1
𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇
𝑝 (𝑥 , 𝑦 )=¿
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
∑𝑡=1
𝑇
❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1
𝑇
❑
a11 a12 a13 a21 a22 a23 a31 a32 a33 b11 b12 b21 b22 b31 b32
𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆
❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }
𝑘=𝑛×𝑛+𝑛×𝑚𝝀 :
i=1
j=1
i=1
j=2
i=1
j=3
i=2
j=1
i=2
j=2
i=2
j=3
i=3
j=1
i=3
j=2
i=3
j=1
i=1
o=1
i=1
o=2
i=2
o=1
i=2
o=2
i=3
o=1
i=3
o=2
𝒇 :
44
Introduction Libras Review Methods Experiments Results Conclusion
{∑❑❑
❑
∑❑
❑
❑
𝑝 (𝑥 , 𝑦 )=¿
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
𝑒𝑥𝑝 {∑𝑡=1𝑇
∑𝑖=1
𝑛
∑𝑗=1
𝑛
𝑎𝑖𝑗1{𝑦𝑡=𝑖}1{𝑦𝑡− 1= 𝑗 }+∑𝑡=1
𝑇
∑𝑖=1
𝑛
∑𝑗=1
𝑚
𝑏𝑖𝑗1{𝑦 𝑡=𝑖}1{𝑥𝑡= 𝑗}}
𝑒𝑥𝑝 {∑𝑘=1
𝐾
𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}
𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆
❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }¿𝑒𝑥𝑝 {∑𝑡=1𝑇
𝑎𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙)+∑𝑡=1
𝑇
𝑏𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 )}
¿
45
Introduction Libras Review Methods Experiments Results Conclusion
𝑝 (𝑥 , 𝑦 )=¿
𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )
∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)}
𝑒𝑥𝑝 {∑𝑘=1
𝐾
𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}𝑝 ( 𝑦 )
𝑝 (𝒚|𝒙 )=¿1
𝑍 (𝒙 )
46
Introduction Libras Review Methods Experiments Results Conclusion
Drawback Assumes both and are known
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
47
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
48
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
49
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Conditional Random Fields Generalization of the hidden Markov classifiers
𝜃𝐻𝐶𝑅𝐹
𝜃𝐻𝑀𝑀𝑐
Parameter space
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
50
Introduction Libras Review Methods Experiments Results Conclusion
Sequence classification Model without explicitly modeling
Do not require to be known The sequence of states is now hidden
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
Hidden Conditional Random Fields Generalization of the hidden Markov classifiers
51
Introduction Libras Review Methods Experiments Results Conclusion
∑𝑘=1
𝐾 (𝑝)
❑ ;𝜽𝒑 )}
1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏
Ψ 𝑐∈𝐶𝑝
Ψ 𝑐 ¿¿¿¿ 𝒙 )=¿∑𝒚
❑ ;𝜽𝒑 ¿,𝜔𝑐
Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄𝜽𝒑;𝜽𝒑 )=𝑒𝑥𝑝 {∑𝑘=1
𝐾 (𝑝)
𝜆𝑝𝑘 𝑓 𝑝𝑘 (𝒙𝒄 ,𝒚 𝒄 ,𝜽𝒑,𝜔𝑐,𝜔𝑐
𝑝 (𝜔|𝒙 )=∑𝒚
❑𝑝 (𝒚,𝜔Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Hidden Conditional Random Fields (HCRFs)
Hidden Conditional Random Fields Generalization of the hidden Markov classifiers
52
Introduction Libras Review Methods Experiments Results Conclusion
xt-2 xt-1 xt
yt-1yt-2 yt
ω
ω̂=max𝜔 𝑗∈ ω
𝑝 (𝜔 𝑗 | 𝒙)
Single model for all words
53
Experiments and ResultsFingerspelling recognition with SVMs and HCRFs against ANNs and HMMs
Introduction Libras Review Methods Experiments Results Conclusion
Static gesture classifier
P TP P A A T T O OA
Sequence classifier
PatoHMM
ANN
xt-2 xt-1 xt
yt-1yt-2 yt
ω
HCRF
SVM
55
Introduction Libras Review Methods Experiments Results Conclusion
Static gesture recognition Database of 8100 grayscale images Input instances with 1024 features 27 classes (manual alphabet signs)
Static gestures(hand postures)
Dynamic gestures (spelled words)
56
Introduction Libras Review Methods Experiments Results Conclusion
Neural Networks Evaluate initialization heuristics (Nguyen-Widrow)
Support Vector Machines Evaluate the heuristic value for based on the inter-
quartile range of the norm statistics for the input dataset
Static gestures(hand postures)
Dynamic gestures (spelled words)
57
Introduction Libras Review Methods Experiments Results Conclusion
Resilient Backpropagation ANN
Hidden Neurons Kappa
50 0.851
100 0.887
300 0.921
500 0.922
1000 0.924
Gaussian kernel SVM
Kappa
0,1 0.000
1 0.106
10 0.569
100 0.950
(heuristic) 392 0.917
1000 0.863
Static sign classification
0.1 1 10 100 10000
0.2
0.4
0.6
0.8
1
0.00100.00200.00300.00400.00500.00600.00700.00
Busca em grade Heurística Vetores de Suporte
Kapp
a
Supp
ort V
ecto
rs (A
vera
ge)
58
Introduction Libras Review Methods Experiments Results Conclusion
Resilient Backpropagation ANN
Hidden Neurons Kappa
50 0.851
100 0.887
300 0.921
500 0.922
1000 0.925
Gaussian kernel SVM
Kappa
0.1 0.000
1 0.106
10 0.569
100 0.959
(heuristic) 392 0.917
1000 0.863
0.1 1 10 100 10000
0.2
0.4
0.6
0.8
1
0.00100.00200.00300.00400.00500.00600.00700.00
Busca em grade Heurística Vetores de Suporte
Kapp
a
Supp
ort V
ecto
rs (A
vera
ge)
Static sign classification
59
Introduction Libras Review Methods Experiments Results Conclusion
0.1 1075
200 H1000
2000
0.80
0.85
0.90
0.95
1.00
11000
1000000
Hyperparameter surface for Gaussian SVMs
1 10 100 1000 10000 100000 1000000 10000000
Sigma (σ²)
Kapp
a
C
0.1 1075
200 H1000
2000
0100200300400500600
11000
1000000
1 10 100Sigma (σ²)Av
erag
e nu
mbe
r of S
Vs
C
60
Introduction Libras Review Methods Experiments Results Conclusion
Static gesture classification Statistically significant results (p < 0.01)
Points of interest Polynomial machines have increased sparcity but smaller kappa Neural networks were faster to evaluate, but not to learn – unless using linear SVMs Sigma plays a much more important role than C in Gaussian machines Heuristics for choosing sigma and C resulted in great performance values
Static gestures(hand postures)
Dynamic gestures (spelled words)
61
Introduction Libras Review Methods Experiments Results Conclusion
Static gestures(hand postures)
Dynamic gestures (spelled words)
62
Introduction Libras Review Methods Experiments Results Conclusion
Static gestures(hand postures)
Dynamic gestures (spelled words)
Introduction Libras Review Methods Experiments Results Conclusion
Static gesture classifier
P TP P A A T TA
Sequence classifier
Pato
xt-2 xt-1 xt
yt-1yt-2 yt
ω
HCRF
SVM
64
Introduction Libras Review Methods Experiments Results Conclusion
Dynamic gesture classification Database containing 540 signed words Containing a total of 63,703 static signs
The previous layer labels the entire dataset Then we tested all possible model combinations Estimated kappa sampled from 10-fold CV
65
Introduction Libras Review Methods Experiments Results Conclusion
Labeller Classification AlgorithmTraining Validation
Kappa Kappa
SVM HMM Baum-Welch
SVM HCRF RProp
ANN HMM Baum-Welch
ANN HCRF RProp
Dynamic gesture classification
66
Introduction Libras Review Methods Experiments Results Conclusion
Labeller Classification AlgorithmTraining Validation
Kappa Kappa
SVM HMM Baum-Welch 0.95 0.82
SVM HCRF RProp 0.98 0.83
ANN HMM Baum-Welch 0.95 0.80
ANN HCRF RProp 0.99 0.82
Dynamic gesture classification
67
Introduction Libras Review Methods Experiments Results Conclusion
SVM+HCRF have shown the best validation result (10-fold CV) Combinations using HCRF have shown best results in general
Training results are statistically different We have not enough evidence to say validation results are not equivalent
Labeller Classification AlgorithmTraining Validation
Kappa Kappa
SVM HMM Baum-Welch 0.95 0.82
SVM HCRF RProp 0.98 0.83
ANN HMM Baum-Welch 0.95 0.80
ANN HCRF RProp 0.99 0.82
Dynamic gesture classification
68
Introduction Libras Review Methods Experiments Results Conclusion
Hidden Conditional Random Fields – in this specific problem – had higher ability to retain knowledge while keeping the same generality
In other words, achieved less overfitting.
Labeller Classification AlgorithmTraining Validation
Kappa Kappa
SVM HMM Baum-Welch 0,95 0,82
SVM HCRF RProp 0,98 0,83
ANN HMM Baum-Welch 0,95 0,80
ANN HCRF RProp 0,99 0,82
Dynamic gesture classification
69
Conclusionand future works
Introduction Libras Review Methods Experiments Results Conclusion
Fingerspelling experiments SVMs & HCRFs vs ANNs & HMMs
Static Gesture Recognition Statistically significant results favoring SVMs Linear SVMs on DDAGs:
Best compromise between speed, accuracy and ease of use SVMs have shown easier training, reduced training times Heuristic initializations work rather well, less parameter tuning
Dynamic Gesture Recognition Choice of gesture classifier had much more impact Linear-chain HCRFs:
Increased knowledge absorption without overfitting
71
Introduction Libras Review Methods Experiments Results Conclusion
Future works
Detect standard words rather than fingerspelling (already complete)
Use Structural Support Vector Machines, which are equivalent to HCRFs but are trained using a hinge loss function
Use a mixed language model to categorize full phrases
References
73
Introduction Libras Review Methods Experiments Results Conclusion
References
BOWDEN, R. et al. A Linguistic Feature Vector for the Visual Interpretation of Sign Language. European Conference on Computer Vision. [S.l.]: Springer-Verlag. 2004. p. 391-401.
BRADSKI, G. R. Computer Vision Face Tracking For Use in a Perceptual User Interface. Intel Technology Journal, n. Q2, 1998. Disponivel em: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673>.
DIAS, D. B. et al. Hand movement recognition for brazilian sign language: a study using distance-based neural networks. Proceedings of the 2009 international joint conference on Neural Networks. Atlanta, Georgia, USA: IEEE Press. 2009. p. 2355-2362.
FERREIRA-BRITO, L. Por uma gramática de Línguas de Sinais. 2nd. ed. Rio de Janeiro: Tempo Brasileiro, 2010. 273 p. ISBN 85-282-0069-8.
FERREIRA-BRITO, L.; LANGEVIN, R. The Sublexical Structure of a Sign Language. Mathématiques, Informatique et Sciences Humaines, v. 125, p. 17-40, 1994.
74
Introduction Libras Review Methods Experiments Results Conclusion
References
FEUERSTACK, S.; COLNAGO, J. H.; SOUZA, C. R. D. Designing and Executing Multimodal Interfaces for the Web based on State Chart XML. Proceedings of 3a. Conferência Web W3C Brasil 2011. Rio de Janeiro: [s.n.]. 2011.
PIZZOLATO, E. B.; ANJO, M. D. S.; PEDROSO, G. C. Automatic recognition of finger spelling for LIBRAS based on a two-layer architecture. Proceedings of the 2010 ACM Symposium on Applied Computing. Sierre, Switzerland: ACM. 2010. p. 969-973.
VIOLA, P.; JONES, M. Robust Real-time Object Detection. International Journal of Computer Vision. [S.l.]: [s.n.]. 2001.
VAPNIK, V. N. The nature of statistical learning theory. New York, NY, USA: Springer-Verlag New York, Inc., 1995. ISBN 0-387-94559-8.
VAPNIK, V. N. Statistical learning theory. [S.l.]: Wiley, 1998. ISBN 0471030031. YANG, R.; SARKAR, S. Detecting Coarticulation in Sign Language using
Conditional Random Fields. Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. [S.l.]: [s.n.]. 2006. p. 108-112.
Acknowledgements
Guilherme Cartacho
Fin!
77
A Framework for Research Support
Appendix A
78Accord.NET Framework Machine Learning and Artificial Intelligence
79Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition
80Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition Mathematics and Statistics
Builds upon well established foundations
83
It has been used toRecognize gestures using Wii
84
It has been used to Study and evaluate performance in 3D gesture recognition
85
It has been used toPredict attacks in computer networks
86
It has been used toCompare touch and in-air gestures using Kinect
87
It has been used toProvide sensor information in multi-model interfaces
Guido Soetens, Estimating the limitations of single-handed multi-touch input. Master Thesis, Utrecht University. September, 2012.
K. N. Pushpalatha, A. K. Gautham, D. R. Shashikumar, K. B. ShivaKumar. Iris Recognition System with Frequency Domain Features optimized with PCA and SVM Classifier, IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 1, September 2012.
Arnaud Ogier, Thierry Dorval. HCS-Analyzer: Open source software for High-Content Screening data correction and analysis. Bioinformatics. First published online May 13, 2012.
It has been used inan increasing number of publications
Ludovico Buffon, Evelina Lamma, Fabrizio Riguzzi, and Davide Forment. Un sistema di vision inspection basato su reti neurali. In Popularize Artificial Intelligence. Proceedings of the AI*IA Workshop and Prize for Celebrating 100th Anniversary of Alan Turing's Birth (PAI 2012), Rome, Italy, June 15, 2012, number 860 in CEUR Workshop Proceedings, pages 1-6, Aachen, Germany, 2012.
Liam Williams, Spotting The Wisdom In The Crowds. Master Thesis on Joint Mathematics and Computer Science. Imperial College London, Department of Computing. June, 2012.
Alosefer, Y.; Rana, O.F.; "Predicting client-side attacks via behaviour analysis using honeypot data," Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on , vol., no., pp.31-36, 19-21 Oct. 2011
It has been used inan increasing number of publications
Brummitt, L. Scrabble Referee: Word Recognition Component, 2011. Final project report. University of Sheffield, Sheffield, England.
Cani, V., 2011. Image Stitching for UAV remote sensing application. Master Degree Thesis. Computer Engineering, School of Castelldefels of Universitat Politècnica de Catalunya. Barcelona, Spain.
Hassani, A. Z.; "Touch versus in-air Hand Gestures: Evaluating the acceptance by seniors of Human-Robot Interaction using Microsoft Kinect," Master Thesis, University of Twente, Enschede, Netherlands, 2011.
Kaplan, K., 2011. ADES: Automatic Driver Evaluation System. PhD Thesis, Boğaziçi University, Istanbul, Turkey.
It has been used inan increasing number of publications
Wright, M., Lin, C.-J., O'Neill, E., Cosker, D. and Johnson, P., 2011. 3D Gesture recognition: An evaluation of user and system performance. In: Pervasive Computing - 9th International Conference, Pervasive 2011, Proceedings. Heidelberg: Springer Verlag, pp. 294-313.
Lourenço, J., 2010. Wii3D: Extending the Nintendo Wii Remote into 3D. Final course project report, Rhodes University, Grahamstown. 110p.
Mendelssohn, T.; 2010. Gestureboard - Entwicklung eines Wiimote-basierten, gestengesteuerten, Whiteboard-Systems für den Bildungsbereich. Final project report. Hochschule Furtwangen University, Furtwangen im Schwarzwald, Germany.
It has been used inan increasing number of publications
92
http://accord.googlecode.com
Top Related