Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields

92
- A comparison with Neural Networks and Hidden Markov Models - César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos Anjo Universidade Federal de São Carlos (Federal University of São Carlos) IBERAMIA 2012 Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields Cartagena de Índias, Colombia 2012

description

Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields. - A comparison with Neural Networks and Hidden Markov Models - César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos Anjo - PowerPoint PPT Presentation

Transcript of Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields

Page 1: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

- A comparison with Neural Networks and Hidden Markov Models -

César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos AnjoUniversidade Federal de São Carlos (Federal University of São Carlos)

IBERAMIA 2012

Fingerspelling Recognition with Support Vector Machines

and Hidden Conditional Random Fields

Cartagena de Índias, Colombia2012

Page 2: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

IntroductionContext, Motivation, Objectives and

the Organization of this Presentation

Page 3: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

3

Multidisciplinary Computing and Linguistics

Ethnologue lists about 130 sign languages existent in the world (LEWIS, 2009)

ContextMotivationObjectives

Agenda

Page 4: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

4

Two fronts

Social Aim to improve quality of life for the

deaf and increase the social inclusion

Scientific Investigation of the distinct interaction

methods, computational models and their respective challenges

ContextMotivationObjectives

Agenda

Page 5: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

5

This paper Investigate the behavior and applicability of SVMs

and HCRFs in the recognition of specific signs from the Brazilian Sign Language

Long term Walk towards the creation of a full-fledged

recognition system for LIBRAS

This work represents a small but important step in achieving this goal

ContextMotivationObjectives

Agenda

Page 6: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

6

Introduction

- Brazilian Sign Language

Review

Methods

Experiments

Results

Conclusion

Literature

Libras

Support Vector MachinesConditional Random Fields

ContextMotivationObjectives

Agenda

and Tools

Page 7: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

7

The Brazilian Sign Language (LIBRAS)Structures and the manual alphabet

Page 8: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

8

Introduction Libras Review Methods Experiments Results Conclusion

Natural languageNot mimicsNot universal It is not only “a problem of the

deaf or a language pathology” (QUADROS & KARNOPP, 2004)

LIBRASDifficultiesGrammar

Page 9: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

9

Introduction Libras Review Methods Experiments Results Conclusion

Highly context-sensitive Same sign may have distinct meanings

Interpretation is hardeven for humans

LIBRASDifficultiesGrammar

Page 10: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

10

Introduction Libras Review Methods Experiments Results Conclusion

Fingerspelling is onlypart of the Grammar Needed when explicitly spelling

the name of a person or a location

Subset of the full-language recognition problem

LIBRASDifficultiesGrammar

Page 11: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

11

Introduction Libras Review Methods Experiments Results Conclusion

Page 12: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

12

Introduction Libras Review Methods Experiments Results Conclusion

Literature Layer architectures are common Static gestures x Dynamic gestures

One of the best works on LIBRAS handles only the movement aspect of the language(Dias et al.)

Page 13: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

13

Introduction Libras Review Methods Experiments Results Conclusion

Few studies explore SVMs But many use Neural Networks

No studies on HCRFs and LIBRAS

Page 14: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T T O OA

Sequence classifier

PatoHMM

ANN

Example Recognition of a fingerspelled word

using a two-layered architecture

Page 15: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

15

Introduction Libras Review Methods Experiments Results Conclusion

YANG, SCLAROFF e LEE, 2009 Multiple layers, SVMs

Elmezain, 2011 HCRF, in-air drawing recognition

Page 16: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

16

Models and ToolsOverview of the chosen techniques

and reasons for their choice

Page 17: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

17

Static Gesture Recognition

Neural Networks and Support Vector Machines for the detection of static signs

Page 18: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

18

a𝑓 (𝑥) b

c

Find such that…

Page 19: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

19

Introduction Libras Review Methods Experiments Results Conclusion

Biologically inspired McCulloch & Pitts, Rosenblatt, Rumelhart

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Page 20: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

Perceptron Hyperplane decision Linearly separable problems

Layer architecture Universal approximator

Learning is a ill-posed problem Multiple local minima, ill-conditioning

Page 21: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

21

Introduction Libras Review Methods Experiments Results Conclusion

Strong theoretical basis Statistical Learning Theory Structural Risk Minimization (SRM)

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Page 22: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

22

Introduction Libras Review Methods Experiments Results Conclusion

Risk minimization through margin maximization Capacity control through margin control Sparse solutions considering only a few support vectors

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Large-margin classifiers

Page 23: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

23

Introduction Libras Review Methods Experiments Results Conclusion

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Problem: binary-only classifier How to generalize to multiple classes?

Page 24: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

24

Introduction Libras Review Methods Experiments Results Conclusion

Classical approaches One-against-all One-against-one Discriminant functions

Drawbacks Only works when equiprobable Evaluation of c(c-1)/2 machines Non-guaranteed optimum results

With 27 static gestures, this would result in 351 SVM evaluations each time a new

classification is required!

Problem: binary-only classifier How to generalize to multiple classes?

Page 25: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

25

Introduction Libras Review Methods Experiments Results Conclusion

Directed Acyclic Graphs Generalization of Decision Trees,

allowing for non-directed cycles

Require at maximum c-1 evaluations

So, for 27 static gestures, only 26 SVM evaluations are required. Only

7.4% of the original effort

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Problem: binary-only classifier How to generalize to multiple classes?

Page 26: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

26

Introduction Libras Review Methods Experiments Results Conclusion

CandidatesElimination proccess

One class eliminated at a timeABCD

A lost D lost

B lostD lost A lost

C lost

B lostA lost

C lostB lost

D lostC lost

BCD

CD

BC

AB

ABC

D C B A

A x D

B x D A x C

C x D B x C A x B

A B C D

Page 27: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

27

Introduction Libras Review Methods Experiments Results Conclusion

However, no matter the modelWe’ll have (extreme) noise due pose transitions

How can we cope with that?

Page 28: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

28

Dynamic Gesture Recognition

Hidden Markov Models, Conditional Random Field and Hidden Conditional Random Fields for

dynamic gesture recognition.

Page 29: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

29

bye𝑓 (𝑥) hi

hello

Find such thatgiven extremely noisy sequences of labels, estimate the word being signed.

blyrei

hil

hmeylrlwo

Page 30: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

30

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 31: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

31

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=∏𝑡=1

𝑇

𝑝 (𝑦 𝑡|𝑦𝑡− 1 )𝑝 (𝑥𝑡∨𝑦 𝑡)

A B

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Markov Models

Joint probability model of a observation sequence and its relationship with time

Page 32: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

32

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝜔 𝑖∨𝒙 )=𝑝 (𝜔𝑖 )𝑝 ( 𝒙∨𝜔 𝑖 )

∑𝑗

𝑐

𝑝 (𝒙∨𝜔 𝑗 )

Hidden Markov Models

Marginalizing over y, we achieve the observation sequence likelihood

Which can be used for classificationusing either the ML or MAP criteria

𝑝 (𝑥 )=∑𝒚∏𝑡=1

𝑇

𝑝 ( 𝑦𝑡|𝑦𝑡 −1 )𝑝 (𝑥𝑡∨𝑦𝑡)

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 33: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

33

Introduction Libras Review Methods Experiments Results Conclusion

Word 1

Word 2

Word n

...

𝑝 (𝒙∨ω𝟏)

ω̂=max𝜔 𝑗∈ω

𝑝 (𝒙|𝜔 𝑗 )𝑝 (𝜔 𝑗)𝑝 (𝒙∨ω𝟐)

𝑝 (𝒙∨ω𝒏)

One model for each word

Page 34: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

34

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models have found great applications in speech recognition

However, a fundamental paradigmshift recently occurred in this field

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 35: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

35

Introduction Libras Review Methods Experiments Results Conclusion

Probability distributions governing speech signals could not be modeled accurately, turning “Bayes decision theory inapplicable under those circumstances”

(Juang & Rabiner, 2005)

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 36: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

36

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 37: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

37

Introduction Libras Review Methods Experiments Results Conclusion

Conditional Random Fields Generalization of the Markov models

Discriminative Models Model without incorporating

Designates a family of MRFs Each new observation originates a new MRF

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 38: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

38

Introduction Libras Review Methods Experiments Results Conclusion

CRFLinear-chain CRFLogistic Regression

Directional modelsNaïve Bayes HMM

Disc

rimin

ative

Gene

rativ

e

Sequ

ence

Grap

hs

Infograph based on the tutorial by Sutton, C., McCallum, A., 2007

Page 39: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

39

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Conditional Random Fields Generalization of the Markov models

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃)

Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃𝑝)=𝑒𝑥𝑝 {∑𝑘=1

𝐾(𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 ,𝒚 𝒄)}𝑍 (𝒙 )=∑

𝒚∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄 ;𝜃)

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} Parameter vector which can be optimized using gradient

methods

Potential Cliques Potential Functions

Characteristic function vector

Partition function

Page 40: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

40

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=¿ ∏𝑡=1

𝑇

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑩𝑝 (𝑥𝑡∨𝑦𝑡)

𝑨𝑝 (𝑦𝑡|𝑦 𝑡−1 )

∏𝑡=1

𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Conditional Random Fields Generalization of the Markov models

How do we initialize those models? Reaching HCRFs from a HMM

Page 41: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

41

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

∏𝑡=1

𝑇

)(∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿𝑒❑

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∏𝑡=1

𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )∑

𝑡=1

𝑇

❑ 𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)ln

¿

Page 42: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

42

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

ln 𝑨

𝑛 𝑛

𝑛 𝑚

ln 𝑩a11 a12 a13

a21 a22 a23

a31 a32 a33

b11 b12

b21 b22

b31 b32

𝝀 :𝑘=𝑛×𝑛+𝑛×𝑚

Page 43: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

43

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

a11 a12 a13 a21 a22 a23 a31 a32 a33 b11 b12 b21 b22 b31 b32

𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆

❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }

𝑘=𝑛×𝑛+𝑛×𝑚𝝀 :

i=1

j=1

i=1

j=2

i=1

j=3

i=2

j=1

i=2

j=2

i=2

j=3

i=3

j=1

i=3

j=2

i=3

j=1

i=1

o=1

i=1

o=2

i=2

o=1

i=2

o=2

i=3

o=1

i=3

o=2

𝒇 :

Page 44: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

44

Introduction Libras Review Methods Experiments Results Conclusion

{∑❑❑

∑❑

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑒𝑥𝑝 {∑𝑡=1𝑇

∑𝑖=1

𝑛

∑𝑗=1

𝑛

𝑎𝑖𝑗1{𝑦𝑡=𝑖}1{𝑦𝑡− 1= 𝑗 }+∑𝑡=1

𝑇

∑𝑖=1

𝑛

∑𝑗=1

𝑚

𝑏𝑖𝑗1{𝑦 𝑡=𝑖}1{𝑥𝑡= 𝑗}}

𝑒𝑥𝑝 {∑𝑘=1

𝐾

𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}

𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆

❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }¿𝑒𝑥𝑝 {∑𝑡=1𝑇

𝑎𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙)+∑𝑡=1

𝑇

𝑏𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 )}

¿

Page 45: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

45

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑒𝑥𝑝 {∑𝑘=1

𝐾

𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}𝑝 ( 𝑦 )

𝑝 (𝒚|𝒙 )=¿1

𝑍 (𝒙 )

Page 46: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

46

Introduction Libras Review Methods Experiments Results Conclusion

Drawback Assumes both and are known

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 47: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

47

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 48: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

48

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 49: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

49

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

𝜃𝐻𝐶𝑅𝐹

𝜃𝐻𝑀𝑀𝑐

Parameter space

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Page 50: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

50

Introduction Libras Review Methods Experiments Results Conclusion

Sequence classification Model without explicitly modeling

Do not require to be known The sequence of states is now hidden

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

Page 51: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

51

Introduction Libras Review Methods Experiments Results Conclusion

∑𝑘=1

𝐾 (𝑝)

❑ ;𝜽𝒑 )}

1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 ¿¿¿¿ 𝒙 )=¿∑𝒚

❑ ;𝜽𝒑 ¿,𝜔𝑐

Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄𝜽𝒑;𝜽𝒑 )=𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘 (𝒙𝒄 ,𝒚 𝒄 ,𝜽𝒑,𝜔𝑐,𝜔𝑐

𝑝 (𝜔|𝒙 )=∑𝒚

❑𝑝 (𝒚,𝜔Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

Page 52: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

52

Introduction Libras Review Methods Experiments Results Conclusion

xt-2 xt-1 xt

yt-1yt-2 yt

ω

ω̂=max𝜔 𝑗∈  ω

𝑝 (𝜔 𝑗 | 𝒙)

Single model for all words

Page 53: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

53

Experiments and ResultsFingerspelling recognition with SVMs and HCRFs against ANNs and HMMs

Page 54: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T T O OA

Sequence classifier

PatoHMM

ANN

xt-2 xt-1 xt

yt-1yt-2 yt

ω

HCRF

SVM

Page 55: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

55

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture recognition Database of 8100 grayscale images Input instances with 1024 features 27 classes (manual alphabet signs)

Static gestures(hand postures)

Dynamic gestures (spelled words)

Page 56: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

56

Introduction Libras Review Methods Experiments Results Conclusion

Neural Networks Evaluate initialization heuristics (Nguyen-Widrow)

Support Vector Machines Evaluate the heuristic value for based on the inter-

quartile range of the norm statistics for the input dataset

Static gestures(hand postures)

Dynamic gestures (spelled words)

Page 57: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

57

Introduction Libras Review Methods Experiments Results Conclusion

Resilient Backpropagation ANN

Hidden Neurons Kappa

50 0.851

100 0.887

300 0.921

500 0.922

1000 0.924

Gaussian kernel SVM

Kappa

0,1 0.000

1 0.106

10 0.569

100 0.950

(heuristic) 392 0.917

1000 0.863

Static sign classification

0.1 1 10 100 10000

0.2

0.4

0.6

0.8

1

0.00100.00200.00300.00400.00500.00600.00700.00

Busca em grade Heurística Vetores de Suporte

Kapp

a

Supp

ort V

ecto

rs (A

vera

ge)

Page 58: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

58

Introduction Libras Review Methods Experiments Results Conclusion

Resilient Backpropagation ANN

Hidden Neurons Kappa

50 0.851

100 0.887

300 0.921

500 0.922

1000 0.925

Gaussian kernel SVM

Kappa

0.1 0.000

1 0.106

10 0.569

100 0.959

(heuristic) 392 0.917

1000 0.863

0.1 1 10 100 10000

0.2

0.4

0.6

0.8

1

0.00100.00200.00300.00400.00500.00600.00700.00

Busca em grade Heurística Vetores de Suporte

Kapp

a

Supp

ort V

ecto

rs (A

vera

ge)

Static sign classification

Page 59: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

59

Introduction Libras Review Methods Experiments Results Conclusion

0.1 1075

200 H1000

2000

0.80

0.85

0.90

0.95

1.00

11000

1000000

Hyperparameter surface for Gaussian SVMs

1 10 100 1000 10000 100000 1000000 10000000

Sigma (σ²)

Kapp

a

C

0.1 1075

200 H1000

2000

0100200300400500600

11000

1000000

1 10 100Sigma (σ²)Av

erag

e nu

mbe

r of S

Vs

C

Page 60: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

60

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classification Statistically significant results (p < 0.01)

Points of interest Polynomial machines have increased sparcity but smaller kappa Neural networks were faster to evaluate, but not to learn – unless using linear SVMs Sigma plays a much more important role than C in Gaussian machines Heuristics for choosing sigma and C resulted in great performance values

Static gestures(hand postures)

Dynamic gestures (spelled words)

Page 61: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

61

Introduction Libras Review Methods Experiments Results Conclusion

Static gestures(hand postures)

Dynamic gestures (spelled words)

Page 62: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

62

Introduction Libras Review Methods Experiments Results Conclusion

Static gestures(hand postures)

Dynamic gestures (spelled words)

Page 63: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T TA

Sequence classifier

Pato

xt-2 xt-1 xt

yt-1yt-2 yt

ω

HCRF

SVM

Page 64: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

64

Introduction Libras Review Methods Experiments Results Conclusion

Dynamic gesture classification Database containing 540 signed words Containing a total of 63,703 static signs

The previous layer labels the entire dataset Then we tested all possible model combinations Estimated kappa sampled from 10-fold CV

Page 65: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

65

Introduction Libras Review Methods Experiments Results Conclusion

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch

SVM HCRF RProp

ANN HMM Baum-Welch

ANN HCRF RProp

Dynamic gesture classification

Page 66: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

66

Introduction Libras Review Methods Experiments Results Conclusion

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0.95 0.82

SVM HCRF RProp 0.98 0.83

ANN HMM Baum-Welch 0.95 0.80

ANN HCRF RProp 0.99 0.82

Dynamic gesture classification

Page 67: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

67

Introduction Libras Review Methods Experiments Results Conclusion

SVM+HCRF have shown the best validation result (10-fold CV) Combinations using HCRF have shown best results in general

Training results are statistically different We have not enough evidence to say validation results are not equivalent

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0.95 0.82

SVM HCRF RProp 0.98 0.83

ANN HMM Baum-Welch 0.95 0.80

ANN HCRF RProp 0.99 0.82

Dynamic gesture classification

Page 68: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

68

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Conditional Random Fields – in this specific problem – had higher ability to retain knowledge while keeping the same generality

In other words, achieved less overfitting.

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0,95 0,82

SVM HCRF RProp 0,98 0,83

ANN HMM Baum-Welch 0,95 0,80

ANN HCRF RProp 0,99 0,82

Dynamic gesture classification

Page 69: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

69

Conclusionand future works

Page 70: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Introduction Libras Review Methods Experiments Results Conclusion

Fingerspelling experiments SVMs & HCRFs vs ANNs & HMMs

Static Gesture Recognition Statistically significant results favoring SVMs Linear SVMs on DDAGs:

Best compromise between speed, accuracy and ease of use SVMs have shown easier training, reduced training times Heuristic initializations work rather well, less parameter tuning

Dynamic Gesture Recognition Choice of gesture classifier had much more impact Linear-chain HCRFs:

Increased knowledge absorption without overfitting

Page 71: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

71

Introduction Libras Review Methods Experiments Results Conclusion

Future works

Detect standard words rather than fingerspelling (already complete)

Use Structural Support Vector Machines, which are equivalent to HCRFs but are trained using a hinge loss function

Use a mixed language model to categorize full phrases

Page 72: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

References

Page 73: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

73

Introduction Libras Review Methods Experiments Results Conclusion

References

BOWDEN, R. et al. A Linguistic Feature Vector for the Visual Interpretation of Sign Language. European Conference on Computer Vision. [S.l.]: Springer-Verlag. 2004. p. 391-401.

BRADSKI, G. R. Computer Vision Face Tracking For Use in a Perceptual User Interface. Intel Technology Journal, n. Q2, 1998. Disponivel em: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673>.

DIAS, D. B. et al. Hand movement recognition for brazilian sign language: a study using distance-based neural networks. Proceedings of the 2009 international joint conference on Neural Networks. Atlanta, Georgia, USA: IEEE Press. 2009. p. 2355-2362.

FERREIRA-BRITO, L. Por uma gramática de Línguas de Sinais. 2nd. ed. Rio de Janeiro: Tempo Brasileiro, 2010. 273 p. ISBN 85-282-0069-8.

FERREIRA-BRITO, L.; LANGEVIN, R. The Sublexical Structure of a Sign Language. Mathématiques, Informatique et Sciences Humaines, v. 125, p. 17-40, 1994.

Page 74: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

74

Introduction Libras Review Methods Experiments Results Conclusion

References

FEUERSTACK, S.; COLNAGO, J. H.; SOUZA, C. R. D. Designing and Executing Multimodal Interfaces for the Web based on State Chart XML. Proceedings of 3a. Conferência Web W3C Brasil 2011. Rio de Janeiro: [s.n.]. 2011.

PIZZOLATO, E. B.; ANJO, M. D. S.; PEDROSO, G. C. Automatic recognition of finger spelling for LIBRAS based on a two-layer architecture. Proceedings of the 2010 ACM Symposium on Applied Computing. Sierre, Switzerland: ACM. 2010. p. 969-973.

VIOLA, P.; JONES, M. Robust Real-time Object Detection. International Journal of Computer Vision. [S.l.]: [s.n.]. 2001.

VAPNIK, V. N. The nature of statistical learning theory. New York, NY, USA: Springer-Verlag New York, Inc., 1995. ISBN 0-387-94559-8.

VAPNIK, V. N. Statistical learning theory. [S.l.]: Wiley, 1998. ISBN 0471030031. YANG, R.; SARKAR, S. Detecting Coarticulation in Sign Language using

Conditional Random Fields. Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. [S.l.]: [s.n.]. 2006. p. 108-112.

Page 75: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Acknowledgements

Guilherme Cartacho

Page 76: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Fin!

Page 77: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

77

A Framework for Research Support

Appendix A

Page 78: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

78Accord.NET Framework Machine Learning and Artificial Intelligence

Page 79: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

79Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition

Page 80: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

80Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition Mathematics and Statistics

Page 81: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields
Page 82: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Builds upon well established foundations

Page 83: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

83

It has been used toRecognize gestures using Wii

Page 84: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

84

It has been used to Study and evaluate performance in 3D gesture recognition

Page 85: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

85

It has been used toPredict attacks in computer networks

Page 86: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

86

It has been used toCompare touch and in-air gestures using Kinect

Page 87: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

87

It has been used toProvide sensor information in multi-model interfaces

Page 88: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Guido Soetens, Estimating the limitations of single-handed multi-touch input. Master Thesis, Utrecht University. September, 2012.

K. N. Pushpalatha, A. K. Gautham, D. R. Shashikumar, K. B. ShivaKumar. Iris Recognition System with Frequency Domain Features optimized with PCA and SVM Classifier, IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 1, September 2012.

Arnaud Ogier, Thierry Dorval. HCS-Analyzer: Open source software for High-Content Screening data correction and analysis. Bioinformatics. First published online May 13, 2012.

It has been used inan increasing number of publications

Page 89: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Ludovico Buffon, Evelina Lamma, Fabrizio Riguzzi, and Davide Forment. Un sistema di vision inspection basato su reti neurali. In Popularize Artificial Intelligence. Proceedings of the AI*IA Workshop and Prize for Celebrating 100th Anniversary of Alan Turing's Birth (PAI 2012), Rome, Italy, June 15, 2012, number 860 in CEUR Workshop Proceedings, pages 1-6, Aachen, Germany, 2012.

Liam Williams, Spotting The Wisdom In The Crowds. Master Thesis on Joint Mathematics and Computer Science. Imperial College London, Department of Computing. June, 2012.

Alosefer, Y.; Rana, O.F.; "Predicting client-side attacks via behaviour analysis using honeypot data," Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on , vol., no., pp.31-36, 19-21 Oct. 2011

It has been used inan increasing number of publications

Page 90: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Brummitt, L. Scrabble Referee: Word Recognition Component, 2011. Final project report. University of Sheffield, Sheffield, England.

Cani, V., 2011. Image Stitching for UAV remote sensing application. Master Degree Thesis. Computer Engineering, School of Castelldefels of Universitat Politècnica de Catalunya. Barcelona, Spain.

Hassani, A. Z.; "Touch versus in-air Hand Gestures: Evaluating the acceptance by seniors of Human-Robot Interaction using Microsoft Kinect," Master Thesis, University of Twente, Enschede, Netherlands, 2011.

Kaplan, K., 2011. ADES: Automatic Driver Evaluation System. PhD Thesis, Boğaziçi University, Istanbul, Turkey.

It has been used inan increasing number of publications

Page 91: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

Wright, M., Lin, C.-J., O'Neill, E., Cosker, D. and Johnson, P., 2011. 3D Gesture recognition: An evaluation of user and system performance. In: Pervasive Computing - 9th International Conference, Pervasive 2011, Proceedings. Heidelberg: Springer Verlag, pp. 294-313.

Lourenço, J., 2010. Wii3D: Extending the Nintendo Wii Remote into 3D. Final course project report, Rhodes University, Grahamstown. 110p.

Mendelssohn, T.; 2010. Gestureboard - Entwicklung eines Wiimote-basierten, gestengesteuerten, Whiteboard-Systems für den Bildungsbereich. Final project report. Hochschule Furtwangen University, Furtwangen im Schwarzwald, Germany.

It has been used inan increasing number of publications

Page 92: Fingerspelling Recognition  with  Support Vector Machines  and  Hidden Conditional Random Fields

92

http://accord.googlecode.com