Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields

Post on 23-Feb-2016

66 views 0 download

Tags:

description

Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields. - A comparison with Neural Networks and Hidden Markov Models - César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos Anjo - PowerPoint PPT Presentation

Transcript of Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields

- A comparison with Neural Networks and Hidden Markov Models -

César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos AnjoUniversidade Federal de São Carlos (Federal University of São Carlos)

IBERAMIA 2012

Fingerspelling Recognition with Support Vector Machines

and Hidden Conditional Random Fields

Cartagena de Índias, Colombia2012

IntroductionContext, Motivation, Objectives and

the Organization of this Presentation

3

Multidisciplinary Computing and Linguistics

Ethnologue lists about 130 sign languages existent in the world (LEWIS, 2009)

ContextMotivationObjectives

Agenda

4

Two fronts

Social Aim to improve quality of life for the

deaf and increase the social inclusion

Scientific Investigation of the distinct interaction

methods, computational models and their respective challenges

ContextMotivationObjectives

Agenda

5

This paper Investigate the behavior and applicability of SVMs

and HCRFs in the recognition of specific signs from the Brazilian Sign Language

Long term Walk towards the creation of a full-fledged

recognition system for LIBRAS

This work represents a small but important step in achieving this goal

ContextMotivationObjectives

Agenda

6

Introduction

- Brazilian Sign Language

Review

Methods

Experiments

Results

Conclusion

Literature

Libras

Support Vector MachinesConditional Random Fields

ContextMotivationObjectives

Agenda

and Tools

7

The Brazilian Sign Language (LIBRAS)Structures and the manual alphabet

8

Introduction Libras Review Methods Experiments Results Conclusion

Natural languageNot mimicsNot universal It is not only “a problem of the

deaf or a language pathology” (QUADROS & KARNOPP, 2004)

LIBRASDifficultiesGrammar

9

Introduction Libras Review Methods Experiments Results Conclusion

Highly context-sensitive Same sign may have distinct meanings

Interpretation is hardeven for humans

LIBRASDifficultiesGrammar

10

Introduction Libras Review Methods Experiments Results Conclusion

Fingerspelling is onlypart of the Grammar Needed when explicitly spelling

the name of a person or a location

Subset of the full-language recognition problem

LIBRASDifficultiesGrammar

11

Introduction Libras Review Methods Experiments Results Conclusion

12

Introduction Libras Review Methods Experiments Results Conclusion

Literature Layer architectures are common Static gestures x Dynamic gestures

One of the best works on LIBRAS handles only the movement aspect of the language(Dias et al.)

13

Introduction Libras Review Methods Experiments Results Conclusion

Few studies explore SVMs But many use Neural Networks

No studies on HCRFs and LIBRAS

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T T O OA

Sequence classifier

PatoHMM

ANN

Example Recognition of a fingerspelled word

using a two-layered architecture

15

Introduction Libras Review Methods Experiments Results Conclusion

YANG, SCLAROFF e LEE, 2009 Multiple layers, SVMs

Elmezain, 2011 HCRF, in-air drawing recognition

16

Models and ToolsOverview of the chosen techniques

and reasons for their choice

17

Static Gesture Recognition

Neural Networks and Support Vector Machines for the detection of static signs

Introduction Libras Review Methods Experiments Results Conclusion

18

a𝑓 (𝑥) b

c

Find such that…

19

Introduction Libras Review Methods Experiments Results Conclusion

Biologically inspired McCulloch & Pitts, Rosenblatt, Rumelhart

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Introduction Libras Review Methods Experiments Results Conclusion

Perceptron Hyperplane decision Linearly separable problems

Layer architecture Universal approximator

Learning is a ill-posed problem Multiple local minima, ill-conditioning

21

Introduction Libras Review Methods Experiments Results Conclusion

Strong theoretical basis Statistical Learning Theory Structural Risk Minimization (SRM)

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

22

Introduction Libras Review Methods Experiments Results Conclusion

Risk minimization through margin maximization Capacity control through margin control Sparse solutions considering only a few support vectors

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Large-margin classifiers

23

Introduction Libras Review Methods Experiments Results Conclusion

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Problem: binary-only classifier How to generalize to multiple classes?

24

Introduction Libras Review Methods Experiments Results Conclusion

Classical approaches One-against-all One-against-one Discriminant functions

Drawbacks Only works when equiprobable Evaluation of c(c-1)/2 machines Non-guaranteed optimum results

With 27 static gestures, this would result in 351 SVM evaluations each time a new

classification is required!

Problem: binary-only classifier How to generalize to multiple classes?

25

Introduction Libras Review Methods Experiments Results Conclusion

Directed Acyclic Graphs Generalization of Decision Trees,

allowing for non-directed cycles

Require at maximum c-1 evaluations

So, for 27 static gestures, only 26 SVM evaluations are required. Only

7.4% of the original effort

Neural Networks

Support Vector MachinesMaximum Margin

Multiple Classes

Problem: binary-only classifier How to generalize to multiple classes?

26

Introduction Libras Review Methods Experiments Results Conclusion

CandidatesElimination proccess

One class eliminated at a timeABCD

A lost D lost

B lostD lost A lost

C lost

B lostA lost

C lostB lost

D lostC lost

BCD

CD

BC

AB

ABC

D C B A

A x D

B x D A x C

C x D B x C A x B

A B C D

27

Introduction Libras Review Methods Experiments Results Conclusion

However, no matter the modelWe’ll have (extreme) noise due pose transitions

How can we cope with that?

28

Dynamic Gesture Recognition

Hidden Markov Models, Conditional Random Field and Hidden Conditional Random Fields for

dynamic gesture recognition.

Introduction Libras Review Methods Experiments Results Conclusion

29

bye𝑓 (𝑥) hi

hello

Find such thatgiven extremely noisy sequences of labels, estimate the word being signed.

blyrei

hil

hmeylrlwo

30

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

31

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=∏𝑡=1

𝑇

𝑝 (𝑦 𝑡|𝑦𝑡− 1 )𝑝 (𝑥𝑡∨𝑦 𝑡)

A B

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Markov Models

Joint probability model of a observation sequence and its relationship with time

32

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝜔 𝑖∨𝒙 )=𝑝 (𝜔𝑖 )𝑝 ( 𝒙∨𝜔 𝑖 )

∑𝑗

𝑐

𝑝 (𝒙∨𝜔 𝑗 )

Hidden Markov Models

Marginalizing over y, we achieve the observation sequence likelihood

Which can be used for classificationusing either the ML or MAP criteria

𝑝 (𝑥 )=∑𝒚∏𝑡=1

𝑇

𝑝 ( 𝑦𝑡|𝑦𝑡 −1 )𝑝 (𝑥𝑡∨𝑦𝑡)

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

33

Introduction Libras Review Methods Experiments Results Conclusion

Word 1

Word 2

Word n

...

𝑝 (𝒙∨ω𝟏)

ω̂=max𝜔 𝑗∈ω

𝑝 (𝒙|𝜔 𝑗 )𝑝 (𝜔 𝑗)𝑝 (𝒙∨ω𝟐)

𝑝 (𝒙∨ω𝒏)

One model for each word

34

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models have found great applications in speech recognition

However, a fundamental paradigmshift recently occurred in this field

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

35

Introduction Libras Review Methods Experiments Results Conclusion

Probability distributions governing speech signals could not be modeled accurately, turning “Bayes decision theory inapplicable under those circumstances”

(Juang & Rabiner, 2005)

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

36

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

37

Introduction Libras Review Methods Experiments Results Conclusion

Conditional Random Fields Generalization of the Markov models

Discriminative Models Model without incorporating

Designates a family of MRFs Each new observation originates a new MRF

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

38

Introduction Libras Review Methods Experiments Results Conclusion

CRFLinear-chain CRFLogistic Regression

Directional modelsNaïve Bayes HMM

Disc

rimin

ative

Gene

rativ

e

Sequ

ence

Grap

hs

Infograph based on the tutorial by Sutton, C., McCallum, A., 2007

39

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Conditional Random Fields Generalization of the Markov models

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃)

Ψ 𝑐 (𝒙𝒄 , 𝒚 𝒄 ;𝜃𝑝)=𝑒𝑥𝑝 {∑𝑘=1

𝐾(𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 ,𝒚 𝒄)}𝑍 (𝒙 )=∑

𝒚∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄 ;𝜃)

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} Parameter vector which can be optimized using gradient

methods

Potential Cliques Potential Functions

Characteristic function vector

Partition function

40

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=¿ ∏𝑡=1

𝑇

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑩𝑝 (𝑥𝑡∨𝑦𝑡)

𝑨𝑝 (𝑦𝑡|𝑦 𝑡−1 )

∏𝑡=1

𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Conditional Random Fields Generalization of the Markov models

How do we initialize those models? Reaching HCRFs from a HMM

41

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

∏𝑡=1

𝑇

)(∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿𝑒❑

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∏𝑡=1

𝑇𝑩(𝑥𝑡 , 𝑦𝑡)𝑨 ( 𝑦𝑡 , 𝑦𝑡−1 )∑

𝑡=1

𝑇

❑ 𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)ln

¿

42

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

ln 𝑨

𝑛 𝑛

𝑛 𝑚

ln 𝑩a11 a12 a13

a21 a22 a23

a31 a32 a33

b11 b12

b21 b22

b31 b32

𝝀 :𝑘=𝑛×𝑛+𝑛×𝑚

43

Introduction Libras Review Methods Experiments Results Conclusion

∏𝑡=1

𝑇 }𝑒𝑥𝑝 {∏𝑡=1𝑇

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

∑𝑡=1

𝑇

❑𝐥𝐧 𝑨 ( 𝑦𝑡 , 𝑦𝑡− 1)+¿ 𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡)∑𝑡=1

𝑇

a11 a12 a13 a21 a22 a23 a31 a32 a33 b11 b12 b21 b22 b31 b32

𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆

❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }

𝑘=𝑛×𝑛+𝑛×𝑚𝝀 :

i=1

j=1

i=1

j=2

i=1

j=3

i=2

j=1

i=2

j=2

i=2

j=3

i=3

j=1

i=3

j=2

i=3

j=1

i=1

o=1

i=1

o=2

i=2

o=1

i=2

o=2

i=3

o=1

i=3

o=2

𝒇 :

44

Introduction Libras Review Methods Experiments Results Conclusion

{∑❑❑

∑❑

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑒𝑥𝑝 {∑𝑡=1𝑇

∑𝑖=1

𝑛

∑𝑗=1

𝑛

𝑎𝑖𝑗1{𝑦𝑡=𝑖}1{𝑦𝑡− 1= 𝑗 }+∑𝑡=1

𝑇

∑𝑖=1

𝑛

∑𝑗=1

𝑚

𝑏𝑖𝑗1{𝑦 𝑡=𝑖}1{𝑥𝑡= 𝑗}}

𝑒𝑥𝑝 {∑𝑘=1

𝐾

𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}

𝒇 𝒆𝒅𝒈𝒆❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 ; 𝑖 , 𝑗 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒚 𝒕 −𝟏= 𝐣} 𝒇 𝒏𝒐𝒅𝒆

❑ (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙 ; 𝑖 ,𝑜 )=𝟏{𝒚𝒕=𝐢 }𝟏{𝒙 𝒕=𝐨 }¿𝑒𝑥𝑝 {∑𝑡=1𝑇

𝑎𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 , 𝒙)+∑𝑡=1

𝑇

𝑏𝑖𝑗 𝒇 𝒊𝒋 (𝒚 𝒕 , 𝒚 𝒕−𝟏 ,𝒙 )}

¿

45

Introduction Libras Review Methods Experiments Results Conclusion

𝑝 (𝑥 , 𝑦 )=¿

𝑝 (𝒚|𝒙 )= 1𝑍 (𝒙 )

∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘(𝒙𝒄 , 𝒚𝒄)} 

𝑒𝑥𝑝 {∑𝑘=1

𝐾

𝜆𝑘 𝑓 𝑘 (𝒙 , 𝒚 )}𝑝 ( 𝑦 )

𝑝 (𝒚|𝒙 )=¿1

𝑍 (𝒙 )

46

Introduction Libras Review Methods Experiments Results Conclusion

Drawback Assumes both and are known

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

47

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

48

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

49

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

𝜃𝐻𝐶𝑅𝐹

𝜃𝐻𝑀𝑀𝑐

Parameter space

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

50

Introduction Libras Review Methods Experiments Results Conclusion

Sequence classification Model without explicitly modeling

Do not require to be known The sequence of states is now hidden

Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

51

Introduction Libras Review Methods Experiments Results Conclusion

∑𝑘=1

𝐾 (𝑝)

❑ ;𝜽𝒑 )}

1𝑍 (𝒙 ) ∏𝐶𝑝∈𝒞 ∏

Ψ 𝑐∈𝐶𝑝

Ψ 𝑐 ¿¿¿¿ 𝒙 )=¿∑𝒚

❑ ;𝜽𝒑 ¿,𝜔𝑐

Ψ 𝑐 (𝒙𝒄 , 𝒚𝒄𝜽𝒑;𝜽𝒑 )=𝑒𝑥𝑝 {∑𝑘=1

𝐾 (𝑝)

𝜆𝑝𝑘 𝑓 𝑝𝑘 (𝒙𝒄 ,𝒚 𝒄 ,𝜽𝒑,𝜔𝑐,𝜔𝑐

𝑝 (𝜔|𝒙 )=∑𝒚

❑𝑝 (𝒚,𝜔Hidden Markov Models (HMMs)

Conditional Random Fields (CRFs)

Hidden Conditional Random Fields (HCRFs)

Hidden Conditional Random Fields Generalization of the hidden Markov classifiers

52

Introduction Libras Review Methods Experiments Results Conclusion

xt-2 xt-1 xt

yt-1yt-2 yt

ω

ω̂=max𝜔 𝑗∈  ω

𝑝 (𝜔 𝑗 | 𝒙)

Single model for all words

53

Experiments and ResultsFingerspelling recognition with SVMs and HCRFs against ANNs and HMMs

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T T O OA

Sequence classifier

PatoHMM

ANN

xt-2 xt-1 xt

yt-1yt-2 yt

ω

HCRF

SVM

55

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture recognition Database of 8100 grayscale images Input instances with 1024 features 27 classes (manual alphabet signs)

Static gestures(hand postures)

Dynamic gestures (spelled words)

56

Introduction Libras Review Methods Experiments Results Conclusion

Neural Networks Evaluate initialization heuristics (Nguyen-Widrow)

Support Vector Machines Evaluate the heuristic value for based on the inter-

quartile range of the norm statistics for the input dataset

Static gestures(hand postures)

Dynamic gestures (spelled words)

57

Introduction Libras Review Methods Experiments Results Conclusion

Resilient Backpropagation ANN

Hidden Neurons Kappa

50 0.851

100 0.887

300 0.921

500 0.922

1000 0.924

Gaussian kernel SVM

Kappa

0,1 0.000

1 0.106

10 0.569

100 0.950

(heuristic) 392 0.917

1000 0.863

Static sign classification

0.1 1 10 100 10000

0.2

0.4

0.6

0.8

1

0.00100.00200.00300.00400.00500.00600.00700.00

Busca em grade Heurística Vetores de Suporte

Kapp

a

Supp

ort V

ecto

rs (A

vera

ge)

58

Introduction Libras Review Methods Experiments Results Conclusion

Resilient Backpropagation ANN

Hidden Neurons Kappa

50 0.851

100 0.887

300 0.921

500 0.922

1000 0.925

Gaussian kernel SVM

Kappa

0.1 0.000

1 0.106

10 0.569

100 0.959

(heuristic) 392 0.917

1000 0.863

0.1 1 10 100 10000

0.2

0.4

0.6

0.8

1

0.00100.00200.00300.00400.00500.00600.00700.00

Busca em grade Heurística Vetores de Suporte

Kapp

a

Supp

ort V

ecto

rs (A

vera

ge)

Static sign classification

59

Introduction Libras Review Methods Experiments Results Conclusion

0.1 1075

200 H1000

2000

0.80

0.85

0.90

0.95

1.00

11000

1000000

Hyperparameter surface for Gaussian SVMs

1 10 100 1000 10000 100000 1000000 10000000

Sigma (σ²)

Kapp

a

C

0.1 1075

200 H1000

2000

0100200300400500600

11000

1000000

1 10 100Sigma (σ²)Av

erag

e nu

mbe

r of S

Vs

C

60

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classification Statistically significant results (p < 0.01)

Points of interest Polynomial machines have increased sparcity but smaller kappa Neural networks were faster to evaluate, but not to learn – unless using linear SVMs Sigma plays a much more important role than C in Gaussian machines Heuristics for choosing sigma and C resulted in great performance values

Static gestures(hand postures)

Dynamic gestures (spelled words)

61

Introduction Libras Review Methods Experiments Results Conclusion

Static gestures(hand postures)

Dynamic gestures (spelled words)

62

Introduction Libras Review Methods Experiments Results Conclusion

Static gestures(hand postures)

Dynamic gestures (spelled words)

Introduction Libras Review Methods Experiments Results Conclusion

Static gesture classifier

P TP P A A T TA

Sequence classifier

Pato

xt-2 xt-1 xt

yt-1yt-2 yt

ω

HCRF

SVM

64

Introduction Libras Review Methods Experiments Results Conclusion

Dynamic gesture classification Database containing 540 signed words Containing a total of 63,703 static signs

The previous layer labels the entire dataset Then we tested all possible model combinations Estimated kappa sampled from 10-fold CV

65

Introduction Libras Review Methods Experiments Results Conclusion

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch

SVM HCRF RProp

ANN HMM Baum-Welch

ANN HCRF RProp

Dynamic gesture classification

66

Introduction Libras Review Methods Experiments Results Conclusion

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0.95 0.82

SVM HCRF RProp 0.98 0.83

ANN HMM Baum-Welch 0.95 0.80

ANN HCRF RProp 0.99 0.82

Dynamic gesture classification

67

Introduction Libras Review Methods Experiments Results Conclusion

SVM+HCRF have shown the best validation result (10-fold CV) Combinations using HCRF have shown best results in general

Training results are statistically different We have not enough evidence to say validation results are not equivalent

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0.95 0.82

SVM HCRF RProp 0.98 0.83

ANN HMM Baum-Welch 0.95 0.80

ANN HCRF RProp 0.99 0.82

Dynamic gesture classification

68

Introduction Libras Review Methods Experiments Results Conclusion

Hidden Conditional Random Fields – in this specific problem – had higher ability to retain knowledge while keeping the same generality

In other words, achieved less overfitting.

Labeller Classification AlgorithmTraining Validation

Kappa Kappa

SVM HMM Baum-Welch 0,95 0,82

SVM HCRF RProp 0,98 0,83

ANN HMM Baum-Welch 0,95 0,80

ANN HCRF RProp 0,99 0,82

Dynamic gesture classification

69

Conclusionand future works

Introduction Libras Review Methods Experiments Results Conclusion

Fingerspelling experiments SVMs & HCRFs vs ANNs & HMMs

Static Gesture Recognition Statistically significant results favoring SVMs Linear SVMs on DDAGs:

Best compromise between speed, accuracy and ease of use SVMs have shown easier training, reduced training times Heuristic initializations work rather well, less parameter tuning

Dynamic Gesture Recognition Choice of gesture classifier had much more impact Linear-chain HCRFs:

Increased knowledge absorption without overfitting

71

Introduction Libras Review Methods Experiments Results Conclusion

Future works

Detect standard words rather than fingerspelling (already complete)

Use Structural Support Vector Machines, which are equivalent to HCRFs but are trained using a hinge loss function

Use a mixed language model to categorize full phrases

References

73

Introduction Libras Review Methods Experiments Results Conclusion

References

BOWDEN, R. et al. A Linguistic Feature Vector for the Visual Interpretation of Sign Language. European Conference on Computer Vision. [S.l.]: Springer-Verlag. 2004. p. 391-401.

BRADSKI, G. R. Computer Vision Face Tracking For Use in a Perceptual User Interface. Intel Technology Journal, n. Q2, 1998. Disponivel em: <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673>.

DIAS, D. B. et al. Hand movement recognition for brazilian sign language: a study using distance-based neural networks. Proceedings of the 2009 international joint conference on Neural Networks. Atlanta, Georgia, USA: IEEE Press. 2009. p. 2355-2362.

FERREIRA-BRITO, L. Por uma gramática de Línguas de Sinais. 2nd. ed. Rio de Janeiro: Tempo Brasileiro, 2010. 273 p. ISBN 85-282-0069-8.

FERREIRA-BRITO, L.; LANGEVIN, R. The Sublexical Structure of a Sign Language. Mathématiques, Informatique et Sciences Humaines, v. 125, p. 17-40, 1994.

74

Introduction Libras Review Methods Experiments Results Conclusion

References

FEUERSTACK, S.; COLNAGO, J. H.; SOUZA, C. R. D. Designing and Executing Multimodal Interfaces for the Web based on State Chart XML. Proceedings of 3a. Conferência Web W3C Brasil 2011. Rio de Janeiro: [s.n.]. 2011.

PIZZOLATO, E. B.; ANJO, M. D. S.; PEDROSO, G. C. Automatic recognition of finger spelling for LIBRAS based on a two-layer architecture. Proceedings of the 2010 ACM Symposium on Applied Computing. Sierre, Switzerland: ACM. 2010. p. 969-973.

VIOLA, P.; JONES, M. Robust Real-time Object Detection. International Journal of Computer Vision. [S.l.]: [s.n.]. 2001.

VAPNIK, V. N. The nature of statistical learning theory. New York, NY, USA: Springer-Verlag New York, Inc., 1995. ISBN 0-387-94559-8.

VAPNIK, V. N. Statistical learning theory. [S.l.]: Wiley, 1998. ISBN 0471030031. YANG, R.; SARKAR, S. Detecting Coarticulation in Sign Language using

Conditional Random Fields. Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. [S.l.]: [s.n.]. 2006. p. 108-112.

Acknowledgements

Guilherme Cartacho

Fin!

77

A Framework for Research Support

Appendix A

78Accord.NET Framework Machine Learning and Artificial Intelligence

79Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition

80Accord.NET Framework Machine Learning and Artificial Intelligence Computer Vision / Audition Mathematics and Statistics

Builds upon well established foundations

83

It has been used toRecognize gestures using Wii

84

It has been used to Study and evaluate performance in 3D gesture recognition

85

It has been used toPredict attacks in computer networks

86

It has been used toCompare touch and in-air gestures using Kinect

87

It has been used toProvide sensor information in multi-model interfaces

Guido Soetens, Estimating the limitations of single-handed multi-touch input. Master Thesis, Utrecht University. September, 2012.

K. N. Pushpalatha, A. K. Gautham, D. R. Shashikumar, K. B. ShivaKumar. Iris Recognition System with Frequency Domain Features optimized with PCA and SVM Classifier, IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 1, September 2012.

Arnaud Ogier, Thierry Dorval. HCS-Analyzer: Open source software for High-Content Screening data correction and analysis. Bioinformatics. First published online May 13, 2012.

It has been used inan increasing number of publications

Ludovico Buffon, Evelina Lamma, Fabrizio Riguzzi, and Davide Forment. Un sistema di vision inspection basato su reti neurali. In Popularize Artificial Intelligence. Proceedings of the AI*IA Workshop and Prize for Celebrating 100th Anniversary of Alan Turing's Birth (PAI 2012), Rome, Italy, June 15, 2012, number 860 in CEUR Workshop Proceedings, pages 1-6, Aachen, Germany, 2012.

Liam Williams, Spotting The Wisdom In The Crowds. Master Thesis on Joint Mathematics and Computer Science. Imperial College London, Department of Computing. June, 2012.

Alosefer, Y.; Rana, O.F.; "Predicting client-side attacks via behaviour analysis using honeypot data," Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on , vol., no., pp.31-36, 19-21 Oct. 2011

It has been used inan increasing number of publications

Brummitt, L. Scrabble Referee: Word Recognition Component, 2011. Final project report. University of Sheffield, Sheffield, England.

Cani, V., 2011. Image Stitching for UAV remote sensing application. Master Degree Thesis. Computer Engineering, School of Castelldefels of Universitat Politècnica de Catalunya. Barcelona, Spain.

Hassani, A. Z.; "Touch versus in-air Hand Gestures: Evaluating the acceptance by seniors of Human-Robot Interaction using Microsoft Kinect," Master Thesis, University of Twente, Enschede, Netherlands, 2011.

Kaplan, K., 2011. ADES: Automatic Driver Evaluation System. PhD Thesis, Boğaziçi University, Istanbul, Turkey.

It has been used inan increasing number of publications

Wright, M., Lin, C.-J., O'Neill, E., Cosker, D. and Johnson, P., 2011. 3D Gesture recognition: An evaluation of user and system performance. In: Pervasive Computing - 9th International Conference, Pervasive 2011, Proceedings. Heidelberg: Springer Verlag, pp. 294-313.

Lourenço, J., 2010. Wii3D: Extending the Nintendo Wii Remote into 3D. Final course project report, Rhodes University, Grahamstown. 110p.

Mendelssohn, T.; 2010. Gestureboard - Entwicklung eines Wiimote-basierten, gestengesteuerten, Whiteboard-Systems für den Bildungsbereich. Final project report. Hochschule Furtwangen University, Furtwangen im Schwarzwald, Germany.

It has been used inan increasing number of publications

92

http://accord.googlecode.com