Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology:...

83
1 Delft University of Technology: Man–Machine Interaction Delft University of Technology Lexical Stress in Speech Recognition Master’s thesis presentation Rogier van Dalen 13th May 2005

Transcript of Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology:...

Page 1: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

1

Delft University of Technology: Man–Machine InteractionDelft University of Technology

Lexical Stress in SpeechRecognition

Master’s thesis presentation

Rogier van Dalen

13th May 2005

Page 2: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 2

Delft University of Technology: Man–Machine Interaction

Topics

• Objective• What is lexical stress?• Properties of lexical stress• Model• System• Results

Page 3: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 3

Delft University of Technology: Man–Machine Interaction

Objective

Can lexical stress be used in a speechrecogniser to make it perform better?

Page 4: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 3

Delft University of Technology: Man–Machine Interaction

Objective

Can lexical stress be used in a speechrecogniser to make it perform better?

• Find properties of lexical stress

Page 5: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 3

Delft University of Technology: Man–Machine Interaction

Objective

Can lexical stress be used in a speechrecogniser to make it perform better?

• Find properties of lexical stress• Model speech recogniser

Page 6: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 3

Delft University of Technology: Man–Machine Interaction

Objective

Can lexical stress be used in a speechrecogniser to make it perform better?

• Find properties of lexical stress• Model speech recogniser• Implement speech recogniser

Page 7: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 3

Delft University of Technology: Man–Machine Interaction

Objective

Can lexical stress be used in a speechrecogniser to make it perform better?

• Find properties of lexical stress• Model speech recogniser• Implement speech recogniser• Test speech recogniser

Page 8: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 4

Delft University of Technology: Man–Machine Interaction

Garden-variety speech recognition

Input modelled as a concatenation of phonemes

Page 9: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 5

Delft University of Technology: Man–Machine Interaction

What is lexical stress?

/ho:rIkda:r@nka:nOn/

Page 10: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 5

Delft University of Technology: Man–Machine Interaction

What is lexical stress?

/ho:rIkda:r@nka:nOn/

Hoor ik daar een kanon?

Page 11: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 5

Delft University of Technology: Man–Machine Interaction

What is lexical stress?

/ho:rIkda:r@nka:nOn/

Hoor ik daar een kanon?

kanón ‘gun’ or kánon ‘song’?

Page 12: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 6

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Minimal pairs

Page 13: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 6

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Minimal pairs(a) subject – (to) subject

Page 14: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 6

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Minimal pairs(a) subject – (to) subjectDu. aanbod ‘offer’ – aan bod ‘first in line’

Page 15: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 6

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Minimal pairs(a) subject – (to) subjectDu. aanbod ‘offer’ – aan bod ‘first in line’Du. voorkomen ‘prevent’ – voorkomen‘happen’

Page 16: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 6

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Minimal pairs(a) subject – (to) subjectDu. aanbod ‘offer’ – aan bod ‘first in line’Du. voorkomen ‘prevent’ – voorkomen‘happen’Portuguese falara ‘I had spoken’ – falará ‘hewill speak’

Page 17: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 7

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

Page 18: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 7

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Word recognition

Page 19: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 7

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Word recognitionDu. october – octopus

Page 20: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 7

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Word recognitionDu. october – octopustigress – digress

Page 21: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 8

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

Page 22: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 8

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Segmentation

Page 23: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 8

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Segmentationconduct ascends uphill

Page 24: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 8

Delft University of Technology: Man–Machine Interaction

Use of lexical stress

• Segmentationconduct ascends uphill‘a doctor sends a pill’?

Page 25: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 9

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

Page 26: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 9

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level

Page 27: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon

a /eI/

are /A:/

the /Di:/

garden /gA:d@n/

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 28: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon

a /eI/ [@]

are /A:/

the /Di:/

garden /gA:d@n/

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 29: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon

a /eI/ [@]

are /A:/ [@]

the /Di:/

garden /gA:d@n/

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 30: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /gA:d@n/

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 31: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 32: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 33: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/

table /teIb@l/

variety /v@raI@ti/

Page 34: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/ ["O:dn"ri]

table /teIb@l/

variety /v@raI@ti/

Page 35: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/ ["O:dn"ri]

table /"teIb@l/

variety /v@raI@ti/

Page 36: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/ ["O:dn"ri]

table /"teIb@l/ ["theIbl"]

variety /v@raI@ti/

Page 37: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/ ["O:dn"ri]

table /"teIb@l/ ["theIbl"]

variety /v@"raI@ti/

Page 38: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 10

Delft University of Technology: Man–Machine Interaction

A lexicon with stress marks

a /eI/ [@]

are /A:/ [@]

the /Di:/ [D@]

garden /"gA:d@n/ ["gA:dn"]

ordinary /"O:dInEri/ ["O:dn"ri]

table /"teIb@l/ ["theIbl"]

variety /v@"raI@ti/ [v""raI@ti]

Page 39: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 11

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level

Page 40: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 11

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced

Page 41: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 12

Delft University of Technology: Man–Machine Interaction

/ka:"nOn/ ‘gun’ or /"ka:nOn/ ‘song’?

Time (s)0 0.532517

Pitc

h (H

z)

0

400

Time (s)0 0.486281

Pitc

h (H

z)

0

400

|k|a: | n | O | n | |k| a: |n| O | n |

Page 42: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 12

Delft University of Technology: Man–Machine Interaction

/ka:"nOn/ ‘gun’ or /"ka:nOn/ ‘song’?

Time (s)0 0.532517

Pitc

h (H

z)

0

400

Time (s)0 0.486281

Pitc

h (H

z)

0

400

|k|a: | n | O | n | |k| a: |n| O | n |kanón /ka:"non/ ‘gun’ kánon /"ka:non/ ‘song’

Page 43: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 13

Delft University of Technology: Man–Machine Interaction

/"ka:nOn?/ ‘song?’ or /ka:"nOn?/ ‘gun?’?

Time (s)0 0.5

Pitc

h (H

z)

0

400

Time (s)0 0.565397

Pitc

h (H

z)

0

400

|k|a: | n | O | n | |k| a: | n | O | n |

Page 44: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 13

Delft University of Technology: Man–Machine Interaction

/"ka:nOn?/ ‘song?’ or /ka:"nOn?/ ‘gun?’?

Time (s)0 0.5

Pitc

h (H

z)

0

400

Time (s)0 0.565397

Pitc

h (H

z)

0

400

|k|a: | n | O | n | |k| a: | n | O | n |kanón /ka:"non/ ‘gun’ kánon /"ka:non/ ‘song’

Page 45: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 14

Delft University of Technology: Man–Machine Interaction

/"ka:nOn/ ‘song’ or /ka:"nOn/ ‘gun’?

Time (s)0 0.532517

–0.6068

0.6289

0

Time (s)0 0.486281

–0.5964

0.6173

0

|k|a: | n | O | n | |k| a: |n| O | n |

Page 46: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 14

Delft University of Technology: Man–Machine Interaction

/"ka:nOn/ ‘song’ or /ka:"nOn/ ‘gun’?

Time (s)0 0.532517

–0.6068

0.6289

0

Time (s)0 0.486281

–0.5964

0.6173

0

|k|a: | n | O | n | |k| a: |n| O | n |kanón /ka:"non/ ‘gun’ kánon /"ka:non/ ‘song’

Page 47: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 15

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced

Page 48: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 15

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced• Stressed syllables have longer durations

Page 49: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 15

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced• Stressed syllables have longer durations• Stressed syllables are louder

Page 50: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 16

Delft University of Technology: Man–Machine Interaction

/ka:"nOn/ ‘gun’ or /"ka:nOn/ ‘song’?

Time (s)0 0.532517

0

5000

Fre

quen

cy (

Hz)

Time (s)0 0.486281

0

5000

Fre

quen

cy (

Hz)

|k|a: | n | O | n | |k| a: |n| O | n |

Page 51: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 16

Delft University of Technology: Man–Machine Interaction

/ka:"nOn/ ‘gun’ or /"ka:nOn/ ‘song’?

Time (s)0 0.532517

0

5000

Fre

quen

cy (

Hz)

Time (s)0 0.486281

0

5000

Fre

quen

cy (

Hz)

|k|a: | n | O | n | |k| a: |n| O | n |kanón /ka:"non/ ‘gun’ kánon /"ka:non/ ‘song’

Page 52: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 17

Delft University of Technology: Man–Machine Interaction

/"ka:nOn?/ ‘song?’ or /ka:"nOn?/ ‘gun?’?

Time (s)0 0.5

0

5000

Fre

quen

cy (

Hz)

Time (s)0 0.565397

0

5000

Fre

quen

cy (

Hz)

|k|a: | n | O | n | |k| a: | n | O | n |

Page 53: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 17

Delft University of Technology: Man–Machine Interaction

/"ka:nOn?/ ‘song?’ or /ka:"nOn?/ ‘gun?’?

Time (s)0 0.5

0

5000

Fre

quen

cy (

Hz)

Time (s)0 0.565397

0

5000

Fre

quen

cy (

Hz)

|k|a: | n | O | n | |k| a: | n | O | n |kanón /ka:"non/ ‘gun’ kánon /"ka:non/ ‘song’

Page 54: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 18

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced• Stressed syllables have longer durations• Stressed syllables are louder

Page 55: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 18

Delft University of Technology: Man–Machine Interaction

Properties of lexical stress

• Stress works on the syllable level• Unstressed syllables are reduced• Stressed syllables have longer durations• Stressed syllables are louder• Stressed syllables have more high

frequencies

Page 56: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 19

Delft University of Technology: Man–Machine Interaction

Distinguishing phonemes

/@/

/a:/

/y:/ /u:/

/O/

Page 57: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 20

Delft University of Technology: Man–Machine Interaction

Distinguishing phonemes

/@/

/a:/

/"a:/

/y:/

/"y:/

/u:/

/"u:/

/O/

/"O/

Page 58: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 21

Delft University of Technology: Man–Machine Interaction

Integration in a speech recogniser

• /A a: p t O v V/

Page 59: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 21

Delft University of Technology: Man–Machine Interaction

Integration in a speech recogniser

• /A a: p t O v V/

• Stressed and unstressed versions ofphonemes/A "A a: "a: p "p t "t O "O v "v V "V/

Page 60: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 22

Delft University of Technology: Man–Machine Interaction

Integration in the lexicon

are @

a @

the D @

garden g A: d n

ordinary O: d n r i

table t eI b l

variety v r aI @ t i

Page 61: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 22

Delft University of Technology: Man–Machine Interaction

Integration in the lexicon

are @

a @

the D @

garden "g "A: d n

ordinary "O: d n r i

table "t "eI b l

variety v "r "aI @ t i

Page 62: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 23

Delft University of Technology: Man–Machine Interaction

Integration in feature vectors

Page 63: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 23

Delft University of Technology: Man–Machine Interaction

Integration in feature vectors

MFCCs

Spectral tilt features

Page 64: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 24

Delft University of Technology: Man–Machine Interaction

Modelling duration

a12 a23 a34 a45

a22 a33 a44

b2 b3 b4

Page 65: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 25

Delft University of Technology: Man–Machine Interaction

Model — baseline

Feature

extraction

Viterbi:

phoneme

level

{

@eI

}

l

{

iaI

}

@

{

mn

} Viterbi:

word

level

hmms

aI @ eI il m n

Lexicon

a /@/alien /eIli@n/lion /laI@n/

‘alien’

Page 66: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 26

Delft University of Technology: Man–Machine Interaction

Model — stress-enabled

Feature

extraction

Viterbi:

phoneme

level

{

@

eI

} {

l

"l

}

"i

aI

"aI

@

{

m

n

} Duration

analysis

Viterbi:

word

level

hmms

aI "aI @ eI"eI i "i l "lm "m n "n

Lexicon

a /@/alien /"eI-li-@n/lion /"laI-@n/

‘a lion’

Stress

feature

extraction

Page 67: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 27

Delft University of Technology: Man–Machine Interaction

Model — implemented

Feature

extraction

Viterbi:

phoneme

level

{

@

eI

} {

l

"l

}

"i

aI

"aI

@

{

m

n

}

Viterbi:

word

level

hmms

aI "aI @ eI"eI i "i l "lm "m n "n

Lexicon

a /@/alien /"eI-li-@n/lion /"laI-@n/

‘a lion’

Stress

feature

extraction

Page 68: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 28

Delft University of Technology: Man–Machine Interaction

System

HCopy

Praat

sox

cvf HERest

HCompV

f o: "n "i: m I kt r A: n "s "k "r "I "p S n

"z

trained hmms

HVite

‘recognised words’

Page 69: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 29

Delft University of Technology: Man–Machine Interaction

System

• Hidden Markov Toolkit• Corpus Gesproken Nederlands• 772 recordings• 54 842 files• 775 034 words• 53 hours

Page 70: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 30

Delft University of Technology: Man–Machine Interaction

Time

Using 6 to 8 computers:• Training: 1 – 8 hours per iteration• Evaluation: 4 – 10 hours per iteration• 60 training iterations for 2 recognisers

Page 71: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 31

Delft University of Technology: Man–Machine Interaction

Experimental set-up

Conventional

speech

recogniser

Stress-enabled

speech

recogniser

Page 72: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 32

Delft University of Technology: Man–Machine Interaction

Results — duration /"i:/–/i:/

0 50 100 150 200 250 3000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Page 73: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 33

Delft University of Technology: Man–Machine Interaction

Results — duration /"n/–/n/

0 50 100 150 200 250 3000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 74: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 34

Delft University of Technology: Man–Machine Interaction

Results — spectral tilt /"a:/–/a:/

−15 −10 −5 0 5 10 15 20 25 300

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Page 75: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 35

Delft University of Technology: Man–Machine Interaction

Results — spectral tilt /"d/–/d/

−20 −15 −10 −5 0 5 10 15 20 25 300

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Page 76: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 36

Delft University of Technology: Man–Machine Interaction

Results — training

0

25

30

35

40

45

0 5 10 15 20 25 30 35 40 45 50 55 60

Training iteration −→

Rec

ognitio

nac

cura

cy(%

)−→

Page 77: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 37

Delft University of Technology: Man–Machine Interaction

Results — Recognition improvement

Page 78: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 37

Delft University of Technology: Man–Machine Interaction

Results — Recognition improvement

Conventional

speech

recogniser

Stress-enabled

speech

recogniser

Word error rate

56.72 %

Word error rate

55.27 %

A 2.6 % relative improvement.

Page 79: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 38

Delft University of Technology: Man–Machine Interaction

Conclusion

Page 80: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 38

Delft University of Technology: Man–Machine Interaction

Conclusion

Using lexical stress in an automatic speechrecogniser for continuous, large-vocabularyspeech can improve the recognition rate.

Page 81: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 38

Delft University of Technology: Man–Machine Interaction

Conclusion

Using lexical stress in an automatic speechrecogniser for continuous, large-vocabularyspeech can improve the recognition rate.

• Consonants

Page 82: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 39

Delft University of Technology: Man–Machine Interaction

Future work

• Model duration• Model phrasal stress

Page 83: Lexical Stress in Speech Recognition · 13th May 2005 18 Delft University of Technology: Man–Machine Interaction Properties of lexical stress • Stress works on the syllable level

13th May 2005 40

Delft University of Technology: Man–Machine Interaction