-- Speech Activities in CST Thomas Fang...

94
Thomas Fang Zheng Thomas Fang Zheng Center of Speech Technology (CST) State Key Lab of Intelligent Technology and Systems Department of Computer Science & Technology Tsinghua University [email protected], http://sp.cs.tsinghua.edu.cn/~fzheng/ 30 Oct 01 at Communications Research Lab, Kyoto Chinese Spoken Dialogue Systems -- Speech Activities in CST

Transcript of -- Speech Activities in CST Thomas Fang...

Page 1: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

Thomas Fang ZhengThomas Fang ZhengCenter of Speech Technology (CST)

State Key Lab of Intelligent Technology and Systems

Department of Computer Science & TechnologyTsinghua University

[email protected], http://sp.cs.tsinghua.edu.cn/~fzheng/

30 Oct 01 at Communications Research Lab, Kyoto

Chinese Spoken Dialogue Systems

-- Speech Activities in CST

Page 2: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 2

Outline

q Brief Introduction to CST

q Speech R&D Activities (w/ paper references)

q A Flight Spoken Dialogue System - EasyFlightv System Overview

v Keyword Based Robust Parser

v Powerful Dialogue Manager

q Demonstrationsv EasyFlight - Flight inquiry & reservation dialog system

v EasyNav - THU Campus navigation dialog system

q Thanks

Page 3: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 3

q Founded in 1979, named as Speech Laboratory

q Joined the State Key Laboratory of Intelligent Technology and Systems in 1999, renamed as Center of Speech Technology

q http://sp.cs.tsinghua.edu.cn/

Center of Speech Technology

Page 4: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 4

5

1

13

7FacultyFacultyFacultyFaculty

Post DoctorsPost DoctorsPost DoctorsPost Doctors

Doctoral StudentsDoctoral StudentsDoctoral StudentsDoctoral Students

Master StudentsMaster StudentsMaster StudentsMaster Students

Members of CST in 2001

Page 5: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 5

q State fundamental research plan:v NSFv 863v 973v 985 (Tsinghua University)

q Collaboration with industries:v Analog Devices, Inc.v IBMv Intelv Keysun Information Technology Limitedv Lucent Technologiesv Microsoftv Nokiav SoundTek Technology Limitedv Weniwen Technologies Limited)v ...

Founding Resources

Page 6: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 6

Speech R&D Activitiesq Acoustic Modeling

v Feature Extraction and Selectionv Acoustic Modelingv Accurate & fast AM Searchv Robustness

Speech EnhancementFractalsSpeaker AdaptationSpeaker NormalizationChinese Pronunciation Modeling

q Language Modelingv Characteristics of Chinesev Language Modeling and Searchv LM Adaptation & New Word

Induction

q Natural/Spoken Speech Understanding (NLU/SLU)v NLU - GLR Based Parsingv SLU - KW based robust parsingv Dialogue Manager

q Applicationsv Command and controlv Keyword spottingv Language Learningv Input method editorv Chinese dictation machinev Spoken dialoguesv Speaker identification and

verification

q Resources

Page 7: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 7

1. Fan Wang, Fang Zheng, and Wenhu Wu. “An MCE based Classification Tree Using Hierarchical Feature-Weighting in Speech Recognition,” EuroSpeech’2001, 3:1947-1950, Sept. 3-7, 2001, Aalborg, Denmark

2. Xinyan Zhang. “Subband analysis based robust speech recognition,” Graduate Project: Tsinghua University, Beijing. June 2001.

Feature Extraction and Selection

Page 8: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 8

Acoustic Modeling1. Jiyong Zhang, Fang Zheng, Jing Li, Chunhua Luo, and Guoliang

Zhang, “Improved Context-Dependent Acoustic Modeling for Continuous Chinese Speech Recognition,” EuroSpeech, 3:1617-1620, Sept. 3-7, 2001, Aalborg, Denmark

2. Zheng Fang, Wu Wenhu, and Fang Ditang, “Center-Distance Continuous Probability Models And the Distance Measure,” J. of Computer Science and Technology, 13(5): 426-437, Sept., 1998

3. ZHENG Fang, MOU Xiaolong, WU Wenhu, and FANG Ditang, “On the Embedded Multiple-Model Scoring Scheme for Speech Recognition,” International Symposium on Chinese Spoken Language Processing (ISCSLP'98), ASR-A3, pp.49-53, Dec.7-9, 1998, Singapore

4. Guo Qing, Zheng Fang, Wu Jian and Wu Wenhu, “A new method used in HMM for modeling frame correlation,” ICASSP, pp. I-169~172, March 15~19, 1999, Phoenix

Page 9: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 9

Accurate & fast AM Search1. Guoliang Zhang, Fang Zheng, and Wenhu Wu, “A Two-Layer Lexical Tree Based

Beam Search in Continuous Chinese Speech Recognition,” EuroSpeech, 3:1801-1804, Sept. 3-7, 2001, Aalborg, Denmark

2. Jian Wu, and Fang Zheng. “Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning,” ICSLP’00, pp. IV-262~265

3. Zhanjiang Song, Fang Zheng, and Wenhu Wu. “Statistical knowledge based frame synchronous search strategies in continuous speech recognition,” ICASSP’00, pp. III-1583~1586

4. Jiyong Zhang, Fang Zheng, Shu Du, Zhanjiang Song and Mingxing Xu. “Merging based syllable detection automaton in continuous Chinese speech recognition,” J. of Software, 10(11): 1212~1215, Nov. 1999 (in Chinese)

5. Fang Zheng, Zhanjiang Song, Mingxing Xu, et al. “EasyTalk: A Large-Vocabulary Speaker-Independent Chinese Dictation Machine,” EuroSpeech'99, Vol. 2, pp.819-822, Budapest, Hungary, Sept. 1999

6. Fang Zheng, Mingxing Xu, and Wenhu Wu. “Search strategies in continuous speech recognition,” 5th National Conference on Man-Machine Speech Communications(NCMMSC’98),138-143, Jul. 26-31, 1998, Harbin (in Chinese)

Page 10: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 10

Speech Enhancement

1. YANG Dali, XU Mingxing, WU Wenhu, ZHENG Fang, “A Noise Cancellation Method Based on Wavelet Transform,”International Symposium on Chinese Spoken Language Processing, pp. 211-214, Oct. 13-15, 2000, Beijing

Page 11: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 11

Fractals

1. Fan Wang, Fang Zheng, and Wenhu Wu, “A C/V segmentation method for Mandarin speech based on multiscale fractal dimension,” International Conference on Spoken Language Processing, pp. IV-648~651, Oct. 16-20, Beijing

2. WANG Fan, ZHENG Fang, and WU Wenhu, “A self-adapting endpoint detection algorithm for speech recognition in noisy environments based on 1/f process,” International Symposium on Chinese Spoken Language Processing, pp. 327-330, Oct. 13-15,

2000, Beijing

Page 12: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 12

Speaker Adaptation

1. Lei He, Jian Wu, Ditang Fang, Wenhu Wu, “Speaker adaptation based on combination of map estimation and weighted neighbor regression,” IEEE ICASSP, pp.II-981~984, June 5-9, 2000, Istanbul, Turkey

Page 13: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 13

Speaker Normalization1. Lei HE, Ditang FANG, and Wenhu WU, “Speaker normalization

training and adaptation for speech recognition,” International Conference on Spoken Language Processing, pp. IV-342~345, Oct. 16-20, Beijing

2. Tranzai LEE, Fang ZHENG, and Wenhu WU, “Reference point alignment frequency warp method for speaker adaptation,”International Conference on Signal Pocessing, pp. II-756~759, Aug. 21-25, 2000, Beijing

Page 14: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 14

Chinese Pronunciation Modeling1. Fang Zheng, Zhanjiang Song, Pascale Fung, and William Byrne,

“Mandarin Pronunciation Modeling Based on CASS Corpus,”Sino-French Symposium on Speech and Language Processing, pp. 47-53, Oct. 16, 2000, Beijing

2. Pascale Fung, William Byrne, ZHENG Fang Thomas, Terri Kamm, LIU Yi, SONG Zhanjiang, Veera Venkataramani, and Umar Ruhi, “Pronunciation modeling of Mandarin casual speech,” Workshop 2000 on Speech and Language Processing: Final Report for MPM Group, http://www.clsp.jhu.edu/index.shtml

3. Fang Zheng, Zhanjiang Song, Pascale Fung, and William Byrne, “Modeling Pronunciation Variation Using Context-Dependent Weighting and B/S Refined Acoustic Modeling,” EuroSpeech, 1:57-60, Sept. 3-7, 2001, Aalborg, Denmark

Page 15: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 15

Language Modeling

1. Jian Wu and Fang Zheng, “On enhancing Katz-smoothing based back-off language model,” International Conference on Spoken Language Processing, pp. I-198~201, Oct. 16-20, Beijing

2. Xiaolong Mou, Jinming Zhan, Fang Zheng and Wenhu Wu. “The N-Gram Language Model Based on the Back-off Estimation Algorithm,” The 5th National Conference on Man-Machine Speech Communication (NCMMSC’98), 206-209, July 26-31, 1998, Harbin (in Chinese)

Page 16: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 16

1. Fang Zheng, Jian Wu and Zhanjiang Song, “Improving the Syllable-Synchronous Network Search Algorithm for Word Decoding in Continuous ChinesE Speech Recognition ,” J. Computer Science & Technology, 15(5): 461-471, Sept. 2000

2. Fang Zheng, “A Syllable-Synchronous Network Search Algorithm for Word Decoding in Chinese Speech Recognition,”ICASSP, pp. II-601~604, March 15~19, 1999, Phoenix

3. Fang Zheng, Jian Wu and Wenhu Wu, “Input Chinese sentences using digits,” International Conference on Spoken Language Processing, pp. III-127~130, Oct. 16-20, Beijing

LM Search

Page 17: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 17

1. Genqing Wu, Fang Zheng, Ling Jin, and Wenhu Wu, “An online incremental language model adaptation,” EuroSpeech, 3:2139-2142, Sept. 3-7, 2001, Aalborg, Denmark

LM Adaptation & New Word Induction

Page 18: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 18

NLU - GLR Based Parsing1. Yinfei Huang, Fang Zheng, Yi Su, Fang Li, Wenhu Wu, “A

Theme Structure Method for the Ellipsis Resolution,”EuroSpeech, 3:2153-2156, Sept. 3-7, 2001, Aalborg, Denmark

2. Yi Su, Fang Zheng, and Yinfei Huang, “Design of a Semantic Parser with Support to Ellipsis Resolution in a Chinese Spoken Language Dialogue System,” EuroSpeech, 3:2161-2164, Sept. 3-7, 2001, Aalborg, Denmark

3. Yinfei HUANG, Fang ZHENG, Mingxing XU, Pengju Yan, and Wenhu WU, “Language understanding component for Chinese dialogue system,” International Conference on Spoken Language Processing, pp. III-1053~1056, Oct. 16-20, Beijing

4. Yan Pengju, Zheng Fang, Xu Mingxing, Huang Yinfei, “Word-class stochastic model in a spoken language dialogue system,”International Symposium on Chinese Spoken Language Processing, pp. 141-144, Oct. 13-15, 2000, Beijing

Page 19: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 19

SLU - KW Based Robust Parsing

1. Pengju Yan, Fang Zheng, Hui Sun, and Mingxing Xu, “Parsing spontaneous speech in the dialogue systems,” EuroSpeech, 3:2149-2152, Sept. 3-7, 2001, Aalborg, Denmark

Page 20: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 20

Dialogue Manager (DM)

1. Xiaojun Wu, Fang Zheng and Mingxing Xu. “TOPIC Forest: A plan-based dialogue management structure,” International Conference on Acoustics, Speech and Signal Processing, Vol. I., May 7-11, Salt Lake City, USA

2. Li Fang, Zheng Fang, Wu Wenhu, Huang Yinfei, “Dynamic Query Organization and Response Generation in Spoken Dialogue System,” 19th International Conference on Computer Processing of Oriental Languages, May 14-16, Seoul, Korea

Page 21: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 21

Applications & References

q Command and controlv Fang Zheng, Qixiu Hu, Xiang Deng, et al. “An introduction

to a kind of voice dialers for dummies,” 4th National Conference on Man-Machine Speech Communications (NCMMSC’96), pp.165-168, Oct. 1996, Beijing (in Chinese)

v Yinfei Huang, Fang Zheng, and Wenhu Wu. “EasyCmd: Navigation by Voice Commands,” International Symposium on Chinese Spoken Language Processing (ISCSLP’00), pp. 145-148, Oct. 13-15, 2000, Beijing

Page 22: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 22

q Keyword spottingv Zheng Fang, Xu Mingxing, Mou Xiaolong, et al. “HarkMan

- A Vocabulary-Independent Keyword Spotter for Spontaneous Chinese Speech,” J. of Computer Science and Technology (JCST), 14(1): 18-26, Jan., 1999

Page 23: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 23

q Language Learning (Pronunciation Scoring)v Zhanjiang Song, Fang Zheng, Mingxing Xu, and Wenhu

Wu. “An Effective Scoring Method for Speaking Skill Evaluation System,” EuroSpeech'99, Vol. 1, pp.187-190, Budapest, Hungary, Sept. 1999

Page 24: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 24

q Input method editor (IME)v Fang Zheng, Jian Wu, and Wenhu Wu. “Input Chinese

sentences using digits,” International Conference on Spoken Language Processing (ICSLP’00), pp. III-127~130, Oct. 16-20, Beijing

v Ling JIN, Genqing Wu, Fang Zheng, and Wenhu Wu. “Improved strategies for intelligent sentence input method engine system,” International Symposium on Chinese Spoken Language Processing (ISCSLP’00), pp. 247-250, Oct. 13-15, 2000, Beijing

Page 25: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 25

q Chinese dictation machine (CDM)v Fang Zheng, Zhanjiang Song, Mingxing Xu, et al. “EasyTalk:

A Large-Vocabulary Speaker-Independent Chinese Dictation Machine,” EuroSpeech'99, Vol. 2, pp.819-822, Budapest, Hungary, Sept. 1999

v Jian Wu, and Fang Zheng. “Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning,” ICSLP’00, pp. IV-262~265

Page 26: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 26

q Spoken dialoguesv Yinfei Huang, Fang Zheng, Mingxing Xu, et al. “Language

understanding component for Chinese dialogue system,”ICSLP’00, pp. III-1053~1056, Oct. 16-20, Beijing

v Yan Pengju, Zheng Fang, Xu Mingxing, et al. “Word-class stochastic model in a spoken language dialogue system,”ICSLP’00, pp. 141-144, Oct. 13-15, 2000, Beijing

v Pengju Yan, Fang Zheng, Hui Sun, et al. “Parsing Spontaneous speech in the dialogue systems,” to be submitted

v Xiaojun Wu, Fang Zheng and Mingxing Xu. “TOPIC Forest: A plan-based dialogue management structure,” to appear in ICASSP’2001

Page 27: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 27

q Speaker identification and verification

q Language Identification

q …

Page 28: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 28

Resources

q Chinese Speech Databasev Standard Chinese (25 CD-ROMs)v Chinese w/ Yue accent (41 CD-ROMs)v Real-world spontaneous telephone dialogue (200 hours)v Chinese annotated spontaneous speech (CASS) corpus

(6 hours)v 863 Speech Recognition Database (40 CD-ROMs)v 863 Speech Synthesis Database (8 CD-ROMs)

q Chinese Text Databasev People’s Daily

Page 29: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

EasyFlight

A Spoken Dialogue Systemfor

Flight Information Inquiry and

Flight Reservation

Page 30: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 30

System Overview

q EasyFlight is a spoken dialogue system providingv Flight information inquiry; and

v Flight reservation.

q EasyFlight features:v Context-dependent understanding (w/ remembering and

forgetting scheme to support ellipsis(省略))

v Robust parsing (to enable spoken language phenomena)

v Topic changeable (to allow user shift among topics freely)

v Mixed-initiative (混合主导) (both the user and the machine can guide the following conversations at anytime)

Page 31: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 31

User Utterance Keyword Lattice

SyntaxTree

DynamicVocabulary

Inquiry &Update

Dynam

ic Rule

Set

Contexts

SpeechResponse

Texts/Tags

Semanticframe

DialogueManager

KeywordSpotter

Text-to-Speech

SyntacticAnalyzer

SemanticAnalyzer

Results

DomainDatabase

Dialog History

& Status

MaintenanceResponseGenerator

Resp

onse

Foc

us

Text

resp

onse

System Block Diagram

Page 32: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 32

Keyword Based Robust Parser

q We use a keyword-based parser and a context free grammar (CFG) for spoken language understanding v The symbols of the grammar are semantic-relevant

items

q Why keywords?

q Why Grammar?

q How we do?

Page 33: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 33

q Why keywords?v For spoken dialogues, there are often

Speech Recognition Errors: deletion, substitution, and insertionSpontaneous Speech Phenomena: garbage, hesitation, repetition, correction, fragment, ellipsis, word disordering, ill form and so on

v So difficult to get fully correct recognized sentence for full sentence parsing

v An alternative way: keyword spotting, semantics-based grammar, partial parsing (each partial result is maintained)

Page 34: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 34

q Why grammar?v The sentence structure can be viewed as a deterministic tree.

S

NP

PRON

VP

V

NP

NADJ

明天的 票

Page 35: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 35

q Why grammar? (cont’d)v The structure of the underlying semantics (语义) and/or the

domain knowledge can also be viewed as a deterministic tree.

QUERY_FOR_FLIGHTS

DATE_TIME

WEEKLY_DATE

next Monday

ROUTE

ARRIVAL_CITY

Beijing

DEPARTURE_CITY

Shenzhen

Page 36: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 36

q Problems when using grammarv Chinese is an ideographic (表意的) language

sentence in Chinese: casual than English

difficult to be modeled with syntactic grammars

v In dialogue systems, ungrammatical phenomena are common seen

ellipses or missing words/phrases

repetitions

garbage

fragments

disordering

ill forms

Page 37: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 37

q Solutionsv Define special types of CFG rules to deal with

spoken language phenomena.

v Unlike Parts-of-Speech (POSes) as terminal symbols in traditional grammar, use keyword categories as terminal symbols and semantic units as non-terminal symbols to form a semantics-based grammar

v Enhance and modify the traditional chart parser into the Marionette parser

Page 38: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 38

q A Keyword based robust parser includes:v Keyword List, used as lexicon for recognizer and

terminal symbols in the semantic grammar

v Grammar Definition, four types of rules are defined

v Grammar Transcription, a semantic grammar based on the analysis on a real-world domain corpus

v Marionette Parser, a chart parser making use of the aforesaid grammar and eliminating ambiguities by pruning/optimizing

Page 39: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 39

q Keyword listv ~700 lexical wordsv ~70 semantic categoriesv 3 larger classes

Material class (实体类) - each word contains some real domain-specific info.Tag class (标记类) - each category plays a different role in identifying user’s intentionAtom class (原子类) - no word has substantial semantic meaning of their own but can be combined to become larger constituents (成分).

Keyword categories examples

“六” (digital suffix for weekday) ato_1to6

“一” (digits for ID spelling) ato_0to9_yao

“元” (January prefix) ato_january_prefix

“礼拜” (weekday prefix)ato_week

“多少” (“how many”)tag_how_many

“有没有” (“exist or not”)tag_exit_or_not

“到” (“to”)tag_to

“从这儿” (“from here”)tag_from_here

“上午” (“morning”)mat_time_of_the_day

“波音747” (“Boeing 747”)mat_aircraft_type

“CA” (“Air China”)mat_airline_code

“北京” (“Beijing”)mat_city_name

Explanation or exampleCategory

Page 40: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 40

q Grammar Definitionv 4 types of grammar rules to cope with the spontaneous speech phenomena

Up-tying type (苛刻型) - where the sub-constituents are strictly tied together, as appeared in conventional grammar

By-passing type (跳跃型) - where the sub-constituents are combined together whether there exist gap words in between

Up-messing class (无序型) - where the sub-constituents can appear in any order

Over-crossing class (交叉型) - where the occupations of the sub-constituents can overlap with each other

v Overall featuresKeywords are taken as terminal symbols

All constituents are within semantics category instead of syntactic category

Thus the grammar is a semantic one

The grammar size is over 250

Page 41: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 41

q Semantic Grammar Transcription (examples)v up-tying (苛刻型) rules

Some crucial information are not allowed to be mixed/inserted by other terms, e.g. personal ID no.

E.g. (in China, ID no. can be 15-digit or 18-digit long)§ sub_id_card_head *→→→→ ato_0to9_yao + ato_0to9_yao +

… +ato_0to9_yao (15 identical terms)

§ id_card_no →→→→ sub_id_card_head§ id_card_no *→→→→ sub_id_card_head +

ato_0to9_yao +ato_0to9_yao +ato_0to9_yao

This is the traditional rule type.

Page 42: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 42

q Semantic Grammar Transcription (examples)v by-passing (跳跃型) rules

Contrarily, some utterances are allowed to be inserted with recognition garbage/fillers, or meaningless parts, e.g., “星期啊三嗯星期四”

E.g.§ sub_week_day →→→→ ato_week + ato_1to6§ sub_week_day_list →→→→ sub_week_day§ sub_week_day_list →→→→ sub_week_day + sub_week_day_list§ sub_date →→→→ sub_week_day_list

A great deal of rules are of this type.

Page 43: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 43

q Semantic Grammar Transcription (examples)v up-messing (无序型) rules

Some information, such as time, city names, plane types etc., can appear without following any predefined orders

E.g. § timeloc_info_cond @→→→→ info_date_time_cond + info_fromto§ plane_info @→→→→ mat_airline_code + mat_aircraft_type§ flight_info_cond @→→→→ timeloc_info_cond + plane_info

Page 44: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 44

q Semantic Grammar Transcription (examples)v over-crossing (交叉型) rules

Some phrases/constituents, such as “是…吗”, can have other constituents appear in between, e.g. “是到北京吗”, “是两张吗”

E.g. § mark_q_is →→→→ tag_is_or_not§ mark_q_is →→→→ tag_is + tag_question_mark§ mark_q_is →→→→ tag_is_q§ confirm_request #→→→→ mark_q_is + confirm_c

Page 45: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 45

q When ambiguities (歧义) are met, evaluation are made according to some criteria, the constituent which ranks highest will survivev position (in sentence)

v occupation (number of leaf nodes)

v depth, etc.

Page 46: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 46

q Marionette Parser - an enhanced chart parserv Maintaining all partial resultsv Combining non-adjacent constituents (By-passing rules)v Considering all the possible order of the constituents (Up-

messing rules)v Grouping the constituents whether their occupations overlap

with each other or not (Over-crossing rules)v Taking the precedence of later sub-constituents over earlier

onesv Taking the precedence of larger sub-constituents over smaller

ones.

Page 47: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 47

v A part of the parsing algorithm

For constituent C at position ( )21, pp :

a) for arc nkk YCYYYYY LLL o121 +→ at position ( )0 1, 'p p

where 1'1 pp ≤ , at position ( )20, pp add a new arc

nkk YCYYYYY LLL o121 +→ ;

b) for arc nkk YYYYYY LL o121@ +→ where C is compatible

and not yet applied, add a new arc

nkk YYYYYY LL o121@ +→ at the calculated actual position;

c) for arc nkk YYYYYY LL o121# +→ where C is compatible, not

yet applied, and no overlapping met, add a new arc

nkk YYYYYY LL o121# +→ at the calculated actual position.

Page 48: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 48

v In a semantic tree (as in a syntactic tree)Statically, each rule node corresponds to a semantic function.

Dynamically, each constituent node has a pointer to a semantic function

The semantic analysis procedure is a procedure to call semantic functions:

§ At the very beginning, the topmost node’s semantic function is called;

§ The child node’s semantic function will be called by its parental node recursively;

§ Until the semantics is obtained finally.

Page 49: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 49

flight_info

key_info

tp_info

info_fromto

sub_to

sub_city_name_list

sub_from

mat_city_name×

五 北京从 到 上海那个

tag_from mat_city_name mat_city_name tag_to

...

北京月 一 号

info_data_time

...

×

garbage repetition

various orders

ellipses fragments ill-form

A parsing example.

Page 50: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 50

Powerful Dialogue Manager (DM)q Role:

v Maintain dialogue contexts and statesv Direct the dialoguesv Accept parsing results and generate responses

q Desired features:v Be able to deal with multiple topicsv Topics can be changed freelyv Be able to make full use of information shared by different

topics and to support ellipsis (when topic changed from one to another)

v User and machine mixed-initiative (混合主导)v Be adaptive to users’ interests & parlancesv Be domain-transparent to user (easy to port to new systems)

Page 51: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 51

q Problemsv Representation problems

Complex relationship among topic itemsItems with different importanceShared information items among topics

v Runtime problemsTo distinguish different user interestsTo handle freely topic changing

q Solutionv A plan-based Topic Forest (TF) structure with Shared

Information Index (SII); and v A Finite-State controller.

Page 52: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 52

q DM Overviewv Input: Semantic Frame

v Main StructureTopic Forest

Shared Info Index

v Output: Text Response

v Reasoning Engine: Strategies

v Dialogue control: Finite-state controller

Page 53: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 53

q DM Input - Semantic FrameSemantic Framev Topic Information

Current topic

Semantic slots (info. items)

v Non-topic informationStatement/question

Question focus

Reference items

TopicSemantic slot 1Semantic slot 2Semantic slot 3

…...

Sentence TypeQuestion Focus

Reference item 1Reference item 2

…...

Semantic Frame

Page 54: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 54

q DM Main Structure - Topic Forest (TF)Topic Forest (TF)v Consisting of Topic Trees (each for a single topic domain)v Representing domain topics & maintaining dialog historyv Off-line designed & dynamically loaded

Primary Property (PP):Dominate Info. Items

Secondary Property (SP):Detail Info. Items

Additional Property (AP):Optional Info. Items

Flight Information

PP (AND) AP (AND) SP (OR)

(OR)

(AND)

Airline Information

PP (OR)

Time Difference

PP (AND) AP (AND)

(OR)

(AND)

Date Departure Time

Flight Number

Arrival CityDeparture City

Arrival Time Plane Type Airline

Full Name Code

Time A Time B

Time Difference

City A City B

Abbreviation

Leaf Node: Topic Information Item

Topic Node: Domain Topic Info

Mid Node: Relation among son nodes (AND/OR), used for users’ interests adaptationPart of Topic Forest of EasyFlight (3 topic trees for 3 topics)

Page 55: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 55

q DM Main Structure – Shared Info Index (SII)Shared Info Index (SII)v A collection of one-to-many mappings

v Help to deal with ellipsis between topics

v Automatically generated after Topic Forest loaded

Flight Number

……

Departure City

……

Arrival City

…… ……

……

Flight Information

PP (OR)

(OR)

Ticket Price

PP (OR)

(AND)

…… ……

Special Semanteme

Flight Number

Departure City Arrival City

Date

(AND)Flight Number

Departure City Arrival City

Shared Info Index connects all topics.

Page 56: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 56

q DM Output - Text ResponseText Responsev Response Generator - Response Generation Functions

One leaf node - one focus - one function

Different responses according to dialog status§ Inquiry failure, confirmation of topic information, ...

Leaning users’ parlances

Page 57: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 57

q DM Reasoning Engine - StrategiesStrategiesv Maintaining dialogue state history

By filling the topic forest: with the obtained informationInformation sources:

§ User information: that expressed by the user§ Inquiry result: that obtained from the database

Filling operations:§ Appending: keep (remember) previous information§ Replacing: erase (forget) previous information

v Ellipsis ProcessingEllipsis in the same topic: by maintaining a list of the most recent topic information items (stored in the corresponding topic tree)Ellipsis among different topics: by using the Shared Information Index (SII)

v Reasoning StrategyTopic Forest basedQuerying databaseDetermining response focusUsers’ interests adaptationIndependent of domain knowledge

Page 58: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 58

q Dialogue Control – FiniteFinite--state Controllerstate Controllerv Finite-state based method to control dialogue progress, suitable

for situation of item-by-item confirmation

v On a basis of Topic Forest StructureWith previous structure unchanged (only some leaf nodes added)

With other topics unaffected

With intelligent inquiry supported

Flight Information

AP (AND)

Plane Type Airline Personal IDTicket Amount State

……

Flight Information topic with added leaf nodes

Page 59: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 59

v The state transition network

State: Initial state: Transition or (with condition):

Flight Inquiry

Flight Confirmation

Ticket Confirmation

Ticket Asking

PersonalIDConfirmation

PersonalIDAsking

yes & ticket amount unknown

no

yes & ticket amount known

flight undecided

flight concluded

correction

extracted

no yes & personal ID unknown

unextracted

yes & personal ID known extracted

no

unextracted

correction

yesxxx Ticket Reservation

xxx

Page 60: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

Demonstrations

Page 61: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 61

EasyFlight

q Flight information inquiry, and

q Flight reservation.

Page 62: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 62

Page 63: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 63

Greetings.

Page 64: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 64

User wants to book a ticket. Machine initiates by departure city.

Page 65: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 65

User’s response w/ additional arrival city. Machine asks for date.

Page 66: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 66

Machine provides 14 flights and asks for time.

Page 67: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 67

User wants the earliest flight. Machine wants user to confirm.

Page 68: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 68

User changes topic to ask plane type.

Page 69: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 69

User changes topic back, and changes mind to buy two tickets. Machine needs confirmation.

Page 70: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 70

User finally confirms.

Page 71: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 71

Thanks.

Page 72: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 72

EasyNav

q THU Campus Navigation system.

q Was selected to demonstrate for President JiangZemin in '2000 Spring Festival.

q The ONLY system selected for demonstration at THU Information Center during THU 90th

Anniversary.

Page 73: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 73

How long does it take to walk from X to Y?

Page 74: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 74

Where can something be done?

Page 75: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 75

Different ways to ask “which X is closer to Loc_A”? (1/5)

Page 76: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 76

Different ways to ask “which X is closer to Loc_A”? (2/5)

Page 77: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 77

Different ways to ask “which X is closer to Loc_A”? (3/5)

Page 78: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 78

Different ways to ask “which X is closer to Loc_A”? (4/5)

Page 79: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 79

Different ways to ask “which X is closer to Loc_A”? (5/5)

Page 80: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 80

What is to the east of Loc_A?

Page 81: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 81

What are there to the east of Loc_A?

Page 82: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 82

Where is Loc_A?

Page 83: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 83

What X is nearby? (Ellipsis)

Page 84: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 84

Which is better? (Ellipsis)

Page 85: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 85

Which is cheaper? (Ellipsis)

Page 86: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 86

How to get Loc_A?

Page 87: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 87

How to get Loc_B from Loc_A?

Page 88: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 88

How long does it take to get X?

Page 89: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 89

How far from Loc_A to a nearest X?

Page 90: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 90

(When no information in database?)

Page 91: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 91

(When two places are the same.)

Page 92: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 92

(When no information in database.)

Page 93: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

CCCCCCCCenter of enter of SSSSSSSSpeech peech TTTTTTTTechnology, Tsinghua Universityechnology, Tsinghua University Slide 93

Good-bye.

Page 94: -- Speech Activities in CST Thomas Fang Zhengcslt.riit.tsinghua.edu.cn/~fzheng/TALKS/20011030_TALK_AT...2001/10/30  · Wenhu WU, “Language understanding component for Chinese dialogue

Thanks for listening

Thomas Fang ZhengThomas Fang ZhengCenter of Speech Technology

State Key Lab of Intelligent Technology and Systems

Department of Computer Science & TechnologyTsinghua University

[email protected], http://sp.cs.tsinghua.edu.cn/~fzheng/